Author: Tinoryj
Documentation List of SDFS Official Guide & Docsopendedup website on github
opendedup official site
GitLab : SDFS System Research Report
System EnvironmentUbuntu 14.04.5
On VMware Fusion 11.0 For macOS
standalone install function
IDEA 2018.1
Modify AimsOutput Chunks process Order after Chunking & before in file deduplication
Output Chunks process Order after in file deduplication & before between files deduplication
Output Chunks process Order when the chunk going to store in the disk.
Install SDFS In Standalone ModelThis install could be overwritten by installing the deb package again.
Step 1: Jump to the position where SDFS's deb package store. Step 2: Install SDFS and dependencies sudo apt-get install fuse libfuse2 ssh openssh-server jsvc libxml2-utils sudo dpkg -i sdfs-version.deb Step 3: Change the maximum number of open files allowed echo "* hard nofile 65535" >> /etc/security/limits.conf echo "* soft nofile 65535" >> /etc/security/limits.conf Build SDFS Basic Environment for PackageTo package this system, the following dependencies need to install.
Maven - Java Package Manage ToolTo install Maven, JDK is needed.
sudo add-apt-repository ppa:openjdk-r/ppa sudo apt-get update sudo apt-get install openjdk-8-jdkUsing the following command to check the java environment.
java -versionIf install success, the following message will be output.
openjdk version "1.8.0_01-internal" OpenJDK Runtime Environment (build 1.8.0_01-internal-b04) OpenJDK 64-Bit Server VM (build 25.40-b08, mixed mode)Then install Maven and verify success.
sudo apt-get install maven mvn -version FPM - Packages CreatorInstall by apt-get & gem(ruby).
sudo apt-get install ruby-dev build-essential sudo gem install fpm Build System deb Package Modify build.shIn order to create the deb package, using modified build.sh shell script in /install-packages/.
export JAVA_HOME=http://www.likecs.com/usr/lib/jvm/java-8-openjdk-amd64/ VERSION=3.7.8 DEBFILE="sdfs_${VERSION}_amd64.deb" echo $DEBFILE sudo rm -rf deb/usr/share/sdfs/lib/* cd ../ mvn package cd install-packages sudo cp ../target/lib/b2-2.0.3.jar deb/usr/share/sdfs/lib/ sudo cp ../target/sdfs-${VERSION}-jar-with-dependencies.jar deb/usr/share/sdfs/lib/sdfs.jar echo sudo rm *.deb sudo rm deb/usr/share/sdfs/bin/libfuse.so.2 sudo rm deb/usr/share/sdfs/bin/libulockmgr.so.1 sudo rm deb/usr/share/sdfs/bin/libjavafs.so sudo cp DEBIAN/libfuse.so.2 deb/usr/share/sdfs/bin/ sudo cp DEBIAN/libulockmgr.so.1 deb/usr/share/sdfs/bin/ sudo cp DEBIAN/libjavafs.so deb/usr/share/sdfs/bin/ sudo cp ../src/readme.txt deb/usr/share/sdfs/ sudo fpm -s dir -t deb -n sdfs -v $VERSION -C deb/ -d fuse -d libxml2 -d libxml2-utils --vendor datishsystems --deb-no-default-config-filesThe key point for build deb package is adding --deb-no-default-config-files to fpm input in line 22.
Add jre PackageThe origin build package doesn't contains jre package in /install-packages/deb/usr/share/sdfs/bin/ (because of .gitignore rules).
So you may need to install the official version first and copy /usr/share/sdfs/bin/jre to your repo's /install-packages/deb/usr/share/sdfs/bin/(The upload jre package may broken or not suitable fo your environment).
Modify pom.xml for Maven ProjectIn line 268 & 269, the two paths may cause mvn package errors.
change ./script to /src/script
create ./test/java file in sdfs project.
After modify, the two line seems like that:
<scriptSourceDirectory>./scripts</scriptSourceDirectory> <testSourceDirectory>./test/java</testSourceDirectory> Build and InstallBy the following commands to build your SDFS deb package and install it.
cd ./install-packages sudo ./build.sh sudo dpkg -i sdfs_3.7.8_amd64.debStep 3(line 3) could use for any times by overwriting the old installation.
The first time to build the deb package may take a long time to download needed packages by maven, all the packages will store into ~/.m2/. Then you could build the package quickly by avoiding download again.
Modify SDFS Files Important Data Struct FingerThis data struct is almost same with normal chunk data struct in CD-Store and REED.
It's implemented in Finger.java, the most import datas in that struct is shown below:
The data structure is an intermediate structure of chunk processing. It's implemented in hashLocPair.java, the most import data in that struct is shown below:
public class HashLocPair implements Comparable<HashLocPair>, Externalizable { public static final int BAL = HashFunctionPool.hashLength + 8 + 4 + 4 + 4 + 4; public byte[] hash; // chunk's hash fingerprint public byte[] hashloc; // hashed position public byte[] data; // chunk's logic data public int len; // chunk's logic data size public int pos; public int offset; public int nlen; private boolean dup = false; public boolean inserted = false; The way to get chunksThe default way to get chunk is based on Rabin Fingerprint, it also could be setting into fixed-size chunking by edit XML setting files.