Opendedup

get more from your disk

  • Increase font size
  • Default font size
  • Decrease font size

Administration Guide

E-mail Print PDF

 

Index:

Introduction:

This is intended to be a detailed guide for the SDFS file-system. For most purposes, the Quickstart Guide will get you going but if you are interested in advanced topics, this is the place to look.

SDFS is a filesystem designed to provide inline deduplication and flexiblity for applications. Services such as backup, archiving, NAS storage, and Virtual Machine primary and secondary storage can benefit greatly from SDFS.

According to wikipedia, "Data deduplication is a specific form of compression where redundant data is eliminated, typically to improve storage utilization. In the deduplication process, duplicate data is deleted, leaving only one copy of the data to be stored. However, indexing of all data is still retained should that data ever be required. Deduplication is able to reduce the required storage capacity since only the unique data is stored. For example, a typical email system might contain 100 instances of the same one megabyte (MB) file attachment. If the email platform is backed up or archived, all 100 instances are saved, requiring 100 MB storage space. With data deduplication, only one instance of the attachment is actually stored; each subsequent instance is just referenced back to the one saved copy. In this example, a 100 MB storage demand could be reduced to only 1 MB. Different applications have different levels of data redundancy. Backup applications generally benefit the most from de-duplication due to the nature of repeated full backups of an existing file system."[2]

Virtual Machines can also benefit greatly from deduplication as most operating system specific binaries are similiar across different guest instances. In many cases deduplication with SDFS can provide between 80% and 90% percent storage reduction.

Architecture:

SDFS is comprised of 4 basic components:

  • SDFS Volume
  • SDFS file-system service
  • Deduplication Storage Engine (DSE)
  • Data Chunks

The SDFS Volume is a mounted file-system presented to the operating system. This is the primary way applications and services interact with SDFS. SDFS Volumes can be shared through SAMBA or NFS.

The SDFS file-system service provides a typical POSIX compliant view of deduplicated files and folders to volumes. The SDFS filesystem services store meta-data regarding files and folders. This meta data includes information such as file size, file path, and most other aspects of files and folders other than the actual file data. Each SDFS Volume manages its own SDFS file-system service. In addition to meta data the SDFS file-system service also manages file maps that identify data location to dedup/undeduped chunk mappings. The chunks themselves can live either within the SDFS file-system, if the option "dedupAll=false" is selected, or within the Deduplication Storage Engine.

The Deduplication Storage Engine (DSE) stores, retrieves, and removes all deduped chunks. Chunks of data are stored on disk and indexed for retrieval with an in-memory custom written hashtable. The deduplication storage engine can be run as part of an SDFS Volume, which is default, or as a network service.

Data Chunks are the unit by which raw data is processes and stored with SDFS. Chunks of data are stored either with the Deduplication Storage Engine or the SDFS file-system service depending on the deduplication process (see below). Chunks are hashed using the Tiger Hash computation.

Managing and Mounting SDFS Volumes:

SDFS Volumes are created with the the command mkfs.sdfs. There are plenty of options to create a file system and to see them run "./mkfs.sdfs --help" or take a look at them here.

By default volumes store all data in the folder structure /opt/sdfs/<volume-name>. This may not be optimal and can be changed before a volume is mounted for the first time. In addition, volume configurations are held in the /etc/sdfs folder. Each volume configuration is created when the mkfs.sdfs command is run and stored as an XML file and its naming convention is <volume-name>-volume-cfg.xml. A typical volume configuration is as follows:

<?xml version="1.0" encoding="UTF-8" standalone="no"?><subsystem-config version="1.1.0"><locations dedup-db-store="/opt/sdfs/volumes/pool0/ddb" io-log="/opt/sdfs/volumes/pool0/ioperf.log"/><io chunk-size="4" claim-hash-schedule="0 0 0/2 * * ?" dedup-files="true" file-read-cache="5" hash-size="16" log-level="1" max-file-inactive="900" max-file-write-buffers="1" max-open-files="1024" meta-file-cache="1024" multi-read-timeout="1000" safe-close="false" safe-sync="false" system-read-cache="1000" write-threads="18"/><permissions default-file="0644" default-folder="0755" default-group="0" default-owner="0"/><volume capacity="4TB" closed-gracefully="true" current-size="0" maximum-percentage-full="-1.0" path="/opt/sdfs/volumes/pool0/files"/><launch-params class-path="/usr/share/sdfs/lib/truezip-samples-7.3.2-jar-with-dependencies.jar:/usr/share/sdfs/lib/commons-collections-3.2.1.jar:/usr/share/sdfs/lib/sdfs.jar:/usr/share/sdfs/lib/jacksum.jar:/usr/share/sdfs/lib/slf4j-log4j12-1.5.10.jar:/usr/share/sdfs/lib/slf4j-api-1.5.10.jar:/usr/share/sdfs/lib/simple-4.1.21.jar:/usr/share/sdfs/lib/commons-io-1.4.jar:/usr/share/sdfs/lib/clhm-release-1.0-lru.jar:/usr/share/sdfs/lib/trove-3.0.0a3.jar:/usr/share/sdfs/lib/quartz-1.8.3.jar:/usr/share/sdfs/lib/log4j-1.2.15.jar:/usr/share/sdfs/lib/bcprov-jdk16-143.jar:/usr/share/sdfs/lib/commons-codec-1.3.jar:/usr/share/sdfs/lib/commons-httpclient-3.1.jar:/usr/share/sdfs/lib/commons-logging-1.1.1.jar:/usr/share/sdfs/lib/java-xmlbuilder-1.jar:/usr/share/sdfs/lib/jets3t-0.7.4.jar:/usr/share/sdfs/lib/commons-cli-1.2.jar" java-options="-Djava.library.path=/usr/share/sdfs/bin/ -Dorg.apache.commons.logging.Log=fuse.logging.FuseLog -Dfuse.logging.level=INFO -server -XX:+UseG1GC -Xmx3000m -Xmn400m" java-path="/usr/share/sdfs/jre1.7.0/bin/java"/><sdfscli enable="true" enable-auth="true" listen-address="0.0.0.0" password="cbe709382e6ee7cf22116ffb1cb645df0be48b52bb378ddd895d9a0d561714b0" port="6442" salt="o3iVPw"/><local-Dedup Storage Engine allocation-size="214748364800" chunk-gc-schedule="0 0 0/4 * * ?" chunk-store="/opt/sdfs/volumes/pool0/Dedup Storage Engine/chunks" chunk-store-dirty-timeout="1000" chunk-store-read-cache="5" Dedup Storage Engine-class="org.opendedup.sdfs.filestore.FileDedup Storage Engine" enabled="true" encrypt="false" encryption-key="2CUKKdfav6PM28p71hX-r7sG@hutaV5bDFX" eviction-age="6" gc-class="org.opendedup.sdfs.filestore.gc.PFullGC" hash-db-store="/opt/sdfs/volumes/pool0/Dedup Storage Engine/hdb" pre-allocate="false" read-ahead-pages="8"><network enable="true" hostname="0.0.0.0" port="2222" upstream-enabled="false" upstream-host="" upstream-host-port="2222" upstream-password="admin" use-udp="false"/></local-Dedup Storage Engine></subsystem-config>

SDFS Volumes are mounted with the mount.sdfs command. Mounting a volume typically typically is executed by running "mount.sdfs -v <volume-name> -m <mount-point>. As an example "mount.sdfs -v sdfs -m /media/dedup will mount the volume as configured by /etc/sdfs/sdfs-volume-cfg.xml to the path /media/dedup. Volume mounting options are as follows:

-m <arg>    mount point for SDFS file system
e.g. /media/dedup
-o <arg>    fuse mount options.
Will default to:
direct_io,big_writes,allow_other,fsname=SDFS
-r <arg>    path to Dedup Storage Engine routing file.
Will default to:
/etc/sdfs/routing-config.xml
-v <arg>    sdfs volume to mount
e.g. dedup
-vc <arg>   sdfs volume configuration file to mount
e.g. /etc/sdfs/dedup-volume-cfg.xml

Volumes are unmounted automatically when the mount.sdfs is killed or the volume is unmounted using  the umount command.

Mounting SDFS Volumes as NFS Shares:

SDFS can be shared through NFS exports on linux kernel 2.6.31 and above. It can be shared on kernel levels below that but performance will suffer at you will need to disable the fuse direct_io option when mounting the sdfs filesystem. NFS opens and closes files with every read or write. File open and closes are expensive for SDFS and as such can degrade performance when running over NFS. SDFS volumes can be optimized for NFS with the option "--io-safe-close=false" when creating the volume. This will leave files open for NFS reads and writes. Files data will still be sync'd with every write command, so data integrity will still be maintained. Files will be closed after an inactivity period has been reached. By default this inactivity period is 15 (900) seconds minutes but can be changed at any time, along with the io-safe-close option within the xml configuration file located in /etc/sdfs/<volume-name>-volume-cfg.xml.


Managing SDFS Volumes for  Virtual Machines:

It was the origional goal of SDFS to be a file system of virutal machines. Again, to get proper deduplication rates for VMDK files set io-chunk-size to "4" when creating the volume. This will match the chunk size of the guest os file system usually. NTFS allow 32k chunk sizes but not on root volumes. It may be advantageous, for Windows guest environments, to have the root volume on one mounted SDFS path at 4k chunk size and data volumes in another SDFS path at 32k chunk sizes. Then format the data ntfs volumes, within the guest, for 32k chunk sizes. This will provide optimal performance.

For the base os vmdk, you may also want to set dedupall to false after you have initially copied the data into the volume. This will prevent changes to the page file or other files that rapidly change from filling up the dedup storage engine. Take a look at the "Inline vs. Batch Mode" for more detail.

SDFS provides a convenience function to create flat VMDK files quickly and in a fashion that will be easy to snapshot (see snapshots above). To use this convenience function run "setfattr -n user.cmd.vmdk.make -v 5556:<name of vmdk>:<size of vmdk> <path where vmdk will be placed>".
e.g.

setfattr -n user.cmd.vmdk.make -v 5556:win2k8-1:100GB dedup/vmfs/

Inline vs Batch based deduplication for SDFS Volumes:

SDFS provides the option to do either inline or batch based deduplication. Inline deduplication stores the data to the deduplicated Dedup Storage Engine in realtime. Batch based deduplication stores data to disk as a normal file unless a match is found in the Dedup Storage Engine to a previously persisted deduplicated chunk of data. Inline deduplication is perfect for backup but not as good for live data such as VMs or databases because there is a high rate of change for those types of files. Batch based deduplication has a lot of overhead since there is eventually twice as much IO per file, one to store it to disk and once to dedup it. By default, inline is enabled, but can be disabled with the option "--io-dedup-files=false" when creating the volume. When this option is set to false all data will be stored in its origional format until deduplication is set on a per file basis. To set deduplication to on, run the command "setfattr -n user.cmd.dedupAll -v 556:true <path to file>". Conversely, setting "setfattr -n user.cmd.dedupAll -v 556:false <path to file>" will disable deduplication of a specific file from that point forward. Finally, batch based deduplicated files can be checked to see if duplicate chunks exist with the command "setfattr -n user.cmd.optimize -v 555:100 <path to file>"

Managing SDFS Volumes through extended attributes (Linux Only):

SDFS provides IO metrics and SDFS specific file/folder management through extended file attributes. Extended file attributes are can be accessed through the setfattr and getfattr commands available on most linux distributions. The command line for getting extended attributes for a specific file or folder are "getfattr -d <file-or-folder-name>".

Metrics attributes are as follows:

user.sdfs.ActualBytesWritten - The actual number of bytes written to the SDFS volume for this specific file. This does not include dedup chunks that were not written because they were already there.
user.sdfs.BytesRead - The actual number of bytes read for the file.
user.sdfs.DuplicateData - The amount of data within the file that is duplicate
user.sdfs.UniqueData - The amount of data within the file that is unique
user.sdfs.VMDK - Whether the file is a VMDK
user.sdfs.VirtualBytesWritten - The amount of data that would have been written to the volume if it were not deduplicated. This include dedup and non-dedup data.
user.sdfs.dedupAll - Whether all data will be deduped or not, true means all data will be deduped. See Inline vs Batch mode for more details.
user.sdfs.dfGUID - The guid for the dedup file map. This can used to determin the location of the map file on disk
user.sdfs.file.isopen - True if the file is open
user.sdfs.fileGUID - The meta data file GUID.

The command to manage SDFS volumes through settings extended attributes are as follows :

user.cmd.cleanstore - cleans the DSE, if local, of data older than the defined minutes e.g. setfattr -n user.cmd.cleanstore -v 5555:<minutes> <mount point> (See Data Chunks for more detail)
user.cmd.dedupAll -sets the file to dedup all chunks or not. Set to true if you would like to dedup all chunks <unique-command-id:true or false>
user.cmd.file.flush - Flush write cache for specificed file <unique-command-id>
user.cmd.flush.all - Flush write cache for all files
user.cmd.ids.clearstatus - clear all command id status
user.cmd.ids.status - get the status if a specific command e.g. to get the status of command id 54333 run getfattr -n user.cmd.ids.status.54333
user.cmd.nextid - returns a random GUID
user.cmd.optimize - optimize the file by specifiying a specific length <unique-command-id:length-in-bytes>
user.cmd.snapshot - Take a Snapshot of a File or Folder <unique-command-id:snapshotdst> (See File-System Snapshots)
user.cmd.vmdk.make - Creates an simple flat vmdk in this directory <unique-command-id:vmdkname:size(TB|GB|MB)>. The command must be executed on a directory. e.g.setfattr -n user.cmd.vmdk.make -v 5556:bigvserver:500GB /dir

Setting extended attributes takes the following command line structure :

setfattr -n <attribute> -v <attribute-value> <file or folder>

e.g.

setfattr -n user.cmd.vmdk.make -v 5556:win2k8-0:40GB dedup/vmfs/

Managing SDFS Volumes through SDFS command line

On Windows, SDFS management is done throught sdfscli. This is a command line executable that allows access to management and information about a particular SDFS volume. The volume in question must be mounted when the command line is executed. below are the command line parameters that can be run. Also help is available for the command line when run as sdfscli --help .
usage: sdfs.cmd <options>

--archive-out <folder to archive out> Creates an archive tar for a particular file or folder and outputs the location.
--change-password <new password> Change the administrative password.   
--cleanstore <minutes> Clean the dedup storage engine of data that is older than defined minutes and is unclaimed by current files. This command only works if the dedup storage engine is local and not in network mode   
--debug makes output more verbose
--debug-info  Returns Debug Information.
--dedup-file <true|false> Deduplicates all file blocks if set to true, otherwise it will only dedup blocks that are already stored in the DSE.
--dse-info  Returns Dedup Storage Engine Statitics.
--expandvolume <new size> Expand the volume, online, to a size in MB,GB, or TB  
--file-info Returns io file attributes such as dedup rate and file io statistics.                                      e.g. --file-info --file-path=<path to file or folder>   
--file-path <RELATIVE PATH>  The relative path to the file or folder to take action on.                                     e.g. --file-path=readme.txt or --file-path=file\file.txt   
--flush-all-buffers Flushes all buffers within an SDFS file system.
--flush-file-buffers Flushes to buffer of a praticular file.
e.g. --flush-file-buffers --file-path=<file to flush>
--help Display these options.
    --import-archive <path to archive> Imports an archive created using archive out.                                      e.g. --import-archive <archive created with archive-out> --file-path=<relative-folder-destination>   
--password <password> password to authenticate to SDFS CLI Interface for volume.   
--port <tcp port> SDFS CLI Interface tcp listening port for volume.   
--server <host name/ip> SDFS host location.   
--snapshot Creates a snapshot for a particular file or folder.
e.g. --snapshot --file-path=<source-file> --snapshot-path=<snapshot-destination>   
--snapshot-path <RELATIVE PATH>   The relative path to the destination of the snapshot.
e.g. --snapshot-path=snap-readme.txt or --snapshot-path=file\snap-file.txt   
--volume-info Returns SDFS Volume Statitics.

File-system Snapshots for SDFS Volumes:

SDFS provides snapshot functions for files and folders. To snapshot a file or folder you will need the setfattr command available on your system. On ubuntu this is available through the Attr package. The snapshot command is "setfattr -n user.cmd.snapshot -v 5555:<destination path> <snapshot source>". The destination path is relative to the mount point of the sdfs filesystem.

 

Dedup Storage Engine:

The Dedup Storage Engine (DSE) provides services to store, retreive, and remove deduplicated chunks of data. By default, each SDFS Volume contains its and manages its own DSE. When a DSE runs as part of a volume it is referred to running in local mode. A DSE can also be configured to run as a network service and support many volumes across multiple physical servers. When a DSE is running as a network service it is referred to running in network mode.

A DSE running in local mode is configured automatically, when a volume is create, and works well most environments. In local mode, all settings can be changed after volume creation. Keep in mind, if directory locations are changed data will need to be migrated to the new directories as well.

The advantage for running the DSE in network mode is that it allows for global deduplicaton and better scalability. The disadvantage is complexity of configuration.

Running a DSE in network mode is executed through the startDSE.sh script. It requires the path the DSE configuration file. A sample DSE configuration files is located at "etc/hashserver-config.xml" in the directory of the SDFS package. The parameter are almost identical to those in a local volume configuration and look as follows:

<chunk-server>
<network port="2222" hostname="0.0.0.0" use-udp="false"/>
<locations chunk-store="/opt/dchunks/Dedup Storage Engine/chunks" hash-db-store="/opt/ddb/hdb"/>
<chunk-store pre-allocate="false" chunk-gc-schedule="0 0 0/2 * * ?" eviction-age="4" allocation-size="161061273600" page-size="4096" read-ahead-pages="8"/>
</chunk-server>

To run the DSE execute :

./startDSE.sh <location-of-config-file>

e.g.

./startDSE.sh etc/hashserver-config.xml

To enable a volume to use a DSE requires 4 steps :

  1. Make sure the chunk size are exactly the same for volume and DSE. The DSE config determines chunk size with the attribute "page-size" in bytes.
  2. Edit the volume configuration file and change the attribute enable="true" to enable="false" within the "local-Dedup Storage Engine" element.
  3. Edit the server element within routing config file to point at the DSE. A sample routing config file is located at etc/routing-config.xml
  4. Mount the volume with the "-r" tag
    e.g. ./mount.sdfs -r etc/routing-config.xml -v sdfs -m /media/dedup

Dedup Storage Engine - Cloud Based Deduplication:

The DSE can be configured to store data to the  Amazon S3 cloud storage service. When enabled, all unique blocks will be stored to an S3 bucket of your choosing. Data can be encrypted before transit, and at rest with the S3 cloud using AES-256 bit encryption. In addition, all data is compressed by default before sent to the cloud.

The purpose of deduplicating data before sending it to cloud storage to minimize storage and maximize  write performance. The concept behind deduplication is to only store unique blocks of data. If only unique data is sent to cloud storage, bandwidth can be optimized and cloud storage can be reduced. Opendedup approaches cloud storage differently than a traditional cloud based file system. The volume data such as the name space and file meta-data are stored locally on the the system where the SDFS volume is mounted. Only the unique chunks of data are stored at the cloud storage provider. This ensures maximum performance by allowing all file system functions to be performed locally except for data reads and writes. In addition, local read and write caching should make writing smaller files transparent to the user or service writing to the volume.

Cloud based storage has been enabled for S3 Amazon Web service. To create a volume using the Amazon Web Service S3 provider follow these steps:

1. Go to http://aws.amazon.com and create an account.
2. Sign up for S3 data storage
3. Get your Access Key ID and Secret Key ID.
4. Make an SDFS volume using the following parameters:
./mkfs.sdfs  --volume-name=<volume name> --volume-capacity=<volume capacity> --aws-enabled=true --aws-access-key=<the aws assigned access key> --aws-bucket-name=<a universally unique bucke name such as the aws-access-key> --aws-secret-key=<assigned aws secret key> --chunk-store-encrypt=true
5. Mount volume and go to town!
./mount.sdfs <volume name> <mount point>

Dedup Storage Engine Memory:

The mount.sdfs shell script currently allocates up to 2GB of RAM for the SDFS file system. This is fine for SDFS file systems of around 200GB for 4k chunk size and around 6TB for 128k chunk size. To expand the memory edit the "-Xmx2g" within mount.sdfs or the startDSE.sh script to something better for your environment. Each stored chunk takes up approximately 25 bytes of RAM.  To calculate how much RAM you will need for a specific volume divide the volume size (in bytes) by the chunk size (in bytes) and multiply that times 25.

Memory Requirements Calculation:

(volume size/chunk size)*25 Data

Routing Config File:

The routing config file is used by the SDFS file-system service to route where chunks of data will be persisted when running in network mode. Blocks of data are routed based on the first byte of the Tiger hash of the block. This allows for maximum scalability in a RAIN configuration.

The routing config file is comprised of two sections. The first is section is the server definition section. The server name is a unique name used to identify the server within the configuration file.  The host and port determine the host ip/name for the DSE compress and use udp are not used currently. Network Threads are the number of TCP connections the client will use to connect to the DSE.

<servers>
<server name="server1" host="127.0.0.1" port="2222" enable-udp="false" compress="false" network-threads="8"/>
<server name="server2" host="127.0.0.1" port="2222" enable-udp="false" compress="false" network-threads="8"/>
</servers>

The second section is chunks. This section determines which data chunk will be stored at which server. The "name" is the first byte of the chunk hash and the server determines, by the name defined in the servers section where the hash that begins with that byte will be stored.

<chunks>
<chunk name="00" server="server1"/>
<chunk name="01" server="server1"/>
<chunk name="02" server="server1"/>
<chunk name="03" server="server1"/>
<chunk name="04" server="server1"/>

...(256 entries)

<chunk name="fe" server="server2"/>
<chunk name="ff" server="server2"/>
</chunks>
</routing-config>

Scalability through a RAIN Configuration:

SDFS can be setup in a RAIN configuration that allows for maximum throughput and scalability. To setup SDFS in this configuration run the DSE on multiple nodes and setup the SDFS volumes to use network mode. Finally edit the routing config file to evenly allocate chunks to the specific nodes. The routing config file is already setup to share data across 2 nodes.

Data Chunks:

The chunk size must match for both the SDFS Volume and the Deduplication Storage Engine. The default for SDFS is to store chunks at 128k size. This size provides optimal memory performance and IO throughput. The chunk size must be set at volume and Deduplication Storage Engine creation. When Volumes are created with their own local Deduplication Storage Engine chunk sizes are matched up automatically, but, when the Deduplication Storage Engine is run as a network service this must be set before the data is stored within the engine.


Within a SDFS volume chunksize is set upon creation with the option --io-chunk-size. The option --io-chunk-size sets the size of chunks that are hashed and can only be changed before the file system is mounted to for the first time. The default setting is 128k but can be set as low as 4k. The size of chucks determine the efficient at which files will be deduplicated at the cost of RAM. As an example a 4k chunk size is perfect for Virtual Machines (VMDKs) because it matches the cluster size of most guest os file systems but can cost as much as 8GB of RAM per 1TB to store. In contrast setting the chunk size to 128k is perfect of archived data, such as backups, and will allow you to store as much as 32TB of data with the same 8GB of memory.


To create a volume that will store VMs (VMDK files) create a volume using 4k chunk size as follows:
sudo ./mkfs.sdfs --volume-name=sdfs_vol1 --volume-capacity=150GB --io-chunk-size=4k --io-max-file-write-buffers=32

As stated, when running SDFS Volumes with a local DSE chunksizes are matched automatically, but if running the DSE as a network service, than a parameter with the DSE configuration XML file will need to be set before any data is stored. The parameter is:

page-size="<chunk-size in bytes>".

As an example to set a 4k chunk size the option would need to be set to:

page-size="4096"

File and Folder Placement:

Deduplication is IO Intensive. SDFS, by default writes data to /opt/sdfs. SDFS does a lot of writes went persisting data and a lot of random io when reading data. For high IO intensive applications it is suggested that you split at least the chunk-store-data-location and chunk-store-hashdb-location onto fast and separate physical disks. From experience these are the most io intensive stores and could take advantage of faster IO.

Other options and extended attributes:

SDFS uses extended attributes to manipulate the SDFS file system and files contained within. It is also used to report on IO performance. To get a list of commands and readable IO statistics run "getfattr -d *" within the mount point of the sdfs file system.

Linux e.g.

user@server:/media/dedup$ getfattr -d * .

Windows e.g.

sdfscli --file-info --file-path=<path to file or folder>

SDFS Volume Replication:

SDFS now provides asynchronous master/slave volume and subvolume replication throught the sdfsreplicate service and script. SDFS volume replication  takes a snapshot of the disignated master volume or subfolder and then replicated meta-data and unique blocks to the secondary, or slave, SDFS volume. Only unique blocks that are not already stored on the slave volume are replicated so data transfer should be minimal. The benefits of SDFS Replication are:

* Fast replication - SDFS can replicate large volume sets quickly.
* Reduced bandwidth - Only unique data is replicated between volumes
* Build in scheduling - The sdfsreplicate service has a built in scheduling engine based on cron style syntax.
* Sub-volume replication - The sdfsreplicate service can replicate volumes or subfolders to slave volumes. In addition, replication can be set to  be targeted to sub-volumes on the slave.
* Sub-volume targest on the slave allow for wildcard naming such as and appended timestamp or the hostname of the master.
* Replicate to S3 (Cloud based Storage) with limited bandwith requirements.

The steps SDFS uses to perform  asynchronous replication are the following:

1. The sdfsreplicate service, on the slave volume host, requests a snapshot of the master volume or subfolder.
2. The master volume creates a snapshot of all SDFS metadata and data maps.
3. The master volume tar and zips the snapshot metadata and data maps
4. The sdfsreplicate service, on the slave volume host, downloads the snap shot tar
5. The slave volume unzips and imports the tar to its volume structure
6. The slave volume imports data associated with the master snapshot to its dedup storage engine from the master volume

The steps required to setup master/slave replication are the following:

1. Configure your SDFS master volume to allow replication. This is done by creating a SDFS volume with the command line parameter    "--enable-replication-master".  e.g. mkfs.sdfs --volume-name=vol0 --volume-capacity=1TB --io-chunk-size=4 --chunk-store-size=200GB      --enable-replication-master
2. Configure your SDFS slave volume to allow replication. This is done by creating an SDFS volume with the command line parameters    "--enable-replication-slave" and "--replication-master=<master-ip-or-hostname>" e.g. mkfs.sdfs --volume-name=vol0-slave --volume-capacity=1TB --io-chunk-size=4 --chunk-store-size=200GB      --enable-replication-slave --replication-master=192.168.0.12
3. Configure the replication.props configuration file on the slave. An example of this script is included in the etc/sdfs directory    and includes the following parameters:
#Replication master settings
#IP address of the server where the master volume is located
replication.master=master-ip
#the password of the master. This defaults to "admin"
replication.master.password=admin
#The sdfscli port on the master server. This defaults to 6442
replication.master.port=6442 #The folder within the volume that should be replicated. If you would like to replicate the entire volume use "/" replication.master.folder=/
#Replication slave settings
#The local ip address that the sdfscli is listening on for the slave volume.
replication.slave=localhost
#the password used on the sdfscli for the slave volume. This defaults to admin
replication.slave.password=admin
#The tcp port the sdfscli is listening on for the slave
replication.slave.port=6442
#The folder where you would like to replicate to wild cards are %d (date as yyMMddHHmmss) %h (remote host) 
#the slave folder to replicated to e.g. backup-%h-%d will output "backup-<master-name>-<timestamp>
replication.slave.folder=backup-%h-%d
#Replication service settings
#The folder where the SDFS master snapshot will be downloaded to on the slave. The snapshot tar archive is deleted after import.
archive.staging=/tmp
#The log file that will output replication status
logfile=/var/log/sdfs/replication.log
#Schedule cron = as a cron job, single = run one time
schedule.type=cron
#Every 30 minutes take a look at http://www.quartz-scheduler.org/documentation/quartz-2.x/tutorials/tutorial-lesson-06 for scheduling tutorial
schedule.cron=0 0/30 * * * ?

4. Run the sdfsreplicate script on the slave. This will either run once and exit if schedule.type=single or will run continuously with    schedule.type=cron

e.g. ./sdfsreplicate /etc/sdfs/replication.props

Data Chunk Removal:

 SDFS uses two methods to remove unsued data from an DedupStorage Engine(DSE). If the SDFS volume has its own dedup storage engine, which it does by default. Unused,or ophaned, chunks are removed as the size of the DSE increases at 10% increments. Below details the process:

1. SDFS Volume scans all files, claims, and informs the DSE what chunks are currently in use. This happens when chunks are first stored and then every time the ChunkStore grows by 10%.
2. The DSE checks for data that has not been claimed by the file system and time stamps all data that has been claimed by the volume.
3. The chunks that have not been claimed by the volume are de-referenced and put into a pool for re-use as new  data is written to the dse.

If the DSE is decoupled from the SDFS volume  a batch process to remove unused blocks of hashed data.This process is used because the file-system is decoupled from the back end storage (Dedup Storage Engine) where the actual data is held. As hashed data becomes stale they are removed from the Dedup Storage Engine. The process for determining and removing stale chunks is as follows.

  1. SDFS file-system informs the Dedup Storage Engine what chunks are currently in use. This happens when chunks are first created and then every 2 hours on the hour after that.
  2. The DSE checks for data that has not been claimed in the last 8 hours upon mount and then every 4 hours after that.
  3. The chunks that have not been claimed in the last 10 hours upon mount and 6 hours after that are put into a pool and overwritten as new data is written to the Dedup Storage Engine.
The Dedup Storage Engine can be cleaned manually by running :

sdfscli --cleanstore=<minutes>

The size of the chunks.chk will not diminish but rather SDFS will re-allocate space already written to, but unclaimed.

As stated above, the volume claims chunks on chunks every two hours when the DSE is decoupled from the SDFS Volume. This can be configured to happen more or less frequently by editing the SDFS configuration file and modifing the "claim-hash-schedule" attribute. This should always occure more frequently than the "eviction-age" attribute set for the DSE ("chunk-store" tag).

The DSE claim schedule can be modified through the "chunk-gc-schedule" attribute. Again, this should occure more frequently than the "eviction-age" attribute set for the DSE ("chunk-store" tag).

Finally, the "eviction-age" is set based on hours and by default it is 6. This can be changed but should be greater than the "claim-hash-schedule" and "chunk-gc-schedule".

    All of this is configurable and can be changed after a volume is written to. Take a look at cron format for more details.

    TroubleShooting:

    There are a few common errors with simple fixes. 

    1. OutOfMemoryError - This is caused by the size of the DedupStorageEngine memory requirements being larger than the heap size allocated for the JVM. To fix this edit the mount.sdfs script and increase the -Xmx2g to something larger (e.g. -Xmx4g).

    2. java.io.IOException : Too Many Open Files - This is caused by there not being enough available file handles for underlying filesystem processes. To fix this add the following lines to /etc/security/limits.conf and the relogin/restart your system.

    * soft nofile 65535
    * hard nofile 65535

    References:

    1. Wikipedia - Tiger (cryptography)
    2. Wikipedia - Data Deduplication
    {joscommentenable}
     

    SDFS Info

    Please Donate for more testing hardware.

    Latest News

    SDFS and Appliance 1.1.2 has been release after significant testing. Several fixes and a couple enhancements included.
     
    New OpenDedup NAS appliance with VMWare integration Available. Download it here.