Cloud Based Deduplication Quick Start Guide


It is now possible to store dedup chunks to the Amazon S3 cloud storage service or Azure. This will allow you to store unlimited amounts of data without the need for local storage. AES 256 bit encryption and compression (default) is provided for storing data to cloud storage. It is suggested that the chunk size be set to the default (128k) to allow for maximum compression and fewest round trips for data.
The purpose of deduplicating data before sending it to cloud storage is to minimize storage and maximize write performance. The concept behind deduplication is to only store unique blocks of data. If only unique data is sent to cloud storage, bandwidth can be optimized and cloud storage can be reduced. Opendedup approaches cloud storage differently than a traditional cloud based file system. The volume data such as the name space and file meta-data are stored locally on the the system where the SDFS volume is mounted. Only the unique chunks of data are stored at the cloud storage provider. This ensures maximum performance by allowing all file system functions to be performed locally except for data reads and writes. In addition, local read and write caching should make writing smaller files transparent to the user or service writing to the volume.
Requirements :
Read the quickstart guide first!
To Setup AWS enabled deduplication volumes follow these steps:
1. Go to http://aws.amazon.com and create an account.
2. Sign up for S3 data storage
3. Get your Access Key ID and Secret Key ID.
4. Make an SDFS volume using the following parameters:
./mkfs.sdfs --volume-name=<volume name> --volume-capacity=<volume capacity> --aws-enabled=true --cloud-access-key=<the aws assigned access key> --cloud-bucket-name=<a universally unique bucket name such as the aws-access-key> --cloud-secret-key=<assigned aws secret key> --chunk-store-encrypt=true
5. Mount volume and go to town!
./mount.sdfs <volume name> <mount point>
To Setup Azure enabled deduplication volumes follow these steps
./mkfs.sdfs --volume-name=<volume name> --volume-capacity=<volume capacity> --azure-enabled=true --cloud-access-key=<storage account> --cloud-bucket-name=<the buckey name> --cloud-secret-key=<primary access key> --chunk-store-encrypt=true
6. Mount volume and go to town!
./mount.sdfs <volume name> <mount point>






