Opendedup

Global Deduplication

  • Increase font size
  • Default font size
  • Decrease font size

Performance Metrics

E-mail Print PDF

Introduction:

SDFS is an unique approach to storing data and performance. SDFS metrics vary widely from traditional kernel based file-systems. Current tools for measuring performance do not always accurately depict the performance of SDFS. To illustrate performance of the file system three different tools were used: dd, custom random file writer, and bonnie++. SDFS was tested at 128k, 64k, 32k, and 4k block sizes. Tests were performed at these block sizes to accurately depict the somewhat linear nature of SDFS IO performance across the different block sizes. 

Testing of SDFS 1.0.5 :

SDFS performance depends mostly on the percentage of data within a dataset that is unique. In order to accurately test performance a sample test harnes was created that read, and subsiquently wrote sample data with various percentages of unique data. In addition, the test was performed using various chunk sizes within SDFS. The chunk sizes tested were 4k,8k,16k,32k,64k,128k. In addition a volume mounted, using the same hardware, with xfs filesystem.

SDFS performance in the 1.0.5 release is much improved over previous releases and testing has been performed to display the improvement

The configuration for this test differs that that above and includes the following:

  • Hardware Brand : Generic white box workstation.
  • CPU : Intel i7 920 (4 core) (not overclocked)
  • RAM : 16 GB (2 GB Used for testing)
  • Hard Drives : 3 100Gig SSD drives, RAID 0 using software raid
    • 3 Drive for SDFS Volume storage (XFS)
    • 1 Drive for OS (1 TB SATA Drive)
  • OS : Ubuntu 11.0.4 (x64)
  • SDFS Version : 1.0.5
    • Standard SDFS Configuration
    • Local Mode with 1 volume
    • 128k, 64k, 32k, and 4k block sizes tested

SDFS Write Performance:

Write performance was tested using data sets with predetermined rates of unique data within. Data sets with 0%, 25%, 50%, 75%, and 100% unique data were chose to compair at different SDFS chunk sizes. Each data set test contained 5 unique 1 GB files that were simultainously written the the mounted SDFS, or XFS volume to compair performance. Each file contained a predetermined percentage of unique data per test.

Here are the results:

Write Performance of SDFS 1.0.5

As is indicated above, XFS performed better initially. It is suspected that this is due to write caching within the array. It is not due to actual IO performance improvements because IO performance dropped off drastically later in the testing, as is indicated in the graph above.

Write performance in the testing for SDFS was limited by CPU in all cases and performance could reach much higher rates on a more powerfull CPU configuration.

Below are the detailed results for copying data that is 100% duplicate to SDFS, at various chunk size configurations, and XFS.

XFS RAID 0 = 731 MB/s 
128k Chunk Size = 614 MB/s
64k Chunk Size = 640 MB/s
32k Chunk Size = 610 MB/s
16k Chunk Size = 591 MB/s
8k Chunk Size = 512 MB/s
4k Chunk Size = 468 MB/s

 

Below are the details for results of copying 100% unique data to and SDFS Volume, As you can see, SDFS performed on par or better than XFS to the same disks.

SDFS Write Performance 100% unique data

XFS RAID 0 = 349 MB/s 
128k Chunk Size = 359 MB/s
64k Chunk Size = 354 MB/s
32k Chunk Size = 349 MB/s
16k Chunk Size = 351 MB/s
8k Chunk Size = 341 MB/s
4k Chunk Size = 309 MB/s

 

SDFS Read Performance:

Read Performance, like write performance was tested using data sets with predetermined rates of unique data within. Data sets with 0%, 25%, 50%, 75%, and 100% unique data were chose to compair at different SDFS chunk sizes. Each data set test contained 5 unique 1 GB files that were simultainously read from the mounted SDFS, or XFS volume to compair performance. Each file contained a predetermined percentage of unique data per test.

Here are the results:

SDFS Read Performance

As indicated by the graph above, SDFS performance drops off as data is read for more unique data sets. The drop off is due to the IOPs required for SDFS block data retrieval and provides area for improvement and code optimization. In general, across VMDK data sets SDFS will perform better that standard XFS volumes by the nature of VMDK files being 90-95% duplicate data.

Below are the detailed results for reading data that is 100% duplicate from SDFS, at various chunk size configurations, and XFS.

XFS RAID 0 = 863 MB/s 
128k Chunk Size = 3840 MB/s
64k Chunk Size = 3840 MB/s
32k Chunk Size = 3840 MB/s
16k Chunk Size = 3840 MB/s
8k Chunk Size = 2560 MB/s
4k Chunk Size = 1920 MB/s

Below are the details for results of reading 100% unique data to and SDFS Volume.

SDFS READ Performance 100% unique data

XFS RAID 0 = 668 MB/s 
128k Chunk Size = 526 MB/s
64k Chunk Size = 452 MB/s
32k Chunk Size = 452 MB/s
16k Chunk Size = 341 MB/s
8k Chunk Size = 284 MB/s
4k Chunk Size = 237 MB/s

 

Last Updated on Thursday, 26 May 2011 00:23  

SDFS Info

Latest News

SDFS Version 2.0 Released after over a year of development. Download it.
 
SDFS 2.0 RC2 was released with fixes for Replication, Redhat, and Variable Block deduplication.