Heuristics in Distributing Data and Parity with Distributed Hash Tables
View/ Open
Date
2021-11-30Type of Degree
Master's ThesisDepartment
Computer Science and Software Engineering
Metadata
Show full item recordAbstract
We compare multiple methods of distributing data and error correcting code across distributed hash tables. We focus on the scaling of distributed hash tables and at which methods moved the least amount of data while maintaining an even distribution. A common technique is to use erasure coding and storing pieces of files on separate hardware. This approach makes placement of pieces dependent on earlier placements. We identify several rules that when applied to standard methods reduces the amount of data moved while scaling dramatically. Even though CRUSH [28] includes these heuristics we found that tweaking its approach allowed it to migrate less data when changing the cluster layout.