《存储系统容错编码简介》由会员分享,可在线阅读,更多相关《存储系统容错编码简介(119页珍藏版)》请在金锄头文库上搜索。
1、存储系统容错编码简介内容mRAID、容错编码mReed-Solomon编码m二进制线性码m阵列码m利用组合数学工具构造容错编码内容mRAID、容错编码mReed-Solomon编码m二进制线性码m阵列码m利用组合数学工具构造容错编码RAIDmRedundant Arrays of Inexpensive Disks Redundant Arrays of Independent Disksq容量q性能q可靠性mChen, P. M., Lee, E. K., Gibson, G. A., Katz, R. H., and Patterson, D. A. “RAID: high-perform
2、ance, reliable secondary storage.” ACM Computing Surveys 26(2), pp. 143-185, June 1994.RAID结构mData Stripingstripe unitstripe04KB-14KB 8KB-18KB 12KB-112KB 16KB-116KB 20KB-1RAID结构mRedundancy04KB-1RAID结构m编码:d1 XOR d2 XOR XOR dn = pm解码:di = p XOR d1 XOR XOR di-1 XOR di-1 m解码:pnew = p XOR diold XOR dinew
3、RAID结构mdata unit、parity unitmRAID5:更新负载均匀分布RAID结构RAID5的读写RAID5的读写RAID5布局mEdward K. Lee, Randy H. Katz, “The Performance of Parity Placements in Disk Arrays”, IEEE Transactions on Computers, vol. 42 no. 6, pp. 651-664, 1993.RAID5布局RAID5布局RAID0mHui-I Hsiao and David J. DeWitt, “Chained declustering: A
4、 new availability strategy for multiprocessor database machines”, Technical Report CS TR 854, University of Wisconsin, Madison, June 1989. RAID0mGang Wang, Xiaoguang Liu, Sheng Lin, Guangjun Xie, Jing Liu, “Constructing Double- and Triple-erasure- correcting Codes with High Availability Using Mirror
5、ing and Parity Approaches”, ICPADS2007.What is an Erasure Code?mJ. S. Plank, “Erasure Codes for Storage Applications”, Tutorial of the 4th Usenix Conference on File and Storage Technologies, San Francisco, CA, Dec, 2005.When are they useful?mAnytime you need to tolerate failures.When are they useful
6、?mAnytime you need to tolerate failures.When are they useful?mAnytime you need to tolerate failures.When are they useful?mAnytime you need to tolerate failures.When are they useful?mAnytime you need to tolerate failures.When are they useful?mAnytime you need to tolerate failures.When are they useful
7、?mAnytime you need to tolerate failures.Terms & DefinitionsmNumber of data disks:nmNumber of coding disks: mmRate of a code:R = n/(n+m)mIdentifiable Failure: “Erasure”The problem, once againIssues with Erasure CodingmPerformanceqEncoding Typically O(mn), but not always.qUpdate Typically O(m), but no
8、t always.qDecoding Typically O(mn), but not always.Issues with Erasure CodingmSpace UsageqQuantified by two of four: Data Devices:n Coding Devices:m Sum of Devices:(n+m) Rate:R = n/(n+m)qHigher rates are more space efficient, but less fault-tolerant.Issues with Erasure CodingmFlexibilityqCan you arb
9、itrarily add data / coding nodes?q(Can you change the rate)?qHow does this impact failure coverage?Trivial Example: ReplicationmMDSmExtremely fast encoding/decoding/update.mRate: R = 1/(m+1) - Very space inefficientmThere are many replication/based systemsm(P2P especially).Less Trivial Example: Simp
10、le ParitymPatterson D A, Gibson G A, Katz R H, “A case for redundant arrays of inexpensive disks (RAID)”, ACM International Conference on Management of Data, Chicago, ACM Press, 1988, pp. 109-116.mP. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson. RAID: High-performance, reliable
11、secondary storage. ACM Computing Surveys, 26(2):145185, June 1994.Evaluating ParitymMDSmRate: R = n/(n+1) - Very space efficientmOptimal encoding/decoding/update:qn-1 XORs to encode & decodeq2 XORs to updatemExtremely popular (RAID Level 5).mDownside: m = 1 is limited.UnfortunatelymThose are the las
12、t easy things youll see.mFor (n 1, m 1), there is no consensus on the best coding technique.mThey all have tradeoffs.Why is this such a pain?mCoding theory historically has been the purview of coding theorists.mTheir goals have had their roots elsewhere (noisy communication lines, byzantine memory s
13、ystems, etc).mThey are not systems programmers.m(They dont care)内容mRAID、容错编码mReed-Solomon编码m二进制线性码内容mRAID、容错编码mReed-Solomon编码m二进制线性码m阵列码m利用组合数学工具构造容错编码Reed-Solomon CodesmThe only MDS coding technique for arbitrary n & m.mThis means that m erasures are always tolerated.mHave been around for decades.m
14、Expensive. J. S. Plank. A tutorial on Reed-Solomon coding for fault-tolerance in RAID-like systems. Software Practice& Experience, 27(9):9951012, September 1997.Reed-Solomon CodesmOperate on binary words of data, composed of w bits, where 2w n+m.Reed-Solomon CodesmOperate on binary words of data, co
15、mposed of w bits, where 2w n+m.Reed-Solomon CodesmThis means we only have to focus on words, rather than whole devices.mWord size is an issue:qIf n+m 256, we can use bytes as words.qIf n+m 65,536, we can use shorts as words.Reed-Solomon CodesmCodes are based on linear algebra.qFirst, consider the da
16、ta words as a column vector D:Reed-Solomon CodesmCodes are based on linear algebra.qNext, define an (n+m)*n “Distribution Matrix” B, whose first n rows are the identity matrix:Reed-Solomon CodesmCodes are based on linear algebra.qB*D equals an (n+m)*1 column vector composed of D and C (the coding words):Reed-Solomon CodesmThis means that each data and coding word has a corresponding row in the distribution matrix.Reed-Solomon C