基于逻辑卷的分级存储系统设计与实现

资源描述

《基于逻辑卷的分级存储系统设计与实现》由会员分享，可在线阅读，更多相关《基于逻辑卷的分级存储系统设计与实现（52页珍藏版）》请在金锄头文库上搜索。

1、华中科技大学硕士学位论文基于逻辑卷的分级存储系统设计与实现姓名：陈洁申请学位级别：硕士专业：计算机系统结构指导教师：曹强 2011-01-13 I 华华中中科科技技大大硕硕士士学学位位论论文文摘摘要要随着信息技术的飞速发展，数据呈爆炸式增长，高效存储数据给大规模存储系统的设计、建设和运行带来了巨大的挑战。在大规模存储系统中，各种计算、传输和存储设备无论在性能上还是在可靠性等特性上都存在很大物理差异；同时，实际业务负载对于存储设备的数据存取也并不均匀，在空间和时间上存在很大差异。如果将数据全部存储在高性能设备上是不现实也是不明智的。分级存储概念的提出

2、有效地解决了这个问题，它能对数据存取负载进行有效监控，并能够根据负载和应用需求按照不同存储资源特性进行优化配置。通过分析文件级数据迁移机制的不足，设计了一种基于逻辑卷的分级存储系统，其能够根据逻辑卷的实时热度自动分配相应的存储资源，从整体上提高了系统数据存储效率。首先介绍了分级存储系统的应用环境，讨论了系统实现过程中涉及的关键技术，分析了基于用户访问频度和存储设备性能的分级存储系统所引发的相关问题。其次，对原型系统进行设计和实现，构建了由客户端、管理器和存储资源代理三个部分组成的系统架构,每个部分都包含多个模块，其中用户需求监控模块、存储虚拟化模块和数据迁移模块是核心。用户

3、需求监控模块通过对逻辑卷访问热度的监控，实现了基于用户访问频度的分级存储；存储虚拟化模块通过对存储设备按性能分级实现了对异构资源的统一；数据迁移模块实现了数据在不同性能的存储设备上的迁移，该系统不仅支持在线迁移，保证了前台应用的不中断，而且还可以对迁移速率进行控制，从而有效地提高了用户的访问性能。最后，通过实验验证了该系统所具有的部分功能，对系统的访问性能进行了评测，并分析了数据迁移对用户访问性能的影响。同时，实验数据表明通过对数据迁移速率进行控制，可以提高 10%-40%的前台应用吞吐量。关键词：关键词：海量数据存储，分级存储，负载均衡，存储虚拟化，数据迁移 II 华华中中

4、科科技技大大硕硕士士学学位位论论文文 Abstract With the rapid development of information technology，data grows explosively, how to store massive data efficiently takes great challenge to the design、construction and running for large-scale storage system.In large-scale storage systems, the various computing、tr

5、ansmission and storage devices have very big difference in physical characteristics; At the same time, the access specifications of the storage devices for the actual applications are not uniform, they vary in time and space, its not wise to store all the data in high performance devices. The presen

6、t of hierarchical storage can solve the above problems properly ,it can efficiently monitor the data access specifications, according to which it will optimize the configuration of storage resources with heterogeneous characteristics. By analyzing the shortcomings of the file-level data migration me

7、chanisms, we designed a volume-based hierarchical storage management system, it can automatically assign the storage resources according to the data access specifications of the logic volumes, thus enhance the overall efficiency of the storage system. We first introduced the applicable environment o

8、f hierarchical storage system, discussed the key technologies for implementing the system, analyzed the issues related to realizing the access specifications and storage device performance based hierarchical storage management system. Secondly, we designed and implemented the system prototype，the sy

9、stem is composed of three parts, the client、the manager and the storage resource agent, each part has a few modules, and each module achieves specific functionalities, and the user requirements monitor module、storage virtualization module and data migration module are the core. The user requirements

10、 monitor module collects the users access specifications to achieve the hierarchical storage based on user access frequency; The storage virtualization module classifies devices according to their performance to achieve the unity of the heterogeneous resources; The data migration module collects use

11、r access frequency and migrate data between the devices of different performance. The system can III 华华中中科科技技大大硕硕士士学学位位论论文文 support online migration and migration velocity control, thus effectively ensure the users access performance. Finally, we checked part of the systems functionalities

12、, evaluated the systems performance, and analyzed how data migration affected the users access performance. At the same time, the experiments also showed that the foreground application throughput could improve 10% -40% by controlling the data migration rate. Keywords: Massive data storage, Hierarch

13、ical storage, Load balance, Storage virtualization, Data migration 独创性声明独创性声明本人声明所呈交的学位论文是我个人在导师指导下进行的研究工作及取得的研究成果。尽我所知，除文中已经标明引用的内容外，本论文不包含任何其他个人或集体已经发表或撰写过的研究成果。对本文的研究做出贡献的个人和集体，均已在文中以明确方式标明。本人完全意识到本声明的法律结果由本人承担。学位论文作者签名：日期：年月日学位论文版权使用授权书学位论文版权使用授权书本学位论文作者完全了解学校有关保留、使用学位论文的规定，即：学校有权保留并

14、向国家有关部门或机构送交论文的复印件和电子版，允许论文被查阅和借阅。本人授权华中科技大学可以将本学位论文的全部或部分内容编入有关数据库进行检索，可以采用影印、缩印或扫描等复制手段保存和汇编本学位论文。保密，在年解密后适用本授权书。不保密。（请在以上方框内打“” ）学位论文作者签名：指导教师签名：日期：年月日日期：年月日本论文属于 1 华华中中科科技技大大硕硕士士学学位位论论文文 1 绪论绪论随着数据呈爆炸式的增长，企业对数据的存储也面临着越来越多的挑战。本章首先介绍了数据的快速增长，然后引入了分级存储的概念，并针对现有分级存储系统的

15、不足提出了一种新的解决方案，最后对本文的研究目标和论文组织结构作出了具体说明。 1.1 课题研究背景课题研究背景随着信息技术的飞速发展，当前无论是政府机构、企事业单位还是个人比过去任何时代都离不开信息数据。美国政府机构保存的医疗和商业记录超过百亿， Google 每天处理 20PB 的数据， Facebook 已管理 6 百亿以上的照片文件，每个星期存储超过 25TB 的新照片， Opera 浏览器每个月处理 1PB 以上的数据，根据预测 2011 年全世界数据总量将达到 2ZB。随着数据呈爆炸式增长，高效存储这些数据给大规模存储系统的设计、建设和运行带来了巨大的挑战。在大规

16、模存储系统中，各种计算、传输和存储设备无论在性能上还是在可靠性等特性上都存在很大物理差异1；同时，实际业务负载对于存储设备的数据存取也并不均匀，在空间和时间上都存在很大差异 2,7,8。如果将数据全部存储在高性能设备上是不现实也是不明智的，因此智能地将不同业务负载的数据存储在不同存储性能的设备上成为解决实际问题的关键。分级存储概念的提出有效地解决了这个问题，它能对数据存取负载进行有效监控，并能够根据负载和应用需求按照不同存储资源特性进行智能地优化配置6,38。分级存储的主要作用是将数据按照业务负载对它们进行分级存储36,37。图 1.1 给出了分级存储的框架，在应用服务和存储设施之间有个桥梁，这个桥梁的主要作用是在数据产生后对数据负载进行监控，并根据负载对数据进行分级存储，数据迁移使得数据可以从一种存储资源上

展开阅读全文