数据网格中高可用性副本管理及性能优化研究

资源描述

《数据网格中高可用性副本管理及性能优化研究》由会员分享，可在线阅读，更多相关《数据网格中高可用性副本管理及性能优化研究（114页珍藏版）》请在金锄头文库上搜索。

1、重庆大学博士学位论文数据网格中高可用性副本管理及性能优化研究博士研究生：吴长泽指导教师：陈蜀宇教授学科、专业：计算机应用重庆大学计算机学院二 OO 七年十月Ph.D. Dissertation of Chongqing UniversityResearch of High availability ReplicaManagement and PerformanceOptimization in Data GridPh.D. Candidate: Wu ChangzeSupervisor: Prof. Chen ShuyuMajor: Computer ScienceCollege of

2、 Computer ScienceChongqing UniversityOctober 2007摘要数据网格的出现解决了传统的数据管理系统难以对大规模的、分布式的数据进行数据访问、传送、分析的难题。极大地推动了涉及大规模数据管理的科学研究和工程实践的发展。为了在数据管理中提高数据可用性、减少网络流量、增强数据访问性能，在数据网格中引入了副本管理技术。然而，由于数据网格系统本身及其资源的高度动态性、异构性、广域性的特点，阻碍了数据高可用性及性能优化的取得。如何针对数据网格的特点，建立恰当的副本管理机制，切实提高数据网格中数据的可用性和改善数据访问性能，已成为数据网格中的一个研究热点。本文基于相

3、关研究成果的对比研究，总结了数据网格中的副本管理需求，给出了一种动态副本管理服务模型，并据此提出了相应的自适应副本创建策略、动态均衡的副本定位算法、基于模糊灰预测的副本选择算法和动态异步的副本一致性维护算法，分别是：根据数据网格的特点，分别分析了数据网格中副本数据高可用性需求及数据访问性能优化需求，建立了一种动态副本管理服务模型，能够满足数据网格动态性，提高了数据可用性，优化了数据访问性能。针对数据网格副本管理中的副本创建问题，采用 Markov 模型计算了副本冗余度，考虑了多个副本数据不一致对可用性的影响，更准确的保障了数据可用性。并提出了基于开销分摊的副本创建策略，在节点自治的前提下取

4、得了全局性能优化。从理论上证明了开销分摊副本创建算法的正确性和全局性能优化特性。最后，通过仿真实验进一步验证了算法的有效性和正确性。针对数据网格副本管理中的副本定位问题，在改进蚁群算法的基础上，提出了一种动态均衡的副本定位算法，能够自适应节点的动态加入和退出进行准确定位，提高了数据访问性能。针对数据网格副本管理中的副本选择问题，在提出的动态均衡副本定位算法基础上，提出了一种基于模糊灰预测的副本选择算法，对预测样本要求低，通过模糊控制器的优化，获得了较高的预测精度。最后通过仿真实验确定了模糊控制器自学习因子的选取，并验证了副本定位算法与副本选择算法的有效性和正确性。针对数据网格副本管理中的副

5、本一致性维护问题，提出了一种动态异步的数据一致性维护策略。该策略通过设计的动态投票机制可以适用于节点低在线率的副本一致性维护；通过减少一致性维护过程中读写操作节点参与数，改善了系统性能，并提高了数据可用性；通过降低系统内通信开销达到了提高系统可扩展性的目的。从全局有序性和读一致性证明了该算法的正确性。最后通过仿真实验I验证了副本一致性维护算法的正确性和有效性。综上，本文针对数据网格副本管理的需求，提出了一种动态副本管理服务模型及相关算法，为提高数据可用性和优化性能提供了一种较好的解决方案。理论分析和仿真实验结果表明：在现有研究基础上，本文所提出的副本管理服务策略是正确、有效的，能够弥补现有研究

6、的不足，可用于数据网格环境下的副本管理。关键词：数据网格，副本管理，高可用性，性能优化IIABSTRACTData grid has resolved the puzzle that it is difficult for traditional data managementsystems to access， transfer and analyze the large scale distributed data. And it hasboosted the development of scientific research and engineering practice extre

7、mely.Replica management was introduced into data grid for enhancing data availability，reducing network flow and improving data access performance. Nevertheless， becauseof the characteristics of highly dynamic， heterogeneity and large scale， data gridsystems are fail to achieve high data availability

8、 and performance optimization. How tobuild an replica management mechanism according to the characteristics of data grid，to enhance data availability and improve data access performance， which is one of thehost spot issues in literature.Based on comparison research ， this dissertation summarized the

9、 replicamanagement requirements of data grid ，and constructed a dynamic replicamanagement service model. Moreover， the author presented the correspondingalgorithms， such as adaptive replica creating strategies， dynamic equilibrium replicalocation algorithms， replica selection algorithms based on fuz

10、zy-grey prediction anddynamic asynchronous consistency maintenance algorithms. The main contents are asfollows. According to the characteristics of data grid, the author analyzed replica datahigh availability requirements and performance optimization requirements respectively.Based on it， the dynami

11、c replica management service model that it considered dataavailability measurement and performance optimization strategy lay was built. Aimed at the problems of replica creating in data gird. The author choseMarkov model to confirm replica redundance accurately for guaranteeing dataavailability thro

12、ugh considering the influence factor of data consistency. Moreover, thedissertation presented a suit of cost shared replica creating algorithms. cost sharedmechanism was established to encourage autonomous nodes creating replica, and thesenodes in favor of global performance optimization. Then, the

13、performance optimizationalgorithm was analyzed theoretically. At last, simulation results demonstrated thecorrectness and effectiveness of the algorithm. Aimed at the problem of replica location in data grid, a dynamic equilibriumreplica location algorithm was proposed on the basis of modified ant a

14、lgorithm. It canIIIself-adapt node to dynamic join or quit. With replica locate accurately as prior condition,a strategy of replica selection based on grey prediction is presented. It is not constrain toprediction samples. Through fuzzy controller compensate prediction value, it have gotprediction values accurately. In the end, the author confirmed the self-study factor offuzzy controller, and the correctness and

展开阅读全文