数据存储方案－金锄头文库

资源描述

《数据存储方案》由会员分享，可在线阅读，更多相关《数据存储方案（9页珍藏版）》请在金锄头文库上搜索。

1、引言文献是由Rick Cattell撰写的论文，论文讨论了可扩展的结构化数据的、非结构化的（包括基于键值对的、基于文档的和面向列的)数据存储方案(注：NOSQL是支撑大数据应用的关键所在。事实上，将NOSQL翻译为“非结构化不甚准确,因为NOSQL更为常见的解释是：Not Only SQL（不仅仅是结构化），换句话说，NOSQL并不是站在结构化SQL的对立面，而是既可包括结构化数据，也可包括非结构化数据)。论文信息Scalable SQL and NoSQL Data StoresRick Cattell Originally published in 2010， last revised D

2、ecember 2011摘要ABSTRACTIn this paper, we examine a number of SQL and so called “NoSQL data stores designed to scale simple OLTPstyle application loads over many servers。Originally motivated by Web 2.0 applications, these systems are designed to scale to thousands or millions of users doing updates as

3、 well as reads， in contrast to traditional DBMSs and data warehouses。We contrast the new systems on their data model, consistency mechanisms, storage mechanisms, durability guarantees， availability， query support, and other dimensions. These systems typically sacrifice some of these dimensions， e.g.

4、 database-wide transaction consistency, in order to achieve others, e。g。 higher availability and scalability.在这篇文献中,我们验证了许多SQL和所谓的NoSQL数据存储（它设计于支持简单的OLTP风格的应用，能够用于扩展在很多服务器上）它最先由Web 2.0应用引起，与传统的数据库管理系统和数据仓库对比，这些系统设计为可扩展到数以千计或数以百万计的用户做更新，同时读取。我们对比了新系统上的数据模型,一致性机制, 存储机制,持久性保证,可用性，支持的查询以及其它属性,这些系统典型的牺牲（

5、为了实现其它属性而去掉）了一些属性。如数据库常有的事务一致性，牺牲了这个是为了其它的属性，如高可用，可扩展。Note： Bibliographic references for systems are not listed， but URLs for more information can be found in the System References table at the end of this paper.注：参考书没列出来(翻译省）Caveat： Statements in this paper are based on sources and documentation th

6、at may not be reliable, and the systems described are “moving targets，” so some statements may be incorrect. Verify through other sources before depending on information here. Nevertheless, we hope this comprehensive survey is useful! Check for future corrections on the authors web site cattell。net/

7、datastores.警告:一些提及的书可能不可用。尽管如此，我们还是希望这篇综合的文献对大家有帮助，我们网站： The author is on the technical advisory board of Schooner Technologies and has a consulting business advising on scalable databases.透漏：作者是可扩展数据库商业顾问.1。 OVERVIEWIn recent years a number of new systems have been designed to provide good horizon

8、tal scalability for simple read/write database operations distributed over many servers. In contrast, traditional database products have comparatively little or no ability to scale horizontally on these applications. This paper examines and compares the various new systems。近年,很多系统的设计提供良好水平扩展，支持在多服务器

9、上分布式读写.相比较传统的系统，一般为无扩展，规模小。本篇文献研究与对比很多不同的新系统(Yol注,其实就是各种NOSQL设计进行对比，比如Mongo与Hbase分类，简介）Many of the new systems are referred to as “NoSQL” data stores. The definition of NoSQL， which stands for “Not Only SQL or “Not Relational”, is not entirely agreed upon。 For the purposes of this paper， NoSQL syste

10、ms generally have six key features:NoSQL等于Not Only SQL，或者Not Relational（弱关系型数据库，与mysql比较起来),NoSQL的systems一般有6重要特征:1。 the ability to horizontally scale “simple operation throughput over many servers，通过简单操作在多服务器上水平扩展的能力2。 the ability to replicate and to distribute （partition） data over many servers，复

11、制和分发 (分区）数据在多个服务器的能力3。 a simple call level interface or protocol (in contrast to a SQL binding）,一种简单的调用级接口或协议 (相比较于 SQL 绑定）4. a weaker concurrency（并发性，并行性） model than the ACID transactions of most relational （SQL) database systems，对比大多数关系数据库（SQL) 数据库管理系统 ACID 事务，它是一种较弱的并发模型5. efficient use of dist

12、ributed indexes and RAM for data storage,有效地利用分布式的索引和 RAM 的数据存储6.and the ability to dynamically add new attributes to data records。动态地在数据记录中添加新的属性The systems differ in other ways， and in this paper we contrast those differences. They range in functionality from the simplest distributed hashing, as s

13、upported by the popular memcached open source cache， to highly scalable partitioned tables, as supported by Googles BigTable 1。 In fact, BigTable, memcached, and Amazons Dynamo 2 provided a “proof of concept that inspired many of the data stores we describe here：这些系统在其他方面也有不同，在本文中我们对比了这些差异。它们的范围从简单的

14、分布式哈希算法，如流行的开源memcached 缓存,到高度可扩展的已分区表，如谷歌的 BigTable 1.事实上，BigTable，memcached 和亚马逊的Dynamo 2 提供”概念证明”，催动了许多我们在这儿描述的数据存储： Memcached demonstrated(论证，证明） that inmemory indexes can be highly scalable, distributing and replicating objects over multiple nodes。 Memcached 表明内存中索引可以是高度可伸缩、分布式和在多个节点上复制对象。 D

15、ynamo pioneered the idea of eventual consistency as a way to achieve higher availability and scalability: data fetched are not guaranteed to be up-todate, but updates are guaranteed to be propagated to all nodes eventually. Dynamo的先驱想了一个idea，以实现更高的可用性和可伸缩性的最终一致性, 那就是: 获取数据不能保证是最新的，但保证这个最新能最终传播到所有节点。

16、 BigTable demonstrated that persistent record storage could be scaled to thousands of nodes, a feat that most of the other systems aspire to. BigTable 表明，持续的记录存储可以缩放到数千个节点，是其他系统最向往的.A key feature of NoSQL systems is “shared nothing” horizontal scaling replicating and partitioning data over many servers。 This allows them to support a large numb

展开阅读全文