[精选]Serengeti-虚拟化你的大数据应用

上传人:我**** 文档编号:183794915 上传时间:2021-06-15 格式:PPTX 页数:42 大小:6.37MB
返回 下载 相关 举报
[精选]Serengeti-虚拟化你的大数据应用_第1页
第1页 / 共42页
[精选]Serengeti-虚拟化你的大数据应用_第2页
第2页 / 共42页
[精选]Serengeti-虚拟化你的大数据应用_第3页
第3页 / 共42页
[精选]Serengeti-虚拟化你的大数据应用_第4页
第4页 / 共42页
[精选]Serengeti-虚拟化你的大数据应用_第5页
第5页 / 共42页
点击查看更多>>
资源描述

《[精选]Serengeti-虚拟化你的大数据应用》由会员分享,可在线阅读,更多相关《[精选]Serengeti-虚拟化你的大数据应用(42页珍藏版)》请在金锄头文库上搜索。

1、 2009 VMware Inc. All rights reserved,Serengeti - 虚拟化你的大数据应用,蔺永华 Vmware, Inc.,Agenda, Todays big data system Why virtualize hadoop? Serengeti introduction, Common questions about virtualization Serengeti solution, Deep insight into Serengeti Summary Q&A,Todays Big Data System:,ETL,Unstructured Data

2、(HDFS),Real Time Structured Database,Big SQL,Data,Parallel Batch Processing,Real Time Streams Real-Time Processing (s4,storm),Analytics,Agenda, Todays big data system Why virtualize hadoop? Serengeti introduction, Common questions about virtualization Serengeti solution, Deep insight into Serengeti

3、Summary Q&A,Challenges To Use Hadoop in physical infrastructure,Deployment, Difficult to deploy, cost several people for several days even months Difficult to tune cluster performance,Low Efficiency, Hadoop clusters are typically not 100% utilized across all hardware resources. Difficult to share re

4、sources safely between different workload,Single Point of Failure, Single point of failure for Name Node and Job tracker No HA for Hive, HCatalog, etc.,Why Virtualize Hadoop? - Get your Hadoop cluster in minutes,1/1000humanefforts, LeastHadoopoperation knowledge Fullyautomated process, 10 minutesto

5、get a Hadoop/HBaseclusterfrom scratch,Server preparation OS installation Automateby Serengeti on vSpherewith best practice Network Configuration Hadoop Installation and Configuration Manual process, costdays,Why Virtualize Hadoop? - Consolidate sprawling clusters,Clustersshare serverswith strongisol

6、ation, Single Hardware Infrastructure Unified operations Optimize Shared Resources = higher utilization Elastic resources = faster on-demand access,Hadoop Dev,Hadoop Prod,HBase,ClusterSprawling Single purpose clusters for various business applications lead to cluster sprawl.,Cluster Consolidation Si

7、mplify,Finance,Hadoop,Virtualization Platform,Hadoop Dev,Hadoop Prod,HBase,.,Portal Hadoop,Portal Hadoop,30%CAPEXDown,50%+ resourcesaresitting idlewhilehighpriorityjob is burningup its cluster.,Utilizeall resourcesfrom pool on demand.,Dynamic elastic scalingonshared resourcepool,Why Virtualize Hadoo

8、p? Utilize all your resources to solve the priority problem 3X fasterto getanalyticresults,vSphere High Availability (HA) - protection against unplanned downtime,Overview Protection against host and VM failures Automatic failure detection (host, guest OS) Automatic virtual machine restart in minutes

9、, on any available host in cluster OS and application-independent,does not require complex configuration changes,(Coordination),Zookeepr,Management Server,High Availability for the Hadoop Stack,(Hadoop Distributed File System),HBase (Key-Valuestore) HDFS,MapReduce (Job Scheduling/Execution System),P

10、ig (Data,Flow),Hive,BI Reporting,ETLTools,RDBMS,Jobtracker Namenode,(SQL) Hive MetaDB,HCatalog Hcatalog MDB,Server,X X,HA HA,App OS,App App OS OS,App OS,App OS,App OS,App OS,VMwareESX X,VMwareESX, Zero downtime, zero data loss failover for all virtual machines in case of hardware failures, Integrate

11、d with VMware HA/DRS No complex clustering or specialized hardware required Single common mechanism for all applications and operating,FT,vSphere Fault Tolerance provides continuous protection Overview Single identical VMs running in lockstep on separate hosts,systems ZerodowntimeforNameNode,JobTrac

12、kerandothercomponentsin Hadoopclusters,Agenda, Todays big data system Why virtualize hadoop? Serengeti introduction, Common questions about virtualization Serengeti solution, Deep insight into Serengeti Summary Q&A,Easy and rapid deployment and management Open sourceprojectlaunched in June 2012, 0.8

13、 is released at Apr. and willrelease0.9 at Jun. Toolkitthat leveragevirtualizationto simplifyHadoop deployment and operations Deploy a cluster in 10 Minutes fully automated Customize Hadoop and HBase cluster Automated cluster operation,Come with eco-system components Support all popular Hadoop Distr

14、ibutions,Serengeti,Demo: 10 minutes to a Hadoop cluster with Serengeti,Agenda, Todays big data system Why virtualize hadoop? Serengeti introduction, Common questions about virtualization Serengeti solution, Deep insight into Serengeti Summary Q&A,Common questions about virtualization Local Disk, ,Ca

15、n local disk be used in virtualization environment? Flexibilityand Scalability How to flexible schedule resources between clusters and different applications as mentioned above? Data stability In virtual environment, how can we distribute data across host and rack? Data locality Hadoop will schedule

16、 compute tasks near by the data, to reduce network IO for data R/W. Can virtual environment get the same result? Performance How about the performance in virtual environment?,Agenda, Todays big data system Why virtualize hadoop? Serengeti introduction, Common questions about virtualization Serengeti solution, Deep insight into Serengeti Summary Q&A,Can I use local diskeasily?,Other VM,Other VM,Other VM,Other VM,Other VM,Other VM,Other VM,Other VM,Hadoop,Hadoop,Hadoop,Hadoop,Hadoop,Hadoop,Hadoop,

展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 办公文档 > PPT模板库 > PPT素材/模板

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号