Serengeti-虚拟化你的大数据应用

上传人:876****10 文档编号:127671968 上传时间:2020-04-04 格式:PPT 页数:41 大小:6.40MB
返回 下载 相关 举报
Serengeti-虚拟化你的大数据应用_第1页
第1页 / 共41页
Serengeti-虚拟化你的大数据应用_第2页
第2页 / 共41页
Serengeti-虚拟化你的大数据应用_第3页
第3页 / 共41页
Serengeti-虚拟化你的大数据应用_第4页
第4页 / 共41页
Serengeti-虚拟化你的大数据应用_第5页
第5页 / 共41页
点击查看更多>>
资源描述

《Serengeti-虚拟化你的大数据应用》由会员分享,可在线阅读,更多相关《Serengeti-虚拟化你的大数据应用(41页珍藏版)》请在金锄头文库上搜索。

1、 2009VMwareInc Allrightsreserved Serengeti 虚拟化你的大数据应用 蔺永华Vmware Inc Agenda Today sbigdatasystem Whyvirtualizehadoop Serengetiintroduction Commonquestionsaboutvirtualization Serengetisolution DeepinsightintoSerengeti Summary Q A Today sBigDataSystem ETL UnstructuredData HDFS RealTimeStructuredDatabas

2、e BigSQL Data ParallelBatchProcessing RealTimeStreamsReal TimeProcessing s4 storm Analytics Agenda Today sbigdatasystem Whyvirtualizehadoop Serengetiintroduction Commonquestionsaboutvirtualization Serengetisolution DeepinsightintoSerengeti Summary Q A ChallengesToUseHadoopinphysicalinfrastructure De

3、ployment Difficulttodeploy costseveralpeopleforseveraldaysevenmonths Difficulttotuneclusterperformance LowEfficiency Hadoopclustersaretypicallynot100 utilizedacrossallhardwareresources Difficulttoshareresourcessafelybetweendifferentworkload SinglePointofFailure SinglepointoffailureforNameNodeandJobt

4、racker NoHAforHive HCatalog etc WhyVirtualizeHadoop GetyourHadoopclusterinminutes 1 1000humanefforts LeastHadoopoperationknowledgeFullyautomatedprocess 10minutestogetaHadoop HBaseclusterfromscratch ServerpreparationOSinstallationAutomatebySerengetionvSpherewithbestpracticeNetworkConfigurationHadoopI

5、nstallationandConfigurationManualprocess costdays WhyVirtualizeHadoop Consolidatesprawlingclusters Clustersshareserverswithstrongisolation SingleHardwareInfrastructure Unifiedoperations Optimize SharedResources higherutilization Elasticresources fasteron demandaccess HadoopDev HadoopProd HBase Clust

6、erSprawlingSinglepurposeclustersforvariousbusinessapplicationsleadtoclustersprawl ClusterConsolidation Simplify Finance Hadoop VirtualizationPlatform HadoopDev HadoopProd HBase PortalHadoop PortalHadoop 30 CAPEXDown 50 resourcesaresittingidlewhilehighpriorityjobisburningupitscluster Utilizeallresour

7、cesfrompoolondemand Dynamicelasticscalingonsharedresourcepool WhyVirtualizeHadoop Utilizeallyourresourcestosolvethepriorityproblem3Xfastertogetanalyticresults vSphereHighAvailability HA protectionagainstunplanneddowntime Overview ProtectionagainsthostandVMfailures Automaticfailuredetection host gues

8、tOS Automaticvirtualmachinerestartinminutes onanyavailablehostincluster OSandapplication independent doesnotrequirecomplexconfigurationchanges Coordination Zookeepr ManagementServer HighAvailabilityfortheHadoopStack HadoopDistributedFileSystem HBase Key Valuestore HDFS MapReduce JobScheduling Execut

9、ionSystem Pig Data Flow Hive BIReporting ETLTools RDBMS JobtrackerNamenode SQL HiveMetaDB HCatalogHcatalogMDB Server XX HAHA AppOS AppAppOSOS AppOS AppOS AppOS AppOS VMwareESXX VMwareESX Zerodowntime zerodatalossfailoverforallvirtualmachinesincaseofhardwarefailures IntegratedwithVMwareHA DRS Nocompl

10、exclusteringorspecializedhardwarerequired Singlecommonmechanismforallapplicationsandoperating FT vSphereFaultToleranceprovidescontinuousprotectionOverview SingleidenticalVMsrunninginlocksteponseparatehosts systemsZerodowntimeforNameNode JobTrackerandothercomponentsinHadoopclusters Agenda Today sbigd

11、atasystem Whyvirtualizehadoop Serengetiintroduction Commonquestionsaboutvirtualization Serengetisolution DeepinsightintoSerengeti Summary Q A EasyandrapiddeploymentandmanagementOpensourceprojectlaunchedinJune2012 0 8isreleasedatApr andwillrelease0 9atJun ToolkitthatleveragevirtualizationtosimplifyHa

12、doopdeploymentandoperationsDeployaclusterin10MinutesfullyautomatedCustomizeHadoopandHBaseclusterAutomatedclusteroperation Comewitheco systemcomponentsSupportallpopularHadoopDistributions Serengeti Demo 10minutestoaHadoopclusterwithSerengeti Agenda Today sbigdatasystem Whyvirtualizehadoop Serengetiin

13、troduction Commonquestionsaboutvirtualization Serengetisolution DeepinsightintoSerengeti Summary Q A CommonquestionsaboutvirtualizationLocalDisk Canlocaldiskbeusedinvirtualizationenvironment FlexibilityandScalabilityHowtoflexiblescheduleresourcesbetweenclustersanddifferentapplicationsasmentionedabov

14、e DatastabilityInvirtualenvironment howcanwedistributedataacrosshostandrack DatalocalityHadoopwillschedulecomputetasksnearbythedata toreducenetworkIOfordataR W Canvirtualenvironmentgetthesameresult PerformanceHowabouttheperformanceinvirtualenvironment Agenda Today sbigdatasystem Whyvirtualizehadoop

15、Serengetiintroduction Commonquestionsaboutvirtualization Serengetisolution DeepinsightintoSerengeti Summary Q A CanIuselocaldiskeasily OtherVM OtherVM OtherVM OtherVM OtherVM OtherVM OtherVM OtherVM Hadoop Hadoop Hadoop Hadoop Hadoop Hadoop Hadoop Hadoop Hadoop Hadoop SerengetiExtendVirtualStorageAr

16、chitecturetoIncludeLocalDisk SharedStorage SANorNAS Easytoprovision Automatedclusterrebalancing HybridStorage SANforbootimages otherworkloads LocaldiskforHadoop HDFS Host Host Host Host Host Host Howtoflexiblescalein scaleout Howtoflexiblescheduleresourcesbetweenclustersanddifferentapplications Compute CurrentHadoop T1 T2 VM VM VM VM CombinedStorage ComputeHadoopinVM VMlifecycledeterminedbyDatanode Limitedelasticity VMStorageSeparateStorage VMStorageSeparateComputeClusters Separatecompute fromda

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 办公文档 > 活动策划

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号