
上传人:jiups****uk12 文档编号:57188940 上传时间:2018-10-19 格式:PPT 页数:16 大小:882.50KB
返回 下载 相关 举报
第1页 / 共16页
第2页 / 共16页
第3页 / 共16页
第4页 / 共16页
第5页 / 共16页


1、1,Map Reduce 介紹,王耀聰 陳威宇 國家高速網路與計算中心(NCHC),2,Divide and Conquer,範例四: 眼前有五階樓梯,每次可踏上一階或踏上兩階,那麼爬完五階共有幾種踏法? Ex : (1,1,1,1,1) or (1,2,1,1),範例一:十分逼近法,範例二:方格法求面積,範例三:鋪滿 L 形磁磚,3,Map Reduce 起源,Functional Programming : Map Reduce map(.) : 1,2,3,4 (*2) - 2,4,6,8 reduce(.): 1,2,3,

2、4 - (sum) - 10 演算法(Algorithms): Divide and Conquer 分而治之 在程式設計的軟體架構內,適合使用在大規模數據的運算中,4,Hadoop MapReduce定義,Hadoop Map/Reduce是一個易於使用的軟體平台,以MapReduce為基礎的應用程序,能夠運作在由上千台PC所組成的大型叢集上,並以一種可靠容錯的方式平行處理上P級別的資料集。,5,Hadoop-MapReduce 運作流程,part0,map,map,map,reduce,reduce,part1,input HDFS,sort/copy,merge,output HDFS,

3、JobTracker跟NameNode取得需要運算的blocks,JobTracker選數個TaskTracker來作Map運算,產生些中間檔案,JobTracker將中間檔案整合排序後,複製到需要的TaskTracker去,JobTracker派遣TaskTracker作reduce,reduce完後通知JobTracker與Namenode以產生output,MapReduce 與 ,6,Row Data,Map,Reduce,7,MapReduce 圖解,8,MapReduce in Parallel,JobTracker先選了三個Tracker做map,9,範例,I am a tige

4、r, you are also a tiger,a,2also,1 am,1are,1 I,1tiger,2 you,1,reduce,map,map,map,sort & shuffle,Map結束後,hadoop進行中間資料的重組與排序,JobTracker再選一個TaskTracker作reduce,10,Hadoop適用於,Text tokenization Indexing and Search Data mining machine learning ,http:/ 可拆解,Hadoop Applications (1),Adobe use Hadoop and HBase in

5、several areas from social services to structured data storage and processing for internal use. Adknowledge - Ad network used to build the recommender system for behavioral targeting, plus other clickstream analytics Alibaba processing sorts of business data dumped out of database and joining them to

6、gether. These data will then be fed into iSearch, our vertical search engine. AOL We use hadoop for variety of things ranging from ETL style processing and statistics generation to running advanced algorithms for doing behavioral analysis,11,Hadoop Applications (2),Baidu - the leading Chinese langua

7、ge search engine Hadoop used to analyze the log of search and do some mining work on web page database Contextweb - ADSDAQ Ad Excange use Hadoop to store ad serving log and use it as a source for Ad optimizations/Analytics/reporting/machine learning. Detikcom - Indonesias largest news portal use had

8、oop, pig and hbase to analyze search log, generate Most View News, generate top wordcloud, and analyze all of our logs,12,Hadoop Applications (3),DropFire generate Pig Latin scripts that describe structural and semantic conversions between data contexts use Hadoop to execute these scripts for produc

9、tion-level deployments Facebook use Hadoop to store copies of internal log and dimension data sources use it as a source for reporting/analytics and machine learning. Freestylers - Image retrieval engine use Hadoop 影像處理 Hosting Habitat 取得所有clients的軟體資訊 分析並告知clients 未安裝或未更新的軟體,13,Hadoop Applications

10、(4),IBM Blue Cloud Computing Clusters ICCS 用 Hadoop and Nutch to crawl Blog posts 並分析之 IIIT, Hyderabad We use hadoop 資訊檢索與提取 Journey Dynamics 用 Hadoop MapReduce 分析 billions of lines of GPS data 並產生交通路線資訊. Krugle 用 Hadoop and Nutch 建構 原始碼搜尋引擎,14,Hadoop Applications (5),SEDNS - Security Enhanced DNS G

11、roup 收集全世界的 DNS 以探索網路分散式內容. Technical analysis and Stock Research 分析股票資訊 University of Maryland 用Hadoop 執行 machine translation, language modeling, bioinformatics, email analysis, and image processing 相關研究 University of Nebraska Lincoln, Research Computing Facility 用Hadoop跑約200TB的CMS經驗分析 緊湊渺子線圈(CMS,C

12、ompact Muon Solenoid)為瑞士歐洲核子研究組織CERN的大型強子對撞器計劃的兩大通用型粒子偵測器中的一個。,15,Hadoop Applications (6),PARC Used Hadoop to analyze Wikipedia conflicts Search Wikia A project to help develop open source social search tools Yahoo! Used to support research for Ad Systems and Web Search 使用Hadoop平台來發現發送垃圾郵件的殭屍網絡 趨勢科技 過濾像是釣魚網站或惡意連結的網頁內容,16,


当前位置:首页 > 行业资料 > 其它行业文档

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号