数据仓库讲义之Recent Developments in Data Warehousing

上传人:jiups****uk12 文档编号:45623240 上传时间:2018-06-18 格式:PPT 页数:62 大小:1.18MB
返回 下载 相关 举报
数据仓库讲义之Recent Developments in Data Warehousing_第1页
第1页 / 共62页
数据仓库讲义之Recent Developments in Data Warehousing_第2页
第2页 / 共62页
数据仓库讲义之Recent Developments in Data Warehousing_第3页
第3页 / 共62页
数据仓库讲义之Recent Developments in Data Warehousing_第4页
第4页 / 共62页
数据仓库讲义之Recent Developments in Data Warehousing_第5页
第5页 / 共62页
点击查看更多>>
资源描述

《数据仓库讲义之Recent Developments in Data Warehousing》由会员分享,可在线阅读,更多相关《数据仓库讲义之Recent Developments in Data Warehousing(62页珍藏版)》请在金锄头文库上搜索。

1、Recent Developments in Data WarehousingHugh J. Watson Terry College of Business University of Georgia hwatsonterry.uga.eduhttp:/www.terry.uga.edu/hwatson/dw_tutorial.pptTutorial ObjectivesnProvide an overview of data warehousingnProvide materials to support the teaching of data warehousing nDiscuss

2、recent developments in data warehousingThe Importance of Data WarehousingnProvide a “single version of the truth”nImprove decision making nSupport key corporate initiatives such as performance management, B2C and B2B e-commerce, and customer relationship managementnEstimated to be a $113.5 billion m

3、arket in 2002 for systems, software, services, and in-house expenditures (Palo Alto Management Group) Data Warehouse CharacteristicsnSubject oriented - data are organized around sales, products, etc.nIntegrated - data are integrated to provide a comprehensive viewnTime variant - historical data are

4、maintainednNonvolatile - data are not updated by usersTopics CoverednDefinitions and conceptsnTwo case studies: Harrahs Entertainment (first) and Owens street number and street name; and city and state.CorrectingnCorrects parsed individual data components using sophisticated data algorithms and seco

5、ndary data sources.nExample include replacing a vanity address and adding a zip code.StandardizingnStandardizing applies conversion routines to transform data into its preferred (and consistent) format using both standard and custom business rules.nExamples include adding a pre name, replacing a nic

6、kname, and using a preferred street name. MatchingnSearching and matching records within and across the parsed, corrected and standardized data based on predefined business rules to eliminate duplications.nExamples include identifying similar names and addresses.ConsolidatingAnalyzing and identifyin

7、g relationships between matched records and consolidating/merging them into ONE representation.Data StagingnOften used as an interim step between data extraction and later stepsnAccumulates data from asynchronous sources using native interfaces, flat files, FTP sessions, or other processesnAt a pred

8、efined cutoff time, data in the staging file is transformed and loaded to the warehousenThere is usually no end user access to the staging filenAn operational data store may be used for data stagingData TransformationnTransforms the data in accordance with the business rules and standards that have

9、been establishednExample include: format changes, deduplication, splitting up fields, replacement of codes, derived values, and aggregatesData LoadingnData are physically moved to the data warehousenThe loading takes place within a “load window” nThe trend is to near real time updates of the data wa

10、rehouse as the warehouse is increasingly used for operational applicationsMeta DatanData about datanNeeded by both information technology personnel and usersnIT personnel need to know data sources and targets; database, table and column names; refresh schedules; data usage measures; etc. nUsers need

11、 to know entity/attribute definitions; reports/query tools available; report distribution information; help desk contact information, etc. Recent Development: Meta Data IntegrationnA growing realization that meta data is critical to data warehousing success nProgress is being made on getting vendors

12、 to agree on standards and to incorporate the sharing of meta data among their toolsnVendors like Microsoft, Computer Associates, and Oracle have entered the meta data marketplace with significant product offeringsDatabase VendorsnHigh end (i.e., terabyte plus) vendors include IBM (DB2) and NCR -Ter

13、adata (Teradata)nOracle (8i) and Microsoft (SQL Server 7) are major players for smaller databasesOn-line Analytical Processing (OLAP)nA set of functionality that facilitates multidimensional analysisnAllows users to analyze data in ways that are natural to themnComes in many varieties - ROLAP, MOLAP

14、, DOLAP, etc.ROLAPnRelational OLAPnUses a RDBMS to implement and OLAP environmentnTypically involves a star schema to provide the multidimensional capabilitiesnOLAP tool manipulates RDBMS star schema datanCalled slowlap by MOLAP vendorsMOLAPnMultidimensional OLAPnUses a MDDBS (e.g., Essbase) to stor

15、e and access datanUsually requires proprietary (non SQL) data access toolsnProvides exceptionally fast response timesStar SchemanCreates non-normalized data structuresnEasier for users to understandnOptimized for OLAPnUses fact (facts or measures in the business) and dimension (establishes the conte

16、xt of the facts) tablesOLAP ToolsnProducts come from vendors such as Brio, Cognos, Hyperion, and BusinessObjectsnTypically available as a fat or thin (i.e., browser) clientnIn a web environment, the browser communicates with a web server, which talks to an application server, which connects to backend databasesnThe application server provides query, reporting, and OLAP analysis functionality ove

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 行业资料 > 其它行业文档

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号