Chapter 2 Data Warehousing and OLAP Technology for Data Mining数据挖掘:概念与技术 英文版教学课件

上传人:M****1 文档编号:585818733 上传时间:2024-09-03 格式:PPT 页数:64 大小:395.50KB
返回 下载 相关 举报
Chapter 2 Data Warehousing and OLAP Technology for Data Mining数据挖掘:概念与技术 英文版教学课件_第1页
第1页 / 共64页
Chapter 2 Data Warehousing and OLAP Technology for Data Mining数据挖掘:概念与技术 英文版教学课件_第2页
第2页 / 共64页
Chapter 2 Data Warehousing and OLAP Technology for Data Mining数据挖掘:概念与技术 英文版教学课件_第3页
第3页 / 共64页
Chapter 2 Data Warehousing and OLAP Technology for Data Mining数据挖掘:概念与技术 英文版教学课件_第4页
第4页 / 共64页
Chapter 2 Data Warehousing and OLAP Technology for Data Mining数据挖掘:概念与技术 英文版教学课件_第5页
第5页 / 共64页
点击查看更多>>
资源描述

《Chapter 2 Data Warehousing and OLAP Technology for Data Mining数据挖掘:概念与技术 英文版教学课件》由会员分享,可在线阅读,更多相关《Chapter 2 Data Warehousing and OLAP Technology for Data Mining数据挖掘:概念与技术 英文版教学课件(64页珍藏版)》请在金锄头文库上搜索。

1、Data Mining: Concepts and Techniques Slides for Textbook Chapter 2 Jiawei Han and Micheline KamberIntelligent Database Systems Research LabSchool of Computing Science Simon Fraser University, Canada :/ cs.sfu.ca9/3/20241Chapter 2: Data Warehousing and OLAP Technology for Data MiningnWhat is a data w

2、arehouse? nA multi-dimensional data modelnData warehouse architecturenData warehouse implementationnFurther development of data cube technologynFrom data warehousing to data mining9/3/20242What is Data Warehouse?nDefined in many different ways, but not rigorously.nA decision support database that is

3、 maintained separately from the organizations operational databasenSupport information processing by providing a solid platform of consolidated, historical data for analysis.n“A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of managemen

4、ts decision-making process.”W. H. InmonnData warehousing:nThe process of constructing and using data warehouses9/3/20243Data WarehouseSubject-OrientednOrganized around major subjects, such as customer, product, sales.nFocusing on the modeling and analysis of data for decision makers, not on daily op

5、erations or transaction processing.nProvide a simple and concise view around particular subject issues by excluding data that are not useful in the decision support process.9/3/20244Data WarehouseIntegratednConstructed by integrating multiple, heterogeneous data sourcesnrelational databases, flat fi

6、les, on-line transaction recordsnData cleaning and data integration techniques are applied.nEnsure consistency in naming conventions, encoding structures, attribute measures, etc. among different data sourcesnE.g., Hotel price: currency, tax, breakfast covered, etc.nWhen data is moved to the warehou

7、se, it is converted. 9/3/20245Data WarehouseTime VariantnThe time horizon for the data warehouse is significantly longer than that of operational systems.nOperational database: current value data.nData warehouse data: provide information from a historical perspective (e.g., past 5-10 years)nEvery ke

8、y structure in the data warehousenContains an element of time, explicitly or implicitlynBut the key of operational data may or may not contain “time element”.9/3/20246Data WarehouseNon-VolatilenA physically separate store of data transformed from the operational environment.nOperational update of da

9、ta does not occur in the data warehouse environment.nDoes not require transaction processing, recovery, and concurrency control mechanismsnRequires only two operations in data accessing: ninitial loading of data and access of data.9/3/20247Data Warehouse vs. Heterogeneous DBMSnTraditional heterogene

10、ous DB integration: nBuild wrappers/mediators on top of heterogeneous databases nQuery driven approachnWhen a query is posed to a client site, a meta-dictionary is used to translate the query into queries appropriate for individual heterogeneous sites involved, and the results are integrated into a

11、global answer setnComplex information filtering, compete for resourcesnData warehouse: update-driven, high performancenInformation from heterogeneous sources is integrated in advance and stored in warehouses for direct query and analysis9/3/20248Data Warehouse vs. Operational DBMSnOLTP (on-line tran

12、saction processing)nMajor task of traditional relational DBMSnDay-to-day operations: purchasing, inventory, banking, manufacturing, payroll, registration, accounting, etc.nOLAP (on-line analytical processing)nMajor task of data warehouse systemnData analysis and decision makingnDistinct features (OL

13、TP vs. OLAP):nUser and system orientation: customer vs. marketnData contents: current, detailed vs. historical, consolidatednDatabase design: ER + application vs. star + subjectnView: current, local vs. evolutionary, integratednAccess patterns: update vs. read-only but complex queries9/3/20249OLTP v

14、s. OLAP9/3/202410Why Separate Data Warehouse?nHigh performance for both systemsnDBMS tuned for OLTP: access methods, indexing, concurrency control, recoverynWarehousetuned for OLAP: complex OLAP queries, multidimensional view, consolidation.nDifferent functions and different data:nmissing data: Deci

15、sion support requires historical data which operational DBs do not typically maintainndata consolidation: DS requires consolidation (aggregation, summarization) of data from heterogeneous sourcesndata quality: different sources typically use inconsistent data representations, codes and formats which

16、 have to be reconciled9/3/202411Chapter 2: Data Warehousing and OLAP Technology for Data MiningnWhat is a data warehouse? nA multi-dimensional data modelnData warehouse architecturenData warehouse implementationnFurther development of data cube technologynFrom data warehousing to data mining9/3/2024

17、12From Tables and Spreadsheets to Data CubesnA data warehouse is based on a multidimensional data model which views data in the form of a data cubenA data cube, such as sales, allows data to be modeled and viewed in multiple dimensionsnDimension tables, such as item (item_name, brand, type), or time

18、(day, week, month, quarter, year) nFact table contains measures (such as dollars_sold) and keys to each of the related dimension tablesnIn data warehousing literature, an n-D base cube is called a base cuboid. The top most 0-D cuboid, which holds the highest-level of summarization, is called the ape

19、x cuboid. The lattice of cuboids forms a data cube.9/3/202413Cube: A Lattice of Cuboidsalltimeitemlocationsuppliertime,itemtime,locationtime,supplieritem,locationitem,supplierlocation,suppliertime,item,locationtime,item,suppliertime,location,supplieritem,location,suppliertime, item, location, suppli

20、er0-D(apex) cuboid1-D cuboids2-D cuboids3-D cuboids4-D(base) cuboid9/3/202414Conceptual Modeling of Data WarehousesnModeling data warehouses: dimensions & measuresnStar schema: A fact table in the middle connected to a set of dimension tables nSnowflake schema: A refinement of star schema where some

21、 dimensional hierarchy is normalized into a set of smaller dimension tables, forming a shape similar to snowflakenFact constellations: Multiple fact tables share dimension tables, viewed as a collection of stars, therefore called galaxy schema or fact constellation 9/3/202415Example of Star Schema t

22、ime_keydayday_of_the_weekmonthquarteryeartimelocation_keystreetcityprovince_or_streetcountrylocationSales Fact Table time_key item_key branch_key location_key units_sold dollars_sold avg_salesMeasuresitem_keyitem_namebrandtypesupplier_typeitembranch_keybranch_namebranch_typebranch9/3/202416Example o

23、f Snowflake Schematime_keydayday_of_the_weekmonthquarteryeartimelocation_keystreetcity_keylocationSales Fact Table time_key item_key branch_key location_key units_sold dollars_sold avg_salesMeasuresitem_keyitem_namebrandtypesupplier_keyitembranch_keybranch_namebranch_typebranchsupplier_keysupplier_t

24、ypesuppliercity_keycityprovince_or_streetcountrycity9/3/202417Example of Fact Constellationtime_keydayday_of_the_weekmonthquarteryeartimelocation_keystreetcityprovince_or_streetcountrylocationSales Fact Tabletime_key item_key branch_key location_key units_sold dollars_sold avg_salesMeasuresitem_keyi

25、tem_namebrandtypesupplier_typeitembranch_keybranch_namebranch_typebranchShipping Fact Tabletime_key item_key shipper_key from_location to_location dollars_cost units_shippedshipper_keyshipper_namelocation_keyshipper_typeshipper9/3/202418A Data Mining Query Language, DMQL: Language PrimitivesnCube De

26、finition (Fact Table)define cube : nDimension Definition ( Dimension Table )define dimension as ()nSpecial Case (Shared Dimension Tables)nFirst time as “cube definition”ndefine dimension as in cube 9/3/202419Defining a Star Schema in DMQLdefine cube sales_star time, item, branch, location:dollars_so

27、ld = sum(sales_in_dollars), avg_sales = avg(sales_in_dollars), units_sold = count(*)define dimension time as (time_key, day, day_of_week, month, quarter, year)define dimension item as (item_key, item_name, brand, type, supplier_type)define dimension branch as (branch_key, branch_name, branch_type)de

28、fine dimension location as (location_key, street, city, province_or_state, country)9/3/202420Defining a Snowflake Schema in DMQLdefine cube sales_snowflake time, item, branch, location:dollars_sold = sum(sales_in_dollars), avg_sales = avg(sales_in_dollars), units_sold = count(*)define dimension time

29、 as (time_key, day, day_of_week, month, quarter, year)define dimension item as (item_key, item_name, brand, type, supplier(supplier_key, supplier_type)define dimension branch as (branch_key, branch_name, branch_type)define dimension location as (location_key, street, city(city_key, province_or_state

30、, country)9/3/202421Defining a Fact Constellation in DMQLdefine cube sales time, item, branch, location:dollars_sold = sum(sales_in_dollars), avg_sales = avg(sales_in_dollars), units_sold = count(*)define dimension time as (time_key, day, day_of_week, month, quarter, year)define dimension item as (i

31、tem_key, item_name, brand, type, supplier_type)define dimension branch as (branch_key, branch_name, branch_type)define dimension location as (location_key, street, city, province_or_state, country)define cube shipping time, item, shipper, from_location, to_location:dollar_cost = sum(cost_in_dollars)

32、, unit_shipped = count(*)define dimension time as time in cube salesdefine dimension item as item in cube salesdefine dimension shipper as (shipper_key, shipper_name, location as location in cube sales, shipper_type)define dimension from_location as location in cube salesdefine dimension to_location

33、 as location in cube sales9/3/202422Measures: Three Categoriesndistributive: if the result derived by applying the function to n aggregate values is the same as that derived by applying the function on all the data without partitioning.nE.g., count(), sum(), min(), max().nalgebraic: if it can be com

34、puted by an algebraic function with M arguments (where M is a bounded integer), each of which is obtained by applying a distributive aggregate function.nE.g., avg(), min_N(), standard_deviation().nholistic: if there is no constant bound on the storage size needed to describe a subaggregate. nE.g., m

35、edian(), mode(), rank().9/3/202423A Concept Hierarchy: Dimension (location)allEuropeNorth_AmericaMexicoCanadaSpainGermanyVancouverM. WindL. Chan.allregionofficecountryTorontoFrankfurtcity9/3/202424View of Warehouses and HierarchiesSpecification of hierarchiesnSchema hierarchyday month quarter; week

36、yearnSet_grouping hierarchy1.10 inexpensive9/3/202425Multidimensional DatanSales volume as a function of product, month, and regionProductRegionMonthDimensions: Product, Location, TimeHierarchical summarization pathsIndustry Region YearCategory Country QuarterProduct City Month Week Office Day9/3/20

37、2426A Sample Data CubeTotal annual salesof TV in U.S.A.DateProductCountrysumsum TVVCRPC1Qtr2Qtr3Qtr4QtrU.S.ACanadaMexicosum9/3/202427Cuboids Corresponding to the Cubeallproductdatecountryproduct,dateproduct,countrydate, countryproduct, date, country0-D(apex) cuboid1-D cuboids2-D cuboids3-D(base) cub

38、oid9/3/202428Browsing a Data CubenVisualizationnOLAP capabilitiesnInteractive manipulation9/3/202429Typical OLAP OperationsnRoll up (drill-up): summarize datanby climbing up hierarchy or by dimension reductionnDrill down (roll down): reverse of roll-upnfrom higher level summary to lower level summar

39、y or detailed data, or introducing new dimensionsnSlice and dice: nproject and select nPivot (rotate): nreorient the cube, visualization, 3D to series of 2D planes.nOther operationsndrill across: involving (across) more than one fact tablendrill through: through the bottom level of the cube to its b

40、ack-end relational tables (using SQL)9/3/202430A Star-Net Query Model Shipping MethodAIR-EXPRESSTRUCKORDERCustomer OrdersCONTRACTSCustomerProductPRODUCT GROUPPRODUCT LINEPRODUCT ITEMSALES PERSONDISTRICTDIVISIONOrganizationPromotionCITYCOUNTRYREGIONLocationDAILYQTRLYANNUALYTimeEach circle is called a

41、 footprint9/3/202431Chapter 2: Data Warehousing and OLAP Technology for Data MiningnWhat is a data warehouse? nA multi-dimensional data modelnData warehouse architecturenData warehouse implementationnFurther development of data cube technologynFrom data warehousing to data mining9/3/202432Design of

42、a Data Warehouse: A Business Analysis FrameworknFour views regarding the design of a data warehouse nTop-down viewnallows selection of the relevant information necessary for the data warehousenData source viewnexposes the information being captured, stored, and managed by operational systemsnData wa

43、rehouse viewnconsists of fact tables and dimension tablesnBusiness query view nsees the perspectives of data in the warehouse from the view of end-user9/3/202433Data Warehouse Design Process nTop-down, bottom-up approaches or a combination of bothnTop-down: Starts with overall design and planning (m

44、ature)nBottom-up: Starts with experiments and prototypes (rapid)nFrom software engineering point of viewnWaterfall: structured and systematic analysis at each step before proceeding to the nextnSpiral: rapid generation of increasingly functional systems, short turn around time, quick turn aroundnTyp

45、ical data warehouse design processnChoose a business process to model, e.g., orders, invoices, etc.nChoose the grain (atomic level of data) of the business processnChoose the dimensions that will apply to each fact table recordnChoose the measure that will populate each fact table record9/3/202434Mu

46、lti-Tiered ArchitectureDataWarehouseExtractTransformLoadRefreshOLAP EngineAnalysisQueryReportsData miningMonitor&IntegratorMetadataData SourcesFront-End ToolsServeData MartsOperational DBsothersourcesData StorageOLAP Server9/3/202435Three Data Warehouse ModelsnEnterprise warehousencollects all of th

47、e information about subjects spanning the entire organizationnData Martna subset of corporate-wide data that is of value to a specific groups of users. Its scope is confined to specific, selected groups, such as marketing data martnIndependent vs. dependent (directly from warehouse) data martnVirtua

48、l warehousenA set of views over operational databasesnOnly some of the possible summary views may be materialized9/3/202436Data Warehouse Development: A Recommended ApproachDefine a high-level corporate data modelData MartData MartDistributed Data MartsMulti-Tier Data WarehouseEnterprise Data Wareho

49、useModel refinementModel refinement9/3/202437OLAP Server ArchitecturesnRelational OLAP (ROLAP) nUse relational or extended-relational DBMS to store and manage warehouse data and OLAP middle ware to support missing piecesnInclude optimization of DBMS backend, implementation of aggregation navigation

50、logic, and additional tools and servicesngreater scalabilitynMultidimensional OLAP (MOLAP) nArray-based multidimensional storage engine (sparse matrix techniques)nfast indexing to pre-computed summarized datanHybrid OLAP (HOLAP)nUser flexibility, e.g., low level: relational, high-level: arraynSpecia

51、lized SQL serversnspecialized support for SQL queries over star/snowflake schemas9/3/202438Chapter 2: Data Warehousing and OLAP Technology for Data MiningnWhat is a data warehouse? nA multi-dimensional data modelnData warehouse architecturenData warehouse implementationnFurther development of data c

52、ube technologynFrom data warehousing to data mining9/3/202439Efficient Data Cube ComputationnData cube can be viewed as a lattice of cuboids nThe bottom-most cuboid is the base cuboidnThe top-most cuboid (apex) contains only one cellnHow many cuboids in an n-dimensional cube with L levels?nMateriali

53、zation of data cubenMaterialize every (cuboid) (full materialization), none (no materialization), or some (partial materialization)nSelection of which cuboids to materializenBased on size, sharing, access frequency, etc.9/3/202440Cube OperationnCube definition and computation in DMQLdefine cube sale

54、sitem, city, year: sum(sales_in_dollars)compute cube salesnTransform it into a SQL-like language (with a new operator cube by, introduced by Gray et al.96)SELECT item, city, year, SUM (amount)FROM SALESCUBE BY item, city, yearnNeed compute the following Group-Bys (date, product, customer),(date,prod

55、uct),(date, customer), (product, customer),(date), (product), (customer)() (item)(city)()(year)(city, item)(city, year)(item, year)(city, item, year)9/3/202441Cube Computation: ROLAP-Based MethodnEfficient cube computation methodsnROLAP-based cubing algorithms (Agarwal et al96)nArray-based cubing al

56、gorithm (Zhao et al97)nBottom-up computation method (Bayer & Ramarkrishnan99)nROLAP-based cubing algorithms nSorting, hashing, and grouping operations are applied to the dimension attributes in order to reorder and cluster related tuplesnGrouping is performed on some subaggregates as a “partial grou

57、ping step”nAggregates may be computed from previously computed aggregates, rather than from the base fact table9/3/202442Multi-way Array Aggregation for Cube ComputationnPartition arrays into chunks (a small subcube which fits in memory). nCompressed sparse array addressing: (chunk_id, offset)nCompu

58、te aggregates in “multiway” by visiting cube cells in the order which minimizes the # of times to visit each cell, and reduces memory access and storage cost.What is the best traversing order to do multi-way aggregation?AB29303132123459131415166463626148474645a1a0c3c2c1c 0b3b2b1b0a2a3CB4428564024523

59、620609/3/202444Multi-way Array Aggregation for Cube ComputationAB29303132123459131415166463626148474645a1a0c3c2c1c 0b3b2b1b0a2a3C442856402452362060B9/3/202445Multi-way Array Aggregation for Cube ComputationAB29303132123459131415166463626148474645a1a0c3c2c1c 0b3b2b1b0a2a3C442856402452362060B9/3/20244

60、6Multi-Way Array Aggregation for Cube Computation (Cont.)nMethod: the planes should be sorted and computed according to their size in ascending order.nSee the details of Example 2.12 (pp. 75-78)nIdea: keep the smallest plane in the main memory, fetch and compute only one chunk at a time for the larg

61、est planenLimitation of the method: computing well only for a small number of dimensionsnIf there are a large number of dimensions, “bottom-up computation” and iceberg cube computation methods can be explored9/3/202447Indexing OLAP Data: Bitmap IndexnIndex on a particular columnnEach value in the co

62、lumn has a bit vector: bit-op is fastnThe length of the bit vector: # of records in the base tablenThe i-th bit is set if the i-th row of the base table has the value for the indexed columnnnot suitable for high cardinality domainsBase tableIndex on RegionIndex on Type9/3/202448Indexing OLAP Data: J

63、oin IndicesnJoin index: JI(R-id, S-id) where R (R-id, ) S (S-id, )nTraditional indices map the values to a list of record idsnIt materializes relational join in JI file and speeds up relational join a rather costly operationnIn data warehouses, join index relates the values of the dimensions of a st

64、art schema to rows in the fact table.nE.g. fact table: Sales and two dimensions city and productnA join index on city maintains for each distinct city a list of R-IDs of the tuples recording the Sales in the city nJoin indices can span multiple dimensions9/3/202449Efficient Processing OLAP QueriesnD

65、etermine which operations should be performed on the available cuboids:ntransform drill, roll, etc. into corresponding SQL and/or OLAP operations, e.g, dice = selection + projectionnDetermine to which materialized cuboid(s) the relevant operations should be applied.nExploring indexing structures and

66、 compressed vs. dense array structures in MOLAP9/3/202450Metadata RepositorynMeta data is the data defining warehouse objects. It has the following kinds nDescription of the structure of the warehousenschema, view, dimensions, hierarchies, derived data defn, data mart locations and contentsnOperatio

67、nal meta-datandata lineage (history of migrated data and transformation path), currency of data (active, archived, or purged), monitoring information (warehouse usage statistics, error reports, audit trails)nThe algorithms used for summarizationnThe mapping from operational environment to the data w

68、arehousenData related to system performancenwarehouse schema, view and derived data definitionsnBusiness datanbusiness terms and definitions, ownership of data, charging policies9/3/202451Data Warehouse Back-End Tools and UtilitiesnData extraction:nget data from multiple, heterogeneous, and external

69、 sourcesnData cleaning:ndetect errors in the data and rectify them when possiblenData transformation:nconvert data from legacy or host format to warehouse formatnLoad:nsort, summarize, consolidate, compute views, check integrity, and build indicies and partitionsnRefreshnpropagate the updates from t

70、he data sources to the warehouse9/3/202452Chapter 2: Data Warehousing and OLAP Technology for Data MiningnWhat is a data warehouse? nA multi-dimensional data modelnData warehouse architecturenData warehouse implementationnFurther development of data cube technologynFrom data warehousing to data mini

71、ng9/3/202453Discovery-Driven Exploration of Data CubesnHypothesis-driven: exploration by user, huge search spacenDiscovery-driven (Sarawagi et al.98)npre-compute measures indicating exceptions, guide user in the data analysis, at all levels of aggregationnException: significantly different from the

72、value anticipated, based on a statistical modelnVisual cues such as background color are used to reflect the degree of exception of each cellnComputation of exception indicator (modeling fitting and computing SelfExp, InExp, and PathExp values) can be overlapped with cube construction9/3/202454Examp

73、les: Discovery-Driven Data Cubes9/3/202455Complex Aggregation at Multiple Granularities: Multi-Feature CubesnMulti-feature cubes (Ross, et al. 1998): Compute complex queries involving multiple dependent aggregates at multiple granularitiesnEx. Grouping by all subsets of item, region, month, find the

74、 maximum price in 1997 for each group, and the total sales among all maximum price tuplesselect item, region, month, max(price), sum(R.sales)from purchaseswhere year = 1997cube by item, region, month: Rsuch that R.price = max(price)nContinuing the last example, among the max price tuples, find the m

75、in and max shelf life, and find the fraction of the total sales due to tuple that have min shelf life within the set of all max price tuples9/3/202456Chapter 2: Data Warehousing and OLAP Technology for Data MiningnWhat is a data warehouse? nA multi-dimensional data modelnData warehouse architecturen

76、Data warehouse implementationnFurther development of data cube technologynFrom data warehousing to data mining9/3/202457Data Warehouse UsagenThree kinds of data warehouse applicationsnInformation processingnsupports querying, basic statistical analysis, and reporting using crosstabs, tables, charts

77、and graphsnAnalytical processingnmultidimensional analysis of data warehouse datansupports basic OLAP operations, slice-dice, drilling, pivotingnData miningnknowledge discovery from hidden patterns nsupports associations, constructing analytical models, performing classification and prediction, and

78、presenting the mining results using visualization tools.nDifferences among the three tasks9/3/202458From On-Line Analytical Processing to On Line Analytical Mining (OLAM)nWhy online analytical mining?nHigh quality of data in data warehousesnDW contains integrated, consistent, cleaned datanAvailable

79、information processing structure surrounding data warehousesnODBC, OLEDB, Web accessing, service facilities, reporting and OLAP toolsnOLAP-based exploratory data analysisnmining with drilling, dicing, pivoting, etc.nOn-line selection of data mining functionsnintegration and swapping of multiple mini

80、ng functions, algorithms, and tasks.nArchitecture of OLAM9/3/202459An OLAM ArchitectureData WarehouseMeta DataMDDBOLAMEngineOLAPEngineUser GUI APIData Cube APIDatabase APIData cleaningData integrationLayer3OLAP/OLAMLayer2MDDBLayer1Data RepositoryLayer4User InterfaceFiltering&IntegrationFilteringData

81、basesMining queryMining result9/3/202460SummarynData warehouse nA subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of managements decision-making processnA multi-dimensional model of a data warehousenStar schema, snowflake schema, fact constellationsnA data c

82、ube consists of dimensions & measuresnOLAP operations: drilling, rolling, slicing, dicing and pivotingnOLAP servers: ROLAP, MOLAP, HOLAPnEfficient computation of data cubesnPartial vs. full vs. no materializationnMultiway array aggregationnBitmap index and join index implementationsnFurther developm

83、ent of data cube technologynDiscovery-drive and multi-feature cubesnFrom OLAP to OLAM (on-line analytical mining)9/3/202461References (I)nS. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F. Naughton, R. Ramakrishnan, and S. Sarawagi. On the computation of multidimensional aggregates. In Proc. 1

84、996 Int. Conf. Very Large Data Bases, 506-521, Bombay, India, Sept. 1996.nD. Agrawal, A. E. Abbadi, A. Singh, and T. Yurek. Efficient view maintenance in data warehouses. In Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data, 417-427, Tucson, Arizona, May 1997.nR. Agrawal, J. Gehrke, D. Gunopulos,

85、and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. In Proc. 1998 ACM-SIGMOD Int. Conf. Management of Data, 94-105, Seattle, Washington, June 1998.nR. Agrawal, A. Gupta, and S. Sarawagi. Modeling multidimensional databases. In Proc. 1997 Int. Conf. D

86、ata Engineering, 232-243, Birmingham, England, April 1997.nK. Beyer and R. Ramakrishnan. Bottom-Up Computation of Sparse and Iceberg CUBEs. In Proc. 1999 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD99), 359-370, Philadelphia, PA, June 1999.nS. Chaudhuri and U. Dayal. An overview of data warehous

87、ing and OLAP technology. ACM SIGMOD Record, 26:65-74, 1997.nOLAP council. MDAPI specification version 2.0. In :/ olapcouncil.org/research/apily.htm, 1998.nJ. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, and H. Pirahesh. Data cube: A relational aggregation operato

88、r generalizing group-by, cross-tab and sub-totals. Data Mining and Knowledge Discovery, 1:29-54, 1997.9/3/202462References (II)nV. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In Proc. 1996 ACM-SIGMOD Int. Conf. Management of Data, pages 205-216, Montreal, Canada

89、, June 1996.nMicrosoft. OLEDB for OLAP programmers reference version 1.0. In :/ microsoft /data/oledb/olap, 1998.nK. Ross and D. Srivastava. Fast computation of sparse datacubes. In Proc. 1997 Int. Conf. Very Large Data Bases, 116-125, Athens, Greece, Aug. 1997.nK. A. Ross, D. Srivastava, and D. Cha

90、tziantoniou. Complex aggregation at multiple granularities. In Proc. Int. Conf. of Extending Database Technology (EDBT98), 263-277, Valencia, Spain, March 1998.nS. Sarawagi, R. Agrawal, and N. Megiddo. Discovery-driven exploration of OLAP data cubes. In Proc. Int. Conf. of Extending Database Technol

91、ogy (EDBT98), pages 168-182, Valencia, Spain, March 1998.nE. Thomsen. OLAP Solutions: Building Multidimensional Information Systems. John Wiley & Sons, 1997.nY. Zhao, P. M. Deshpande, and J. F. Naughton. An array-based algorithm for simultaneous multidimensional aggregates. In Proc. 1997 ACM-SIGMOD Int. Conf. Management of Data, 159-170, Tucson, Arizona, May 1997.9/3/202463 :/ cs.sfu.ca/hanThank you !9/3/202464

展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 商业/管理/HR > 市场营销

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号