数据库原理与系统chRelationalDatabaseDesignNormalization

上传人:m**** 文档编号:587471214 上传时间:2024-09-06 格式:PPT 页数:87 大小:799.50KB
返回 下载 相关 举报
数据库原理与系统chRelationalDatabaseDesignNormalization_第1页
第1页 / 共87页
数据库原理与系统chRelationalDatabaseDesignNormalization_第2页
第2页 / 共87页
数据库原理与系统chRelationalDatabaseDesignNormalization_第3页
第3页 / 共87页
数据库原理与系统chRelationalDatabaseDesignNormalization_第4页
第4页 / 共87页
数据库原理与系统chRelationalDatabaseDesignNormalization_第5页
第5页 / 共87页
点击查看更多>>
资源描述

《数据库原理与系统chRelationalDatabaseDesignNormalization》由会员分享,可在线阅读,更多相关《数据库原理与系统chRelationalDatabaseDesignNormalization(87页珍藏版)》请在金锄头文库上搜索。

1、Database System Concepts, 5th Ed.Chapter 7: Relational Database Design1Silberschatz, Korth and Sudarshan7.2Database System Concepts - 5th Edition, Oct 5, 2006Chapter 7: Relational Database DesignnFeatures of Good Relational DesignnAtomic Domains and First Normal FormnDecomposition Using Functional D

2、ependenciesnFunctional Dependency TheorynAlgorithms for Functional DependenciesnDecomposition Using Multivalued Dependencies nMore Normal FormnDatabase-Design ProcessnModeling Temporal Data2Silberschatz, Korth and Sudarshan7.3Database System Concepts - 5th Edition, Oct 5, 2006The Banking Schemanbran

3、ch = (branch_name, branch_city, assets)ncustomer = (customer_id, customer_name, customer_street, customer_city)nloan = (loan_number, amount)naccount = (account_number, balance)nemployee = (employee_id. employee_name, telephone_number, start_date)ndependent_name = (employee_id, dname)naccount_branch

4、= (account_number, branch_name)nloan_branch = (loan_number, branch_name)nborrower = (customer_id, loan_number)ndepositor = (customer_id, account_number)ncust_banker = (customer_id, employee_id, type)nworks_for = (worker_employee_id, manager_employee_id)npayment = (loan_number, payment_number, paymen

5、t_date, payment_amount)nsavings_account = (account_number, interest_rate)nchecking_account = (account_number, overdraft_amount)3Silberschatz, Korth and Sudarshan7.4Database System Concepts - 5th Edition, Oct 5, 2006Combine Schemas?nSuppose we combine borrower and loan to get bor_loan = (customer_id,

6、 loan_number, amount )nResult is possible repetition of information (L-100 in example below)4Silberschatz, Korth and Sudarshan7.5Database System Concepts - 5th Edition, Oct 5, 2006A Combined Schema Without RepetitionnConsider combining loan_branch and loanloan_amt_br = (loan_number, amount, branch_n

7、ame)nNo repetition (as suggested by example below)5Silberschatz, Korth and Sudarshan7.6Database System Concepts - 5th Edition, Oct 5, 2006What About Smaller Schemas?nSuppose we had started with bor_loan. How would we know to split up (decompose) it into borrower and loan?nWrite a rule “if there were

8、 a schema (loan_number, amount), then loan_number would be a candidate key”nDenote as a functional dependency: loan_number amountnIn bor_loan, because loan_number is not a candidate key, the amount of a loan may have to be repeated. This indicates the need to decompose bor_loan.nNot all decompositio

9、ns are good. Suppose we decompose employee intoemployee1 = (employee_id, employee_name)employee2 = (employee_name, telephone_number, start_date)nThe next slide shows how we lose information - we cannot reconstruct the original employee relation - and so, this is a lossy decomposition.6Silberschatz,

10、Korth and Sudarshan7.7Database System Concepts - 5th Edition, Oct 5, 2006A Lossy Decomposition7Silberschatz, Korth and Sudarshan7.8Database System Concepts - 5th Edition, Oct 5, 2006First Normal FormnDomain is atomic if its elements are considered to be indivisible unitslExamples of non-atomic domai

11、ns:4Set of names, composite attributes4Identification numbers like CS101 that can be broken up into partsnA relational schema R is in first normal form if the domains of all attributes of R are atomicnNon-atomic values complicate storage and encourage redundant (repeated) storage of datalExample: Se

12、t of accounts stored with each customer, and set of owners stored with each accountlWe assume all relations are in first normal form (and revisit this in Chapter 9)8Silberschatz, Korth and Sudarshan7.9Database System Concepts - 5th Edition, Oct 5, 2006First Normal Form (Contd)nAtomicity is actually

13、a property of how the elements of the domain are used.lExample: Strings would normally be considered indivisible lSuppose that students are given roll numbers which are strings of the form CS0012 or EE1127lIf the first two characters are extracted to find the department, the domain of roll numbers i

14、s not atomic.lDoing so is a bad idea: leads to encoding of information in application program rather than in the database.9Silberschatz, Korth and Sudarshan7.10Database System Concepts - 5th Edition, Oct 5, 2006Goal Devise a Theory for the FollowingnDecide whether a particular relation R is in “good

15、” form.nIn the case that a relation R is not in “good” form, decompose it into a set of relations R1, R2, ., Rn such that leach relation is in good form lthe decomposition is a lossless-join decompositionnOur theory is based on:lfunctional dependencieslmultivalued dependencies10Silberschatz, Korth a

16、nd Sudarshan7.11Database System Concepts - 5th Edition, Oct 5, 2006Functional DependenciesnConstraints on the set of legal relations.nRequire that the value for a certain set of attributes determines uniquely the value for another set of attributes.nA functional dependency is a generalization of the

17、 notion of a key.11Silberschatz, Korth and Sudarshan7.12Database System Concepts - 5th Edition, Oct 5, 2006Functional Dependencies (Cont.)nLet R be a relation schema R and RnThe functional dependency holds on R if and only if for any legal relations r(R), whenever any two tuples t1 and t2 of r agree

18、 on the attributes , they also agree on the attributes . That is, t1 = t2 t1 = t2 nExample: Consider r(A,B ) with the following instance of r.nOn this instance, A B does NOT hold, but B A does hold. 141 53712Silberschatz, Korth and Sudarshan7.13Database System Concepts - 5th Edition, Oct 5, 2006Func

19、tional Dependencies (Cont.)nK is a superkey for relation schema R if and only if K RnK is a candidate key for R if and only if lK R, andlfor no K, RnFunctional dependencies allow us to express constraints that cannot be expressed using superkeys. Consider the schema:bor_loan = (customer_id, loan_num

20、ber, amount ).We expect this functional dependency to hold:loan_number amountbut would not expect the following to hold: amount customer_name13Silberschatz, Korth and Sudarshan7.14Database System Concepts - 5th Edition, Oct 5, 2006Use of Functional DependenciesnWe use functional dependencies to:ltes

21、t relations to see if they are legal under a given set of functional dependencies. 4 If a relation r is legal under a set F of functional dependencies, we say that r satisfies F.lspecify constraints on the set of legal relations4We say that F holds on R if all legal relations on R satisfy the set of

22、 functional dependencies F.nNote: A specific instance of a relation schema may satisfy a functional dependency even if the functional dependency does not hold on all legal instances. lFor example, a specific instance of loan may, by chance, satisfy amount customer_name.14Silberschatz, Korth and Suda

23、rshan7.15Database System Concepts - 5th Edition, Oct 5, 2006Functional Dependencies (Cont.)nA functional dependency is trivial if it is satisfied by all instances of a relationlExample:4 customer_name, loan_number customer_name4 customer_name customer_namelIn general, is trivial if 15Silberschatz, K

24、orth and Sudarshan7.16Database System Concepts - 5th Edition, Oct 5, 2006Closure of a Set of Functional DependenciesnGiven a set F of functional dependencies, there are certain other functional dependencies that are logically implied by F.lFor example: If A B and B C, then we can infer that A CnThe

25、set of all functional dependencies logically implied by F is the closure of F.nWe denote the closure of F by F+.nF+ is a superset of F.16Silberschatz, Korth and Sudarshan7.17Database System Concepts - 5th Edition, Oct 5, 2006Boyce-Codd Normal Formn is trivial (i.e., )n is a superkey for RA relation

26、schema R is in BCNF with respect to a set F of functional dependencies if for all functional dependencies in F+ of the form where R and R, at least one of the following holds:Example schema not in BCNF:bor_loan = ( customer_id, loan_number, amount )because loan_number amount holds on bor_loan but lo

27、an_number is not a superkey17Silberschatz, Korth and Sudarshan7.18Database System Concepts - 5th Edition, Oct 5, 2006Decomposing a Schema into BCNFnSuppose we have a schema R and a non-trivial dependency causes a violation of BCNF.We decompose R into:(U )( R - ( - ) )nIn our example, l = loan_number

28、l = amountand bor_loan is replaced byl (U ) = ( loan_number, amount )l( R - ( - ) ) = ( customer_id, loan_number )18Silberschatz, Korth and Sudarshan7.19Database System Concepts - 5th Edition, Oct 5, 2006BCNF and Dependency PreservationnConstraints, including functional dependencies, are costly to c

29、heck in practice unless they pertain to only one relationnIf it is sufficient to test only those dependencies on each individual relation of a decomposition in order to ensure that all functional dependencies hold, then that decomposition is dependency preserving.nBecause it is not always possible t

30、o achieve both BCNF and dependency preservation, we consider a weaker normal form, known as third normal form.19Silberschatz, Korth and Sudarshan7.20Database System Concepts - 5th Edition, Oct 5, 2006Third Normal FormnA relation schema R is in third normal form (3NF) if for all: in F+at least one of

31、 the following holds:l is trivial (i.e., )l is a superkey for RlEach attribute A in is contained in a candidate key for R. (NOTE: each attribute may be in a different candidate key)nIf a relation is in BCNF it is in 3NF (since in BCNF one of the first two conditions above must hold).nThird condition

32、 is a minimal relaxation of BCNF to ensure dependency preservation (will see why later).20Silberschatz, Korth and Sudarshan7.21Database System Concepts - 5th Edition, Oct 5, 2006Goals of NormalizationnLet R be a relation scheme with a set F of functional dependencies.nDecide whether a relation schem

33、e R is in “good” form.nIn the case that a relation scheme R is not in “good” form, decompose it into a set of relation scheme R1, R2, ., Rn such that leach relation scheme is in good form lthe decomposition is a lossless-join decompositionlPreferably, the decomposition should be dependency preservin

34、g.21Silberschatz, Korth and Sudarshan7.22Database System Concepts - 5th Edition, Oct 5, 2006How good is BCNF?nThere are database schemas in BCNF that do not seem to be sufficiently normalized nConsider a database classes (course, teacher, book ) such that (c, t, b) classes means that t is qualified

35、to teach c, and b is a required textbook for cnThe database is supposed to list for each course the set of teachers any one of which can be the courses instructor, and the set of books, all of which are required for the course (no matter who teaches it).22Silberschatz, Korth and Sudarshan7.23Databas

36、e System Concepts - 5th Edition, Oct 5, 2006nThere are no non-trivial functional dependencies and therefore the relation is in BCNF nInsertion anomalies i.e., if Marilyn is a new teacher that can teach database, two tuples need to be inserted(database, Marilyn, DB Concepts)(database, Marilyn, Ullman

37、)courseteacherbookdatabasedatabasedatabasedatabasedatabasedatabaseoperating systemsoperating systemsoperating systemsoperating systemsAviAviHankHankSudarshanSudarshanAviAvi PetePeteDB ConceptsUllmanDB ConceptsUllmanDB ConceptsUllmanOS ConceptsStallingsOS ConceptsStallingsclassesHow good is BCNF? (Co

38、nt.)23Silberschatz, Korth and Sudarshan7.24Database System Concepts - 5th Edition, Oct 5, 2006nTherefore, it is better to decompose classes into:courseteacherdatabasedatabasedatabaseoperating systemsoperating systemsAviHankSudarshanAvi Jimteachescoursebookdatabasedatabaseoperating systemsoperating s

39、ystemsDB ConceptsUllmanOS ConceptsShawtextThis suggests the need for higher normal forms, such as Fourth Normal Form (4NF), which we shall see later.How good is BCNF? (Cont.)How good is BCNF? (Cont.)24Silberschatz, Korth and Sudarshan7.25Database System Concepts - 5th Edition, Oct 5, 2006Functional-

40、Dependency TheorynWe now consider the formal theory that tells us which functional dependencies are implied logically by a given set of functional dependencies.nWe then develop algorithms to generate lossless decompositions into BCNF and 3NFnWe then develop algorithms to test if a decomposition is d

41、ependency-preserving25Silberschatz, Korth and Sudarshan7.26Database System Concepts - 5th Edition, Oct 5, 2006Closure of a Set of Functional DependenciesnGiven a set F set of functional dependencies, there are certain other functional dependencies that are logically implied by F.lFor example: If A B

42、 and B C, then we can infer that A CnThe set of all functional dependencies logically implied by F is the closure of F.nWe denote the closure of F by F+.nWe can find all of F+ by applying Armstrongs Axioms:lif , then (reflexivity)lif , then (augmentation)lif , and , then (transitivity)nThese rules a

43、re lsound (generate only functional dependencies that actually hold) and lcomplete (generate all functional dependencies that hold).26Silberschatz, Korth and Sudarshan7.27Database System Concepts - 5th Edition, Oct 5, 2006ExamplenR = (A, B, C, G, H, I)F = A B A CCG HCG I B Hnsome members of F+lA H 4

44、by transitivity from A B and B HlAG I 4by augmenting A C with G, to get AG CG and then transitivity with CG I lCG HI 4by augmenting CG I to infer CG CGI, and augmenting of CG H to infer CGI HI, and then transitivity27Silberschatz, Korth and Sudarshan7.28Database System Concepts - 5th Edition, Oct 5,

45、 2006Procedure for Computing F+nTo compute the closure of a set of functional dependencies F: F + = Frepeatfor each functional dependency f in F+ apply reflexivity and augmentation rules on f add the resulting functional dependencies to F +for each pair of functional dependencies f1and f2 in F + if

46、f1 and f2 can be combined using transitivity then add the resulting functional dependency to F +until F + does not change any furtherNOTE: We shall see an alternative procedure for this task later28Silberschatz, Korth and Sudarshan7.29Database System Concepts - 5th Edition, Oct 5, 2006Closure of Fun

47、ctional Dependencies (Cont.)nWe can further simplify manual computation of F+ by using the following additional rules.lIf holds and holds, then holds (union)lIf holds, then holds and holds (decomposition)lIf holds and holds, then holds (pseudotransitivity)The above rules can be inferred from Armstro

48、ngs axioms.29Silberschatz, Korth and Sudarshan7.30Database System Concepts - 5th Edition, Oct 5, 2006Closure of Attribute SetsnGiven a set of attributes , define the closure of under F (denoted by +) as the set of attributes that are functionally determined by under Fn Algorithm to compute +, the cl

49、osure of under F result := ;while (changes to result) dofor each in F dobeginif result then result := result end30Silberschatz, Korth and Sudarshan7.31Database System Concepts - 5th Edition, Oct 5, 2006Example of Attribute Set ClosurenR = (A, B, C, G, H, I)nF = A BA C CG HCG IB Hn(AG)+1. result = AG

50、2. result = ABCG(A C and A B)3. result = ABCGH(CG H and CG AGBC)4. result = ABCGHI(CG I and CG AGBCH)nIs AG a candidate key? 1.Is AG a super key?1.Does AG R? = Is (AG)+ R2.Is any subset of AG a superkey?1.Does A R? = Is (A)+ R2.Does G R? = Is (G)+ R31Silberschatz, Korth and Sudarshan7.32Database Sys

51、tem Concepts - 5th Edition, Oct 5, 2006Uses of Attribute ClosureThere are several uses of the attribute closure algorithm:nTesting for superkey:lTo test if is a superkey, we compute +, and check if + contains all attributes of R.nTesting functional dependencieslTo check if a functional dependency ho

52、lds (or, in other words, is in F+), just check if +. lThat is, we compute + by using attribute closure, and then check if it contains . lIs a simple and cheap test, and very usefulnComputing closure of FlFor each R, we find the closure +, and for each S +, we output a functional dependency S.32Silbe

53、rschatz, Korth and Sudarshan7.33Database System Concepts - 5th Edition, Oct 5, 2006Canonical CovernSets of functional dependencies may have redundant dependencies that can be inferred from the otherslFor example: A C is redundant in: A B, B ClParts of a functional dependency may be redundant4E.g.: o

54、n RHS: A B, B C, A CD can be simplified to A B, B C, A D 4E.g.: on LHS: A B, B C, AC D can be simplified to A B, B C, A D nIntuitively, a canonical cover of F is a “minimal” set of functional dependencies equivalent to F, having no redundant dependencies or redundant parts of dependencies 33Silbersc

55、hatz, Korth and Sudarshan7.34Database System Concepts - 5th Edition, Oct 5, 2006Extraneous AttributesnConsider a set F of functional dependencies and the functional dependency in F.lAttribute A is extraneous in if A and F logically implies (F ) ( A) .lAttribute A is extraneous in if A and the set of

56、 functional dependencies (F ) ( A) logically implies F.nNote: implication in the opposite direction is trivial in each of the cases above, since a “stronger” functional dependency always implies a weaker onenExample: Given F = A C, AB C lB is extraneous in AB C because A C, AB C logically implies A

57、C (I.e. the result of dropping B from AB C).nExample: Given F = A C, AB CDlC is extraneous in AB CD since AB C can be inferred even after deleting C34Silberschatz, Korth and Sudarshan7.35Database System Concepts - 5th Edition, Oct 5, 2006Testing if an Attribute is ExtraneousnConsider a set F of func

58、tional dependencies and the functional dependency in F.nTo test if attribute A is extraneous in pute ( A)+ using the dependencies in F 2. check that ( A)+ contains ; if it does, A is extraneous in nTo test if attribute A is extraneous in pute + using only the dependencies in F = (F ) ( A), 2. check

59、that + contains A; if it does, A is extraneous in 35Silberschatz, Korth and Sudarshan7.36Database System Concepts - 5th Edition, Oct 5, 2006Canonical CovernA canonical cover for F is a set of dependencies Fc such that lF logically implies all dependencies in Fc, and lFc logically implies all depende

60、ncies in F, andlNo functional dependency in Fc contains an extraneous attribute, andlEach left side of functional dependency in Fc is unique.nTo compute a canonical cover for F:repeatUse the union rule to replace any dependencies in F 1 1 and 1 2 with 1 1 2 Find a functional dependency with an extra

61、neous attribute either in or in If an extraneous attribute is found, delete it from until F does not changenNote: Union rule may become applicable after some extraneous attributes have been deleted, so it has to be re-applied36Silberschatz, Korth and Sudarshan7.37Database System Concepts - 5th Editi

62、on, Oct 5, 2006Computing a Canonical CovernR = (A, B, C)F = A BC B C A BAB CnCombine A BC and A B into A BClSet is now A BC, B C, AB CnA is extraneous in AB ClCheck if the result of deleting A from AB C is implied by the other dependencies4Yes: in fact, B C is already present!lSet is now A BC, B CnC

63、 is extraneous in A BC lCheck if A C is logically implied by A B and the other dependencies4Yes: using transitivity on A B and B C. Can use attribute closure of A in more complex casesnThe canonical cover is: A BB C37Silberschatz, Korth and Sudarshan7.38Database System Concepts - 5th Edition, Oct 5,

64、 2006Lossless-join DecompositionnFor the case of R = (R1, R2), we require that for all possible relations r on schema Rr = R1 (r ) R2 (r ) nA decomposition of R into R1 and R2 is lossless join if and only if at least one of the following dependencies is in F+:lR1 R2 R1lR1 R2 R238Silberschatz, Korth

65、and Sudarshan7.39Database System Concepts - 5th Edition, Oct 5, 2006ExamplenR = (A, B, C)F = A B, B C)lCan be decomposed in two different waysnR1 = (A, B), R2 = (B, C)lLossless-join decomposition: R1 R2 = B and B BClDependency preservingnR1 = (A, B), R2 = (A, C)lLossless-join decomposition: R1 R2 =

66、A and A ABlNot dependency preserving (cannot check B C without computing R1 R2)39Silberschatz, Korth and Sudarshan7.40Database System Concepts - 5th Edition, Oct 5, 2006Dependency PreservationDependency Preservationn Let Fi be the set of dependencies F + that include only attributes in Ri. 4 A decom

67、position is dependency preserving, if (F1 F2 Fn )+ = F +4If it is not, then checking updates for violation of functional dependencies may require computing joins, which is expensive.40Silberschatz, Korth and Sudarshan7.41Database System Concepts - 5th Edition, Oct 5, 2006Testing for Dependency Prese

68、rvationnTo check if a dependency is preserved in a decomposition of R into R1, R2, , Rn we apply the following test (with attribute closure done with respect to F)lresult = while (changes to result) dofor each Ri in the decompositiont = (result Ri)+ Riresult = result tlIf result contains all attribu

69、tes in , then the functional dependency is preserved.nWe apply the test on all dependencies in F to check if a decomposition is dependency preservingnThis procedure takes polynomial time, instead of the exponential time required to compute F+ and (F1 F2 Fn)+ 41Silberschatz, Korth and Sudarshan7.42Da

70、tabase System Concepts - 5th Edition, Oct 5, 2006ExamplenR = (A, B, C )F = A B B CKey = AnR is not in BCNFnDecomposition R1 = (A, B), R2 = (B, C)lR1 and R2 in BCNFlLossless-join decompositionlDependency preserving42Silberschatz, Korth and Sudarshan7.43Database System Concepts - 5th Edition, Oct 5, 2

71、006Testing for BCNFnTo check if a non-trivial dependency causes a violation of BCNF1. compute + (the attribute closure of ), and 2. verify that it includes all attributes of R, that is, it is a superkey of R.nSimplified test: To check if a relation schema R is in BCNF, it suffices to check only the

72、dependencies in the given set F for violation of BCNF, rather than checking all dependencies in F+.lIf none of the dependencies in F causes a violation of BCNF, then none of the dependencies in F+ will cause a violation of BCNF either.nHowever, using only F is incorrect when testing a relation in a

73、decomposition of RlConsider R = (A, B, C, D, E), with F = A B, BC D4Decompose R into R1 = (A,B) and R2 = (A,C,D, E) 4Neither of the dependencies in F contain only attributes from (A,C,D,E) so we might be mislead into thinking R2 satisfies BCNF. 4In fact, dependency AC D in F+ shows R2 is not in BCNF

74、. 43Silberschatz, Korth and Sudarshan7.44Database System Concepts - 5th Edition, Oct 5, 2006Testing Decomposition for BCNFnTo check if a relation Ri in a decomposition of R is in BCNF, lEither test Ri for BCNF with respect to the restriction of F to Ri (that is, all FDs in F+ that contain only attri

75、butes from Ri)lor use the original set of dependencies F that hold on R, but with the following test:for every set of attributes Ri, check that + (the attribute closure of ) either includes no attribute of Ri- , or includes all attributes of Ri.4If the condition is violated by some in F, the depende

76、ncy (+ - ) Rican be shown to hold on Ri, and Ri violates BCNF.4We use above dependency to decompose Ri44Silberschatz, Korth and Sudarshan7.45Database System Concepts - 5th Edition, Oct 5, 2006BCNF Decomposition Algorithmresult := R ;done := false;compute F +;while (not done) doif (there is a schema

77、Ri in result that is not in BCNF)then beginlet be a nontrivial functional dependency that holds on Ri such that Ri is not in F +, and = ; result := (result Ri ) (Ri ) (, ); endelse done := true; Note: each Ri is in BCNF, and decomposition is lossless-join.45Silberschatz, Korth and Sudarshan7.46Datab

78、ase System Concepts - 5th Edition, Oct 5, 2006Example of BCNF DecompositionnR = (A, B, C )F = A B B CKey = AnR is not in BCNF (B C but B is not superkey)nDecompositionlR1 = (B, C)lR2 = (A,B)46Silberschatz, Korth and Sudarshan7.47Database System Concepts - 5th Edition, Oct 5, 2006Example of BCNF Deco

79、mpositionnOriginal relation R and functional dependency F R = (branch_name, branch_city, assets, customer_name, loan_number, amount ) F = branch_name assets branch_city loan_number amount branch_name Key = loan_number, customer_namenDecompositionlR1 = (branch_name, branch_city, assets )lR2 = (branch

80、_name, customer_name, loan_number, amount )lR3 = (branch_name, loan_number, amount )lR4 = (customer_name, loan_number )nFinal decomposition R1, R3, R447Silberschatz, Korth and Sudarshan7.48Database System Concepts - 5th Edition, Oct 5, 2006BCNF and Dependency PreservationnR = (J, K, L )F = JK L L K

81、Two candidate keys = JK and JLnR is not in BCNFnAny decomposition of R will fail to preserveJK L This implies that testing for JK L requires a joinIt is not always possible to get a BCNF decomposition that is dependency preserving48Silberschatz, Korth and Sudarshan7.49Database System Concepts - 5th

82、Edition, Oct 5, 2006Third Normal Form: MotivationnThere are some situations where lBCNF is not dependency preserving, and lefficient checking for FD violation on updates is importantnSolution: define a weaker normal form, called Third Normal Form (3NF)lAllows some redundancy (with resultant problems

83、; we will see examples later)lBut functional dependencies can be checked on individual relations without computing a join.lThere is always a lossless-join, dependency-preserving decomposition into 3NF.49Silberschatz, Korth and Sudarshan7.50Database System Concepts - 5th Edition, Oct 5, 20063NF Examp

84、lenRelation R:lR = (J, K, L )F = JK L, L K lTwo candidate keys: JK and JLlR is in 3NFJK L JK is a superkeyL KK is contained in a candidate key50Silberschatz, Korth and Sudarshan7.51Database System Concepts - 5th Edition, Oct 5, 2006Redundancy in 3NFJj1j2j3nullLl1l1l1l2Kk1k1k1k2nrepetition of informa

85、tion (e.g., the relationship l1, k1) nneed to use null values (e.g., to represent the relationship l2, k2 where there is no corresponding value for J).nThere is some redundancy in this schemanExample of problems due to redundancy in 3NFlR = (J, K, L)F = JK L, L K 51Silberschatz, Korth and Sudarshan7

86、.52Database System Concepts - 5th Edition, Oct 5, 2006Testing for 3NFnOptimization: Need to check only FDs in F, need not check all FDs in F+.nUse attribute closure to check for each dependency , if is a superkey.nIf is not a superkey, we have to verify if each attribute in is contained in a candida

87、te key of Rlthis test is rather more expensive, since it involve finding candidate keysltesting for 3NF has been shown to be NP-hardlInterestingly, decomposition into third normal form (described shortly) can be done in polynomial time 52Silberschatz, Korth and Sudarshan7.53Database System Concepts

88、- 5th Edition, Oct 5, 20063NF Decomposition AlgorithmLet Fc be a canonical cover for F;i := 0;for each functional dependency in Fc doif none of the schemas Rj, 1 j i contains then begini := i + 1;Ri := endif none of the schemas Rj, 1 j i contains a candidate key for Rthen begini := i + 1;Ri := any c

89、andidate key for R;end return (R1, R2, ., Ri) 53Silberschatz, Korth and Sudarshan7.54Database System Concepts - 5th Edition, Oct 5, 20063NF Decomposition Algorithm (Cont.)nAbove algorithm ensures:leach relation schema Ri is in 3NFldecomposition is dependency preserving and lossless-joinlProof of cor

90、rectness is at end of this presentation (click here)54Silberschatz, Korth and Sudarshan7.55Database System Concepts - 5th Edition, Oct 5, 20063NF Decomposition: An ExamplenRelation schema:cust_banker_branch = (customer_id, employee_id, branch_name, type )nThe functional dependencies for this relatio

91、n schema are:1.customer_id, employee_id branch_name, type2.employee_id branch_name3.customer_id, branch_name employee_idnWe first compute a canonical coverlbranch_name is extraneous in the r.h.s. of the 1st dependencylNo other attribute is extraneous, so we get FC = customer_id, employee_id type emp

92、loyee_id branch_name customer_id, branch_name employee_id55Silberschatz, Korth and Sudarshan7.56Database System Concepts - 5th Edition, Oct 5, 20063NF Decompsition Example (Cont.)nThe for loop generates following 3NF schema: (customer_id, employee_id, type ) (employee_id, branch_name) (customer_id,

93、branch_name, employee_id)lObserve that (customer_id, employee_id, type ) contains a candidate key of the original schema, so no further relation schema needs be addednIf the FDs were considered in a different order, with the 2nd one considered after the 3rd, (employee_id, branch_name) would not be i

94、ncluded in the decomposition because it is a subset of (customer_id, branch_name, employee_id)nMinor extension of the 3NF decomposition algorithm: at end of for loop, detect and delete schemas, such as (employee_id, branch_name), which are subsets of other schemaslresult will not depend on the order

95、 in which FDs are considerednThe resultant simplified 3NF schema is: (customer_id, employee_id, type) (customer_id, branch_name, employee_id)56Silberschatz, Korth and Sudarshan7.57Database System Concepts - 5th Edition, Oct 5, 2006Comparison of BCNF and 3NFnIt is always possible to decompose a relat

96、ion into a set of relations that are in 3NF such that:lthe decomposition is losslesslthe dependencies are preservednIt is always possible to decompose a relation into a set of relations that are in BCNF such that:lthe decomposition is losslesslit may not be possible to preserve dependencies.57Silber

97、schatz, Korth and Sudarshan7.58Database System Concepts - 5th Edition, Oct 5, 2006Design GoalsnGoal for a relational database design is:lBCNF.lLossless join.lDependency preservation.nIf we cannot achieve this, we accept one oflLack of dependency preservation lRedundancy due to use of 3NFnInteresting

98、ly, SQL does not provide a direct way of specifying functional dependencies other than superkeys.Can specify FDs using assertions, but they are expensive to testnEven if we had a dependency preserving decomposition, using SQL we would not be able to efficiently test a functional dependency whose lef

99、t hand side is not a key.58Silberschatz, Korth and Sudarshan7.59Database System Concepts - 5th Edition, Oct 5, 2006Multivalued Dependencies (MVDs)nLet R be a relation schema and let R and R. The multivalued dependency holds on R if in any legal relation r(R), for all pairs for tuples t1 and t2 in r

100、such that t1 = t2 , there exist tuples t3 and t4 in r such that: t1 = t2 = t3 = t4 t3 = t1 t3R = t2R t4 = t2 t4R = t1R 59Silberschatz, Korth and Sudarshan7.60Database System Concepts - 5th Edition, Oct 5, 2006MVD (Cont.)nTabular representation of 60Silberschatz, Korth and Sudarshan7.61Database Syste

101、m Concepts - 5th Edition, Oct 5, 2006ExamplenLet R be a relation schema with a set of attributes that are partitioned into 3 nonempty subsets.Y, Z, WnWe say that Y Z (Y multidetermines Z )if and only if for all possible relations r (R ) r and rthen r and rnNote that since the behavior of Z and W are

102、 identical it follows that Y Z if Y W 61Silberschatz, Korth and Sudarshan7.62Database System Concepts - 5th Edition, Oct 5, 2006Example (Cont.)nIn our example:course teachercourse booknThe above formal definition is supposed to formalize the notion that given a particular value of Y (course) it has

103、associated with it a set of values of Z (teacher) and a set of values of W (book), and these two sets are in some sense independent of each other.nNote: lIf Y Z then Y ZlIndeed we have (in above notation) Z1 = Z2The claim follows.62Silberschatz, Korth and Sudarshan7.63Database System Concepts - 5th

104、Edition, Oct 5, 2006Use of Multivalued DependenciesnWe use multivalued dependencies in two ways: 1. To test relations to determine whether they are legal under a given set of functional and multivalued dependencies2. To specify constraints on the set of legal relations. We shall thus concern ourselv

105、es only with relations that satisfy a given set of functional and multivalued dependencies.nIf a relation r fails to satisfy a given multivalued dependency, we can construct a relations r that does satisfy the multivalued dependency by adding tuples to r. 63Silberschatz, Korth and Sudarshan7.64Datab

106、ase System Concepts - 5th Edition, Oct 5, 2006Theory of MVDsnFrom the definition of multivalued dependency, we can derive the following rule:lIf , then That is, every functional dependency is also a multivalued dependencynThe closure D+ of D is the set of all functional and multivalued dependencies

107、logically implied by D. lWe can compute D+ from D, using the formal definitions of functional dependencies and multivalued dependencies.lWe can manage with such reasoning for very simple multivalued dependencies, which seem to be most common in practicelFor complex dependencies, it is better to reas

108、on about sets of dependencies using a system of inference rules (see Appendix C).64Silberschatz, Korth and Sudarshan7.65Database System Concepts - 5th Edition, Oct 5, 2006Fourth Normal FormnA relation schema R is in 4NF with respect to a set D of functional and multivalued dependencies if for all mu

109、ltivalued dependencies in D+ of the form , where R and R, at least one of the following hold:l is trivial (i.e., or = R)l is a superkey for schema RnIf a relation is in 4NF it is in BCNF65Silberschatz, Korth and Sudarshan7.66Database System Concepts - 5th Edition, Oct 5, 2006Restriction of Multivalu

110、ed DependenciesnThe restriction of D to Ri is the set Di consisting oflAll functional dependencies in D+ that include only attributes of RilAll multivalued dependencies of the form ( Ri) where Ri and is in D+ 66Silberschatz, Korth and Sudarshan7.67Database System Concepts - 5th Edition, Oct 5, 20064

111、NF Decomposition Algorithm result: = R;done := false;compute D+;Let Di denote the restriction of D+ to Ri while (not done) if (there is a schema Ri in result that is not in 4NF) then begin let be a nontrivial multivalued dependency that holds on Ri such that Ri is not in Di, and ; result := (result

112、- Ri) (Ri - ) (, ); end else done:= true;Note: each Ri is in 4NF, and decomposition is lossless-join67Silberschatz, Korth and Sudarshan7.68Database System Concepts - 5th Edition, Oct 5, 2006ExamplenR =(A, B, C, G, H, I)F = A BB HICG H nR is not in 4NF since A B and A is not a superkey for RnDecompos

113、itiona) R1 = (A, B) (R1 is in 4NF)b) R2 = (A, C, G, H, I) (R2 is not in 4NF)c) R3 = (C, G, H) (R3 is in 4NF)d) R4 = (A, C, G, I) (R4 is not in 4NF)nSince A B and B HI, A HI, A Ie) R5 = (A, I) (R5 is in 4NF)f)R6 = (A, C, G) (R6 is in 4NF)68Silberschatz, Korth and Sudarshan7.69Database System Concepts

114、 - 5th Edition, Oct 5, 2006Further Normal FormsnJoin dependencies generalize multivalued dependenciesllead to project-join normal form (PJNF) (also called fifth normal form)nA class of even more general constraints, leads to a normal form called domain-key normal form.nProblem with these generalized

115、 constraints: are hard to reason with, and no set of sound and complete set of inference rules exists.nHence rarely used69Silberschatz, Korth and Sudarshan7.70Database System Concepts - 5th Edition, Oct 5, 2006Overall Database Design ProcessnWe have assumed schema R is givenlR could have been genera

116、ted when converting E-R diagram to a set of tables.lR could have been a single relation containing all attributes that are of interest (called universal relation).lNormalization breaks R into smaller relations.lR could have been the result of some ad hoc design of relations, which we then test/conve

117、rt to normal form.70Silberschatz, Korth and Sudarshan7.71Database System Concepts - 5th Edition, Oct 5, 2006ER Model and NormalizationnWhen an E-R diagram is carefully designed, identifying all entities correctly, the tables generated from the E-R diagram should not need further normalization.nHowev

118、er, in a real (imperfect) design, there can be functional dependencies from non-key attributes of an entity to other attributes of the entitylExample: an employee entity with attributes department_number and department_address, and a functional dependency department_number department_addresslGood de

119、sign would have made department an entitynFunctional dependencies from non-key attributes of a relationship set possible, but rare - most relationships are binary 71Silberschatz, Korth and Sudarshan7.72Database System Concepts - 5th Edition, Oct 5, 2006Denormalization for PerformancenMay want to use

120、 non-normalized schema for performancenFor example, displaying customer_name along with account_number and balance requires join of account with depositornAlternative 1: Use denormalized relation containing attributes of account as well as depositor with all above attributeslfaster lookuplextra spac

121、e and extra execution time for updateslextra coding work for programmer and possibility of error in extra codenAlternative 2: use a materialized view defined as account depositorlBenefits and drawbacks same as above, except no extra coding work for programmer and avoids possible errors72Silberschatz

122、, Korth and Sudarshan7.73Database System Concepts - 5th Edition, Oct 5, 2006Other Design IssuesnSome aspects of database design are not caught by normalizationnExamples of bad database design, to be avoided: Instead of earnings (company_id, year, amount ), use learnings_2004, earnings_2005, earnings

123、_2006, etc., all on the schema (company_id, earnings).4Above are in BCNF, but make querying across years difficult and needs new table each yearlcompany_year(company_id, earnings_2004, earnings_2005, earnings_2006)4Also in BCNF, but also makes querying across years difficult and requires new attribu

124、te each year.4Is an example of a crosstab, where values for one attribute become column names4Used in spreadsheets, and in data analysis tools73Silberschatz, Korth and Sudarshan7.74Database System Concepts - 5th Edition, Oct 5, 2006Modeling Temporal DatanTemporal data have an association time interv

125、al during which the data are valid.nA snapshot is the value of the data at a particular point in timenSeveral proposals to extend ER model by adding valid time tolattributes, e.g. address of a customer at different points in timelentities, e.g. time duration when an account existslrelationships, e.g

126、. time during which a customer owned an accountnBut no accepted standardnAdding a temporal component results in functional dependencies likecustomer_id customer_street, customer_citynot to hold, because the address varies over timenA temporal functional dependency X Y holds on schema R if the functi

127、onal dependency X Y holds on all snapshots for all legal instances r (R )t74Silberschatz, Korth and Sudarshan7.75Database System Concepts - 5th Edition, Oct 5, 2006Modeling Temporal Data (Cont.)nIn practice, database designers may add start and end time attributes to relationslE.g. course(course_id,

128、 course_title) course(course_id, course_title, start, end)4Constraint: no two tuples can have overlapping valid timesHard to enforce efficientlynForeign key references may be to current version of data, or to data at a point in timelE.g. student transcript should refer to course information at the t

129、ime the course was taken75Database System Concepts, 5th Ed.End of Chapter76Database System Concepts, 5th Ed.Proof of Correctness of 3NF Decomposition Algorithm77Silberschatz, Korth and Sudarshan7.78Database System Concepts - 5th Edition, Oct 5, 2006Correctness of 3NF Decomposition Algorithmn3NF deco

130、mposition algorithm is dependency preserving (since there is a relation for every FD in Fc)nDecomposition is losslesslA candidate key (C ) is in one of the relations Ri in decompositionlClosure of candidate key under Fc must contain all attributes in R. lFollow the steps of attribute closure algorit

131、hm to show there is only one tuple in the join result for each tuple in Ri78Silberschatz, Korth and Sudarshan7.79Database System Concepts - 5th Edition, Oct 5, 2006Correctness of 3NF Decomposition Algorithm (Contd.)Claim: if a relation Ri is in the decomposition generated by the above algorithm, the

132、n Ri satisfies 3NF.nLet Ri be generated from the dependency nLet B be any non-trivial functional dependency on Ri. (We need only consider FDs whose right-hand side is a single attribute.)nNow, B can be in either or but not in both. Consider each case separately.79Silberschatz, Korth and Sudarshan7.8

133、0Database System Concepts - 5th Edition, Oct 5, 2006Correctness of 3NF Decomposition (Contd.)nCase 1: If B in :lIf is a superkey, the 2nd condition of 3NF is satisfiedlOtherwise must contain some attribute not in lSince B is in F+ it must be derivable from Fc, by using attribute closure on .lAttribu

134、te closure not have used . If it had been used, must be contained in the attribute closure of , which is not possible, since we assumed is not a superkey.lNow, using (- B) and B, we can derive B(since , and B since B is non-trivial)lThen, B is extraneous in the right-hand side of ; which is not poss

135、ible since is in Fc.lThus, if B is in then must be a superkey, and the second condition of 3NF must be satisfied.80Silberschatz, Korth and Sudarshan7.81Database System Concepts - 5th Edition, Oct 5, 2006Correctness of 3NF Decomposition (Contd.)nCase 2: B is in .lSince is a candidate key, the third a

136、lternative in the definition of 3NF is trivially satisfied.lIn fact, we cannot show that is a superkey.lThis shows exactly why the third alternative is present in the definition of 3NF.Q.E.D.81Silberschatz, Korth and Sudarshan7.82Database System Concepts - 5th Edition, Oct 5, 2006Figure 7.5: Sample

137、Relation r82Silberschatz, Korth and Sudarshan7.83Database System Concepts - 5th Edition, Oct 5, 2006Figure 7.683Silberschatz, Korth and Sudarshan7.84Database System Concepts - 5th Edition, Oct 5, 2006Figure 7.784Silberschatz, Korth and Sudarshan7.85Database System Concepts - 5th Edition, Oct 5, 2006

138、Figure 7.15: An Example of Figure 7.15: An Example of Redundancy in a BCNF RelationRedundancy in a BCNF Relation85Silberschatz, Korth and Sudarshan7.86Database System Concepts - 5th Edition, Oct 5, 2006Figure 7.16: An Illegal R2 Relation86Silberschatz, Korth and Sudarshan7.87Database System Concepts - 5th Edition, Oct 5, 2006Figure 7.18: Relation of Practice Exercise 7.287

展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 办公文档 > 工作计划

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号