Pobieranie prezentacji. Proszę czekać

Pobieranie prezentacji. Proszę czekać

Dzielenie relacyjne / Relational Division Bazy i hurtownie danych TWO1 2009/2010 https://ophelia.cs.put.poznan.pl/webdav/ dbdw/students/dbdw-winter_2009-10/

Podobne prezentacje


Prezentacja na temat: "Dzielenie relacyjne / Relational Division Bazy i hurtownie danych TWO1 2009/2010 https://ophelia.cs.put.poznan.pl/webdav/ dbdw/students/dbdw-winter_2009-10/"— Zapis prezentacji:

1 Dzielenie relacyjne / Relational Division Bazy i hurtownie danych TWO1 2009/2010 https://ophelia.cs.put.poznan.pl/webdav/ dbdw/students/dbdw-winter_ / Based on: V.M. Matos, R. Grasser, Assessing performance of the relational division operator. Data Base Management

2 7/11/2009Bazy i hurtownie danych2 Relational Division The basic operators of the relational algebra: Union (UNION) Difference (MINUS) Cartesian product Projection & selection (SELECT... FROM...) Additional operators added to the relational algebra: Join most popular in practice Rename (renaming fields) Intersection Division

3 7/11/2009Bazy i hurtownie danych3 Relational Division The division operator is less common than select-project- join queries, however, it is applicable to many common queries: Find students who have taken all the core source courses Find customers who have ordered all items from a given line of products The division operator can be also employed in data mining algorithms (e.g., generation of association rules)

4 7/11/2009Bazy i hurtownie danych4 Informal Definition The division operator allows verifying whether or not a candidate subject is related to each of the values held in the base set. The base set is called the divisor (or denominator T2[B]), and the table holding the subjects data is called the dividend (or nominator T1[A, B]). The expression T1[A, B]/T2[B] selects the A values from the dividend table T1[A, B], whose B values are a superset of those B values held in the divisor table T2[B].

5 7/11/2009Bazy i hurtownie danych5 Informal Definition

6 7/11/2009Bazy i hurtownie danych6 Formal Definition: Relational Algebra Lets assume that the numerator table T1 always consists of two columns A and B, and the denominator has only one B attribute. Then, the expression T1[A, B]/T2[B] is semantically equivalent to: T1[A, B]/T2[B] = T1[A] – ((T1[A] × T2[B]) – T1[A, B])[A]

7 7/11/2009Bazy i hurtownie danych7 Formal Definition: Relational Algebra

8 7/11/2009Bazy i hurtownie danych8 Formal Definition: Tuple-calculus Using relational tuple-calculus language, the division operator can be rephrased as follows: T1[A, B]/T2[B] = { t1[A] / t1 T1 and for-all t2 (t2 T2 exists t3 (t3 T1 and (t1[A] = t3[A]) and (t2[B] = t3[B]))) }

9 7/11/2009Bazy i hurtownie danych9 SQL Implementation: Q0 SELECT A FROM T1 WHERE B IN (SELECT B FROM T2) GROUP BY A HAVING COUNT(*) = (SELECT COUNT(*) FROM T2)

10 7/11/2009Bazy i hurtownie danych10 SQL Implementation: Q1 Based on the formal predicate calculus definition modified to fit SQL: The universal quantifier for-all x (f(x)) replaced by not exists x (not f(x)) The implication X Y replaced by (not(X) or Y) T1[A, B]/T2[B] = { t1[A] / t1 T1 and not exists t2 (not(not( t2 T2) or (exists t3 (t3 T1 and (t1[A] = t3[A]) and (t2[B] = t3[B]))))) } Byzanthyne approach

11 7/11/2009Bazy i hurtownie danych11 SQL Implementation: Q1 Previous definition is equivalent (De Morgans law) to: T1[A, B]/T2[B] = { t1[A] / t1 T1 and not exists t2 (( t2 T2) and (not exists t3 (t3 T1 and (t1[A] = t3[A]) and (t2[B] = t3[B])))) }

12 7/11/2009Bazy i hurtownie danych12 SQL Implementation: Q1 SELECT DISTINCT x.A FROM T1 AS x WHERE NOT EXISTS (SELECT * FROM T2 y WHERE NOT EXISTS (SELECT * FROM T1 AS z WHERE (z.A=x.A) AND (z.B=y.B)))

13 7/11/2009Bazy i hurtownie danych13 SQL Implementation: Q2 Based on the algebraic definition of the division operator and broken into two steps: SELECT DISTINCT y.A, z.B INTO T3 FROM T1 AS y, T2 AS z WHERE NOT EXISTS (SELECT * FROM T1 WHERE (T1.A = y.A) AND (T1.B=z.B)) SELECT DISTINCT A FROM T1 WHERE NOT EXISTS (SELECT * FROM T3 WHERE (T3.A=T1.A))

14 7/11/2009Bazy i hurtownie danych14 SQL Implementation: Q3 Similar to Q0, with GROUP BY and HAVING replaced by join: SELECT DISTINCT x.A FROM T1 AS x WHERE (SELECT COUNT(*) FROM T2) = (SELECT COUNT(*) FROM T1, T2 WHERE (T1.A=x.A) AND (T1.B=T2.B))

15 7/11/2009Bazy i hurtownie danych15 Zero Division The divide operator is defined in such a way that T1[A,B]\T2[B] produces exactly all A values in T1 each time that T2[B] is either empty or has a zero selectivity with respect to T1[A,B]. An empty set would be a more appropriate answer this is how Q0 works.

16 7/11/2009Bazy i hurtownie danych16 Experiment Conduct an experiment with the following settings: number of A-values in T1 = , number of B-values in T1 = 100, number of B-values in T2 = 20, 40, 60, 80, 100. Create appropriate script generating data samples, implement the four queries (Q0…Q3) and test their performance (execution time). Collect the observations in a tabular and graphical form and describe the results.

17 7/11/2009Bazy i hurtownie danych17 Checking Execution Time Turn SET STATISTICS TIME on (Tools Options)

18 7/11/2009Bazy i hurtownie danych18 Generacja zbiorów danych (1) CREATE PROCEDURE int AS BEGIN BEGIN TRY BEGIN TRANSACTION DELETE FROM T1 DELETE FROM T2 int int = 1 <= BEGIN = 1 <= 100 BEGIN INSERT INTO T1 VALUES( 'a' + AS varchar(10)), 'b' + AS varchar(10))) + 1 END +1 END Procedura wypełniająca danymi tablice T1 i T2. Procedura zakłada, że kolumny A i B w obu tablicach są typu VARCHAR. Jedynym parametrem procedury jest liczba rekordów w tablicy T2. W tej wersji wykorzystano jawną obsłguę transakcji (BEGIN... COMMIT... ROLLBACK TRANSACTION) wraz z obsługą wyjątków (TRY... CATCH...). Nie jest to konieczne dla poprawnego działania procedury, ale poprawia jej efektywność.

19 7/11/2009Bazy i hurtownie danych19 Generacja zbiorów danych (2) = 1 BEGIN INSERT INTO T2 VALUES('b' + AS varchar(10))) + 1 END COMMIT TRANSACTION END TRY BEGIN CATCH IF > 0 ROLLBACK TRANSACTION END CATCH SET STATISTICS TIME ON SET STATISTICS IO ON END

20 7/11/2009Bazy i hurtownie danych20 Poprawianie efektywności procedury Po dyskusji na temat sensowności wykorzystania BEGIN... COMMIT TRANSACTION, która miała miejsce na ostatnich zajęciach, sprawdziłem ich wpływ na efektywność procedury (czas jej wykonania). Sprawdziłem 2 warianty procedury – z BEGIN... COMMIT TRANSACTION (wariant 1) i bez (wariant 2). Każdy wariant uruchomiłem 10 razy i sprawdzałem czasy wykonania. Wyniki (wartość średnia i odchylenie standardowe w sekundach) podane są w poniższej tabeli. Wariant 1Wariant 2 CPU TIMEELAPSED TIMECPU TIMEELAPSED TIME AVERAGE STDEV Na podstawie tych wyników wydaje się więc, że warto zastosować BEGIN... END TRANSACTION – łączny czas wykonania procedury (ELAPSED TIME) skraca sie 10- krotnie.


Pobierz ppt "Dzielenie relacyjne / Relational Division Bazy i hurtownie danych TWO1 2009/2010 https://ophelia.cs.put.poznan.pl/webdav/ dbdw/students/dbdw-winter_2009-10/"

Podobne prezentacje


Reklamy Google