|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Where Not Existsthe meaning of sub-selects inside a WHERE NOT EXISTS clause. My brain just can't visualize it. SELECT Stuff FROM TableA WHERE NOT EXISTS ( SELECT OtherStuff FROM TableB ) I am gasping to find a rephrasing of a general form that uses some kind of OUTER JOIN WHERE xxx IS NULL. THAT makes sense to me. All these upside down where not exists (select big long huge statement) queries we do here are just unreadable to me. Also, is there a performance difference between the above SQL statement and the rephrasing of it that uses outer joins? Or is it statement specific? -- Peace & happy computing, Mike Labosh, MCSD "Mr. McKittrick, after very careful consideration, I have come to the conclusion that this new system SUCKS." -- General Barringer, "War Games" SELECT A.k, A.x
FROM A WHERE NOT EXISTS (SELECT 1 FROM B WHERE B.k = a.k) is logically equivalent to SELECT A.k, A.x FROM A WHERE A.k IN (SELECT k FROM B) is logically equivalent to SELECT DISTINCT A.k, A.x FROM A LEFT OUTER JOIN B ON (B.k = A.k) WHERE B.k IS NULL is logically equivalent to SELECT A.k, A.x FROM A LEFT OUTER JOIN B ON (B.k = A.k) WHERE B.k IS NULL GROUP BY A.k, A.x Note that the key value from A exists in each query. That's because the relationship between A and B could be 1..*, so with a left join you could get several identical rows. As far as performance, the query optimizer may produce different execution plans. The first is a correlated subquery and may produce an iteration. The second is just a subquery, and will probably produce a hash or merge join, but may also produce an iteration. The third and fourth are left outer joins, and will probably produce a hash or merge join, and most probably the same execution plan. Depending on available indexes the query optimizer may produce the same plan for each of the above. Your best bet is to try them with real data and find out. Show quote "Mike Labosh" <mlab***@hotmail.com> wrote in message news:e5zPsD1gFHA.2560@TK2MSFTNGP10.phx.gbl... > Maybe it's a personality flaw, but I have always had trouble figuring out > the meaning of sub-selects inside a WHERE NOT EXISTS clause. My brain just > can't visualize it. > > SELECT Stuff > FROM TableA > WHERE NOT EXISTS ( > SELECT OtherStuff FROM TableB > ) > > I am gasping to find a rephrasing of a general form that uses some kind of > OUTER JOIN WHERE xxx IS NULL. THAT makes sense to me. All these upside > down where not exists (select big long huge statement) queries we do here > are just unreadable to me. > > Also, is there a performance difference between the above SQL statement and > the rephrasing of it that uses outer joins? Or is it statement specific? > > -- > Peace & happy computing, > > Mike Labosh, MCSD > > "Mr. McKittrick, after very careful consideration, I have > come to the conclusion that this new system SUCKS." > -- General Barringer, "War Games" > > Brian,
I have to correct this. If one uses an OUTER JOIN to determine the non-related rows (the Outer Join variant of Not Exists), then a DISTINCT or GROUP BY is not necessary. In fact, it will give incorrect results if there are any duplicates in the outer table. This is demonstrated in the example below. CREATE TABLE #t1 (id int) CREATE TABLE #t2 (id int) INSERT INTO #t1 VALUES (1) INSERT INTO #t1 VALUES (1) INSERT INTO #t1 VALUES (2) -- incorrect result SELECT DISTINCT T1.id FROM #t1 T1 LEFT JOIN #t2 T2 ON T2.id=T1.id WHERE T2.id IS NULL -- incorrect result SELECT T1.id FROM #t1 T1 LEFT JOIN #t2 T2 ON T2.id=T1.id WHERE T2.id IS NULL GROUP BY T1.id -- correct result SELECT T1.id FROM #t1 T1 LEFT JOIN #t2 T2 ON T2.id=T1.id WHERE T2.id IS NULL DROP TABLE #t1 DROP TABLE #t2 Gert-Jan Brian Selzer wrote: Show quote > > SELECT A.k, A.x > FROM A > WHERE NOT EXISTS (SELECT 1 FROM B WHERE B.k = a.k) > > is logically equivalent to > > SELECT A.k, A.x > FROM A > WHERE A.k IN (SELECT k FROM B) > > is logically equivalent to > > SELECT DISTINCT A.k, A.x > FROM A > LEFT OUTER JOIN B > ON (B.k = A.k) > WHERE B.k IS NULL > > is logically equivalent to > > SELECT A.k, A.x > FROM A > LEFT OUTER JOIN B > ON (B.k = A.k) > WHERE B.k IS NULL > GROUP BY A.k, A.x > > Note that the key value from A exists in each query. That's because the > relationship between A and B could be 1..*, so with a left join you could > get several identical rows. > > As far as performance, the query optimizer may produce different execution > plans. The first is a correlated subquery and may produce an iteration. > The second is just a subquery, and will probably produce a hash or merge > join, but may also produce an iteration. The third and fourth are left > outer joins, and will probably produce a hash or merge join, and most > probably the same execution plan. Depending on available indexes the query > optimizer may produce the same plan for each of the above. Your best bet is > to try them with real data and find out. > > "Mike Labosh" <mlab***@hotmail.com> wrote in message > news:e5zPsD1gFHA.2560@TK2MSFTNGP10.phx.gbl... > > Maybe it's a personality flaw, but I have always had trouble figuring out > > the meaning of sub-selects inside a WHERE NOT EXISTS clause. My brain > just > > can't visualize it. > > > > SELECT Stuff > > FROM TableA > > WHERE NOT EXISTS ( > > SELECT OtherStuff FROM TableB > > ) > > > > I am gasping to find a rephrasing of a general form that uses some kind of > > OUTER JOIN WHERE xxx IS NULL. THAT makes sense to me. All these upside > > down where not exists (select big long huge statement) queries we do here > > are just unreadable to me. > > > > Also, is there a performance difference between the above SQL statement > and > > the rephrasing of it that uses outer joins? Or is it statement specific? > > > > -- > > Peace & happy computing, > > > > Mike Labosh, MCSD > > > > "Mr. McKittrick, after very careful consideration, I have > > come to the conclusion that this new system SUCKS." > > -- General Barringer, "War Games" > > > > >> Maybe it's a Personality Flaw, but .. << Nah! you have a wrong mental model! Just read it as bad English orgood Greek or formal logic. This is a language problem. "There ain't nobody that can do this!" is valid but confusing to English speakers. >> All these upside down where not exists (select big long huge statement) queries we do here are just unreadable to me. << Sorry, but that is how us math geeks think. Have you bought a basiclogic book yet? I thought that my 8-th book might be "logic for programmers" ... if there is an demand. >> is there a performance difference between the above SQL statement and the rephrasing of it that uses outer joins? << Depends on the SQL engine. DB2 likes EXISTS() predicates and Ingresoptimizes so it does n ot matter, etc. |
|||||||||||||||||||||||