|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
How to use "Not Exists" ?I need to pull a subset of records from a table. Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) This works fine and returns the correct number of rows. But then I try this: Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1, t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) and get an error message stating that I need to use Not Exists when using mutiple columns in the subquery. Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) Now I don't get any rows - same data - same conditions (I think?) What are the rules for using Not Exists? Thanks, Rich Here is some sample data (hope I clarify my problem)
CREATE TABLE #temp1(recID int, otherID int, col1 varchar(10), col2 varchar(10)) INSERT INTO #temp1 SELECT 1, 10, 'aa1', 'aa2' union SELECT 2, 20, 'bb1', 'bb2' union SELECT 3, 30, 'cc1', 'cc2' union SELECT 4, 40, 'dd1', 'dd2' union SELECT 5, 50, 'ee1', 'ee2' union SELECT 6, 60, 'ff1', 'ff2' union SELECT 7, 70, 'gg1', 'gg2' union SELECT 8, 80, 'hh1', 'hh2' union SELECT 9, 90, 'ii1', 'ii2' union SELECT 10, 100, 'jj1', 'jj2' SELECT * FROM #temp1 CREATE TABLE #temp2(otherID int, col1 varchar(10), col2 varchar(10)) INSERT INTO #temp2 SELECT 10, 'aa1', 'aa2' union SELECT 20, 'bb1', 'bb2' union SELECT 30, 'cc1', 'cc2' union SELECT 40, 'dd1', 'dd2' union SELECT 50, 'ee1', 'ee2' select * from #temp1 where otherID Not In (select otherID from #temp2) this returns all the rows in #temp1 where recID > 50 select * from #temp1 where NOT exists (select otherID from #temp2) This does not return any rows. When do I want to use Not Exists? Show quote "Rich" wrote: > Hello, > > I need to pull a subset of records from a table. > > Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1 > join tbl2 t2 on t1.OtherID = t2.OtherID) > > This works fine and returns the correct number of rows. But then I try this: > > Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1, > t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) > > and get an error message stating that I need to use Not Exists when using > mutiple columns in the subquery. > > Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2 > from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) > > Now I don't get any rows - same data - same conditions (I think?) What are > the rules for using Not Exists? > > Thanks, > Rich > SELECT *
FROM #temp1 WHERE OtherID NOT IN (select OtherID from #temp2) recID otherID col1 col2 ----------- ----------- ---------- ---------- 6 60 ff1 ff2 7 70 gg1 gg2 8 80 hh1 hh2 9 90 ii1 ii2 10 100 jj1 jj2 SELECT * FROM #temp1 WHERE NOT EXISTS (select * from #temp2 where #temp1.OtherID = #temp2.OtherID) recID otherID col1 col2 ----------- ----------- ---------- ---------- 6 60 ff1 ff2 7 70 gg1 gg2 8 80 hh1 hh2 9 90 ii1 ii2 10 100 jj1 jj2 Roy Harvey Beacon Falls, CT On Wed, 2 Aug 2006 09:11:02 -0700, Rich <R***@discussions.microsoft.com> wrote: Show quote >Here is some sample data (hope I clarify my problem) > >CREATE TABLE #temp1(recID int, otherID int, col1 varchar(10), col2 >varchar(10)) >INSERT INTO #temp1 >SELECT 1, 10, 'aa1', 'aa2' union >SELECT 2, 20, 'bb1', 'bb2' union >SELECT 3, 30, 'cc1', 'cc2' union >SELECT 4, 40, 'dd1', 'dd2' union >SELECT 5, 50, 'ee1', 'ee2' union >SELECT 6, 60, 'ff1', 'ff2' union >SELECT 7, 70, 'gg1', 'gg2' union >SELECT 8, 80, 'hh1', 'hh2' union >SELECT 9, 90, 'ii1', 'ii2' union >SELECT 10, 100, 'jj1', 'jj2' > >SELECT * FROM #temp1 > >CREATE TABLE #temp2(otherID int, col1 varchar(10), col2 varchar(10)) >INSERT INTO #temp2 >SELECT 10, 'aa1', 'aa2' union >SELECT 20, 'bb1', 'bb2' union >SELECT 30, 'cc1', 'cc2' union >SELECT 40, 'dd1', 'dd2' union >SELECT 50, 'ee1', 'ee2' > > >select * from #temp1 where otherID Not In (select otherID from #temp2) > >this returns all the rows in #temp1 where recID > 50 > > >select * from #temp1 where NOT exists (select otherID from #temp2) > >This does not return any rows. > >When do I want to use Not Exists? > > >"Rich" wrote: > >> Hello, >> >> I need to pull a subset of records from a table. >> >> Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1 >> join tbl2 t2 on t1.OtherID = t2.OtherID) >> >> This works fine and returns the correct number of rows. But then I try this: >> >> Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1, >> t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) >> >> and get an error message stating that I need to use Not Exists when using >> mutiple columns in the subquery. >> >> Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2 >> from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) >> >> Now I don't get any rows - same data - same conditions (I think?) What are >> the rules for using Not Exists? >> >> Thanks, >> Rich >> In has also a 'side effect' with NULLS take a look at this
CREATE TABLE #temp1(recID int, otherID int, col1 varchar(10), col2 varchar(10)) INSERT INTO #temp1 SELECT 1, 10, 'aa1', 'aa2' union SELECT 2, 20, 'bb1', 'bb2' union SELECT 3, 30, 'cc1', 'cc2' union SELECT 4, 40, 'dd1', 'dd2' union SELECT 5, 50, 'ee1', 'ee2' union SELECT 6, 60, 'ff1', 'ff2' union SELECT 7, 70, 'gg1', 'gg2' union SELECT 8, 80, 'hh1', 'hh2' union SELECT 9, 90, 'ii1', 'ii2' union SELECT 10, 100, 'jj1', 'jj2' CREATE TABLE #temp2(otherID int, col1 varchar(10), col2 varchar(10)) INSERT INTO #temp2 SELECT 10, 'aa1', 'aa2' union SELECT 20, 'bb1', 'bb2' union SELECT 30, 'cc1', 'cc2' union SELECT 40, 'dd1', 'dd2' union SELECT 50, 'ee1', 'ee2' union SELECT NULL, 'ee1', 'ee2' -- added this --won't work because of the NULL value SELECT 'query1',* FROM #temp1 WHERE OtherID NOT IN (select OtherID from #temp2) --this works now SELECT 'query2',* FROM #temp1 WHERE OtherID NOT IN (select OtherID from #temp2 WHERE OtherID IS NOT NULL ) --Exists is always safe SELECT 'query3',* FROM #temp1 WHERE NOT EXISTS (select * from #temp2 where #temp1.OtherID = #temp2.OtherID) just keep that in mind Denis the SQL Menace http://sqlservercode.blogspot.com/ Roy Harvey wrote: Show quote > SELECT * > FROM #temp1 > WHERE OtherID NOT IN > (select OtherID from #temp2) > > recID otherID col1 col2 > ----------- ----------- ---------- ---------- > 6 60 ff1 ff2 > 7 70 gg1 gg2 > 8 80 hh1 hh2 > 9 90 ii1 ii2 > 10 100 jj1 jj2 > > SELECT * > FROM #temp1 > WHERE NOT EXISTS > (select * from #temp2 > where #temp1.OtherID = #temp2.OtherID) > > recID otherID col1 col2 > ----------- ----------- ---------- ---------- > 6 60 ff1 ff2 > 7 70 gg1 gg2 > 8 80 hh1 hh2 > 9 90 ii1 ii2 > 10 100 jj1 jj2 > > Roy Harvey > Beacon Falls, CT > > On Wed, 2 Aug 2006 09:11:02 -0700, Rich > <R***@discussions.microsoft.com> wrote: > > >Here is some sample data (hope I clarify my problem) > > > >CREATE TABLE #temp1(recID int, otherID int, col1 varchar(10), col2 > >varchar(10)) > >INSERT INTO #temp1 > >SELECT 1, 10, 'aa1', 'aa2' union > >SELECT 2, 20, 'bb1', 'bb2' union > >SELECT 3, 30, 'cc1', 'cc2' union > >SELECT 4, 40, 'dd1', 'dd2' union > >SELECT 5, 50, 'ee1', 'ee2' union > >SELECT 6, 60, 'ff1', 'ff2' union > >SELECT 7, 70, 'gg1', 'gg2' union > >SELECT 8, 80, 'hh1', 'hh2' union > >SELECT 9, 90, 'ii1', 'ii2' union > >SELECT 10, 100, 'jj1', 'jj2' > > > >SELECT * FROM #temp1 > > > >CREATE TABLE #temp2(otherID int, col1 varchar(10), col2 varchar(10)) > >INSERT INTO #temp2 > >SELECT 10, 'aa1', 'aa2' union > >SELECT 20, 'bb1', 'bb2' union > >SELECT 30, 'cc1', 'cc2' union > >SELECT 40, 'dd1', 'dd2' union > >SELECT 50, 'ee1', 'ee2' > > > > > >select * from #temp1 where otherID Not In (select otherID from #temp2) > > > >this returns all the rows in #temp1 where recID > 50 > > > > > >select * from #temp1 where NOT exists (select otherID from #temp2) > > > >This does not return any rows. > > > >When do I want to use Not Exists? > > > > > >"Rich" wrote: > > > >> Hello, > >> > >> I need to pull a subset of records from a table. > >> > >> Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1 > >> join tbl2 t2 on t1.OtherID = t2.OtherID) > >> > >> This works fine and returns the correct number of rows. But then I try this: > >> > >> Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1, > >> t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) > >> > >> and get an error message stating that I need to use Not Exists when using > >> mutiple columns in the subquery. > >> > >> Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2 > >> from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) > >> > >> Now I don't get any rows - same data - same conditions (I think?) What are > >> the rules for using Not Exists? > >> > >> Thanks, > >> Rich > >> Point taken. Thanks very much for this insight. Now I am real glad that I
learned how to use Not Exists. Rich Show quote "SQL Menace" wrote: > In has also a 'side effect' with NULLS take a look at this > > CREATE TABLE #temp1(recID int, otherID int, col1 varchar(10), col2 > varchar(10)) > INSERT INTO #temp1 > SELECT 1, 10, 'aa1', 'aa2' union > SELECT 2, 20, 'bb1', 'bb2' union > SELECT 3, 30, 'cc1', 'cc2' union > SELECT 4, 40, 'dd1', 'dd2' union > SELECT 5, 50, 'ee1', 'ee2' union > SELECT 6, 60, 'ff1', 'ff2' union > SELECT 7, 70, 'gg1', 'gg2' union > SELECT 8, 80, 'hh1', 'hh2' union > SELECT 9, 90, 'ii1', 'ii2' union > SELECT 10, 100, 'jj1', 'jj2' > > > CREATE TABLE #temp2(otherID int, col1 varchar(10), col2 varchar(10)) > INSERT INTO #temp2 > SELECT 10, 'aa1', 'aa2' union > SELECT 20, 'bb1', 'bb2' union > SELECT 30, 'cc1', 'cc2' union > SELECT 40, 'dd1', 'dd2' union > SELECT 50, 'ee1', 'ee2' union > SELECT NULL, 'ee1', 'ee2' -- added this > > > --won't work because of the NULL value > SELECT 'query1',* > FROM #temp1 > WHERE OtherID NOT IN > (select OtherID from #temp2) > > --this works now > SELECT 'query2',* > FROM #temp1 > WHERE OtherID NOT IN > (select OtherID from #temp2 WHERE OtherID IS NOT NULL ) > > --Exists is always safe > SELECT 'query3',* > FROM #temp1 > WHERE NOT EXISTS > (select * from #temp2 > where #temp1.OtherID = #temp2.OtherID) > > just keep that in mind > > Denis the SQL Menace > http://sqlservercode.blogspot.com/ > > Roy Harvey wrote: > > SELECT * > > FROM #temp1 > > WHERE OtherID NOT IN > > (select OtherID from #temp2) > > > > recID otherID col1 col2 > > ----------- ----------- ---------- ---------- > > 6 60 ff1 ff2 > > 7 70 gg1 gg2 > > 8 80 hh1 hh2 > > 9 90 ii1 ii2 > > 10 100 jj1 jj2 > > > > SELECT * > > FROM #temp1 > > WHERE NOT EXISTS > > (select * from #temp2 > > where #temp1.OtherID = #temp2.OtherID) > > > > recID otherID col1 col2 > > ----------- ----------- ---------- ---------- > > 6 60 ff1 ff2 > > 7 70 gg1 gg2 > > 8 80 hh1 hh2 > > 9 90 ii1 ii2 > > 10 100 jj1 jj2 > > > > Roy Harvey > > Beacon Falls, CT > > > > On Wed, 2 Aug 2006 09:11:02 -0700, Rich > > <R***@discussions.microsoft.com> wrote: > > > > >Here is some sample data (hope I clarify my problem) > > > > > >CREATE TABLE #temp1(recID int, otherID int, col1 varchar(10), col2 > > >varchar(10)) > > >INSERT INTO #temp1 > > >SELECT 1, 10, 'aa1', 'aa2' union > > >SELECT 2, 20, 'bb1', 'bb2' union > > >SELECT 3, 30, 'cc1', 'cc2' union > > >SELECT 4, 40, 'dd1', 'dd2' union > > >SELECT 5, 50, 'ee1', 'ee2' union > > >SELECT 6, 60, 'ff1', 'ff2' union > > >SELECT 7, 70, 'gg1', 'gg2' union > > >SELECT 8, 80, 'hh1', 'hh2' union > > >SELECT 9, 90, 'ii1', 'ii2' union > > >SELECT 10, 100, 'jj1', 'jj2' > > > > > >SELECT * FROM #temp1 > > > > > >CREATE TABLE #temp2(otherID int, col1 varchar(10), col2 varchar(10)) > > >INSERT INTO #temp2 > > >SELECT 10, 'aa1', 'aa2' union > > >SELECT 20, 'bb1', 'bb2' union > > >SELECT 30, 'cc1', 'cc2' union > > >SELECT 40, 'dd1', 'dd2' union > > >SELECT 50, 'ee1', 'ee2' > > > > > > > > >select * from #temp1 where otherID Not In (select otherID from #temp2) > > > > > >this returns all the rows in #temp1 where recID > 50 > > > > > > > > >select * from #temp1 where NOT exists (select otherID from #temp2) > > > > > >This does not return any rows. > > > > > >When do I want to use Not Exists? > > > > > > > > >"Rich" wrote: > > > > > >> Hello, > > >> > > >> I need to pull a subset of records from a table. > > >> > > >> Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1 > > >> join tbl2 t2 on t1.OtherID = t2.OtherID) > > >> > > >> This works fine and returns the correct number of rows. But then I try this: > > >> > > >> Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1, > > >> t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) > > >> > > >> and get an error message stating that I need to use Not Exists when using > > >> mutiple columns in the subquery. > > >> > > >> Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2 > > >> from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) > > >> > > >> Now I don't get any rows - same data - same conditions (I think?) What are > > >> the rules for using Not Exists? > > >> > > >> Thanks, > > >> Rich > > >> > > On Wed, 2 Aug 2006 10:29:03 -0700, Rich
<R***@discussions.microsoft.com> wrote: >Point taken. Thanks very much for this insight. Now I am real glad that I The other situation where NOT EXISTS is required is when there is a>learned how to use Not Exists. > >Rich multi-column join. IN only works with single columns (at least in Transact-SQL, the dialect of SQL used by SQL Server). An EXISTS subquery can join on as many columns as required. Roy On Wed, 2 Aug 2006 08:39:02 -0700, Rich
<R***@discussions.microsoft.com> wrote: >Hello, First, as written there is no need for tbl1 to be specified in the> >I need to pull a subset of records from a table. > >Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1 >join tbl2 t2 on t1.OtherID = t2.OtherID) subquery, and the NOT IN can simply be on OtherID. SELECT * FROM tbl1 WHERE OtherID NOT IN (select OtherID from tbl2) >This works fine and returns the correct number of rows. But then I try this: It makes no sense to include multiple columns in an IN subquery, as> >Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1, >t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) the results are only used to compare to the one column specified before the IN. >and get an error message stating that I need to use Not Exists when using SELECT *>mutiple columns in the subquery. > >Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2 >from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) > >Now I don't get any rows - same data - same conditions (I think?) What are >the rules for using Not Exists? FROM tbl1 WHERE NOT EXISTS (select * from tbl2 where tbl1.OtherID = tbl2.OtherID) The rule for using NOT EXISTS is that the subquery is a correlated subquery. Note that tbl1 only appears in the outer FROM clause, but it is referenced in the inner WHERE clause. That is what makes it correlated. >Thanks, Roy Harvey>Rich Beacon Falls, CT Hi Roy,
Thanks very much for your replies and explanation. SELECT * FROM tbl1 WHERE NOT EXISTS (select * from tbl2 where tbl1.OtherID = tbl2.OtherID) >>select * from tbl2 where tbl1.OtherID = tbl2.OtherID<<OK. Now I think I get it - when using Exists/Not Exists tbl1 (the outer table) needs to be referenced in a WHERE clause in the subquery. Got it. Thanks, Rich Show quote "Roy Harvey" wrote: > On Wed, 2 Aug 2006 08:39:02 -0700, Rich > <R***@discussions.microsoft.com> wrote: > > >Hello, > > > >I need to pull a subset of records from a table. > > > >Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1 > >join tbl2 t2 on t1.OtherID = t2.OtherID) > > First, as written there is no need for tbl1 to be specified in the > subquery, and the NOT IN can simply be on OtherID. > > SELECT * > FROM tbl1 > WHERE OtherID NOT IN > (select OtherID from tbl2) > > >This works fine and returns the correct number of rows. But then I try this: > > > >Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1, > >t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) > > It makes no sense to include multiple columns in an IN subquery, as > the results are only used to compare to the one column specified > before the IN. > > >and get an error message stating that I need to use Not Exists when using > >mutiple columns in the subquery. > > > >Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2 > >from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID) > > > >Now I don't get any rows - same data - same conditions (I think?) What are > >the rules for using Not Exists? > > SELECT * > FROM tbl1 > WHERE NOT EXISTS > (select * from tbl2 > where tbl1.OtherID = tbl2.OtherID) > > The rule for using NOT EXISTS is that the subquery is a correlated > subquery. Note that tbl1 only appears in the outer FROM clause, but > it is referenced in the inner WHERE clause. That is what makes it > correlated. > > >Thanks, > >Rich > > Roy Harvey > Beacon Falls, CT > Roy Harvey <roy_har***@snet.net> wrote:
> This works, but in my experience the NOT EXISTS is a performance>SELECT * > FROM tbl1 > WHERE NOT EXISTS > (select * from tbl2 > where tbl1.OtherID = tbl2.OtherID) > no-no. You can get the same results much faster with SELECT * FROM tbl1 LEFT JOIN tbl2 on tbl2.OtherID = tbl1.OtherID WHERE tbl2.OtherID is null -John John,
Interesting, because I have frequently seen quite the opposite: NOT EXISTS working faster then an OUTER JOIN followed by a filter. Would you mind posting a repro? "Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote: I don't have a real example I can post, but the difference is>John, > >Interesting, because I have frequently seen quite the opposite: NOT >EXISTS working faster then an OUTER JOIN followed by a filter. Would >you mind posting a repro? sometimes dramatic. If you look at the execution plans on the Northwind DB of SELECT c.* FROM [Northwind].[dbo].[Customers] C WHERE NOT EXISTS (SELECT 1 FROM [Northwind].[dbo].[Orders] o WHERE o.[CustomerID] = c.[CustomerID]) and SELECT c.* FROM [Northwind].[dbo].[Customers] C LEFT JOIN [Northwind].[dbo].[Orders] o on o.[CustomerID] = c.[CustomerID] WHERE o.[CustomerID] IS NULL you will see that they are about the same with the added filter on the later. While with just a few rows that looks like it takes longer, with real data, I've always found it a lot faster. -John Regarding Customers and Orders, I would say that we need to distinguish
at least 3 differerent cases: 1. Some customers have a few orders, such as 100K customers and 100K orders, 50K customers without orders. 2. Most customers have a few orders, such as 100K customers and 500K orders, 10K customers without orders. 3. Most customers have a lot of orders, just a few don't have any, such as 100K customers and 2M orders, 1K customers without orders. In the third case I have seen NOT EXISTS run somewhat (though not dramatically) faster than NOT IN. In which case did you see NOT IN outperform NOT EXISTS? Small point, but he was not saying NOT IN performed better, but rather
LEFT OUTER with a NULL test. Roy On 3 Aug 2006 08:23:43 -0700, "Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote: Show quote >Regarding Customers and Orders, I would say that we need to distinguish >at least 3 differerent cases: >1. Some customers have a few orders, such as 100K customers and 100K >orders, 50K customers without orders. >2. Most customers have a few orders, such as 100K customers and 500K >orders, 10K customers without orders. >3. Most customers have a lot of orders, just a few don't have any, such >as 100K customers and 2M orders, 1K customers without orders. > >In the third case I have seen NOT EXISTS run somewhat (though not >dramatically) faster than NOT IN. >In which case did you see NOT IN outperform NOT EXISTS? Yes, Roy, that is correct. The LEFT JOIN with a NULL test is a lot
faster in our applications than a NOT IN or NOT EXISTS. While the LEFT+null seemed odd to me at first, I now think that way and never use the NOT EXISTS anymore. -John Roy Harvey <roy_har***@snet.net> wrote: Show quote >Small point, but he was not saying NOT IN performed better, but rather John Baima>LEFT OUTER with a NULL test. > >Roy > >On 3 Aug 2006 08:23:43 -0700, "Alexander Kuznetsov" ><AK_TIREDOFSPAM@hotmail.COM> wrote: > >>Regarding Customers and Orders, I would say that we need to distinguish >>at least 3 differerent cases: >>1. Some customers have a few orders, such as 100K customers and 100K >>orders, 50K customers without orders. >>2. Most customers have a few orders, such as 100K customers and 500K >>orders, 10K customers without orders. >>3. Most customers have a lot of orders, just a few don't have any, such >>as 100K customers and 2M orders, 1K customers without orders. >> >>In the third case I have seen NOT EXISTS run somewhat (though not >>dramatically) faster than NOT IN. >>In which case did you see NOT IN outperform NOT EXISTS? John,
Can you recall which kind of data were you running your query against: 1. Some customers have a few orders, such as 100K customers and 100K orders, 50K customers without orders. 2. Most customers have a few orders, such as 100K customers and 500K orders, 10K customers without orders. 3. Most customers have a lot of orders, just a few don't have any, such as 100K customers and 2M orders, 1K customers without orders. I'm trying to say that in case 3 it can be the other way around and in case 1 I did not notice any difference between NOT EXISTS, NOT IN and an OUTER JOIN at all. Alexander Kuznetsov wrote:
> Regarding Customers and Orders, I would say that we need to distinguish you are correct, my question needs to be corrected:> at least 3 differerent cases: > 1. Some customers have a few orders, such as 100K customers and 100K > orders, 50K customers without orders. > 2. Most customers have a few orders, such as 100K customers and 500K > orders, 10K customers without orders. > 3. Most customers have a lot of orders, just a few don't have any, such > as 100K customers and 2M orders, 1K customers without orders. > Roy, In the third case I have seen NOT EXISTS run somewhat (though not dramatically) faster than OUTER JOIN. In which case did you see NOT IN outperform OUTER JOIN? I would like to add that sometimes comparing 2 queries against different data gives entirely different results... |
|||||||||||||||||||||||