Home All Groups Group Topic Archive Search About

How to use "Not Exists" ?

Author
2 Aug 2006 3:39 PM
Rich
Hello,

I need to pull a subset of records from a table.

Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1
join tbl2 t2 on t1.OtherID = t2.OtherID)

This works fine and returns the correct number of rows.  But then I try this:

Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1,
t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)

and get an error message stating that I need to use Not Exists when using
mutiple columns in the subquery.

Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2
from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)

Now I don't get any rows - same data - same conditions (I think?)   What are
the rules for using Not Exists?

Thanks,
Rich

Author
2 Aug 2006 4:11 PM
Rich
Here is some sample data (hope I clarify my problem)

CREATE TABLE #temp1(recID int, otherID int, col1 varchar(10), col2
varchar(10))
INSERT INTO #temp1
SELECT 1, 10, 'aa1', 'aa2' union
SELECT 2, 20, 'bb1', 'bb2' union
SELECT 3, 30, 'cc1', 'cc2' union
SELECT 4, 40, 'dd1', 'dd2' union
SELECT 5, 50, 'ee1', 'ee2' union
SELECT 6, 60, 'ff1', 'ff2' union
SELECT 7, 70, 'gg1', 'gg2' union
SELECT 8, 80, 'hh1', 'hh2' union
SELECT 9, 90, 'ii1', 'ii2' union
SELECT 10, 100, 'jj1', 'jj2'

SELECT * FROM #temp1

CREATE TABLE #temp2(otherID int, col1 varchar(10), col2 varchar(10))
INSERT INTO #temp2
SELECT  10, 'aa1', 'aa2' union
SELECT  20, 'bb1', 'bb2' union
SELECT  30, 'cc1', 'cc2' union
SELECT  40, 'dd1', 'dd2' union
SELECT  50, 'ee1', 'ee2'


select * from #temp1 where otherID Not In (select otherID from #temp2)

this returns all the rows in #temp1 where recID > 50


select * from #temp1 where NOT exists (select otherID from #temp2)

This does not return any rows.

When do I want to use Not Exists?


Show quote
"Rich" wrote:

> Hello,
>
> I need to pull a subset of records from a table.
>
> Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1
> join tbl2 t2 on t1.OtherID = t2.OtherID)
>
> This works fine and returns the correct number of rows.  But then I try this:
>
> Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1,
> t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)
>
> and get an error message stating that I need to use Not Exists when using
> mutiple columns in the subquery.
>
> Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2
> from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)
>
> Now I don't get any rows - same data - same conditions (I think?)   What are
> the rules for using Not Exists?
>
> Thanks,
> Rich
>
Author
2 Aug 2006 4:18 PM
Roy Harvey
SELECT *
  FROM #temp1
WHERE OtherID NOT IN
       (select OtherID from #temp2)

recID       otherID     col1       col2      
----------- ----------- ---------- ----------
          6          60 ff1        ff2
          7          70 gg1        gg2
          8          80 hh1        hh2
          9          90 ii1        ii2
         10         100 jj1        jj2

SELECT *
  FROM #temp1
WHERE NOT EXISTS
       (select * from #temp2
         where #temp1.OtherID = #temp2.OtherID)

recID       otherID     col1       col2      
----------- ----------- ---------- ----------
          6          60 ff1        ff2
          7          70 gg1        gg2
          8          80 hh1        hh2
          9          90 ii1        ii2
         10         100 jj1        jj2

Roy Harvey
Beacon Falls, CT

On Wed, 2 Aug 2006 09:11:02 -0700, Rich
<R***@discussions.microsoft.com> wrote:

Show quote
>Here is some sample data (hope I clarify my problem)
>
>CREATE TABLE #temp1(recID int, otherID int, col1 varchar(10), col2
>varchar(10))
>INSERT INTO #temp1
>SELECT 1, 10, 'aa1', 'aa2' union
>SELECT 2, 20, 'bb1', 'bb2' union
>SELECT 3, 30, 'cc1', 'cc2' union
>SELECT 4, 40, 'dd1', 'dd2' union
>SELECT 5, 50, 'ee1', 'ee2' union
>SELECT 6, 60, 'ff1', 'ff2' union
>SELECT 7, 70, 'gg1', 'gg2' union
>SELECT 8, 80, 'hh1', 'hh2' union
>SELECT 9, 90, 'ii1', 'ii2' union
>SELECT 10, 100, 'jj1', 'jj2'
>
>SELECT * FROM #temp1
>
>CREATE TABLE #temp2(otherID int, col1 varchar(10), col2 varchar(10))
>INSERT INTO #temp2
>SELECT  10, 'aa1', 'aa2' union
>SELECT  20, 'bb1', 'bb2' union
>SELECT  30, 'cc1', 'cc2' union
>SELECT  40, 'dd1', 'dd2' union
>SELECT  50, 'ee1', 'ee2'
>
>
>select * from #temp1 where otherID Not In (select otherID from #temp2)
>
>this returns all the rows in #temp1 where recID > 50
>
>
>select * from #temp1 where NOT exists (select otherID from #temp2)
>
>This does not return any rows.
>
>When do I want to use Not Exists?
>
>
>"Rich" wrote:
>
>> Hello,
>>
>> I need to pull a subset of records from a table.
>>
>> Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1
>> join tbl2 t2 on t1.OtherID = t2.OtherID)
>>
>> This works fine and returns the correct number of rows.  But then I try this:
>>
>> Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1,
>> t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)
>>
>> and get an error message stating that I need to use Not Exists when using
>> mutiple columns in the subquery.
>>
>> Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2
>> from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)
>>
>> Now I don't get any rows - same data - same conditions (I think?)   What are
>> the rules for using Not Exists?
>>
>> Thanks,
>> Rich
>>
Author
2 Aug 2006 4:49 PM
SQL Menace
In has also a 'side effect' with NULLS take a look at this

CREATE TABLE #temp1(recID int, otherID int, col1 varchar(10), col2
varchar(10))
INSERT INTO #temp1
SELECT 1, 10, 'aa1', 'aa2' union
SELECT 2, 20, 'bb1', 'bb2' union
SELECT 3, 30, 'cc1', 'cc2' union
SELECT 4, 40, 'dd1', 'dd2' union
SELECT 5, 50, 'ee1', 'ee2' union
SELECT 6, 60, 'ff1', 'ff2' union
SELECT 7, 70, 'gg1', 'gg2' union
SELECT 8, 80, 'hh1', 'hh2' union
SELECT 9, 90, 'ii1', 'ii2' union
SELECT 10, 100, 'jj1', 'jj2'


CREATE TABLE #temp2(otherID int, col1 varchar(10), col2 varchar(10))
INSERT INTO #temp2
SELECT  10, 'aa1', 'aa2' union
SELECT  20, 'bb1', 'bb2' union
SELECT  30, 'cc1', 'cc2' union
SELECT  40, 'dd1', 'dd2' union
SELECT  50, 'ee1', 'ee2' union
SELECT  NULL, 'ee1', 'ee2'  -- added this


--won't work because of the NULL value
SELECT 'query1',*
  FROM #temp1
WHERE OtherID NOT IN
       (select OtherID from #temp2)

--this works now
SELECT 'query2',*
  FROM #temp1
WHERE OtherID NOT IN
       (select OtherID from #temp2 WHERE OtherID IS NOT NULL )

--Exists is always safe
SELECT 'query3',*
  FROM #temp1
WHERE NOT EXISTS
       (select * from #temp2
         where #temp1.OtherID = #temp2.OtherID)

just keep that in mind

Denis the SQL Menace
http://sqlservercode.blogspot.com/

Roy Harvey wrote:
Show quote
> SELECT *
>   FROM #temp1
>  WHERE OtherID NOT IN
>        (select OtherID from #temp2)
>
> recID       otherID     col1       col2
> ----------- ----------- ---------- ----------
>           6          60 ff1        ff2
>           7          70 gg1        gg2
>           8          80 hh1        hh2
>           9          90 ii1        ii2
>          10         100 jj1        jj2
>
> SELECT *
>   FROM #temp1
>  WHERE NOT EXISTS
>        (select * from #temp2
>          where #temp1.OtherID = #temp2.OtherID)
>
> recID       otherID     col1       col2
> ----------- ----------- ---------- ----------
>           6          60 ff1        ff2
>           7          70 gg1        gg2
>           8          80 hh1        hh2
>           9          90 ii1        ii2
>          10         100 jj1        jj2
>
> Roy Harvey
> Beacon Falls, CT
>
> On Wed, 2 Aug 2006 09:11:02 -0700, Rich
> <R***@discussions.microsoft.com> wrote:
>
> >Here is some sample data (hope I clarify my problem)
> >
> >CREATE TABLE #temp1(recID int, otherID int, col1 varchar(10), col2
> >varchar(10))
> >INSERT INTO #temp1
> >SELECT 1, 10, 'aa1', 'aa2' union
> >SELECT 2, 20, 'bb1', 'bb2' union
> >SELECT 3, 30, 'cc1', 'cc2' union
> >SELECT 4, 40, 'dd1', 'dd2' union
> >SELECT 5, 50, 'ee1', 'ee2' union
> >SELECT 6, 60, 'ff1', 'ff2' union
> >SELECT 7, 70, 'gg1', 'gg2' union
> >SELECT 8, 80, 'hh1', 'hh2' union
> >SELECT 9, 90, 'ii1', 'ii2' union
> >SELECT 10, 100, 'jj1', 'jj2'
> >
> >SELECT * FROM #temp1
> >
> >CREATE TABLE #temp2(otherID int, col1 varchar(10), col2 varchar(10))
> >INSERT INTO #temp2
> >SELECT  10, 'aa1', 'aa2' union
> >SELECT  20, 'bb1', 'bb2' union
> >SELECT  30, 'cc1', 'cc2' union
> >SELECT  40, 'dd1', 'dd2' union
> >SELECT  50, 'ee1', 'ee2'
> >
> >
> >select * from #temp1 where otherID Not In (select otherID from #temp2)
> >
> >this returns all the rows in #temp1 where recID > 50
> >
> >
> >select * from #temp1 where NOT exists (select otherID from #temp2)
> >
> >This does not return any rows.
> >
> >When do I want to use Not Exists?
> >
> >
> >"Rich" wrote:
> >
> >> Hello,
> >>
> >> I need to pull a subset of records from a table.
> >>
> >> Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1
> >> join tbl2 t2 on t1.OtherID = t2.OtherID)
> >>
> >> This works fine and returns the correct number of rows.  But then I try this:
> >>
> >> Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1,
> >> t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)
> >>
> >> and get an error message stating that I need to use Not Exists when using
> >> mutiple columns in the subquery.
> >>
> >> Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2
> >> from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)
> >>
> >> Now I don't get any rows - same data - same conditions (I think?)   What are
> >> the rules for using Not Exists?
> >>
> >> Thanks,
> >> Rich
> >>
Author
2 Aug 2006 5:29 PM
Rich
Point taken.  Thanks very much for this insight.  Now I am real glad that I
learned how to use Not Exists.

Rich

Show quote
"SQL Menace" wrote:

> In has also a 'side effect' with NULLS take a look at this
>
> CREATE TABLE #temp1(recID int, otherID int, col1 varchar(10), col2
> varchar(10))
> INSERT INTO #temp1
> SELECT 1, 10, 'aa1', 'aa2' union
> SELECT 2, 20, 'bb1', 'bb2' union
> SELECT 3, 30, 'cc1', 'cc2' union
> SELECT 4, 40, 'dd1', 'dd2' union
> SELECT 5, 50, 'ee1', 'ee2' union
> SELECT 6, 60, 'ff1', 'ff2' union
> SELECT 7, 70, 'gg1', 'gg2' union
> SELECT 8, 80, 'hh1', 'hh2' union
> SELECT 9, 90, 'ii1', 'ii2' union
> SELECT 10, 100, 'jj1', 'jj2'
>
>
> CREATE TABLE #temp2(otherID int, col1 varchar(10), col2 varchar(10))
> INSERT INTO #temp2
> SELECT  10, 'aa1', 'aa2' union
> SELECT  20, 'bb1', 'bb2' union
> SELECT  30, 'cc1', 'cc2' union
> SELECT  40, 'dd1', 'dd2' union
> SELECT  50, 'ee1', 'ee2' union
> SELECT  NULL, 'ee1', 'ee2'  -- added this
>
>
> --won't work because of the NULL value
> SELECT 'query1',*
>   FROM #temp1
>  WHERE OtherID NOT IN
>        (select OtherID from #temp2)
>
> --this works now
> SELECT 'query2',*
>   FROM #temp1
>  WHERE OtherID NOT IN
>        (select OtherID from #temp2 WHERE OtherID IS NOT NULL )
>
> --Exists is always safe
> SELECT 'query3',*
>   FROM #temp1
>  WHERE NOT EXISTS
>        (select * from #temp2
>          where #temp1.OtherID = #temp2.OtherID)
>
> just keep that in mind
>
> Denis the SQL Menace
> http://sqlservercode.blogspot.com/
>
> Roy Harvey wrote:
> > SELECT *
> >   FROM #temp1
> >  WHERE OtherID NOT IN
> >        (select OtherID from #temp2)
> >
> > recID       otherID     col1       col2
> > ----------- ----------- ---------- ----------
> >           6          60 ff1        ff2
> >           7          70 gg1        gg2
> >           8          80 hh1        hh2
> >           9          90 ii1        ii2
> >          10         100 jj1        jj2
> >
> > SELECT *
> >   FROM #temp1
> >  WHERE NOT EXISTS
> >        (select * from #temp2
> >          where #temp1.OtherID = #temp2.OtherID)
> >
> > recID       otherID     col1       col2
> > ----------- ----------- ---------- ----------
> >           6          60 ff1        ff2
> >           7          70 gg1        gg2
> >           8          80 hh1        hh2
> >           9          90 ii1        ii2
> >          10         100 jj1        jj2
> >
> > Roy Harvey
> > Beacon Falls, CT
> >
> > On Wed, 2 Aug 2006 09:11:02 -0700, Rich
> > <R***@discussions.microsoft.com> wrote:
> >
> > >Here is some sample data (hope I clarify my problem)
> > >
> > >CREATE TABLE #temp1(recID int, otherID int, col1 varchar(10), col2
> > >varchar(10))
> > >INSERT INTO #temp1
> > >SELECT 1, 10, 'aa1', 'aa2' union
> > >SELECT 2, 20, 'bb1', 'bb2' union
> > >SELECT 3, 30, 'cc1', 'cc2' union
> > >SELECT 4, 40, 'dd1', 'dd2' union
> > >SELECT 5, 50, 'ee1', 'ee2' union
> > >SELECT 6, 60, 'ff1', 'ff2' union
> > >SELECT 7, 70, 'gg1', 'gg2' union
> > >SELECT 8, 80, 'hh1', 'hh2' union
> > >SELECT 9, 90, 'ii1', 'ii2' union
> > >SELECT 10, 100, 'jj1', 'jj2'
> > >
> > >SELECT * FROM #temp1
> > >
> > >CREATE TABLE #temp2(otherID int, col1 varchar(10), col2 varchar(10))
> > >INSERT INTO #temp2
> > >SELECT  10, 'aa1', 'aa2' union
> > >SELECT  20, 'bb1', 'bb2' union
> > >SELECT  30, 'cc1', 'cc2' union
> > >SELECT  40, 'dd1', 'dd2' union
> > >SELECT  50, 'ee1', 'ee2'
> > >
> > >
> > >select * from #temp1 where otherID Not In (select otherID from #temp2)
> > >
> > >this returns all the rows in #temp1 where recID > 50
> > >
> > >
> > >select * from #temp1 where NOT exists (select otherID from #temp2)
> > >
> > >This does not return any rows.
> > >
> > >When do I want to use Not Exists?
> > >
> > >
> > >"Rich" wrote:
> > >
> > >> Hello,
> > >>
> > >> I need to pull a subset of records from a table.
> > >>
> > >> Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1
> > >> join tbl2 t2 on t1.OtherID = t2.OtherID)
> > >>
> > >> This works fine and returns the correct number of rows.  But then I try this:
> > >>
> > >> Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1,
> > >> t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)
> > >>
> > >> and get an error message stating that I need to use Not Exists when using
> > >> mutiple columns in the subquery.
> > >>
> > >> Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2
> > >> from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)
> > >>
> > >> Now I don't get any rows - same data - same conditions (I think?)   What are
> > >> the rules for using Not Exists?
> > >>
> > >> Thanks,
> > >> Rich
> > >>
>
>
Author
2 Aug 2006 6:18 PM
Roy Harvey
On Wed, 2 Aug 2006 10:29:03 -0700, Rich
<R***@discussions.microsoft.com> wrote:

>Point taken.  Thanks very much for this insight.  Now I am real glad that I
>learned how to use Not Exists.
>
>Rich

The other situation where NOT EXISTS is required is when there is a
multi-column join.  IN only works with single columns (at least in
Transact-SQL, the dialect of SQL used by SQL Server).  An EXISTS
subquery can join on as many columns as required.

Roy
Author
2 Aug 2006 4:15 PM
Roy Harvey
On Wed, 2 Aug 2006 08:39:02 -0700, Rich
<R***@discussions.microsoft.com> wrote:

>Hello,
>
>I need to pull a subset of records from a table.
>
>Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1
>join tbl2 t2 on t1.OtherID = t2.OtherID)

First, as written there is no need for tbl1 to be specified in the
subquery, and the NOT IN can simply be on OtherID.

SELECT *
  FROM tbl1
WHERE OtherID NOT IN
       (select OtherID from tbl2)

>This works fine and returns the correct number of rows.  But then I try this:
>
>Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1,
>t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)

It makes no sense to include multiple columns in an IN subquery, as
the results are only used to compare to the one column specified
before the IN.

>and get an error message stating that I need to use Not Exists when using
>mutiple columns in the subquery.
>
>Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2
>from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)
>
>Now I don't get any rows - same data - same conditions (I think?)   What are
>the rules for using Not Exists?

SELECT *
  FROM tbl1
WHERE NOT EXISTS
       (select * from tbl2
         where tbl1.OtherID = tbl2.OtherID)

The rule for using NOT EXISTS is that the subquery is a correlated
subquery.  Note that tbl1 only appears in the outer FROM clause, but
it is referenced in the inner WHERE clause.  That is what makes it
correlated.

>Thanks,
>Rich

Roy Harvey
Beacon Falls, CT
Author
2 Aug 2006 4:53 PM
Rich
Hi Roy,

Thanks very much for your replies and explanation.

SELECT *
  FROM tbl1
WHERE NOT EXISTS
       (select * from tbl2
         where tbl1.OtherID = tbl2.OtherID)

>>select * from tbl2
         where tbl1.OtherID = tbl2.OtherID<<

OK.  Now I think I get it - when using Exists/Not Exists  tbl1 (the outer
table) needs to be referenced in a WHERE clause in the subquery.    Got it.

Thanks,
Rich

Show quote
"Roy Harvey" wrote:

> On Wed, 2 Aug 2006 08:39:02 -0700, Rich
> <R***@discussions.microsoft.com> wrote:
>
> >Hello,
> >
> >I need to pull a subset of records from a table.
> >
> >Select * from tbl1 where RecordID Not In (select t1.RecordID from tbl1 t1
> >join tbl2 t2 on t1.OtherID = t2.OtherID)
>
> First, as written there is no need for tbl1 to be specified in the
> subquery, and the NOT IN can simply be on OtherID.
>
> SELECT *
>   FROM tbl1
>  WHERE OtherID NOT IN
>        (select OtherID from tbl2)
>
> >This works fine and returns the correct number of rows.  But then I try this:
> >
> >Select * from tbl1 where RecordID Not In (select t1.RecordID, t1.col1,
> >t1.col2 from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)
>
> It makes no sense to include multiple columns in an IN subquery, as
> the results are only used to compare to the one column specified
> before the IN.
>
> >and get an error message stating that I need to use Not Exists when using
> >mutiple columns in the subquery.
> >
> >Select * from tbl1 where Not Exists (select t1.RecordID, t1.col1, t1.col2
> >from tbl1 t1 join tbl2 t2 on t1.OtherID = t2.OtherID)
> >
> >Now I don't get any rows - same data - same conditions (I think?)   What are
> >the rules for using Not Exists?
>
> SELECT *
>   FROM tbl1
>  WHERE NOT EXISTS
>        (select * from tbl2
>          where tbl1.OtherID = tbl2.OtherID)
>
> The rule for using NOT EXISTS is that the subquery is a correlated
> subquery.  Note that tbl1 only appears in the outer FROM clause, but
> it is referenced in the inner WHERE clause.  That is what makes it
> correlated.
>
> >Thanks,
> >Rich
>
> Roy Harvey
> Beacon Falls, CT
>
Author
3 Aug 2006 1:41 PM
John Baima
Roy Harvey <roy_har***@snet.net> wrote:

>
>SELECT *
>  FROM tbl1
> WHERE NOT EXISTS
>       (select * from tbl2
>         where tbl1.OtherID = tbl2.OtherID)
>

This works, but in my experience the NOT EXISTS is a performance
no-no. You can get the same results much faster with

SELECT *
  FROM tbl1
  LEFT JOIN tbl2 on tbl2.OtherID = tbl1.OtherID
WHERE tbl2.OtherID is null

-John
Author
3 Aug 2006 1:45 PM
Alexander Kuznetsov
John,

Interesting, because I have frequently seen quite the opposite: NOT
EXISTS working faster then an OUTER JOIN followed by a filter. Would
you mind posting a repro?
Author
3 Aug 2006 2:34 PM
John Baima
"Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote:

>John,
>
>Interesting, because I have frequently seen quite the opposite: NOT
>EXISTS working faster then an OUTER JOIN followed by a filter. Would
>you mind posting a repro?


I don't have a real example I can post, but the difference is
sometimes dramatic. If you look at the execution plans on the
Northwind DB of

SELECT c.*
FROM [Northwind].[dbo].[Customers] C
WHERE NOT EXISTS (SELECT 1 FROM [Northwind].[dbo].[Orders] o
        WHERE o.[CustomerID] = c.[CustomerID])

and

SELECT c.*
FROM [Northwind].[dbo].[Customers] C
LEFT JOIN [Northwind].[dbo].[Orders] o on o.[CustomerID] =
c.[CustomerID]
WHERE o.[CustomerID] IS NULL

you will see that they are about the same with the added filter on the
later. While with just a few rows that looks like it takes longer,
with real data, I've always found it a lot faster.

-John
Author
3 Aug 2006 3:23 PM
Alexander Kuznetsov
Regarding Customers and Orders, I would say that we need to distinguish
at least 3 differerent cases:
1. Some customers have a few orders, such as 100K customers and 100K
orders, 50K customers without orders.
2. Most customers have a few orders, such as 100K customers and 500K
orders, 10K customers without orders.
3. Most customers have a lot of orders, just a few don't have any, such
as 100K customers and 2M orders, 1K customers without orders.

In the third case I have seen NOT EXISTS run somewhat (though not
dramatically) faster than NOT IN.
In which case did you see NOT IN outperform NOT EXISTS?
Author
3 Aug 2006 10:56 PM
Roy Harvey
Small point, but he was not saying NOT IN performed better, but rather
LEFT OUTER with a NULL test.

Roy

On 3 Aug 2006 08:23:43 -0700, "Alexander Kuznetsov"
<AK_TIREDOFSPAM@hotmail.COM> wrote:

Show quote
>Regarding Customers and Orders, I would say that we need to distinguish
>at least 3 differerent cases:
>1. Some customers have a few orders, such as 100K customers and 100K
>orders, 50K customers without orders.
>2. Most customers have a few orders, such as 100K customers and 500K
>orders, 10K customers without orders.
>3. Most customers have a lot of orders, just a few don't have any, such
>as 100K customers and 2M orders, 1K customers without orders.
>
>In the third case I have seen NOT EXISTS run somewhat (though not
>dramatically) faster than NOT IN.
>In which case did you see NOT IN outperform NOT EXISTS?
Author
4 Aug 2006 1:52 AM
John Baima
Yes, Roy, that is correct. The LEFT JOIN with a NULL test is a lot
faster in our applications than a NOT IN or NOT EXISTS. While the
LEFT+null seemed odd to me at first, I now think that way and never
use the NOT EXISTS anymore.

-John

Roy Harvey <roy_har***@snet.net> wrote:

Show quote
>Small point, but he was not saying NOT IN performed better, but rather
>LEFT OUTER with a NULL test.
>
>Roy
>
>On 3 Aug 2006 08:23:43 -0700, "Alexander Kuznetsov"
><AK_TIREDOFSPAM@hotmail.COM> wrote:
>
>>Regarding Customers and Orders, I would say that we need to distinguish
>>at least 3 differerent cases:
>>1. Some customers have a few orders, such as 100K customers and 100K
>>orders, 50K customers without orders.
>>2. Most customers have a few orders, such as 100K customers and 500K
>>orders, 10K customers without orders.
>>3. Most customers have a lot of orders, just a few don't have any, such
>>as 100K customers and 2M orders, 1K customers without orders.
>>
>>In the third case I have seen NOT EXISTS run somewhat (though not
>>dramatically) faster than NOT IN.
>>In which case did you see NOT IN outperform NOT EXISTS?

John Baima
Author
4 Aug 2006 2:00 AM
Alexander Kuznetsov
John,

Can you recall which kind of data were you running your query against:

1. Some customers have a few orders, such as 100K customers and 100K
orders, 50K customers without orders.
2. Most customers have a few orders, such as 100K customers and 500K
orders, 10K customers without orders.
3. Most customers have a lot of orders, just a few don't have any, such

as 100K customers and 2M orders, 1K customers without orders.

I'm trying to say that in case 3 it can be the other way around and in
case 1 I did not notice any difference between NOT EXISTS, NOT IN and
an OUTER JOIN at all.
Author
4 Aug 2006 1:52 AM
Alexander Kuznetsov
Alexander Kuznetsov wrote:
> Regarding Customers and Orders, I would say that we need to distinguish
> at least 3 differerent cases:
> 1. Some customers have a few orders, such as 100K customers and 100K
> orders, 50K customers without orders.
> 2. Most customers have a few orders, such as 100K customers and 500K
> orders, 10K customers without orders.
> 3. Most customers have a lot of orders, just a few don't have any, such
> as 100K customers and 2M orders, 1K customers without orders.
>
Roy,

you are correct, my question needs to be corrected:

In the third case I have seen NOT EXISTS run somewhat (though not
dramatically) faster than OUTER JOIN. In which case did you see NOT IN
outperform OUTER JOIN?
I would like to add that sometimes comparing 2 queries against
different data gives entirely different results...

AddThis Social Bookmark Button