Home All Groups Group Topic Archive Search About
Author
10 Aug 2006 9:12 AM
Olivier Matrot
Hello,
I would like to optimize the following query. The goal is to give a record
number on each row based on a row owner and a timestamp.
UPDATE dbo.mytable

SET rec_id = (SELECT COUNT(*) FROM dbo.mytable b WHERE owner= N'me' and
a.[timestamp] >= b.[timestamp])

FROM dbo.mytable a WHERE owner = N'me' AND rec_id IS NULL

TIA.

Author
10 Aug 2006 9:22 AM
Chris Lim
Olivier Matrot wrote:
> Hello,
> I would like to optimize the following query. The goal is to give a record
> number on each row based on a row owner and a timestamp.
> UPDATE dbo.mytable
>
> SET rec_id = (SELECT COUNT(*) FROM dbo.mytable b WHERE owner= N'me' and
> a.[timestamp] >= b.[timestamp])
>
> FROM dbo.mytable a WHERE owner = N'me' AND rec_id IS NULL

Assuming you have an index on myTable on columns (owner, timestamp),
then I think that query is pretty much as good as you can get it.

Note that you don't need the FROM clause though in your UPDATE. i.e.

UPDATE dbo.mytable
SET rec_id = (SELECT COUNT(*)
                     FROM dbo.mytable b
                     WHERE b.owner= dbo.mytable.owner
                     AND b.[timestamp] < dbo.mytable.[timestamp] )
WHERE owner = N'me'
AND rec_id IS NULL
Author
10 Aug 2006 9:51 AM
Olivier Matrot
Ok thanks for that.
Could we imagine using a user function to get the next record number instead
? Would it be faster ?

Show quote
"Chris Lim" <blackca***@hotmail.com> wrote in message
news:1155201742.194301.111490@q16g2000cwq.googlegroups.com...
> Olivier Matrot wrote:
>> Hello,
>> I would like to optimize the following query. The goal is to give a
>> record
>> number on each row based on a row owner and a timestamp.
>> UPDATE dbo.mytable
>>
>> SET rec_id = (SELECT COUNT(*) FROM dbo.mytable b WHERE owner= N'me' and
>> a.[timestamp] >= b.[timestamp])
>>
>> FROM dbo.mytable a WHERE owner = N'me' AND rec_id IS NULL
>
> Assuming you have an index on myTable on columns (owner, timestamp),
> then I think that query is pretty much as good as you can get it.
>
> Note that you don't need the FROM clause though in your UPDATE. i.e.
>
> UPDATE dbo.mytable
> SET rec_id = (SELECT COUNT(*)
>                     FROM dbo.mytable b
>                     WHERE b.owner= dbo.mytable.owner
>                     AND b.[timestamp] < dbo.mytable.[timestamp] )
> WHERE owner = N'me'
> AND rec_id IS NULL
>
Author
10 Aug 2006 10:05 AM
Chris Lim
Olivier Matrot wrote:
> Ok thanks for that.
> Could we imagine using a user function to get the next record number instead
> ? Would it be faster ?

If you are using SQL Server 2005 you could use the ROW_NUMBER() OVER()
function to generate your sequence numbers. e.g.

UPDATE dbo.mytable
SET rec_id = b.RowNbr
FROM dbo.mytable a
INNER JOIN (
                     SELECT owner,
                                  [timestamp],
                                  RowNbr = ROW_NUMBER()
                                                   OVER(PARTITION BY
owner
                                                               ORDER BY
[timestamp])
                      FROM mytable
                      WHERE owner = N'me'
                      AND recid is NULL
                   ) b
  ON b.owner = a.owner
  AND b.[timestamp] = a.[timestamp]
WHERE a.owner = N'me'
AND a.rec_id IS NULL

(I haven't actually used SQL Server 2005 so excuse me if I got the
syntax wrong!)

That might be faster since it's not doing a co-related sub-query to
work out the row numbers, but as you can see it's a more complicated
query.

Chris
Author
10 Aug 2006 3:11 PM
Olivier Matrot
Chris,

Thanks for that. I'll give it a try.
BTW the syntax is good !!!!

Show quote
"Chris Lim" <blackca***@hotmail.com> wrote in message
news:1155204321.292170.185540@m79g2000cwm.googlegroups.com...
> Olivier Matrot wrote:
>> Ok thanks for that.
>> Could we imagine using a user function to get the next record number
>> instead
>> ? Would it be faster ?
>
> If you are using SQL Server 2005 you could use the ROW_NUMBER() OVER()
> function to generate your sequence numbers. e.g.
>
> UPDATE dbo.mytable
> SET rec_id = b.RowNbr
> FROM dbo.mytable a
> INNER JOIN (
>                     SELECT owner,
>                                  [timestamp],
>                                  RowNbr = ROW_NUMBER()
>                                                   OVER(PARTITION BY
> owner
>                                                               ORDER BY
> [timestamp])
>                      FROM mytable
>                      WHERE owner = N'me'
>                      AND recid is NULL
>                   ) b
>  ON b.owner = a.owner
>  AND b.[timestamp] = a.[timestamp]
> WHERE a.owner = N'me'
> AND a.rec_id IS NULL
>
> (I haven't actually used SQL Server 2005 so excuse me if I got the
> syntax wrong!)
>
> That might be faster since it's not doing a co-related sub-query to
> work out the row numbers, but as you can see it's a more complicated
> query.
>
> Chris
>
Author
10 Aug 2006 9:58 AM
Uri Dimant
Hi
Does the query run slow or give a wrong output?

UPDATE dbo.mytable
SET rec_id = (SELECT COUNT(*) FROM dbo.mytable b WHERE owner= N'me' and
a.[timestamp] >= b.[timestamp])
WHERE EXISTS (SELECT * FROM  dbo.mytable  WHERE owner = N'me' AND rec_id IS
NULL)





Show quote
"Olivier Matrot" <olivier.matrot.rte@online.nospam> wrote in message
news:Ojs$n0FvGHA.4140@TK2MSFTNGP03.phx.gbl...
> Hello,
> I would like to optimize the following query. The goal is to give a record
> number on each row based on a row owner and a timestamp.
> UPDATE dbo.mytable
>
> SET rec_id = (SELECT COUNT(*) FROM dbo.mytable b WHERE owner= N'me' and
> a.[timestamp] >= b.[timestamp])
>
> FROM dbo.mytable a WHERE owner = N'me' AND rec_id IS NULL
>
> TIA.
>
Author
10 Aug 2006 10:07 AM
Chris Lim
Uri Dimant wrote:
> UPDATE dbo.mytable
> SET rec_id = (SELECT COUNT(*) FROM dbo.mytable b WHERE owner= N'me' and
> a.[timestamp] >= b.[timestamp])
> WHERE EXISTS (SELECT * FROM  dbo.mytable  WHERE owner = N'me' AND rec_id IS
> NULL)

Wouldn't that update all rows in the table, instead of just the ones
where owner = 'Me' and rec_id is null?
Author
10 Aug 2006 10:12 AM
Uri Dimant
Chris
NO, because in correlated subquery is a WHERE condition 'WHERE owner = N'me'
AND rec_id IS NULL'


Show quote
"Chris Lim" <blackca***@hotmail.com> wrote in message
news:1155204424.696723.105920@b28g2000cwb.googlegroups.com...
> Uri Dimant wrote:
>> UPDATE dbo.mytable
>> SET rec_id = (SELECT COUNT(*) FROM dbo.mytable b WHERE owner= N'me' and
>> a.[timestamp] >= b.[timestamp])
>> WHERE EXISTS (SELECT * FROM  dbo.mytable  WHERE owner = N'me' AND rec_id
>> IS
>> NULL)
>
> Wouldn't that update all rows in the table, instead of just the ones
> where owner = 'Me' and rec_id is null?
>
Author
10 Aug 2006 10:17 AM
Chris Lim
Uri Dimant wrote:
> Chris
> NO, because in correlated subquery is a WHERE condition 'WHERE owner = N'me'
> AND rec_id IS NULL'

But that isn't a correlated subquery! Is it? Am I missing something?
Author
10 Aug 2006 10:26 AM
Uri Dimant
Chris
Yes ,it is a corellated subquery.EXISTS simply checks for a nonempty set.It
returns (internally) either TRUE or NOT TRUE

Show quote
"Chris Lim" <blackca***@hotmail.com> wrote in message
news:1155205073.143853.213620@75g2000cwc.googlegroups.com...
> Uri Dimant wrote:
>> Chris
>> NO, because in correlated subquery is a WHERE condition 'WHERE owner =
>> N'me'
>> AND rec_id IS NULL'
>
> But that isn't a correlated subquery! Is it? Am I missing something?
>
Author
10 Aug 2006 10:37 AM
Chris Lim
Uri Dimant wrote:
> Chris
> Yes ,it is a corellated subquery.EXISTS simply checks for a nonempty set.It
> returns (internally) either TRUE or NOT TRUE

How is this sub-query correlated to the outer query?

> (SELECT * FROM  dbo.mytable  WHERE owner = N'me' AND rec_id IS
> NULL)

Would this not return TRUE if there was one or more rows in mytable
with owner = 'Me' and rec_id null, and then update all rows in the
outer query?

> UPDATE dbo.mytable
> SET rec_id = (SELECT COUNT(*) FROM dbo.mytable b WHERE owner= N'me' and
> a.[timestamp] >= b.[timestamp])
> WHERE EXISTS (SELECT * FROM  dbo.mytable  WHERE owner = N'me' AND rec_id IS
> NULL)

Chris
Author
10 Aug 2006 10:56 AM
Uri Dimant
Chris
For example you have the following query

select <column list> from tabel_a where not exists ( select * from table_a
a where a.col=table_a.col)

NOT EXISTS  correlated  table_a to those matches ,looking  for table_a that
don't  have  a single row returned in the subquery
The optimizer knows that as soon as it gets a single hit using NOT EXISTS
the operation is TRUE or NOT TRUE   and futher operation is unnecessary


Think about correlated subquries as a loop in programming  such as do
....while or for... The result of each execution of the subquery must be
correlated to a row of the main query





Show quote
"Chris Lim" <blackca***@hotmail.com> wrote in message
news:1155206245.887708.326960@75g2000cwc.googlegroups.com...
> Uri Dimant wrote:
>> Chris
>> Yes ,it is a corellated subquery.EXISTS simply checks for a nonempty
>> set.It
>> returns (internally) either TRUE or NOT TRUE
>
> How is this sub-query correlated to the outer query?
>
>> (SELECT * FROM  dbo.mytable  WHERE owner = N'me' AND rec_id IS
>> NULL)
>
> Would this not return TRUE if there was one or more rows in mytable
> with owner = 'Me' and rec_id null, and then update all rows in the
> outer query?
>
>> UPDATE dbo.mytable
>> SET rec_id = (SELECT COUNT(*) FROM dbo.mytable b WHERE owner= N'me' and
>> a.[timestamp] >= b.[timestamp])
>> WHERE EXISTS (SELECT * FROM  dbo.mytable  WHERE owner = N'me' AND rec_id
>> IS
>> NULL)
>
> Chris
>
Author
10 Aug 2006 11:12 AM
Chris Lim
Uri Dimant wrote:
> Think about correlated subquries as a loop in programming  such as do
> ...while or for... The result of each execution of the subquery must be
> correlated to a row of the main query

Thanks, I'm aware of how correlated subqueries work. I still don't see
how your one is a correlated subquery though.

Perhaps someone else could explain this to me? I don't have access to
SQL Server right now otherwise I'd try it myself.

Chris
Author
10 Aug 2006 11:19 AM
Uri Dimant
Chris
> Perhaps someone else could explain this to me?

UPDATE dbo.mytable
SET rec_id = (SELECT COUNT(*) FROM dbo.mytable b WHERE owner= N'me' and
a.[timestamp] >= b.[timestamp])
WHERE EXISTS (SELECT * FROM  dbo.mytable  WHERE owner = N'me' AND rec_id IS
NULL)


--------
EXISTS  correlated dbo.mytable
to those matches ,looking  for dbo.mytable
that do  have  a single row returned in the subquery



Show quote
"Chris Lim" <blackca***@hotmail.com> wrote in message
news:1155208333.091637.110730@h48g2000cwc.googlegroups.com...
> Uri Dimant wrote:
>> Think about correlated subquries as a loop in programming  such as do
>> ...while or for... The result of each execution of the subquery must be
>> correlated to a row of the main query
>
> Thanks, I'm aware of how correlated subqueries work. I still don't see
> how your one is a correlated subquery though.
>
> Perhaps someone else could explain this to me? I don't have access to
> SQL Server right now otherwise I'd try it myself.
>
> Chris
>
Author
10 Aug 2006 11:24 AM
Chris Lim
Uri Dimant wrote:
> UPDATE dbo.mytable
> SET rec_id = (SELECT COUNT(*) FROM dbo.mytable b WHERE owner= N'me' and
> a.[timestamp] >= b.[timestamp])
> WHERE EXISTS (SELECT * FROM  dbo.mytable  WHERE owner = N'me' AND rec_id IS
> NULL)
>
>  EXISTS  correlated dbo.mytable
>  to those matches ,looking  for dbo.mytable
>  that do  have  a single row returned in the subquery

Still don't see the correlation. It's getting late, maybe someone will
be able to explain this to me in the morning!
Author
10 Aug 2006 12:00 PM
Tibor Karaszi
That subquery cannot be correlated. Since the table referred to in the EXISTS subquery is the same
table which will be updates, the column names referred to (non-qualified, in this case) either
doesn't exist in that table and an error is produced or refers to the table inside the subquery
which mean no correlation to the outer query.

Show quote
"Chris Lim" <blackca***@hotmail.com> wrote in message
news:1155209067.943185.249640@i3g2000cwc.googlegroups.com...
> Uri Dimant wrote:
>> UPDATE dbo.mytable
>> SET rec_id = (SELECT COUNT(*) FROM dbo.mytable b WHERE owner= N'me' and
>> a.[timestamp] >= b.[timestamp])
>> WHERE EXISTS (SELECT * FROM  dbo.mytable  WHERE owner = N'me' AND rec_id IS
>> NULL)
>>
>>  EXISTS  correlated dbo.mytable
>>  to those matches ,looking  for dbo.mytable
>>  that do  have  a single row returned in the subquery
>
> Still don't see the correlation. It's getting late, maybe someone will
> be able to explain this to me in the morning!
>
Author
10 Aug 2006 12:17 PM
Johan Sjöström
Uri left out the alias (table prefix) "a" in his original post, making
the example hard to comment from there.


A correlated example with EXISTS, straight from the SQL2K5 Impl. and
maintenance, p.175:

SELECT *
FROM HumanResources.Employee AS E
WHERE EXISTS (
SELECT *
FROM HumanResources.EmployeeAddress EA
WHERE E.EmployeeId = EA.EmployeeId
)

As you all know, what makes this correlated is that the subquery
correlates the outer query's EmployeeId value to the subquery's
EmployeeId.

Tibor just made a good point though.

Cheers,
Johan Sjöström
MSc, MCP, MCAD
Author
10 Aug 2006 11:16 AM
Chris Lim
Uri Dimant wrote:
> select <column list> from tabel_a where not exists ( select * from table_a
> a where a.col=table_a.col)

In your above example, there is a link between table_a in your subquery
with table_a in your outer query. That is the correlation.

Where is the equivalent link in this subquery?

> (SELECT * FROM  dbo.mytable  WHERE owner = N'me' AND rec_id IS NULL)

Chris

AddThis Social Bookmark Button