Home All Groups Group Topic Archive Search About

unaccountable result for select

Author
3 Dec 2005 7:33 PM
helmut woess
Hi,

I am using MSDE2000 sp4 and tried following:
(you can copy it directly in QueryAnalizer)
-----------------------------------------------------
declare @tbl table (
  yyyymm  int,
  id      int,
  x       int )
--
insert into @tbl values(200502,1,0)
insert into @tbl values(200502,2,0)
insert into @tbl values(200502,1,0)
insert into @tbl values(200502,3,0)
insert into @tbl values(200503,1,0)
insert into @tbl values(200503,1,0)
insert into @tbl values(200503,1,0)
--
select distinct c.yyyymm,
  (select count(*) from (select distinct d.id from
    @tbl d where d.yyyymm = c.yyyymm) as dd) as idSum
from @tbl c group by c.yyyymm
---------------------------------------------------

and this is the result:

200502    3
200502    3
200502    3
200502    3
200502    3
200502    3
200502    3
200502    3
200503    1
200503    1
200503    1
200503    1
200503    1
200503    1

hmmm, 14 rows instead 2 as i estimated:
200502    3
200503    1

this is what i want: how many different id's per yyyymm.

I always thought, if i use distinct, i can never get complete identically
rows - but wrong. Is this normal? I can't believe that SQL-Server has a
bug. But i can't see my mistake.

thanks,
Helmut

Author
3 Dec 2005 8:25 PM
Steve Kass
Helmut,

You have not made a mistake.  This is a bug.  It is fixed in SQL Server
2005,
and a workaround in SQL Server 2000 for this particular query is

select distinct * from (
  select c.yyyymm,
    (select count(*) from (select distinct d.id from
      @tbl d where d.yyyymm = c.yyyymm) as dd) as idSum
  from @tbl c group by c.yyyymm
) T

However, SQL Server does not handle DISTINCT and
subqueries very well, so you should be careful in situations
where you use them.  If alternatives are possible, use them.
Here, for example, I think you could write this equivalent:

select
  c.yyyymm,
  count(distinct d.id) as idSum
from @tbl as c
join @tbl as d
on d.yyyymm = c.yyyymm
group by c.yyyymm


Steve Kass
Drew University

helmut woess wrote:

Show quote
>Hi,
>
>I am using MSDE2000 sp4 and tried following:
>(you can copy it directly in QueryAnalizer)
>-----------------------------------------------------
>declare @tbl table (
>  yyyymm  int,
>  id      int,
>  x       int )
>--
>insert into @tbl values(200502,1,0)
>insert into @tbl values(200502,2,0)
>insert into @tbl values(200502,1,0)
>insert into @tbl values(200502,3,0)
>insert into @tbl values(200503,1,0)
>insert into @tbl values(200503,1,0)
>insert into @tbl values(200503,1,0)
>--
>select distinct c.yyyymm,
>  (select count(*) from (select distinct d.id from
>    @tbl d where d.yyyymm = c.yyyymm) as dd) as idSum
>from @tbl c group by c.yyyymm
>---------------------------------------------------
>
>and this is the result:
>
>200502    3
>200502    3
>200502    3
>200502    3
>200502    3
>200502    3
>200502    3
>200502    3
>200503    1
>200503    1
>200503    1
>200503    1
>200503    1
>200503    1
>
>hmmm, 14 rows instead 2 as i estimated:
>200502    3
>200503    1
>
>this is what i want: how many different id's per yyyymm.
>
>I always thought, if i use distinct, i can never get complete identically
>rows - but wrong. Is this normal? I can't believe that SQL-Server has a
>bug. But i can't see my mistake.
>
>thanks,
>Helmut
>   

>
Author
3 Dec 2005 8:26 PM
Razvan Socol
Hi, Helmut

Remove the GROUP BY clause to get the expected result.

Razvan
Author
3 Dec 2005 8:31 PM
Steve Kass
Good suggestion - I missed seeing that, and with luck, that's an
easier and more reliable way to fix this.

Steve

Razvan Socol wrote:

Show quote
>Hi, Helmut
>
>Remove the GROUP BY clause to get the expected result.
>
>Razvan
>

>
Author
4 Dec 2005 8:27 AM
helmut woess
Am 3 Dec 2005 12:26:53 -0800 schrieb Razvan Socol:

> Hi, Helmut
>
> Remove the GROUP BY clause to get the expected result.
>
> Razvan

Hello Razvan and Steve,

thank's for your answers and the solution.

bye,
Helmut

AddThis Social Bookmark Button