Home All Groups Group Topic Archive Search About

Newbie:UNION with SELECT TOP

Author
1 Dec 2005 7:19 PM
Chuck
I have been trying to query the Top 100 selling items for each category to be
one result set...all day.  I thought that this should work...

SELECT TOP 100 productid, categoryid, sum(sales) as [Sales]
FROM dbo.Sales
WHERE categoryid ='01'
GROUP BY productid
ORDER BY sum(sales) DESC
UNION
SELECT TOP 100 productid, categoryid, sum(sales) as [Sales]
FROM dbo.Sales
WHERE categoryid ='02'
GROUP BY productid
ORDER BY sum(sales) DESC
UNION...

....and so on for our 14 different categories.

This is the first time that I have tried to use that UNION command, and I'm
wondering if it is the right way to go about this? Obviously I am muddling my
way through SQL Queries and appreciate any help.

Author
1 Dec 2005 7:30 PM
Bob Barrows [MVP]
Chuck wrote:
Show quote
> I have been trying to query the Top 100 selling items for each
> category to be one result set...all day.  I thought that this should
> work...
>
> SELECT TOP 100 productid, categoryid, sum(sales) as [Sales]
> FROM dbo.Sales
> WHERE categoryid ='01'
> GROUP BY productid
> ORDER BY sum(sales) DESC
> UNION
> SELECT TOP 100 productid, categoryid, sum(sales) as [Sales]
> FROM dbo.Sales
> WHERE categoryid ='02'
> GROUP BY productid
> ORDER BY sum(sales) DESC
> UNION...
>
> ...and so on for our 14 different categories.
>
> This is the first time that I have tried to use that UNION command,
> and I'm wondering if it is the right way to go about this? Obviously
> I am muddling my way through SQL Queries and appreciate any help.

Apart from suggesting the use of UNION ALL instead of UNION*, I can make no
suggestions here. Is it not doing what you want it to do?

Bob Barrows
*Using the ALL keyword prevents the query engine from wasting time trying to
eliminate duplicate rows from the result. You know there will be no
duplicates, so use UNION ALL
--
Microsoft MVP -- ASP/ASP.NET
Please reply to the newsgroup. The email account listed in my From
header is my spam trap, so I don't check it very often. You will get a
quicker response by posting to the newsgroup.
Author
1 Dec 2005 7:48 PM
Aaron Bertrand [SQL Server MVP]
> Apart from suggesting the use of UNION ALL instead of UNION*, I can make
> no
> suggestions here. Is it not doing what you want it to do?

Well, you can't have ORDER BY within each UNION, you can only order the
entire result.

Though I am not sure why this query all has to be in one resultset.  My
three suggestions would be:

(a) get the top 100 across all categories:

SELECT TOP 100
        productID, categoryID, SUM(sales)
    FROM dbo.Sales
    GROUP BY productID, categoryID
    ORDER BY 3 DESC;

(b) an easy way to bypass the ORDER/UNION quandary is to use a #temp table
or @table variable and insert each result there, then you can order on the
whole instead of its parts.

(c) have 14 results, or take category as a parameter and call the stored
procedure however many times is relevant (my guess is that, in general, we
are either looking at sales overall (in which case (a) is more prudent) or
we are looking at a specific category).

A
Author
1 Dec 2005 8:07 PM
Bob Barrows [MVP]
Aaron Bertrand [SQL Server MVP] wrote:
>> Apart from suggesting the use of UNION ALL instead of UNION*, I can
>> make no
>> suggestions here. Is it not doing what you want it to do?
>
> Well, you can't have ORDER BY within each UNION, you can only order
> the entire result.
Thanks, I missed that (obviously ... )
--
Microsoft MVP -- ASP/ASP.NET
Please reply to the newsgroup. The email account listed in my From
header is my spam trap, so I don't check it very often. You will get a
quicker response by posting to the newsgroup.
Author
1 Dec 2005 7:50 PM
Alexander Kuznetsov
create table #t(productid int, categoryid int, sumsales float)
insert into #t values(1,1,12.3)
insert into #t values(2,1,11.0)
insert into #t values(3,1,15.3)
insert into #t values(1,2,12.3)
insert into #t values(2,2,1.3)
insert into #t values(3,2,2.3)
insert into #t values(4,2,7.3)
insert into #t values(5,2,12.1)

select * from #t t1 where
(select count(*) from #t t2 where t1.categoryid = t2.categoryid
and t2.sumsales>t1.sumsales)<2
order by categoryid, sumsales desc


(1 row(s) affected)


(1 row(s) affected)


(1 row(s) affected)


(1 row(s) affected)


(1 row(s) affected)


(1 row(s) affected)


(1 row(s) affected)


(1 row(s) affected)

productid   categoryid  sumsales

----------- -----------
-----------------------------------------------------
3           1           15.300000000000001
1           1           12.300000000000001
1           2           12.300000000000001
5           2           12.1

(4 row(s) affected)



drop table #t
Author
1 Dec 2005 8:03 PM
Alexander Kuznetsov
oops I hit Post button too soon.

the query I posted will give you top 2 in each category, *assuming
there are no ties*

I added a tie:

insert into #t values(4,1,12.3)

and this works:

select * from #t t1 where sumsales in
(select top 2 with ties sumsales
from #t t2 where t1.categoryid = t2.categoryid
order by sumsales desc)
order by categoryid, sumsales desc


productid   categoryid  sumsales

----------- -----------
3           1           15.300000000000001
4           1           12.300000000000001
1           1           12.300000000000001
1           2           12.300000000000001
5           2           12.1

(5 row(s) affected)
Author
1 Dec 2005 9:06 PM
Chuck
Show quote
"Alexander Kuznetsov" wrote:

> oops I hit Post button too soon.
>
> the query I posted will give you top 2 in each category, *assuming
> there are no ties*
>
> I added a tie:
>
> insert into #t values(4,1,12.3)
>
> and this works:
>
> select * from #t t1 where sumsales in
> (select top 2 with ties sumsales
> from #t t2 where t1.categoryid = t2.categoryid
> order by sumsales desc)
> order by categoryid, sumsales desc
>
>
> productid   categoryid  sumsales
>
> ----------- -----------
> 3           1           15.300000000000001
> 4           1           12.300000000000001
> 1           1           12.300000000000001
> 1           2           12.300000000000001
> 5           2           12.1
>
> (5 row(s) affected)
>
>

Bob, Aaron, and Alexander,

Thank you for your quick replies.  In answer to your questions...I have
thought of creating individual result sets, which would solve part of the
issue. The ultimate goal here is to create a VIEW which will join with an
inventory table to produce the percentage of Top 100 selling merchandise
"in-stock"  by category. As a result, at some point it will be necessary to
rejoin the individual result sets, which I am trying to avoid.

However, I want to break down the project into bite sized pieces, focussing
for the moment on this problem.

I have tried Alexander's suggestion, but the sumsales column in it is not an
aggregate, while my query it is.  That produced the error that I could not
use an aggregate as the value for an IN statement.  I have altered it to
this, which has produced a timeout error.

SELECT     ProductID, CategoryID, SUM(Sales) AS Expr1, SUM(QTY) AS Expr2
FROM         cghastin.jvsProd_Performance t1
WHERE     (ProductID IN
                          (SELECT     TOP 2 productid
                            FROM          cghastin.jvsprod_performance t2
                            WHERE      t1.categoryid = t2.categoryid
                            GROUP BY productid
                            ORDER BY SUM(sales) DESC))
GROUP BY CategoryID, ProductID
Author
1 Dec 2005 9:31 PM
Alexander Kuznetsov
an index on (productID, category_id, sales, qty)
might speed it up

Also you may try this:

select productid, categoryid, sumsales, sumqty
   from (
     select productid, categoryid, sum(sales) as sumsales, sum(qty) as
sumqty
     from sales
     group by productid, categoryid
   ) t
where (select count(*) from (
     select productid, categoryid, sum(sales) as sumsales, sum(qty) as
sumqty
     from sales
     group by productid, categoryid ) t1 where t.productid =
t1.productid
    and t.categoryid = t1.categoryid and t.sumsales < t1.sumsales)<100

that is with ties
Author
2 Dec 2005 4:18 PM
Chuck
ALL,

This has been very helpful.  I appreciate your input.  I have the syntax
that works, but am now experiencing timeout errors.  Since the topic has
changed somewhat, I have reposted and if you'd like to respond to the new
post I would appreciate that as well.




Show quote
"Alexander Kuznetsov" wrote:

> an index on (productID, category_id, sales, qty)
> might speed it up
>
> Also you may try this:
>
> select productid, categoryid, sumsales, sumqty
>    from (
>      select productid, categoryid, sum(sales) as sumsales, sum(qty) as
> sumqty
>      from sales
>      group by productid, categoryid
>    ) t
> where (select count(*) from (
>      select productid, categoryid, sum(sales) as sumsales, sum(qty) as
> sumqty
>      from sales
>      group by productid, categoryid ) t1 where t.productid =
> t1.productid
>     and t.categoryid = t1.categoryid and t.sumsales < t1.sumsales)<100
>
> that is with ties
>
>
Author
1 Dec 2005 9:31 PM
Alexander Kuznetsov
an index on (productID, category_id, sales, qty)
might speed it up

Also you may try this:

select productid, categoryid, sumsales, sumqty
   from (
     select productid, categoryid, sum(sales) as sumsales, sum(qty) as
sumqty
     from sales
     group by productid, categoryid
   ) t
where (select count(*) from (
     select productid, categoryid, sum(sales) as sumsales, sum(qty) as
sumqty
     from sales
     group by productid, categoryid ) t1 where t.productid =
t1.productid
    and t.categoryid = t1.categoryid and t.sumsales < t1.sumsales)<100

that is with ties
Author
1 Dec 2005 8:47 PM
Trey Walpole
you can do it with derived tables and subqueries as well, and not have
to hard-code categoryid's
e.g.

select productid, categoryid, sales
   from (
     select productid, categoryid, sum(sales) as sales
     from sales
     group by productid, categoryid
   ) SumAll
where productid in  (
   select top 100 productid
   from (
     select productid, sum(sales) as sales
     from sales
     where categoryid=SumAll.categoryid
     group by productid
   ) SumCategory
   order by sales desc
)
order by categoryid asc, sales desc



Chuck wrote:
Show quote
> I have been trying to query the Top 100 selling items for each category to be
> one result set...all day.  I thought that this should work...
>
> SELECT TOP 100 productid, categoryid, sum(sales) as [Sales]
> FROM dbo.Sales
> WHERE categoryid ='01'
> GROUP BY productid
> ORDER BY sum(sales) DESC
> UNION
> SELECT TOP 100 productid, categoryid, sum(sales) as [Sales]
> FROM dbo.Sales
> WHERE categoryid ='02'
> GROUP BY productid
> ORDER BY sum(sales) DESC
> UNION...
>
> ...and so on for our 14 different categories.
>
> This is the first time that I have tried to use that UNION command, and I'm
> wondering if it is the right way to go about this? Obviously I am muddling my
> way through SQL Queries and appreciate any help.

AddThis Social Bookmark Button