Home All Groups Group Topic Archive Search About

Using aliases in SELECT DISTINCT clause and ORDER BY

Author
16 Feb 2006 7:41 PM
Juan Darío Tempesta
Hello,

I'm having a problem when using aliases for columns in the select list and
using the DISTINCT keyword. This is a simplified version of the real problem.

CREATE TABLE #Authors
(
    [AuthorId] INT,
    [Name] VARCHAR(50)
)

CREATE TABLE #Books
(
    [BookId] INT,
    [AuthorId] INT,
    [Name] VARCHAR(50),
    [Item] VARCHAR(10)
)

This query generates the error "ORDER BY items must appear in the select
list if SELECT DISTINCT is specified.", which is OK because I'm ordering by
a.[name] column which is not included in the SELECT list.

SELECT DISTINCT b.[item]
FROM #Books b
    INNER JOIN #Authors a
        ON b.AuthorId = a.AuthorId
ORDER BY a.[name]

I'm building a custom query language which is very similar to T-SQL and I
translate commands from my language to T-SQL. The problem is that the user
can add aliases to columns in the select list. Suppose that the user adds the
alias [name] to the column b.item and in the ORDER BY I already have a colum
named [name] but belonging to another table. This is the query and is totally
valid in my context.

SELECT DISTINCT b.item AS [name]
FROM #Books b
    INNER JOIN #Authors a
        ON b.AuthorId = a.AuthorId
ORDER BY a.[name]

The query is the same as the first one with the difference of the alias
[name], but this query executes OK. But why? I'm not including the column
a.[name] in the select list... The rare thing is that the order is being
applied by the first column which has the [name] alias and not by the [name]
column of the table "a". It looks like SQL is being misled by the aliases,
but I'm just guessing.

So my question is: is this a bug or there is a reasonable explanation for
this behavior?

Thanks,
Darío

Author
16 Feb 2006 8:17 PM
Gert-Jan Strik
The ORDER BY clause always sorts the resultset. If you use an expression
in the ORDER BY clause (which is disallowed if you use the DISTINCT
keyword), then this expression is added (hidden) to the Selection List.

In your case, the parser does not seem to notice that there is a table
alias prefixed to the Name column. It seems to ignore it.

The matching of the ORDER BY criteria to the column names, select list
aliases and expressions is very mysterious. I once did a whole set of
experiments in a discussion with Steve Kass. The best advice I can give
is to only use Select List aliases in the ORDER BY clause (so without
table aliases).

HTH,
Gert-Jan


Juan Darío Tempesta wrote:
Show quote
>
> Hello,
>
> I'm having a problem when using aliases for columns in the select list and
> using the DISTINCT keyword. This is a simplified version of the real problem.
>
> CREATE TABLE #Authors
> (
>         [AuthorId] INT,
>         [Name] VARCHAR(50)
> )
>
> CREATE TABLE #Books
> (
>         [BookId] INT,
>         [AuthorId] INT,
>         [Name] VARCHAR(50),
>         [Item] VARCHAR(10)
> )
>
> This query generates the error "ORDER BY items must appear in the select
> list if SELECT DISTINCT is specified.", which is OK because I'm ordering by
> a.[name] column which is not included in the SELECT list.
>
> SELECT DISTINCT b.[item]
> FROM #Books b
>         INNER JOIN #Authors a
>                 ON b.AuthorId = a.AuthorId
> ORDER BY a.[name]
>
> I'm building a custom query language which is very similar to T-SQL and I
> translate commands from my language to T-SQL. The problem is that the user
> can add aliases to columns in the select list. Suppose that the user adds the
> alias [name] to the column b.item and in the ORDER BY I already have a colum
> named [name] but belonging to another table. This is the query and is totally
> valid in my context.
>
> SELECT DISTINCT b.item AS [name]
> FROM #Books b
>         INNER JOIN #Authors a
>                 ON b.AuthorId = a.AuthorId
> ORDER BY a.[name]
>
> The query is the same as the first one with the difference of the alias
> [name], but this query executes OK. But why? I'm not including the column
> a.[name] in the select list... The rare thing is that the order is being
> applied by the first column which has the [name] alias and not by the [name]
> column of the table "a". It looks like SQL is being misled by the aliases,
> but I'm just guessing.
>
> So my question is: is this a bug or there is a reasonable explanation for
> this behavior?
>
> Thanks,
> Darío
Author
16 Feb 2006 8:28 PM
Jim Underwood
I think this is what is happening...

Table1 (simplified for this example)
Name        Age
_______________
John           50
John           22
Paul           35
Thom          22
Mike           50

Select distinct Name from Table1 order by name;
Name
_____
John
Mike
Paul
Thom

Select distinct Age from Table1 order by age;
Name
_____
22
35
50

Select distinct Name from Table1 order by age;
Name
_____
John (50 or 22?)
Thom (22)
Paul (35)
Mike (50)

The first 2 examples are simple.  We can order by any field that we are
selecting.  In the last example, we are selecting distinct values and
attempting to order by a different value which is not a 1 to 1 relationship
to the values we are selecting.  In order to get a distinct list of names
ordered by age, we have to select age, and make the distinct rows dependent
on it as well.  In the case below this means we will get two rows where name
= John.

Select distinct Name, age from Table1 order by age;
Name        Age
_______________
John           22
Thom          22
Paul           35
John           50
Mike           50

I wont ask why you are building your own query language, but there are
dozens of reporting tools out there that are written to make writing queries
easy, I can't imagine a reason for writing your own, unless you intend to
market it.

Show quote
"Juan Darío Tempesta" <Juan Daro Tempe***@discussions.microsoft.com> wrote
in message news:A4F5EFE8-A8B0-4B97-B947-1137ABF7CCCF@microsoft.com...
> Hello,
>
> I'm having a problem when using aliases for columns in the select list and
> using the DISTINCT keyword. This is a simplified version of the real
problem.
>
> CREATE TABLE #Authors
> (
> [AuthorId] INT,
> [Name] VARCHAR(50)
> )
>
> CREATE TABLE #Books
> (
> [BookId] INT,
> [AuthorId] INT,
> [Name] VARCHAR(50),
> [Item] VARCHAR(10)
> )
>
> This query generates the error "ORDER BY items must appear in the select
> list if SELECT DISTINCT is specified.", which is OK because I'm ordering
by
> a.[name] column which is not included in the SELECT list.
>
> SELECT DISTINCT b.[item]
> FROM #Books b
> INNER JOIN #Authors a
> ON b.AuthorId = a.AuthorId
> ORDER BY a.[name]
>
> I'm building a custom query language which is very similar to T-SQL and I
> translate commands from my language to T-SQL. The problem is that the user
> can add aliases to columns in the select list. Suppose that the user adds
the
> alias [name] to the column b.item and in the ORDER BY I already have a
colum
> named [name] but belonging to another table. This is the query and is
totally
> valid in my context.
>
> SELECT DISTINCT b.item AS [name]
> FROM #Books b
> INNER JOIN #Authors a
> ON b.AuthorId = a.AuthorId
> ORDER BY a.[name]
>
> The query is the same as the first one with the difference of the alias
> [name], but this query executes OK. But why? I'm not including the column
> a.[name] in the select list... The rare thing is that the order is being
> applied by the first column which has the [name] alias and not by the
[name]
> column of the table "a". It looks like SQL is being misled by the aliases,
> but I'm just guessing.
>
> So my question is: is this a bug or there is a reasonable explanation for
> this behavior?
>
> Thanks,
> Darío
>
Author
17 Feb 2006 2:02 AM
05ponyGT
"Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message
news:%23szcFezMGHA.1536@TK2MSFTNGP11.phx.gbl...
>.
> I wont ask why you are building your own query language,
>.

You mean your satisfied with the mess that Chamberlain and his friends
have sattled us with for 35 years?Think if sql was written as an algebra.
People could possibly make some sense of it:)I think we should be
incouraging people to try to get it right:)

$.02 from
www.rac4sql.net
Author
17 Feb 2006 8:03 AM
Razvan Socol
Hi, Dario

The behaviour of SQL Server 2000 in respect to table prefixes in the
ORDER BY clause is somewhat buggy and has been changed in SQL Server
2005. Books Online 2005 says the following, in the "Behavior Changes to
Database Engine Features in SQL Server 2005" topic:

   [In SQL Server 2000] Column names in the ORDER BY clause are
   resolved to columns listed in the select list, regardless if
   they are qualified.

   For example, the following query executes without error:

   USE pubs
   SELECT au_fname AS 'FName',
     au_lname AS 'LName'
   FROM authors a
   ORDER BY a.LName

   SQL Server ignores the qualifier a in the ORDER BY clause and
   resolves the column name LName to the select list.

   [In SQL Server 2005] Qualified column names and aliases are
   resolved to columns of tables listed in the FROM clause. If
   order_by_expression is not qualified, it must be unique among
   all columns listed in the SELECT statement.

   For example, the following equivalent query returns an error:

   USE AdventureWorks
   SELECT FirstName AS 'FName',
    LastName AS 'LName'
   FROM Person.Contact p
   ORDER BY p.LName

   SQL Server does not ignore the qualifier p in the ORDER BY clause,
   and resolves the column name LName to tables listed in the
   FROM clause. But the FROM clause does not recognize that
   column LName is a column alias of table p."

Even more unexpected than what it's presented above, the following
query executes with no errors, although the table alias x is not
defined in the query:

USE pubs
SELECT au_fname AS 'FName',
  au_lname AS 'LName'
FROM authors a
ORDER BY x.LName

Also unexpected is the behaviour of this query:

USE pubs
SELECT REVERSE(au_fname) AS au_fname
FROM authors a
ORDER BY a.au_fname

Although the ORDER BY clause states that the result should be sorted on
the original column (a.au_fname), SQL Server sorts it on the column
with the same name which is in the final resultset.

As I've said earlier, this strange behaviours have been corrected in
SQL Server 2005.

Razvan

AddThis Social Bookmark Button