|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Using aliases in SELECT DISTINCT clause and ORDER BYI'm having a problem when using aliases for columns in the select list and using the DISTINCT keyword. This is a simplified version of the real problem. CREATE TABLE #Authors ( [AuthorId] INT, [Name] VARCHAR(50) ) CREATE TABLE #Books ( [BookId] INT, [AuthorId] INT, [Name] VARCHAR(50), [Item] VARCHAR(10) ) This query generates the error "ORDER BY items must appear in the select list if SELECT DISTINCT is specified.", which is OK because I'm ordering by a.[name] column which is not included in the SELECT list. SELECT DISTINCT b.[item] FROM #Books b INNER JOIN #Authors a ON b.AuthorId = a.AuthorId ORDER BY a.[name] I'm building a custom query language which is very similar to T-SQL and I translate commands from my language to T-SQL. The problem is that the user can add aliases to columns in the select list. Suppose that the user adds the alias [name] to the column b.item and in the ORDER BY I already have a colum named [name] but belonging to another table. This is the query and is totally valid in my context. SELECT DISTINCT b.item AS [name] FROM #Books b INNER JOIN #Authors a ON b.AuthorId = a.AuthorId ORDER BY a.[name] The query is the same as the first one with the difference of the alias [name], but this query executes OK. But why? I'm not including the column a.[name] in the select list... The rare thing is that the order is being applied by the first column which has the [name] alias and not by the [name] column of the table "a". It looks like SQL is being misled by the aliases, but I'm just guessing. So my question is: is this a bug or there is a reasonable explanation for this behavior? Thanks, DarÃo The ORDER BY clause always sorts the resultset. If you use an expression
in the ORDER BY clause (which is disallowed if you use the DISTINCT keyword), then this expression is added (hidden) to the Selection List. In your case, the parser does not seem to notice that there is a table alias prefixed to the Name column. It seems to ignore it. The matching of the ORDER BY criteria to the column names, select list aliases and expressions is very mysterious. I once did a whole set of experiments in a discussion with Steve Kass. The best advice I can give is to only use Select List aliases in the ORDER BY clause (so without table aliases). HTH, Gert-Jan Juan Darío Tempesta wrote: Show quote > > Hello, > > I'm having a problem when using aliases for columns in the select list and > using the DISTINCT keyword. This is a simplified version of the real problem. > > CREATE TABLE #Authors > ( > [AuthorId] INT, > [Name] VARCHAR(50) > ) > > CREATE TABLE #Books > ( > [BookId] INT, > [AuthorId] INT, > [Name] VARCHAR(50), > [Item] VARCHAR(10) > ) > > This query generates the error "ORDER BY items must appear in the select > list if SELECT DISTINCT is specified.", which is OK because I'm ordering by > a.[name] column which is not included in the SELECT list. > > SELECT DISTINCT b.[item] > FROM #Books b > INNER JOIN #Authors a > ON b.AuthorId = a.AuthorId > ORDER BY a.[name] > > I'm building a custom query language which is very similar to T-SQL and I > translate commands from my language to T-SQL. The problem is that the user > can add aliases to columns in the select list. Suppose that the user adds the > alias [name] to the column b.item and in the ORDER BY I already have a colum > named [name] but belonging to another table. This is the query and is totally > valid in my context. > > SELECT DISTINCT b.item AS [name] > FROM #Books b > INNER JOIN #Authors a > ON b.AuthorId = a.AuthorId > ORDER BY a.[name] > > The query is the same as the first one with the difference of the alias > [name], but this query executes OK. But why? I'm not including the column > a.[name] in the select list... The rare thing is that the order is being > applied by the first column which has the [name] alias and not by the [name] > column of the table "a". It looks like SQL is being misled by the aliases, > but I'm just guessing. > > So my question is: is this a bug or there is a reasonable explanation for > this behavior? > > Thanks, > DarÃo I think this is what is happening...
Table1 (simplified for this example) Name Age _______________ John 50 John 22 Paul 35 Thom 22 Mike 50 Select distinct Name from Table1 order by name; Name _____ John Mike Paul Thom Select distinct Age from Table1 order by age; Name _____ 22 35 50 Select distinct Name from Table1 order by age; Name _____ John (50 or 22?) Thom (22) Paul (35) Mike (50) The first 2 examples are simple. We can order by any field that we are selecting. In the last example, we are selecting distinct values and attempting to order by a different value which is not a 1 to 1 relationship to the values we are selecting. In order to get a distinct list of names ordered by age, we have to select age, and make the distinct rows dependent on it as well. In the case below this means we will get two rows where name = John. Select distinct Name, age from Table1 order by age; Name Age _______________ John 22 Thom 22 Paul 35 John 50 Mike 50 I wont ask why you are building your own query language, but there are dozens of reporting tools out there that are written to make writing queries easy, I can't imagine a reason for writing your own, unless you intend to market it. Show quote "Juan Darío Tempesta" <Juan Daro Tempe***@discussions.microsoft.com> wrote in message news:A4F5EFE8-A8B0-4B97-B947-1137ABF7CCCF@microsoft.com... > Hello, > > I'm having a problem when using aliases for columns in the select list and > using the DISTINCT keyword. This is a simplified version of the real problem. > > CREATE TABLE #Authors > ( > [AuthorId] INT, > [Name] VARCHAR(50) > ) > > CREATE TABLE #Books > ( > [BookId] INT, > [AuthorId] INT, > [Name] VARCHAR(50), > [Item] VARCHAR(10) > ) > > This query generates the error "ORDER BY items must appear in the select > list if SELECT DISTINCT is specified.", which is OK because I'm ordering by > a.[name] column which is not included in the SELECT list. > > SELECT DISTINCT b.[item] > FROM #Books b > INNER JOIN #Authors a > ON b.AuthorId = a.AuthorId > ORDER BY a.[name] > > I'm building a custom query language which is very similar to T-SQL and I > translate commands from my language to T-SQL. The problem is that the user > can add aliases to columns in the select list. Suppose that the user adds the > alias [name] to the column b.item and in the ORDER BY I already have a colum > named [name] but belonging to another table. This is the query and is totally > valid in my context. > > SELECT DISTINCT b.item AS [name] > FROM #Books b > INNER JOIN #Authors a > ON b.AuthorId = a.AuthorId > ORDER BY a.[name] > > The query is the same as the first one with the difference of the alias > [name], but this query executes OK. But why? I'm not including the column > a.[name] in the select list... The rare thing is that the order is being > applied by the first column which has the [name] alias and not by the [name] > column of the table "a". It looks like SQL is being misled by the aliases, > but I'm just guessing. > > So my question is: is this a bug or there is a reasonable explanation for > this behavior? > > Thanks, > Darío > "Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message You mean your satisfied with the mess that Chamberlain and his friendsnews:%23szcFezMGHA.1536@TK2MSFTNGP11.phx.gbl... >. > I wont ask why you are building your own query language, >. have sattled us with for 35 years?Think if sql was written as an algebra. People could possibly make some sense of it:)I think we should be incouraging people to try to get it right:) $.02 from www.rac4sql.net Hi, Dario
The behaviour of SQL Server 2000 in respect to table prefixes in the ORDER BY clause is somewhat buggy and has been changed in SQL Server 2005. Books Online 2005 says the following, in the "Behavior Changes to Database Engine Features in SQL Server 2005" topic: [In SQL Server 2000] Column names in the ORDER BY clause are resolved to columns listed in the select list, regardless if they are qualified. For example, the following query executes without error: USE pubs SELECT au_fname AS 'FName', au_lname AS 'LName' FROM authors a ORDER BY a.LName SQL Server ignores the qualifier a in the ORDER BY clause and resolves the column name LName to the select list. [In SQL Server 2005] Qualified column names and aliases are resolved to columns of tables listed in the FROM clause. If order_by_expression is not qualified, it must be unique among all columns listed in the SELECT statement. For example, the following equivalent query returns an error: USE AdventureWorks SELECT FirstName AS 'FName', LastName AS 'LName' FROM Person.Contact p ORDER BY p.LName SQL Server does not ignore the qualifier p in the ORDER BY clause, and resolves the column name LName to tables listed in the FROM clause. But the FROM clause does not recognize that column LName is a column alias of table p." Even more unexpected than what it's presented above, the following query executes with no errors, although the table alias x is not defined in the query: USE pubs SELECT au_fname AS 'FName', au_lname AS 'LName' FROM authors a ORDER BY x.LName Also unexpected is the behaviour of this query: USE pubs SELECT REVERSE(au_fname) AS au_fname FROM authors a ORDER BY a.au_fname Although the ORDER BY clause states that the result should be sorted on the original column (a.au_fname), SQL Server sorts it on the column with the same name which is in the final resultset. As I've said earlier, this strange behaviours have been corrected in SQL Server 2005. Razvan |
|||||||||||||||||||||||