|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Somewhat complicated GROUP behavior that has me stumpedbefore and it's got me stumped. Let me first say that I cannot redesign this database at all, it is an off-the-shelf product that my company has purchased, and I have to deal with the structure. That said, I'm trying to do something along the folowing lines: Assuming that I have a table that looks like this: AccountID TransData TransID 1 10 101 1 12 102 Basically the table stores a series of transactions that can be revised, and if a revision is entered, the old data stays and a new entry for the same AccountID is logged, but with a higher TransID. Please let me know if you're confused about what's going on. Basically, I need a query that returns the TransData for the highest TransID for each AccountID. If this makes any sense, I need to group by AccountID, and return the TransData column for the row that has the highest TransID. I've come up with something that concatenates the TransData and TransID and takes the Max of that, and then takes a substring of that to get back to just the TransData, but that seems like a goofy, overcomplicated, and inefficient way to do it. Anyone out there know how to do this elegantly? Thanks a ton! Stefan,
This is a pretty common question, and the answer isn't obvious until you've seen it. At least for most people... Your MAX solution is an excellent idea, and in fact it can be the best solution. (Be sure to use COALESCE on any nullable columns in the mix.) I think it can even be optimized by putting an index on the concatenation (as a computed column), but it's late and I'm too tired to check now. A more common solution to your question is this: select AccountID, TransData from T where TransID = ( select max(TransID) from T as Tcopy where Tcopy.AccountID = T.AccountID ) or similarly, select AccountID, TransData from T where not exists ( select * from T as Tcopy where Tcopy.AccountID = T.AccountID and Tcopy.TransID > T.TransID ) The SQL Server engine usually optimizes these well if there are supporting indexes, but another thing to try if you find otherwise is select AccountID, TransData from T where TransID = ( select top 1 TransID from T as Tcopy where Tcopy.AccountID = T.AccountID order by TransID desc ) I suspect the nullability of TransID affects potential query plans. The nice thing about your "carry along" solution is that it is expressed more clearly as an aggregate, which is the natural way to think of the problem. SK Stefan Wrobel wrote: Show quote >I'm trying to group results in a database in a way that I've never done >before and it's got me stumped. Let me first say that I cannot redesign this >database at all, it is an off-the-shelf product that my company has >purchased, and I have to deal with the structure. That said, I'm trying to >do something along the folowing lines: > >Assuming that I have a table that looks like this: >AccountID TransData TransID >1 10 101 >1 12 102 > >Basically the table stores a series of transactions that can be revised, and >if a revision is entered, the old data stays and a new entry for the same >AccountID is logged, but with a higher TransID. Please let me know if you're >confused about what's going on. Basically, I need a query that returns the >TransData for the highest TransID for each AccountID. If this makes any >sense, I need to group by AccountID, and return the TransData column for the >row that has the highest TransID. I've come up with something that >concatenates the TransData and TransID and takes the Max of that, and then >takes a substring of that to get back to just the TransData, but that seems >like a goofy, overcomplicated, and inefficient way to do it. Anyone out >there know how to do this elegantly? Thanks a ton! > > Stefan,
I just noticed your "can't redesign" comment, so I'll add that the optimization I mentioned might still be possible via an indexed view. If you create a view over this table with one extra (computed) column defined by the concatenation, then put a clustered index on (AccountID, newColumn), you might get lucky and have the optimizer see how to use the index, even if your query is against the table, in the case of SQL Server Enterprise Edition. For Standard, you would have write the query directly against the view, and perhaps even add the NOEXPAND query hint. You also might get unlucky and have this materialized view cause too much extra overhead to be worth anything. SK Stefan Wrobel wrote: Show quote >I'm trying to group results in a database in a way that I've never done >before and it's got me stumped. Let me first say that I cannot redesign this >database at all, it is an off-the-shelf product that my company has >purchased, and I have to deal with the structure. That said, I'm trying to >do something along the folowing lines: > >Assuming that I have a table that looks like this: >AccountID TransData TransID >1 10 101 >1 12 102 > >Basically the table stores a series of transactions that can be revised, and >if a revision is entered, the old data stays and a new entry for the same >AccountID is logged, but with a higher TransID. Please let me know if you're >confused about what's going on. Basically, I need a query that returns the >TransData for the highest TransID for each AccountID. If this makes any >sense, I need to group by AccountID, and return the TransData column for the >row that has the highest TransID. I've come up with something that >concatenates the TransData and TransID and takes the Max of that, and then >takes a substring of that to get back to just the TransData, but that seems >like a goofy, overcomplicated, and inefficient way to do it. Anyone out >there know how to do this elegantly? Thanks a ton! > > Hi Stefan
Try this SELECT A.AccountID,A.TransData,A.TransID FROM TRANS A where A.TransID IN (SELECT Max(TransID) FROM TRANS GROUP BY AccountId) trans is your table name here Regards R.D Show quote "Stefan Wrobel" wrote: > I'm trying to group results in a database in a way that I've never done > before and it's got me stumped. Let me first say that I cannot redesign this > database at all, it is an off-the-shelf product that my company has > purchased, and I have to deal with the structure. That said, I'm trying to > do something along the folowing lines: > > Assuming that I have a table that looks like this: > AccountID TransData TransID > 1 10 101 > 1 12 102 > > Basically the table stores a series of transactions that can be revised, and > if a revision is entered, the old data stays and a new entry for the same > AccountID is logged, but with a higher TransID. Please let me know if you're > confused about what's going on. Basically, I need a query that returns the > TransData for the highest TransID for each AccountID. If this makes any > sense, I need to group by AccountID, and return the TransData column for the > row that has the highest TransID. I've come up with something that > concatenates the TransData and TransID and takes the Max of that, and then > takes a substring of that to get back to just the TransData, but that seems > like a goofy, overcomplicated, and inefficient way to do it. Anyone out > there know how to do this elegantly? Thanks a ton! Thanks to both of you for your help solving this. I ended up using the
solution that Steve had proposed and R.D. also proposed. The execution time was basically equal for that and my aggregate method, but the query is just much more readable without all of the crazy concatenation and substring stuff. It seems like a straightforward thing that would have a more elegant solution, but there are obviously a million ways to get the job done! Thanks a ton to both of you! Show quote "R.D" wrote: > Hi Stefan > Try this > SELECT A.AccountID,A.TransData,A.TransID FROM TRANS A where A.TransID IN > (SELECT Max(TransID) FROM TRANS GROUP BY AccountId) > > trans is your table name here > > Regards > R.D > > > "Stefan Wrobel" wrote: > > > I'm trying to group results in a database in a way that I've never done > > before and it's got me stumped. Let me first say that I cannot redesign this > > database at all, it is an off-the-shelf product that my company has > > purchased, and I have to deal with the structure. That said, I'm trying to > > do something along the folowing lines: > > > > Assuming that I have a table that looks like this: > > AccountID TransData TransID > > 1 10 101 > > 1 12 102 > > > > Basically the table stores a series of transactions that can be revised, and > > if a revision is entered, the old data stays and a new entry for the same > > AccountID is logged, but with a higher TransID. Please let me know if you're > > confused about what's going on. Basically, I need a query that returns the > > TransData for the highest TransID for each AccountID. If this makes any > > sense, I need to group by AccountID, and return the TransData column for the > > row that has the highest TransID. I've come up with something that > > concatenates the TransData and TransID and takes the Max of that, and then > > takes a substring of that to get back to just the TransData, but that seems > > like a goofy, overcomplicated, and inefficient way to do it. Anyone out > > there know how to do this elegantly? Thanks a ton! Stefan,
In SQL Server 2005, and on products that support analytic functions, you can do this: select AccountID, TransData, TransID from ( select AccountID, TransData, TransID, rank() over (partition by AccountID order by TransID desc) as rk from T ) as T where rk = 1 (Windowed functions cannot appear in the WHERE clause in SQL Server 2005, but I don't know if that's a standard limitation.) There may be a cleaner solution, too, but I'm new to these features. SK Stefan Wrobel wrote: Show quote >Thanks to both of you for your help solving this. I ended up using the >solution that Steve had proposed and R.D. also proposed. The execution time >was basically equal for that and my aggregate method, but the query is just >much more readable without all of the crazy concatenation and substring >stuff. It seems like a straightforward thing that would have a more elegant >solution, but there are obviously a million ways to get the job done! Thanks >a ton to both of you! > >"R.D" wrote: > > > >>Hi Stefan >>Try this >>SELECT A.AccountID,A.TransData,A.TransID FROM TRANS A where A.TransID IN >>(SELECT Max(TransID) FROM TRANS GROUP BY AccountId) >> >>trans is your table name here >> >>Regards >>R.D >> >> >>"Stefan Wrobel" wrote: >> >> >> >>>I'm trying to group results in a database in a way that I've never done >>>before and it's got me stumped. Let me first say that I cannot redesign this >>>database at all, it is an off-the-shelf product that my company has >>>purchased, and I have to deal with the structure. That said, I'm trying to >>>do something along the folowing lines: >>> >>>Assuming that I have a table that looks like this: >>>AccountID TransData TransID >>>1 10 101 >>>1 12 102 >>> >>>Basically the table stores a series of transactions that can be revised, and >>>if a revision is entered, the old data stays and a new entry for the same >>>AccountID is logged, but with a higher TransID. Please let me know if you're >>>confused about what's going on. Basically, I need a query that returns the >>>TransData for the highest TransID for each AccountID. If this makes any >>>sense, I need to group by AccountID, and return the TransData column for the >>>row that has the highest TransID. I've come up with something that >>>concatenates the TransData and TransID and takes the Max of that, and then >>>takes a substring of that to get back to just the TransData, but that seems >>>like a goofy, overcomplicated, and inefficient way to do it. Anyone out >>>there know how to do this elegantly? Thanks a ton! >>> >>> |
|||||||||||||||||||||||