|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
SQL Join question - getting most recent value in a history tableMy question concerns three tables: - tblHeader - containing general information such as product ID, description etc HeaderID HeaderDescription HeaderSerialNumber lstLocations - lookup table containing location IDs LocID LocDescription tblLocationHistory - a transaction table containing the location history of each header and the date of their installation. HistoryID HeaderID LocID InstallDate What I'm after is a select which shows each line in tblHeader and its most recent location. However my first attempt (as you would expect from SQL) returns every product and every location they have been in - e.g. Product X has 7 locations - thus appears 7 times in the results from the below query. Unfortunately I'm after the most recent location of each Header *only*. I feel like this is something I should be able to do with my eyes closed but I can't seem to crack it. Please could someone point me in the right direction! Should I be looking at using UDF's for example? Cheers Chris. ******************* SELECT dbo.tblHeader.HeaderID, dbo.tblHeader.Type, dbo.tblHeader.SerialNumber, dbo.lstLocations.Location_Name, dbo.tblLocationHistory.InstallDate FROM dbo.lstLocations INNER JOIN dbo.tblLocationHistory ON dbo.lstLocations.Location_ID = dbo.tblLocationHistory.LocationID INNER JOIN dbo.tblHeader ON dbo.tblLocationHistory.HeaderID = dbo.tblHeader.HeaderID Chris Strug wrote:
> What I'm after is a select which shows each line in tblHeader and its most You need a derived table that holds the most recent location for each> recent location. header. Something like the following, though there are probably better ways (it's getting late..!): SELECT h.HeaderID, h.Type, h.SerialNumber, l.Location_Name, lh.InstallDate FROM dbo.tblHeader h INNER JOIN ( SELECT l.HeaderID, LocID = MAX(l.LocID) FROM lstLocations l WHERE l.InstallDate = ( SELECT MAX(l2.InstallDate) FROM lstLocations l2 WHERE l2.HeaderID = l.HeaderID) GROUP BY l.Header ID) lastInstall ON lastInstall.HeaderID = h.HeaderID INNER JOIN tblLocationHistory lh ON lh.HeaderID = lastInstall.HeaderID AND lh.LocID = lastInstall.LocID INNER JOIN lstLocations l ON l.LocID = lh.LocID Chris Chris Lim wrote:
> SELECT h.HeaderID, Correction:SELECT h.HeaderID, h.Type, h.SerialNumber, l.Location_Name, lh.InstallDate FROM dbo.tblHeader h INNER JOIN ( SELECT l.HeaderID, LocID = MAX(l.LocID) FROM dbo.tblLocationHistory l WHERE l.InstallDate = ( SELECT MAX(l2.InstallDate) FROM dbo.tblLocationHistory l2 WHERE l2.HeaderID = l.HeaderID) GROUP BY l.Header ID) lastInstall ON lastInstall.HeaderID = h.HeaderID INNER JOIN dbo.tblLocationHistory lh ON lh.HeaderID = lastInstall.HeaderID AND lh.LocID = lastInstall.LocID INNER JOIN dbo.lstLocations l ON l.LocID = lh.LocID Chris,
Many thanks for your help - from my limited understanding of SQL, I think I see what you're doing. I was just wondering, would it be possible (even recommended!) to abstract the "derived" select into another view? Anyway, thank you once again for your kind help. Cheers Chris Show quote "Chris Lim" <blackca***@hotmail.com> wrote in message news:1153996695.893580.87360@75g2000cwc.googlegroups.com... > Chris Lim wrote: >> SELECT h.HeaderID, > > Correction: > > SELECT h.HeaderID, > h.Type, > h.SerialNumber, > l.Location_Name, > lh.InstallDate > FROM dbo.tblHeader h > INNER JOIN ( SELECT l.HeaderID, > LocID = MAX(l.LocID) > FROM dbo.tblLocationHistory l > WHERE l.InstallDate = ( SELECT MAX(l2.InstallDate) > FROM > dbo.tblLocationHistory l2 > WHERE l2.HeaderID = > l.HeaderID) > GROUP BY > l.Header ID) lastInstall > ON lastInstall.HeaderID = h.HeaderID > INNER JOIN dbo.tblLocationHistory lh > ON lh.HeaderID = lastInstall.HeaderID > AND lh.LocID = lastInstall.LocID > INNER JOIN dbo.lstLocations l > ON l.LocID = lh.LocID > Chris Strug wrote:
> Many thanks for your help - from my limited understanding of SQL, I think I Putting that derived table into a view could be useful. It really> see what you're doing. > > I was just wondering, would it be possible (even recommended!) to abstract > the "derived" select into another view? depends if you'd use that bit of information elsewhere in your system. If this is the only place where you're likely to use it, then creating a view would be an unnecessary overhead IMO. Chris Chris,
If you happen to be using SQL Server 2005, queries like this are very easy to write. The most recent location is the one that would be ranked #1 in decending InstallDate order for a particular HeaderID, so you can do this: SELECT HeaderID, Type, SerialNumber, Location_Name, InstallDate FROM ( SELECT dbo.tblHeader.HeaderID, dbo.tblHeader.Type, dbo.tblHeader.SerialNumber, dbo.lstLocations.Location_Name, dbo.tblLocationHistory.InstallDate, ROW_NUMBER() OVER ( PARTITION BY dbo.tblHeader.HeaderID ORDER BY dbo.tblLocationHistory.InstallDate DESC ) AS rk FROM dbo.lstLocations INNER JOIN dbo.tblLocationHistory ON dbo.lstLocations.Location_ID = dbo.tblLocationHistory.LocationID INNER JOIN dbo.tblHeader ON dbo.tblLocationHistory.HeaderID = dbo.tblHeader.HeaderID ) AS RankedResults WHERE rk = 1 Steve Kass Drew University www.stevekass.com Chris Strug wrote: Show quote >Hi, > >My question concerns three tables: > >- tblHeader - containing general information such as product ID, description >etc > HeaderID > HeaderDescription > HeaderSerialNumber > >lstLocations - lookup table containing location IDs > LocID > LocDescription > >tblLocationHistory - a transaction table containing the location history of >each header and the date of their installation. > HistoryID > HeaderID > LocID > InstallDate > >What I'm after is a select which shows each line in tblHeader and its most >recent location. However my first attempt (as you would expect from SQL) >returns every product and every location they have been in - e.g. Product X >has 7 locations - thus appears 7 times in the results from the below query. >Unfortunately I'm after the most recent location of each Header *only*. > >I feel like this is something I should be able to do with my eyes closed but >I can't seem to crack it. Please could someone point me in the right >direction! Should I be looking at using UDF's for example? > >Cheers > >Chris. > >******************* > >SELECT > dbo.tblHeader.HeaderID, > dbo.tblHeader.Type, > dbo.tblHeader.SerialNumber, > dbo.lstLocations.Location_Name, > dbo.tblLocationHistory.InstallDate >FROM dbo.lstLocations INNER JOIN > dbo.tblLocationHistory ON dbo.lstLocations.Location_ID >= dbo.tblLocationHistory.LocationID INNER JOIN > dbo.tblHeader ON dbo.tblLocationHistory.HeaderID = >dbo.tblHeader.HeaderID > > > > > Steve Kass wrote:
> Chris, Hmm, very useful to know! Currently still using 2000, but will keep in> > If you happen to be using SQL Server 2005, queries like this > are very easy to write. The most recent location is the one > that would be ranked #1 in decending InstallDate order for > a particular HeaderID, so you can do this: mind for 2005. Steve Kass wrote:
> FROM ( I'm trying to understand how this works. Do you not even need a FROM> SELECT > dbo.tblHeader.HeaderID, > dbo.tblHeader.Type, > dbo.tblHeader.SerialNumber, > dbo.lstLocations.Location_Name, > dbo.tblLocationHistory.InstallDate, > ROW_NUMBER() OVER ( > PARTITION BY dbo.tblHeader.HeaderID > ORDER BY dbo.tblLocationHistory.InstallDate DESC > ) AS rk clause? You do need a from clause, but you snipped it out here. The query I
posted was SELECT HeaderID, Type, SerialNumber, Location_Name, InstallDate FROM ( SELECT dbo.tblHeader.HeaderID, dbo.tblHeader.Type, dbo.tblHeader.SerialNumber, dbo.lstLocations.Location_Name, dbo.tblLocationHistory.InstallDate, ROW_NUMBER() OVER ( PARTITION BY dbo.tblHeader.HeaderID ORDER BY dbo.tblLocationHistory.InstallDate DESC ) AS rk FROM dbo.lstLocations INNER JOIN dbo.tblLocationHistory ON dbo.lstLocations.Location_ID = dbo.tblLocationHistory.LocationID INNER JOIN dbo.tblHeader ON dbo.tblLocationHistory.HeaderID = dbo.tblHeader.HeaderID ) AS RankedResults WHERE rk = 1 The outermost SELECT selects from a derived table, and the derived table selects from the tables of the original query. What would be more readable, but which the syntax of the language doesn't allow, is SELECT dbo.tblHeader.HeaderID, dbo.tblHeader.Type, dbo.tblHeader.SerialNumber, dbo.lstLocations.Location_Name, dbo.tblLocationHistory.InstallDate FROM dbo.lstLocations INNER JOIN dbo.tblLocationHistory ON dbo.lstLocations.Location_ID = dbo.tblLocationHistory.LocationID INNER JOIN dbo.tblHeader ON dbo.tblLocationHistory.HeaderID = dbo.tblHeader.HeaderID WHERE ROW_NUMBER() OVER ( PARTITION BY dbo.tblHeader.HeaderID ORDER BY dbo.tblLocationHistory.InstallDate DESC ) = 1 If allowed, this would be the same thing, but the ROW_NUMBER() calculation can only appear in SELECT and ORDER BY clauses. There is a sneaky version that you can write with less typing: SELECT TOP 1 WITH TIES dbo.tblHeader.HeaderID, dbo.tblHeader.Type, dbo.tblHeader.SerialNumber, dbo.lstLocations.Location_Name, dbo.tblLocationHistory.InstallDate FROM dbo.lstLocations INNER JOIN dbo.tblLocationHistory ON dbo.lstLocations.Location_ID = dbo.tblLocationHistory.LocationID INNER JOIN dbo.tblHeader ON dbo.tblLocationHistory.HeaderID = dbo.tblHeader.HeaderID ORDER BY ROW_NUMBER() OVER ( PARTITION BY dbo.tblHeader.HeaderID ORDER BY dbo.tblLocationHistory.InstallDate DESC ) This is less readable, I think. SK Chris Lim wrote: Show quote >Steve Kass wrote: > > >>FROM ( >> SELECT >> dbo.tblHeader.HeaderID, >> dbo.tblHeader.Type, >> dbo.tblHeader.SerialNumber, >> dbo.lstLocations.Location_Name, >> dbo.tblLocationHistory.InstallDate, >> ROW_NUMBER() OVER ( >> PARTITION BY dbo.tblHeader.HeaderID >> ORDER BY dbo.tblLocationHistory.InstallDate DESC >> ) AS rk >> >> > >I'm trying to understand how this works. Do you not even need a FROM >clause? > > > Steve Kass wrote:
> The outermost SELECT selects from a derived table, and the derived Oh I see. I mismatched the brackets when I was reading which is why I> table selects from the tables of the original query. got myself confused! So the bit that SQL Server 2005 provides is: > ROW_NUMBER() OVER ( I can see myself using that a bit. Saves having to do sub-queries or> PARTITION BY dbo.tblHeader.HeaderID > ORDER BY dbo.tblLocationHistory.InstallDate DESC > ) AS rk additional dervied tables to do the ranking! Chris Chris,
In cases like this, you might go for a little denormalization and add a column NumberOfPreviousVersions. Inserting a history row, setnit to 0 and up it for all the previous versions. That done, you can quickly search on NumberOfPreviousVersions = 0, which is usually much faster and simpler than subqueries or OLAP functions. Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, data types, etc. in your schema are. Sample data is also a good idea, along with clear specifications. It is very hard to debug code when you do not let us see it. You also need to learn ISO-11179 rules so you will stop using those silly redundant affixes that describe physical implementations in an RDBMS. It makes you look like a BASIC programmer. Also, tables are sets which usually have more than one element, so you need to use collecitve or plural names. The proper way to model temporal data is with duration, not chronons (google it). Try this: CREATE TABLE LocationHistory (header_id INTEGER NOT NULL REFERENCES Headers (header_id), loc_id INTEGER NOT NULL REFERENCES Locations (loc_id), start_date DATETIME NOT NULL. end_date DATETIME, -- null is current location PRIMARY KEY (header_id, loc_id, start_date), etc.); >> What I'm after is a select which shows each line in Headers and its most recent location. << Look for "end_date IS NULL" in your query. Proper DDL means mucheasier coding. |
|||||||||||||||||||||||