|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Compute sum of count(*) with group byGiven the following table and test data: CREATE TABLE test ( recordId numeric(18, 0) NOT NULL, spId int NOT NULL, startTime datetime NULL, endTime datetime NULL ) INSERT INTO test VALUES (1,1,'2005-01-01 12:00','2005-01-01 14:33') INSERT INTO test VALUES (2,2,'2005-01-01 12:26','2005-01-01 14:00') INSERT INTO test VALUES (3,1,'2005-01-01 14:00','2005-01-01 14:33') INSERT INTO test VALUES (4,2,'2005-01-01 14:00','2005-01-01 15:15') INSERT INTO test VALUES (5,1,'2005-01-01 15:15','2005-01-01 15:20') INSERT INTO test VALUES (6,2,'2005-01-01 15:15','2005-01-01 16:00') INSERT INTO test VALUES (7,3,'2005-01-01 12:00','2005-01-01 14:30') the following query lists only the spid's with non-unique spid's and their respective counts: SELECT spid, count(*) AS 'Count' FROM test GROUP BY spid HAVING count(*) > 1 ORDER BY spid I'm new to SQL and am having difficulty with a couple of things: 1. Modify the above query to compute the grand total for the Count, or indeed a separate SQL statement to return just the grand total (= 6 in this example). 2. This is the big challenge :). Taking the grouping returned by the above query, write a query/stored procedure which looks for records with identical spId's and the endtime of one spid equal to the startTime of another. With the above test data, recordIds 3, 5, and 2, 4, 6 match this criteria. Thanks very much for any help with this. 1. Use a derived table construct:
SELECT SUM( total ) FROM ( SELECT spid, COUNT(*) FROM tbl GROUP BY spid HAVING COUNT(*) > 1 ) D ( spid, total ) ; 2. Not sure if your requirements are clear since . Something like: SELECT recordId, spId, ... ( SELECT MIN( t2.startTime ) FROM tbl t2 WHERE t2.spId = t1.spId AND t2.startTime >= t1.endtime ) FROM tbl t1 ORDER BY t1.spid, startTime ; If, not please post the sample resultset for the dataset you posted. -- Anith Anith Sen wrote:
> 1. Use a derived table construct: Thanks. What does the 'D' mean above?> > SELECT SUM( total ) > FROM ( SELECT spid, COUNT(*) > FROM tbl > GROUP BY spid > HAVING COUNT(*) > 1 ) D ( spid, total ) ; > > 2. Not sure if your requirements are clear since . Something like: CREATE TABLE test (> > SELECT recordId, spId, ... > ( SELECT MIN( t2.startTime ) > FROM tbl t2 WHERE t2.spId = t1.spId > AND t2.startTime >= t1.endtime ) > FROM tbl t1 > ORDER BY t1.spid, startTime ; > > If, not please post the sample resultset for the dataset you posted. recordId numeric(18, 0) NOT NULL, spId int NOT NULL, startTime datetime NULL, endTime datetime NULL ) INSERT INTO test VALUES (1,1,'2005-01-01 12:00','2005-01-01 14:33') INSERT INTO test VALUES (3,1,'2005-01-01 14:00','2005-01-01 14:33') INSERT INTO test VALUES (5,1,'2005-01-01 14:33','2005-01-01 15:20') INSERT INTO test VALUES (2,2,'2005-01-01 12:26','2005-01-01 14:00') INSERT INTO test VALUES (4,2,'2005-01-01 14:00','2005-01-01 15:15') INSERT INTO test VALUES (6,2,'2005-01-01 15:15','2005-01-01 16:00') INSERT INTO test VALUES (7,3,'2005-01-01 12:00','2005-01-01 14:30') (Sorry, no wonder it wasn't clear as there was mistake in my original test data. I've corrected the data above and put records with the same spId together to make the grouping more obvious.) So, from the above test data the expected results contain 2 sets of matching data: 1. recordIds 3 and 5 because they have the same spId (1) and the endTime of recordId 3 is the same as the startTime of recordId 5. 2. recordIds 2, 4 and 6 because they have the same spId (2) and the endTime of recordId 2 is the same as the startTime of recordId 4; the endTime of 4 is the same as the startTime of 6. I hope that makes sense now. cheers, >> What does the 'D' mean above? D in the query stands for an alias for the derived table ( some folks explicitly use AS keyword before the alias as well. ) >> from the above test data the expected results contain 2 sets of matching Can you post the sample resultset here ( as you'd want to see on the QA >> data results pane ). -- Anith Anith Sen wrote:
> Can you post the sample resultset here ( as you'd want to see on the QA First recordId, Second recordId, spId, endTime, startTime> results pane ). 3 5 1 2005-01-01 14:33 2005-01-01 14:33 2 4 2 2005-01-01 14:00 2005-01-01 14:00 4 6 2 2005-01-01 15:15 2005-01-01 15:15 The resultset shows pairs of 'matching' records, which is slightly different (and better) to how I first envisioned it. Thanks. This is one way of getting it:
SELECT MAX( t1.recordid ), t2.recordid, t1.spid, t1.endtime FROM test t1 INNER JOIN test t2 ON t1.spId = t2.spId AND t1.endTime = t2.starttime GROUP BY t1.spid, t2.recordid, t1.endtime ; -- Anith On 11 Nov 2005 09:30:25 -0800, "J Williams"
<johnwilliams_esqu***@hotmail.com> wrote: >SELECT spid, count(*) AS 'Count' If that does the job, great, otherwise you can always store the>FROM test >GROUP BY spid WITH ROLLUP >HAVING count(*) > 1 ORDER BY spid results of the first query in a table an do further summations against it. J. >SELECT spid, count(*) AS 'Count' Thanks, but that doesn't give the expected result. The basic SELECT:>FROM test >GROUP BY spid WITH ROLLUP >HAVING count(*) > 1 ORDER BY spid SELECT spid, count(*) AS 'Count' FROM test GROUP BY spid HAVING count(*) > 1 ORDER BY spid returns: spid Count 1 3 2 3 The grand total of Count in the above resultset is 6 and the SQL posted earlier by Anith Sen gives this result: SELECT SUM( total ) FROM ( SELECT spid, COUNT(*) FROM tbl GROUP BY spid HAVING COUNT(*) > 1 ) D ( spid, total ) |
|||||||||||||||||||||||