Home All Groups Group Topic Archive Search About
Author
1 Sep 2006 7:04 PM
stavros
I'm trying to report on the runtimes of a couple new jobs through
several iterations, testing various code and data changes in each
iteration.  The sysjobs and sysjobhistory tables have all the info I
need for this, except I'd like to add a column showing the run
iteration for each job.  I'm surprised this isn't already in the
sysjobhistory table; something that ties the individual steps together
into one instance of a job run.

For example, this query returns the data ordered the way I want (faked
data here):

select j.name, jh.step_id, jh.step_name, run_duration
from sysjobhistory jh join sysjobs j
    on j.job_id = jh.job_id
order by j.name, jh.instance_id

name     step_id   step_name        run_duration
Job1     1         Job 1 Step 1     30
Job1     2         Job 1 Step 2     30
Job1     3         Job 1 Step 3     30
Job1     0         (Job outcome)    90
Job1     1         Job 1 Step 1     20
Job1     2         Job 1 Step 2     20
Job1     3         Job 1 Step 3     20
Job1     0         (Job outcome)    60
Job2     1         Job 2 Step 1     10
Job2     2         Job 2 Step 2     10
Job2     3         Job 2 Step 3     10
Job2     4         Job 2 Step 4     10
Job2     0         (Job outcome)    40
Job2     1         Job 2 Step 1     15
Job2     2         Job 2 Step 2     15
Job2     3         Job 2 Step 3     15
Job2     4         Job 2 Step 4     15
Job2     0         (Job outcome)    60

But I'd like to add an instance/iteration column:

name   iteration    step_id   step_name        run_duration
Job1   1            1         Job 1 Step 1     30
Job1   1            2         Job 1 Step 2     30
Job1   1            3         Job 1 Step 3     30
Job1   1            0         (Job outcome)    90
Job1   2            1         Job 1 Step 1     20
Job1   2            2         Job 1 Step 2     20
Job1   2            3         Job 1 Step 3     20
Job1   2            0         (Job outcome)    60
Job2   1            1         Job 2 Step 1     10
Job2   1            2         Job 2 Step 2     10
Job2   1            3         Job 2 Step 3     10
Job2   1            4         Job 2 Step 4     10
Job2   1            0         (Job outcome)    40
Job2   2            1         Job 2 Step 1     15
Job2   2            2         Job 2 Step 2     15
Job2   2            3         Job 2 Step 3     15
Job2   2            4         Job 2 Step 4     15
Job2   2            0         (Job outcome)    60

I'd be happy with a unique id for each iteration; it doesn't have to
reset for each job_name the way I've shown.  This is on 2005, and I
thought the ranking functions might help (DENSE_RANK), but I can't
figure out how to make it work.  Any thoughts?

Author
2 Sep 2006 6:40 AM
Anith Sen
Have you looked into the possibility of using the ROLLUP operation with
GROUP BY?

--
Anith
Author
2 Sep 2006 11:21 AM
Erland Sommarskog
stavros (stav***@mailinator.com) writes:
Show quote
> I'm trying to report on the runtimes of a couple new jobs through
> several iterations, testing various code and data changes in each
> iteration.  The sysjobs and sysjobhistory tables have all the info I
> need for this, except I'd like to add a column showing the run
> iteration for each job.  I'm surprised this isn't already in the
> sysjobhistory table; something that ties the individual steps together
> into one instance of a job run.
>
> For example, this query returns the data ordered the way I want (faked
> data here):
>
> select j.name, jh.step_id, jh.step_name, run_duration
> from sysjobhistory jh join sysjobs j
>      on j.job_id = jh.job_id
> order by j.name, jh.instance_id
>...
> I'd be happy with a unique id for each iteration; it doesn't have to
> reset for each job_name the way I've shown.  This is on 2005, and I
> thought the ranking functions might help (DENSE_RANK), but I can't
> figure out how to make it work.  Any thoughts?

The problem is that the sysjobhistory does not seemed to be design to
cope with this sort of queries. A proper data model would probably have
add sysjobhistory and sysjobstephistory. Trying to get the data out
from this demorlalised design is not that easy. (But I should add that
I have not worked much with sysjobhistory.)

Here is a query that I came up with:

   WITH jobhist AS (
      SELECT job_id, step_id, step_name, run_duration,
             run_date, run_time, instance_id,
             sortkey = row_number()
             OVER (ORDER BY run_date, run_time,
                            CASE WHEN step_id = 0 THEN -10000
                                 ELSE instance_id
                            END)
       FROM  sysjobhistory      
   )        
   select j.name, jh.step_id, jh.step_name, jh.run_duration, jh.run_date,
          jh.run_time, jh.instance_id, jh.sortkey, jh2.sortkey,
          iteration = dense_rank() over (ORDER BY j.name, jh2.sortkey)
   from jobhist jh
   join jobhist jh2 ON jh.job_id = jh2.job_id
                   AND  jh2.sortkey = (SELECT  MAX(jh3.sortkey)
                                       FROM    jobhist jh3
                                       WHERE   jh3.job_id = jh2.job_id
                                         AND   jh3.step_id = 0
                                         AND   jh.sortkey >= jh3.sortkey)
   join sysjobs j on j.job_id = jh.job_id
   order by j.name, jh.sortkey

As you see it's quite messy. It could be simplified if you make the
assumption that instance id is growing monotonically, and if you
use step 1 as your baseline and not step 0. (As step 0 appears to
be written last to the table.)

You could still get funky results if the same job runs twice in the
same second, which could happen for jobs triggered by WMI events I
guess.

Once you have the grouping in place, using dense_rank() is trivial.


--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
5 Sep 2006 10:37 PM
stavros
Thanks very much.  You're right, it's a little messy, but a good
introduction to common table expressions.  And it works!  Cheers...

Erland Sommarskog wrote:
Show quote
> stavros (stav***@mailinator.com) writes:
> > I'm trying to report on the runtimes of a couple new jobs through
> > several iterations, testing various code and data changes in each
> > iteration.  The sysjobs and sysjobhistory tables have all the info I
> > need for this, except I'd like to add a column showing the run
> > iteration for each job.  I'm surprised this isn't already in the
> > sysjobhistory table; something that ties the individual steps together
> > into one instance of a job run.
> >
> > For example, this query returns the data ordered the way I want (faked
> > data here):
> >
> > select j.name, jh.step_id, jh.step_name, run_duration
> > from sysjobhistory jh join sysjobs j
> >      on j.job_id = jh.job_id
> > order by j.name, jh.instance_id
> >...
> > I'd be happy with a unique id for each iteration; it doesn't have to
> > reset for each job_name the way I've shown.  This is on 2005, and I
> > thought the ranking functions might help (DENSE_RANK), but I can't
> > figure out how to make it work.  Any thoughts?
>
> The problem is that the sysjobhistory does not seemed to be design to
> cope with this sort of queries. A proper data model would probably have
> add sysjobhistory and sysjobstephistory. Trying to get the data out
> from this demorlalised design is not that easy. (But I should add that
> I have not worked much with sysjobhistory.)
>
> Here is a query that I came up with:
>
>    WITH jobhist AS (
>       SELECT job_id, step_id, step_name, run_duration,
>              run_date, run_time, instance_id,
>              sortkey = row_number()
>              OVER (ORDER BY run_date, run_time,
>                             CASE WHEN step_id = 0 THEN -10000
>                                  ELSE instance_id
>                             END)
>        FROM  sysjobhistory
>    )
>    select j.name, jh.step_id, jh.step_name, jh.run_duration, jh.run_date,
>           jh.run_time, jh.instance_id, jh.sortkey, jh2.sortkey,
>           iteration = dense_rank() over (ORDER BY j.name, jh2.sortkey)
>    from jobhist jh
>    join jobhist jh2 ON jh.job_id = jh2.job_id
>                    AND  jh2.sortkey = (SELECT  MAX(jh3.sortkey)
>                                        FROM    jobhist jh3
>                                        WHERE   jh3.job_id = jh2.job_id
>                                          AND   jh3.step_id = 0
>                                          AND   jh.sortkey >= jh3.sortkey)
>    join sysjobs j on j.job_id = jh.job_id
>    order by j.name, jh.sortkey
>
> As you see it's quite messy. It could be simplified if you make the
> assumption that instance id is growing monotonically, and if you
> use step 1 as your baseline and not step 0. (As step 0 appears to
> be written last to the table.)
>
> You could still get funky results if the same job runs twice in the
> same second, which could happen for jobs triggered by WMI events I
> guess.
>
> Once you have the grouping in place, using dense_rank() is trivial.
>
>
> --
> Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se
>
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

AddThis Social Bookmark Button