Home All Groups Group Topic Archive Search About

SQL query: is object in a static list showing up in daily results? if so, how many times?

Author
28 Apr 2006 10:33 PM
Randall Arnold
The title of this post is a little cryptic but hopefully I can explain well
enough.  I asked a similar question a while back, didn't get usable answers
and then thought I'd solved it on my own.  However, my current approach to
the query is running way too slowly.

In this case I have a static table of assembly lines, MPWS_L_AUDIT_LINE.
Each line has a Line_ID (immaterial to this need) and a LineName (A0, B0,
C0, etc).

I also have a daily defect query, MPWS_ProductAuditFails_view.  It's a
series of records detailing failures that can occur on any given line.

What I need as an end result is a list of all lines for every day an audit
was performed.  Each row should look like the following:

  lineview2 LineName PeriodYear Period PeriodWeek DateInput LineCount
      AA 2006 1 1 1/3/2006 1
      AB 2006 1 1 1/3/2006 1
      AC 2006 1 1 1/3/2006 1
      AD 2006 1 1 1/3/2006 0
      AE 2006 1 1 1/3/2006 1
      AF 2006 1 1 1/3/2006 1
      B0 2006 1 1 1/3/2006 1
      D0 2006 1 1 1/3/2006 1
      G0 2006 1 1 1/3/2006 1
      HL 2006 1 1 1/3/2006 0
      I0 2006 1 1 1/3/2006 1
      O0 2006 1 1 1/3/2006 1
      P0 2006 1 1 1/3/2006 0

(etc)

"LineCount" would be the number of defects found per line per day.  Every
line should be represented (I can get that to happen using a Cross Join
only).  Here is what my current query looks like:

SELECT DISTINCT
                      TOP 100 PERCENT dbo.MPWS_L_AUDIT_LINE.Descr AS
LineName, dev.MPWS_ProductAuditFails_view.PeriodYear,
                      dev.MPWS_ProductAuditFails_view.Period,
dev.MPWS_ProductAuditFails_view.PeriodWeek,
dev.MPWS_ProductAuditFails_view.DateInput
FROM         dbo.MPWS_L_AUDIT_LINE CROSS JOIN
                      dev.MPWS_ProductAuditFails_view
GROUP BY dev.MPWS_ProductAuditFails_view.PeriodYear,
dev.MPWS_ProductAuditFails_view.Period,
dev.MPWS_ProductAuditFails_view.PeriodWeek,
                      dev.MPWS_ProductAuditFails_view.DateInput,
dbo.MPWS_L_AUDIT_LINE.Descr
ORDER BY dev.MPWS_ProductAuditFails_view.DateInput

This gets me every column but LineCount.  What I have tried for LineCount is
to use a subquery to count in MPWS_ProductAuditFails_view how many defects
showed up for a line on each audited day.  Problem is, I can't seem to get
the subquery right.  And even the subqueries that don't produce proper
results take wayyyyy too long to run.

Sooo... any ideas?  I'm just about out.

Randall Arnold

Author
28 Apr 2006 11:20 PM
Roy Harvey
As everyone always says here, "You need to include your DDL.
http://www.aspfaq.com/etiquette.asp?id=5006"

More information is needed.  In particular, more information about the
result set from the view MPWS_ProductAuditFails_view.  Since you are
treating it like a table, we need to know the same things we need to
know for a table.  What are the columns?  What is the key?  Yes, I
know the views don't have defined keys, but there should be some
column or set of columns that are unique.

Next, you talk about a subquery that is too slow, but you don't show
the subquery.  How can anyone suggest improvements to something
unknown?

Without that information my comments are few.  One is that, since
MPWS_ProductAuditFails_view is a view, there is no reason for us to
assume the problem is with the query you are asking about; we have no
reason to rule out the view as being the problem.  Another is that I
see no benefit to the TOP 100 PERCENT.  And finally, if the DISTINCT
is really necessary, it would seem that the columns used from
MPWS_ProductAuditFails_view do not constitute a key.  In that case it
is probably worth making those columns DISTINCT before the cross join,
possibly by using a view on MPWS_ProductAuditFails_view, or
alternately using a derived table.

SELECT DISTINCT
       dev.MPWS_ProductAuditFails_view.PeriodYear,
       dev.MPWS_ProductAuditFails_view.Period,
       dev.MPWS_ProductAuditFails_view.PeriodWeek,
       dev.MPWS_ProductAuditFails_view.DateInput
  FROM dev.MPWS_ProductAuditFails_view

Roy Harvey
Beacon Falls, CT


On Fri, 28 Apr 2006 22:33:37 GMT, "Randall Arnold"
<randall.nospam.arnold@nokia.com.> wrote:

Show quote
>The title of this post is a little cryptic but hopefully I can explain well
>enough.  I asked a similar question a while back, didn't get usable answers
>and then thought I'd solved it on my own.  However, my current approach to
>the query is running way too slowly.
>
>In this case I have a static table of assembly lines, MPWS_L_AUDIT_LINE.
>Each line has a Line_ID (immaterial to this need) and a LineName (A0, B0,
>C0, etc).
>
>I also have a daily defect query, MPWS_ProductAuditFails_view.  It's a
>series of records detailing failures that can occur on any given line.
>
>What I need as an end result is a list of all lines for every day an audit
>was performed.  Each row should look like the following:
>
>  lineview2 LineName PeriodYear Period PeriodWeek DateInput LineCount
>      AA 2006 1 1 1/3/2006 1
>      AB 2006 1 1 1/3/2006 1
>      AC 2006 1 1 1/3/2006 1
>      AD 2006 1 1 1/3/2006 0
>      AE 2006 1 1 1/3/2006 1
>      AF 2006 1 1 1/3/2006 1
>      B0 2006 1 1 1/3/2006 1
>      D0 2006 1 1 1/3/2006 1
>      G0 2006 1 1 1/3/2006 1
>      HL 2006 1 1 1/3/2006 0
>      I0 2006 1 1 1/3/2006 1
>      O0 2006 1 1 1/3/2006 1
>      P0 2006 1 1 1/3/2006 0
>
>(etc)
>
>"LineCount" would be the number of defects found per line per day.  Every
>line should be represented (I can get that to happen using a Cross Join
>only).  Here is what my current query looks like:
>
>SELECT DISTINCT
>                      TOP 100 PERCENT dbo.MPWS_L_AUDIT_LINE.Descr AS
>LineName, dev.MPWS_ProductAuditFails_view.PeriodYear,
>                      dev.MPWS_ProductAuditFails_view.Period,
>dev.MPWS_ProductAuditFails_view.PeriodWeek,
>dev.MPWS_ProductAuditFails_view.DateInput
>FROM         dbo.MPWS_L_AUDIT_LINE CROSS JOIN
>                      dev.MPWS_ProductAuditFails_view
>GROUP BY dev.MPWS_ProductAuditFails_view.PeriodYear,
>dev.MPWS_ProductAuditFails_view.Period,
>dev.MPWS_ProductAuditFails_view.PeriodWeek,
>                      dev.MPWS_ProductAuditFails_view.DateInput,
>dbo.MPWS_L_AUDIT_LINE.Descr
>ORDER BY dev.MPWS_ProductAuditFails_view.DateInput
>
>This gets me every column but LineCount.  What I have tried for LineCount is
>to use a subquery to count in MPWS_ProductAuditFails_view how many defects
>showed up for a line on each audited day.  Problem is, I can't seem to get
>the subquery right.  And even the subqueries that don't produce proper
>results take wayyyyy too long to run.
>
>Sooo... any ideas?  I'm just about out.
>
>Randall Arnold
>
Author
29 Apr 2006 4:54 AM
Randall Arnold
*sigh*

If I had thought more info would have been needed I honestly would have
provided it.  Really, the things you mention as missing are moot.  The key
isn't included in the result set (it's an Indentity).  There will likely be
no unique values in the result set at all.  And I didn't include the
subquery mentioned for the very reason I gave: it didn't work properly (AND
was way too slow).

The columns I showed in the query are the only ones I care about.  All
others in MPWS_ProductAuditFails_view are utterly moot for the purposes I'm
describing,  This is a highly specific need and the view itself is very
broad.

There is nothing at all wrong with the view's construction or performance,
which would be the safe assumption on the reader's part since I was very
specific in the issue at hand.  I have used this view for some time with
numerous other related queries with 100% success and zero problems.  And
what I'm asking for isn't the result of a problem with anything per se-- I'm
looking for a technique.

The Top 100 PERCENT is a SQL Server requirement if any sort of ORDER BY is
used, as I am using.  The DISTINCT may be moot; I was testing different
things to try to get this to work.  It seems to neither hurt nor help.

I appreciate your reply, but I'm really not necessarily looking for someone
to solve my specific need, but rather as I said, for a general technique I
can apply to all such cases.  This isn't the only query of this sort I'm
going to need to create.  Bottom line, not every line will have defects
every day... yet they will still need to be listed with 0 defects.  All
others need the number of defects shown.  The only way I can see to force
the appearance of every line per day is to cross join on the Line table.
But after that I'm coming up empty on showing the defect quantities int he
same rows.

Randall Arnold

Show quote
"Roy Harvey" <roy_har***@snet.net> wrote in message
news:6385521dmfthofcbjpb9qv1pu4gj4474oi@4ax.com...
> As everyone always says here, "You need to include your DDL.
> http://www.aspfaq.com/etiquette.asp?id=5006"
>
> More information is needed.  In particular, more information about the
> result set from the view MPWS_ProductAuditFails_view.  Since you are
> treating it like a table, we need to know the same things we need to
> know for a table.  What are the columns?  What is the key?  Yes, I
> know the views don't have defined keys, but there should be some
> column or set of columns that are unique.
>
> Next, you talk about a subquery that is too slow, but you don't show
> the subquery.  How can anyone suggest improvements to something
> unknown?
>
> Without that information my comments are few.  One is that, since
> MPWS_ProductAuditFails_view is a view, there is no reason for us to
> assume the problem is with the query you are asking about; we have no
> reason to rule out the view as being the problem.  Another is that I
> see no benefit to the TOP 100 PERCENT.  And finally, if the DISTINCT
> is really necessary, it would seem that the columns used from
> MPWS_ProductAuditFails_view do not constitute a key.  In that case it
> is probably worth making those columns DISTINCT before the cross join,
> possibly by using a view on MPWS_ProductAuditFails_view, or
> alternately using a derived table.
>
> SELECT DISTINCT
>       dev.MPWS_ProductAuditFails_view.PeriodYear,
>       dev.MPWS_ProductAuditFails_view.Period,
>       dev.MPWS_ProductAuditFails_view.PeriodWeek,
>       dev.MPWS_ProductAuditFails_view.DateInput
>  FROM dev.MPWS_ProductAuditFails_view
>
> Roy Harvey
> Beacon Falls, CT
>
>
> On Fri, 28 Apr 2006 22:33:37 GMT, "Randall Arnold"
> <randall.nospam.arnold@nokia.com.> wrote:
>
>>The title of this post is a little cryptic but hopefully I can explain
>>well
>>enough.  I asked a similar question a while back, didn't get usable
>>answers
>>and then thought I'd solved it on my own.  However, my current approach to
>>the query is running way too slowly.
>>
>>In this case I have a static table of assembly lines, MPWS_L_AUDIT_LINE.
>>Each line has a Line_ID (immaterial to this need) and a LineName (A0, B0,
>>C0, etc).
>>
>>I also have a daily defect query, MPWS_ProductAuditFails_view.  It's a
>>series of records detailing failures that can occur on any given line.
>>
>>What I need as an end result is a list of all lines for every day an audit
>>was performed.  Each row should look like the following:
>>
>>  lineview2 LineName PeriodYear Period PeriodWeek DateInput LineCount
>>      AA 2006 1 1 1/3/2006 1
>>      AB 2006 1 1 1/3/2006 1
>>      AC 2006 1 1 1/3/2006 1
>>      AD 2006 1 1 1/3/2006 0
>>      AE 2006 1 1 1/3/2006 1
>>      AF 2006 1 1 1/3/2006 1
>>      B0 2006 1 1 1/3/2006 1
>>      D0 2006 1 1 1/3/2006 1
>>      G0 2006 1 1 1/3/2006 1
>>      HL 2006 1 1 1/3/2006 0
>>      I0 2006 1 1 1/3/2006 1
>>      O0 2006 1 1 1/3/2006 1
>>      P0 2006 1 1 1/3/2006 0
>>
>>(etc)
>>
>>"LineCount" would be the number of defects found per line per day.  Every
>>line should be represented (I can get that to happen using a Cross Join
>>only).  Here is what my current query looks like:
>>
>>SELECT DISTINCT
>>                      TOP 100 PERCENT dbo.MPWS_L_AUDIT_LINE.Descr AS
>>LineName, dev.MPWS_ProductAuditFails_view.PeriodYear,
>>                      dev.MPWS_ProductAuditFails_view.Period,
>>dev.MPWS_ProductAuditFails_view.PeriodWeek,
>>dev.MPWS_ProductAuditFails_view.DateInput
>>FROM         dbo.MPWS_L_AUDIT_LINE CROSS JOIN
>>                      dev.MPWS_ProductAuditFails_view
>>GROUP BY dev.MPWS_ProductAuditFails_view.PeriodYear,
>>dev.MPWS_ProductAuditFails_view.Period,
>>dev.MPWS_ProductAuditFails_view.PeriodWeek,
>>                      dev.MPWS_ProductAuditFails_view.DateInput,
>>dbo.MPWS_L_AUDIT_LINE.Descr
>>ORDER BY dev.MPWS_ProductAuditFails_view.DateInput
>>
>>This gets me every column but LineCount.  What I have tried for LineCount
>>is
>>to use a subquery to count in MPWS_ProductAuditFails_view how many defects
>>showed up for a line on each audited day.  Problem is, I can't seem to get
>>the subquery right.  And even the subqueries that don't produce proper
>>results take wayyyyy too long to run.
>>
>>Sooo... any ideas?  I'm just about out.
>>
>>Randall Arnold
>>
Author
29 Apr 2006 6:59 PM
Roy Harvey
*sigh*

I really do not need more information, as I do not really need to help
you solve your problem.  I am, or was, ready to try to help, and I
spent a good bit of time trying to decipher the information you gave
so that I could try.  While it may be clear to you, or to the rest of
the world, it was not clear to me.  Not being able to help based on
the given information, I asked for what seemed to be missing.  My
apologies.

>The Top 100 PERCENT is a SQL Server requirement if any sort of ORDER BY is
>used, as I am using.

You did not mention it was in a view.  It IS required of ORDER BY is
used in a view.  In SQL Server 2000 it happens that the order of the
rows (generally) is controlled by this ORDER BY when the view is
queried.  This undocumented behavior that is NOT perpetuated in SQL
Server 2005.  If you depend on it, it will break if you ever upgrade.
You might want to keep that in mind.

If it was not in a view, TOP 100 PERCENT does nothing.

>I appreciate your reply, but I'm really not necessarily looking for someone
>to solve my specific need, but rather as I said, for a general technique I
>can apply to all such cases.

OK, here are some general techniques.

Beware of DISTINCT.  Sometimes DISTINCT is an absolutely indispensable
tool.  DISTINCT is also the quick fix for many queries that were
written incorrectly.  Likewise, it is the quick fix for many queries
written against databases that were designed incorrectly.  When trying
to solve a problem, if you see DISTINCT treat it as a red flag and
PROVE that it is required - as it very well may be.

Be very care full of CROSS JOINs.  When the tables involved are
"master" tables such as your MPWS_L_AUDIT_LINE it does not worry me
too much.  When used against something with greater levels of detail,
especially when it is necessary to guess the degree of detail, I get
concerned.  I have seen too many cases where the use of a table with
more detail than required was a reflection of shortcomings in the
database design, to be specific missing "master" tables.  Not always,
of course, but an issue that should be reviewed.

If there is redundancy resulting from a CROSS JOIN, redundancy
requiring a DISTINCT, I much prefer to remove the redundancy BEFORE
the CROSS JOIN.  Usually I do this with a derived table query against
the table causing the redundancy, including only the columns needed
for the CROSS JOIN, with a DISTINCT (or GROUP BY) inside the derived
table query.

Beware of using a view that has everything you need, but also has
things you do not need.  Sometimes views require processing and
joining to derive the columns that are not being referenced.  It is
all too common for performance to suffer from this.  It is a very easy
trap to fall into, as my own past mistakes have taught me.  Especially
when views are written against views that are written against views.

Never loose sight of the key.  If you have a table, it should have a
unique key.  If you have a view, you should know without having to
think about it what the effective key is.  Usually this is simple. For
a view (or query) that uses GROUP BY the key is the GROUP BY columns.
For a view that performs an INNER JOIN, the key is the combination of
the key columns for both tables.  Mostly this is a no-brainer, but the
important concept here is that you can not really think relationally
about the table (or view or result set) if you do not know the key.
Of course it is possible to loose the key, say by not including all
the key columns when JOINING and at the same time not doing a GROUP
BY.  That is a serious problem but easily avoided by not loosing sight
of the key.

A required DISTINCT, when combined with a subquery IN THE SELECT LIST,
is poison.  Don't do it.  Instead, resolve the DISTINCT, then perform
the subquery.  To do this, put the DISTINCT in a derived query, and
the subquery in the SELECT list of an outer query.

The best way to confirm that you understand something is to explain it
to someone so that they understand it too.  The most common advice
when debugging a programming problem is to grab somebody and explain
what is going on, and what is going wrong.  It often doesn't even
matter whether they actually are qualified to help - in fact sometimes
the less they know the better.  Not being able to assume anything
about their knowledge means you have to explain all your assumptions -
which can often be at the root of the problem.  I have lost count of
the times that the very act of explaining the problem was all I needed
to solve it myself.  Likewise I have helped others many times just by
letting them explain things to me until they solved their own
problems.  A very powerful tool!

One corollary to this last point and I am done.  If you can not
explain a problem so that someone else understands it, what reason
have you given them to believe that you understand it yourself?

Roy Harvey
Beacon Falls, CT


On Fri, 28 Apr 2006 23:54:46 -0500, "Randall Arnold"
<fabricator@cynicalnospamsigns.com> wrote:

Show quote
>*sigh*
>
>If I had thought more info would have been needed I honestly would have
>provided it.  Really, the things you mention as missing are moot.  The key
>isn't included in the result set (it's an Indentity).  There will likely be
>no unique values in the result set at all.  And I didn't include the
>subquery mentioned for the very reason I gave: it didn't work properly (AND
>was way too slow).
>
>The columns I showed in the query are the only ones I care about.  All
>others in MPWS_ProductAuditFails_view are utterly moot for the purposes I'm
>describing,  This is a highly specific need and the view itself is very
>broad.
>
>There is nothing at all wrong with the view's construction or performance,
>which would be the safe assumption on the reader's part since I was very
>specific in the issue at hand.  I have used this view for some time with
>numerous other related queries with 100% success and zero problems.  And
>what I'm asking for isn't the result of a problem with anything per se-- I'm
>looking for a technique.
>
>The Top 100 PERCENT is a SQL Server requirement if any sort of ORDER BY is
>used, as I am using.  The DISTINCT may be moot; I was testing different
>things to try to get this to work.  It seems to neither hurt nor help.
>
>I appreciate your reply, but I'm really not necessarily looking for someone
>to solve my specific need, but rather as I said, for a general technique I
>can apply to all such cases.  This isn't the only query of this sort I'm
>going to need to create.  Bottom line, not every line will have defects
>every day... yet they will still need to be listed with 0 defects.  All
>others need the number of defects shown.  The only way I can see to force
>the appearance of every line per day is to cross join on the Line table.
>But after that I'm coming up empty on showing the defect quantities int he
>same rows.
>
>Randall Arnold
>
>"Roy Harvey" <roy_har***@snet.net> wrote in message
>news:6385521dmfthofcbjpb9qv1pu4gj4474oi@4ax.com...
>> As everyone always says here, "You need to include your DDL.
>> http://www.aspfaq.com/etiquette.asp?id=5006"
>>
>> More information is needed.  In particular, more information about the
>> result set from the view MPWS_ProductAuditFails_view.  Since you are
>> treating it like a table, we need to know the same things we need to
>> know for a table.  What are the columns?  What is the key?  Yes, I
>> know the views don't have defined keys, but there should be some
>> column or set of columns that are unique.
>>
>> Next, you talk about a subquery that is too slow, but you don't show
>> the subquery.  How can anyone suggest improvements to something
>> unknown?
>>
>> Without that information my comments are few.  One is that, since
>> MPWS_ProductAuditFails_view is a view, there is no reason for us to
>> assume the problem is with the query you are asking about; we have no
>> reason to rule out the view as being the problem.  Another is that I
>> see no benefit to the TOP 100 PERCENT.  And finally, if the DISTINCT
>> is really necessary, it would seem that the columns used from
>> MPWS_ProductAuditFails_view do not constitute a key.  In that case it
>> is probably worth making those columns DISTINCT before the cross join,
>> possibly by using a view on MPWS_ProductAuditFails_view, or
>> alternately using a derived table.
>>
>> SELECT DISTINCT
>>       dev.MPWS_ProductAuditFails_view.PeriodYear,
>>       dev.MPWS_ProductAuditFails_view.Period,
>>       dev.MPWS_ProductAuditFails_view.PeriodWeek,
>>       dev.MPWS_ProductAuditFails_view.DateInput
>>  FROM dev.MPWS_ProductAuditFails_view
>>
>> Roy Harvey
>> Beacon Falls, CT
>>
>>
>> On Fri, 28 Apr 2006 22:33:37 GMT, "Randall Arnold"
>> <randall.nospam.arnold@nokia.com.> wrote:
>>
>>>The title of this post is a little cryptic but hopefully I can explain
>>>well
>>>enough.  I asked a similar question a while back, didn't get usable
>>>answers
>>>and then thought I'd solved it on my own.  However, my current approach to
>>>the query is running way too slowly.
>>>
>>>In this case I have a static table of assembly lines, MPWS_L_AUDIT_LINE.
>>>Each line has a Line_ID (immaterial to this need) and a LineName (A0, B0,
>>>C0, etc).
>>>
>>>I also have a daily defect query, MPWS_ProductAuditFails_view.  It's a
>>>series of records detailing failures that can occur on any given line.
>>>
>>>What I need as an end result is a list of all lines for every day an audit
>>>was performed.  Each row should look like the following:
>>>
>>>  lineview2 LineName PeriodYear Period PeriodWeek DateInput LineCount
>>>      AA 2006 1 1 1/3/2006 1
>>>      AB 2006 1 1 1/3/2006 1
>>>      AC 2006 1 1 1/3/2006 1
>>>      AD 2006 1 1 1/3/2006 0
>>>      AE 2006 1 1 1/3/2006 1
>>>      AF 2006 1 1 1/3/2006 1
>>>      B0 2006 1 1 1/3/2006 1
>>>      D0 2006 1 1 1/3/2006 1
>>>      G0 2006 1 1 1/3/2006 1
>>>      HL 2006 1 1 1/3/2006 0
>>>      I0 2006 1 1 1/3/2006 1
>>>      O0 2006 1 1 1/3/2006 1
>>>      P0 2006 1 1 1/3/2006 0
>>>
>>>(etc)
>>>
>>>"LineCount" would be the number of defects found per line per day.  Every
>>>line should be represented (I can get that to happen using a Cross Join
>>>only).  Here is what my current query looks like:
>>>
>>>SELECT DISTINCT
>>>                      TOP 100 PERCENT dbo.MPWS_L_AUDIT_LINE.Descr AS
>>>LineName, dev.MPWS_ProductAuditFails_view.PeriodYear,
>>>                      dev.MPWS_ProductAuditFails_view.Period,
>>>dev.MPWS_ProductAuditFails_view.PeriodWeek,
>>>dev.MPWS_ProductAuditFails_view.DateInput
>>>FROM         dbo.MPWS_L_AUDIT_LINE CROSS JOIN
>>>                      dev.MPWS_ProductAuditFails_view
>>>GROUP BY dev.MPWS_ProductAuditFails_view.PeriodYear,
>>>dev.MPWS_ProductAuditFails_view.Period,
>>>dev.MPWS_ProductAuditFails_view.PeriodWeek,
>>>                      dev.MPWS_ProductAuditFails_view.DateInput,
>>>dbo.MPWS_L_AUDIT_LINE.Descr
>>>ORDER BY dev.MPWS_ProductAuditFails_view.DateInput
>>>
>>>This gets me every column but LineCount.  What I have tried for LineCount
>>>is
>>>to use a subquery to count in MPWS_ProductAuditFails_view how many defects
>>>showed up for a line on each audited day.  Problem is, I can't seem to get
>>>the subquery right.  And even the subqueries that don't produce proper
>>>results take wayyyyy too long to run.
>>>
>>>Sooo... any ideas?  I'm just about out.
>>>
>>>Randall Arnold
Author
1 May 2006 1:04 AM
Roy Harvey
On Sat, 29 Apr 2006 14:59:59 -0400, Roy Harvey <roy_har***@snet.net>
wrote:

>OK, here are some general techniques.

One more I should have included.

The graphical Execution Plan viewer in Query Analyzer is your most
powerful friend.  Learn to use it.

A simple trick when looking at execution plans.  Put two versions of
the query in the same query window, and generate the query plans for
both at the same time.  Each one will have an estimated percentage of
the time it would take to run both.  So, if both say 50% the optimizer
thinks they will run about the same.  And if it says 25% for one and
75% for the other it expects performance to differ by a factor of
three.

Roy Harvey
Beacon Falls, CT
Author
1 May 2006 9:47 AM
Erland Sommarskog
Roy Harvey (roy_har***@snet.net) writes:
> The best way to confirm that you understand something is to explain it
> to someone so that they understand it too.  The most common advice
> when debugging a programming problem is to grab somebody and explain
> what is going on, and what is going wrong.  It often doesn't even
> matter whether they actually are qualified to help - in fact sometimes
> the less they know the better.  Not being able to assume anything
> about their knowledge means you have to explain all your assumptions -
> which can often be at the root of the problem.  I have lost count of
> the times that the very act of explaining the problem was all I needed
> to solve it myself.  Likewise I have helped others many times just by
> letting them explain things to me until they solved their own
> problems.  A very powerful tool!

How true. I've experienced this myself many times.

> One corollary to this last point and I am done.  If you can not
> explain a problem so that someone else understands it, what reason
> have you given them to believe that you understand it yourself?

Amen to that, Roy!

--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
1 May 2006 1:57 PM
Randall Arnold
I don't think he really needs the cheerleading any more than I needed the
verbal beating.

Randall Arnold

Show quote
"Erland Sommarskog" <esq***@sommarskog.se> wrote in message
news:Xns97B677ED9C565Yazorman@127.0.0.1...
> Roy Harvey (roy_har***@snet.net) writes:
>> The best way to confirm that you understand something is to explain it
>> to someone so that they understand it too.  The most common advice
>> when debugging a programming problem is to grab somebody and explain
>> what is going on, and what is going wrong.  It often doesn't even
>> matter whether they actually are qualified to help - in fact sometimes
>> the less they know the better.  Not being able to assume anything
>> about their knowledge means you have to explain all your assumptions -
>> which can often be at the root of the problem.  I have lost count of
>> the times that the very act of explaining the problem was all I needed
>> to solve it myself.  Likewise I have helped others many times just by
>> letting them explain things to me until they solved their own
>> problems.  A very powerful tool!
>
> How true. I've experienced this myself many times.
>
>> One corollary to this last point and I am done.  If you can not
>> explain a problem so that someone else understands it, what reason
>> have you given them to believe that you understand it yourself?
>
> Amen to that, Roy!
>
> --
> Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se
>
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
1 May 2006 1:56 PM
Randall Arnold
I really didn't need that lengthy lecture.  It gets far afield of where I'm
trying to go, and seems to based on the assumption that I'm an idiot.

As a developer of over 20 years, I understand very well the need to get all
the info that's relevant.  I've also helped numerous people on similar
forums in similar situations (where I had the expertise in question) for
years.  So I know first hand the frustration of trying to help someone who
doesn't provide the requisite critical info.

Problem here is, in this case, I did.  Again: I'm NOT looking for
highly-specific code to create the results I'm looking for.  I'm simply
providing an example of what I'm working with and asking for high-level
techinique(s) that will apply across a broad range of needs.  I've reviewed
what I provided and there is certainly enough info for someone more versed
in SQL than I to provide at least a starting point as opposed to a
condescending lesson in posting etiquette.  I've seen (and have participated
in) solutions to far more vexing problems on far less information.

For this current dilemma, though, I guess I'll look in other areas where
there's less arrogance and no philosophical labyrinth to navigate.

Randall Arnold

Show quote
"Roy Harvey" <roy_har***@snet.net> wrote in message
news:i397525dblgml0cfk3lntmko2pif53udkl@4ax.com...
> *sigh*
>
> I really do not need more information, as I do not really need to help
> you solve your problem.  I am, or was, ready to try to help, and I
> spent a good bit of time trying to decipher the information you gave
> so that I could try.  While it may be clear to you, or to the rest of
> the world, it was not clear to me.  Not being able to help based on
> the given information, I asked for what seemed to be missing.  My
> apologies.
>
>>The Top 100 PERCENT is a SQL Server requirement if any sort of ORDER BY is
>>used, as I am using.
>
> You did not mention it was in a view.  It IS required of ORDER BY is
> used in a view.  In SQL Server 2000 it happens that the order of the
> rows (generally) is controlled by this ORDER BY when the view is
> queried.  This undocumented behavior that is NOT perpetuated in SQL
> Server 2005.  If you depend on it, it will break if you ever upgrade.
> You might want to keep that in mind.
>
> If it was not in a view, TOP 100 PERCENT does nothing.
>
>>I appreciate your reply, but I'm really not necessarily looking for
>>someone
>>to solve my specific need, but rather as I said, for a general technique I
>>can apply to all such cases.
>
> OK, here are some general techniques.
>
> Beware of DISTINCT.  Sometimes DISTINCT is an absolutely indispensable
> tool.  DISTINCT is also the quick fix for many queries that were
> written incorrectly.  Likewise, it is the quick fix for many queries
> written against databases that were designed incorrectly.  When trying
> to solve a problem, if you see DISTINCT treat it as a red flag and
> PROVE that it is required - as it very well may be.
>
> Be very care full of CROSS JOINs.  When the tables involved are
> "master" tables such as your MPWS_L_AUDIT_LINE it does not worry me
> too much.  When used against something with greater levels of detail,
> especially when it is necessary to guess the degree of detail, I get
> concerned.  I have seen too many cases where the use of a table with
> more detail than required was a reflection of shortcomings in the
> database design, to be specific missing "master" tables.  Not always,
> of course, but an issue that should be reviewed.
>
> If there is redundancy resulting from a CROSS JOIN, redundancy
> requiring a DISTINCT, I much prefer to remove the redundancy BEFORE
> the CROSS JOIN.  Usually I do this with a derived table query against
> the table causing the redundancy, including only the columns needed
> for the CROSS JOIN, with a DISTINCT (or GROUP BY) inside the derived
> table query.
>
> Beware of using a view that has everything you need, but also has
> things you do not need.  Sometimes views require processing and
> joining to derive the columns that are not being referenced.  It is
> all too common for performance to suffer from this.  It is a very easy
> trap to fall into, as my own past mistakes have taught me.  Especially
> when views are written against views that are written against views.
>
> Never loose sight of the key.  If you have a table, it should have a
> unique key.  If you have a view, you should know without having to
> think about it what the effective key is.  Usually this is simple. For
> a view (or query) that uses GROUP BY the key is the GROUP BY columns.
> For a view that performs an INNER JOIN, the key is the combination of
> the key columns for both tables.  Mostly this is a no-brainer, but the
> important concept here is that you can not really think relationally
> about the table (or view or result set) if you do not know the key.
> Of course it is possible to loose the key, say by not including all
> the key columns when JOINING and at the same time not doing a GROUP
> BY.  That is a serious problem but easily avoided by not loosing sight
> of the key.
>
> A required DISTINCT, when combined with a subquery IN THE SELECT LIST,
> is poison.  Don't do it.  Instead, resolve the DISTINCT, then perform
> the subquery.  To do this, put the DISTINCT in a derived query, and
> the subquery in the SELECT list of an outer query.
>
> The best way to confirm that you understand something is to explain it
> to someone so that they understand it too.  The most common advice
> when debugging a programming problem is to grab somebody and explain
> what is going on, and what is going wrong.  It often doesn't even
> matter whether they actually are qualified to help - in fact sometimes
> the less they know the better.  Not being able to assume anything
> about their knowledge means you have to explain all your assumptions -
> which can often be at the root of the problem.  I have lost count of
> the times that the very act of explaining the problem was all I needed
> to solve it myself.  Likewise I have helped others many times just by
> letting them explain things to me until they solved their own
> problems.  A very powerful tool!
>
> One corollary to this last point and I am done.  If you can not
> explain a problem so that someone else understands it, what reason
> have you given them to believe that you understand it yourself?
>
> Roy Harvey
> Beacon Falls, CT
>
>
> On Fri, 28 Apr 2006 23:54:46 -0500, "Randall Arnold"
> <fabricator@cynicalnospamsigns.com> wrote:
>
>>*sigh*
>>
>>If I had thought more info would have been needed I honestly would have
>>provided it.  Really, the things you mention as missing are moot.  The key
>>isn't included in the result set (it's an Indentity).  There will likely
>>be
>>no unique values in the result set at all.  And I didn't include the
>>subquery mentioned for the very reason I gave: it didn't work properly
>>(AND
>>was way too slow).
>>
>>The columns I showed in the query are the only ones I care about.  All
>>others in MPWS_ProductAuditFails_view are utterly moot for the purposes
>>I'm
>>describing,  This is a highly specific need and the view itself is very
>>broad.
>>
>>There is nothing at all wrong with the view's construction or performance,
>>which would be the safe assumption on the reader's part since I was very
>>specific in the issue at hand.  I have used this view for some time with
>>numerous other related queries with 100% success and zero problems.  And
>>what I'm asking for isn't the result of a problem with anything per se-- 
>>I'm
>>looking for a technique.
>>
>>The Top 100 PERCENT is a SQL Server requirement if any sort of ORDER BY is
>>used, as I am using.  The DISTINCT may be moot; I was testing different
>>things to try to get this to work.  It seems to neither hurt nor help.
>>
>>I appreciate your reply, but I'm really not necessarily looking for
>>someone
>>to solve my specific need, but rather as I said, for a general technique I
>>can apply to all such cases.  This isn't the only query of this sort I'm
>>going to need to create.  Bottom line, not every line will have defects
>>every day... yet they will still need to be listed with 0 defects.  All
>>others need the number of defects shown.  The only way I can see to force
>>the appearance of every line per day is to cross join on the Line table.
>>But after that I'm coming up empty on showing the defect quantities int he
>>same rows.
>>
>>Randall Arnold
>>
>>"Roy Harvey" <roy_har***@snet.net> wrote in message
>>news:6385521dmfthofcbjpb9qv1pu4gj4474oi@4ax.com...
>>> As everyone always says here, "You need to include your DDL.
>>> http://www.aspfaq.com/etiquette.asp?id=5006"
>>>
>>> More information is needed.  In particular, more information about the
>>> result set from the view MPWS_ProductAuditFails_view.  Since you are
>>> treating it like a table, we need to know the same things we need to
>>> know for a table.  What are the columns?  What is the key?  Yes, I
>>> know the views don't have defined keys, but there should be some
>>> column or set of columns that are unique.
>>>
>>> Next, you talk about a subquery that is too slow, but you don't show
>>> the subquery.  How can anyone suggest improvements to something
>>> unknown?
>>>
>>> Without that information my comments are few.  One is that, since
>>> MPWS_ProductAuditFails_view is a view, there is no reason for us to
>>> assume the problem is with the query you are asking about; we have no
>>> reason to rule out the view as being the problem.  Another is that I
>>> see no benefit to the TOP 100 PERCENT.  And finally, if the DISTINCT
>>> is really necessary, it would seem that the columns used from
>>> MPWS_ProductAuditFails_view do not constitute a key.  In that case it
>>> is probably worth making those columns DISTINCT before the cross join,
>>> possibly by using a view on MPWS_ProductAuditFails_view, or
>>> alternately using a derived table.
>>>
>>> SELECT DISTINCT
>>>       dev.MPWS_ProductAuditFails_view.PeriodYear,
>>>       dev.MPWS_ProductAuditFails_view.Period,
>>>       dev.MPWS_ProductAuditFails_view.PeriodWeek,
>>>       dev.MPWS_ProductAuditFails_view.DateInput
>>>  FROM dev.MPWS_ProductAuditFails_view
>>>
>>> Roy Harvey
>>> Beacon Falls, CT
>>>
>>>
>>> On Fri, 28 Apr 2006 22:33:37 GMT, "Randall Arnold"
>>> <randall.nospam.arnold@nokia.com.> wrote:
>>>
>>>>The title of this post is a little cryptic but hopefully I can explain
>>>>well
>>>>enough.  I asked a similar question a while back, didn't get usable
>>>>answers
>>>>and then thought I'd solved it on my own.  However, my current approach
>>>>to
>>>>the query is running way too slowly.
>>>>
>>>>In this case I have a static table of assembly lines, MPWS_L_AUDIT_LINE.
>>>>Each line has a Line_ID (immaterial to this need) and a LineName (A0,
>>>>B0,
>>>>C0, etc).
>>>>
>>>>I also have a daily defect query, MPWS_ProductAuditFails_view.  It's a
>>>>series of records detailing failures that can occur on any given line.
>>>>
>>>>What I need as an end result is a list of all lines for every day an
>>>>audit
>>>>was performed.  Each row should look like the following:
>>>>
>>>>  lineview2 LineName PeriodYear Period PeriodWeek DateInput LineCount
>>>>      AA 2006 1 1 1/3/2006 1
>>>>      AB 2006 1 1 1/3/2006 1
>>>>      AC 2006 1 1 1/3/2006 1
>>>>      AD 2006 1 1 1/3/2006 0
>>>>      AE 2006 1 1 1/3/2006 1
>>>>      AF 2006 1 1 1/3/2006 1
>>>>      B0 2006 1 1 1/3/2006 1
>>>>      D0 2006 1 1 1/3/2006 1
>>>>      G0 2006 1 1 1/3/2006 1
>>>>      HL 2006 1 1 1/3/2006 0
>>>>      I0 2006 1 1 1/3/2006 1
>>>>      O0 2006 1 1 1/3/2006 1
>>>>      P0 2006 1 1 1/3/2006 0
>>>>
>>>>(etc)
>>>>
>>>>"LineCount" would be the number of defects found per line per day.
>>>>Every
>>>>line should be represented (I can get that to happen using a Cross Join
>>>>only).  Here is what my current query looks like:
>>>>
>>>>SELECT DISTINCT
>>>>                      TOP 100 PERCENT dbo.MPWS_L_AUDIT_LINE.Descr AS
>>>>LineName, dev.MPWS_ProductAuditFails_view.PeriodYear,
>>>>                      dev.MPWS_ProductAuditFails_view.Period,
>>>>dev.MPWS_ProductAuditFails_view.PeriodWeek,
>>>>dev.MPWS_ProductAuditFails_view.DateInput
>>>>FROM         dbo.MPWS_L_AUDIT_LINE CROSS JOIN
>>>>                      dev.MPWS_ProductAuditFails_view
>>>>GROUP BY dev.MPWS_ProductAuditFails_view.PeriodYear,
>>>>dev.MPWS_ProductAuditFails_view.Period,
>>>>dev.MPWS_ProductAuditFails_view.PeriodWeek,
>>>>                      dev.MPWS_ProductAuditFails_view.DateInput,
>>>>dbo.MPWS_L_AUDIT_LINE.Descr
>>>>ORDER BY dev.MPWS_ProductAuditFails_view.DateInput
>>>>
>>>>This gets me every column but LineCount.  What I have tried for
>>>>LineCount
>>>>is
>>>>to use a subquery to count in MPWS_ProductAuditFails_view how many
>>>>defects
>>>>showed up for a line on each audited day.  Problem is, I can't seem to
>>>>get
>>>>the subquery right.  And even the subqueries that don't produce proper
>>>>results take wayyyyy too long to run.
>>>>
>>>>Sooo... any ideas?  I'm just about out.
>>>>
>>>>Randall Arnold
Author
1 May 2006 10:07 PM
Roy Harvey
Another suggestion.

I understand that you need the result set to include not just the
instances that actually have audits, but also those that do not.
Perhaps it would help to write a SELECT ... COUNT(*) ... GROUP BY ...
query that shows the result set in the format you need, summarized to
the level you need, but only including the rows that DO have activity.
This is a much simpler query than the one requiring the cross join. If
nothing else it would help clarify where the counts that need to be
derived are to come from.

Roy Harvey
Beacon Falls, CT
Author
3 May 2006 6:44 PM
Randall Arnold
I appreciate the suggestion.

Randall Arnold

Show quote
"Roy Harvey" <roy_har***@snet.net> wrote in message
news:ek1d521715b89u8iqah3s9pus50sfqu2og@4ax.com...
> Another suggestion.
>
> I understand that you need the result set to include not just the
> instances that actually have audits, but also those that do not.
> Perhaps it would help to write a SELECT ... COUNT(*) ... GROUP BY ...
> query that shows the result set in the format you need, summarized to
> the level you need, but only including the rows that DO have activity.
> This is a much simpler query than the one requiring the cross join. If
> nothing else it would help clarify where the counts that need to be
> derived are to come from.
>
> Roy Harvey
> Beacon Falls, CT
Author
29 Apr 2006 9:34 PM
Hugo Kornelis
On Fri, 28 Apr 2006 22:33:37 GMT, Randall Arnold wrote:

(snip)
Show quote
> Here is what my current query looks like:
>
>SELECT DISTINCT
>                      TOP 100 PERCENT dbo.MPWS_L_AUDIT_LINE.Descr AS
>LineName, dev.MPWS_ProductAuditFails_view.PeriodYear,
>                      dev.MPWS_ProductAuditFails_view.Period,
>dev.MPWS_ProductAuditFails_view.PeriodWeek,
>dev.MPWS_ProductAuditFails_view.DateInput
>FROM         dbo.MPWS_L_AUDIT_LINE CROSS JOIN
>                      dev.MPWS_ProductAuditFails_view
>GROUP BY dev.MPWS_ProductAuditFails_view.PeriodYear,
>dev.MPWS_ProductAuditFails_view.Period,
>dev.MPWS_ProductAuditFails_view.PeriodWeek,
>                      dev.MPWS_ProductAuditFails_view.DateInput,
>dbo.MPWS_L_AUDIT_LINE.Descr
>ORDER BY dev.MPWS_ProductAuditFails_view.DateInput

Hi Randall,

Even though you claim otherwise in your reply to Roy's message, there
are a few things wrong in this query.

First: If this is to be a view (like you say in your other message),
then get rid of the ORDER BY clause. A view in the relational model is
BY DEFINITION not ordered. This is exactly the reason why ORDER BY is
normally not even allowed in a CREATE VIEW statement. The only exception
to this rule is when a TOP clause is included as well; in that case, the
ORDER BY is used to determine which rows are considered to be "the top",
not the order of the rows in the result.

In SQL Server 2000, a view with TOP 100 PERCENT ... ORDER BY would
*usually* return the rows in the specified order if no ORDER BY clause
is used in the SELECT .. FROM View. But this was not documented, nor
guaranteed - in some (rare) situations, you might witness different
behaviour.

In SQL Server 2005, the optimizer can employ new techniques that *will*
break this behaviour. There have already been lots of reports about this
in the newsgroups by people who, erroneously, were relying on this
undocumented, unsupported and unguaranteed behaviour of SQL Server 2000.

Remove the ORDER BY clause and the TOP 100 PERCENT clause from this view
definition and include an ORDER BY clause on all SELECT statements that
need the data to be in this order. That is, has alwayys been and
probably will always be, the ONLY supported way to guarantee that the
data is returned in the correct order.


Second: you have a GROUP BY on all columns in the SELECT clause, and you
also have a DISTINCT. That's redundant - one of them can be removed
without changing the result one single bit.

I don't think this change will change the performance, though - the
optimizer is smart enough to disregard one of the operations in a
completely redundant situation such as this.

>This gets me every column but LineCount.  What I have tried for LineCount is
>to use a subquery to count in MPWS_ProductAuditFails_view how many defects
>showed up for a line on each audited day.  Problem is, I can't seem to get
>the subquery right.

Hard to tell without knowing what column(s) in this view need to be
matched against which columns in the partial query you already have.

A pattern that might be useful here is this one:

SELECT          a.Col1, a.Col2, a.Col3,
                COUNT(b.KeyCol)
FROM            SomeTable AS a
LEFT OUTER JOIN DetailsTable AS b
ON              b.KeyCol = a.KeyCol
GROUP BY        a.Col1, a.Col2, a.Col3

This assumes that base data is in table SomeTable, and that you need to
add a count of detail entries from DetailsTable. The LEFT OUTER JOIN is
used to preserve rows with a detailcount of 0. If you want to exclude
those, or if the count is always at least 1, use an INNER JOIN instead.

What intrigues me in this case is that MPWS_ProductAuditFails_view is
already used in your query, and that you are grouping on a bunch of
columns from this table. For all I know, just adding a COUNT(*) column
to your query might already suffice. But then again, you might also need
to join to the same view once more. Or a third possibility - maybe yoou
should change the CROSS JOIN to some other kind of join.
There's really no way of answering that without knowing how your tables
(or views, in this case) really look, what data is in it, and what
results you want to get from that data. Unfortunately, you've given us
only the latter.

> And even the subqueries that don't produce proper
>results take wayyyyy too long to run.

As one of the last resorts, it might be worthwhile to check the view
definitions of the various views used in yoour query. Even if they run
quickly on their own, the combination can be too much. You might have to
consider rewriting the query to use the base table directly instead of
the views.

--
Hugo Kornelis, SQL Server MVP
Author
1 May 2006 2:06 PM
Randall Arnold
I cannot get rid of the ORDER BY.  For the customer's requirements, tt is
essential that the results (Linename) be generated in alphabetical order.
I'm amazed and perplexed to hear that such a thing is frowned upon!

As I said in another post, DISTINCT is irrelevant; I was experimenting
because techniques I thought should work were not producing the desired
results.

COUNT(*) by itself did not work; however, I have been suspecting I might
have to join on the MPWS_ProductAuditFails_view again as you suggest.

My problem here is that while I have a very good grasp of table design and
rudimentary SQL, I come up short when having to create something that led to
my post.  I don't instinctively visualize the logic that enables a count of
data elements in the static table that don't exist in the daily records.  I
know how to do such a thing in Visual Basic (using conditional logic) but I
can't see how to translate that knowledge to SQL.

Thank you very much for your attempt to help, Hugo; I'll try your
suggestions.

Randall Arnold

The Query window will not let me get rid
Show quote
"Hugo Kornelis" <h***@perFact.REMOVETHIS.info.INVALID> wrote in message
news:9hl752t0hrvg7f4idrs9897d5lb796i0do@4ax.com...
> On Fri, 28 Apr 2006 22:33:37 GMT, Randall Arnold wrote:
>
> (snip)
>> Here is what my current query looks like:
>>
>>SELECT DISTINCT
>>                      TOP 100 PERCENT dbo.MPWS_L_AUDIT_LINE.Descr AS
>>LineName, dev.MPWS_ProductAuditFails_view.PeriodYear,
>>                      dev.MPWS_ProductAuditFails_view.Period,
>>dev.MPWS_ProductAuditFails_view.PeriodWeek,
>>dev.MPWS_ProductAuditFails_view.DateInput
>>FROM         dbo.MPWS_L_AUDIT_LINE CROSS JOIN
>>                      dev.MPWS_ProductAuditFails_view
>>GROUP BY dev.MPWS_ProductAuditFails_view.PeriodYear,
>>dev.MPWS_ProductAuditFails_view.Period,
>>dev.MPWS_ProductAuditFails_view.PeriodWeek,
>>                      dev.MPWS_ProductAuditFails_view.DateInput,
>>dbo.MPWS_L_AUDIT_LINE.Descr
>>ORDER BY dev.MPWS_ProductAuditFails_view.DateInput
>
> Hi Randall,
>
> Even though you claim otherwise in your reply to Roy's message, there
> are a few things wrong in this query.
>
> First: If this is to be a view (like you say in your other message),
> then get rid of the ORDER BY clause. A view in the relational model is
> BY DEFINITION not ordered. This is exactly the reason why ORDER BY is
> normally not even allowed in a CREATE VIEW statement. The only exception
> to this rule is when a TOP clause is included as well; in that case, the
> ORDER BY is used to determine which rows are considered to be "the top",
> not the order of the rows in the result.
>
> In SQL Server 2000, a view with TOP 100 PERCENT ... ORDER BY would
> *usually* return the rows in the specified order if no ORDER BY clause
> is used in the SELECT .. FROM View. But this was not documented, nor
> guaranteed - in some (rare) situations, you might witness different
> behaviour.
>
> In SQL Server 2005, the optimizer can employ new techniques that *will*
> break this behaviour. There have already been lots of reports about this
> in the newsgroups by people who, erroneously, were relying on this
> undocumented, unsupported and unguaranteed behaviour of SQL Server 2000.
>
> Remove the ORDER BY clause and the TOP 100 PERCENT clause from this view
> definition and include an ORDER BY clause on all SELECT statements that
> need the data to be in this order. That is, has alwayys been and
> probably will always be, the ONLY supported way to guarantee that the
> data is returned in the correct order.
>
>
> Second: you have a GROUP BY on all columns in the SELECT clause, and you
> also have a DISTINCT. That's redundant - one of them can be removed
> without changing the result one single bit.
>
> I don't think this change will change the performance, though - the
> optimizer is smart enough to disregard one of the operations in a
> completely redundant situation such as this.
>
>>This gets me every column but LineCount.  What I have tried for LineCount
>>is
>>to use a subquery to count in MPWS_ProductAuditFails_view how many defects
>>showed up for a line on each audited day.  Problem is, I can't seem to get
>>the subquery right.
>
> Hard to tell without knowing what column(s) in this view need to be
> matched against which columns in the partial query you already have.
>
> A pattern that might be useful here is this one:
>
> SELECT          a.Col1, a.Col2, a.Col3,
>                COUNT(b.KeyCol)
> FROM            SomeTable AS a
> LEFT OUTER JOIN DetailsTable AS b
> ON              b.KeyCol = a.KeyCol
> GROUP BY        a.Col1, a.Col2, a.Col3
>
> This assumes that base data is in table SomeTable, and that you need to
> add a count of detail entries from DetailsTable. The LEFT OUTER JOIN is
> used to preserve rows with a detailcount of 0. If you want to exclude
> those, or if the count is always at least 1, use an INNER JOIN instead.
>
> What intrigues me in this case is that MPWS_ProductAuditFails_view is
> already used in your query, and that you are grouping on a bunch of
> columns from this table. For all I know, just adding a COUNT(*) column
> to your query might already suffice. But then again, you might also need
> to join to the same view once more. Or a third possibility - maybe yoou
> should change the CROSS JOIN to some other kind of join.
> There's really no way of answering that without knowing how your tables
> (or views, in this case) really look, what data is in it, and what
> results you want to get from that data. Unfortunately, you've given us
> only the latter.
>
>> And even the subqueries that don't produce proper
>>results take wayyyyy too long to run.
>
> As one of the last resorts, it might be worthwhile to check the view
> definitions of the various views used in yoour query. Even if they run
> quickly on their own, the combination can be too much. You might have to
> consider rewriting the query to use the base table directly instead of
> the views.
>
> --
> Hugo Kornelis, SQL Server MVP
Author
1 May 2006 2:47 PM
Raymond D'Anjou
"Randall Arnold" <randall.nospam.arnold@nokia.com.> wrote in message
news:%23NizohSbGHA.4580@TK2MSFTNGP03.phx.gbl...
>I cannot get rid of the ORDER BY.  For the customer's requirements, tt is
>essential that the results (Linename) be generated in alphabetical order.
>I'm amazed and perplexed to hear that such a thing is frowned upon!
>

I've erased part of the message but will answer this part.
It's never been said here not to order the results, just don't use ORDER BY
in a view.
It is undocumented, even though it will work in SQL server 2000.
If ever you upgrade to SQL 2005, it may not work.
That's the risk of relying on undocumented behavior.
You could create the view without Ordering and then you have at least 2
options:
- create a stored procedure that returns an ordered recordset (select from
view order by ...)
- select from the view with the ORDER BY clause directly from the client
application

I prefer and always use the first method.
Everybody here HAS to go through a stored procedure for anything database
related.
Author
1 May 2006 3:15 PM
Erland Sommarskog
Randall Arnold (randall.nospam.arnold@nokia.com.) writes:
> I cannot get rid of the ORDER BY.  For the customer's requirements, tt is
> essential that the results (Linename) be generated in alphabetical order.
> I'm amazed and perplexed to hear that such a thing is frowned upon!

So what is the TOP 100 PERCENT doing there?

> As I said in another post, DISTINCT is irrelevant; I was experimenting
> because techniques I thought should work were not producing the desired
> results.

A piece of advice: include what is releant to your problem, and leave out
what is not. We have enough difficulties to understand what result you
are looking for.

As I said in another post, include table definition (preferably as
CREATE TABLE statements), sample data (preferably as INSERT statements)
and the desired results given the sample.
--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
3 May 2006 6:42 PM
Randall Arnold
I already said: SQL server automatically sticks in the top 100 pct if ANY
ORDER BY is used.  I thought this was common knowledge.

And I'm still of the belief I included everything that was relevant.  I note
that other respondents have not had difficulties offering suggestions based
on what I presented.

Randall Arnold

Show quote
"Erland Sommarskog" <esq***@sommarskog.se> wrote in message
news:Xns97B6AF9C5185BYazorman@127.0.0.1...
> Randall Arnold (randall.nospam.arnold@nokia.com.) writes:
>> I cannot get rid of the ORDER BY.  For the customer's requirements, tt is
>> essential that the results (Linename) be generated in alphabetical order.
>> I'm amazed and perplexed to hear that such a thing is frowned upon!
>
> So what is the TOP 100 PERCENT doing there?
>
>> As I said in another post, DISTINCT is irrelevant; I was experimenting
>> because techniques I thought should work were not producing the desired
>> results.
>
> A piece of advice: include what is releant to your problem, and leave out
> what is not. We have enough difficulties to understand what result you
> are looking for.
>
> As I said in another post, include table definition (preferably as
> CREATE TABLE statements), sample data (preferably as INSERT statements)
> and the desired results given the sample.
> --
> Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se
>
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
3 May 2006 10:15 PM
Erland Sommarskog
Randall Arnold (randall.nospam.arnold@nospamnokia.com.) writes:
> I already said: SQL server automatically sticks in the top 100 pct if ANY
> ORDER BY is used.  I thought this was common knowledge.

No, SQL Server does not stick in any TOP 100 PERCENT automatically.

The Query Designer in Enterprise Manager may do, but the Query
Designer is not SQL Server, but appears to be crappy tool that looks
like it origins from an Access environment. Unfortunately, Microsoft
have not been able to provide tools that are in par with the engine.

> And I'm still of the belief I included everything that was relevant.  I
> note that other respondents have not had difficulties offering
> suggestions based on what I presented.

I thought I saw more people asking for the same as I did? Oh well,
nevermind. I hope those suggestions were helpful to you.


--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
8 May 2006 9:20 PM
Hugo Kornelis
On Mon, 1 May 2006 09:06:12 -0500, Randall Arnold wrote:

>I cannot get rid of the ORDER BY.  For the customer's requirements, tt is
>essential that the results (Linename) be generated in alphabetical order.
>I'm amazed and perplexed to hear that such a thing is frowned upon!

Hi Randall,

If it is essential that the result appear in alphabetical order, then it
is also essential that you use ORDER BY where it belongs: on the final
SELECT statement that generates the output for the customer. Not on the
view. This is undocumented and not guaranteed. It will usually work in
SQL Server 2000, but it might fail on you when you least expect it. And
from reports in this group, I know that it WILL fail on you once you
make the move to SQL Server 2005.

Remove both TOP 100 PERCENT and ORDER BY from the view. Include ORDER BY
in the final SELECT.

>COUNT(*) by itself did not work; however, I have been suspecting I might
>have to join on the MPWS_ProductAuditFails_view again as you suggest.

Too bad. But then, it was just a wild guess.

>My problem here is that while I have a very good grasp of table design and
>rudimentary SQL, I come up short when having to create something that led to
>my post.

And my problem (and that of others as well) is that we have a very good
grasp of DB design and advanced query writing, we have too little
information aboout your particular problem to be able to help you. But
since that has already been pointed out to you several times, and you
have already declined to provide more details, I guess I might just as
well abandon this thread and spend my time helping people that _are_
willing to provide more information when asked to do so.

(from another post)
>I'm NOT looking for
>highly-specific code to create the results I'm looking for.  I'm simply
>providing an example of what I'm working with and asking for high-level
>techinique(s) that will apply across a broad range of needs.

But you left out so many details that it's impossible to make any
suggestions other than "consider rewriting your query", "check if the
database design is completely correct", "create supporting indexes or
change your current indexes", and "look into indexed views".

But I really hate giving such broad suggestions, because I honestly
doubt that they'll help you at all.

--
Hugo Kornelis, SQL Server MVP
Author
1 May 2006 9:46 AM
Erland Sommarskog
Randall Arnold (randall.nospam.arnold@nokia.com.) writes:
Show quote
> The title of this post is a little cryptic but hopefully I can explain
> well enough.  I asked a similar question a while back, didn't get usable
> answers and then thought I'd solved it on my own.  However, my current
> approach to the query is running way too slowly.
>
> In this case I have a static table of assembly lines, MPWS_L_AUDIT_LINE.
> Each line has a Line_ID (immaterial to this need) and a LineName (A0, B0,
> C0, etc).
>
> I also have a daily defect query, MPWS_ProductAuditFails_view.  It's a
> series of records detailing failures that can occur on any given line.
>
> What I need as an end result is a list of all lines for every day an audit
> was performed.  Each row should look like the following:
>
>   lineview2 LineName PeriodYear Period PeriodWeek DateInput LineCount
>       AA 2006 1 1 1/3/2006 1
>       AB 2006 1 1 1/3/2006 1
>       AC 2006 1 1 1/3/2006 1
>       AD 2006 1 1 1/3/2006 0
>       AE 2006 1 1 1/3/2006 1
>       AF 2006 1 1 1/3/2006 1
>       B0 2006 1 1 1/3/2006 1
>       D0 2006 1 1 1/3/2006 1
>       G0 2006 1 1 1/3/2006 1
>       HL 2006 1 1 1/3/2006 0
>       I0 2006 1 1 1/3/2006 1
>       O0 2006 1 1 1/3/2006 1
>       P0 2006 1 1 1/3/2006 0

Despite your sighing, I can only echo Roy's suggestion. Please post:

o   CREATE TABLE statements for your tables.
o   INSERT statements with sample data.
o   The desired result given the sample.

It may not have to be the actual table, but a simplified case. I've look at
this post, and I don't understand what this linecount is. Could you have
something like:

       AA 2006 1 1 1/3/2006 3
       AA 2006 1 1 1/3/2006 3
       AA 2006 1 1 1/3/2006 3



--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
1 May 2006 3:08 PM
Jim Underwood
Please post DDL so we can see your indexes, keys, relationships, etc.  You
describe them, but DDL is more explicit and less open to interpretation.

Right off the bat, there are several opportunities for improvement with this
query.

1.  Top 100 percent.  Are you using this only so you can do an order by in a
view?  This seems completely unnecessary here.  You should probably remove
it.  If you need to sort a view, do it when you query against the view, not
in the view itself.

2. Group by - You are grouping by every column in your select list, this
does nothing except force a distinct, as near as I can tell.  I suspect that
this may be less efficient than simply using distinct, but I am not certain.
Nonetheless, this is redundant, so remove it.

3. Distinct - Why is this here?  From what you describe I am guessing you
used distinct because your cross join is returning duplicates.  Why are you
using a cross join rather than a left outer join?  Without the DDL I can't
be sure, but I think you need a left outer join.

Basically, it looks like your SQL needs to be rewritten from scratch, and
you are asking us to fix the flawed SQL.  Give the DDL, sample data (insert
statements), results, etc.  Let us see the SQL for the view
(MPWS_ProductAuditFails_view) that you are using in the sql below.  I
realize you are looking for a general solution.  Let us use this specific
case as an example that we can all learn from.  Start over, write the
MPWS_ProductAuditFails_view from scratch, join in the tables as needed, and
produce the required results in clean and efficient SQL.

If you insist on sticking to what you already have, then you may not be able
to fix it correctly.

Show quote
"Randall Arnold" <randall.nospam.arnold@nokia.com.> wrote in message
news:5Lw4g.28207$_k2.490989@news2.nokia.com...
> The title of this post is a little cryptic but hopefully I can explain
well
> enough.  I asked a similar question a while back, didn't get usable
answers
> and then thought I'd solved it on my own.  However, my current approach to
> the query is running way too slowly.
>
> In this case I have a static table of assembly lines, MPWS_L_AUDIT_LINE.
> Each line has a Line_ID (immaterial to this need) and a LineName (A0, B0,
> C0, etc).
>
> I also have a daily defect query, MPWS_ProductAuditFails_view.  It's a
> series of records detailing failures that can occur on any given line.
>
> What I need as an end result is a list of all lines for every day an audit
> was performed.  Each row should look like the following:
>
>   lineview2 LineName PeriodYear Period PeriodWeek DateInput LineCount
>       AA 2006 1 1 1/3/2006 1
>       AB 2006 1 1 1/3/2006 1
>       AC 2006 1 1 1/3/2006 1
>       AD 2006 1 1 1/3/2006 0
>       AE 2006 1 1 1/3/2006 1
>       AF 2006 1 1 1/3/2006 1
>       B0 2006 1 1 1/3/2006 1
>       D0 2006 1 1 1/3/2006 1
>       G0 2006 1 1 1/3/2006 1
>       HL 2006 1 1 1/3/2006 0
>       I0 2006 1 1 1/3/2006 1
>       O0 2006 1 1 1/3/2006 1
>       P0 2006 1 1 1/3/2006 0
>
> (etc)
>
> "LineCount" would be the number of defects found per line per day.  Every
> line should be represented (I can get that to happen using a Cross Join
> only).  Here is what my current query looks like:
>
> SELECT DISTINCT
>                       TOP 100 PERCENT dbo.MPWS_L_AUDIT_LINE.Descr AS
> LineName, dev.MPWS_ProductAuditFails_view.PeriodYear,
>                       dev.MPWS_ProductAuditFails_view.Period,
> dev.MPWS_ProductAuditFails_view.PeriodWeek,
> dev.MPWS_ProductAuditFails_view.DateInput
> FROM         dbo.MPWS_L_AUDIT_LINE CROSS JOIN
>                       dev.MPWS_ProductAuditFails_view
> GROUP BY dev.MPWS_ProductAuditFails_view.PeriodYear,
> dev.MPWS_ProductAuditFails_view.Period,
> dev.MPWS_ProductAuditFails_view.PeriodWeek,
>                       dev.MPWS_ProductAuditFails_view.DateInput,
> dbo.MPWS_L_AUDIT_LINE.Descr
> ORDER BY dev.MPWS_ProductAuditFails_view.DateInput
>
> This gets me every column but LineCount.  What I have tried for LineCount
is
> to use a subquery to count in MPWS_ProductAuditFails_view how many defects
> showed up for a line on each audited day.  Problem is, I can't seem to get
> the subquery right.  And even the subqueries that don't produce proper
> results take wayyyyy too long to run.
>
> Sooo... any ideas?  I'm just about out.
>
> Randall Arnold
>
>

AddThis Social Bookmark Button