Home All Groups Group Topic Archive Search About
Author
29 Aug 2006 7:57 PM
Dan D.
Using SS2000 SP4. I have a table with many instances of each company id. I
want to get only one row based on a date. I tried this:
select distinct T.numCompanyId,
(select top 1  tblCompanyFranchise.numProgramId ProgramId,
tblCompanyFranchise.numCompanyStatusId StatusId, numTaskId,
dtTaskCreationDate
from tblTasks, tblCompanyFranchise  Where tblTasks.numCompanyId =
tblCompanyFranchise.numCompanyId
AND tblTasks.numCompanyId = T.numCompanyId
order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.numCompanyId,
dtTaskCreationDate desc)
from tblTasks T

but I get this error: "Only one expression can be specified in the select
list when the subquery is not introduced with EXISTS." I tried with "case
when exists" but couldn't figure out how to get that to work either.

Any ideas?

Thanks,
--
Dan D.

Author
29 Aug 2006 8:53 PM
Tom Cooper
Please post DDL for your tables, and sample data and show the result you
would want for the sample data you give us.  Otherwise we are just guessing
at what your table structure is and the result you want.  See
www.aspfaq.com/5006 for help on how to provide that.

Tom

Show quote
"Dan D." <D***@discussions.microsoft.com> wrote in message
news:0A323244-D18A-4D38-B65B-5601A2F3CD6F@microsoft.com...
> Using SS2000 SP4. I have a table with many instances of each company id. I
> want to get only one row based on a date. I tried this:
> select distinct T.numCompanyId,
> (select top 1  tblCompanyFranchise.numProgramId ProgramId,
> tblCompanyFranchise.numCompanyStatusId StatusId, numTaskId,
> dtTaskCreationDate
> from tblTasks, tblCompanyFranchise  Where tblTasks.numCompanyId =
> tblCompanyFranchise.numCompanyId
> AND tblTasks.numCompanyId = T.numCompanyId
> order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.numCompanyId,
> dtTaskCreationDate desc)
> from tblTasks T
>
> but I get this error: "Only one expression can be specified in the select
> list when the subquery is not introduced with EXISTS." I tried with "case
> when exists" but couldn't figure out how to get that to work either.
>
> Any ideas?
>
> Thanks,
> --
> Dan D.
Author
30 Aug 2006 2:34 PM
Dan D.
CREATE TABLE [tblTasks] (
    [numTaskId] [decimal](18, 0) NOT NULL ,
    [numUserId] [decimal](18, 0) NULL ,
    [numCompanyId] [decimal](18, 0) NULL ,
    [numProgramId] [decimal](18, 0) NULL ,
    [numFranchiseId] [decimal](18, 0) NULL ,
    [numCorporateId] [decimal](18, 0) NULL ,
    [dtTaskCreationDate] [datetime] NOT NULL ,
    [dtTaskCreationTime] [datetime] NOT NULL ,
    [Priority] [int] NULL
) ON [PRIMARY]
GO
CREATE TABLE [tblCompanyFranchise] (
    [ID] [numeric](18, 0) NOT NULL ,
    [numFranchiseId] [numeric](18, 0) NOT NULL ,
    [numCompanyId] [numeric](18, 0) NOT NULL ,
    [numCompanyStatusId] [numeric](18, 0) NOT NULL ,
    [numProgramId] [numeric](18, 0) NOT NULL ,
    [pinId] [varchar] (30) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
    CONSTRAINT [PK_tblCompanyFranchise] PRIMARY KEY  CLUSTERED
    (
        [ID]
    ) WITH  FILLFACTOR = 90  ON [PRIMARY] ,
    CONSTRAINT [CK_tblCompanyFranchise_Pinid] UNIQUE  NONCLUSTERED
    (
        [ID]
    ) WITH  FILLFACTOR = 90  ON [PRIMARY]
) ON [PRIMARY]
GO

In my original email I should have said that I need one row for each
company. This is what I came up with. It seems to work.

select distinct T.numCompanyId CompanyId,
    (select top 1  tblTasks.numProgramId
    from tblTasks where tblTasks.numCompanyId = T.numCompanyId
    order by dtTaskCreationDate desc)
AS ProgramId,
    (select top 1  tblCompanyFranchise.numCompanyStatusId
    from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
T.numCompanyId )
AS StatusId,
    (select top 1  dtTaskCreationDate
    from tblTasks where tblTasks.numCompanyId = T.numCompanyId
    order by dtTaskCreationDate desc)
AS dtTaskCreationDate,
    (select top 1  numTaskId
    from tblTasks where tblTasks.numCompanyId = T.numCompanyId
    order by dtTaskCreationDate desc)
AS numTaskId,
    (select top 1  tblCompanyFranchise.numFranchiseId
    from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
T.numCompanyId )
AS StatusId
from tblTasks T
order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.CompanyId,
dtTaskCreationDate desc
--
Dan D.


Show quote
"Tom Cooper" wrote:

> Please post DDL for your tables, and sample data and show the result you
> would want for the sample data you give us.  Otherwise we are just guessing
> at what your table structure is and the result you want.  See
> www.aspfaq.com/5006 for help on how to provide that.
>
> Tom
>
> "Dan D." <D***@discussions.microsoft.com> wrote in message
> news:0A323244-D18A-4D38-B65B-5601A2F3CD6F@microsoft.com...
> > Using SS2000 SP4. I have a table with many instances of each company id. I
> > want to get only one row based on a date. I tried this:
> > select distinct T.numCompanyId,
> > (select top 1  tblCompanyFranchise.numProgramId ProgramId,
> > tblCompanyFranchise.numCompanyStatusId StatusId, numTaskId,
> > dtTaskCreationDate
> > from tblTasks, tblCompanyFranchise  Where tblTasks.numCompanyId =
> > tblCompanyFranchise.numCompanyId
> > AND tblTasks.numCompanyId = T.numCompanyId
> > order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.numCompanyId,
> > dtTaskCreationDate desc)
> > from tblTasks T
> >
> > but I get this error: "Only one expression can be specified in the select
> > list when the subquery is not introduced with EXISTS." I tried with "case
> > when exists" but couldn't figure out how to get that to work either.
> >
> > Any ideas?
> >
> > Thanks,
> > --
> > Dan D.
>
>
>
Author
29 Aug 2006 11:01 PM
Roy Harvey
A subquery in the SELECT list can only return one column.  The one in
the example tries to return four.  Not that it would work as you wish
even if there were only one, as there is no correlation between the
outer query and the inner query.

Without the DDL for the tables it isn't even practical to guess how it
can be rewritten.  In particular there is no indication what table
dtTaskCreationDate comes from, and the whole query is supposed to
depend on that column.

Roy Harvey
Beacon Falls, CT

On Tue, 29 Aug 2006 12:57:02 -0700, Dan D.
<D***@discussions.microsoft.com> wrote:

Show quote
>Using SS2000 SP4. I have a table with many instances of each company id. I
>want to get only one row based on a date. I tried this:
> select distinct T.numCompanyId,
> (select top 1  tblCompanyFranchise.numProgramId ProgramId,
>tblCompanyFranchise.numCompanyStatusId StatusId, numTaskId,
>dtTaskCreationDate
>from tblTasks, tblCompanyFranchise  Where tblTasks.numCompanyId =
>tblCompanyFranchise.numCompanyId
>AND tblTasks.numCompanyId = T.numCompanyId
>order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.numCompanyId,
>dtTaskCreationDate desc)
>from tblTasks T
>
>but I get this error: "Only one expression can be specified in the select
>list when the subquery is not introduced with EXISTS." I tried with "case
>when exists" but couldn't figure out how to get that to work either.
>
>Any ideas?
>
>Thanks,
Author
30 Aug 2006 2:35 PM
Dan D.
See other post for schemas. I think this works.

select distinct T.numCompanyId CompanyId,
    (select top 1  tblTasks.numProgramId
    from tblTasks where tblTasks.numCompanyId = T.numCompanyId
    order by dtTaskCreationDate desc)
AS ProgramId,
    (select top 1  tblCompanyFranchise.numCompanyStatusId
    from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
T.numCompanyId )
AS StatusId,
    (select top 1  dtTaskCreationDate
    from tblTasks where tblTasks.numCompanyId = T.numCompanyId
    order by dtTaskCreationDate desc)
AS dtTaskCreationDate,
    (select top 1  numTaskId
    from tblTasks where tblTasks.numCompanyId = T.numCompanyId
    order by dtTaskCreationDate desc)
AS numTaskId,
    (select top 1  tblCompanyFranchise.numFranchiseId
    from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
T.numCompanyId )
AS StatusId
from tblTasks T
order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.CompanyId,
dtTaskCreationDate desc
--
Dan D.


Show quote
"Roy Harvey" wrote:

> A subquery in the SELECT list can only return one column.  The one in
> the example tries to return four.  Not that it would work as you wish
> even if there were only one, as there is no correlation between the
> outer query and the inner query.
>
> Without the DDL for the tables it isn't even practical to guess how it
> can be rewritten.  In particular there is no indication what table
> dtTaskCreationDate comes from, and the whole query is supposed to
> depend on that column.
>
> Roy Harvey
> Beacon Falls, CT
>
> On Tue, 29 Aug 2006 12:57:02 -0700, Dan D.
> <D***@discussions.microsoft.com> wrote:
>
> >Using SS2000 SP4. I have a table with many instances of each company id. I
> >want to get only one row based on a date. I tried this:
> > select distinct T.numCompanyId,
> > (select top 1  tblCompanyFranchise.numProgramId ProgramId,
> >tblCompanyFranchise.numCompanyStatusId StatusId, numTaskId,
> >dtTaskCreationDate
> >from tblTasks, tblCompanyFranchise  Where tblTasks.numCompanyId =
> >tblCompanyFranchise.numCompanyId
> >AND tblTasks.numCompanyId = T.numCompanyId
> >order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.numCompanyId,
> >dtTaskCreationDate desc)
> >from tblTasks T
> >
> >but I get this error: "Only one expression can be specified in the select
> >list when the subquery is not introduced with EXISTS." I tried with "case
> >when exists" but couldn't figure out how to get that to work either.
> >
> >Any ideas?
> >
> >Thanks,
>
Author
30 Aug 2006 3:16 PM
Roy Harvey
Three subqueriesin the SELECT list, all against the same table, all
with the same criteria, generally means that a different approach can
be used with less redundancy.

Also, using TOP without any ORDER BY, as in two other subqueries,
gives a random result from among the set of possible results.

I would expect something more along these lines:

SELECT T.numCompanyId as CompanyId,
       T.numProgramId as ProgramId,
       C.numCompanyStatusId as StatusId,
       T.dtTaskCreationDate as TaskCreationDate,
       T.numTaskId as TaskId,
       C.numFranchiseId as FranchiseId
  FROM tblCompanyFranchise as C
  JOIN tblTasks as T
    ON C.numCompanyId = T.numCompanyId
WHERE T.dtTaskCreationDate =
       (select max(dtTaskCreationDate) from tblTasks as T2
         where C.numCompanyId = T2.numCompanyId)

This may not be what you need - it is still unclear to me what
constitutes a unique row in the input data.  Hopefully you will be
able to get the idea in any case.  One possible change, if this
provides more than one row per company:

SELECT T.numCompanyId as CompanyId,
       T.numProgramId as ProgramId,
       MAX(C.numCompanyStatusId) as StatusId,
       T.dtTaskCreationDate as TaskCreationDate,
       T.numTaskId as TaskId,
       MAX(C.numFranchiseId) as FranchiseId
  FROM tblCompanyFranchise as C
  JOIN tblTasks as T
    ON C.numCompanyId = T.numCompanyId
WHERE T.dtTaskCreationDate =
       (select max(dtTaskCreationDate) from tblTasks as T2
         where C.numCompanyId = T2.numCompanyId)
GROUP BY T.numCompanyId, T.numProgramId, T.dtTaskCreationDate,
       T.numTaskId

A couple of notes about the table definitions are in order.  First, it
is generally considered poor practice to prefix column names with
their data types, and the same goes for prefixing table names with
tbl.  Second, looking at the definition of tblCompanyFranchise, the
key is the ID column, but I find it hard to believe that there is not
a natural key among the other columns, either a single column or
combination of columns.  In that case the use of the ID column seems
redundant, but in any case IF there is a natural key that too should
have a unique constraint.

Roy Harvey
Beacon Falls, CT

On Wed, 30 Aug 2006 07:35:02 -0700, Dan D.
<D***@discussions.microsoft.com> wrote:

Show quote
>See other post for schemas. I think this works.
>
>select distinct T.numCompanyId CompanyId,
>     (select top 1  tblTasks.numProgramId
>    from tblTasks where tblTasks.numCompanyId = T.numCompanyId
>    order by dtTaskCreationDate desc)
>AS ProgramId,
>    (select top 1  tblCompanyFranchise.numCompanyStatusId
>    from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
>T.numCompanyId )
>AS StatusId,
>    (select top 1  dtTaskCreationDate
>    from tblTasks where tblTasks.numCompanyId = T.numCompanyId
>    order by dtTaskCreationDate desc)
>AS dtTaskCreationDate,
>    (select top 1  numTaskId
>    from tblTasks where tblTasks.numCompanyId = T.numCompanyId
>    order by dtTaskCreationDate desc)
>AS numTaskId,
>    (select top 1  tblCompanyFranchise.numFranchiseId
>    from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
>T.numCompanyId )
>AS StatusId
>from tblTasks T
>order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.CompanyId,
>dtTaskCreationDate desc
Author
30 Aug 2006 3:51 PM
Dan D.
I'm all for doing it more efficiently if possible. Your queries ran a lot
faster than mine (8 to 10 secs vs 37 secs). They didn't give too many rows.
They gave too few rows. Your queries both returned 125269 rows and mine
returned 128803. If I just run "select distinct numcompanid from tbltasks" I
get 128803 rows. I'm trying to figure out why.

There is a one to many relationship between tblCompanyFranchise and
tblTasks. The data comes from a telemarketing center. For each contact with a
company a row gets added to tblTasks. The purpose of the query is to only get
the data for the last contact with the company. Hope that helps. Thanks.
--
Dan D.


Show quote
"Roy Harvey" wrote:

> Three subqueriesin the SELECT list, all against the same table, all
> with the same criteria, generally means that a different approach can
> be used with less redundancy.
>
> Also, using TOP without any ORDER BY, as in two other subqueries,
> gives a random result from among the set of possible results.
>
> I would expect something more along these lines:
>
> SELECT T.numCompanyId as CompanyId,
>        T.numProgramId as ProgramId,
>        C.numCompanyStatusId as StatusId,
>        T.dtTaskCreationDate as TaskCreationDate,
>        T.numTaskId as TaskId,
>        C.numFranchiseId as FranchiseId
>   FROM tblCompanyFranchise as C
>   JOIN tblTasks as T
>     ON C.numCompanyId = T.numCompanyId
>  WHERE T.dtTaskCreationDate =
>        (select max(dtTaskCreationDate) from tblTasks as T2
>          where C.numCompanyId = T2.numCompanyId)
>
> This may not be what you need - it is still unclear to me what
> constitutes a unique row in the input data.  Hopefully you will be
> able to get the idea in any case.  One possible change, if this
> provides more than one row per company:
>
> SELECT T.numCompanyId as CompanyId,
>        T.numProgramId as ProgramId,
>        MAX(C.numCompanyStatusId) as StatusId,
>        T.dtTaskCreationDate as TaskCreationDate,
>        T.numTaskId as TaskId,
>        MAX(C.numFranchiseId) as FranchiseId
>   FROM tblCompanyFranchise as C
>   JOIN tblTasks as T
>     ON C.numCompanyId = T.numCompanyId
>  WHERE T.dtTaskCreationDate =
>        (select max(dtTaskCreationDate) from tblTasks as T2
>          where C.numCompanyId = T2.numCompanyId)
>  GROUP BY T.numCompanyId, T.numProgramId, T.dtTaskCreationDate,
>        T.numTaskId
>
> A couple of notes about the table definitions are in order.  First, it
> is generally considered poor practice to prefix column names with
> their data types, and the same goes for prefixing table names with
> tbl.  Second, looking at the definition of tblCompanyFranchise, the
> key is the ID column, but I find it hard to believe that there is not
> a natural key among the other columns, either a single column or
> combination of columns.  In that case the use of the ID column seems
> redundant, but in any case IF there is a natural key that too should
> have a unique constraint.
>
> Roy Harvey
> Beacon Falls, CT
>
> On Wed, 30 Aug 2006 07:35:02 -0700, Dan D.
> <D***@discussions.microsoft.com> wrote:
>
> >See other post for schemas. I think this works.
> >
> >select distinct T.numCompanyId CompanyId,
> >     (select top 1  tblTasks.numProgramId
> >    from tblTasks where tblTasks.numCompanyId = T.numCompanyId
> >    order by dtTaskCreationDate desc)
> >AS ProgramId,
> >    (select top 1  tblCompanyFranchise.numCompanyStatusId
> >    from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
> >T.numCompanyId )
> >AS StatusId,
> >    (select top 1  dtTaskCreationDate
> >    from tblTasks where tblTasks.numCompanyId = T.numCompanyId
> >    order by dtTaskCreationDate desc)
> >AS dtTaskCreationDate,
> >    (select top 1  numTaskId
> >    from tblTasks where tblTasks.numCompanyId = T.numCompanyId
> >    order by dtTaskCreationDate desc)
> >AS numTaskId,
> >    (select top 1  tblCompanyFranchise.numFranchiseId
> >    from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
> >T.numCompanyId )
> >AS StatusId
> >from tblTasks T
> >order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.CompanyId,
> >dtTaskCreationDate desc
>
Author
30 Aug 2006 3:58 PM
Jim Underwood
Do you have null values in dtTaskCreationDate or numCompanyId?

Show quote
"Dan D." <D***@discussions.microsoft.com> wrote in message
news:A339228E-87C2-456F-9921-9BF6DCD30C87@microsoft.com...
> I'm all for doing it more efficiently if possible. Your queries ran a lot
> faster than mine (8 to 10 secs vs 37 secs). They didn't give too many
rows.
> They gave too few rows. Your queries both returned 125269 rows and mine
> returned 128803. If I just run "select distinct numcompanid from tbltasks"
I
> get 128803 rows. I'm trying to figure out why.
>
> There is a one to many relationship between tblCompanyFranchise and
> tblTasks. The data comes from a telemarketing center. For each contact
with a
> company a row gets added to tblTasks. The purpose of the query is to only
get
> the data for the last contact with the company. Hope that helps. Thanks.
> --
> Dan D.
>
>
> "Roy Harvey" wrote:
>
> > Three subqueriesin the SELECT list, all against the same table, all
> > with the same criteria, generally means that a different approach can
> > be used with less redundancy.
> >
> > Also, using TOP without any ORDER BY, as in two other subqueries,
> > gives a random result from among the set of possible results.
> >
> > I would expect something more along these lines:
> >
> > SELECT T.numCompanyId as CompanyId,
> >        T.numProgramId as ProgramId,
> >        C.numCompanyStatusId as StatusId,
> >        T.dtTaskCreationDate as TaskCreationDate,
> >        T.numTaskId as TaskId,
> >        C.numFranchiseId as FranchiseId
> >   FROM tblCompanyFranchise as C
> >   JOIN tblTasks as T
> >     ON C.numCompanyId = T.numCompanyId
> >  WHERE T.dtTaskCreationDate =
> >        (select max(dtTaskCreationDate) from tblTasks as T2
> >          where C.numCompanyId = T2.numCompanyId)
> >
> > This may not be what you need - it is still unclear to me what
> > constitutes a unique row in the input data.  Hopefully you will be
> > able to get the idea in any case.  One possible change, if this
> > provides more than one row per company:
> >
> > SELECT T.numCompanyId as CompanyId,
> >        T.numProgramId as ProgramId,
> >        MAX(C.numCompanyStatusId) as StatusId,
> >        T.dtTaskCreationDate as TaskCreationDate,
> >        T.numTaskId as TaskId,
> >        MAX(C.numFranchiseId) as FranchiseId
> >   FROM tblCompanyFranchise as C
> >   JOIN tblTasks as T
> >     ON C.numCompanyId = T.numCompanyId
> >  WHERE T.dtTaskCreationDate =
> >        (select max(dtTaskCreationDate) from tblTasks as T2
> >          where C.numCompanyId = T2.numCompanyId)
> >  GROUP BY T.numCompanyId, T.numProgramId, T.dtTaskCreationDate,
> >        T.numTaskId
> >
> > A couple of notes about the table definitions are in order.  First, it
> > is generally considered poor practice to prefix column names with
> > their data types, and the same goes for prefixing table names with
> > tbl.  Second, looking at the definition of tblCompanyFranchise, the
> > key is the ID column, but I find it hard to believe that there is not
> > a natural key among the other columns, either a single column or
> > combination of columns.  In that case the use of the ID column seems
> > redundant, but in any case IF there is a natural key that too should
> > have a unique constraint.
> >
> > Roy Harvey
> > Beacon Falls, CT
> >
> > On Wed, 30 Aug 2006 07:35:02 -0700, Dan D.
> > <D***@discussions.microsoft.com> wrote:
> >
> > >See other post for schemas. I think this works.
> > >
> > >select distinct T.numCompanyId CompanyId,
> > > (select top 1  tblTasks.numProgramId
> > > from tblTasks where tblTasks.numCompanyId = T.numCompanyId
> > > order by dtTaskCreationDate desc)
> > >AS ProgramId,
> > > (select top 1  tblCompanyFranchise.numCompanyStatusId
> > > from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
> > >T.numCompanyId )
> > >AS StatusId,
> > > (select top 1  dtTaskCreationDate
> > > from tblTasks where tblTasks.numCompanyId = T.numCompanyId
> > > order by dtTaskCreationDate desc)
> > >AS dtTaskCreationDate,
> > > (select top 1  numTaskId
> > > from tblTasks where tblTasks.numCompanyId = T.numCompanyId
> > > order by dtTaskCreationDate desc)
> > >AS numTaskId,
> > > (select top 1  tblCompanyFranchise.numFranchiseId
> > > from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
> > >T.numCompanyId )
> > >AS StatusId
> > >from tblTasks T
> > >order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.CompanyId,
> > >dtTaskCreationDate desc
> >
Author
30 Aug 2006 4:30 PM
Tom Cooper
Do you have duplicate values in dtTaskCreationDate for the same company?
And why do you have a dtTaskCreationDate and a dtTaskCreationTime.  If you
are storing date in one column and time in the other, it is usually better
to store both date and time in the same column.

Also, in your schema you have a unique constraint name
CK_tblCompanyFranchise_Pinid which sounds like you wanted that constraint to
ensure that PinId is unique.  But, in fact, that constraint is ensuring that
ID is unique, and you already know that is true since you have your primary
key on ID.

Finally, do you have a primary key and a clustered index on tblTasks?
You'll want to choose the clustered index very carefully (possibly the
primary key, but probably on some other column(s) like numCompanyId) if you
have multiple tasks for 125,000+ companies.

Tom

Show quote
"Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message
news:eCnkm0EzGHA.2516@TK2MSFTNGP06.phx.gbl...
> Do you have null values in dtTaskCreationDate or numCompanyId?
>
> "Dan D." <D***@discussions.microsoft.com> wrote in message
> news:A339228E-87C2-456F-9921-9BF6DCD30C87@microsoft.com...
>> I'm all for doing it more efficiently if possible. Your queries ran a lot
>> faster than mine (8 to 10 secs vs 37 secs). They didn't give too many
> rows.
>> They gave too few rows. Your queries both returned 125269 rows and mine
>> returned 128803. If I just run "select distinct numcompanid from
>> tbltasks"
> I
>> get 128803 rows. I'm trying to figure out why.
>>
>> There is a one to many relationship between tblCompanyFranchise and
>> tblTasks. The data comes from a telemarketing center. For each contact
> with a
>> company a row gets added to tblTasks. The purpose of the query is to only
> get
>> the data for the last contact with the company. Hope that helps. Thanks.
>> --
>> Dan D.
>>
>>
>> "Roy Harvey" wrote:
>>
>> > Three subqueriesin the SELECT list, all against the same table, all
>> > with the same criteria, generally means that a different approach can
>> > be used with less redundancy.
>> >
>> > Also, using TOP without any ORDER BY, as in two other subqueries,
>> > gives a random result from among the set of possible results.
>> >
>> > I would expect something more along these lines:
>> >
>> > SELECT T.numCompanyId as CompanyId,
>> >        T.numProgramId as ProgramId,
>> >        C.numCompanyStatusId as StatusId,
>> >        T.dtTaskCreationDate as TaskCreationDate,
>> >        T.numTaskId as TaskId,
>> >        C.numFranchiseId as FranchiseId
>> >   FROM tblCompanyFranchise as C
>> >   JOIN tblTasks as T
>> >     ON C.numCompanyId = T.numCompanyId
>> >  WHERE T.dtTaskCreationDate =
>> >        (select max(dtTaskCreationDate) from tblTasks as T2
>> >          where C.numCompanyId = T2.numCompanyId)
>> >
>> > This may not be what you need - it is still unclear to me what
>> > constitutes a unique row in the input data.  Hopefully you will be
>> > able to get the idea in any case.  One possible change, if this
>> > provides more than one row per company:
>> >
>> > SELECT T.numCompanyId as CompanyId,
>> >        T.numProgramId as ProgramId,
>> >        MAX(C.numCompanyStatusId) as StatusId,
>> >        T.dtTaskCreationDate as TaskCreationDate,
>> >        T.numTaskId as TaskId,
>> >        MAX(C.numFranchiseId) as FranchiseId
>> >   FROM tblCompanyFranchise as C
>> >   JOIN tblTasks as T
>> >     ON C.numCompanyId = T.numCompanyId
>> >  WHERE T.dtTaskCreationDate =
>> >        (select max(dtTaskCreationDate) from tblTasks as T2
>> >          where C.numCompanyId = T2.numCompanyId)
>> >  GROUP BY T.numCompanyId, T.numProgramId, T.dtTaskCreationDate,
>> >        T.numTaskId
>> >
>> > A couple of notes about the table definitions are in order.  First, it
>> > is generally considered poor practice to prefix column names with
>> > their data types, and the same goes for prefixing table names with
>> > tbl.  Second, looking at the definition of tblCompanyFranchise, the
>> > key is the ID column, but I find it hard to believe that there is not
>> > a natural key among the other columns, either a single column or
>> > combination of columns.  In that case the use of the ID column seems
>> > redundant, but in any case IF there is a natural key that too should
>> > have a unique constraint.
>> >
>> > Roy Harvey
>> > Beacon Falls, CT
>> >
>> > On Wed, 30 Aug 2006 07:35:02 -0700, Dan D.
>> > <D***@discussions.microsoft.com> wrote:
>> >
>> > >See other post for schemas. I think this works.
>> > >
>> > >select distinct T.numCompanyId CompanyId,
>> > > (select top 1  tblTasks.numProgramId
>> > > from tblTasks where tblTasks.numCompanyId = T.numCompanyId
>> > > order by dtTaskCreationDate desc)
>> > >AS ProgramId,
>> > > (select top 1  tblCompanyFranchise.numCompanyStatusId
>> > > from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
>> > >T.numCompanyId )
>> > >AS StatusId,
>> > > (select top 1  dtTaskCreationDate
>> > > from tblTasks where tblTasks.numCompanyId = T.numCompanyId
>> > > order by dtTaskCreationDate desc)
>> > >AS dtTaskCreationDate,
>> > > (select top 1  numTaskId
>> > > from tblTasks where tblTasks.numCompanyId = T.numCompanyId
>> > > order by dtTaskCreationDate desc)
>> > >AS numTaskId,
>> > > (select top 1  tblCompanyFranchise.numFranchiseId
>> > > from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
>> > >T.numCompanyId )
>> > >AS StatusId
>> > >from tblTasks T
>> > >order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.CompanyId,
>> > >dtTaskCreationDate desc
>> >
>
>
Author
30 Aug 2006 5:43 PM
Dan D.
Yes there are duplicate values in dtTaskCreationDate for the same company. I
wouldn't have thought so but its true. As to why a dtTaskCreationDate and a
dtTaskCreationTime column, I don't know. We inherited this app and I don't
know what the intended purpose of those two columns was supposed to be.

The CK_tblCompanyFranchise_Pinid  problem is on the list to be fixed. There
is a clustered index on numTaskId and numCompanyId (separate indexes) in
tblTasks but no primary key. Another thing that will change in the future.
--
Dan D.


Show quote
"Tom Cooper" wrote:

> Do you have duplicate values in dtTaskCreationDate for the same company?
> And why do you have a dtTaskCreationDate and a dtTaskCreationTime.  If you
> are storing date in one column and time in the other, it is usually better
> to store both date and time in the same column.
>
> Also, in your schema you have a unique constraint name
> CK_tblCompanyFranchise_Pinid which sounds like you wanted that constraint to
> ensure that PinId is unique.  But, in fact, that constraint is ensuring that
> ID is unique, and you already know that is true since you have your primary
> key on ID.
>
> Finally, do you have a primary key and a clustered index on tblTasks?
> You'll want to choose the clustered index very carefully (possibly the
> primary key, but probably on some other column(s) like numCompanyId) if you
> have multiple tasks for 125,000+ companies.
>
> Tom
>
> "Jim Underwood" <james.underwoodATfallonclinic.com> wrote in message
> news:eCnkm0EzGHA.2516@TK2MSFTNGP06.phx.gbl...
> > Do you have null values in dtTaskCreationDate or numCompanyId?
> >
> > "Dan D." <D***@discussions.microsoft.com> wrote in message
> > news:A339228E-87C2-456F-9921-9BF6DCD30C87@microsoft.com...
> >> I'm all for doing it more efficiently if possible. Your queries ran a lot
> >> faster than mine (8 to 10 secs vs 37 secs). They didn't give too many
> > rows.
> >> They gave too few rows. Your queries both returned 125269 rows and mine
> >> returned 128803. If I just run "select distinct numcompanid from
> >> tbltasks"
> > I
> >> get 128803 rows. I'm trying to figure out why.
> >>
> >> There is a one to many relationship between tblCompanyFranchise and
> >> tblTasks. The data comes from a telemarketing center. For each contact
> > with a
> >> company a row gets added to tblTasks. The purpose of the query is to only
> > get
> >> the data for the last contact with the company. Hope that helps. Thanks.
> >> --
> >> Dan D.
> >>
> >>
> >> "Roy Harvey" wrote:
> >>
> >> > Three subqueriesin the SELECT list, all against the same table, all
> >> > with the same criteria, generally means that a different approach can
> >> > be used with less redundancy.
> >> >
> >> > Also, using TOP without any ORDER BY, as in two other subqueries,
> >> > gives a random result from among the set of possible results.
> >> >
> >> > I would expect something more along these lines:
> >> >
> >> > SELECT T.numCompanyId as CompanyId,
> >> >        T.numProgramId as ProgramId,
> >> >        C.numCompanyStatusId as StatusId,
> >> >        T.dtTaskCreationDate as TaskCreationDate,
> >> >        T.numTaskId as TaskId,
> >> >        C.numFranchiseId as FranchiseId
> >> >   FROM tblCompanyFranchise as C
> >> >   JOIN tblTasks as T
> >> >     ON C.numCompanyId = T.numCompanyId
> >> >  WHERE T.dtTaskCreationDate =
> >> >        (select max(dtTaskCreationDate) from tblTasks as T2
> >> >          where C.numCompanyId = T2.numCompanyId)
> >> >
> >> > This may not be what you need - it is still unclear to me what
> >> > constitutes a unique row in the input data.  Hopefully you will be
> >> > able to get the idea in any case.  One possible change, if this
> >> > provides more than one row per company:
> >> >
> >> > SELECT T.numCompanyId as CompanyId,
> >> >        T.numProgramId as ProgramId,
> >> >        MAX(C.numCompanyStatusId) as StatusId,
> >> >        T.dtTaskCreationDate as TaskCreationDate,
> >> >        T.numTaskId as TaskId,
> >> >        MAX(C.numFranchiseId) as FranchiseId
> >> >   FROM tblCompanyFranchise as C
> >> >   JOIN tblTasks as T
> >> >     ON C.numCompanyId = T.numCompanyId
> >> >  WHERE T.dtTaskCreationDate =
> >> >        (select max(dtTaskCreationDate) from tblTasks as T2
> >> >          where C.numCompanyId = T2.numCompanyId)
> >> >  GROUP BY T.numCompanyId, T.numProgramId, T.dtTaskCreationDate,
> >> >        T.numTaskId
> >> >
> >> > A couple of notes about the table definitions are in order.  First, it
> >> > is generally considered poor practice to prefix column names with
> >> > their data types, and the same goes for prefixing table names with
> >> > tbl.  Second, looking at the definition of tblCompanyFranchise, the
> >> > key is the ID column, but I find it hard to believe that there is not
> >> > a natural key among the other columns, either a single column or
> >> > combination of columns.  In that case the use of the ID column seems
> >> > redundant, but in any case IF there is a natural key that too should
> >> > have a unique constraint.
> >> >
> >> > Roy Harvey
> >> > Beacon Falls, CT
> >> >
> >> > On Wed, 30 Aug 2006 07:35:02 -0700, Dan D.
> >> > <D***@discussions.microsoft.com> wrote:
> >> >
> >> > >See other post for schemas. I think this works.
> >> > >
> >> > >select distinct T.numCompanyId CompanyId,
> >> > > (select top 1  tblTasks.numProgramId
> >> > > from tblTasks where tblTasks.numCompanyId = T.numCompanyId
> >> > > order by dtTaskCreationDate desc)
> >> > >AS ProgramId,
> >> > > (select top 1  tblCompanyFranchise.numCompanyStatusId
> >> > > from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
> >> > >T.numCompanyId )
> >> > >AS StatusId,
> >> > > (select top 1  dtTaskCreationDate
> >> > > from tblTasks where tblTasks.numCompanyId = T.numCompanyId
> >> > > order by dtTaskCreationDate desc)
> >> > >AS dtTaskCreationDate,
> >> > > (select top 1  numTaskId
> >> > > from tblTasks where tblTasks.numCompanyId = T.numCompanyId
> >> > > order by dtTaskCreationDate desc)
> >> > >AS numTaskId,
> >> > > (select top 1  tblCompanyFranchise.numFranchiseId
> >> > > from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
> >> > >T.numCompanyId )
> >> > >AS StatusId
> >> > >from tblTasks T
> >> > >order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.CompanyId,
> >> > >dtTaskCreationDate desc
> >> >
> >
> >
>
>
>
Author
30 Aug 2006 5:24 PM
Dan D.
There are no nulls in either column.
--
Dan D.


Show quote
"Jim Underwood" wrote:

> Do you have null values in dtTaskCreationDate or numCompanyId?
>
> "Dan D." <D***@discussions.microsoft.com> wrote in message
> news:A339228E-87C2-456F-9921-9BF6DCD30C87@microsoft.com...
> > I'm all for doing it more efficiently if possible. Your queries ran a lot
> > faster than mine (8 to 10 secs vs 37 secs). They didn't give too many
> rows.
> > They gave too few rows. Your queries both returned 125269 rows and mine
> > returned 128803. If I just run "select distinct numcompanid from tbltasks"
> I
> > get 128803 rows. I'm trying to figure out why.
> >
> > There is a one to many relationship between tblCompanyFranchise and
> > tblTasks. The data comes from a telemarketing center. For each contact
> with a
> > company a row gets added to tblTasks. The purpose of the query is to only
> get
> > the data for the last contact with the company. Hope that helps. Thanks.
> > --
> > Dan D.
> >
> >
> > "Roy Harvey" wrote:
> >
> > > Three subqueriesin the SELECT list, all against the same table, all
> > > with the same criteria, generally means that a different approach can
> > > be used with less redundancy.
> > >
> > > Also, using TOP without any ORDER BY, as in two other subqueries,
> > > gives a random result from among the set of possible results.
> > >
> > > I would expect something more along these lines:
> > >
> > > SELECT T.numCompanyId as CompanyId,
> > >        T.numProgramId as ProgramId,
> > >        C.numCompanyStatusId as StatusId,
> > >        T.dtTaskCreationDate as TaskCreationDate,
> > >        T.numTaskId as TaskId,
> > >        C.numFranchiseId as FranchiseId
> > >   FROM tblCompanyFranchise as C
> > >   JOIN tblTasks as T
> > >     ON C.numCompanyId = T.numCompanyId
> > >  WHERE T.dtTaskCreationDate =
> > >        (select max(dtTaskCreationDate) from tblTasks as T2
> > >          where C.numCompanyId = T2.numCompanyId)
> > >
> > > This may not be what you need - it is still unclear to me what
> > > constitutes a unique row in the input data.  Hopefully you will be
> > > able to get the idea in any case.  One possible change, if this
> > > provides more than one row per company:
> > >
> > > SELECT T.numCompanyId as CompanyId,
> > >        T.numProgramId as ProgramId,
> > >        MAX(C.numCompanyStatusId) as StatusId,
> > >        T.dtTaskCreationDate as TaskCreationDate,
> > >        T.numTaskId as TaskId,
> > >        MAX(C.numFranchiseId) as FranchiseId
> > >   FROM tblCompanyFranchise as C
> > >   JOIN tblTasks as T
> > >     ON C.numCompanyId = T.numCompanyId
> > >  WHERE T.dtTaskCreationDate =
> > >        (select max(dtTaskCreationDate) from tblTasks as T2
> > >          where C.numCompanyId = T2.numCompanyId)
> > >  GROUP BY T.numCompanyId, T.numProgramId, T.dtTaskCreationDate,
> > >        T.numTaskId
> > >
> > > A couple of notes about the table definitions are in order.  First, it
> > > is generally considered poor practice to prefix column names with
> > > their data types, and the same goes for prefixing table names with
> > > tbl.  Second, looking at the definition of tblCompanyFranchise, the
> > > key is the ID column, but I find it hard to believe that there is not
> > > a natural key among the other columns, either a single column or
> > > combination of columns.  In that case the use of the ID column seems
> > > redundant, but in any case IF there is a natural key that too should
> > > have a unique constraint.
> > >
> > > Roy Harvey
> > > Beacon Falls, CT
> > >
> > > On Wed, 30 Aug 2006 07:35:02 -0700, Dan D.
> > > <D***@discussions.microsoft.com> wrote:
> > >
> > > >See other post for schemas. I think this works.
> > > >
> > > >select distinct T.numCompanyId CompanyId,
> > > > (select top 1  tblTasks.numProgramId
> > > > from tblTasks where tblTasks.numCompanyId = T.numCompanyId
> > > > order by dtTaskCreationDate desc)
> > > >AS ProgramId,
> > > > (select top 1  tblCompanyFranchise.numCompanyStatusId
> > > > from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
> > > >T.numCompanyId )
> > > >AS StatusId,
> > > > (select top 1  dtTaskCreationDate
> > > > from tblTasks where tblTasks.numCompanyId = T.numCompanyId
> > > > order by dtTaskCreationDate desc)
> > > >AS dtTaskCreationDate,
> > > > (select top 1  numTaskId
> > > > from tblTasks where tblTasks.numCompanyId = T.numCompanyId
> > > > order by dtTaskCreationDate desc)
> > > >AS numTaskId,
> > > > (select top 1  tblCompanyFranchise.numFranchiseId
> > > > from tblCompanyFranchise where tblCompanyFranchise.numCompanyId =
> > > >T.numCompanyId )
> > > >AS StatusId
> > > >from tblTasks T
> > > >order by tblCompanyFranchise.ProgramId, tblCompanyFranchise.CompanyId,
> > > >dtTaskCreationDate desc
> > >
>
>
>
Author
30 Aug 2006 4:07 PM
Roy Harvey
Try a LEFT OUTER JOIN:

SELECT T.numCompanyId as CompanyId,
       T.numProgramId as ProgramId,
       C.numCompanyStatusId as StatusId,
       T.dtTaskCreationDate as TaskCreationDate,
       T.numTaskId as TaskId,
       C.numFranchiseId as FranchiseId
  FROM tblCompanyFranchise as C
  LEFT OUTER
  JOIN tblTasks as T
    ON C.numCompanyId = T.numCompanyId
WHERE T.dtTaskCreationDate =
       (select max(dtTaskCreationDate) from tblTasks as T2
         where C.numCompanyId = T2.numCompanyId)

Roy Harvey
Beacon Falls, CT

On Wed, 30 Aug 2006 08:51:01 -0700, Dan D.
<D***@discussions.microsoft.com> wrote:

Show quote
>I'm all for doing it more efficiently if possible. Your queries ran a lot
>faster than mine (8 to 10 secs vs 37 secs). They didn't give too many rows.
>They gave too few rows. Your queries both returned 125269 rows and mine
>returned 128803. If I just run "select distinct numcompanid from tbltasks" I
>get 128803 rows. I'm trying to figure out why.
>
>There is a one to many relationship between tblCompanyFranchise and
>tblTasks. The data comes from a telemarketing center. For each contact with a
>company a row gets added to tblTasks. The purpose of the query is to only get
>the data for the last contact with the company. Hope that helps. Thanks.
>--
>Dan D.
Author
30 Aug 2006 4:32 PM
Jim Underwood
Roy,
Won't the where clause negate the left outer join?  Shouldn't it be:

  LEFT OUTER
  JOIN tblTasks as T
    ON C.numCompanyId = T.numCompanyId
AND T.dtTaskCreationDate =
       (select max(dtTaskCreationDate) from tblTasks as T2
         where C.numCompanyId = T2.numCompanyId)

Show quote
"Roy Harvey" <roy_har***@snet.net> wrote in message
news:ntdbf29uh5p1k42gelm4l3b10echojm8fk@4ax.com...
> Try a LEFT OUTER JOIN:
>
> SELECT T.numCompanyId as CompanyId,
>        T.numProgramId as ProgramId,
>        C.numCompanyStatusId as StatusId,
>        T.dtTaskCreationDate as TaskCreationDate,
>        T.numTaskId as TaskId,
>        C.numFranchiseId as FranchiseId
>   FROM tblCompanyFranchise as C
>   LEFT OUTER
>   JOIN tblTasks as T
>     ON C.numCompanyId = T.numCompanyId
>  WHERE T.dtTaskCreationDate =
>        (select max(dtTaskCreationDate) from tblTasks as T2
>          where C.numCompanyId = T2.numCompanyId)
>
> Roy Harvey
> Beacon Falls, CT
>
> On Wed, 30 Aug 2006 08:51:01 -0700, Dan D.
> <D***@discussions.microsoft.com> wrote:
>
> >I'm all for doing it more efficiently if possible. Your queries ran a lot
> >faster than mine (8 to 10 secs vs 37 secs). They didn't give too many
rows.
> >They gave too few rows. Your queries both returned 125269 rows and mine
> >returned 128803. If I just run "select distinct numcompanid from
tbltasks" I
> >get 128803 rows. I'm trying to figure out why.
> >
> >There is a one to many relationship between tblCompanyFranchise and
> >tblTasks. The data comes from a telemarketing center. For each contact
with a
> >company a row gets added to tblTasks. The purpose of the query is to only
get
> >the data for the last contact with the company. Hope that helps. Thanks.
> >--
> >Dan D.
Author
30 Aug 2006 5:02 PM
Roy Harvey
On Wed, 30 Aug 2006 12:32:28 -0400, "Jim Underwood"
<james.underwoodATfallonclinic.com> wrote:

>Roy,
>Won't the where clause negate the left outer join?  Shouldn't it be:
>
>  LEFT OUTER
>  JOIN tblTasks as T
>    ON C.numCompanyId = T.numCompanyId
> AND T.dtTaskCreationDate =
>       (select max(dtTaskCreationDate) from tblTasks as T2
>         where C.numCompanyId = T2.numCompanyId)

Yes, you are correct that I blew the WHERE clause.  Thanks for
catching that!

Roy
Author
30 Aug 2006 5:46 PM
Dan D.
That returns too many rows - 143380.
--
Dan D.


Show quote
"Jim Underwood" wrote:

> Roy,
> Won't the where clause negate the left outer join?  Shouldn't it be:
>
>   LEFT OUTER
>   JOIN tblTasks as T
>     ON C.numCompanyId = T.numCompanyId
>  AND T.dtTaskCreationDate =
>        (select max(dtTaskCreationDate) from tblTasks as T2
>          where C.numCompanyId = T2.numCompanyId)
>
> "Roy Harvey" <roy_har***@snet.net> wrote in message
> news:ntdbf29uh5p1k42gelm4l3b10echojm8fk@4ax.com...
> > Try a LEFT OUTER JOIN:
> >
> > SELECT T.numCompanyId as CompanyId,
> >        T.numProgramId as ProgramId,
> >        C.numCompanyStatusId as StatusId,
> >        T.dtTaskCreationDate as TaskCreationDate,
> >        T.numTaskId as TaskId,
> >        C.numFranchiseId as FranchiseId
> >   FROM tblCompanyFranchise as C
> >   LEFT OUTER
> >   JOIN tblTasks as T
> >     ON C.numCompanyId = T.numCompanyId
> >  WHERE T.dtTaskCreationDate =
> >        (select max(dtTaskCreationDate) from tblTasks as T2
> >          where C.numCompanyId = T2.numCompanyId)
> >
> > Roy Harvey
> > Beacon Falls, CT
> >
> > On Wed, 30 Aug 2006 08:51:01 -0700, Dan D.
> > <D***@discussions.microsoft.com> wrote:
> >
> > >I'm all for doing it more efficiently if possible. Your queries ran a lot
> > >faster than mine (8 to 10 secs vs 37 secs). They didn't give too many
> rows.
> > >They gave too few rows. Your queries both returned 125269 rows and mine
> > >returned 128803. If I just run "select distinct numcompanid from
> tbltasks" I
> > >get 128803 rows. I'm trying to figure out why.
> > >
> > >There is a one to many relationship between tblCompanyFranchise and
> > >tblTasks. The data comes from a telemarketing center. For each contact
> with a
> > >company a row gets added to tblTasks. The purpose of the query is to only
> get
> > >the data for the last contact with the company. Hope that helps. Thanks.
> > >--
> > >Dan D.
>
>
>
Author
30 Aug 2006 6:18 PM
Jim Underwood
Try these two queries to see where you have duplicates...

SELECT T.numCompanyId as CompanyId,
       T.numProgramId as ProgramId,
       T.dtTaskCreationDate as TaskCreationDate,
       T.numTaskId as TaskId
  FROM tblTasks as T
where exists
(
select 1
from tblTasks as T1
where T.numCompanyId = T1.numCompanyId
and T.dtTaskCreationDate = T1.dtTaskCreationDate
having count(1) > 1
);

SELECT C.numCompanyStatusId as StatusId,
       C.numFranchiseId as FranchiseId
  FROM tblCompanyFranchise as C
where exists
(
select 1
from tblCompanyFranchise as C1
where C.numCompanyId = C1.numCompanyId
having count(1) > 1
);

Also, try adding dtTaskCreationTime to the join and see if it gets the right
number of rows...

SELECT T.numCompanyId as CompanyId,
      T.numProgramId as ProgramId,
      C.numCompanyStatusId as StatusId,
      T.dtTaskCreationDate as TaskCreationDate,
      T.numTaskId as TaskId,
      C.numFranchiseId as FranchiseId
FROM tblCompanyFranchise as C
LEFT OUTER
JOIN tblTasks as T
   ON C.numCompanyId = T.numCompanyId
AND T.dtTaskCreationDate =
      (select max(dtTaskCreationDate) from tblTasks as T2
        where C.numCompanyId = T2.numCompanyId)

AND T.dtTaskCreationTime =
      (select max(dtTaskCreationTime) from tblTasks as T2
        where C.numCompanyId = T2.numCompanyId
        and T2.dtTaskCreationDate = T.dtTaskCreationDate )

Show quote
"Dan D." <D***@discussions.microsoft.com> wrote in message
news:5ED843C2-9CC1-4C02-99DD-AF2B6C8AA8A9@microsoft.com...
> That returns too many rows - 143380.
> --
> Dan D.
>
>
> "Jim Underwood" wrote:
>
> > Roy,
> > Won't the where clause negate the left outer join?  Shouldn't it be:
> >
> >   LEFT OUTER
> >   JOIN tblTasks as T
> >     ON C.numCompanyId = T.numCompanyId
> >  AND T.dtTaskCreationDate =
> >        (select max(dtTaskCreationDate) from tblTasks as T2
> >          where C.numCompanyId = T2.numCompanyId)
> >
> > "Roy Harvey" <roy_har***@snet.net> wrote in message
> > news:ntdbf29uh5p1k42gelm4l3b10echojm8fk@4ax.com...
> > > Try a LEFT OUTER JOIN:
> > >
> > > SELECT T.numCompanyId as CompanyId,
> > >        T.numProgramId as ProgramId,
> > >        C.numCompanyStatusId as StatusId,
> > >        T.dtTaskCreationDate as TaskCreationDate,
> > >        T.numTaskId as TaskId,
> > >        C.numFranchiseId as FranchiseId
> > >   FROM tblCompanyFranchise as C
> > >   LEFT OUTER
> > >   JOIN tblTasks as T
> > >     ON C.numCompanyId = T.numCompanyId
> > >  WHERE T.dtTaskCreationDate =
> > >        (select max(dtTaskCreationDate) from tblTasks as T2
> > >          where C.numCompanyId = T2.numCompanyId)
> > >
> > > Roy Harvey
> > > Beacon Falls, CT
> > >
> > > On Wed, 30 Aug 2006 08:51:01 -0700, Dan D.
> > > <D***@discussions.microsoft.com> wrote:
> > >
> > > >I'm all for doing it more efficiently if possible. Your queries ran a
lot
> > > >faster than mine (8 to 10 secs vs 37 secs). They didn't give too many
> > rows.
> > > >They gave too few rows. Your queries both returned 125269 rows and
mine
> > > >returned 128803. If I just run "select distinct numcompanid from
> > tbltasks" I
> > > >get 128803 rows. I'm trying to figure out why.
> > > >
> > > >There is a one to many relationship between tblCompanyFranchise and
> > > >tblTasks. The data comes from a telemarketing center. For each
contact
> > with a
> > > >company a row gets added to tblTasks. The purpose of the query is to
only
> > get
> > > >the data for the last contact with the company. Hope that helps.
Thanks.
> > > >--
> > > >Dan D.
> >
> >
> >
Author
30 Aug 2006 6:48 PM
Dan D.
The first query returns 5046 and the second one zero. I had done a slightly
different query earlier and I think there were 1600+ companies that had at
least one dupe. We discovered a batch process that was timing out sometimes
and the user would try again and apparently some of the rows would get
inserted before it timed out and then they would get inserted again. We're
working on fixing that process now and then we'll have to clean up the data.

The third query returned 143380 rows.
--
Dan D.


Show quote
"Jim Underwood" wrote:

> Try these two queries to see where you have duplicates...
>
> SELECT T.numCompanyId as CompanyId,
>        T.numProgramId as ProgramId,
>        T.dtTaskCreationDate as TaskCreationDate,
>        T.numTaskId as TaskId
>   FROM tblTasks as T
> where exists
> (
> select 1
> from tblTasks as T1
> where T.numCompanyId = T1.numCompanyId
> and T.dtTaskCreationDate = T1.dtTaskCreationDate
> having count(1) > 1
> );
>
> SELECT C.numCompanyStatusId as StatusId,
>        C.numFranchiseId as FranchiseId
>   FROM tblCompanyFranchise as C
> where exists
> (
> select 1
> from tblCompanyFranchise as C1
> where C.numCompanyId = C1.numCompanyId
> having count(1) > 1
> );
>
> Also, try adding dtTaskCreationTime to the join and see if it gets the right
> number of rows...
>
> SELECT T.numCompanyId as CompanyId,
>       T.numProgramId as ProgramId,
>       C.numCompanyStatusId as StatusId,
>       T.dtTaskCreationDate as TaskCreationDate,
>       T.numTaskId as TaskId,
>       C.numFranchiseId as FranchiseId
>  FROM tblCompanyFranchise as C
>  LEFT OUTER
>  JOIN tblTasks as T
>    ON C.numCompanyId = T.numCompanyId
> AND T.dtTaskCreationDate =
>       (select max(dtTaskCreationDate) from tblTasks as T2
>         where C.numCompanyId = T2.numCompanyId)
>
> AND T.dtTaskCreationTime =
>       (select max(dtTaskCreationTime) from tblTasks as T2
>         where C.numCompanyId = T2.numCompanyId
>         and T2.dtTaskCreationDate = T.dtTaskCreationDate )
>
> "Dan D." <D***@discussions.microsoft.com> wrote in message
> news:5ED843C2-9CC1-4C02-99DD-AF2B6C8AA8A9@microsoft.com...
> > That returns too many rows - 143380.
> > --
> > Dan D.
> >
> >
> > "Jim Underwood" wrote:
> >
> > > Roy,
> > > Won't the where clause negate the left outer join?  Shouldn't it be:
> > >
> > >   LEFT OUTER
> > >   JOIN tblTasks as T
> > >     ON C.numCompanyId = T.numCompanyId
> > >  AND T.dtTaskCreationDate =
> > >        (select max(dtTaskCreationDate) from tblTasks as T2
> > >          where C.numCompanyId = T2.numCompanyId)
> > >
> > > "Roy Harvey" <roy_har***@snet.net> wrote in message
> > > news:ntdbf29uh5p1k42gelm4l3b10echojm8fk@4ax.com...
> > > > Try a LEFT OUTER JOIN:
> > > >
> > > > SELECT T.numCompanyId as CompanyId,
> > > >        T.numProgramId as ProgramId,
> > > >        C.numCompanyStatusId as StatusId,
> > > >        T.dtTaskCreationDate as TaskCreationDate,
> > > >        T.numTaskId as TaskId,
> > > >        C.numFranchiseId as FranchiseId
> > > >   FROM tblCompanyFranchise as C
> > > >   LEFT OUTER
> > > >   JOIN tblTasks as T
> > > >     ON C.numCompanyId = T.numCompanyId
> > > >  WHERE T.dtTaskCreationDate =
> > > >        (select max(dtTaskCreationDate) from tblTasks as T2
> > > >          where C.numCompanyId = T2.numCompanyId)
> > > >
> > > > Roy Harvey
> > > > Beacon Falls, CT
> > > >
> > > > On Wed, 30 Aug 2006 08:51:01 -0700, Dan D.
> > > > <D***@discussions.microsoft.com> wrote:
> > > >
> > > > >I'm all for doing it more efficiently if possible. Your queries ran a
> lot
> > > > >faster than mine (8 to 10 secs vs 37 secs). They didn't give too many
> > > rows.
> > > > >They gave too few rows. Your queries both returned 125269 rows and
> mine
> > > > >returned 128803. If I just run "select distinct numcompanid from
> > > tbltasks" I
> > > > >get 128803 rows. I'm trying to figure out why.
> > > > >
> > > > >There is a one to many relationship between tblCompanyFranchise and
> > > > >tblTasks. The data comes from a telemarketing center. For each
> contact
> > > with a
> > > > >company a row gets added to tblTasks. The purpose of the query is to
> only
> > > get
> > > > >the data for the last contact with the company. Hope that helps.
> Thanks.
> > > > >--
> > > > >Dan D.
> > >
> > >
> > >
>
>
>
Author
30 Aug 2006 7:11 PM
Jim Underwood
A primary key / unique constraint on the proper columns will make it
impossible for duplicates to be inserted.  Of course the data needs to be
cleaned up first, and you will want to build error handling into your
application so the users get a friendly message when this happens (assuming
you don't code the application to prevent it as well)..

Out of curiosity, do you have any rows in tblTasks  where TaskCreationDate
<> TaskCreationTime?

Show quote
"Dan D." <D***@discussions.microsoft.com> wrote in message
news:479B64B7-FFF7-47B3-A0D5-2E17D24AEB10@microsoft.com...
> The first query returns 5046 and the second one zero. I had done a
slightly
> different query earlier and I think there were 1600+ companies that had at
> least one dupe. We discovered a batch process that was timing out
sometimes
> and the user would try again and apparently some of the rows would get
> inserted before it timed out and then they would get inserted again. We're
> working on fixing that process now and then we'll have to clean up the
data.
>
> The third query returned 143380 rows.
> --
> Dan D.
>
>
> "Jim Underwood" wrote:
>
> > Try these two queries to see where you have duplicates...
> >
> > SELECT T.numCompanyId as CompanyId,
> >        T.numProgramId as ProgramId,
> >        T.dtTaskCreationDate as TaskCreationDate,
> >        T.numTaskId as TaskId
> >   FROM tblTasks as T
> > where exists
> > (
> > select 1
> > from tblTasks as T1
> > where T.numCompanyId = T1.numCompanyId
> > and T.dtTaskCreationDate = T1.dtTaskCreationDate
> > having count(1) > 1
> > );
> >
> > SELECT C.numCompanyStatusId as StatusId,
> >        C.numFranchiseId as FranchiseId
> >   FROM tblCompanyFranchise as C
> > where exists
> > (
> > select 1
> > from tblCompanyFranchise as C1
> > where C.numCompanyId = C1.numCompanyId
> > having count(1) > 1
> > );
> >
> > Also, try adding dtTaskCreationTime to the join and see if it gets the
right
> > number of rows...
> >
> > SELECT T.numCompanyId as CompanyId,
> >       T.numProgramId as ProgramId,
> >       C.numCompanyStatusId as StatusId,
> >       T.dtTaskCreationDate as TaskCreationDate,
> >       T.numTaskId as TaskId,
> >       C.numFranchiseId as FranchiseId
> >  FROM tblCompanyFranchise as C
> >  LEFT OUTER
> >  JOIN tblTasks as T
> >    ON C.numCompanyId = T.numCompanyId
> > AND T.dtTaskCreationDate =
> >       (select max(dtTaskCreationDate) from tblTasks as T2
> >         where C.numCompanyId = T2.numCompanyId)
> >
> > AND T.dtTaskCreationTime =
> >       (select max(dtTaskCreationTime) from tblTasks as T2
> >         where C.numCompanyId = T2.numCompanyId
> >         and T2.dtTaskCreationDate = T.dtTaskCreationDate )
> >
> > "Dan D." <D***@discussions.microsoft.com> wrote in message
> > news:5ED843C2-9CC1-4C02-99DD-AF2B6C8AA8A9@microsoft.com...
> > > That returns too many rows - 143380.
> > > --
> > > Dan D.
> > >
> > >
> > > "Jim Underwood" wrote:
> > >
> > > > Roy,
> > > > Won't the where clause negate the left outer join?  Shouldn't it be:
> > > >
> > > >   LEFT OUTER
> > > >   JOIN tblTasks as T
> > > >     ON C.numCompanyId = T.numCompanyId
> > > >  AND T.dtTaskCreationDate =
> > > >        (select max(dtTaskCreationDate) from tblTasks as T2
> > > >          where C.numCompanyId = T2.numCompanyId)
> > > >
> > > > "Roy Harvey" <roy_har***@snet.net> wrote in message
> > > > news:ntdbf29uh5p1k42gelm4l3b10echojm8fk@4ax.com...
> > > > > Try a LEFT OUTER JOIN:
> > > > >
> > > > > SELECT T.numCompanyId as CompanyId,
> > > > >        T.numProgramId as ProgramId,
> > > > >        C.numCompanyStatusId as StatusId,
> > > > >        T.dtTaskCreationDate as TaskCreationDate,
> > > > >        T.numTaskId as TaskId,
> > > > >        C.numFranchiseId as FranchiseId
> > > > >   FROM tblCompanyFranchise as C
> > > > >   LEFT OUTER
> > > > >   JOIN tblTasks as T
> > > > >     ON C.numCompanyId = T.numCompanyId
> > > > >  WHERE T.dtTaskCreationDate =
> > > > >        (select max(dtTaskCreationDate) from tblTasks as T2
> > > > >          where C.numCompanyId = T2.numCompanyId)
> > > > >
> > > > > Roy Harvey
> > > > > Beacon Falls, CT
> > > > >
> > > > > On Wed, 30 Aug 2006 08:51:01 -0700, Dan D.
> > > > > <D***@discussions.microsoft.com> wrote:
> > > > >
> > > > > >I'm all for doing it more efficiently if possible. Your queries
ran a
> > lot
> > > > > >faster than mine (8 to 10 secs vs 37 secs). They didn't give too
many
> > > > rows.
> > > > > >They gave too few rows. Your queries both returned 125269 rows
and
> > mine
> > > > > >returned 128803. If I just run "select distinct numcompanid from
> > > > tbltasks" I
> > > > > >get 128803 rows. I'm trying to figure out why.
> > > > > >
> > > > > >There is a one to many relationship between tblCompanyFranchise
and
> > > > > >tblTasks. The data comes from a telemarketing center. For each
> > contact
> > > > with a
> > > > > >company a row gets added to tblTasks. The purpose of the query is
to
> > only
> > > > get
> > > > > >the data for the last contact with the company. Hope that helps.
> > Thanks.
> > > > > >--
> > > > > >Dan D.
> > > >
> > > >
> > > >
> >
> >
> >
Author
30 Aug 2006 5:45 PM
Dan D.
That's as fast as your other queries but still returns 125269 rows.
--
Dan D.


Show quote
"Roy Harvey" wrote:

> Try a LEFT OUTER JOIN:
>
> SELECT T.numCompanyId as CompanyId,
>        T.numProgramId as ProgramId,
>        C.numCompanyStatusId as StatusId,
>        T.dtTaskCreationDate as TaskCreationDate,
>        T.numTaskId as TaskId,
>        C.numFranchiseId as FranchiseId
>   FROM tblCompanyFranchise as C
>   LEFT OUTER
>   JOIN tblTasks as T
>     ON C.numCompanyId = T.numCompanyId
>  WHERE T.dtTaskCreationDate =
>        (select max(dtTaskCreationDate) from tblTasks as T2
>          where C.numCompanyId = T2.numCompanyId)
>
> Roy Harvey
> Beacon Falls, CT
>
> On Wed, 30 Aug 2006 08:51:01 -0700, Dan D.
> <D***@discussions.microsoft.com> wrote:
>
> >I'm all for doing it more efficiently if possible. Your queries ran a lot
> >faster than mine (8 to 10 secs vs 37 secs). They didn't give too many rows.
> >They gave too few rows. Your queries both returned 125269 rows and mine
> >returned 128803. If I just run "select distinct numcompanid from tbltasks" I
> >get 128803 rows. I'm trying to figure out why.
> >
> >There is a one to many relationship between tblCompanyFranchise and
> >tblTasks. The data comes from a telemarketing center. For each contact with a
> >company a row gets added to tblTasks. The purpose of the query is to only get
> >the data for the last contact with the company. Hope that helps. Thanks.
> >--
> >Dan D.
>
Author
30 Aug 2006 5:56 PM
Roy Harvey
On Wed, 30 Aug 2006 10:45:01 -0700, Dan D.
<D***@discussions.microsoft.com> wrote:

>That's as fast as your other queries but still returns 125269 rows.

Please run:

SELECT COUNT(distinct numcompanid) FROM tblCompanyFranchise
SELECT COUNT(distinct numcompanid) FROM tbltasks

SELECT numcompanid
  FROM tbltasks as T
WHERE NOT EXISTS
       (select * from tblCompanyFranchise as C
         where T.numcompanid = C.numcompanid)

and post the results.

Roy Harvey
Beacon Falls, CT
Author
30 Aug 2006 6:56 PM
Dan D.
143366
128803
8092 - 3548 unique companies

So there are quite a few companies with no rows in tblTasks and quite a few
companies that don't even exist.:)
--
Dan D.


Show quote
"Roy Harvey" wrote:

> On Wed, 30 Aug 2006 10:45:01 -0700, Dan D.
> <D***@discussions.microsoft.com> wrote:
>
> >That's as fast as your other queries but still returns 125269 rows.
>
> Please run:
>
> SELECT COUNT(distinct numcompanid) FROM tblCompanyFranchise
> SELECT COUNT(distinct numcompanid) FROM tbltasks
>
> SELECT numcompanid
>   FROM tbltasks as T
>  WHERE NOT EXISTS
>        (select * from tblCompanyFranchise as C
>          where T.numcompanid = C.numcompanid)
>
> and post the results.
>
> Roy Harvey
> Beacon Falls, CT
>
Author
30 Aug 2006 7:09 PM
Roy Harvey
On Wed, 30 Aug 2006 11:56:01 -0700, Dan D.
<D***@discussions.microsoft.com> wrote:

>143366
>128803
>8092 - 3548 unique companies
>
>So there are quite a few companies with no rows in tblTasks and quite a few
>companies that don't even exist.:)

Sounds like you have some decisions to make about just how the
contradictions in the data need to be treated in your query.

Roy
Author
30 Aug 2006 7:27 PM
Dan D.
Yes we do. Thanks for your help Roy.
--
Dan D.


Show quote
"Roy Harvey" wrote:

> On Wed, 30 Aug 2006 11:56:01 -0700, Dan D.
> <D***@discussions.microsoft.com> wrote:
>
> >143366
> >128803
> >8092 - 3548 unique companies
> >
> >So there are quite a few companies with no rows in tblTasks and quite a few
> >companies that don't even exist.:)
>
> Sounds like you have some decisions to make about just how the
> contradictions in the data need to be treated in your query.
>
> Roy
>
Author
30 Aug 2006 7:01 PM
Dan D.
Someone who knows that data better than I do, said that even though the
company is removed from tblCompanyFranchise, which means the company is no
longer associated with a franchise, the company still exists in tblCompany
and is shown as inactive.
--
Dan D.


Show quote
"Roy Harvey" wrote:

> On Wed, 30 Aug 2006 10:45:01 -0700, Dan D.
> <D***@discussions.microsoft.com> wrote:
>
> >That's as fast as your other queries but still returns 125269 rows.
>
> Please run:
>
> SELECT COUNT(distinct numcompanid) FROM tblCompanyFranchise
> SELECT COUNT(distinct numcompanid) FROM tbltasks
>
> SELECT numcompanid
>   FROM tbltasks as T
>  WHERE NOT EXISTS
>        (select * from tblCompanyFranchise as C
>          where T.numcompanid = C.numcompanid)
>
> and post the results.
>
> Roy Harvey
> Beacon Falls, CT
>
Author
30 Aug 2006 7:06 PM
Dan D.
So I guess that could explain the difference between my count of 128803 and
your count of 125269. Your count is only for those companies that exist in
tblCompanyFranchise and my count would include companies that are not in
tblCompanyFranchise.
--
Dan D.


Show quote
"Roy Harvey" wrote:

> On Wed, 30 Aug 2006 10:45:01 -0700, Dan D.
> <D***@discussions.microsoft.com> wrote:
>
> >That's as fast as your other queries but still returns 125269 rows.
>
> Please run:
>
> SELECT COUNT(distinct numcompanid) FROM tblCompanyFranchise
> SELECT COUNT(distinct numcompanid) FROM tbltasks
>
> SELECT numcompanid
>   FROM tbltasks as T
>  WHERE NOT EXISTS
>        (select * from tblCompanyFranchise as C
>          where T.numcompanid = C.numcompanid)
>
> and post the results.
>
> Roy Harvey
> Beacon Falls, CT
>
Author
30 Aug 2006 7:11 PM
Dan D.
I double checked the StatusId column in my query and there are 3548 nulls. So
those are the companies that don't exist in tblCompanyFranchise.

I'll use your query. It's also faster. Thanks Roy.
--
Dan D.


Show quote
"Roy Harvey" wrote:

> On Wed, 30 Aug 2006 10:45:01 -0700, Dan D.
> <D***@discussions.microsoft.com> wrote:
>
> >That's as fast as your other queries but still returns 125269 rows.
>
> Please run:
>
> SELECT COUNT(distinct numcompanid) FROM tblCompanyFranchise
> SELECT COUNT(distinct numcompanid) FROM tbltasks
>
> SELECT numcompanid
>   FROM tbltasks as T
>  WHERE NOT EXISTS
>        (select * from tblCompanyFranchise as C
>          where T.numcompanid = C.numcompanid)
>
> and post the results.
>
> Roy Harvey
> Beacon Falls, CT
>

AddThis Social Bookmark Button