Home All Groups Group Topic Archive Search About

how to: INSERT IGNORE INTO?

Author
20 Jul 2006 11:04 AM
R. Nachtsturm
Hi,

I have the following situation:
a partitioned view as the target of a "insert into", with the source being
an "select from" statement.

the problem:
the partitioned view requires a primary key, the source table however does
not have either a primary nor a unique key and does have several duplicate
values

i want to destination to only have unique values, so i would like to ignore
errors (duplicates) during the insert into...

MySQL has an "IGNORE" hint/option for the insert into statement, which does
exactly that, is there something similar for SQL Server?

things i don't want to do:
use some kind of "not exists in" sub query.. the performance is way to slow
same goes for a "group by" of the unique column fields..
(the source table doesn't have ANY indices/keys, not my fault btw! ;)

also the destination is currently being written to, so can't drop/recreate
the source or the destination

any help would be appreciated!

Author
20 Jul 2006 11:21 AM
Adi
You can try using select distinct.  This should get unique records
only.  Not sure about the performance because I don't know how many
records it will have to check for uniqueness

Adi

R. Nachtsturm wrote:
Show quote
> Hi,
>
> I have the following situation:
> a partitioned view as the target of a "insert into", with the source being
> an "select from" statement.
>
> the problem:
> the partitioned view requires a primary key, the source table however does
> not have either a primary nor a unique key and does have several duplicate
> values
>
> i want to destination to only have unique values, so i would like to ignore
> errors (duplicates) during the insert into...
>
> MySQL has an "IGNORE" hint/option for the insert into statement, which does
> exactly that, is there something similar for SQL Server?
>
> things i don't want to do:
> use some kind of "not exists in" sub query.. the performance is way to slow
> same goes for a "group by" of the unique column fields..
> (the source table doesn't have ANY indices/keys, not my fault btw! ;)
>
> also the destination is currently being written to, so can't drop/recreate
> the source or the destination
>
> any help would be appreciated!
Author
20 Jul 2006 11:31 AM
Omnibuzz
Do you have duplicate rows or keys? Can you give an example of what you are
trying to achieve.


--
-Omnibuzz (The SQL GC)

http://omnibuzz-sql.blogspot.com/



Show quote
"R. Nachtsturm" wrote:

> Hi,
>
> I have the following situation:
> a partitioned view as the target of a "insert into", with the source being
> an "select from" statement.
>
> the problem:
> the partitioned view requires a primary key, the source table however does
> not have either a primary nor a unique key and does have several duplicate
> values
>
> i want to destination to only have unique values, so i would like to ignore
> errors (duplicates) during the insert into...
>
> MySQL has an "IGNORE" hint/option for the insert into statement, which does
> exactly that, is there something similar for SQL Server?
>
> things i don't want to do:
> use some kind of "not exists in" sub query.. the performance is way to slow
> same goes for a "group by" of the unique column fields..
> (the source table doesn't have ANY indices/keys, not my fault btw! ;)
>
> also the destination is currently being written to, so can't drop/recreate
> the source or the destination
>
> any help would be appreciated!
Author
20 Jul 2006 1:49 PM
R. Nachtsturm
Both, well if the row has a duplicate then the any key for that row would be
a duplicate too...

bascially i have one schema where they have one table for each object:
i.e. ObjectID101 (table name)
with columns: Val, Tim (for value and time)

i want to create another table (in another schema) that combines records
from multiple objects (tables in the old schema) into the new table (and
schema), i.e
Data (table name)
with columns: ID, VAL, TIM (for object id, value and time)
on the new table i have set the primary key on the combination of ID and TIM.

example data for old ObjectID101 (TIM as sort of timestamp for this example)
[VAL], [TIM]
1,1
1,1
2,2
2,3
2,3

example data for new Data table:
[ID],[VAL],[TIM]
101,2,2
101,2,3
101,4,5
201,2,3
201,3,4

so i have the problem that some records in the old schema do not exist in
the new schema, some do, and some of those records that don't exist in the
new schema exists in duplicats in the old schema... and i want to get all
recors without duplicates into the new schema as fast as possible

one cuase of the problem is that the new schema is constantly getting new
records, the same records however are also added to the old schema

another problem is, like i said, that the old object tables do not have any
indecies and i cannot modify them since i only have read only access...

well i hope you there still is great simple, fast solution here :)
otherwise my only option is to use 'group by key where no exists in
target'?... or to make a small app. that tries to insert record by record
into the new table .. but that will take even longer

anyhow,
thanks for the help!

Show quote
"Omnibuzz" wrote:

> Do you have duplicate rows or keys? Can you give an example of what you are
> trying to achieve.
>
>
> --
> -Omnibuzz (The SQL GC)
>
> http://omnibuzz-sql.blogspot.com/
>
>
>
> "R. Nachtsturm" wrote:
>
> > Hi,
> >
> > I have the following situation:
> > a partitioned view as the target of a "insert into", with the source being
> > an "select from" statement.
> >
> > the problem:
> > the partitioned view requires a primary key, the source table however does
> > not have either a primary nor a unique key and does have several duplicate
> > values
> >
> > i want to destination to only have unique values, so i would like to ignore
> > errors (duplicates) during the insert into...
> >
> > MySQL has an "IGNORE" hint/option for the insert into statement, which does
> > exactly that, is there something similar for SQL Server?
> >
> > things i don't want to do:
> > use some kind of "not exists in" sub query.. the performance is way to slow
> > same goes for a "group by" of the unique column fields..
> > (the source table doesn't have ANY indices/keys, not my fault btw! ;)
> >
> > also the destination is currently being written to, so can't drop/recreate
> > the source or the destination
> >
> > any help would be appreciated!
Author
20 Jul 2006 3:17 PM
Arnie Rowland
> one cuase of the problem is that the new schema is constantly getting new
> records, the same records however are also added to the old schema

Are new records being added to the 'old schema' BUT not to the 'new schema'?
In order words, is this a one time need, or an ongoing situation?

--
Arnie Rowland
Most good judgment comes from experience.
Most experience comes from bad judgment.
- Anonymous


Show quote
"R. Nachtsturm" <RNachtst***@discussions.microsoft.com> wrote in message
news:CEAFF93E-557B-4169-B2AB-54006471FB1A@microsoft.com...
> Both, well if the row has a duplicate then the any key for that row would
> be
> a duplicate too...
>
> bascially i have one schema where they have one table for each object:
> i.e. ObjectID101 (table name)
> with columns: Val, Tim (for value and time)
>
> i want to create another table (in another schema) that combines records
> from multiple objects (tables in the old schema) into the new table (and
> schema), i.e
> Data (table name)
> with columns: ID, VAL, TIM (for object id, value and time)
> on the new table i have set the primary key on the combination of ID and
> TIM.
>
> example data for old ObjectID101 (TIM as sort of timestamp for this
> example)
> [VAL], [TIM]
> 1,1
> 1,1
> 2,2
> 2,3
> 2,3
>
> example data for new Data table:
> [ID],[VAL],[TIM]
> 101,2,2
> 101,2,3
> 101,4,5
> 201,2,3
> 201,3,4
>
> so i have the problem that some records in the old schema do not exist in
> the new schema, some do, and some of those records that don't exist in the
> new schema exists in duplicats in the old schema... and i want to get all
> recors without duplicates into the new schema as fast as possible
>
> one cuase of the problem is that the new schema is constantly getting new
> records, the same records however are also added to the old schema
>
> another problem is, like i said, that the old object tables do not have
> any
> indecies and i cannot modify them since i only have read only access...
>
> well i hope you there still is great simple, fast solution here :)
> otherwise my only option is to use 'group by key where no exists in
> target'?... or to make a small app. that tries to insert record by record
> into the new table .. but that will take even longer
>
> anyhow,
> thanks for the help!
>
> "Omnibuzz" wrote:
>
>> Do you have duplicate rows or keys? Can you give an example of what you
>> are
>> trying to achieve.
>>
>>
>> --
>> -Omnibuzz (The SQL GC)
>>
>> http://omnibuzz-sql.blogspot.com/
>>
>>
>>
>> "R. Nachtsturm" wrote:
>>
>> > Hi,
>> >
>> > I have the following situation:
>> > a partitioned view as the target of a "insert into", with the source
>> > being
>> > an "select from" statement.
>> >
>> > the problem:
>> > the partitioned view requires a primary key, the source table however
>> > does
>> > not have either a primary nor a unique key and does have several
>> > duplicate
>> > values
>> >
>> > i want to destination to only have unique values, so i would like to
>> > ignore
>> > errors (duplicates) during the insert into...
>> >
>> > MySQL has an "IGNORE" hint/option for the insert into statement, which
>> > does
>> > exactly that, is there something similar for SQL Server?
>> >
>> > things i don't want to do:
>> > use some kind of "not exists in" sub query.. the performance is way to
>> > slow
>> > same goes for a "group by" of the unique column fields..
>> > (the source table doesn't have ANY indices/keys, not my fault btw! ;)
>> >
>> > also the destination is currently being written to, so can't
>> > drop/recreate
>> > the source or the destination
>> >
>> > any help would be appreciated!
Author
20 Jul 2006 4:39 PM
R. Nachtsturm
it is a one time thing, needing to sync the new schema with the old, but
since nobody wants so loose any data or allow any downtime, i have to try to
add all the records missing in the new schema from the old one..

but its ok now, i'm just going to delete all the records of the new schema,
note the time and then just import all records of the old schema with time
less then the noted time.. and i wil use a "group by" of the key components
to make sure i'm only getting unique entries...

this will take a while though... but i can just leave it running for a few
days.. (i do hope its done before the weekend ;)

anyhow, thanks for the help!!

Show quote
"Arnie Rowland" wrote:

> > one cuase of the problem is that the new schema is constantly getting new
> > records, the same records however are also added to the old schema
>
> Are new records being added to the 'old schema' BUT not to the 'new schema'?
> In order words, is this a one time need, or an ongoing situation?
>
> --
> Arnie Rowland
> Most good judgment comes from experience.
> Most experience comes from bad judgment.
> - Anonymous
>
>
> "R. Nachtsturm" <RNachtst***@discussions.microsoft.com> wrote in message
> news:CEAFF93E-557B-4169-B2AB-54006471FB1A@microsoft.com...
> > Both, well if the row has a duplicate then the any key for that row would
> > be
> > a duplicate too...
> >
> > bascially i have one schema where they have one table for each object:
> > i.e. ObjectID101 (table name)
> > with columns: Val, Tim (for value and time)
> >
> > i want to create another table (in another schema) that combines records
> > from multiple objects (tables in the old schema) into the new table (and
> > schema), i.e
> > Data (table name)
> > with columns: ID, VAL, TIM (for object id, value and time)
> > on the new table i have set the primary key on the combination of ID and
> > TIM.
> >
> > example data for old ObjectID101 (TIM as sort of timestamp for this
> > example)
> > [VAL], [TIM]
> > 1,1
> > 1,1
> > 2,2
> > 2,3
> > 2,3
> >
> > example data for new Data table:
> > [ID],[VAL],[TIM]
> > 101,2,2
> > 101,2,3
> > 101,4,5
> > 201,2,3
> > 201,3,4
> >
> > so i have the problem that some records in the old schema do not exist in
> > the new schema, some do, and some of those records that don't exist in the
> > new schema exists in duplicats in the old schema... and i want to get all
> > recors without duplicates into the new schema as fast as possible
> >
> > one cuase of the problem is that the new schema is constantly getting new
> > records, the same records however are also added to the old schema
> >
> > another problem is, like i said, that the old object tables do not have
> > any
> > indecies and i cannot modify them since i only have read only access...
> >
> > well i hope you there still is great simple, fast solution here :)
> > otherwise my only option is to use 'group by key where no exists in
> > target'?... or to make a small app. that tries to insert record by record
> > into the new table .. but that will take even longer
> >
> > anyhow,
> > thanks for the help!
> >
> > "Omnibuzz" wrote:
> >
> >> Do you have duplicate rows or keys? Can you give an example of what you
> >> are
> >> trying to achieve.
> >>
> >>
> >> --
> >> -Omnibuzz (The SQL GC)
> >>
> >> http://omnibuzz-sql.blogspot.com/
> >>
> >>
> >>
> >> "R. Nachtsturm" wrote:
> >>
> >> > Hi,
> >> >
> >> > I have the following situation:
> >> > a partitioned view as the target of a "insert into", with the source
> >> > being
> >> > an "select from" statement.
> >> >
> >> > the problem:
> >> > the partitioned view requires a primary key, the source table however
> >> > does
> >> > not have either a primary nor a unique key and does have several
> >> > duplicate
> >> > values
> >> >
> >> > i want to destination to only have unique values, so i would like to
> >> > ignore
> >> > errors (duplicates) during the insert into...
> >> >
> >> > MySQL has an "IGNORE" hint/option for the insert into statement, which
> >> > does
> >> > exactly that, is there something similar for SQL Server?
> >> >
> >> > things i don't want to do:
> >> > use some kind of "not exists in" sub query.. the performance is way to
> >> > slow
> >> > same goes for a "group by" of the unique column fields..
> >> > (the source table doesn't have ANY indices/keys, not my fault btw! ;)
> >> >
> >> > also the destination is currently being written to, so can't
> >> > drop/recreate
> >> > the source or the destination
> >> >
> >> > any help would be appreciated!
>
>
>
Author
20 Jul 2006 11:58 AM
Roy Harvey
On Thu, 20 Jul 2006 04:04:01 -0700, R. Nachtsturm
<RNachtst***@discussions.microsoft.com> wrote:

>things i don't want to do:
>use some kind of "not exists in" sub query.. the performance is way to slow
>same goes for a "group by" of the unique column fields..
>(the source table doesn't have ANY indices/keys, not my fault btw! ;)

So what is preventing you from adding an index or two to the source
table?  You need not make any of them clustered, and you need not make
any of them unique, so they could be dropped when you are done without
changing anything.

Also, I can understand that without indexes subqueries are going to be
very slow, but I can't see any reason why GROUP BY on the columns you
want to be the key would be.  Have you actually tried it, or are you
just assuming?

Roy Harvey
Beacon Falls, CT
Author
20 Jul 2006 4:40 PM
R. Nachtsturm
Hi,

now i have tried it.. it's faster then i though but it will still take a while

thanks!

Show quote
"Roy Harvey" wrote:

> On Thu, 20 Jul 2006 04:04:01 -0700, R. Nachtsturm
> <RNachtst***@discussions.microsoft.com> wrote:
>
> >things i don't want to do:
> >use some kind of "not exists in" sub query.. the performance is way to slow
> >same goes for a "group by" of the unique column fields..
> >(the source table doesn't have ANY indices/keys, not my fault btw! ;)
>
> So what is preventing you from adding an index or two to the source
> table?  You need not make any of them clustered, and you need not make
> any of them unique, so they could be dropped when you are done without
> changing anything.
>
> Also, I can understand that without indexes subqueries are going to be
> very slow, but I can't see any reason why GROUP BY on the columns you
> want to be the key would be.  Have you actually tried it, or are you
> just assuming?
>
> Roy Harvey
> Beacon Falls, CT
>

AddThis Social Bookmark Button