Home All Groups Group Topic Archive Search About

Most efficient way to update/insert order details?!

Author
26 Aug 2005 6:07 PM
JorgeC
Hi!
I've been working on a stock management software for a couple of months.
The user creates/changes stock documents which have a header and several
hundreds (usually) lines.
Now, when we change a stock document (deleting, adding, changing lines) the
current software deletes all rows before saving the document (we're using SQL
Server and Access databases).
The person responsible for this design says it's faster to delete and insert
all lines, than to update the lines which have been changed + deleting the
lines that were marked to be deleted + inserting the new lines.

Basically, he is saying that an Update operation is slower than a DELETE +
INSERT operation. Is this true? How have you solved this problem?
ATTENTION: the lines table does not have primary keys, because of this
DELETE+INSERT stuff...

From my newbie viewpoint I would use the UPDATE methodology....

I really need to understand why! So, can anyone explain this to me? Please?

Thanks for your help and attention.
Best Regards,
Jorge C.

Author
26 Aug 2005 7:03 PM
David Gugick
JorgeC wrote:
Show quote
> Hi!
> I've been working on a stock management software for a couple of
> months.
> The user creates/changes stock documents which have a header and
> several hundreds (usually) lines.
> Now, when we change a stock document (deleting, adding, changing
> lines) the current software deletes all rows before saving the
> document (we're using SQL Server and Access databases).
> The person responsible for this design says it's faster to delete and
> insert all lines, than to update the lines which have been changed +
> deleting the lines that were marked to be deleted + inserting the new
> lines.
>
> Basically, he is saying that an Update operation is slower than a
> DELETE + INSERT operation. Is this true? How have you solved this
> problem?
> ATTENTION: the lines table does not have primary keys, because of this
> DELETE+INSERT stuff...
>

We really need to understand how the lines table cannot have a primary
key... Without a key neither updates nor deletes are going to run with
any expediency. Plus you expose possible data integrity issues.

Once you have the proper keys in place, and generally speaking, I would
say the overhead is about equal. That is, a delete and insert is about
the same as an update. I would use an update, personally, unless there
was a proven performance benefit to the system and the update operation
was causing some performance issues that could not be addressed any
other way.

What performance issues was your system having that the "designer" chose
to perform a full delete and insert. Certainly, more rows are affected
that way. And there are no keys in place, which makes me wonder if a
table scan occurs every time you perform this procedure...

--
David Gugick
Quest Software
www.imceda.com
www.quest.com
Author
26 Aug 2005 11:35 PM
JorgeC
Hi!
The table doesn't have primary keys because the "designer" says it will
improve the DELETE all rows + insert all rows performance.
Also the designer says both Access and SQL Server take longer to release the
lock after an UPdate than  a DELETE+INSERT...

The software is eing used by dozens of users at the same time, probably
selling the same products, so locks are an issue and this is the reason this
design has been chosen...
But I still can't get how is it that deleting + inserting is faster than
updating.
It's just awkward to delete hundreds of lines (very often, specially when
dealing with stock documents) just to update that one line hte user has
changed or added.

Plus, how come SQL Server doesn't return/raise an error when there is a
primary key violation?! It takes its time to "figure out" there a
violation... So we're already half way through our  "saving the order header
+ all those related objects" when we get an error from SQL server saying:
Remember that first insert you have made? Well, the primary key is invalid,
please repeat the operation with a new key.
Unbelievable, I really need to my hands on a high concurrency system to see
how these problems have been solved.

Once again, any ideas are always appreciated.
Jorge C.

Show quote
"David Gugick" wrote:

> JorgeC wrote:
> > Hi!
> > I've been working on a stock management software for a couple of
> > months.
> > The user creates/changes stock documents which have a header and
> > several hundreds (usually) lines.
> > Now, when we change a stock document (deleting, adding, changing
> > lines) the current software deletes all rows before saving the
> > document (we're using SQL Server and Access databases).
> > The person responsible for this design says it's faster to delete and
> > insert all lines, than to update the lines which have been changed +
> > deleting the lines that were marked to be deleted + inserting the new
> > lines.
> >
> > Basically, he is saying that an Update operation is slower than a
> > DELETE + INSERT operation. Is this true? How have you solved this
> > problem?
> > ATTENTION: the lines table does not have primary keys, because of this
> > DELETE+INSERT stuff...
> >
>
> We really need to understand how the lines table cannot have a primary
> key... Without a key neither updates nor deletes are going to run with
> any expediency. Plus you expose possible data integrity issues.
>
> Once you have the proper keys in place, and generally speaking, I would
> say the overhead is about equal. That is, a delete and insert is about
> the same as an update. I would use an update, personally, unless there
> was a proven performance benefit to the system and the update operation
> was causing some performance issues that could not be addressed any
> other way.
>
> What performance issues was your system having that the "designer" chose
> to perform a full delete and insert. Certainly, more rows are affected
> that way. And there are no keys in place, which makes me wonder if a
> table scan occurs every time you perform this procedure...
>
> --
> David Gugick
> Quest Software
> www.imceda.com
> www.quest.com
>
>
Author
28 Aug 2005 11:18 PM
Hugo Kornelis
On Fri, 26 Aug 2005 16:35:02 -0700, JorgeC wrote:

>Hi!
>The table doesn't have primary keys because the "designer" says it will
>improve the DELETE all rows + insert all rows performance.
>Also the designer says both Access and SQL Server take longer to release the
>lock after an UPdate than  a DELETE+INSERT...
>
>The software is eing used by dozens of users at the same time, probably
>selling the same products, so locks are an issue and this is the reason this
>design has been chosen...

Hi Jorge,

I really can't believe this. Seriously: I wouldn't allow any "designer"
who dares to suggest to delete and re-insert live data on a live
database near one of my databases. Deleting and re-inserting during off
hours can sometimes be a good idea - but NEVER, NEVER, NEVER during
business hours. PLEASE, change this!

Also, SQL Server releases locks as soon as the transaction is either
committed or rolled back. Your designer apparently knows &%#$ about SQL
Server.

>But I still can't get how is it that deleting + inserting is faster than
>updating.
>It's just awkward to delete hundreds of lines (very often, specially when
>dealing with stock documents) just to update that one line hte user has
>changed or added.

Of course you can't get that, because it's rubbish. There is no way that
deleting and inserting hundreds of rows will ever be faster than
updating one row - I'm pretty sure that if I do everything I can to
optimize the delete+re-insert AND do everything I can to slow down the
update of the single row, the latter will still run rings around the
former.

>Plus, how come SQL Server doesn't return/raise an error when there is a
>primary key violation?! It takes its time to "figure out" there a
>violation... So we're already half way through our  "saving the order header
>+ all those related objects" when we get an error from SQL server saying:
>Remember that first insert you have made? Well, the primary key is invalid,
>please repeat the operation with a new key.

Please post the code your system uses, or at least enough of it to allow
me (and others) to reproduce this. The person who designed your system
must have used some pretty obscure techniques to force this strange
behaviour. If you just INSERT a row in the order header that violates
the primary key, you get an error - at once.

Ah, but I forgot - yopur "designer" chose not to have any primary keys,
so he obviously implemented the checking and error handling himself. And
he screwed up that part of the job just as much as every other part.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Author
26 Aug 2005 10:48 PM
Hugo Kornelis
On Fri, 26 Aug 2005 11:07:03 -0700, JorgeC wrote:

(snip)
>ATTENTION: the lines table does not have primary keys, because of this
>DELETE+INSERT stuff...

Hi Jorge,

This is nonsense. The DELETE+INSERT stuff doesn't preclude the use of
primary keys. It does preclude the use of foreign key constraints; maybe
that is what you wanted to write?

Anyhow, even in that case, I'd prefer to have the foreign key
constraints and choose another way to update the data - integrity is
worth a million times more than performance!

>I really need to understand why! So, can anyone explain this to me? Please?

The true answer is: it depends.

If the update approach would have to delete some rows, insert some new
ones and update a few existing rows, then simply deleting all and
re-inserting might indeed be faster.

If there would be limited deletes and inserts, but lots of updates, it
depends: if most updates can be done "in-place", the update version has
a good chance to win; if the updates can't be done "in-place", then the
delete/insert has the better papers.
(Note: update in place can always be done if only fixed-length columns
change, and can sometimes be done if variable-length columns change).

If there would be limited or no deletes and inserts and limited updates,
the delete/insert would almost certainly be slower.

All this is also influenced by the number of indexed columns, and
whether the column for the clustered index can be updated as well.

In short - test both versions to be sure.

Note: If you want to test the UPDATE version, make sure to use the best
performing order:

1. DELETE rows that have been deleted from other table;
2. UPDATE rows that are in both tables but are not equal;
3. INSERT rows that are in other table but not in this table.

This order minimizes the amount of work to do.
Also, be sure to enclose these statements in a transaction and include
proper error handling, if you don't want to end up with only half the
work being done.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Author
26 Aug 2005 11:40 PM
JorgeC
> This is nonsense. The DELETE+INSERT stuff doesn't preclude the use of
> primary keys. It does preclude the use of foreign key constraints; maybe
> that is what you wanted to write?

Yes, but aren't primary keys normally used to speed up access time(thus
improving delete performance, but descreasing insert performance) and enforce
integrity?

Jorge C.


Show quote
"Hugo Kornelis" wrote:

> On Fri, 26 Aug 2005 11:07:03 -0700, JorgeC wrote:
>
> (snip)
> >ATTENTION: the lines table does not have primary keys, because of this
> >DELETE+INSERT stuff...
>
> Hi Jorge,
>
> This is nonsense. The DELETE+INSERT stuff doesn't preclude the use of
> primary keys. It does preclude the use of foreign key constraints; maybe
> that is what you wanted to write?
>
> Anyhow, even in that case, I'd prefer to have the foreign key
> constraints and choose another way to update the data - integrity is
> worth a million times more than performance!
>
> >I really need to understand why! So, can anyone explain this to me? Please?
>
> The true answer is: it depends.
>
> If the update approach would have to delete some rows, insert some new
> ones and update a few existing rows, then simply deleting all and
> re-inserting might indeed be faster.
>
> If there would be limited deletes and inserts, but lots of updates, it
> depends: if most updates can be done "in-place", the update version has
> a good chance to win; if the updates can't be done "in-place", then the
> delete/insert has the better papers.
> (Note: update in place can always be done if only fixed-length columns
> change, and can sometimes be done if variable-length columns change).
>
> If there would be limited or no deletes and inserts and limited updates,
> the delete/insert would almost certainly be slower.
>
> All this is also influenced by the number of indexed columns, and
> whether the column for the clustered index can be updated as well.
>
> In short - test both versions to be sure.
>
> Note: If you want to test the UPDATE version, make sure to use the best
> performing order:
>
> 1. DELETE rows that have been deleted from other table;
> 2. UPDATE rows that are in both tables but are not equal;
> 3. INSERT rows that are in other table but not in this table.
>
> This order minimizes the amount of work to do.
> Also, be sure to enclose these statements in a transaction and include
> proper error handling, if you don't want to end up with only half the
> work being done.
>
> Best, Hugo
> --
>
> (Remove _NO_ and _SPAM_ to get my e-mail address)
>
Author
27 Aug 2005 1:18 AM
David Gugick
JorgeC wrote:
>> This is nonsense. The DELETE+INSERT stuff doesn't preclude the use of
>> primary keys. It does preclude the use of foreign key constraints;
>> maybe that is what you wanted to write?
>
> Yes, but aren't primary keys normally used to speed up access
> time(thus improving delete performance, but descreasing insert
> performance) and enforce integrity?

Primary keys are only used to enforce data integrity. They are a logical
concept, despite their physical nature once added to a table. Indexing
is used to speed data access.

Are you deleting all rows from the table? Your last reply to me seemed
to indicate this was a complete delete of all rows. If not, then you
need some indexing so SQL Server can find those rows to update or delete
with some speed. And if you used a clustered index, keeping those rows
together, that would help too. And if you were deleting on the primary
key, despite the fact that one is not implemented, you could just use a
clustered primary key.

You should always start with the proven methods and fallback to the
"different" only when there's no other way to get the performance you
need. But that should happen almost never.


--
David Gugick
Quest Software
www.imceda.com
www.quest.com
Author
28 Aug 2005 11:08 PM
Hugo Kornelis
On Fri, 26 Aug 2005 16:40:03 -0700, JorgeC wrote:

>> This is nonsense. The DELETE+INSERT stuff doesn't preclude the use of
>> primary keys. It does preclude the use of foreign key constraints; maybe
>> that is what you wanted to write?
>
>Yes, but aren't primary keys normally used to speed up access time(thus
>improving delete performance, but descreasing insert performance) and enforce
>integrity?

Hi Jorge,

No.

First the theory. PRIMARY KEY, UNIQUE, FOREIGN KEY and CHECK are all
_constraints_. The sole purpose of a constraint is to enforce integrity;
they have nothing whatsoever to do with speed. Without integrity, your
database is useless. I don't care if I get my answer in 20 seconds or in
3 nanoseconds if I can't be sure that it's correct.

Once integrity is assured, optimizing for speed is the second step.
Adding indexes is one of the best ways to improve speed. They can also
help improve concurrency. But be aware that, while indexes can improve
the time to find data, they also hurt the performance of modifications
of your data.

Now, the practice. If you declare a PRIMARY KEY or UNIQUE constraint,
SQL Server will immediately create an index on the column(s) that are
contained in the constraint. So if you work on your integrity, you'll
get improved performance as a bonus - though the indexes created for the
constraint might not be optimal for all your queries, they'll certainly
benefit some of them. (Of course, the main reason for SQL Server to
create these indexes is to speed up the checking of the constraint).

But the bottom line: never use PRIMARY KEY or UNIQUE as a performance
tool (*); use them only to maintain integrity. If you need further
optimization, use the CREATE INDEX statement to create extra indexes
(that can be both UNIQUE or NONUNIQUE).

(*) Changing the options to influence the type of index created by SQL
Server for these constraints CAN be a valuable strategy, though.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Author
28 Aug 2005 4:56 PM
Brian Selzer
> Basically, he is saying that an Update operation is slower than a DELETE +
> INSERT operation. Is this true? How have you solved this problem?
> ATTENTION: the lines table does not have primary keys, because of this
> DELETE+INSERT stuff...

It is not always true.  Depending on the number of indexes and the extent of
the updates, issuing DELETE...UPDATE...INSERT can perform much, much better
than issuing DELETE...INSERT.  Of course, since you don't have a primary key
constraint on the stock lines table, even the DELETE...INSERT will run slow,
because the system is probably issuing a table scan.  I think that the lack
of a primary key constraint is an even bigger problem.

Integrity should be maintained by the database, not the application.  You
should have a primary key constraint on every, EVERY table, and you should
enforce referential integrity on every relationship either by using
declarative foreign key constraints, or if necessary, triggers.  The only
exceptions to this rule are temporary tables or work tables--neither of
which should reside in the same database.  You should also have check
constraints that mirror the validation done in the client application to
ensure column integrity and row integrity, and procedural constraints
(triggers) to deal with dependencies between rows in the same table, such as
preventing overlapping time intervals.


Show quote
"JorgeC" <Jor***@discussions.microsoft.com> wrote in message
news:3E91960F-50BC-4082-9CC7-2E5C147AB99D@microsoft.com...
> Hi!
> I've been working on a stock management software for a couple of months.
> The user creates/changes stock documents which have a header and several
> hundreds (usually) lines.
> Now, when we change a stock document (deleting, adding, changing lines)
the
> current software deletes all rows before saving the document (we're using
SQL
> Server and Access databases).
> The person responsible for this design says it's faster to delete and
insert
> all lines, than to update the lines which have been changed + deleting the
> lines that were marked to be deleted + inserting the new lines.
>
> Basically, he is saying that an Update operation is slower than a DELETE +
> INSERT operation. Is this true? How have you solved this problem?
> ATTENTION: the lines table does not have primary keys, because of this
> DELETE+INSERT stuff...
>
> From my newbie viewpoint I would use the UPDATE methodology....
>
> I really need to understand why! So, can anyone explain this to me?
Please?
>
> Thanks for your help and attention.
> Best Regards,
> Jorge C.

AddThis Social Bookmark Button