|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Most efficient way to update/insert order details?!I've been working on a stock management software for a couple of months. The user creates/changes stock documents which have a header and several hundreds (usually) lines. Now, when we change a stock document (deleting, adding, changing lines) the current software deletes all rows before saving the document (we're using SQL Server and Access databases). The person responsible for this design says it's faster to delete and insert all lines, than to update the lines which have been changed + deleting the lines that were marked to be deleted + inserting the new lines. Basically, he is saying that an Update operation is slower than a DELETE + INSERT operation. Is this true? How have you solved this problem? ATTENTION: the lines table does not have primary keys, because of this DELETE+INSERT stuff... From my newbie viewpoint I would use the UPDATE methodology.... I really need to understand why! So, can anyone explain this to me? Please? Thanks for your help and attention. Best Regards, Jorge C. JorgeC wrote:
Show quote > Hi! We really need to understand how the lines table cannot have a primary > I've been working on a stock management software for a couple of > months. > The user creates/changes stock documents which have a header and > several hundreds (usually) lines. > Now, when we change a stock document (deleting, adding, changing > lines) the current software deletes all rows before saving the > document (we're using SQL Server and Access databases). > The person responsible for this design says it's faster to delete and > insert all lines, than to update the lines which have been changed + > deleting the lines that were marked to be deleted + inserting the new > lines. > > Basically, he is saying that an Update operation is slower than a > DELETE + INSERT operation. Is this true? How have you solved this > problem? > ATTENTION: the lines table does not have primary keys, because of this > DELETE+INSERT stuff... > key... Without a key neither updates nor deletes are going to run with any expediency. Plus you expose possible data integrity issues. Once you have the proper keys in place, and generally speaking, I would say the overhead is about equal. That is, a delete and insert is about the same as an update. I would use an update, personally, unless there was a proven performance benefit to the system and the update operation was causing some performance issues that could not be addressed any other way. What performance issues was your system having that the "designer" chose to perform a full delete and insert. Certainly, more rows are affected that way. And there are no keys in place, which makes me wonder if a table scan occurs every time you perform this procedure... Hi!
The table doesn't have primary keys because the "designer" says it will improve the DELETE all rows + insert all rows performance. Also the designer says both Access and SQL Server take longer to release the lock after an UPdate than a DELETE+INSERT... The software is eing used by dozens of users at the same time, probably selling the same products, so locks are an issue and this is the reason this design has been chosen... But I still can't get how is it that deleting + inserting is faster than updating. It's just awkward to delete hundreds of lines (very often, specially when dealing with stock documents) just to update that one line hte user has changed or added. Plus, how come SQL Server doesn't return/raise an error when there is a primary key violation?! It takes its time to "figure out" there a violation... So we're already half way through our "saving the order header + all those related objects" when we get an error from SQL server saying: Remember that first insert you have made? Well, the primary key is invalid, please repeat the operation with a new key. Unbelievable, I really need to my hands on a high concurrency system to see how these problems have been solved. Once again, any ideas are always appreciated. Jorge C. Show quote "David Gugick" wrote: > JorgeC wrote: > > Hi! > > I've been working on a stock management software for a couple of > > months. > > The user creates/changes stock documents which have a header and > > several hundreds (usually) lines. > > Now, when we change a stock document (deleting, adding, changing > > lines) the current software deletes all rows before saving the > > document (we're using SQL Server and Access databases). > > The person responsible for this design says it's faster to delete and > > insert all lines, than to update the lines which have been changed + > > deleting the lines that were marked to be deleted + inserting the new > > lines. > > > > Basically, he is saying that an Update operation is slower than a > > DELETE + INSERT operation. Is this true? How have you solved this > > problem? > > ATTENTION: the lines table does not have primary keys, because of this > > DELETE+INSERT stuff... > > > > We really need to understand how the lines table cannot have a primary > key... Without a key neither updates nor deletes are going to run with > any expediency. Plus you expose possible data integrity issues. > > Once you have the proper keys in place, and generally speaking, I would > say the overhead is about equal. That is, a delete and insert is about > the same as an update. I would use an update, personally, unless there > was a proven performance benefit to the system and the update operation > was causing some performance issues that could not be addressed any > other way. > > What performance issues was your system having that the "designer" chose > to perform a full delete and insert. Certainly, more rows are affected > that way. And there are no keys in place, which makes me wonder if a > table scan occurs every time you perform this procedure... > > -- > David Gugick > Quest Software > www.imceda.com > www.quest.com > > On Fri, 26 Aug 2005 16:35:02 -0700, JorgeC wrote:
>Hi! Hi Jorge,>The table doesn't have primary keys because the "designer" says it will >improve the DELETE all rows + insert all rows performance. >Also the designer says both Access and SQL Server take longer to release the >lock after an UPdate than a DELETE+INSERT... > >The software is eing used by dozens of users at the same time, probably >selling the same products, so locks are an issue and this is the reason this >design has been chosen... I really can't believe this. Seriously: I wouldn't allow any "designer" who dares to suggest to delete and re-insert live data on a live database near one of my databases. Deleting and re-inserting during off hours can sometimes be a good idea - but NEVER, NEVER, NEVER during business hours. PLEASE, change this! Also, SQL Server releases locks as soon as the transaction is either committed or rolled back. Your designer apparently knows &%#$ about SQL Server. >But I still can't get how is it that deleting + inserting is faster than Of course you can't get that, because it's rubbish. There is no way that>updating. >It's just awkward to delete hundreds of lines (very often, specially when >dealing with stock documents) just to update that one line hte user has >changed or added. deleting and inserting hundreds of rows will ever be faster than updating one row - I'm pretty sure that if I do everything I can to optimize the delete+re-insert AND do everything I can to slow down the update of the single row, the latter will still run rings around the former. >Plus, how come SQL Server doesn't return/raise an error when there is a Please post the code your system uses, or at least enough of it to allow>primary key violation?! It takes its time to "figure out" there a >violation... So we're already half way through our "saving the order header >+ all those related objects" when we get an error from SQL server saying: >Remember that first insert you have made? Well, the primary key is invalid, >please repeat the operation with a new key. me (and others) to reproduce this. The person who designed your system must have used some pretty obscure techniques to force this strange behaviour. If you just INSERT a row in the order header that violates the primary key, you get an error - at once. Ah, but I forgot - yopur "designer" chose not to have any primary keys, so he obviously implemented the checking and error handling himself. And he screwed up that part of the job just as much as every other part. Best, Hugo -- (Remove _NO_ and _SPAM_ to get my e-mail address) On Fri, 26 Aug 2005 11:07:03 -0700, JorgeC wrote:
(snip) >ATTENTION: the lines table does not have primary keys, because of this Hi Jorge,>DELETE+INSERT stuff... This is nonsense. The DELETE+INSERT stuff doesn't preclude the use of primary keys. It does preclude the use of foreign key constraints; maybe that is what you wanted to write? Anyhow, even in that case, I'd prefer to have the foreign key constraints and choose another way to update the data - integrity is worth a million times more than performance! >I really need to understand why! So, can anyone explain this to me? Please? The true answer is: it depends.If the update approach would have to delete some rows, insert some new ones and update a few existing rows, then simply deleting all and re-inserting might indeed be faster. If there would be limited deletes and inserts, but lots of updates, it depends: if most updates can be done "in-place", the update version has a good chance to win; if the updates can't be done "in-place", then the delete/insert has the better papers. (Note: update in place can always be done if only fixed-length columns change, and can sometimes be done if variable-length columns change). If there would be limited or no deletes and inserts and limited updates, the delete/insert would almost certainly be slower. All this is also influenced by the number of indexed columns, and whether the column for the clustered index can be updated as well. In short - test both versions to be sure. Note: If you want to test the UPDATE version, make sure to use the best performing order: 1. DELETE rows that have been deleted from other table; 2. UPDATE rows that are in both tables but are not equal; 3. INSERT rows that are in other table but not in this table. This order minimizes the amount of work to do. Also, be sure to enclose these statements in a transaction and include proper error handling, if you don't want to end up with only half the work being done. Best, Hugo -- (Remove _NO_ and _SPAM_ to get my e-mail address) > This is nonsense. The DELETE+INSERT stuff doesn't preclude the use of Yes, but aren't primary keys normally used to speed up access time(thus > primary keys. It does preclude the use of foreign key constraints; maybe > that is what you wanted to write? improving delete performance, but descreasing insert performance) and enforce integrity? Jorge C. Show quote "Hugo Kornelis" wrote: > On Fri, 26 Aug 2005 11:07:03 -0700, JorgeC wrote: > > (snip) > >ATTENTION: the lines table does not have primary keys, because of this > >DELETE+INSERT stuff... > > Hi Jorge, > > This is nonsense. The DELETE+INSERT stuff doesn't preclude the use of > primary keys. It does preclude the use of foreign key constraints; maybe > that is what you wanted to write? > > Anyhow, even in that case, I'd prefer to have the foreign key > constraints and choose another way to update the data - integrity is > worth a million times more than performance! > > >I really need to understand why! So, can anyone explain this to me? Please? > > The true answer is: it depends. > > If the update approach would have to delete some rows, insert some new > ones and update a few existing rows, then simply deleting all and > re-inserting might indeed be faster. > > If there would be limited deletes and inserts, but lots of updates, it > depends: if most updates can be done "in-place", the update version has > a good chance to win; if the updates can't be done "in-place", then the > delete/insert has the better papers. > (Note: update in place can always be done if only fixed-length columns > change, and can sometimes be done if variable-length columns change). > > If there would be limited or no deletes and inserts and limited updates, > the delete/insert would almost certainly be slower. > > All this is also influenced by the number of indexed columns, and > whether the column for the clustered index can be updated as well. > > In short - test both versions to be sure. > > Note: If you want to test the UPDATE version, make sure to use the best > performing order: > > 1. DELETE rows that have been deleted from other table; > 2. UPDATE rows that are in both tables but are not equal; > 3. INSERT rows that are in other table but not in this table. > > This order minimizes the amount of work to do. > Also, be sure to enclose these statements in a transaction and include > proper error handling, if you don't want to end up with only half the > work being done. > > Best, Hugo > -- > > (Remove _NO_ and _SPAM_ to get my e-mail address) > JorgeC wrote:
>> This is nonsense. The DELETE+INSERT stuff doesn't preclude the use of Primary keys are only used to enforce data integrity. They are a logical >> primary keys. It does preclude the use of foreign key constraints; >> maybe that is what you wanted to write? > > Yes, but aren't primary keys normally used to speed up access > time(thus improving delete performance, but descreasing insert > performance) and enforce integrity? concept, despite their physical nature once added to a table. Indexing is used to speed data access. Are you deleting all rows from the table? Your last reply to me seemed to indicate this was a complete delete of all rows. If not, then you need some indexing so SQL Server can find those rows to update or delete with some speed. And if you used a clustered index, keeping those rows together, that would help too. And if you were deleting on the primary key, despite the fact that one is not implemented, you could just use a clustered primary key. You should always start with the proven methods and fallback to the "different" only when there's no other way to get the performance you need. But that should happen almost never. On Fri, 26 Aug 2005 16:40:03 -0700, JorgeC wrote:
>> This is nonsense. The DELETE+INSERT stuff doesn't preclude the use of Hi Jorge,>> primary keys. It does preclude the use of foreign key constraints; maybe >> that is what you wanted to write? > >Yes, but aren't primary keys normally used to speed up access time(thus >improving delete performance, but descreasing insert performance) and enforce >integrity? No. First the theory. PRIMARY KEY, UNIQUE, FOREIGN KEY and CHECK are all _constraints_. The sole purpose of a constraint is to enforce integrity; they have nothing whatsoever to do with speed. Without integrity, your database is useless. I don't care if I get my answer in 20 seconds or in 3 nanoseconds if I can't be sure that it's correct. Once integrity is assured, optimizing for speed is the second step. Adding indexes is one of the best ways to improve speed. They can also help improve concurrency. But be aware that, while indexes can improve the time to find data, they also hurt the performance of modifications of your data. Now, the practice. If you declare a PRIMARY KEY or UNIQUE constraint, SQL Server will immediately create an index on the column(s) that are contained in the constraint. So if you work on your integrity, you'll get improved performance as a bonus - though the indexes created for the constraint might not be optimal for all your queries, they'll certainly benefit some of them. (Of course, the main reason for SQL Server to create these indexes is to speed up the checking of the constraint). But the bottom line: never use PRIMARY KEY or UNIQUE as a performance tool (*); use them only to maintain integrity. If you need further optimization, use the CREATE INDEX statement to create extra indexes (that can be both UNIQUE or NONUNIQUE). (*) Changing the options to influence the type of index created by SQL Server for these constraints CAN be a valuable strategy, though. Best, Hugo -- (Remove _NO_ and _SPAM_ to get my e-mail address) > Basically, he is saying that an Update operation is slower than a DELETE + It is not always true. Depending on the number of indexes and the extent of> INSERT operation. Is this true? How have you solved this problem? > ATTENTION: the lines table does not have primary keys, because of this > DELETE+INSERT stuff... the updates, issuing DELETE...UPDATE...INSERT can perform much, much better than issuing DELETE...INSERT. Of course, since you don't have a primary key constraint on the stock lines table, even the DELETE...INSERT will run slow, because the system is probably issuing a table scan. I think that the lack of a primary key constraint is an even bigger problem. Integrity should be maintained by the database, not the application. You should have a primary key constraint on every, EVERY table, and you should enforce referential integrity on every relationship either by using declarative foreign key constraints, or if necessary, triggers. The only exceptions to this rule are temporary tables or work tables--neither of which should reside in the same database. You should also have check constraints that mirror the validation done in the client application to ensure column integrity and row integrity, and procedural constraints (triggers) to deal with dependencies between rows in the same table, such as preventing overlapping time intervals. Show quote "JorgeC" <Jor***@discussions.microsoft.com> wrote in message news:3E91960F-50BC-4082-9CC7-2E5C147AB99D@microsoft.com... > Hi! > I've been working on a stock management software for a couple of months. > The user creates/changes stock documents which have a header and several > hundreds (usually) lines. > Now, when we change a stock document (deleting, adding, changing lines) the > current software deletes all rows before saving the document (we're using SQL > Server and Access databases). > The person responsible for this design says it's faster to delete and insert > all lines, than to update the lines which have been changed + deleting the > lines that were marked to be deleted + inserting the new lines. > > Basically, he is saying that an Update operation is slower than a DELETE + > INSERT operation. Is this true? How have you solved this problem? > ATTENTION: the lines table does not have primary keys, because of this > DELETE+INSERT stuff... > > From my newbie viewpoint I would use the UPDATE methodology.... > > I really need to understand why! So, can anyone explain this to me? Please? > > Thanks for your help and attention. > Best Regards, > Jorge C.
Other interesting topics
|
|||||||||||||||||||||||