|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Deleting large amount of data on sql server 2000Hi
I am new to SQL server. Hence please help me. I need to delete around 100 million records of a table in SQL Server 2000. One delete statement is creating the transaction log issue. 1) Can I run a DELETE statement without recording in the transaction log? 2) Can I run the delete in steps? If so how do I do it? 3) The table has a composite index on 4 columns A, B, C and D (in that order). If the where clause in the index contains the conditions on B and C coulmns alone. Will the index be used or not? Thanks Harish > 2) Can I run the delete in steps? If so how do I do it? Try something like this:delete <table> from <table> inner join ( select top 1000 * from <table> order by <indexed_column> ) <alias> on <alias>.<key_column> = <table>.<key_column> ML Oh, yeah - and you might want to wrap that one into a loop:
while @@rowcount > 0 begin ...delete statement here... end Test, re-test, double-test, and - just to make sure - test again. ML while @@rowcount > 0
begin ...delete statement here... end Hi thanks in this case should it be run as a stored procedure or a single sql statement? *** Sent via Developersdex http://www.developersdex.com *** If you plan on using it often, then put it in a procedure. Also look at
Itzik's posts - his solution may perform better. ML Harish,
> 1) Can I run a DELETE statement without recording in the transaction No. Only TRUNCATE and DROP statements are minimally logged. But these you > log? use when you need to clean/drop the whole table. > 2) Can I run the delete in steps? If so how do I do it? Yes. e.g., steps of 5000 in each transaction:SET ROW_COUNT 5000; WHILE 1 = 1 BEGIN DELETE FROM T1 WHERE dt < '20030101' -- original delete IF @@rowcount < 5000 BREAK; END SET ROW_COUNT 5000; > 3) The table has a composite index on 4 columns A, B, C and D (in that Not for an efficient seek operation. But the optimizer will consult > order). If the where clause in the index contains the conditions on B > and C coulmns alone. Will the index be used or not? statistics to see if scanning the whole leaf level of the index and then doing lookups (in case it's not a covering index) might help. -- Show quoteBG, SQL Server MVP www.SolidQualityLearning.com Join us for the SQL Server 2005 launch at the SQL Week in Israel! http://www.microsoft.com/israel/sql/sqlweek/default.mspx "harish" <harish.prabh***@gmail.com> wrote in message news:1131061415.639807.113740@g49g2000cwa.googlegroups.com... > Hi > I am new to SQL server. Hence please help me. > > I need to delete around 100 million records of a table in SQL Server > 2000. > One delete statement is creating the transaction log issue. > > 1) Can I run a DELETE statement without recording in the transaction > log? > > 2) Can I run the delete in steps? If so how do I do it? > > 3) The table has a composite index on 4 columns A, B, C and D (in that > order). If the where clause in the index contains the conditions on B > and C coulmns alone. Will the index be used or not? > > Thanks > Harish > Hii
Thanks a lot for the code. i have another question. Do I run this as a single SQl or as a stored procedure. Thanks Harish Either way is fine.
-- Show quoteBG, SQL Server MVP www.SolidQualityLearning.com Join us for the SQL Server 2005 launch at the SQL Week in Israel! http://www.microsoft.com/israel/sql/sqlweek/default.mspx "harish" <harish.prabh***@gmail.com> wrote in message news:1131068462.451488.129400@g44g2000cwa.googlegroups.com... > Hii > > Thanks a lot for the code. > i have another question. > Do I run this as a single SQl or as a stored procedure. > > Thanks > Harish > Hey
I didnt understand how this @@rowcount works.Basically I didnt understand how this loop will work. Can you please explain? Sure,
The idea is to split the large transaction into smaller ones, like you asked. Remember that in SQL Server, unless within an explicit outer transaction, each statement is its own transaction, as if encapsulated with BEGIN TRAM/COMMIT TRAN. The SET ROW_COUNT command changes the way your session behaves in the sense that any statement would stop processing rows as once it processed the number of rows specified in this option. So, the following statement sets this value to 5000: SET ROW_COUNT 5000; From now on, any statement processing data will stop after 5000 rows are processed. Then the following endless loop fires a DELETE, which affects 5000 rows in each iteration. And since it's an independent statement, it commits every 5000 rows. Once the DELETE affects less (you reached the last batch), the loop breaks. WHILE 1 = 1 BEGIN DELETE FROM T1 WHERE dt < '20030101' -- original delete IF @@rowcount < 5000 BREAK; END Once done, remove the SET ROW_COUNT limitation by setting it to 0: SET ROW_COUNT 0; -- Show quoteBG, SQL Server MVP www.SolidQualityLearning.com Join us for the SQL Server 2005 launch at the SQL Week in Israel! http://www.microsoft.com/israel/sql/sqlweek/default.mspx "harish" <harish.prabh***@gmail.com> wrote in message news:1131072963.044861.113350@g49g2000cwa.googlegroups.com... > hey ben > can you pls explain the logic > Correction:
Should be SET ROWCOUNT, and not SET ROW_COUNT. Here's the full code again: SET ROWCOUNT 5000; WHILE 1 = 1 BEGIN DELETE FROM T1 WHERE dt < '20030101' -- original delete IF @@rowcount < 5000 BREAK; END SET ROWCOUNT 0; -- Show quoteBG, SQL Server MVP www.SolidQualityLearning.com Join us for the SQL Server 2005 launch at the SQL Week in Israel! http://www.microsoft.com/israel/sql/sqlweek/default.mspx "Itzik Ben-Gan" <it***@REMOVETHIS.SolidQualityLearning.com> wrote in message news:%23p10e8P4FHA.2816@tk2msftngp13.phx.gbl... > Sure, > > The idea is to split the large transaction into smaller ones, like you > asked. > > Remember that in SQL Server, unless within an explicit outer transaction, > each statement is its own transaction, as if encapsulated with BEGIN > TRAM/COMMIT TRAN. > > The SET ROW_COUNT command changes the way your session behaves in the > sense that any statement would stop processing rows as once it processed > the number of rows specified in this option. > So, the following statement sets this value to 5000: > SET ROW_COUNT 5000; > From now on, any statement processing data will stop after 5000 rows are > processed. > Then the following endless loop fires a DELETE, which affects 5000 rows in > each iteration. And since it's an independent statement, it commits every > 5000 rows. Once the DELETE affects less (you reached the last batch), the > loop breaks. > WHILE 1 = 1 > BEGIN > DELETE FROM T1 WHERE dt < '20030101' -- original delete > IF @@rowcount < 5000 BREAK; > END > > Once done, remove the SET ROW_COUNT limitation by setting it to 0: > SET ROW_COUNT 0; > > -- > BG, SQL Server MVP > www.SolidQualityLearning.com > > Join us for the SQL Server 2005 launch at the SQL Week in Israel! > http://www.microsoft.com/israel/sql/sqlweek/default.mspx > > > "harish" <harish.prabh***@gmail.com> wrote in message > news:1131072963.044861.113350@g49g2000cwa.googlegroups.com... >> hey ben >> can you pls explain the logic >> > > Hey ben
We have a problem with performance. The idea is DELETING 100 million from a table weekly SQl SERVER 2000 **************************************************************************** We have a table in SQL SERVER 2000 which has about 250 million records and this will be growing by 100 million every week. At a time the table should contain just 13 weeks of data. when the 14th week data needs to be loaded the first week's data has to be deleted. And this deletes 100 million every week, since the delete is taking lot of transaction log space the job is not successful. Can you please help with what are the approaches we can take to fix this problem? Performance and transaction log are the issues we are facing. We tried deletion in steps too but that also is taking time. What are the different ways we can address this quickly. Please reply at the earliest. Thanks Harish Hi
You could do daily deletions? A more clunky alternative would be to have a partioned view which gets re-defined each week and therefore you would only need to drop the table pertaining to the week in question. John Show quote "harish" wrote: > Hey ben > > We have a problem with performance. The idea is > > DELETING 100 million from a table weekly SQl SERVER 2000 > **************************************************************************** > We have a table in SQL SERVER 2000 which has about 250 million records > and this will be growing by 100 million every week. At a time the table > should contain just 13 weeks of data. when the 14th week data needs to > be loaded the first week's data has to be deleted. > And this deletes 100 million every week, since the delete is taking lot > of transaction log space the job is not successful. > Can you please help with what are the approaches we can take to fix > this problem? > Performance and transaction log are the issues we are facing. We tried > deletion in steps too but that also is taking time. What are the > different ways we can address this quickly. > Please reply at the earliest. > Thanks > Harish > > Hey
Thanks. We have an index on four columns in this table. For Ex A, B, C and D The delete statement's where clause has the conditions for A, B and C The delete statement's where clause has the conditions for just A Which of the two's performance will be faster? We tried something like this: SET ROWCOUNT 5000; WHILE 1 = 1 BEGIN DELETE FROM T1 WHERE dt < '20030101' -- original delete IF @@rowcount < 5000 BREAK; END SET ROWCOUNT 0; 1) Does this setting ROWCOUNT first sort the table and then delete? 2) The above query is executed to delete all records satisfying the condition in steps of 5000 until the delete is comple. How can I stop it after one 5000? On 5 Nov 2005 11:01:59 -0800, harish wrote:
(snip) Hi Harish, I just replied to the same question in another thread. Could you please ask your questions in JUST ONE place, and in JUST ONE group? I've seen your messages scattered over several groups, and several different messages in just this group. Many of them have attracted replies. It's very hard to keep track of what is going on in all thesse threads, and it's a waste of other people's time if someone posts a reply to you that you already had received in another group. Best, Hugo -- (Remove _NO_ and _SPAM_ to get my e-mail address) Hi
Your reply is confusing, ther first suggestion I made would reduce the volume of data to be deleted and therefore speed should be quicker, the draw back is that you do it more often. The second option would mean that you view is not available for a short period of time, but the deletion is effectively offline and you don't need to use delete as you can just drop the table. Show quote "harish" wrote: Why only 5000? Remove the WHILE 1 = 1 statement and IF @@rowcount < 5000 > Hey > > > Thanks. We have an index on four columns in this table. For Ex A, B, C > and D > The delete statement's where clause has the conditions for A, B and C > The delete statement's where clause has the conditions for just A > > > Which of the two's performance will be faster? > > > We tried something like this: > > > SET ROWCOUNT 5000; > WHILE 1 = 1 > BEGIN > DELETE FROM T1 WHERE dt < '20030101' -- original delete > IF @@rowcount < 5000 BREAK; > END > SET ROWCOUNT 0; > > > 1) Does this setting ROWCOUNT first sort the table and then delete? > No > > 2) The above query is executed to delete all records satisfying the > condition in steps of 5000 until the delete is comple. Yes > How can I stop it after one 5000? > BREAK; will do this! But then you will be left with the majority of the records. > JohnOn Sat, 5 Nov 2005 12:53:02 -0800, John Bell wrote:
>Hi Hi John,> >Your reply is confusing, No wonder. This guy posted the same question in at least six different messages, in four different newsgroups. He received different replies to most of his messages; now he has posted the same followup to all these replies. Maybe his questions do make sense as a followup to a reply in one of the other threads. But frankly, I can't be bothered to waste my time trying to find that out. Best, Hugo -- (Remove _NO_ and _SPAM_ to get my e-mail address) Have you considered working from the opposite direction to solve this issue?
You have not mentioned the number of existing rows in the table so I am assuming that the number of rows remaining after the delete will be relatively small compared to the 100,000,000 rows you are deleting. Try (in a single batch) something like: select <column list> into #temp from <table> where <reverse of where clause for deletion> truncate table <table> insert into <table>(<column list) select (column list) from #temp drop table #temp Show quote "harish" <harish.prabh***@gmail.com> wrote in message news:1131061415.639807.113740@g49g2000cwa.googlegroups.com... > Hi > I am new to SQL server. Hence please help me. > > I need to delete around 100 million records of a table in SQL Server > 2000. > One delete statement is creating the transaction log issue. > > 1) Can I run a DELETE statement without recording in the transaction > log? > > 2) Can I run the delete in steps? If so how do I do it? > > 3) The table has a composite index on 4 columns A, B, C and D (in that > order). If the where clause in the index contains the conditions on B > and C coulmns alone. Will the index be used or not? > > Thanks > Harish > Thanks Stephan
Actually we considered this "create table as" approach but the disk space is a issue. We do no have sufficient disk space as its a huge table. Also I have another question. Do we have something like a swap table in SQL Server like EXCHANGE PARTITION. byee harish Before you do anything, make sure you have good backups.
When processing a large volume of data with limited disk space, you need to investigate into creatively manageing the location of your database files on the server. The location of the tempdb and transaction log files are particularly important in this situation. Physical Database Files and Filegroups: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/architec/8_ar_da2_9sab.asp You may also want to consider using DTS to copy the data you wish to *keep* into perhaps a tab delimited flat text file, truncate the table, and then DTS import the data back into the table. If you set the recovery model of the database to "bulk-logged", then transaction logging will be kept to a minimum during the load. Logged and Minimally Logged Bulk Copy Operations http://msdn.microsoft.com/library/default.asp?url=/library/en-us/adminsql/ad_impt_bcp_9esz.asp SQL Server 2000 Incremental Bulk Load Case Study http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/incbulkload.mspx Finally, I don't know if this delete 100 million rows issue is a one time maintenace fix, but if you do this often, then try to think about redesigning your workflow in such a way to avoid it. Show quote "harish" <harish.prabh***@gmail.com> wrote in message news:1131061415.639807.113740@g49g2000cwa.googlegroups.com... > Hi > I am new to SQL server. Hence please help me. > > I need to delete around 100 million records of a table in SQL Server > 2000. > One delete statement is creating the transaction log issue. > > 1) Can I run a DELETE statement without recording in the transaction > log? > > 2) Can I run the delete in steps? If so how do I do it? > > 3) The table has a composite index on 4 columns A, B, C and D (in that > order). If the where clause in the index contains the conditions on B > and C coulmns alone. Will the index be used or not? > > Thanks > Harish > SET ROWCOUNT 5000;
WHILE 1 = 1 BEGIN DELETE FROM T1 WHERE dt < '20030101' -- original delete IF @@rowcount < 5000 BREAK; END SET ROWCOUNT 0; Using this code if I delete 1 million in each step out of a total of 100million, Will i encounter the transaction log problem? I sthe transaction log cleared after every 1million are deleted? Hrish, please read the reply I provided in your question in the new thread.
-- Show quoteBG, SQL Server MVP www.SolidQualityLearning.com Join us for the SQL Server 2005 launch at the SQL Week in Israel! http://www.microsoft.com/israel/sql/sqlweek/default.mspx "harish" <harish.prabh***@gmail.com> wrote in message news:1131124865.801597.115790@g49g2000cwa.googlegroups.com... > SET ROWCOUNT 5000; > WHILE 1 = 1 > BEGIN > DELETE FROM T1 WHERE dt < '20030101' -- original delete > IF @@rowcount < 5000 BREAK; > END > SET ROWCOUNT 0; > > Using this code if I delete 1 million in each step out of a total of > 100million, Will i encounter the transaction log problem? > > I sthe transaction log cleared after every 1million are deleted? > |
|||||||||||||||||||||||