Home All Groups Group Topic Archive Search About

Deleting large amount of data on sql server 2000

Author
3 Nov 2005 11:43 PM
harish
Hi
I am new to SQL server. Hence please help me.

I need to delete around 100 million records of a table in SQL Server
2000.
One delete statement is creating the transaction log issue.

1) Can I run a DELETE statement without recording in the transaction
log?

2) Can I run the delete in steps? If so how do I do it?

3) The table has a composite index on 4 columns A, B, C and D (in that
order). If the where clause in the index contains the conditions on B
and C coulmns alone. Will the index be used or not?

Thanks
Harish

Author
4 Nov 2005 12:02 AM
ML
> 2) Can I run the delete in steps? If so how do I do it?

Try something like this:
delete    <table>
    from    <table>
        inner join    (
                select    top 1000 *
                    from    <table>
                    order by    <indexed_column>
                ) <alias>
                on    <alias>.<key_column> = <table>.<key_column>


ML
Author
4 Nov 2005 12:07 AM
ML
Oh, yeah - and you might want to wrap that one into a loop:

while @@rowcount > 0
        begin

               ...delete statement here...

        end


Test, re-test, double-test, and - just to make sure - test again.


ML
Author
4 Nov 2005 1:21 AM
harish p
while @@rowcount > 0
begin

...delete statement here...

end

Hi thanks
in this case should it be run as a stored procedure or a single sql
statement?

*** Sent via Developersdex http://www.developersdex.com ***
Author
4 Nov 2005 8:21 AM
ML
If you plan on using it often, then put it in a procedure. Also look at
Itzik's posts - his solution may perform better.


ML
Author
4 Nov 2005 1:28 AM
Itzik Ben-Gan
Harish,

> 1) Can I run a DELETE statement without recording in the transaction
> log?

No. Only TRUNCATE and DROP statements are minimally logged. But these you
use when you need to clean/drop the whole table.

> 2) Can I run the delete in steps? If so how do I do it?

Yes. e.g., steps of 5000 in each transaction:

SET ROW_COUNT 5000;
WHILE 1 = 1
BEGIN
  DELETE FROM T1 WHERE dt < '20030101' -- original delete
  IF @@rowcount < 5000 BREAK;
END
SET ROW_COUNT 5000;

> 3) The table has a composite index on 4 columns A, B, C and D (in that
> order). If the where clause in the index contains the conditions on B
> and C coulmns alone. Will the index be used or not?

Not for an efficient seek operation. But the optimizer will consult
statistics to see if scanning the whole leaf level of the index and then
doing lookups (in case it's not a covering index) might help.

--
BG, SQL Server MVP
www.SolidQualityLearning.com

Join us for the SQL Server 2005 launch at the SQL Week in Israel!
http://www.microsoft.com/israel/sql/sqlweek/default.mspx


Show quote
"harish" <harish.prabh***@gmail.com> wrote in message
news:1131061415.639807.113740@g49g2000cwa.googlegroups.com...
> Hi
> I am new to SQL server. Hence please help me.
>
> I need to delete around 100 million records of a table in SQL Server
> 2000.
> One delete statement is creating the transaction log issue.
>
> 1) Can I run a DELETE statement without recording in the transaction
> log?
>
> 2) Can I run the delete in steps? If so how do I do it?
>
> 3) The table has a composite index on 4 columns A, B, C and D (in that
> order). If the where clause in the index contains the conditions on B
> and C coulmns alone. Will the index be used or not?
>
> Thanks
> Harish
>
Author
4 Nov 2005 1:41 AM
harish
Hii

Thanks a lot for the code.
i have another question.
Do I run this as a single SQl or as a stored procedure.

Thanks
Harish
Author
4 Nov 2005 1:53 AM
Itzik Ben-Gan
Either way is fine.

--
BG, SQL Server MVP
www.SolidQualityLearning.com

Join us for the SQL Server 2005 launch at the SQL Week in Israel!
http://www.microsoft.com/israel/sql/sqlweek/default.mspx


Show quote
"harish" <harish.prabh***@gmail.com> wrote in message
news:1131068462.451488.129400@g44g2000cwa.googlegroups.com...
> Hii
>
> Thanks a lot for the code.
> i have another question.
> Do I run this as a single SQl or as a stored procedure.
>
> Thanks
> Harish
>
Author
4 Nov 2005 1:57 AM
harish
Hey

I didnt understand how this @@rowcount works.Basically I didnt
understand how this loop will work.
Can you please explain?
Author
4 Nov 2005 2:10 AM
harish
But is there no COMMIT statement required here?
Author
4 Nov 2005 2:56 AM
harish
hey ben
can you pls explain the logic
Author
4 Nov 2005 5:17 AM
Itzik Ben-Gan
Sure,

The idea is to split the large transaction into smaller ones, like you
asked.

Remember that in SQL Server, unless within an explicit outer transaction,
each statement is its own transaction, as if encapsulated with BEGIN
TRAM/COMMIT TRAN.

The SET ROW_COUNT command changes the way your session behaves in the sense
that any statement would stop processing rows as once it processed the
number of rows specified in this option.
So, the following statement sets this value to 5000:
SET ROW_COUNT 5000;
From now on, any statement processing data will stop after 5000 rows are
processed.
Then the following endless loop fires a DELETE, which affects 5000 rows in
each iteration. And since it's an independent statement, it commits every
5000 rows. Once the DELETE affects less (you reached the last batch), the
loop breaks.
WHILE 1 = 1
BEGIN
  DELETE FROM T1 WHERE dt < '20030101' -- original delete
  IF @@rowcount < 5000 BREAK;
END

Once done, remove the SET ROW_COUNT limitation by setting it to 0:
SET ROW_COUNT 0;

--
BG, SQL Server MVP
www.SolidQualityLearning.com

Join us for the SQL Server 2005 launch at the SQL Week in Israel!
http://www.microsoft.com/israel/sql/sqlweek/default.mspx


Show quote
"harish" <harish.prabh***@gmail.com> wrote in message
news:1131072963.044861.113350@g49g2000cwa.googlegroups.com...
> hey ben
> can you pls explain the logic
>
Author
4 Nov 2005 5:27 AM
Itzik Ben-Gan
Correction:

Should be SET ROWCOUNT, and not SET ROW_COUNT.

Here's the full code again:

SET ROWCOUNT 5000;
WHILE 1 = 1
BEGIN
  DELETE FROM T1 WHERE dt < '20030101' -- original delete
  IF @@rowcount < 5000 BREAK;
END
SET ROWCOUNT 0;

--
BG, SQL Server MVP
www.SolidQualityLearning.com

Join us for the SQL Server 2005 launch at the SQL Week in Israel!
http://www.microsoft.com/israel/sql/sqlweek/default.mspx


Show quote
"Itzik Ben-Gan" <it***@REMOVETHIS.SolidQualityLearning.com> wrote in message
news:%23p10e8P4FHA.2816@tk2msftngp13.phx.gbl...
> Sure,
>
> The idea is to split the large transaction into smaller ones, like you
> asked.
>
> Remember that in SQL Server, unless within an explicit outer transaction,
> each statement is its own transaction, as if encapsulated with BEGIN
> TRAM/COMMIT TRAN.
>
> The SET ROW_COUNT command changes the way your session behaves in the
> sense that any statement would stop processing rows as once it processed
> the number of rows specified in this option.
> So, the following statement sets this value to 5000:
> SET ROW_COUNT 5000;
> From now on, any statement processing data will stop after 5000 rows are
> processed.
> Then the following endless loop fires a DELETE, which affects 5000 rows in
> each iteration. And since it's an independent statement, it commits every
> 5000 rows. Once the DELETE affects less (you reached the last batch), the
> loop breaks.
> WHILE 1 = 1
> BEGIN
>  DELETE FROM T1 WHERE dt < '20030101' -- original delete
>  IF @@rowcount < 5000 BREAK;
> END
>
> Once done, remove the SET ROW_COUNT limitation by setting it to 0:
> SET ROW_COUNT 0;
>
> --
> BG, SQL Server MVP
> www.SolidQualityLearning.com
>
> Join us for the SQL Server 2005 launch at the SQL Week in Israel!
> http://www.microsoft.com/israel/sql/sqlweek/default.mspx
>
>
> "harish" <harish.prabh***@gmail.com> wrote in message
> news:1131072963.044861.113350@g49g2000cwa.googlegroups.com...
>> hey ben
>> can you pls explain the logic
>>
>
>
Author
5 Nov 2005 3:42 AM
harish
Hey ben

We have a problem with performance. The idea is

DELETING 100 million from a table weekly SQl SERVER 2000
****************************************************************************
We have a table in SQL SERVER 2000 which has about 250 million records
and this will be growing by 100 million every week. At a time the table
should contain just 13 weeks of data. when the 14th week data needs to
be loaded the first week's data has to be deleted.
And this deletes 100 million every week, since the delete is taking lot
of transaction log space the job is not successful.
Can you please help with what are the approaches we can take to fix
this problem?
Performance and transaction log are the issues we are facing. We tried
deletion in steps too but that also is taking time. What are the
different ways we can address this quickly.
Please reply at the earliest.
Thanks
Harish
Author
5 Nov 2005 8:40 AM
John Bell
Hi

You could do daily deletions?

A more clunky alternative would be to have a partioned view which gets
re-defined each week and therefore you would only need to drop the table
pertaining to the week in question.

John

Show quote
"harish" wrote:

> Hey ben
>
> We have a problem with performance. The idea is
>
> DELETING 100 million from a table weekly SQl SERVER 2000
> ****************************************************************************
> We have a table in SQL SERVER 2000 which has about 250 million records
> and this will be growing by 100 million every week. At a time the table
> should contain just 13 weeks of data. when the 14th week data needs to
> be loaded the first week's data has to be deleted.
> And this deletes 100 million every week, since the delete is taking lot
> of transaction log space the job is not successful.
> Can you please help with what are the approaches we can take to fix
> this problem?
> Performance and transaction log are the issues we are facing. We tried
> deletion in steps too but that also is taking time. What are the
> different ways we can address this quickly.
> Please reply at the earliest.
> Thanks
> Harish
>
>
Author
5 Nov 2005 7:01 PM
harish
Hey


Thanks. We have an index on four columns in this table. For Ex A, B, C
and D
The delete statement's where clause has the conditions for A, B and C
The delete statement's where clause has the conditions for just A


Which of the two's performance will be faster?


We tried something like this:


SET ROWCOUNT 5000;
WHILE 1 = 1
BEGIN
  DELETE FROM T1 WHERE dt < '20030101' -- original delete
  IF @@rowcount < 5000 BREAK;
END
SET ROWCOUNT 0;


1) Does this setting ROWCOUNT first sort the table and then delete?


2) The above query is executed to delete all records satisfying the
condition in steps of 5000 until the delete is comple.
How can I stop it after one 5000?
Author
5 Nov 2005 8:28 PM
Hugo Kornelis
On 5 Nov 2005 11:01:59 -0800, harish wrote:

(snip)

Hi Harish,

I just replied to the same question in another thread.

Could you please ask your questions in JUST ONE place, and in JUST ONE
group? I've seen your messages scattered over several groups, and
several different messages in just this group. Many of them have
attracted replies. It's very hard to keep track of what is going on in
all thesse threads, and it's a waste of other people's time if someone
posts a reply to you that you already had received in another group.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Author
5 Nov 2005 8:53 PM
John Bell
Hi

Your reply is confusing, ther first suggestion I made would reduce the
volume of data to be deleted and therefore speed should be quicker, the draw
back is that you do it more often.

The second option would mean that you view is not available for a short
period of time, but the deletion is effectively offline and you don't need to
use delete as you can just drop the table.

Show quote
"harish" wrote:

> Hey
>
>
> Thanks. We have an index on four columns in this table. For Ex A, B, C
> and D
> The delete statement's where clause has the conditions for A, B and C
> The delete statement's where clause has the conditions for just A
>
>
> Which of the two's performance will be faster?
>
>
> We tried something like this:
>
>
> SET ROWCOUNT 5000;
> WHILE 1 = 1
> BEGIN
>   DELETE FROM T1 WHERE dt < '20030101' -- original delete
>   IF @@rowcount < 5000 BREAK;
> END
> SET ROWCOUNT 0;
>
>
> 1) Does this setting ROWCOUNT first sort the table and then delete?
>
No
>
> 2) The above query is executed to delete all records satisfying the
> condition in steps of 5000 until the delete is comple.
Yes

> How can I stop it after one 5000?
>
Why only 5000? Remove the WHILE 1 = 1 statement and IF @@rowcount < 5000
BREAK; will do this! But then you will be left with the majority of the
records.
>

John
Author
5 Nov 2005 11:53 PM
Hugo Kornelis
On Sat, 5 Nov 2005 12:53:02 -0800, John Bell wrote:

>Hi
>
>Your reply is confusing,

Hi John,

No wonder. This guy posted the same question in at least six different
messages, in four different newsgroups. He received different replies to
most of his messages; now he has posted the same followup to all these
replies.

Maybe his questions do make sense as a followup to a reply in one of the
other threads. But frankly, I can't be bothered to waste my time trying
to find that out.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)
Author
4 Nov 2005 6:56 AM
Stephany Young
Have you considered working from the opposite direction to solve this issue?

You have not mentioned the number of existing rows in the table so I am
assuming that the number of rows remaining after the delete will be
relatively small compared to the 100,000,000 rows you are deleting.

Try (in a single batch) something like:

    select <column list> into #temp from <table> where <reverse of where
clause for deletion>

    truncate table <table>

    insert into <table>(<column list) select (column list) from #temp

    drop table #temp


Show quote
"harish" <harish.prabh***@gmail.com> wrote in message
news:1131061415.639807.113740@g49g2000cwa.googlegroups.com...
> Hi
> I am new to SQL server. Hence please help me.
>
> I need to delete around 100 million records of a table in SQL Server
> 2000.
> One delete statement is creating the transaction log issue.
>
> 1) Can I run a DELETE statement without recording in the transaction
> log?
>
> 2) Can I run the delete in steps? If so how do I do it?
>
> 3) The table has a composite index on 4 columns A, B, C and D (in that
> order). If the where clause in the index contains the conditions on B
> and C coulmns alone. Will the index be used or not?
>
> Thanks
> Harish
>
Author
4 Nov 2005 3:14 PM
harish
Thanks Ben for the detailed explanation.

Thanks again
byee
harish
Author
4 Nov 2005 3:16 PM
harish
Thanks Stephan

Actually we considered this "create table as" approach but the disk
space is a issue. We do no have sufficient disk space as its a huge
table.

Also I have another question. Do we have something like a swap table in
SQL Server like EXCHANGE PARTITION.

byee
harish
Author
4 Nov 2005 3:47 PM
JT
Before you do anything, make sure you have good backups.

When processing a large volume of data with limited disk space, you need to
investigate into creatively manageing the location of your database files on
the server. The location of the tempdb and transaction log files are
particularly important in this situation.

Physical Database Files and Filegroups:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/architec/8_ar_da2_9sab.asp

You may also want to consider using DTS to copy the data you wish to *keep*
into perhaps a tab delimited flat text file, truncate the table, and then
DTS import the data back into the table. If you set the recovery model of
the database to "bulk-logged", then transaction logging will be kept to a
minimum during the load.

Logged and Minimally Logged Bulk Copy Operations
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/adminsql/ad_impt_bcp_9esz.asp
SQL Server 2000 Incremental Bulk Load Case Study
http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/incbulkload.mspx

Finally, I don't know if this delete 100 million rows issue is a one time
maintenace fix, but if you do this often, then try to think about
redesigning your workflow in such a way to avoid it.

Show quote
"harish" <harish.prabh***@gmail.com> wrote in message
news:1131061415.639807.113740@g49g2000cwa.googlegroups.com...
> Hi
> I am new to SQL server. Hence please help me.
>
> I need to delete around 100 million records of a table in SQL Server
> 2000.
> One delete statement is creating the transaction log issue.
>
> 1) Can I run a DELETE statement without recording in the transaction
> log?
>
> 2) Can I run the delete in steps? If so how do I do it?
>
> 3) The table has a composite index on 4 columns A, B, C and D (in that
> order). If the where clause in the index contains the conditions on B
> and C coulmns alone. Will the index be used or not?
>
> Thanks
> Harish
>
Author
4 Nov 2005 5:21 PM
harish
SET ROWCOUNT 5000;
WHILE 1 = 1
BEGIN
  DELETE FROM T1 WHERE dt < '20030101' -- original delete
  IF @@rowcount < 5000 BREAK;
END
SET ROWCOUNT 0;

Using this code if I delete 1 million in each step out of a total of
100million, Will i encounter the transaction log problem?

I sthe transaction log cleared after every 1million are deleted?
Author
4 Nov 2005 5:54 PM
Itzik Ben-Gan
Hrish, please read the reply I provided in your question in the new thread.

--
BG, SQL Server MVP
www.SolidQualityLearning.com

Join us for the SQL Server 2005 launch at the SQL Week in Israel!
http://www.microsoft.com/israel/sql/sqlweek/default.mspx


Show quote
"harish" <harish.prabh***@gmail.com> wrote in message
news:1131124865.801597.115790@g49g2000cwa.googlegroups.com...
> SET ROWCOUNT 5000;
> WHILE 1 = 1
> BEGIN
>  DELETE FROM T1 WHERE dt < '20030101' -- original delete
>  IF @@rowcount < 5000 BREAK;
> END
> SET ROWCOUNT 0;
>
> Using this code if I delete 1 million in each step out of a total of
> 100million, Will i encounter the transaction log problem?
>
> I sthe transaction log cleared after every 1million are deleted?
>
Author
4 Nov 2005 8:30 PM
harish
Hey Ben

Thanks a lot. That was a great help.

It was a design problem and we are trying to modify the design and fix
the problem.
Thanks a lot for the help.

Byee
Harish

AddThis Social Bookmark Button