Home All Groups Group Topic Archive Search About

Transaction Log in a large delete operation performed in steps.

Author
4 Nov 2005 5:26 PM
harish
I am using teh following code in a large delete operation performed in
steps.

SET ROWCOUNT 5000;
WHILE 1 = 1
BEGIN
  DELETE FROM T1 WHERE dt < '20030101' -- original delete
  IF @@rowcount < 5000 BREAK;
END
SET ROWCOUNT 0;


Using this code if I delete 1 million in each step out of a total of
100million, Will i encounter the transaction log problem?

Is the transaction log cleared after every 1million are deleted?


Thanks
harish

Author
4 Nov 2005 5:46 PM
Itzik Ben-Gan
Harish,

The subject is much deeper than what would appear on the surface. I should
have provided more details...

Deleting a large rowset in a single transaction has several drawbacks. A
DELETE statement is fully logged, and besides being slow, it will require
enough space in the transaction log to accommodate the whole transaction.
The section in the log starting with the oldest open transaction until the
current pointer cannot be overridden, preventing the log from recycling.
Furthermore, if the transaction breaks in the middle for some reason, the
whole activity that took place so far will be rolled back, and this will
take a while. Finally, with large deletes, SQL Server will probably have to
escalate the locks held on the deleted data to a full blown exclusive table
lock, preventing both read and write access against the target table until
the DELETE finishes.

You realize that it makes a lot of sense to break the single large DELETE
transaction to multiple smaller ones. Small enough not to cause lock
escalation (typically a few thousand rows, not a million), and allowing
recycling of the transaction log. Keep in mind that unless the database is
in SIMPLE recovery model, the log won't override committed transactions
until you backup the log. To make things easy, you can define a SQL Agent
alert that fires when the log percent usage reaches a certain threshold
(e/g/, 90%), and when fired, it would invoke a job that backs up the log to
free space.

As for lock escalation, you can easily verify that the number you chose
doesn't cause lock escalation by testing a DELETE with the TOP option while
monitoring a Lock Escalation event with Profiler. Splitting the large DELETE
will also allow overriding committed transactions in sections in the log
that were already backed up.

And most importantly, if your environment reached sizes where purging is
very expensive, INSERTs start to become slower, maintenance activities like
rebuilding indexes become very expensive, etc., maybe it's time to consider
partitioning.

You can find a lot of coverage on partitioning in www.sqlmag.com.
Type partitioning in the search box.

--
BG, SQL Server MVP
www.SolidQualityLearning.com

Join us for the SQL Server 2005 launch at the SQL Week in Israel!
http://www.microsoft.com/israel/sql/sqlweek/default.mspx


Show quote
"harish" <harish.prabh***@gmail.com> wrote in message
news:1131125178.299996.12240@g44g2000cwa.googlegroups.com...
>I am using teh following code in a large delete operation performed in
> steps.
>
> SET ROWCOUNT 5000;
> WHILE 1 = 1
> BEGIN
>  DELETE FROM T1 WHERE dt < '20030101' -- original delete
>  IF @@rowcount < 5000 BREAK;
> END
> SET ROWCOUNT 0;
>
>
> Using this code if I delete 1 million in each step out of a total of
> 100million, Will i encounter the transaction log problem?
>
> Is the transaction log cleared after every 1million are deleted?
>
>
> Thanks
> harish
>
Author
4 Nov 2005 8:30 PM
harish
Hey Ben

Thanks a lot. That was a great help.

It was a design problem and we are trying to modify the design and fix
the problem.
Thanks a lot for the help.

Byee
Harish
Author
5 Nov 2005 3:43 AM
harish
Hey ben

We have a problem with performance. The idea is

DELETING 100 million from a table weekly SQl SERVER 2000
****************************************************************************
We have a table in SQL SERVER 2000 which has about 250 million records
and this will be growing by 100 million every week. At a time the table
should contain just 13 weeks of data. when the 14th week data needs to
be loaded the first week's data has to be deleted.
And this deletes 100 million every week, since the delete is taking lot
of transaction log space the job is not successful.
Can you please help with what are the approaches we can take to fix
this problem?
Performance and transaction log are the issues we are facing. We tried
deletion in steps too but that also is taking time. What are the
different ways we can address this quickly.
Please reply at the earliest.
Thanks
Harish
Author
5 Nov 2005 8:36 AM
Itzik Ben-Gan
Harish,

Check out my previous reply; specifically the last paragraph:

>> And most importantly, if your environment reached sizes where purging is
very expensive, INSERTs start to become slower, maintenance activities like
rebuilding indexes become very expensive, etc., maybe it's time to consider
partitioning. <<

Sounds like your environment qualifies.

--
BG, SQL Server MVP
www.SolidQualityLearning.com

Join us for the SQL Server 2005 launch at the SQL Week in Israel!
http://www.microsoft.com/israel/sql/sqlweek/default.mspx


Show quote
"harish" <harish.prabh***@gmail.com> wrote in message
news:1131162223.056584.205850@g47g2000cwa.googlegroups.com...
> Hey ben
>
> We have a problem with performance. The idea is
>
> DELETING 100 million from a table weekly SQl SERVER 2000
> ****************************************************************************
> We have a table in SQL SERVER 2000 which has about 250 million records
> and this will be growing by 100 million every week. At a time the table
> should contain just 13 weeks of data. when the 14th week data needs to
> be loaded the first week's data has to be deleted.
> And this deletes 100 million every week, since the delete is taking lot
> of transaction log space the job is not successful.
> Can you please help with what are the approaches we can take to fix
> this problem?
> Performance and transaction log are the issues we are facing. We tried
> deletion in steps too but that also is taking time. What are the
> different ways we can address this quickly.
> Please reply at the earliest.
> Thanks
> Harish
>
Author
5 Nov 2005 7:03 PM
harish
Hey


Thanks. We have an index on four columns in this table. For Ex A, B, C
and D
The delete statement's where clause has the conditions for A, B and C
The delete statement's where clause has the conditions for just A


Which of the two's performance will be faster?


We tried something like this:


SET ROWCOUNT 5000;
WHILE 1 = 1
BEGIN
  DELETE FROM T1 WHERE dt < '20030101' -- original delete
  IF @@rowcount < 5000 BREAK;
END
SET ROWCOUNT 0;


1) Does this setting ROWCOUNT first sort the table and then delete?


2) The above query is executed to delete all records satisfying the
condition in steps of 5000 until the delete is comple.
How can I stop it after one 5000?
Author
5 Nov 2005 8:29 PM
Hugo Kornelis
On 5 Nov 2005 11:03:39 -0800, harish wrote:

(snip)

Hi Harish,

I just replied to the same question in another thread.

Could you please ask your questions in JUST ONE place, and in JUST ONE
group? I've seen your messages scattered over several groups, and
several different messages in just this group. Many of them have
attracted replies. It's very hard to keep track of what is going on in
all thesse threads, and it's a waste of other people's time if someone
posts a reply to you that you already had received in another group.

Best, Hugo
--

(Remove _NO_ and _SPAM_ to get my e-mail address)

AddThis Social Bookmark Button