|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Better way to do a large delete?delete about 4 million rows, based on the date in another table Table Session SessionID SessionDate 1 1/25/2004 2 7/25/2001 I have created a view (SessionView) containing all sessionid's where the sessiondate less than 3 years old. (Create view SessionView As Select SessionID From Session Where Session.sessiondate >= getdate() - 1095) Table Claims ClaimID SessionID etc etc etc 24578 1 15 HealthPractice 5554441234 6548975 2 52 ChirocPract 5551234567 What I need is to delete all rows from Table Claims that have sessionids in SessionView. Delete From Claims Where Claims.sessionID In (Select SessionID from SessionView) -In this case ClaimID 6548975 would be deleted but ClaimID 24578 would not. But I must be doing something wrong (I am very new at SQL Server, and have never done a large scale delete that wasn't just straight forward), because the delete seems to take forever. Last time I had to stop it at just over 5 hours because the network admin needed to do some work on the server and reboot it. I have set the recovery to simple, and the server has no other active databases or other applications running on it. Any assistance appreciated. Nancy Lytle N_Ly***@terplaum.umd.edu you could also copy the rows you want to keep to another table, drop
the orginal one, and rename the copy Thanks, I was trying to avoid that because I actually have about 10 tables
that I have to handle in a similiar fashion, and copying was also going slowly. I think I will end up using Dean's suggesting and chunking it out. Thanks, Nancy Show quote "Alexander Kuznetsov" <AK_TIREDOFSPAM@hotmail.COM> wrote in message news:1138312336.363053.109410@o13g2000cwo.googlegroups.com... > you could also copy the rows you want to keep to another table, drop > the orginal one, and rename the copy > nancy,
try deleting in smaller chunks, say 1000 rows each. this should help. dean Show quote "Nancy Lytle" <word_d***@hotmail.com> wrote in message news:er8ntEsIGHA.2668@tk2msftngp13.phx.gbl... > > I have to do a delete from a table with about 25 million rows. I need to > delete about 4 million rows, based on the date in another table > > Table Session > > SessionID SessionDate > > 1 1/25/2004 > > 2 7/25/2001 > > I have created a view (SessionView) containing all sessionid's where the > sessiondate less than 3 years old. > > (Create view SessionView As Select SessionID From Session Where > Session.sessiondate >= getdate() - 1095) > > Table Claims > > ClaimID SessionID etc etc etc > > 24578 1 15 HealthPractice > 5554441234 > > 6548975 2 52 > ChirocPract 5551234567 > > What I need is to delete all rows from Table Claims that have sessionids > in SessionView. > > Delete From Claims Where Claims.sessionID In (Select SessionID from > SessionView) > > -In this case ClaimID 6548975 would be deleted but ClaimID 24578 would > not. > > But I must be doing something wrong (I am very new at SQL Server, and have > never done a large scale delete that wasn't just straight forward), > because the delete seems to take forever. Last time I had to stop it at > just over 5 hours because the network admin needed to do some work on the > server and reboot it. > > I have set the recovery to simple, and the server has no other active > databases or other applications running on it. > > Any assistance appreciated. > > Nancy Lytle > > N_Ly***@terplaum.umd.edu > > I think this is the way I will have to go, so I will just use smaller date
intervals, until I get everything deleted. Thanks, Nancy Show quote "Dean" <dvitner@nospam.gmail.com> wrote in message news:OvslVLsIGHA.2828@TK2MSFTNGP12.phx.gbl... > nancy, > > try deleting in smaller chunks, say 1000 rows each. this should help. > > dean > > "Nancy Lytle" <word_d***@hotmail.com> wrote in message > news:er8ntEsIGHA.2668@tk2msftngp13.phx.gbl... >> >> I have to do a delete from a table with about 25 million rows. I need to >> delete about 4 million rows, based on the date in another table >> >> Table Session >> >> SessionID SessionDate >> >> 1 1/25/2004 >> >> 2 7/25/2001 >> >> I have created a view (SessionView) containing all sessionid's where the >> sessiondate less than 3 years old. >> >> (Create view SessionView As Select SessionID From Session Where >> Session.sessiondate >= getdate() - 1095) >> >> Table Claims >> >> ClaimID SessionID etc etc etc >> >> 24578 1 15 HealthPractice >> 5554441234 >> >> 6548975 2 52 ChirocPract >> 5551234567 >> >> What I need is to delete all rows from Table Claims that have sessionids >> in SessionView. >> >> Delete From Claims Where Claims.sessionID In (Select SessionID from >> SessionView) >> >> -In this case ClaimID 6548975 would be deleted but ClaimID 24578 would >> not. >> >> But I must be doing something wrong (I am very new at SQL Server, and >> have never done a large scale delete that wasn't just straight forward), >> because the delete seems to take forever. Last time I had to stop it at >> just over 5 hours because the network admin needed to do some work on the >> server and reboot it. >> >> I have set the recovery to simple, and the server has no other active >> databases or other applications running on it. >> >> Any assistance appreciated. >> >> Nancy Lytle >> >> N_Ly***@terplaum.umd.edu >> >> > > Another thing could be to drop all indexes, delete your rows, recreate
the indexes. Its all about how much you can do with the table. Do you have a maintenance window where no one are using the database or is it always online? Nancy Lytle (word_d***@hotmail.com) writes:
Show quote > I have created a view (SessionView) containing all sessionid's where the Rather than creating a view, insert all the sessionids to delete into> sessiondate less than 3 years old. > > (Create view SessionView As Select SessionID From Session Where > Session.sessiondate >= getdate() - 1095) >... > What I need is to delete all rows from Table Claims that have sessionids > in SessionView. > > Delete From Claims Where Claims.sessionID In (Select SessionID from > SessionView) > > -In this case ClaimID 6548975 would be deleted but ClaimID 24578 would > not. > > But I must be doing something wrong (I am very new at SQL Server, and > have never done a large scale delete that wasn't just straight forward), > because the delete seems to take forever. Last time I had to stop it at > just over 5 hours because the network admin needed to do some work on > the server and reboot it. a temp table. You could also try this syntax: DELETE Claims FROM Claims c JOIN #temp t ON c.session_id = t.session_id If Claims does not have its clustered index on SessionId, it could be worth to do CREATE CLUSTERED INDEX session_id_ix ON Claims(session_id) WITH DROP_EXISTING for this task, and then restore the original index. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx When bulk deleting 4 million rows, the slowness you are experiencing
probably has more to do with transaction logging than anything else. I assume this must a one time purge, but if it is a recurring process, then you will probably want to reconsider your data model and workflow. Start by dropping any indexes from the table that would not assist in qualifying the rows needed for deletion. For example, if you are deleting based on transaction date, then you would not need to retain an index on LastName or AccountType. This will reduce the amount of transaction logging during the process, and besides, all the indexes will need to be defragmented after the process has completed anyway. You can try creating a loop and delete the rows based on a date column, however, this usally results in batches that vary greatly in size. For example, your company have a lot more transactions in 2005 than in 2004. Using the technique below, you can delete in batches of exactly 1000 rows. set rowcount 1000 while delete from mytable where . . . if @@rowcount = 0 break checkpoint end Show quote "Nancy Lytle" <word_d***@hotmail.com> wrote in message news:er8ntEsIGHA.2668@tk2msftngp13.phx.gbl... > > I have to do a delete from a table with about 25 million rows. I need to > delete about 4 million rows, based on the date in another table > > Table Session > > SessionID SessionDate > > 1 1/25/2004 > > 2 7/25/2001 > > I have created a view (SessionView) containing all sessionid's where the > sessiondate less than 3 years old. > > (Create view SessionView As Select SessionID From Session Where > Session.sessiondate >= getdate() - 1095) > > Table Claims > > ClaimID SessionID etc etc etc > > 24578 1 15 HealthPractice > 5554441234 > > 6548975 2 52 > ChirocPract 5551234567 > > What I need is to delete all rows from Table Claims that have sessionids > in SessionView. > > Delete From Claims Where Claims.sessionID In (Select SessionID from > SessionView) > > -In this case ClaimID 6548975 would be deleted but ClaimID 24578 would > not. > > But I must be doing something wrong (I am very new at SQL Server, and have > never done a large scale delete that wasn't just straight forward), > because the delete seems to take forever. Last time I had to stop it at > just over 5 hours because the network admin needed to do some work on the > server and reboot it. > > I have set the recovery to simple, and the server has no other active > databases or other applications running on it. > > Any assistance appreciated. > > Nancy Lytle > > N_Ly***@terplaum.umd.edu > > |
|||||||||||||||||||||||