|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
best way to bulk updateHi I am new to sql server 2005.
I have a flat file with about 10 million rows of data which I need to use to update the values in a table every day. The table has a single clustered index which is the column ID. I do not need to insert new data, just update the data on this table. The file has about 10 columns with tab separators. I just need someone to tell me what I should look into or what is the best way to accomplish this task? Thank you. Hi,
1. Use BULK insert with BATCHSIZE option or BCP IN with -b option to load the data into temporary Table (Real table) 2. Create a Index on key column in temp table 3. Using the primary key to update the required columns in batches..Use while loop 4. Truncate the temp table... 5. You could schedule this a SQL Agent job Thanks Hari SQL Server MVP <kenim***@gmail.com> wrote in message Show quote news:1157932970.916946.152660@i3g2000cwc.googlegroups.com... > Hi I am new to sql server 2005. > I have a flat file with about 10 million rows of data which I need to > use to update the values in a table every day. The table has a single > clustered index which is the column ID. I do not need to insert new > data, just update the data on this table. The file has about 10 columns > with tab separators. I just need someone to tell me what I should look > into or what is the best way to accomplish this task? > > Thank you. > You haven't really said enough to allow too much of a
recommendation, since we don't know how the "column ID" is related to any of the information in the flat file. However, I think it's likely you will need to import the contents of the entire file into a table separate from the one you are updating. BULK INSERT and bcp are very fast ways to do this. (I suppose SQL Server Integration Services also is, but I have not used it much.). How you update the big table once you import the data is something you'll still have to figure out - what indexes are best, or whether you should break the update up into batches, depend on things we don't know from your post. -- Steve Kass -- Drew University -- http://www.stevekass.com kenim***@gmail.com wrote: Show quote >Hi I am new to sql server 2005. >I have a flat file with about 10 million rows of data which I need to >use to update the values in a table every day. The table has a single >clustered index which is the column ID. I do not need to insert new >data, just update the data on this table. The file has about 10 columns >with tab separators. I just need someone to tell me what I should look >into or what is the best way to accomplish this task? > >Thank you. > > > Thanks so much for your answers. It helps me a lot. I will try bcp.
The key column is the cluster index, and it is one of the columns in the flat file. Simple table, no triggers. Why would I have to update in batches doing the while loop, if I am not keeping a log file (simple recovery)? Is there any advantage in speed? Do I create an index the temp table after or before I insert the data into it ? Thanks. (kenim***@gmail.com) writes:
> Why would I have to update in batches doing the while loop, if I am not Actually, batchings makes more sense with simple recovery than with full> keeping a log file (simple recovery)? Is there any advantage in speed? recovery. You may have space enough for your log file, but if you want to keep it down in size, batching helps. If you update all in one go, the entire transaction must be logged, so it can be rolled back. As for speed, the best is to benchmark on your own. But generally, if it works OK with one big update, go for that. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx kenim***@gmail.com wrote:
> Thanks so much for your answers. It helps me a lot. I will try bcp. Because even in Simple mode, the transaction log still has to be large > > The key column is the cluster index, and it is one of the columns in > the flat file. Simple table, no triggers. > > Why would I have to update in batches doing the while loop, if I am not > keeping a log file (simple recovery)? Is there any advantage in speed? > > Do I create an index the temp table after or before I insert the data > into it ? > > Thanks. > enough to hold the rollback info for the transaction. A 10-million row transaction will be pretty big. See http://realsqlguy.com/serendipity/archives/14-When-Is-A-Transaction-Log-Not-A-Transaction-Log.html How many rows does the target table have? It's important - if you
typically update a significant share of the rows in your target table, you might be better off if you: 1. Create a new table and populate it with modified values. 2. Drop the old table. 3. Rename the new one. This approach may be faster, and your brand new table will have no fragmentation.
Other interesting topics
|
|||||||||||||||||||||||