Home All Groups Group Topic Archive Search About
Author
17 Feb 2006 2:52 PM
KBuser
I guess I'll just post as much about my situation which I feel is
pertinent, and hope I get the feedback I'm looking for. I'm not exactly
sure what it is I am trying to figure out here, but I think I'll get
some good suggestions...

I'm working on a program in C# which takes files (either delimited text
or excel format) performs some operations (lots of regex) and then
creates a BCP batch file and can execute it as well. The bcp uploads to
our sql server (2000)  the new file created from parsing the
aforementioned files (usually will be between 5,000 and 20,000
records).
The main issue here is you have to manually fix any insert errors,
though I could have my program read the bcp error file and display the
results. Either way, as I add more functionality, there will always be
the issue of whether or not all the data made it into the server
correctly or not. I have been looking into T-SQL and stored procedures,
and am wondering how effective ROLLBACKs are, if they're easy to
manipulate, and also if its possible to 'step through' a SP. By that
meaning execute until a set point, wait for input, and then continute
executing, or would it be wiser to use multiple SPs? Also, can you
rollback a bulk-copy insert, or initiate it in a SP? Can you use an SP
to insert from a delimited file? If so, does it perform well/reliably?
I know this was fairly vague, but admittedly I'm not entirely sure what
else I'll add to my prog, or what issues I will run into, but I think
if I can get a few answers/suggestions here I'll have a much better
idea of where I want to go.

Thanks,
KrB

Author
17 Feb 2006 4:00 PM
JT
In your specific case, it sounds like error handling and recovery are more
important concerns than optimizing insert performance. 20,000 rows per batch
is really not a lot of data, and > 100,000 is considered to be more of a
bulk load.

Bulk inserting is designed to optimize performance by minimizing transaction
logging. The degree to which a bulk insert can be rolled back is determined
by the database recovery model. There is a Bulk-Logged Recovery model that
allows better recovery than Simple, but I would reccomend just keeping the
database setting at Full Recovery.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/adminsql/ad_bkprst_4ku1.asp

The SQL-DMO library provides an object wrapper around the bulk load API, and
it would provide better programability and error handling than the BCP.EXE
shell command or the T-SQL bulk insert command:
SQL-DMO BulkCopy Object
http://msdn2.microsoft.com/en-us/library/ms133968

If you still have problems, then consider loading into a staging table
first, and then re-insert into production only after a successful load.

Show quote
"KBuser" <Kyle.Bu***@gmail.com> wrote in message
news:1140187953.722933.163790@o13g2000cwo.googlegroups.com...
>I guess I'll just post as much about my situation which I feel is
> pertinent, and hope I get the feedback I'm looking for. I'm not exactly
> sure what it is I am trying to figure out here, but I think I'll get
> some good suggestions...
>
> I'm working on a program in C# which takes files (either delimited text
> or excel format) performs some operations (lots of regex) and then
> creates a BCP batch file and can execute it as well. The bcp uploads to
> our sql server (2000)  the new file created from parsing the
> aforementioned files (usually will be between 5,000 and 20,000
> records).
> The main issue here is you have to manually fix any insert errors,
> though I could have my program read the bcp error file and display the
> results. Either way, as I add more functionality, there will always be
> the issue of whether or not all the data made it into the server
> correctly or not. I have been looking into T-SQL and stored procedures,
> and am wondering how effective ROLLBACKs are, if they're easy to
> manipulate, and also if its possible to 'step through' a SP. By that
> meaning execute until a set point, wait for input, and then continute
> executing, or would it be wiser to use multiple SPs? Also, can you
> rollback a bulk-copy insert, or initiate it in a SP? Can you use an SP
> to insert from a delimited file? If so, does it perform well/reliably?
> I know this was fairly vague, but admittedly I'm not entirely sure what
> else I'll add to my prog, or what issues I will run into, but I think
> if I can get a few answers/suggestions here I'll have a much better
> idea of where I want to go.
>
> Thanks,
> KrB
>
Author
17 Feb 2006 9:48 PM
KBuser
Thank you kindly.
Author
17 Feb 2006 9:55 PM
Brian Selzer
I think that you should scrub the data better in the C# program.  For 20,000
rows, I would insert the data into a temp table with a status column, and
then process the table with a stored procedure, updating the temp table with
whether or not any given row made it into the table.  Then you can either
dump the unsuccessful rows into a permanent table for later reporting, or
read them out in the C# program for further processing.

Show quote
"KBuser" <Kyle.Bu***@gmail.com> wrote in message
news:1140187953.722933.163790@o13g2000cwo.googlegroups.com...
>I guess I'll just post as much about my situation which I feel is
> pertinent, and hope I get the feedback I'm looking for. I'm not exactly
> sure what it is I am trying to figure out here, but I think I'll get
> some good suggestions...
>
> I'm working on a program in C# which takes files (either delimited text
> or excel format) performs some operations (lots of regex) and then
> creates a BCP batch file and can execute it as well. The bcp uploads to
> our sql server (2000)  the new file created from parsing the
> aforementioned files (usually will be between 5,000 and 20,000
> records).
> The main issue here is you have to manually fix any insert errors,
> though I could have my program read the bcp error file and display the
> results. Either way, as I add more functionality, there will always be
> the issue of whether or not all the data made it into the server
> correctly or not. I have been looking into T-SQL and stored procedures,
> and am wondering how effective ROLLBACKs are, if they're easy to
> manipulate, and also if its possible to 'step through' a SP. By that
> meaning execute until a set point, wait for input, and then continute
> executing, or would it be wiser to use multiple SPs? Also, can you
> rollback a bulk-copy insert, or initiate it in a SP? Can you use an SP
> to insert from a delimited file? If so, does it perform well/reliably?
> I know this was fairly vague, but admittedly I'm not entirely sure what
> else I'll add to my prog, or what issues I will run into, but I think
> if I can get a few answers/suggestions here I'll have a much better
> idea of where I want to go.
>
> Thanks,
> KrB
>

AddThis Social Bookmark Button