Home All Groups Group Topic Archive Search About

Best solution, iterate over millions records and call extended sp

Author
11 Nov 2005 5:43 AM
nick
Hi,

I need to iterate over millions rows in a table and I need call an extended
stored procedure (written in C++ and not possible be written in TSQL) using
the columns of each row as parameters and write the return values to an new
table. The current script open a cursor.

What's the best way to implement it? BCP to a text file and external program
parse and write the text file and BCP back? sp_cmdshell an executible for
each row? (so needn't worry about C++ memory leak issu). XML?.....

Thanks,

Author
11 Nov 2005 6:42 AM
David Portas
Show quote
"nick" <n***@discussions.microsoft.com> wrote in message
news:75FAE6C1-EEC1-4D8B-A3A2-073F10CD859E@microsoft.com...
> Hi,
>
> I need to iterate over millions rows in a table and I need call an
> extended
> stored procedure (written in C++ and not possible be written in TSQL)
> using
> the columns of each row as parameters and write the return values to an
> new
> table. The current script open a cursor.
>
> What's the best way to implement it? BCP to a text file and external
> program
> parse and write the text file and BCP back? sp_cmdshell an executible for
> each row? (so needn't worry about C++ memory leak issu). XML?.....
>
> Thanks,

I'd say the "best" solution would be to rethink your architecture and
implement it differently if you plan to do this on a regular basis. This
doesn't sound like a very scalable or desirable way to use a client-server
database. Have you considered using SQL Server 2005, where you can implement
..NET code in the database? Or implementing your code in ADO rather than with
an XP?

If it's just a one-off requirement then you just need to test which approach
works best for you. It isn't really a SQL question since, for the purposes
of this exercise, you are just using SQL Server as a file dump rather than
what it was designed for. Why not just loop in TSQL and call the proc for
each row?

--
David Portas
SQL Server MVP
--
Author
11 Nov 2005 3:01 PM
nick
Not possible to switch to V2005 now. The program current does loop using
server side cursor... I am thinking using a client side code... Any
information about server side cursor vs client side forward-only recordset?

Show quote
"David Portas" wrote:

> "nick" <n***@discussions.microsoft.com> wrote in message
> news:75FAE6C1-EEC1-4D8B-A3A2-073F10CD859E@microsoft.com...
> > Hi,
> >
> > I need to iterate over millions rows in a table and I need call an
> > extended
> > stored procedure (written in C++ and not possible be written in TSQL)
> > using
> > the columns of each row as parameters and write the return values to an
> > new
> > table. The current script open a cursor.
> >
> > What's the best way to implement it? BCP to a text file and external
> > program
> > parse and write the text file and BCP back? sp_cmdshell an executible for
> > each row? (so needn't worry about C++ memory leak issu). XML?.....
> >
> > Thanks,
>
> I'd say the "best" solution would be to rethink your architecture and
> implement it differently if you plan to do this on a regular basis. This
> doesn't sound like a very scalable or desirable way to use a client-server
> database. Have you considered using SQL Server 2005, where you can implement
> ..NET code in the database? Or implementing your code in ADO rather than with
> an XP?
>
> If it's just a one-off requirement then you just need to test which approach
> works best for you. It isn't really a SQL question since, for the purposes
> of this exercise, you are just using SQL Server as a file dump rather than
> what it was designed for. Why not just loop in TSQL and call the proc for
> each row?
>
> --
> David Portas
> SQL Server MVP
> --
>
>
>
Author
11 Nov 2005 1:38 PM
JT
If you have a situation that calls for looping through a cursor, then it's
better to implement the cursor on the client side than on the server. Open a
read-only, forward only ADO recordset and Command.Execute the stored
procedure for each row.

Show quote
"nick" <n***@discussions.microsoft.com> wrote in message
news:75FAE6C1-EEC1-4D8B-A3A2-073F10CD859E@microsoft.com...
> Hi,
>
> I need to iterate over millions rows in a table and I need call an
> extended
> stored procedure (written in C++ and not possible be written in TSQL)
> using
> the columns of each row as parameters and write the return values to an
> new
> table. The current script open a cursor.
>
> What's the best way to implement it? BCP to a text file and external
> program
> parse and write the text file and BCP back? sp_cmdshell an executible for
> each row? (so needn't worry about C++ memory leak issu). XML?.....
>
> Thanks,
Author
11 Nov 2005 2:55 PM
nick
Yes, then I needn't create an extended stored procedure at all.

Show quote
"JT" wrote:

> If you have a situation that calls for looping through a cursor, then it's
> better to implement the cursor on the client side than on the server. Open a
> read-only, forward only ADO recordset and Command.Execute the stored
> procedure for each row.
>
> "nick" <n***@discussions.microsoft.com> wrote in message
> news:75FAE6C1-EEC1-4D8B-A3A2-073F10CD859E@microsoft.com...
> > Hi,
> >
> > I need to iterate over millions rows in a table and I need call an
> > extended
> > stored procedure (written in C++ and not possible be written in TSQL)
> > using
> > the columns of each row as parameters and write the return values to an
> > new
> > table. The current script open a cursor.
> >
> > What's the best way to implement it? BCP to a text file and external
> > program
> > parse and write the text file and BCP back? sp_cmdshell an executible for
> > each row? (so needn't worry about C++ memory leak issu). XML?.....
> >
> > Thanks,
>
>
>
Author
11 Nov 2005 3:09 PM
nick
In fact, the functionarity needs to be available on the server. So I will
created a program, maybe C# or C++ program to do the looping and calculate
and put the executible on the server so it can be launched via xp_cmdshell...
It should be better than big TSQL cursor?

Show quote
"JT" wrote:

> If you have a situation that calls for looping through a cursor, then it's
> better to implement the cursor on the client side than on the server. Open a
> read-only, forward only ADO recordset and Command.Execute the stored
> procedure for each row.
>
> "nick" <n***@discussions.microsoft.com> wrote in message
> news:75FAE6C1-EEC1-4D8B-A3A2-073F10CD859E@microsoft.com...
> > Hi,
> >
> > I need to iterate over millions rows in a table and I need call an
> > extended
> > stored procedure (written in C++ and not possible be written in TSQL)
> > using
> > the columns of each row as parameters and write the return values to an
> > new
> > table. The current script open a cursor.
> >
> > What's the best way to implement it? BCP to a text file and external
> > program
> > parse and write the text file and BCP back? sp_cmdshell an executible for
> > each row? (so needn't worry about C++ memory leak issu). XML?.....
> >
> > Thanks,
>
>
>
Author
11 Nov 2005 3:35 PM
nick
In fact, my question is

Transact-SQL script "declare cursor" (fast forward) vs. Client side ADO code
with fast forward server cursor

Which one is better for very large rows?


Show quote
"nick" wrote:

> In fact, the functionarity needs to be available on the server. So I will
> created a program, maybe C# or C++ program to do the looping and calculate
> and put the executible on the server so it can be launched via xp_cmdshell...
> It should be better than big TSQL cursor?
>
> "JT" wrote:
>
> > If you have a situation that calls for looping through a cursor, then it's
> > better to implement the cursor on the client side than on the server. Open a
> > read-only, forward only ADO recordset and Command.Execute the stored
> > procedure for each row.
> >
> > "nick" <n***@discussions.microsoft.com> wrote in message
> > news:75FAE6C1-EEC1-4D8B-A3A2-073F10CD859E@microsoft.com...
> > > Hi,
> > >
> > > I need to iterate over millions rows in a table and I need call an
> > > extended
> > > stored procedure (written in C++ and not possible be written in TSQL)
> > > using
> > > the columns of each row as parameters and write the return values to an
> > > new
> > > table. The current script open a cursor.
> > >
> > > What's the best way to implement it? BCP to a text file and external
> > > program
> > > parse and write the text file and BCP back? sp_cmdshell an executible for
> > > each row? (so needn't worry about C++ memory leak issu). XML?.....
> > >
> > > Thanks,
> >
> >
> >
Author
11 Nov 2005 3:47 PM
JT
After thinking about it more, I believe that DTS would be most appropriate
solution for what you are wanting to do: copy the contents of one table to
another table while performing data transformations. You can use a VBScript
or JScript task for performing the complex transformation. The DTS package
is a standard SQL Server approach, it can be contained within the database,
and can be scheduled via a job.

Show quote
"nick" <n***@discussions.microsoft.com> wrote in message
news:0B329B5A-860D-475B-BC24-DF284BEEE547@microsoft.com...
> In fact, my question is
>
> Transact-SQL script "declare cursor" (fast forward) vs. Client side ADO
> code
> with fast forward server cursor
>
> Which one is better for very large rows?
>
>
> "nick" wrote:
>
>> In fact, the functionarity needs to be available on the server. So I will
>> created a program, maybe C# or C++ program to do the looping and
>> calculate
>> and put the executible on the server so it can be launched via
>> xp_cmdshell...
>> It should be better than big TSQL cursor?
>>
>> "JT" wrote:
>>
>> > If you have a situation that calls for looping through a cursor, then
>> > it's
>> > better to implement the cursor on the client side than on the server.
>> > Open a
>> > read-only, forward only ADO recordset and Command.Execute the stored
>> > procedure for each row.
>> >
>> > "nick" <n***@discussions.microsoft.com> wrote in message
>> > news:75FAE6C1-EEC1-4D8B-A3A2-073F10CD859E@microsoft.com...
>> > > Hi,
>> > >
>> > > I need to iterate over millions rows in a table and I need call an
>> > > extended
>> > > stored procedure (written in C++ and not possible be written in TSQL)
>> > > using
>> > > the columns of each row as parameters and write the return values to
>> > > an
>> > > new
>> > > table. The current script open a cursor.
>> > >
>> > > What's the best way to implement it? BCP to a text file and external
>> > > program
>> > > parse and write the text file and BCP back? sp_cmdshell an executible
>> > > for
>> > > each row? (so needn't worry about C++ memory leak issu). XML?.....
>> > >
>> > > Thanks,
>> >
>> >
>> >
Author
11 Nov 2005 4:25 PM
nick
Thanks. In fact, each row of the very big table provides the parameters for a
complex C++ function written by others. The source code is available but I
better not touch the C++ code since the author may upgrade. The C++ code will
return quite a few values for each row and these values need to written in a
SQL table. The process is simple but the data amount is big.

Maybe use JScript in DTS to call an executible? But need to parse the
input/ouput...

Show quote
"JT" wrote:

> After thinking about it more, I believe that DTS would be most appropriate
> solution for what you are wanting to do: copy the contents of one table to
> another table while performing data transformations. You can use a VBScript
> or JScript task for performing the complex transformation. The DTS package
> is a standard SQL Server approach, it can be contained within the database,
> and can be scheduled via a job.
>
> "nick" <n***@discussions.microsoft.com> wrote in message
> news:0B329B5A-860D-475B-BC24-DF284BEEE547@microsoft.com...
> > In fact, my question is
> >
> > Transact-SQL script "declare cursor" (fast forward) vs. Client side ADO
> > code
> > with fast forward server cursor
> >
> > Which one is better for very large rows?
> >
> >
> > "nick" wrote:
> >
> >> In fact, the functionarity needs to be available on the server. So I will
> >> created a program, maybe C# or C++ program to do the looping and
> >> calculate
> >> and put the executible on the server so it can be launched via
> >> xp_cmdshell...
> >> It should be better than big TSQL cursor?
> >>
> >> "JT" wrote:
> >>
> >> > If you have a situation that calls for looping through a cursor, then
> >> > it's
> >> > better to implement the cursor on the client side than on the server.
> >> > Open a
> >> > read-only, forward only ADO recordset and Command.Execute the stored
> >> > procedure for each row.
> >> >
> >> > "nick" <n***@discussions.microsoft.com> wrote in message
> >> > news:75FAE6C1-EEC1-4D8B-A3A2-073F10CD859E@microsoft.com...
> >> > > Hi,
> >> > >
> >> > > I need to iterate over millions rows in a table and I need call an
> >> > > extended
> >> > > stored procedure (written in C++ and not possible be written in TSQL)
> >> > > using
> >> > > the columns of each row as parameters and write the return values to
> >> > > an
> >> > > new
> >> > > table. The current script open a cursor.
> >> > >
> >> > > What's the best way to implement it? BCP to a text file and external
> >> > > program
> >> > > parse and write the text file and BCP back? sp_cmdshell an executible
> >> > > for
> >> > > each row? (so needn't worry about C++ memory leak issu). XML?.....
> >> > >
> >> > > Thanks,
> >> >
> >> >
> >> >
>
>
>
Author
11 Nov 2005 4:56 PM
JT
What you just described is not really 'data transformation', but some type
of data entry that needs to be automated using a large list of parameters
supplied by a 3rd party. In that case, rather than DTS, I would suggest
writing a client side application that:

#1    Opens a forwardonly / readonly recordset recordset from the source
table
#2    Steps through the recorset from top to bottom (not really looping)
#3    For each record, call the C++ executable using parameters from the
record
#4    Write the result of each SP call to a tab delimited text file
#5    Once the last record has been processed, close the recordset, and use
the bulk copy utility (BCP.EXE) to load the tab delimited file into the
destination table.
#6    Truncate the source table?

I would not reccomend running the executable on the database server, becuase
it would provide only a marginal performance benefit but result in security,
deployment, and resource utilization issues.

Read up on the 'bcp utility' in SQL Server Books Online.

The following article describes a bulk loading, with an emphasis on
optimizing performance, but may provide more detail than what you need.
SQL Server 2000 Incremental Bulk Load Case Study
http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/incbulkload.mspx

Show quote
"nick" <n***@discussions.microsoft.com> wrote in message
news:BCA25342-6927-4416-9F0F-F9CCA2E76FA5@microsoft.com...
> Thanks. In fact, each row of the very big table provides the parameters
> for a
> complex C++ function written by others. The source code is available but I
> better not touch the C++ code since the author may upgrade. The C++ code
> will
> return quite a few values for each row and these values need to written in
> a
> SQL table. The process is simple but the data amount is big.
>
> Maybe use JScript in DTS to call an executible? But need to parse the
> input/ouput...
>
> "JT" wrote:
>
>> After thinking about it more, I believe that DTS would be most
>> appropriate
>> solution for what you are wanting to do: copy the contents of one table
>> to
>> another table while performing data transformations. You can use a
>> VBScript
>> or JScript task for performing the complex transformation. The DTS
>> package
>> is a standard SQL Server approach, it can be contained within the
>> database,
>> and can be scheduled via a job.
>>
>> "nick" <n***@discussions.microsoft.com> wrote in message
>> news:0B329B5A-860D-475B-BC24-DF284BEEE547@microsoft.com...
>> > In fact, my question is
>> >
>> > Transact-SQL script "declare cursor" (fast forward) vs. Client side ADO
>> > code
>> > with fast forward server cursor
>> >
>> > Which one is better for very large rows?
>> >
>> >
>> > "nick" wrote:
>> >
>> >> In fact, the functionarity needs to be available on the server. So I
>> >> will
>> >> created a program, maybe C# or C++ program to do the looping and
>> >> calculate
>> >> and put the executible on the server so it can be launched via
>> >> xp_cmdshell...
>> >> It should be better than big TSQL cursor?
>> >>
>> >> "JT" wrote:
>> >>
>> >> > If you have a situation that calls for looping through a cursor,
>> >> > then
>> >> > it's
>> >> > better to implement the cursor on the client side than on the
>> >> > server.
>> >> > Open a
>> >> > read-only, forward only ADO recordset and Command.Execute the stored
>> >> > procedure for each row.
>> >> >
>> >> > "nick" <n***@discussions.microsoft.com> wrote in message
>> >> > news:75FAE6C1-EEC1-4D8B-A3A2-073F10CD859E@microsoft.com...
>> >> > > Hi,
>> >> > >
>> >> > > I need to iterate over millions rows in a table and I need call an
>> >> > > extended
>> >> > > stored procedure (written in C++ and not possible be written in
>> >> > > TSQL)
>> >> > > using
>> >> > > the columns of each row as parameters and write the return values
>> >> > > to
>> >> > > an
>> >> > > new
>> >> > > table. The current script open a cursor.
>> >> > >
>> >> > > What's the best way to implement it? BCP to a text file and
>> >> > > external
>> >> > > program
>> >> > > parse and write the text file and BCP back? sp_cmdshell an
>> >> > > executible
>> >> > > for
>> >> > > each row? (so needn't worry about C++ memory leak issu). XML?.....
>> >> > >
>> >> > > Thanks,
>> >> >
>> >> >
>> >> >
>>
>>
>>
Author
11 Nov 2005 5:19 PM
nick
Thanks. The existed solution is:

#1 A "wrapper"(extended stored procedure) to call the C++ routines.
#2 A stored procedure which declares a fast forward cursor over the large
table and fetch every row and calls the extended stored procedure with
fetched values as parameters.

I am not sure about the C++ code and worry about memory leaks, etc, that's
the reason I am seeking an alternative solution. Assume the C++ code (for
both 3rd party code and extended SP) is good, what's the issue of the existed
approach? Performance/resource usage?

Show quote
"JT" wrote:

> What you just described is not really 'data transformation', but some type
> of data entry that needs to be automated using a large list of parameters
> supplied by a 3rd party. In that case, rather than DTS, I would suggest
> writing a client side application that:
>
> #1    Opens a forwardonly / readonly recordset recordset from the source
> table
> #2    Steps through the recorset from top to bottom (not really looping)
> #3    For each record, call the C++ executable using parameters from the
> record
> #4    Write the result of each SP call to a tab delimited text file
> #5    Once the last record has been processed, close the recordset, and use
> the bulk copy utility (BCP.EXE) to load the tab delimited file into the
> destination table.
> #6    Truncate the source table?
>
> I would not reccomend running the executable on the database server, becuase
> it would provide only a marginal performance benefit but result in security,
> deployment, and resource utilization issues.
>
> Read up on the 'bcp utility' in SQL Server Books Online.
>
> The following article describes a bulk loading, with an emphasis on
> optimizing performance, but may provide more detail than what you need.
> SQL Server 2000 Incremental Bulk Load Case Study
> http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/incbulkload.mspx
>
> "nick" <n***@discussions.microsoft.com> wrote in message
> news:BCA25342-6927-4416-9F0F-F9CCA2E76FA5@microsoft.com...
> > Thanks. In fact, each row of the very big table provides the parameters
> > for a
> > complex C++ function written by others. The source code is available but I
> > better not touch the C++ code since the author may upgrade. The C++ code
> > will
> > return quite a few values for each row and these values need to written in
> > a
> > SQL table. The process is simple but the data amount is big.
> >
> > Maybe use JScript in DTS to call an executible? But need to parse the
> > input/ouput...
> >
> > "JT" wrote:
> >

AddThis Social Bookmark Button