Home All Groups Group Topic Archive Search About

many small queries vs one large query

Author
6 Apr 2006 1:06 PM
Allan Ebdrup
I just had a discussion with my boss, we are running a query that fetches
about 5000 objects from the database using MSSQL and DotNet 2.0.
My boss thinks that the recommendation of keeping the connections open for
as short a time as possible means that one shoule open a connection, fetch
one row, close the connection, open another connection, fetch one row and
close the connection and so on.
I belive that the extra overhead of opening and closing connections all the
time, of generating execution plans for each query, and running the queries
many times gives his solution a large performance hit. My recommendation
will be to fetch all the rows needed in one large SQL query.
I've done several other projects where I have proven this to be true, there
is indeed a large performance hit from generating many small queries instead
of one large. But my boss just says "no" and disagrees.

Can you give me some good arguments and/or point me to some best practice
documents that describe this so I can convince my boss he's wrong?

Kind Regards,
Allan Ebdrup

FUT: microsoft.public.sqlserver.programming

Author
6 Apr 2006 1:26 PM
David Browne
Show quote
"Allan Ebdrup" <ebdrup@noemail.noemail> wrote in message
news:%23kIhprXWGHA.3332@TK2MSFTNGP02.phx.gbl...
>I just had a discussion with my boss, we are running a query that fetches
>about 5000 objects from the database using MSSQL and DotNet 2.0.
> My boss thinks that the recommendation of keeping the connections open for
> as short a time as possible means that one shoule open a connection, fetch
> one row, close the connection, open another connection, fetch one row and
> close the connection and so on.
> I belive that the extra overhead of opening and closing connections all
> the time, of generating execution plans for each query, and running the
> queries many times gives his solution a large performance hit. My
> recommendation will be to fetch all the rows needed in one large SQL
> query.
> I've done several other projects where I have proven this to be true,
> there is indeed a large performance hit from generating many small queries
> instead of one large. But my boss just says "no" and disagrees.
>
> Can you give me some good arguments and/or point me to some best practice
> documents that describe this so I can convince my boss he's wrong?
>


Don't argue with your boss.  Implement it both ways and show him.

David
Author
6 Apr 2006 1:44 PM
Allan Ebdrup
> Don't argue with your boss.  Implement it both ways and show him.

Hi David
Unfortunately it's him who's doing the implementing.
Author
6 Apr 2006 2:28 PM
ML
SQL Server is quite capable of handling long connections, and as long as
they're actually doing something they make perfect sense. Many short lasting
connections may seem to put less stress on the server, but in fact the
overhead of many connections versus a single connection, when calls to the
server constitute a single business operation, must be multiplied with the
number of connections to give a true estimate of the actual (and expected)
stress on the server.

Will puts it very simply (and very true): 5000 connections where each lasts
50ms might kick the server pretty hard. 2nd grade mathematics.


ML

---
http://milambda.blogspot.com/
Author
8 Apr 2006 12:47 PM
Erland Sommarskog
ML (M*@discussions.microsoft.com) writes:
> SQL Server is quite capable of handling long connections, and as long as
> they're actually doing something they make perfect sense.

Yes, if you design a Windows-based application, it makes perfect sense
to have one global conection that stays connected as long as the application
is running. At least as long as the number of users is moderate, and most
users are active most of the time.

Nevertheless, disconnected mode has become the prime model for client
programming these days. I guess there are two major reasons for this:

1) For a web-based app, it's not a good idea to have connection per web
   user, as there could be a lot of user. At the same time, all connections
   are from the same process: the web server. So pooling seems like a
   very good idea.

2) The programming becomes easier to handle when you have to drag a
   global object around, but each function that needs to get or update
   some data, can connect to perform its operation and so, without
   interferring with others. For instance, assume that when you run
   loop with ExecuteReader and in the middle of the loop you call some
   some other function that needs to access the database. With a global
   object you run into conflict.



--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
10 Apr 2006 2:27 PM
ML
As I've mentioned im my other post, for parallel tasks one connection may not
be enough, but for consecutive tasks many conections are wasteful.


ML

---
http://milambda.blogspot.com/
Author
6 Apr 2006 1:37 PM
Will
I can guarantee, having had to optimise an application that used your
bosses tactics, it will destroy your app. it's not just the opening and
closing of connections, the whole act of retrieving data adds a
significant overhead, you see if you were getting a load of records
based on record id, you'd be making about 5000 separate scans of a
table based on it's id column, each one getting one record. This means
that when your index get's fragmented and takes 1 second more, your
operation takes 5000 seconds more. If you were making one large
statement, your index getting fragmented adds 1 second on to the whole
operation.

I can't stress this enough - your boss does not know SQL server very
well

Cheers
Will
Author
6 Apr 2006 1:37 PM
Spam Catcher
"Allan Ebdrup" <ebdrup@noemail.noemail> wrote in
news:#kIhprXWGHA.3332@TK2MSFTNGP02.phx.gbl:

> I've done several other projects where I have proven this to be true,
> there is indeed a large performance hit from generating many small
> queries instead of one large. But my boss just says "no" and disagrees

I would have submitted one large bulk query as well.

Perhaps multiple small queries would reduce the time a table is locked?

Any other solutions? Maybe open a connection - execute a long running SP... 
and fetch the results after the fact?
Author
6 Apr 2006 1:45 PM
Will
My post just now seems to have got lost (wonder where it went).

To summarise what I said:

your boss is making a massive mistake. I've seen this tactic destroy a
large application.

it's fine at first, your queries all run nearly instantaneously, and
you'll load test it later, no need to worry about that right now.

then a table gets big. suddenly the instant query takes 50 milliseconds
to get the data. not a problem if you're only doing one select. if
you're doing 5000 separate ones it takes 5000x50 milliseconds.
Author
6 Apr 2006 2:16 PM
Jim Underwood
IF your query is executing 5000 times against the same tables and retrieving
the same columns with selection criteria against the same columns, then
running a single query to retreive all 5000 rows would be faster, without
question.
i.e.
    select employee_id, employee_name from employees
Will have a connection open for much less time than cycling through each
employee with:
    select employee_id, employee_name from employee where employee_id =
'SOMEID'

The first one will also use less CPU on the database, less IO, less network
traffic, etc.  You can play with the code to control how you open the
dataset once you have retrieved it, but the first SQL will almost always be
faster.  The only exception I can think of is if your client does not have
enough memory to handle the entire dataset.

If you have more complicated logic, and you are running against all
different tables, and combining this into one query means extensive joins
and business logic in the where clause, then you need to test each approach
and see what the performance gain is with the view.

Opening and closing connections for each communication wiht the database is
usually most efficient in my experience.  Connection pooling manages the
resources really well.  The connections do not really get opened and closed,
rather they stay open and available, but the client is able to free up some
of the resources associated with them between each call.  This assumes that
you actually need to execute multiple queries to begin with.

Making 5000 seperate calls to the database will use much more network
resources than making a single call to the database to retreive the same
data.  That part is a no brainer.  Dependign on what you have for SQL, it
will not necessarily need to generate a plan every time (one would hope you
are using stored procedures with parameters and the plans are cached).
However, there would be some small overhead associated with running each
command.


Show quote
"Allan Ebdrup" <ebdrup@noemail.noemail> wrote in message
news:%23kIhprXWGHA.3332@TK2MSFTNGP02.phx.gbl...
> I just had a discussion with my boss, we are running a query that fetches
> about 5000 objects from the database using MSSQL and DotNet 2.0.
> My boss thinks that the recommendation of keeping the connections open for
> as short a time as possible means that one shoule open a connection, fetch
> one row, close the connection, open another connection, fetch one row and
> close the connection and so on.
> I belive that the extra overhead of opening and closing connections all
the
> time, of generating execution plans for each query, and running the
queries
> many times gives his solution a large performance hit. My recommendation
> will be to fetch all the rows needed in one large SQL query.
> I've done several other projects where I have proven this to be true,
there
> is indeed a large performance hit from generating many small queries
instead
> of one large. But my boss just says "no" and disagrees.
>
> Can you give me some good arguments and/or point me to some best practice
> documents that describe this so I can convince my boss he's wrong?
>
> Kind Regards,
> Allan Ebdrup
>
> FUT: microsoft.public.sqlserver.programming
>
>
Author
6 Apr 2006 2:35 PM
ML
If the calls to the database are made against isolated objects (i.e. against
different tables or to procedures that target different tables), then
individual connections may benefit from being initiated asynchronously. But
it would make more sense in issuing a single call to related objects for a
single business operation. At least IMHO.


ML

---
http://milambda.blogspot.com/
Author
6 Apr 2006 4:01 PM
Grant
All posts made a good point and I agree that your boss is wrong. It's a no
wonder he is a not a SQL Programmer but he is your boss. He signs your
paycheck so you may not have a choice. I often had to bow to what my boss
wants, hate it and it sucks but what can I do.


--
Grant

Who gives a {censored} if I am wrong.
Show quote
"Allan Ebdrup" <ebdrup@noemail.noemail> wrote in message
news:%23kIhprXWGHA.3332@TK2MSFTNGP02.phx.gbl...
>I just had a discussion with my boss, we are running a query that fetches
>about 5000 objects from the database using MSSQL and DotNet 2.0.
> My boss thinks that the recommendation of keeping the connections open for
> as short a time as possible means that one shoule open a connection, fetch
> one row, close the connection, open another connection, fetch one row and
> close the connection and so on.
> I belive that the extra overhead of opening and closing connections all
> the time, of generating execution plans for each query, and running the
> queries many times gives his solution a large performance hit. My
> recommendation will be to fetch all the rows needed in one large SQL
> query.
> I've done several other projects where I have proven this to be true,
> there is indeed a large performance hit from generating many small queries
> instead of one large. But my boss just says "no" and disagrees.
>
> Can you give me some good arguments and/or point me to some best practice
> documents that describe this so I can convince my boss he's wrong?
>
> Kind Regards,
> Allan Ebdrup
>
> FUT: microsoft.public.sqlserver.programming
>
Author
6 Apr 2006 4:06 PM
Grant
I forget to add this, leave a note so that way the future sql programming
will not blame you for that bad sql programming. I have plenty of notes if
only my boss reads my notes LOL.

--
Grant

Who gives a {censored} if I am wrong.
Show quote
"Grant" <em***@nowhere.com> wrote in message
news:%23wgVuOZWGHA.4452@TK2MSFTNGP04.phx.gbl...
> All posts made a good point and I agree that your boss is wrong. It's a no
> wonder he is a not a SQL Programmer but he is your boss. He signs your
> paycheck so you may not have a choice. I often had to bow to what my boss
> wants, hate it and it sucks but what can I do.
>
>
> --
> Grant
>
> Who gives a {censored} if I am wrong.
> "Allan Ebdrup" <ebdrup@noemail.noemail> wrote in message
> news:%23kIhprXWGHA.3332@TK2MSFTNGP02.phx.gbl...
>>I just had a discussion with my boss, we are running a query that fetches
>>about 5000 objects from the database using MSSQL and DotNet 2.0.
>> My boss thinks that the recommendation of keeping the connections open
>> for as short a time as possible means that one shoule open a connection,
>> fetch one row, close the connection, open another connection, fetch one
>> row and close the connection and so on.
>> I belive that the extra overhead of opening and closing connections all
>> the time, of generating execution plans for each query, and running the
>> queries many times gives his solution a large performance hit. My
>> recommendation will be to fetch all the rows needed in one large SQL
>> query.
>> I've done several other projects where I have proven this to be true,
>> there is indeed a large performance hit from generating many small
>> queries instead of one large. But my boss just says "no" and disagrees.
>>
>> Can you give me some good arguments and/or point me to some best practice
>> documents that describe this so I can convince my boss he's wrong?
>>
>> Kind Regards,
>> Allan Ebdrup
>>
>> FUT: microsoft.public.sqlserver.programming
>>
>
>
Author
6 Apr 2006 4:08 PM
Martin C K Poon
If responsiveness of your application is important, you might consider to
fetch a certain amount of objects per fetch (say 100 objects, depending on
the size of your objects), instead of fetching 5000 objects together.

If you are going to get 5000 objects at once, while responsiveness is not a
problem, you should fetch all rows using one query.

--
Martin C K Poon
Senior Analyst Programmer
====================================
Show quote
"Allan Ebdrup" <ebdrup@noemail.noemail> ¦b¶l¥ó
news:%23kIhprXWGHA.3332@TK2MSFTNGP02.phx.gbl ¤¤¼¶¼g...
> I just had a discussion with my boss, we are running a query that fetches
> about 5000 objects from the database using MSSQL and DotNet 2.0.
> My boss thinks that the recommendation of keeping the connections open for
> as short a time as possible means that one shoule open a connection, fetch
> one row, close the connection, open another connection, fetch one row and
> close the connection and so on.
> I belive that the extra overhead of opening and closing connections all
the
> time, of generating execution plans for each query, and running the
queries
> many times gives his solution a large performance hit. My recommendation
> will be to fetch all the rows needed in one large SQL query.
> I've done several other projects where I have proven this to be true,
there
> is indeed a large performance hit from generating many small queries
instead
> of one large. But my boss just says "no" and disagrees.
>
> Can you give me some good arguments and/or point me to some best practice
> documents that describe this so I can convince my boss he's wrong?
>
> Kind Regards,
> Allan Ebdrup
>
> FUT: microsoft.public.sqlserver.programming
>
>
Author
7 Apr 2006 4:53 AM
Jéjé
you are right, opening / closing a connection is time consuming.
I have no number for this overhead.
A developper try to minimize the number of connections and monitor the
connection pool used by the .Net 2.0.

With SQL 2005 and the MARS capability, you are able to execute multiple
commands using the same connection.

maybe your boss misunderstand the difference between retrieving 1 row and
releasing a connection as quickly as possible.
For good performance you have to release the connection as quickly as
possible to allow another user to use it and to release the database
resources.
also you have to retrieve just what you need, don't read 5000rows if you'll
display only 100rows.
but if you want to cache your result, retrieving more rows is better.


using a good transaction option, you can insure that you'll get all the data
from all the objects at a point in time. so you insure a good integrity in
your data. But this could cause some lock issues.

in another hand, retrieving each row individually insure that the user will
have the latest version of the row, but the data integrity is not guarantee.
if the employeeID change in table 1, and you don't have read the employee
name for this specific ID, you'll have some problems if you try to join the
data in your application.
But this method could fill your application pool more quickly because you'll
ask for a new connection before the previous connection has been correctly
re-pooled.
And each connection in the pool consume your SQL Server resource (the
connection still alive)


I Think you can simply create a sample application to test the 2 options and
monitor the performance.

Show quote
"Allan Ebdrup" <ebdrup@noemail.noemail> wrote in message
news:%23kIhprXWGHA.3332@TK2MSFTNGP02.phx.gbl...
>I just had a discussion with my boss, we are running a query that fetches
>about 5000 objects from the database using MSSQL and DotNet 2.0.
> My boss thinks that the recommendation of keeping the connections open for
> as short a time as possible means that one shoule open a connection, fetch
> one row, close the connection, open another connection, fetch one row and
> close the connection and so on.
> I belive that the extra overhead of opening and closing connections all
> the time, of generating execution plans for each query, and running the
> queries many times gives his solution a large performance hit. My
> recommendation will be to fetch all the rows needed in one large SQL
> query.
> I've done several other projects where I have proven this to be true,
> there is indeed a large performance hit from generating many small queries
> instead of one large. But my boss just says "no" and disagrees.
>
> Can you give me some good arguments and/or point me to some best practice
> documents that describe this so I can convince my boss he's wrong?
>
> Kind Regards,
> Allan Ebdrup
>
> FUT: microsoft.public.sqlserver.programming
>
Author
8 Apr 2006 12:33 PM
Erland Sommarskog
Allan Ebdrup (ebdrup@noemail.noemail) writes:
> I just had a discussion with my boss, we are running a query that
> fetches about 5000 objects from the database using MSSQL and DotNet 2.0.
> My boss thinks that the recommendation of keeping the connections open
> for as short a time as possible means that one shoule open a connection,
> fetch one row, close the connection, open another connection, fetch one
> row and close the connection and so on. I belive that the extra overhead
> of opening and closing connections all the time, of generating execution
> plans for each query, and running the queries many times gives his
> solution a large performance hit. My recommendation will be to fetch all
> the rows needed in one large SQL query. I've done several other projects
> where I have proven this to be true, there is indeed a large performance
> hit from generating many small queries instead of one large. But my boss
> just says "no" and disagrees.

You conclusions are correct, but some of the arguments are not.

There is indeed an overhead to open and a close a physical connection.
For this reason, ADO .Net maintains a connection pool, so that when you
say cn.Close, ADO .Net does not close the physical connection to the
server. Instead, it keeps the connection open, and if you request a
new connection with the same connection properties within 60 seconds,
the connection will be reused. In this way the overhead is minimized.
There is still a network roundtrip when you reconnect however, as ADO .Net
issues the command sp_reset_connection to make sure that no settings from
the previous connection lingers around.

The problem with your boss's reasoning is not really that he wants to
connect and disconnect. It would be equally bad if he srayed connected
and then fetched one row at a time. Of course, if he stays connected,
he can use ExecuteReader which will buffer under the hood. When reconnects,
he needs to issue a query each time, and all those network roundtrips
will be detrimental to the performance.

It should be easy to write an sample application that emplooys different
models of getting data:
1) The boss method: connect, issue a query, fetch one row, disconnect.
2) Loop over ExecuteRead.
3) Datadatper.Fill.

2 and 3 will probably have very similar performance. 1) will crawl like
a snail.

If the boss does not change his mind, get a new boss.


--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
11 Apr 2006 2:18 AM
Ryan Hunt
Alan, your question is very vague - which may be leading to the lack of
answers.  When you say you are fetching 5000 objects, do you mean records?

What is going to be best depends on a lot.  In general, yes, continually
opening and closing connections is going to incur overhead on both your DB
server and your app server.  Your concerns about the small queries can be
somewhat allayed by the efficient plan reuse from SQL 2000 and 2005.  Once a
query is run, subsequent queries with similar paramters (same column args
and similar cardinality) will use the same compiled plan from the cache.
So, that part doesn't disprove your boss.

Now, where your argument comes into play is in network traffic.  50 query
calls of 100 records each will require 49 more ack/syn TCP handshakes
(server and network CPU) and network traversal (network latency)than is
necessary - not to mention that you are also sending a simiar query command
(as text) to the server 50 times which does take up additional network
bandwith.  Furthmore, 50 different queries will result in many more sql
server RPC's than just one query - which again increases overall CPU.  Also,
if you are trying to break up 5000 records into smaller symetrical chunks
(but in the end you need all 5000), then your queries become more complex.
You are now selecting records in a pseudo ordinal format by assigning a
numbering sequence - which is likely to increase the complexity of the SQL,
require more logic, and require record sorts - which will all increase the
CPU.

In my experience with web apps, its better to increase the memory on your
web/application server by a gig or so and simply return all 5000 records in
one fell swoop (as long as all records are required).  Let the web server
render multiple pages and separate the records into workable chunks for the
users.

If you are looking for a very active SQL forum, check out
http://www.sqlservercentral.com/Forums/Default.aspx

Best of luck.

Ryan

Show quote
"Allan Ebdrup" <ebdrup@noemail.noemail> wrote in message
news:%23kIhprXWGHA.3332@TK2MSFTNGP02.phx.gbl...
> I just had a discussion with my boss, we are running a query that fetches
> about 5000 objects from the database using MSSQL and DotNet 2.0.
> My boss thinks that the recommendation of keeping the connections open for
> as short a time as possible means that one shoule open a connection, fetch
> one row, close the connection, open another connection, fetch one row and
> close the connection and so on.
> I belive that the extra overhead of opening and closing connections all
the
> time, of generating execution plans for each query, and running the
queries
> many times gives his solution a large performance hit. My recommendation
> will be to fetch all the rows needed in one large SQL query.
> I've done several other projects where I have proven this to be true,
there
> is indeed a large performance hit from generating many small queries
instead
> of one large. But my boss just says "no" and disagrees.
>
> Can you give me some good arguments and/or point me to some best practice
> documents that describe this so I can convince my boss he's wrong?
>
> Kind Regards,
> Allan Ebdrup
>
> FUT: microsoft.public.sqlserver.programming
>
>
Author
11 Apr 2006 9:46 AM
Will
One point that is perhaps not relevant to SQL, but might be worth
considering - is it necessary to have 5000 objects in a web
application. If you're basically turning your database structure into a
massive framework, you'll be transferring huge amounts of data (whether
done in one big query or lots of small ones), and you'll be storing
that in memory on your web server (I would assume in Session), so not
only will your web server have to handle the web requests and the
presentation layer, you'll also be making it handle a large amount of
application logic. Also if you're storing the result of your query in
session then you'll be storing that per user, which defeats the
scalability benefits of a web application.

If you're not persisting these objects throughout the users session,
that means you'll be pulling 5000 objects over your network per
request!

I'm just guessing at the architecture here I admit, but might it be the
time to put a bit more logic in the stored procedures, then just return
a small result set?
Author
11 Apr 2006 1:20 PM
Allan Ebdrup
It's actually an agent that runs scheduled every night.
Author
11 Apr 2006 1:47 PM
Will
ok, that caught me out.

I think everyone was working on the assumption that you were trying to
reduce user response times. If it's just a scheduled task that no-one
even sees, there's no problem with having a long running sql statement,
just remember to set the timeout property on your sqlcommand object.
Author
11 Apr 2006 3:03 PM
ML
Maybe you should elaborate on this. What exactly is the business requirement?


ML

---
http://milambda.blogspot.com/
Author
12 Apr 2006 10:26 AM
Allan Ebdrup
"ML" <M*@discussions.microsoft.com> wrote in message
news:FAE1AE0D-5CE3-46D5-B220-FEB057783A48@microsoft.com...
> Maybe you should elaborate on this. What exactly is the business
> requirement?

We're running an agent that matches 5000 XML structures against 50000
others, we're loading it all into memory and processing there on one big
sweep.
It's to run in under 3 hours. Much less if possible.
Author
12 Apr 2006 10:46 AM
ML
Do all the objects need to be compared in a single run? Why?


ML

---
http://milambda.blogspot.com/
Author
12 Apr 2006 2:35 PM
Allan Ebdrup
"ML" <M*@discussions.microsoft.com> wrote in message
news:C97F49D4-0F3D-4738-B957-7FDFAF9524C4@microsoft.com...
> Do all the objects need to be compared in a single run? Why?

We can split them up into large chunks, but running them one at a time is to
slow, running more is faster.
Author
13 Apr 2006 8:32 AM
ML
What DML is involved here? Only retrieving or is data modified somehow and
written back to the database?

Read-only tasks (where no locking is needed) could run in parallel.

Parallel execution might also be possible if data needs to be locked (e.g.
in order to be modified), if individual processes can be isolated on the
database level (i.e. row level locks).


ML

---
http://milambda.blogspot.com/
Author
18 Apr 2006 12:35 PM
Allan Ebdrup
"ML" <M*@discussions.microsoft.com> wrote in message
news:4CA62DDA-77A5-4A9A-AEBB-70DC4D0D8235@microsoft.com...
> What DML is involved here? Only retrieving or is data modified somehow and
> written back to the database?
>
> Read-only tasks (where no locking is needed) could run in parallel.
>
> Parallel execution might also be possible if data needs to be locked (e.g.
> in order to be modified), if individual processes can be isolated on the
> database level (i.e. row level locks).

Good idea to run the stuff parallel, I'm not sure it's a multi-processor
machine it's running on, but I'll check it out. (The sql server is 8-way)

Kind Regards,
Allan Ebdrup
Author
12 Apr 2006 3:53 AM
Doug
it depends. are you retrieving objects, or rows?
how many bytes is the 5000 objects?
is the client capable of receiving the 5000 objects as fast as the
server can send them?
Author
12 Apr 2006 10:24 AM
Allan Ebdrup
"Doug" <drmiller***@hotmail.com> wrote in message
news:1144814016.810297.282890@z34g2000cwc.googlegroups.com...
> it depends. are you retrieving objects, or rows?
> how many bytes is the 5000 objects?
> is the client capable of receiving the 5000 objects as fast as the
> server can send them?

We're retrieving rows, the objects are of different sizes as there is a XML
Text field in the data that describes a small object model contained in the
object, this will not be made into objects by our program, simply modified
as XML. I guess the average size of an object is 10-30KB.
The client is an application server with a 1Gbs network between the two
servers.
Author
25 Apr 2006 3:38 AM
Jeremy
I think there would likely be a performance degredation of doing 5000
queries instead of one.  Not necessarily from opening and closing the
connection because.net uses connection pooling,so the actual the connection
to the db would likely stay open anyhow and when you close a connection in
..net it just gets returned to the pool, but sql server has to process a
query 5000 times instead of once.  As well, more dataset objects will get
instantiated and the garbage collection will work harder.

Put 5000 papers on your boss's desk.  Tell him he can run from your desk to
his and return 1 paper 5000 times, or he can run once and return all 5000.
If he says he won't be working harder by running back and forth 5000 times
get him to prove it!

If for example, you were needing to query client data and there were 5000
clients applicable, I disagree that 5000 queries would be the right way to
go, at least under normal circumstances, maybe there are special cases but
I'd think that would be very rare.

Sql is designed to deal with SETS of data, a set being 5000 records of data
in your case.

Lets say you were to go with your boss's idea, how would you determine what
the 5000 queries would be anyhow?  I mean you need to return some records to
the application but which ones?  If you wanted all of them, using a standard
approach, you'd select * from 1 or more tables.  Or maybe you'd add a where
clause to return based on a specific criteria.  Using your boss's approach,
how would you do this?  You'd need to know the records you wanted, and how
would you know that?

Lastly, if you queried 5000 times, internally sql server is going to have to
do a bunch of logic,  unless you're doing 5000 vastly different queries,
there will be a lot of duplication of effort on sql servers' part such as
table/index scans, joins, etc.

If' I'm understanding your situation correctly I'd say your boss is out to
lunch.


Show quote
"Allan Ebdrup" <ebdrup@noemail.noemail> wrote in message
news:#kIhprXWGHA.3332@TK2MSFTNGP02.phx.gbl...
> I just had a discussion with my boss, we are running a query that fetches
> about 5000 objects from the database using MSSQL and DotNet 2.0.
> My boss thinks that the recommendation of keeping the connections open for
> as short a time as possible means that one shoule open a connection, fetch
> one row, close the connection, open another connection, fetch one row and
> close the connection and so on.
> I belive that the extra overhead of opening and closing connections all
the
> time, of generating execution plans for each query, and running the
queries
> many times gives his solution a large performance hit. My recommendation
> will be to fetch all the rows needed in one large SQL query.
> I've done several other projects where I have proven this to be true,
there
> is indeed a large performance hit from generating many small queries
instead
> of one large. But my boss just says "no" and disagrees.
>
> Can you give me some good arguments and/or point me to some best practice
> documents that describe this so I can convince my boss he's wrong?
>
> Kind Regards,
> Allan Ebdrup
>
> FUT: microsoft.public.sqlserver.programming
>
>

AddThis Social Bookmark Button