|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Tough SQL problem, need expert advice!!!I think I am having a very tough problem, I need some expert advice here. Please bear with me since it will takes me a while to explain the situation. (Using SQL2005) I need to design the generic search utility on the database server (run as Web Service), client supply the search criteria (where clause) I need return only one field:id back to user, the data source the user need to search is coming from different relational database tables , there is a main table and a few sub tables, the main table has a primary key (id) and all the sub-table have the main tables primary key as foreign key., but there are not relationship among the sub tables, and client needs to able to search the all the fields in all the tables. Also, we have different database, each has its own main table and sub tables, and the structure of each table is different. for each set, I have those information stored in my own meta table. For each set I could create view, dynamically generate sql statement on the fly from the metadata table I have, but the code that generate the dynamic SQL has to be the same I have two ways to solve this problem but neither is satisfactory 1) Create a view or query that joins all the tables, because the sub-tables only related to each other with the main table's primary key, if I do a join on the main table and all sub- tables, I would create a Cartesian products. Select distinct id from cartesian_product_view where maintable.field1=field1 and subtable1.field2=fied2 and subtable3.field3=field3 Here on the main table with 40,000 rows, the cartesian_product_view has 100 million rows!!!!! 2) Substitute the where clause with subquery . for example, if the where clause is following: where subtable1.name ='john' or subtable2.product_code ='xyz' or subtable1.name ='tom I will replace it with the following : (please trust me I have the code to do the Substitution but it will be too hard to explain here) Select distinct maintable.id where exists (select * from subttable1 where name = 'john' and maintable.id=subtable1.id) or exists (select * from subttable2 where product_code = 'xyz' and maintable.id=subtable2.id) or exists (select * from subttable1 where name = 'tom' and maintable.id=subtable1.id) order by ... the problem with that is the query is getting slower & slower when user specify more search criteria, it spend more time parsing the dynamic query than querying the database, when you specify enough criteria (which translated into subqueries) SQL Server just gave up and throw the following exception. The query processor ran out of internal resources and could not produce a query plan. This is a rare event and only expected for extremely complex queries or queries that reference a very large number of tables or partitions. Please simplify the query. If you believe you have received this message in error, contact Customer Support Services for more information. So I am totally stuck. Please advice. Any suggestions and advice is greatly appreciated. Thanks in advance. John John (john***@yahoo.com) writes:
> (Using SQL2005) I need to design the generic search utility on the The WHERE clause is not coming as a parameter, I hope?> database server (run as Web Service), client supply the search criteria > (where clause) Show quoteHide quote > 2) Substitute the where clause with subquery . for example, if the So, the rule is that if any of the criterias match, the row should> where clause is following: > > where subtable1.name ='john' or subtable2.product_code ='xyz' > or subtable1.name ='tom > > I will replace it with the following : (please trust me I have the code > to do the Substitution but it will be too hard to explain here) > > Select distinct maintable.id where > exists (select * from subttable1 where name = 'john' and > maintable.id=subtable1.id) > or > exists (select * from subttable2 where product_code = 'xyz' and > maintable.id=subtable2.id) > or > exists (select * from subttable1 where name = 'tom' and > maintable.id=subtable1.id) > order by ... be returned. Often dynamic searches are the other way round: all should match. One thing you could try is: Select maintable.id where exists (select * from subttable1 where name = 'john' and maintable.id=subtable1.id) union Select maintable.id exists (select * from subttable2 where product_code = 'xyz' and maintable.id=subtable2.id) union Select maintable.id exists (select * from subttable1 where name = 'tom' and maintable.id=subtable1.id) If the query processor chokes, you could insert ids into temp tables as you handle the tables, one by one. (Or two by two or whatever.) > the problem with that is the query is getting slower & slower when user If more criterias means more data to search and return that may be> specify more search criteria, inevitable. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx Unfortunately, the client supply the where clause, which could be
anything, such as (Lastname='johnson' or firstname='smith') and (city='Los Angeles') or productid in (234,23434,33) .... So it is very hard for me to break the search criteria into different part and do a union considering the different logical operator combination Erland Sommarskog wrote: Show quoteHide quote > John (john***@yahoo.com) writes: > > (Using SQL2005) I need to design the generic search utility on the > > database server (run as Web Service), client supply the search criteria > > (where clause) > > The WHERE clause is not coming as a parameter, I hope? > > > 2) Substitute the where clause with subquery . for example, if the > > where clause is following: > > > > where subtable1.name ='john' or subtable2.product_code ='xyz' > > or subtable1.name ='tom > > > > I will replace it with the following : (please trust me I have the code > > to do the Substitution but it will be too hard to explain here) > > > > Select distinct maintable.id where > > exists (select * from subttable1 where name = 'john' and > > maintable.id=subtable1.id) > > or > > exists (select * from subttable2 where product_code = 'xyz' and > > maintable.id=subtable2.id) > > or > > exists (select * from subttable1 where name = 'tom' and > > maintable.id=subtable1.id) > > order by ... > > So, the rule is that if any of the criterias match, the row should > be returned. Often dynamic searches are the other way round: all > should match. > > One thing you could try is: > > Select maintable.id > where exists (select * from subttable1 where name = 'john' and > maintable.id=subtable1.id) > union > Select maintable.id > exists (select * from subttable2 where product_code = 'xyz' and > maintable.id=subtable2.id) > union > Select maintable.id > exists (select * from subttable1 where name = 'tom' and > maintable.id=subtable1.id) > > If the query processor chokes, you could insert ids into temp tables > as you handle the tables, one by one. (Or two by two or whatever.) > > > the problem with that is the query is getting slower & slower when user > > specify more search criteria, > > If more criterias means more data to search and return that may be > inevitable. > > > -- > Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se > > Books Online for SQL Server 2005 at > http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx > Books Online for SQL Server 2000 at > http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx John (john***@yahoo.com) writes:
> Unfortunately, the client supply the where clause, which could be And who got that bright idea?> anything, such as > (Lastname='johnson' or firstname='smith') and (city='Los > Angeles') or productid in (234,23434,33) .... > So it is very hard for me to break the search criteria into different > part and do a union considering the different logical operator > combination Sending WHERE clauses as parameters is a really bad idea. Either you compose the entire SQL statement in the client. Or you have all the SQL in the stored procedure. What you have now is the worst of both worlds. It will be very difficult to work well. And obvoiusly there is a risk with SQL injection. And the procedure cache will be a waste. Back to the drawing board, if you want to hear my opinion. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx Well, when I say I allow client to specify the where clause, that is a
just simple way to put it, actually I publish a list of client is allowed to do search, which is different name than the database field name, I have a table to map the field name the client is using to real database field name ( or XML Query) when I get the where clause I do the parsing and convert the client supplied name into real database name (or XML Query expression), since the parse is doing all the checks, I am not worry about the SQL injection You may ask why I am going through so much trouble, the reason it I am writing a framework that needs to be extendable, which means I design the basic infrastructure, other group could design application sitting on top of the framework. For example, they could design their own database, tables, as long as they following the general design, they could reuse everything I am doing. Erland Sommarskog wrote: Show quoteHide quote > John (john***@yahoo.com) writes: > > Unfortunately, the client supply the where clause, which could be > > anything, such as > > (Lastname='johnson' or firstname='smith') and (city='Los > > Angeles') or productid in (234,23434,33) .... > > So it is very hard for me to break the search criteria into different > > part and do a union considering the different logical operator > > combination > > And who got that bright idea? > > Sending WHERE clauses as parameters is a really bad idea. Either you > compose the entire SQL statement in the client. Or you have all the > SQL in the stored procedure. What you have now is the worst of both > worlds. > > It will be very difficult to work well. And obvoiusly there is a risk > with SQL injection. And the procedure cache will be a waste. > > Back to the drawing board, if you want to hear my opinion. > > > > -- > Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se > > Books Online for SQL Server 2005 at > http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx > Books Online for SQL Server 2000 at > http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx John (john***@yahoo.com) writes:
Show quoteHide quote > Well, when I say I allow client to specify the where clause, that is a Now, I understand lesser and lesser. If you are parsing the input, you> just simple way to put it, actually I publish a list of client is > allowed to do search, which is different name than the database field > name, I have a table to map the field name the client is using to real > database field name ( or XML Query) > when I get the where clause I do the parsing and convert the client > supplied name into real database name (or XML Query expression), since > the parse is doing all the checks, I am not worry about the SQL > injection > > You may ask why I am going through so much trouble, the reason it I am > writing a framework that needs to be extendable, which means I design > the basic infrastructure, other group could design application sitting > on top of the framework. For example, they could design their own > database, tables, as long as they following the general design, they > could reuse everything I am doing. should be able to break up in the input so you could try the union thing. Looks like a very good idea to do this in a CLR procedure, by the way. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx Unfortunately I did not write the parser, it is a generic component,
passing me the following information 1) Field 2) Operation 3) Value I could replace all of the value above but the parser does not passing me the logic operator, even if it does, it is going to be very hard to for me to do a union because I could get very complicated logical operation and I could end up query the same table multiple time for different fields. Such as (Lastname='johnson' or firstname='smith') and (city='Los Angeles') or (productid in (234,23434,33) and (Lastname='Smith' or firstname='tom')) also for "and" operation, I will need to do a intersection instead of union. I do not believe SQLServer provide this function on transaction sql. Erland Sommarskog wrote: Show quoteHide quote > John (john***@yahoo.com) writes: > > Well, when I say I allow client to specify the where clause, that is a > > just simple way to put it, actually I publish a list of client is > > allowed to do search, which is different name than the database field > > name, I have a table to map the field name the client is using to real > > database field name ( or XML Query) > > when I get the where clause I do the parsing and convert the client > > supplied name into real database name (or XML Query expression), since > > the parse is doing all the checks, I am not worry about the SQL > > injection > > > > You may ask why I am going through so much trouble, the reason it I am > > writing a framework that needs to be extendable, which means I design > > the basic infrastructure, other group could design application sitting > > on top of the framework. For example, they could design their own > > database, tables, as long as they following the general design, they > > could reuse everything I am doing. > > Now, I understand lesser and lesser. If you are parsing the input, you > should be able to break up in the input so you could try the union thing. > > Looks like a very good idea to do this in a CLR procedure, by the way. > > -- > Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se > > Books Online for SQL Server 2005 at > http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx > Books Online for SQL Server 2000 at > http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx Hi There,
I think you can try something like this . Select * from tableA A Where Exists ( Select 1 from TableB B where B.coulmn1=A.coulmn1 Union All Select 1 from TableB B where B.coulmn2=A.coulmn2 Union All Select 1 from TableB B where B.coulmn3=A.coulmn3 and so on........ ) Using OR in where condition is causing it to FULL SCAN the table. also create index on columns used for comparing. With Warm regards Jatinder Singh Show quoteHide quote > Unfortunately I did not write the parser, it is a generic component, > passing me the following information > 1) Field > 2) Operation > 3) Value > I could replace all of the value above > but the parser does not passing me the logic operator, even if it > does, it is going to be very hard to for me to do a union because I > could get very complicated logical operation and I could end up query > the same table multiple time for different fields. Such as > (Lastname='johnson' or firstname='smith') and (city='Los Angeles') > or (productid in (234,23434,33) and (Lastname='Smith' or > firstname='tom')) > > also for "and" operation, I will need to do a intersection instead > of union. I do not believe SQLServer provide this function on > transaction sql. > > Erland Sommarskog wrote: > > John (john***@yahoo.com) writes: > > > Well, when I say I allow client to specify the where clause, that is a > > > just simple way to put it, actually I publish a list of client is > > > allowed to do search, which is different name than the database field > > > name, I have a table to map the field name the client is using to real > > > database field name ( or XML Query) > > > when I get the where clause I do the parsing and convert the client > > > supplied name into real database name (or XML Query expression), since > > > the parse is doing all the checks, I am not worry about the SQL > > > injection > > > > > > You may ask why I am going through so much trouble, the reason it I am > > > writing a framework that needs to be extendable, which means I design > > > the basic infrastructure, other group could design application sitting > > > on top of the framework. For example, they could design their own > > > database, tables, as long as they following the general design, they > > > could reuse everything I am doing. > > > > Now, I understand lesser and lesser. If you are parsing the input, you > > should be able to break up in the input so you could try the union thing. > > > > Looks like a very good idea to do this in a CLR procedure, by the way. > > > > -- > > Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se > > > > Books Online for SQL Server 2005 at > > http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx > > Books Online for SQL Server 2000 at > > http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx John (john***@yahoo.com) writes:
> Unfortunately I did not write the parser, it is a generic component, Sounds like it's time to reconsider the solution from bottom up. Or accept> passing me the following information > 1) Field > 2) Operation > 3) Value > I could replace all of the value above > but the parser does not passing me the logic operator, even if it > does, it is going to be very hard to for me to do a union because I > could get very complicated logical operation and I could end up query > the same table multiple time for different fields. Such as > (Lastname='johnson' or firstname='smith') and (city='Los Angeles') > or (productid in (234,23434,33) and (Lastname='Smith' or > firstname='tom')) that you will never get a solution that performs well. Not that I know that the UNION thing is going to run with lightning speed, but the less degree of freedom you have, the poorer are you odds. In any case, I feel that I cannot give much further advice, because there is too much I don't know. > also for "and" operation, I will need to do a intersection instead INTERSECT is indeed a new operator, akin to UNION that was added in SQL > of union. I do not believe SQLServer provide this function on > transaction sql. 2005. Although AND conditions are probably better implemented with regular AND operators. They are far less devilish than the OR. -- Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se Books Online for SQL Server 2005 at http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx Books Online for SQL Server 2000 at http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx I think what you want is the ability to load tables with criteria and
not have to use dynamic SQL: skill = Java AND (skill = Perl OR skill = PHP) becomes the disjunctive canonical form: (Java AND Perl) OR (Java AND PHP) which we load into this table: CREATE TABLE Query (and_grp INTEGER NOT NULL, skill CHAR(4) NOT NULL, PRIMARY KEY (and_grp, skill)); INSERT INTO Query VALUES (1, 'Java'); INSERT INTO Query VALUES (1, 'Perl'); INSERT INTO Query VALUES (2, 'Java'); INSERT INTO Query VALUES (2, 'PHP'); Assume we have a table of job candidates: CREATE TABLE Candidates (candidate_name CHAR(15) NOT NULL, skill CHAR(4) NOT NULL, PRIMARY KEY (candidate_name, skill)); INSERT INTO Candidates VALUES ('John', 'Java'); --winner INSERT INTO Candidates VALUES ('John', 'Perl'); INSERT INTO Candidates VALUES ('Mary', 'Java'); --winner INSERT INTO Candidates VALUES ('Mary', 'PHP'); INSERT INTO Candidates VALUES ('Larry', 'Perl'); --winner INSERT INTO Candidates VALUES ('Larry', 'PHP'); INSERT INTO Candidates VALUES ('Moe', 'Perl'); --winner INSERT INTO Candidates VALUES ('Moe', 'PHP'); INSERT INTO Candidates VALUES ('Moe', 'Java'); INSERT INTO Candidates VALUES ('Celko', 'Java'); -- loser INSERT INTO Candidates VALUES ('Celko', 'Algol'); INSERT INTO Candidates VALUES ('Smith', 'APL'); -- loser INSERT INTO Candidates VALUES ('Smith', 'Algol'); The query is simple now: SELECT DISTINCT C1.candidate_name FROM Candidates AS C1, Query AS Q1 WHERE C1.skill = Q1.skill GROUP BY Q1.and_grp, C1.candidate_name HAVING COUNT(C1.skill) = (SELECT COUNT(*) FROM Query AS Q2 WHERE Q1.and_grp = Q2.and_grp); You can retain the COUNT() information to rank candidates. For example Moe meets both qualifications, while other candidates meet only one of the two. You can Google "canonical disjunctive form" for more details. This is a form of relational division. If you have concurrency, then there might be a problem in reading so many
tables. One suggestion I would give is to have a seperate stored procedure for each of the table that is scanned and have an output param to return the ID. And have one master SP that will call each of the SPs in a particular order. Start with the SP where you will most likely find the result and go in a sequence till you get an ID from one of the SP. This way, not all tables will be read at the same time and you will have more control over the flow. Hope this helps. two problem here
1) since the client supply the where clause, which could be anything, such as (Lastname='johnson' or firstname='smith') and (city='Los Angeles') or productid in (234,23434,33) .... that means the master SP has to dynamically call the different sp depends on the client input. 2) also, if the slave sp returns more than one id, then the output param will not work. Omnibuzz wrote: Show quoteHide quote > If you have concurrency, then there might be a problem in reading so many > tables. > One suggestion I would give is to have a seperate stored procedure for each > of the table that is scanned and have an output param to return the ID. > And have one master SP that will call each of the SPs in a particular order. > Start with the SP where you will most likely find the result and go in a > sequence till you get an ID from one of the SP. > > This way, not all tables will be read at the same time and you will have > more control over the flow. > > Hope this helps.
Other interesting topics
Identity or GUID?
How to add separator blank rows by SQL Query? Advice Requested : Trying to write portable SQL Comparing dates in one field Obtain values from different tables How can I update the col value using extended stored procedure ALL IN ONE SQL STATEMENT? Dynamic View Insert by Parameter trigger will not execute |
|||||||||||||||||||||||