Home All Groups Group Topic Archive Search About

Help with efficiently storing a variety nested queries

Author
23 Aug 2006 4:43 PM
J055
Hi

Please bear with me on this. Is there a way of storing information
in a table which can then be used to generate the kind of results below?

I need to create a 'sequential' query whereby each new condition is applied
to the results of the previous condition(s). For example (using attached
dataset) please see some examples below.  It would be nice to avoid using
temp tables if possible.


CREATE TABLE t (
AccountID int NOT NULL ,
ParamID char (1) NOT NULL ,
CONSTRAINT PK_t PRIMARY KEY CLUSTERED
(  AccountID,  ParamID ))
GO

INSERT INTO t VALUES(1, 'A')
INSERT INTO t VALUES(1, 'B')
INSERT INTO t VALUES(1, 'C')
INSERT INTO t VALUES(1, 'D')
INSERT INTO t VALUES(1, 'E')
INSERT INTO t VALUES(2, 'A')
INSERT INTO t VALUES(2, 'C')
INSERT INTO t VALUES(2, 'E')
INSERT INTO t VALUES(3, 'B')
INSERT INTO t VALUES(3, 'D')
INSERT INTO t VALUES(3, 'F')
INSERT INTO t VALUES(4, 'A')
INSERT INTO t VALUES(5, 'D')
INSERT INTO t VALUES(5, 'E')

/*
A
AND
B
OR
C
AND NOT
D
OR
E

Results:

AccountID
---------
1
2

....returns only accountIDs 1 and 2. Obviously the order is important, as the
final 'OR E' adds account 1 back into the result set. Another example...

B
OR
C
AND NOT
A
OR
D

Results:

AccountID
---------
1
3
5

*/

DROP TABLE t

/*
Cheers
Andrew
*/

Author
24 Aug 2006 9:51 AM
Charles Wang[MSFT]
Hi,
My understanding this issue is:
AND is for intersection; OR is for combination; NOT is for other accountid
values that don't match the current factor. Every appending condition is
applied to the results of the previous query.
If I have misunderstood, please let me know.

To let me better understand this issue, I would like to confirm the result
with you about your first example:
According to my understanding I think the result of your first example is
1,2,5.

Also, I'm not sure what your real purpose is. I would like to know:
1.    Do you want to realize a query statement which can let you query by
applying conditions freely?
2.    Is your table structure fixed? Or I need to redesign a table to query?

Sincerely yours
Charles Wang
Microsoft Online Community Support

======================================================
When responding to posts, please "Reply to Group" via
your newsreader so that others may learn and benefit
from this issue.
======================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
======================================================
Author
24 Aug 2006 1:17 PM
J055
Hi Charles

Thanks for your reply.

> My understanding this issue is:
> AND is for intersection; OR is for combination; NOT is for other accountid
> values that don't match the current factor. Every appending condition is
> applied to the results of the previous query.
> If I have misunderstood, please let me know.

Yes, that's exactly right.

> To let me better understand this issue, I would like to confirm the result
> with you about your first example:
> According to my understanding I think the result of your first example is
> 1,2,5.

Yes, it should be (1,2,5), sorry.

> Also, I'm not sure what your real purpose is. I would like to know:
> 1. Do you want to realize a query statement which can let you query by
> applying conditions freely?

Yes. The query will be built through a user interface and then stored for
retrieval, either to edit or to pull AccountID values which match the query
(filter) criteria. The idea is that a fairly non-technical user can build a
fairly powerful query on the data. There may be lots of different queries to
filter AccountIDs by the ParamIDs. ParamIDs are use to describe an Account
so some accounts may share the same ParamIDs and other Accounts may have a
unique ParamID to them or any variation of this. The end application must be
able to read the filtered data very frequently and fast whereas queries
(filters) are written infrequently.

> 2. Is your table structure fixed? Or I need to redesign a table to query?

We have a solution at present where we can store a query as follows:

FilterID ParamID OrGroup Exclude
-------- ------- ------- -------
1        B       1       0
1        D       2       0
1        C       1       0
1        E       2       0
1        A       2       1

It allows us to represent queries in a table to do things like:

(B or C or D) and (A) not (F)

Please see below. I've tried to streamline the code as much as possible.
What we would like to do is use the same concept but store enough
information in the table to, as you say, "Every appending condition is
applied to the results of the previous query.". So in the table rows above
we would get the results of the first to rows and then append the nest row
condition and so on, in the order that the rows are in. I wonder if we can
do this by adding a column(s) to the 'f' (Filter) table, a kind of summary
or order column?

---------------------------------------------------------
CREATE TABLE t (
AccountID int NOT NULL ,
ParamID char (1) NOT NULL ,
CONSTRAINT PK_t PRIMARY KEY CLUSTERED
(  AccountID,  ParamID ))
GO

INSERT INTO t VALUES(1, 'A')
INSERT INTO t VALUES(1, 'B')
INSERT INTO t VALUES(1, 'C')
INSERT INTO t VALUES(1, 'D')
INSERT INTO t VALUES(1, 'E')
INSERT INTO t VALUES(2, 'A')
INSERT INTO t VALUES(2, 'C')
INSERT INTO t VALUES(2, 'E')
INSERT INTO t VALUES(3, 'B')
INSERT INTO t VALUES(3, 'D')
INSERT INTO t VALUES(3, 'F')
INSERT INTO t VALUES(4, 'A')
INSERT INTO t VALUES(5, 'D')
INSERT INTO t VALUES(5, 'E')


CREATE TABLE f (
FilterID int NOT NULL ,
ParamID char(1) NOT NULL ,
OrGroup tinyint NOT NULL ,
Exclude bit NOT NULL ,
CONSTRAINT PK_f PRIMARY KEY  CLUSTERED
(FilterID,ParamID,OrGroup))
GO

INSERT INTO f VALUES(1, 'B', 1, 0)
INSERT INTO f VALUES(1, 'C', 1, 0)
INSERT INTO f VALUES(1, 'D', 1, 0)
INSERT INTO f VALUES(1, 'A', 2, 0)
INSERT INTO f VALUES(1, 'F', 2, 1)

GO
-- Remove excluded accounts
CREATE VIEW dbo.v
AS
SELECT     t.AccountID, f.FilterID, f.OrGroup
FROM         dbo.t INNER JOIN
                      dbo.f ON t.ParamID = f.ParamID
WHERE     (f.Exclude = 0) AND (t.AccountID NOT IN
                          (SELECT     t2.AccountID
                            FROM          dbo.t t2 INNER JOIN
                                                   dbo.f f2 ON t2.ParamID =
f2.ParamID
                            WHERE      (f.FilterID = f2.FilterID) AND
(f2.Exclude = 1)))
GO

-- Sort out ANDs and ORs
SELECT     FilterID, AccountID
FROM         (SELECT     AccountID, COUNT(DISTINCT OrGroup) Num, FilterID
                       FROM          dbo.v
                       GROUP BY AccountID, FilterID) d
WHERE     ((SELECT     COUNT(DISTINCT OrGroup)
                         FROM         dbo.v
                         WHERE     FilterID = d.FilterID) = Num)
DROP VIEW v
DROP TABLE t
DROP TABLE f
------------------------------------------------------------------

Hope this helps. Please let me know if you need more information.
Thanks
Andrew
Author
27 Aug 2006 5:33 PM
Charles Wang[MSFT]
Hi Andrew,
Thanks very much for your detailed comments. Basically I understand your
meaning. Unfortunately there are still two points I'm not very clear and
need your confirmation:
1. I'm not sure how  you use ORGROUP to express the AND conception. From
your description, I think it's   most likely that the value 2 means "AND"
and meanwhile if the Exclude value is 1, it will lost the "AND" meaning.
2. What's the usage of the view and the last query?   I think the sequence
of query conditions are also important to your needs and it seems that the
last query is lack of order. If you want to sort out ANDS and ORs, it's
recommended that you add a sort number column which represents multiple
groups sequencely with different contexts such as OR group 1 , AND group 2,
NOT group3, AND group 4, etc.

Here is the filter table that I think it should be like:
FilterID    ParamID     OrGroup    AndGroup    NotGroup    SortNumber
1                B                    1                  0                 
      0                   1
1                C                    1                  0                 
      0                   1
1                D                    1                  0                 
      0                   1
1                A                    0                  1                 
      0                    2
1                F                    0                   0                
       1                   3

The format of this table is not too difficult to write a dynamic SQL query
script in stored procedure.

If you have any other questions or concerns, please feel free to let me
know. It's my pleasure for your assistance.

Charles Wang
Microsoft Online Community Support
Author
29 Aug 2006 4:11 PM
J055
Hi Charles

Thanks for your help so far.

1. I'm not sure how  you use ORGROUP to express the AND conception. From
> your description, I think it's   most likely that the value 2 means "AND"
> and meanwhile if the Exclude value is 1, it will lost the "AND" meaning.

The OrGroup is used to group all of the ORs together and then AND them
together, e.g.

ParamID OrGroup
------- -------
A       1
B       1
C       1
D       2
E       2
F       2
G       3
H       3
I       3

Is the same as

WHERE (ParamID IN (A,B,C)) AND (ParamID IN (D,E,F)) AND (ParamID IN (G,H,I))

The Exclude excludes all ParamIDs regardless of OrGroup values.

> 2. What's the usage of the view and the last query?   I think the sequence

The view is an intermediate query which removes all accounts where the
ParamID is excluded in the 'f' table. Once all the AccountIDs are removed I
do a query on the view to get AccountIDs which meet the ParamID and OrGroup
values.

> of query conditions are also important to your needs and it seems that the
> last query is lack of order. If you want to sort out ANDS and ORs, it's
> recommended that you add a sort number column which represents multiple
> groups sequencely with different contexts such as OR group 1 , AND group
> 2,
> NOT group3, AND group 4, etc.

I think this is getting closer to what I'm trying to achieve but I'm not
sure how to present the data in a query. I'd really like to avoid Dynamic
SQL if possible.

Thanks again
Andrew
Author
31 Aug 2006 4:46 PM
Charles Wang[MSFT]
Hi Andrew,
This is a quick note to let you know that I'm performing research on this
issue and may need more time. I'll let you know the result within two
business days.
If it's convenient for you, I would like to know your email address so that
I can timely reply you.
If you don't want to publish your private email address on newsgroup, you
can just mail me a response (chang***@microsoft.com).

Sincerely,
Charles Wang
Microsoft Online Community Support
Author
27 Aug 2006 5:45 PM
--CELKO--
>> Is there a way of storing information in a table which can then be used to generate the kind of results below? <<

Maybe this will help.  I think what you want is the ability to load
tables with criteria and not have to use dynamic SQL:

skill = Java AND (skill = Perl OR skill = PHP)

becomes the disjunctive canonical form:

(Java AND Perl) OR (Java AND PHP)

which we load into this table:

CREATE TABLE Query
(and_grp INTEGER NOT NULL,
skill CHAR(4) NOT NULL,
PRIMARY KEY (and_grp, skill));

INSERT INTO Query VALUES (1, 'Java');
INSERT INTO Query VALUES (1, 'Perl');
INSERT INTO Query VALUES (2, 'Java');
INSERT INTO Query VALUES (2, 'PHP');

Assume we have a table of job candidates:

CREATE TABLE Candidates
(candidate_name CHAR(15) NOT NULL,
skill CHAR(4) NOT NULL,
PRIMARY KEY (candidate_name, skill));

INSERT INTO Candidates VALUES ('John', 'Java'); --winner
INSERT INTO Candidates VALUES ('John', 'Perl');
INSERT INTO Candidates VALUES ('Mary', 'Java'); --winner
INSERT INTO Candidates VALUES ('Mary', 'PHP');
INSERT INTO Candidates VALUES ('Larry', 'Perl'); --winner
INSERT INTO Candidates VALUES ('Larry', 'PHP');
INSERT INTO Candidates VALUES ('Moe', 'Perl'); --winner
INSERT INTO Candidates VALUES ('Moe', 'PHP');
INSERT INTO Candidates VALUES ('Moe', 'Java');
INSERT INTO Candidates VALUES ('Celko', 'Java'); -- loser
INSERT INTO Candidates VALUES ('Celko', 'Algol');
INSERT INTO Candidates VALUES ('Smith', 'APL');  -- loser
INSERT INTO Candidates VALUES ('Smith', 'Algol');

The query is simple now:

SELECT DISTINCT C1.candidate_name
  FROM Candidates AS C1, Query AS Q1
WHERE C1.skill = Q1.skill
GROUP BY Q1.and_grp, C1.candidate_name
HAVING COUNT(C1.skill)
       = (SELECT COUNT(*)
            FROM Query AS Q2
           WHERE Q1.and_grp = Q2.and_grp);

You can retain the COUNT() information to rank candidates.  For example
Moe meets both qualifications, while other candidates meet only one of
the two. You can Google "canonical disjunctive form" for more details.
This is a form of relational division.

AddThis Social Bookmark Button