|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
DB Design issueI have four tables namely course,lesson,topic,page. 1.course ( 1,2,3) courseid PK 2.Lesson (lesson id would be 1,2,3 for evry course) courseid+lessonid PK 3.topic( topicid wud be 1,2,3 for evry lesson) courseid+lessonid+topicid PK 4.page (pageid wud 1,2,3.. for evry page courseid+lessonid+pageid+topicid PK I feel that this design is not the efficient one. i've redesigned it this way. 1. course -- courseid PK 2.Lesson -- lessonid Pk (i wud have to append courseid with lesson no to make it unique) courseid FK 3.Topic-- topicid PK ( i wud have to append lessonid with topicid to make it unique) lessonid FK ( courseid wont be required ) 4.Page pageid PK (i wud have to append topicid with pageno to make it unique). topicid FK (no courseid,lessonid columns added) My Question is which of these designs is better and efficient . If any of u could give me a better idea that would help me a lot. thank you for ur replies, Toby I think you are dealing with the difference of having data like so:
Course 1 Some Reallly Great Course Lesson 11 Some Really Great Course Lesson 1 12 Some Really Great Course Lesson 2 etc. versus Course 1 Some Reallly Great Course Lesson 1 1 Some Really Great Course Lesson 1 2 1 Some Really Great Course Lesson 2 If so, having separate PK/FK is better than concatenating. A few reasons why: 1. Can work with numerics rather than alphanumerics 2. Can easily join two tables on a numeric Now, this does not mean you cannot make a human readable concatenated field, either stored in the table ro created on output, but it does mean that you are not joining inefficiently. -- Show quoteGregory A. Beamer MVP; MCP: +I, SE, SD, DBA *********************************************** Think Outside the Box! *********************************************** "Toby" <T***@discussions.microsoft.com> wrote in message news:6ABC3779-CA55-4763-9B44-EA8134935EAA@microsoft.com... > Hello everyone, > > I have four tables namely course,lesson,topic,page. > 1.course ( 1,2,3) > courseid PK > 2.Lesson (lesson id would be 1,2,3 for evry course) > courseid+lessonid PK > 3.topic( topicid wud be 1,2,3 for evry lesson) > courseid+lessonid+topicid PK > 4.page (pageid wud 1,2,3.. for evry page > courseid+lessonid+pageid+topicid PK > > I feel that this design is not the efficient one. > > i've redesigned it this way. > > 1. course -- > courseid PK > > 2.Lesson -- > lessonid Pk (i wud have to append courseid with lesson no to make it > unique) > courseid FK > > 3.Topic-- > topicid PK ( i wud have to append lessonid with topicid to make it unique) > lessonid FK ( courseid wont be required ) > > 4.Page > pageid PK (i wud have to append topicid with pageno to make it unique). > topicid FK (no courseid,lessonid columns added) > > My Question is which of these designs is better and efficient . > > If any of u could give me a better idea that would help me a lot. > > thank you for ur replies, > Toby > > Thanks for the reply Gregory. But my design wouldnot be having alphanumeric
things. I'll just give some sample datas in both designs here. Existing design: Course : courseid coursename ------------------------------------------------------------------------------- 1 SQL 2 .Net Lesson : ( Lessonid+courseid = PK ) lessonid courseid lessonname ---------------------------------------------------- 1 1 intro 2 1 blabla 1 2 intro 2 2 .Net for you Topic : ( topicid+lessonid+courseid=PK) topicid lessonid courseid topicname ----------------------------------------------------- 1 1 1 topic1 2 1 1 topic2 3 1 1 topic3 Page : (pageid+topicid+lessonid+courseid=PK) pageid topicid lessonid courseid content ----------------------------------------------------------- 1 1 1 1 blsadfsdf........... 2 1 1 1 fgfghfgfjh....... My Design: Course : courseid coursename ------------------------------------------------------------------------------- 1 SQL 2 .Net Lesson : ( Lessonid= PK ) lessonid is concatenated with courseid. lessonid courseid lessonname ---------------------------------------------------- 11 1 intro 12 1 blabla 21 2 intro 22 2 .Net for you Topic : ( topicid=PK) topicid lessonid topicname ----------------------------------------------------- 111 1 topic1 112 1 topic2 113 1 topic3 121 2 topic4 ..... Page (pageid = pk) pageid topicid content --------------------------------- 1111 111 dfsad 1112 111 sdfdsf 1113 111 fsdfsfd 1121 112 sadfsdf 1122 112 vcbnbvn there would be a problem with this design tooo. i'd have to use string functions to get the correct lesson no , topic no , page no. in select query. Is it a big overhead to have a composite primary key of 4 columns.if so why? What would be an ideal design for this scenario........ Thanks for your replies, Toby Without full details of your business model, it is hard to conclude your
second design is the best possible option. However the drawbacks of the first design are obvious. There is nothing in your narrative that suggests the need for a loaded superkey ( A superkey is a composite key that has more columns than what is necessary to uniquely identify a row ). All apparent relationships seem to be one-to-many, which when represented using overloaded keys can introduce additional complexity in query expressions. The lack of referential constraints can be a critical data integrity issue. Even if foreign keys are enforced, updates and deletes to courses or lessons would have a longer cascade path rather than a single reference. There might be other business rules which can have an impact on the logical design. However, based only on the limited information in your post, your first design choice is not recommended. -- Anith Thanks for the reply Anith.
> There is nothing in your narrative that suggests the need for a loaded The design has a composite key bcos that is the only thing with which we can > superkey ( A superkey is a composite key that has more columns than what is > necessary to uniquely identify a row ). > identify the row. that is why they have proceeded with composite primary key. Moreover the client insists that we have a primary key like that. what if i add a surrogate key say an identity column to the tables and make it primary . wouldnt it overcome the composite key problem . but having said that i wouldnt be using it to identify the row? so what is the solution. >> The design has a composite key bcos that is the only thing with which we You seem to have misinterpreted what I wrote. Please read again, pay >> can identify the row. that is why they have proceeded with composite >> primary key. Moreover the client insists that we have a primary key like >> that. attention to what a super key is. -- Anith >A superkey is a composite key that has more columns than what is No the design doesnt have unnecessary columns in the primary key. data is >necessary to uniquely identify a row stored in such a way. only the combination of the columns mentioned gives me a unique value. if you had seen my second post(i have posted some sample data ) you would have understood abt that. clearly this is not the case of a superkey with unnecessary columns >> if you had seen my second post(i have posted some sample data ) you would I see what you are referring to. You are just comparing the difference of >> have understood abt that. performance overhead between using a composite key vs. a simple overloaded key. As usual it cannot be generalized based on the logical design alone since there are so many physical factors involved. In general, when all references are encoded within a simple key, there is inherent redundancy. You have a column as well as a part of a column, representing the same attribute. Another problem is with the changes in data. Every update/delete will have to be cascaded to all the four entity types. Having large composite keys have other issues as well. It may yield complex relational expressions and partially nullable references ( where part of composite foreign key can be null) can be problem as well. A reasonable approach might be using surrogate identifiers. Here is a simple schema assuming all relationships are 1-to-many: CREATE TABLE Courses( course_id INT NOT NULL PRIMARY KEY, course_name VARCHAR(40) NOT NULL ); CREATE TABLE Lessons ( lesson_id INT NOT NULL PRIMARY KEY, course_id INT NOT NULL REFERENCES Courses ( course_id ) ON UPDATE CASCADE ON DELETE CASCADE lesson_name VARCHAR(40) NOT NULL UNIQUE ( course_id, lesson_name ) ); CREATE TABLE Topics ( topic_id NOT NULL PRIMARY KEY, lesson_id NOT NULL REFERENCES Lessons ( lesson_id ) ON UPDATE CASCADE ON DELETE CASCADE topic_name VARCHAR(40) NOT NULL UNIQUE( lesson_id, topic_name ) ) ; CREATE TABLE Pages ( page_id INT NOT NULL PRIMARY KEY, topic_id NOT NULL REFERENCES Topics ( topic_id ) ON UPDATE CASCADE ON DELETE CASCADE, content TEXT NOT NULL ... ); If any of these relationships are many-to-many, say for instance if same topic/lesson is in multiple courses, you will have to introduce an "association" table so that it can resolved to two or more 1-to-many relationships. -- Anith Thank you Anith for taking time to explain and yes it is many to many. thats
why all the trouble. I think i'd have to settle for associate tables. Best Regards, Toby Show quote "Anith Sen" wrote: > >> if you had seen my second post(i have posted some sample data ) you would > >> have understood abt that. > > I see what you are referring to. You are just comparing the difference of > performance overhead between using a composite key vs. a simple overloaded > key. As usual it cannot be generalized based on the logical design alone > since there are so many physical factors involved. > > In general, when all references are encoded within a simple key, there is > inherent redundancy. You have a column as well as a part of a column, > representing the same attribute. Another problem is with the changes in > data. Every update/delete will have to be cascaded to all the four entity > types. Having large composite keys have other issues as well. It may yield > complex relational expressions and partially nullable references ( where > part of composite foreign key can be null) can be problem as well. > > A reasonable approach might be using surrogate identifiers. Here is a simple > schema assuming all relationships are 1-to-many: > > CREATE TABLE Courses( > course_id INT NOT NULL PRIMARY KEY, > course_name VARCHAR(40) NOT NULL ); > CREATE TABLE Lessons ( > lesson_id INT NOT NULL PRIMARY KEY, > course_id INT NOT NULL > REFERENCES Courses ( course_id ) > ON UPDATE CASCADE > ON DELETE CASCADE > lesson_name VARCHAR(40) NOT NULL > UNIQUE ( course_id, lesson_name ) ); > CREATE TABLE Topics ( > topic_id NOT NULL PRIMARY KEY, > lesson_id NOT NULL > REFERENCES Lessons ( lesson_id ) > ON UPDATE CASCADE > ON DELETE CASCADE > topic_name VARCHAR(40) NOT NULL > UNIQUE( lesson_id, topic_name ) ) ; > CREATE TABLE Pages ( > page_id INT NOT NULL PRIMARY KEY, > topic_id NOT NULL > REFERENCES Topics ( topic_id ) > ON UPDATE CASCADE > ON DELETE CASCADE, > content TEXT NOT NULL > ... ); > > If any of these relationships are many-to-many, say for instance if same > topic/lesson is in multiple courses, you will have to introduce an > "association" table so that it can resolved to two or more 1-to-many > relationships. > > -- > Anith > > > |
|||||||||||||||||||||||