|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Implementing natural keyswould play around with them a little. A database that I am working on needs to have the ability to repopulate some of its tables from a flat file. By doing this, however, the foreign keys that exist in other tables would all be invalidated. I need a way to get around this problem. I have a table (CITTAS_Primary) that has a foreign key to a table (CITTAS_OriginCityState) that contains three columns: identity/PK, City, and State. I decided that the best way to put a natural key on this table would be to use a combination of City and State. This was easy enough to change. However, when I went to establish the foreign key in CITTAS_Primary, I wasn't able to link my natural key to the existing int field. What data type would the field have to be to reference CITTAS_OriginCityState? If I had to create a city and state field, it would almost be pointless to have the extra tables, wouldn't it? Thanks in advance. -- Chris Lieb UPS CACH, Hodgekins, IL Tech Support Group - Systems/Apps Either you would have to make an amalgamated key (SOMEPLACECA for Someplace, CA)
or use both columns as the PK and link to both columns for all child relationships. (Isn't that fun? ;->) Of course, if you want to associate a zip code with that city and state you will run into problems. There are numerous situations where multiple zip codes are in the same city. Thomas Show quoteHide quote "Chris Lieb" <ChrisL***@discussions.microsoft.com> wrote in message news:D138B4E3-7359-4844-B73D-28F69403085F@microsoft.com... > After reading a thread about the 'merits' of natural keys, I decided that I > would play around with them a little. A database that I am working on needs > to have the ability to repopulate some of its tables from a flat file. By > doing this, however, the foreign keys that exist in other tables would all be > invalidated. I need a way to get around this problem. > > I have a table (CITTAS_Primary) that has a foreign key to a table > (CITTAS_OriginCityState) that contains three columns: identity/PK, City, and > State. I decided that the best way to put a natural key on this table would > be to use a combination of City and State. This was easy enough to change. > However, when I went to establish the foreign key in CITTAS_Primary, I wasn't > able to link my natural key to the existing int field. What data type would > the field have to be to reference CITTAS_OriginCityState? If I had to create > a city and state field, it would almost be pointless to have the extra > tables, wouldn't it? > > Thanks in advance. > > -- > Chris Lieb > UPS CACH, Hodgekins, IL > Tech Support Group - Systems/Apps I don't have to worry about zip codes since the cities are just locations of
our hubs. This is part of a tracking system that we use to track flat cars as they travel between hubs over rail. Chris Show quoteHide quote "Thomas Coleman" wrote: > Either you would have to make an amalgamated key (SOMEPLACECA for Someplace, CA) > or use both columns as the PK and link to both columns for all child > relationships. (Isn't that fun? ;->) Of course, if you want to associate a zip > code with that city and state you will run into problems. There are numerous > situations where multiple zip codes are in the same city. > > > Thomas > > > "Chris Lieb" <ChrisL***@discussions.microsoft.com> wrote in message > news:D138B4E3-7359-4844-B73D-28F69403085F@microsoft.com... > > After reading a thread about the 'merits' of natural keys, I decided that I > > would play around with them a little. A database that I am working on needs > > to have the ability to repopulate some of its tables from a flat file. By > > doing this, however, the foreign keys that exist in other tables would all be > > invalidated. I need a way to get around this problem. > > > > I have a table (CITTAS_Primary) that has a foreign key to a table > > (CITTAS_OriginCityState) that contains three columns: identity/PK, City, and > > State. I decided that the best way to put a natural key on this table would > > be to use a combination of City and State. This was easy enough to change. > > However, when I went to establish the foreign key in CITTAS_Primary, I wasn't > > able to link my natural key to the existing int field. What data type would > > the field have to be to reference CITTAS_OriginCityState? If I had to create > > a city and state field, it would almost be pointless to have the extra > > tables, wouldn't it? > > > > Thanks in advance. > > > > -- > > Chris Lieb > > UPS CACH, Hodgekins, IL > > Tech Support Group - Systems/Apps > > > >> I decided that the best way to put a natural key on this table would be to use a combination of City and State. << No problem; a key can be mutli-column:CREATE TABLE CITTAS_OriginCityState (city_name CHAR(20) NOT NULL, state_code CHAR(2) NOT NULL, PRIMARY KEY (city_name, state_code), ...); CREATE TABLE CITTAS_Primary ( .. city_name CHAR(20) NOT NULL, state_code CHAR(2) NOT NULL, FOREIGN KEY (city_name, state_code) REFERENCES CITTAS_Primary (city_name, state_code) ON UPDATE CASCADE ON DELETE CASCADE, ..); >> If I had to create a city and state field [sic], it would almost be pointless to have the extra tables, wouldn't it? <<Maybe, but we don't know what your data model looks like. What makes "CITTAS_OriginCityState" logically different entities from "CITTAS_Primary"? Or is this a matter of something called a "CITTAS" being in different states or status? For example, you would not have a "MalePersonnel" and a "FemalePersonnel" table; you would a "gender" column instead. Personnel are the same kind of entities. You would not have a "Personnel_or_Automobile" table. People and automobiles are clearly different. CITTAS is just the name of the system being deployed, to keep the tables from
getting mixed up with the other tables in the DB. The system is used to track a trailer from its origin to CACH to its destination. The CITTAS_OriginCityState table is a table of the hubs that we receive trailers from, and, for example, CITTAS_DestinationCityState is a list of hubs that we send trailers to. The CITTAS_Primary table links all of these pieces together under a unique string. The downloads that we get are matched against this key string to get the data about each trailer. For instance, the download says that a trailer Q29 passed through hub X yesterday and was delayed due to mechanical problems. When we get the download, we match Q29s unique key against CITTAS_Primary to retrieve the origin and destination of the trailer among other things. Then, we can inform other hubs down the line that the trailer will not arrive or change sort schedules so we don't have people standing around wailting for a trailer that won't be arriving today. Hope this helps Chris Show quoteHide quote "--CELKO--" wrote: > >> I decided that the best way to put a natural key on this table would be to use a combination of City and State. << > > No problem; a key can be mutli-column: > > CREATE TABLE CITTAS_OriginCityState > (city_name CHAR(20) NOT NULL, > state_code CHAR(2) NOT NULL, > PRIMARY KEY (city_name, state_code), > ...); > > CREATE TABLE CITTAS_Primary > ( .. > city_name CHAR(20) NOT NULL, > state_code CHAR(2) NOT NULL, > FOREIGN KEY (city_name, state_code) > REFERENCES CITTAS_Primary (city_name, state_code) > ON UPDATE CASCADE > ON DELETE CASCADE, > ..); > > >> If I had to create a city and state field [sic], it would almost be > pointless to have the extra tables, wouldn't it? << > > Maybe, but we don't know what your data model looks like. What makes > "CITTAS_OriginCityState" logically different entities from > "CITTAS_Primary"? > Or is this a matter of something called a "CITTAS" being in different > states or status? > > For example, you would not have a "MalePersonnel" and a > "FemalePersonnel" table; you would a "gender" column instead. > Personnel are the same kind of entities. > > You would not have a "Personnel_or_Automobile" table. People and > automobiles are clearly different. > > An idea just came to mind. To prevent having the actual values in
CITTAS_Primary and having to add extra fields to the child tables to handle the multiple keys, could I generate a hash of the data (City + State) and place it in the 'ID' field. By doing this, The number of fields in each table stays the same and, at the same time, I still have a reason to have the parent tables. Since the data in this table should not be altered other that to be repopulated, Users would not have to worry about entering the hashes themselves and programmers could treat the 'ID' fields normally instead of having to do multiple JOINs to relate the data in queries. Is there a way to do this in TSQL? It would not have to be an industrial-strength hash like MD5 or SHA1, just maybe something like CRC. (Ok, maybe CRC isn't actually a hash, but it would do the job, I do believe.) Unfortunately, I do not have permissions to use extended stored procedures, so that is not a viable solution. I figure that a UDF or SP might be the path that I have to go, but I am sure that there are probably hashing functions out there. Chris Show quoteHide quote "Chris Lieb" wrote: > After reading a thread about the 'merits' of natural keys, I decided that I > would play around with them a little. A database that I am working on needs > to have the ability to repopulate some of its tables from a flat file. By > doing this, however, the foreign keys that exist in other tables would all be > invalidated. I need a way to get around this problem. > > I have a table (CITTAS_Primary) that has a foreign key to a table > (CITTAS_OriginCityState) that contains three columns: identity/PK, City, and > State. I decided that the best way to put a natural key on this table would > be to use a combination of City and State. This was easy enough to change. > However, when I went to establish the foreign key in CITTAS_Primary, I wasn't > able to link my natural key to the existing int field. What data type would > the field have to be to reference CITTAS_OriginCityState? If I had to create > a city and state field, it would almost be pointless to have the extra > tables, wouldn't it? > > Thanks in advance. > > -- > Chris Lieb > UPS CACH, Hodgekins, IL > Tech Support Group - Systems/Apps > An idea just came to mind. To prevent having the actual values in It's hard to tell ... but I believe you may have actually thought of a > CITTAS_Primary and having to add extra fields to the child tables to handle > the multiple keys, could I generate a hash of the data (City + State) and > place it in the 'ID' field. [...] *worse* solution than going with an IDENTITY. You need to seriously get a grasp of relational theory before touching RDBMS. Honestly. Take the opportunity this weekend to go to the library and get a book on the topic. > By doing this, The number of fields in each Another advantage to reading up on relational theory is that you will not > table stays the same and, at the same time, I still have a reason to have the > parent tables. sound like at total newbie. In relational databases, there are no fields (only columns) or parent/child tables (only referenced/referencing). It may seem like stupid semantics now, but it's about as bad as saying there's no difference between a "class" and an "object" to an OO guy. > Since the data in this table should not be altered other that Ok -- there's not even a point to explain why this is such a bad idea. Views > to be repopulated, Users would not have to worry about entering the hashes > themselves and programmers could treat the 'ID' fields normally instead of > having to do multiple JOINs to relate the data in queries. (virtual tables) are used to ease complex multi-join queries. Think of them as a "layer of abstraction". > Unfortunately, I do not have permissions to use extended stored procedures I couldn't even immagine the mess you would create if you had such access. There is some hope. First off, I recall reading somewhere about the seven rules of relational
databases, or something like that, that were set out by one of the pioneers in the field. One of the rules was that ALL primary keys should be abstracted from the data in their rows. (Sorry if the language is wrong, but you already managed to point out the fact that I am a real n00b to relational databases.) I kind of have to break that rule so that when I reconstruct the CITTAS_Primary table, all of the tables that reference it will not be invalidated, which is what would happen if I used abstract keys. At the same time, I will not be seeing this app through to the front-end development (or even the completion of the back-end, for that matter), so I don't want to complicate all of the relations and force the new programmer to spend weeks trying to understand it. Using abstract IDs is pretty common here and I don't want my IDs to act differently. Also, I don't see why hashes would be worse than abstract keys. They are basically combining the data that I would have built into multiple keys into a single, easy to use key. Chris Aside: I really take no offence to someone using object and class interchengably. After all, in most OO languages that I have encountered, all classes inherit from Object, so any class that is created is a child of Object. Background: This is my first DB project outside of Access, and it happens to be quite a large and important one. This project has already had two seperate programmers work on it, and neither of them are around anymore. I am basically learning the whole relational model as I go. I actually am employed as a co-op and am currently a sophomore in college. I am not trying to excuse my ignorance, just explain my complete lack of experience. After all, how many teenagers grow up with a full blown DB server in their house for them to play with? Not I. I occasionaly mess around with MySQL over the weekends, but not too often since I have to run it on my only computer, which really slows down under the load. Even then, about all that I can get from it are DB structure and basic SQL syntax. Chris,
> First off, I recall reading somewhere about the seven rules of relational I'm going guess that you were refering to the Access Rule (#2) of Dr. Codd's > databases, or something like that, that were set out by one of the pioneers > in the field. One of the rules was that ALL primary keys should be > abstracted from the data in their rows. 12 Rules of Relational Database Management Systems. And you got some of the words right, but have a totally wrong impression of it. A more accurate characterization would be that primary keys *are* made up of columns within the row -- not abstracted from them. A ID/GUID/Hash/etc are logically the same as a physical locator (pointer) and violate this rule. > I kind of have to break that rule so that when I reconstruct the Well, you're not in a good position to change things. Considering your > CITTAS_Primary table, all of the tables that reference it will not be > invalidated, which is what would happen if I used abstract keys. > [...] > Using abstract IDs is pretty common here and I > don't want my IDs to act differently. circumstance, I think you'd be best just going with what they have. > Also, I don't see why hashes would be Their both bad at the logical level (and are pretty much the same). No one > worse than abstract keys. They are basically combining the data that I would > have built into multiple keys into a single, easy to use key. will understand the hashes; they will understand IDENTITYs. If I had PRINCIPLES OF PROGRAM DESIGN (1975) handy, I'd pull out a quote talking about a programmers' tendancy to want to write unecessarily complex things to satisfy a professional challenge. Needless complexity is also a sign of a newbiew/bad programmer. Remember that and you will do a lot better as you advance in your career. > Aside: I really take no offence to someone using object and class They are fundamentally different concepts. One is an instance and the other > interchengably. is a definition. Nothing to be offended by, but it's still wrong to use them interchangably. Same goes with fields vs. columns. > I am basically learning the whole relational model as I go. I actually am That is a good excuse for ignorance; you're in the process of learning. Most > employed as a co-op and am currently a sophomore in college. I am not trying > to excuse my ignorance, just explain my complete lack of experience. people who post to newsgroups are seeking hacks for their horrendous deisgn. No less, you still should take the time to read a book on the topic. If nothign else, it'll help you with your "A" in DATABASE 101. > I occasionaly mess around with MySQL over the weekends Note that comparing MySQL to any other real database is like comparing a lawnmower engine with a car engine. Pick up a copy of SQL Server developer edition. It's a full-blown verison that retails for $50; you can grab it on fleabay for like $5. I think i've even got 4 or five copies. Your school should have it for like $10. > After reading a thread about the 'merits' of natural keys, I decided that I Welcome to the world of relational data.> would play around with them a little. > I have a table (CITTAS_Primary) that has a foreign key to a table You're halfway there: a table describes a set and should be plural, not > (CITTAS_OriginCityState) that contains three columns: identity/PK, City, and > State. I decided that the best way to put a natural key on this table would > be to use a combination of City and State. singular. > What data type would A FK will need to be the same as the referencing PK. So, you would need to > the field have to be to reference CITTAS_OriginCityState? have a City + State column. > If I had to create This is one of the things oponents of the relational model said back in the > a city and state field, it would almost be pointless to have the extra > tables, wouldn't it? 70's. As it turns out, it works much better than having a pointer (artificial INT pk). To undestand why, you need to read a book on relational theory. I suggest Date's INTRODUCTION TO DATABASE SYSTEMS - you can get an old edition for a few bucks. Celko's DATA & DATABASES would be my second choice. One thing you seem to be misunderstanding is that although every table
should have a Natural key, that key doesn't have to be used as the foreign key in other tables. Often you will want to create a surrogate key as well to be used as a reference for foreign keys. No reason why you can't use a code as an intelligent surrogate key for each city/state. There are plenty of schemes that exist for these things. In the freight business there are surely directories of codes for these locations already in use? If you don't find a suitable scheme then invent one of your own. -- David Portas SQL Server MVP -- The way that I see it is that I need to use natural keys to keep from
invalidating other tables in the DB when I repopulate this group of tables. After, there is no guarantee otherwise that Atlanta, GA will always have an ID of 4. After I repopulate the tables, Atlanta, GA might even drop off of the list. As for the intelligent surrogate, each hub does have a four digit ID number called a SLIC. Unfortunately, not all of the entries are consistent. For example, in the entries key, it might have the SLIC for Cleveland, OH as the origin, but Pittsburgh, PA will be in the Origin field. Most of the records don't have this problem, but the number is signifigant enough that it does create a problem. If it wasn't for that, this problem would be a lot easier to solve. Chris Show quoteHide quote "David Portas" wrote: > One thing you seem to be misunderstanding is that although every table > should have a Natural key, that key doesn't have to be used as the foreign > key in other tables. Often you will want to create a surrogate key as well > to be used as a reference for foreign keys. > > No reason why you can't use a code as an intelligent surrogate key for each > city/state. There are plenty of schemes that exist for these things. In the > freight business there are surely directories of codes for these locations > already in use? If you don't find a suitable scheme then invent one of your > own. > > -- > David Portas > SQL Server MVP > -- > > > It sounds to me like your problems have nothing at all to
do with the fact that you are using artificial keys. Your problem is that you aren't keeping track of what they mean anywhere, and you are storing redundant information in your tables. I don't know what "repopulate" means, but if Atlanta, GA has an (artificial) ID of 4, then you should have a table keyed on the ID where Atlanta, GA and 4 are in the same row. You should never have to "repopulate" this table. As for your row that contains the SLIC for Cleveland in one column and the name Pittsburgh in another, why are you storing redundant origin information? Store the SLIC, and when you need the name, get it from the table of SLICs. Sure, the SLIC might be wrong, if someone entered Pittsburgh incorrectly, when Cleveland was the right value, but your problem is redundancy. No fact (like "the Origin is Cleveland, OH") should be stored more than once in your database. You can store it in many ways, but pick just one, to avoid the trouble you are in now. Steve Kass Drew University Chris Lieb wrote: Show quoteHide quote >The way that I see it is that I need to use natural keys to keep from >invalidating other tables in the DB when I repopulate this group of tables. >After, there is no guarantee otherwise that Atlanta, GA will always have an >ID of 4. After I repopulate the tables, Atlanta, GA might even drop off of >the list. > >As for the intelligent surrogate, each hub does have a four digit ID number >called a SLIC. Unfortunately, not all of the entries are consistent. For >example, in the entries key, it might have the SLIC for Cleveland, OH as the >origin, but Pittsburgh, PA will be in the Origin field. Most of the records >don't have this problem, but the number is signifigant enough that it does >create a problem. If it wasn't for that, this problem would be a lot easier >to solve. > >Chris > >"David Portas" wrote: > > > >>One thing you seem to be misunderstanding is that although every table >>should have a Natural key, that key doesn't have to be used as the foreign >>key in other tables. Often you will want to create a surrogate key as well >>to be used as a reference for foreign keys. >> >>No reason why you can't use a code as an intelligent surrogate key for each >>city/state. There are plenty of schemes that exist for these things. In the >>freight business there are surely directories of codes for these locations >>already in use? If you don't find a suitable scheme then invent one of your >>own. >> >>-- >>David Portas >>SQL Server MVP >>-- >> >> >> >> >>
Other interesting topics
Migrating from RMS III
Multiple FK or "TableName" column? Mode and Median functions in SQL 2000 Copy Database Last Function in SQL Server ? Monitoring of Replication using dev tools SQL Query with Current date + 30 days ORDER BY before UNION syntax error Simple Question (I hope) Indexes on SQL Server |
|||||||||||||||||||||||