|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Fastest way to translate data for normalization in MS SQLI am importing data from a 3rd party that contains categories and subcategories. The data looks like this (the :: are just for visual purposes, representing the different columns in my table): categoryPathID :: categoryLevel :: categoryName :: categoryPath 1 :: Cat1 :: California :: California 1 :: Cat2 :: San Diego :: California / San Diego 1 :: Cat3 :: Downtown :: California / San Diego / Downtown 1 :: Cat4 :: Gaslamp :: California / San Diego / Downtown / Gaslamp 2 :: Cat1 :: California :: California 2 :: Cat2 :: Inland Empire :: California / Inland Empire 2 :: Cat3 :: San Bernardino :: California / Inland Empire / San Bernardino 2 :: Cat4 :: University Heights :: California / Inland Empire / San Bernardino / University Heights I want to transform this data into the following format: categoryID :: categoryName :: parentCategoryID 1 :: California :: NULL 2 :: San Diego :: 1 3 :: Downtown :: 2 4 :: Gaslamp :: 3 5 :: Inland Empire :: 1 6 :: San Bernardino :: 5 7 :: University Heights :: 6 The solution I have in place now works by looping over all the categoryPathIDs, a subloop over each category level with that pathID, then inserting the data in the new table. It works, but because I have 70,000 records in the original data, it can take up to 45 minutes to run. Is there a more elegant way to translate this data that might speed up the process a bit? How are you importing/translating it? Are you using SSIS, or are just
dumping it into a table and then running SQL against it? I don't have a lot of experience with SSIS, but in DTS you could do this sort of thing using an inline VBScript transformatin, and it would be relatively quick. Stu sdwebguy wrote: Show quote > I am running SQL 2005. > > I am importing data from a 3rd party that contains categories and > subcategories. The data looks like this (the :: are just for visual > purposes, representing the different columns in my table): > > categoryPathID :: categoryLevel :: categoryName :: categoryPath > 1 :: Cat1 :: California :: California > 1 :: Cat2 :: San Diego :: California / San Diego > 1 :: Cat3 :: Downtown :: California / San Diego / Downtown > 1 :: Cat4 :: Gaslamp :: California / San Diego / Downtown / Gaslamp > 2 :: Cat1 :: California :: California > 2 :: Cat2 :: Inland Empire :: California / Inland Empire > 2 :: Cat3 :: San Bernardino :: California / Inland Empire / San > Bernardino > 2 :: Cat4 :: University Heights :: California / Inland Empire / San > Bernardino / University Heights > > I want to transform this data into the following format: > categoryID :: categoryName :: parentCategoryID > 1 :: California :: NULL > 2 :: San Diego :: 1 > 3 :: Downtown :: 2 > 4 :: Gaslamp :: 3 > 5 :: Inland Empire :: 1 > 6 :: San Bernardino :: 5 > 7 :: University Heights :: 6 > > The solution I have in place now works by looping over all the > categoryPathIDs, a subloop over each category level with that pathID, > then inserting the data in the new table. It works, but because I have > 70,000 records in the original data, it can take up to 45 minutes to > run. > > Is there a more elegant way to translate this data that might speed up > the process a bit? The original data comes to me in a flat file. I use bcp to import it
into my SQL table and that process is very quick (a few seconds). I didnt use SSIS because it was easier for me to customize the bcp command on the fly as needed (if the filename changes, or the location of the file, or the format of the file, etc). How would the VCScript transformation work using DTS? I might be able to get the idea of the transformation and recode it somehow in T-SQL? Tim Stu wrote: Show quote > How are you importing/translating it? Are you using SSIS, or are just > dumping it into a table and then running SQL against it? > > I don't have a lot of experience with SSIS, but in DTS you could do > this sort of thing using an inline VBScript transformatin, and it would > be relatively quick. > > Stu > |
|||||||||||||||||||||||