|
database
newsgroups
|
|||||||||||||||||||||||
|
|||||||||||||||||||||||
Sort Numbers in Nvarchar As Numbersaround replacements and substrings and other paraphernalia. In my case the nvarchar field can contain any type of numbers/letters in any combination, but strings that are numbers should be sorted as such. For example: 1 7 100 20 A8 G57X 91b17 B32 should be sorted as: 1 7 20 100 91b17 A8 B32 G57X I believe that COLLATE can be used, but I want to only use it when I SELECT since modifying the collation of the column in the table design gives me the usual "Cannot resolve collation conflict" error. I tried ORDER BY CAST(ProductCode AS nvarchar(20)) COLLATE Latin1_General_BIN but that had no effect on the sort order. Nor did putting any other collation in there seem to affect the output. Is there any way to do this without using temporary tables or custom types? Andrew
create table #t (c varchar(20)) insert into #t values ('1') insert into #t values ('7') insert into #t values ('100') insert into #t values ('20') insert into #t values ('A8') insert into #t values ('G57X') insert into #t values ('91b17') insert into #t values ('B32') SELECT * FROM #t ORDER BY substring(c, 0,patindex('%[0-9]%',c))+right ('00000' + substring(c, patindex('%[0-9]%',c) , len(c)),5) Show quote "Andrew Hayes" <AndrewHa***@discussions.microsoft.com> wrote in message news:ehj6mygsGHA.1596@TK2MSFTNGP05.phx.gbl... > I've read several posts and articles about this, but they all see to > revolve around replacements and substrings and other paraphernalia. In my > case the nvarchar field can contain any type of numbers/letters in any > combination, but strings that are numbers should be sorted as such. > > For example: > > 1 > 7 > 100 > 20 > A8 > G57X > 91b17 > B32 > > should be sorted as: > > 1 > 7 > 20 > 100 > 91b17 > A8 > B32 > G57X > > I believe that COLLATE can be used, but I want to only use it when I > SELECT since modifying the collation of the column in the table design > gives me the usual "Cannot resolve collation conflict" error. > > I tried > > ORDER BY CAST(ProductCode AS nvarchar(20)) COLLATE Latin1_General_BIN > > but that had no effect on the sort order. Nor did putting any other > collation in there seem to affect the output. > > Is there any way to do this without using temporary tables or custom > types? > Thanks Uri. I modified the ORDER BY to be like you mentioned and it seemed
to give me the right result, but then I realised it's not exactly what I was looking for. That is, if the table contains: 91 91b17 91g 91fz64 91g8 It would sort those as: 91 91b17 91fz64 91g 91g8 I'll try messing about with patindex and substrings to see if I can get the results I need. Andrew
Well , I can see two posible soultions . 1) Using a SORT opeartion on the CLIENT side istead of T-SQL 2) If you persist using T-SQL , try loop thru the cursor to get what you want Show quote "Andrew Hayes" <AndrewHa***@discussions.microsoft.com> wrote in message news:OsIQaRhsGHA.1304@TK2MSFTNGP06.phx.gbl... > Thanks Uri. I modified the ORDER BY to be like you mentioned and it seemed > to give me the right result, but then I realised it's not exactly what I > was looking for. > > That is, if the table contains: > > 91 > 91b17 > 91g > 91fz64 > 91g8 > > It would sort those as: > > 91 > 91b17 > 91fz64 > 91g > 91g8 > > I'll try messing about with patindex and substrings to see if I can get > the results I need. > Yeah. It would be simple if it was a fixed format, but it's a general column
used for multiple customers with different methods, so I have to be generic about it. Will see how it goes. Thanks for your help though. It's given me a few new ideas to try. Show quote "Uri Dimant" <u***@iscar.co.il> wrote in message news:uokUQWhsGHA.3264@TK2MSFTNGP03.phx.gbl... > Andrew > > Well , I can see two posible soultions . > 1) Using a SORT opeartion on the CLIENT side istead of T-SQL > 2) If you persist using T-SQL , try loop thru the cursor to get what you > want > > > > "Andrew Hayes" <AndrewHa***@discussions.microsoft.com> wrote in message > news:OsIQaRhsGHA.1304@TK2MSFTNGP06.phx.gbl... >> Thanks Uri. I modified the ORDER BY to be like you mentioned and it >> seemed to give me the right result, but then I realised it's not exactly >> what I was looking for. >> >> That is, if the table contains: >> >> 91 >> 91b17 >> 91g >> 91fz64 >> 91g8 >> >> It would sort those as: >> >> 91 >> 91b17 >> 91fz64 >> 91g >> 91g8 >> >> I'll try messing about with patindex and substrings to see if I can get >> the results I need. >> > > Andrew,
Here are two solutions that allow you to sort, if the rule is as follows: Separate each identifier into its contiguous alphabetic and numeric "runs". Sort the identifiers by first "run", then second "run", then third "run", etc. Sort numeric values before alphabetic ones, and sort numeric values by number, alphabetic values lexicographically. Before giving the examples, I'll make one more suggestion - if each customer's [id]s sort correctly among themselves, can you sort by customer first, or is there a real business reason to define an ordering that tells you whether one of customer X's names comes before or after one of customer Y's? Anyway, using Uri's setup, here are the two solutions. Each has limitations, and the UDF might better be written using CLR routines, if you are using SQL Server 2005. The CTE solution can also be rewritten using techniques from SQL Server 2000, though it's a bit messier - the two solutions are not that much different, really - they both split the identifiers up into their pieces, then sort the pieces, right-justifying numeric segments and left-justifying string ones. create function Stretch ( @c varchar(20) ) returns varchar(200) as begin declare @return varchar(200) set @return = '' declare @chunk varchar(10) set @chunk = '' declare @pos int set @pos = 0 declare @ten int set @ten = 10 while @pos <= len(@c) begin set @chunk = @chunk + substring(@c,@pos,1) while substring(@c,@pos,2)+'0' like '[0-9][0-9]%' or substring(@c,@pos,2)+'a' like '[^0-9][^0-9]%' begin set @pos = @pos + 1 set @chunk = @chunk + substring(@c,@pos,1) end set @pos = @pos + 1 if @chunk like '[0-9]%' set @return = @return + space(10-len(@chunk)) + @chunk else set @return = @return + @chunk + space(10-len(@chunk)) set @chunk = '' end return @return end go create table #t (c varchar(20)) insert into #t values ('1') insert into #t values ('7') insert into #t values ('100') insert into #t values ('20') insert into #t values ('A8') insert into #t values ('G57X') insert into #t values ('91a17') insert into #t values ('91ac7') insert into #t values ('91b17') insert into #t values ('B32') go with Nbr(n) as ( select distinct number from master..spt_values where number between 0 and 50 ), Tsplits(c,n,rk) as ( select c, n, row_number() over (partition by c order by n) as rk from #t, Nbr where n <= len(c)+1 and ( n = len(c)+1 or substring('a'+c,n,2) like '[^0-9][0-9]' or substring('0'+c,n,2) like '[0-9][^0-9]' ) ), Tchunks(c,s,p) as ( select c, substring(c,min(n),max(n)-min(n)), min(rk) from Tsplits cross join ( select 0 as i union all select 1 ) Two group by c,rk+i having count(n) = 2 ), TchunksSlid(c,s,p) as ( select c, case when s like '[0-9]%' then right('0000000000'+s,10) else s end, p from Tchunks ) select c from TchunksSlid group by c order by max(case when p = 1 then s else '' end), max(case when p = 2 then s else '' end), max(case when p = 3 then s else '' end), max(case when p = 4 then s else '' end) go select c from #t order by dbo.Stretch(c) go drop table #t drop function Stretch -- Steve Kass -- Drew University -- http://www.stevekass.com -- 8F2E85D3-2C21-4855-94F3-9A600F4BC58B Andrew Hayes wrote: Show quote >Yeah. It would be simple if it was a fixed format, but it's a general column >used for multiple customers with different methods, so I have to be generic >about it. > >Will see how it goes. Thanks for your help though. It's given me a few new >ideas to try. > >"Uri Dimant" <u***@iscar.co.il> wrote in message >news:uokUQWhsGHA.3264@TK2MSFTNGP03.phx.gbl... > > >>Andrew >> >>Well , I can see two posible soultions . >>1) Using a SORT opeartion on the CLIENT side istead of T-SQL >>2) If you persist using T-SQL , try loop thru the cursor to get what you >>want >> >> >> >>"Andrew Hayes" <AndrewHa***@discussions.microsoft.com> wrote in message >>news:OsIQaRhsGHA.1304@TK2MSFTNGP06.phx.gbl... >> >> >>>Thanks Uri. I modified the ORDER BY to be like you mentioned and it >>>seemed to give me the right result, but then I realised it's not exactly >>>what I was looking for. >>> >>>That is, if the table contains: >>> >>>91 >>>91b17 >>>91g >>>91fz64 >>>91g8 >>> >>>It would sort those as: >>> >>>91 >>>91b17 >>>91fz64 >>>91g >>>91g8 >>> >>>I'll try messing about with patindex and substrings to see if I can get >>>the results I need. >>> >>> >>> >> >> > > > >
Show quote
> That is, if the table contains: It would be so cool, if you could get away with something outrageous like > > 91 > 91b17 > 91g > 91fz64 > 91g8 > > It would sort those as: > > 91 > 91b17 > 91fz64 > 91g > 91g8 this: SELECT c, ISNUMERIC(c) AS ItsANumber FROM #t CASE WHEN ISNUMERIC(c) THEN ORDER BY CAST(c AS INT) ELSE ORDER BY c END And then leave it buried way down deep in the system, without even a single comment, just waiting for the poor guy that comes after you. -- Peace & happy computing, Mike Labosh, MCSD MCT Owner, vbSensei.Com "y = (-b ± (b^2 - 4 * a * c)^.5) / 2 * a" -- Dr. Houser Andrew Hayes wrote:
> I've read several posts and articles about this, but they all see to revolve Have you tried something like:> around replacements and substrings and other paraphernalia. In my case the > nvarchar field can contain any type of numbers/letters in any combination, > but strings that are numbers should be sorted as such. ORDER BY CASE WHEN ISNUMERIC(ProductCode) = 1 THEN CAST(ProductCode AS INT) ELSE <2billion,or whatever max INT value can be> END, ProductCode I don't have access to SQL Server at the moment so not sure if that will work or not, but give it a go. Note that ISNUMERIC returns 1 for decimals as well, so if PATINDEX might be better if you only want integers (e.g. PATINDEX('%[^0-9]%') = 0). Chris
if you add insert into #t values ('1.') to my DDL ,sql server will throw an error. In some sitations it is not reliable to use ISNUMERIC() Aaron has a great article on his web www.aspfaq.com site about the subject Show quote "Chris Lim" <blackca***@hotmail.com> wrote in message news:1154068610.636318.52980@75g2000cwc.googlegroups.com... > Andrew Hayes wrote: >> I've read several posts and articles about this, but they all see to >> revolve >> around replacements and substrings and other paraphernalia. In my case >> the >> nvarchar field can contain any type of numbers/letters in any >> combination, >> but strings that are numbers should be sorted as such. > > Have you tried something like: > > ORDER BY > CASE WHEN ISNUMERIC(ProductCode) = 1 > THEN CAST(ProductCode AS INT) > ELSE <2billion,or whatever max INT value can be> > END, > ProductCode > > I don't have access to SQL Server at the moment so not sure if that > will work or not, but give it a go. Note that ISNUMERIC returns 1 for > decimals as well, so if PATINDEX might be better if you only want > integers (e.g. PATINDEX('%[^0-9]%') = 0). > > if you add insert into #t values ('1.') to my DDL ,sql server will throw There is a similar evil behavior for currencies:> an error. In some sitations it is not reliable to use ISNUMERIC() > Aaron has a great article on his web www.aspfaq.com site about the > subject "$1, 234.56" is clearly not a numeric string, but on a system configured for USA Dollars (via Control Panel -> Regional Settings) says this: IF ISNUMERIC("$1,234.56"), it succeeds, but then on the same server, if you execute this: IF ISNUMERIC("EU1,234.56") (for Euros) it fails. Then on that same server, you can visit the Regional Settings and tell it to use EUROS instead of US Dollars, and you will get opposite behavior: ISNUMERIC("$1,234.56") is not, but ISNUMERIC("EU1,234.56") will be. What a painful afternoon THAT was! Ditto on dates: When a user types "5/6/70", is that May 6th or June 5th? Depending on how the Regional Settings are set, it will be interpreted as one or the other. And it gets even better. Your server has a regional settings configuration, and so does the client workstation, and if they are not the same, both computers can interpret the same value differently. For both numeric currency AND dates. GRR!! -- Peace & happy computing, Mike Labosh, MCSD MCT Owner, vbSensei.Com "y = (-b ± (b^2 - 4 * a * c)^.5) / 2 * a" -- Dr. Houser All of which shows how important it is to leave as little
as possible to chance. Either use culture-independent literals (For dates, 'YYYYMMDD', and for currency, decimal literals with no comma or currency symbol.), or explicity convert. If you think the date and number settings are a mess, try troubleshooting non-Unicode code page incompatibilities. SK Mike Labosh wrote: Show quote >>if you add insert into #t values ('1.') to my DDL ,sql server will throw >>an error. In some sitations it is not reliable to use ISNUMERIC() >>Aaron has a great article on his web www.aspfaq.com site about the >>subject >> >> > >There is a similar evil behavior for currencies: > >"$1, 234.56" is clearly not a numeric string, but on a system configured for >USA Dollars (via Control Panel -> Regional Settings) says this: > >IF ISNUMERIC("$1,234.56"), it succeeds, but then on the same server, if you >execute this: > >IF ISNUMERIC("EU1,234.56") (for Euros) it fails. > >Then on that same server, you can visit the Regional Settings and tell it to >use EUROS instead of US Dollars, and you will get opposite behavior: > >ISNUMERIC("$1,234.56") is not, but ISNUMERIC("EU1,234.56") will be. > >What a painful afternoon THAT was! > >Ditto on dates: When a user types "5/6/70", is that May 6th or June 5th? >Depending on how the Regional Settings are set, it will be interpreted as >one or the other. > >And it gets even better. Your server has a regional settings configuration, >and so does the client workstation, and if they are not the same, both >computers can interpret the same value differently. For both numeric >currency AND dates. > >GRR!! > > > All of which shows how important it is to leave as little OMG! I have been talking to databases since I was 9 years old, and NOW the > as possible to chance. Either use culture-independent > literals (For dates, 'YYYYMMDD', and for currency, > decimal literals with no comma or currency symbol.), > or explicity convert. light just clicked on: This is precisely why Mainframers do bizarre COBOL nonsense that stores it as X8 data type as yyyymmdd, so you can cast it to a number, and it has place value that helps you do date-diffs.. And more importantly, it is unmistakably deterministic. I always thought it was because 1,000 years ago, disk capacity cost so much that the ancient Romans had to save bytes. (or something like that) But the catistrophic bug is built in: When the date value of 31-Dec-9999 rolls over to the next day, all our code is guaranteed to break because of that pesky "Y-10-K" bug. Dammit, back to the drawing board. -- Peace & happy computing, Mike Labosh, MCSD MCT Owner, vbSensei.Com "y = (-b ± (b^2 - 4 * a * c)^.5) / 2 * a" -- Dr. Houser And if it really takes 6.02E+23 protons to make a "mole", is that why those critters are so little? After having carefully studied this entire thread, and hacking around inside
QA for 4 hours or something, a buddy of mine got you *ALMOST* there: --drop table #t create table #t(f varchar(10)) insert #t values('10') insert #t values('1') insert #t values('100') insert #t values('91') insert #t values('91b17') insert #t values('91g') insert #t values('91fz64') insert #t values('91g8') select f2, f from ( select f, f2 = cast( case when isnumeric(substring(f,1,1))= 1 then substring(f,1,1) else '' end + case when isnumeric(substring(f,2,1))= 1 then substring(f,2,1) else '' end + case when isnumeric(substring(f,3,1))= 1 then substring(f,3,1) else '' end + case when isnumeric(substring(f,4,1))= 1 then substring(f,4,1) else '' end + case when isnumeric(substring(f,5,1))= 1 then substring(f,5,1) else '' end + case when isnumeric(substring(f,6,1))= 1 then substring(f,6,1) else '' end as int) from #t ) t order by f2 The only problem is that (1) is the filthiest hack I have ever seen and (2) it does not sort the 100 after the 91b. But I'm still chewing on it some more. I dig questions of this bizarre nature because they stimulate my brain, and pull my (married) mind away from the totally hot babes sitting in my class this week. So congradulations: My wife will hate you forever because your interesting problem kept me sitting at my computer for two days straight. :) -- Peace & happy computing, Mike Labosh, MCSD MCT Owner, vbSensei.Com "y = (-b ± (b^2 - 4 * a * c)^.5) / 2 * a" -- Dr. Houser Did sorting by my Stretch() function get the order wrong?
SK Mike Labosh wrote: Show quote >After having carefully studied this entire thread, and hacking around inside >QA for 4 hours or something, a buddy of mine got you *ALMOST* there: > >--drop table #t >create table #t(f varchar(10)) >insert #t values('10') >insert #t values('1') >insert #t values('100') >insert #t values('91') >insert #t values('91b17') >insert #t values('91g') >insert #t values('91fz64') >insert #t values('91g8') > >select f2, f from > ( > select f, f2 = > cast( > case when isnumeric(substring(f,1,1))= 1 then substring(f,1,1) else '' >end + > case when isnumeric(substring(f,2,1))= 1 then substring(f,2,1) else '' >end + > case when isnumeric(substring(f,3,1))= 1 then substring(f,3,1) else '' >end + > case when isnumeric(substring(f,4,1))= 1 then substring(f,4,1) else '' >end + > case when isnumeric(substring(f,5,1))= 1 then substring(f,5,1) else '' >end + > case when isnumeric(substring(f,6,1))= 1 then substring(f,6,1) else '' >end > as int) > from #t > ) t >order by f2 > >The only problem is that (1) is the filthiest hack I have ever seen and (2) >it does not sort the 100 after the 91b. > >But I'm still chewing on it some more. I dig questions of this bizarre >nature because they stimulate my brain, and pull my (married) mind away from >the totally hot babes sitting in my class this week. > >So congradulations: My wife will hate you forever because your interesting >problem kept me sitting at my computer for two days straight. :) > > > > Mike Labosh wrote:
> The only problem is that (1) is the filthiest hack I have ever seen and (2) I thought the OP only wanted strings that were entirely numbers to be> it does not sort the 100 after the 91b. treated as numbers, and strings that were were alphanumeric to be sorted as strings? The required order of his sample data was: 1 7 20 100 91b17 A8 B32 G57X Have I totally missed the point? (I think I have, given that everyone else has interpreted the requirements differently to me!) Chris |
|||||||||||||||||||||||