Home All Groups Group Topic Archive Search About

I want to take a Data Warehouse home.

Author
28 Jul 2006 1:40 AM
Mike Labosh
Business rule:  I am a dev-guy, and they pay me for 40 hours a week, but
they expect me to do 8192 hours of production time.

So I have to get a "copy" of the data warehouse.

But the stupid company issue "craptop" doesn't have that kind of storage.

So here's what I am thinking:

1. Make a local empty structure of the Fact Table and each of the Dimension
Tables
2. Do a SELECT TOP 1 PERCENT INTO each of the local dimension tables
3. Do a SELECT (yakety yakety yack) from the big fat giant fact table on the
server, but joining it against the little tiny dimension tables that I
locally.

That should get me an honest sampling of the data.  Correct?  Enough to test
my code with?



--

Peace & happy computing,

Mike Labosh, MCSD MCT
Owner, vbSensei.Com
"y = (-b ± (b^2 - 4 * a * c)^.5) / 2 * a" -- Dr. Houser

Author
28 Jul 2006 2:15 AM
Stu
Typically dimension tables are rather small already; for example, many
data warehouses will have dimensions for date and time.  10 years of
days is only slightly more than 3000 records, and a days worth of
minutes is only 1440 rows.

I guess what I'm saying is that a teeny slice of dimension data may not
be much data at all; you may want to grab the entire dimension, and
just take a tiny slice of your Facts.

Stu

Mike Labosh wrote:
Show quote
> Business rule:  I am a dev-guy, and they pay me for 40 hours a week, but
> they expect me to do 8192 hours of production time.
>
> So I have to get a "copy" of the data warehouse.
>
> But the stupid company issue "craptop" doesn't have that kind of storage.
>
> So here's what I am thinking:
>
> 1. Make a local empty structure of the Fact Table and each of the Dimension
> Tables
> 2. Do a SELECT TOP 1 PERCENT INTO each of the local dimension tables
> 3. Do a SELECT (yakety yakety yack) from the big fat giant fact table on the
> server, but joining it against the little tiny dimension tables that I
> locally.
>
> That should get me an honest sampling of the data.  Correct?  Enough to test
> my code with?
>
>
>
> --
>
> Peace & happy computing,
>
> Mike Labosh, MCSD MCT
> Owner, vbSensei.Com
> "y = (-b ± (b^2 - 4 * a * c)^.5) / 2 * a" -- Dr. Houser
Author
28 Jul 2006 2:42 AM
Aaron Bertrand [SQL Server MVP]
> But the stupid company issue "craptop" doesn't have that kind of storage.

What is "that kind of storage"?  100 GB?  300 GB?  There are several
portable/external hard drives out there with this kind of capacity for under
$200.  500 GB?  Under $300.  Weight and dimensions are roughly equivalent to
a laptop, so it's not like bringing a raid array home.  Tell your boss to
spring for this so you can take your work home with you, otherwise you will
be spending part of your 8192 hours of production time finding hacks /
workarounds to simulate your normal 40-hour work week.
Author
28 Jul 2006 11:18 AM
Mike Labosh
> What is "that kind of storage"?  100 GB?  300 GB?  There are several
> portable/external hard drives out there with this kind of capacity for
> under $200.  500 GB?  Under $300.  Weight and dimensions are roughly
> equivalent to

Well spoken like a true DBA.  But not like a Code-Monkey.

If I want to take my work home with me, I'm talking about MB or even KB, if
I can get it..., not GB.  As It should be.  As Arnie pointed out, there are
too many dumbasses that take the whole system home, only to have it stolen
and then you see 750 thuosand social security numbers scrolling across the
bottom of your tele on CNN.

I'm simply talking about getting a teeny weeie subset of the data so that
the front-side applicaion I'm working on has something "honest" to see, ONLY
a teenie weenie bit.  That way, it's true to structure, but portable.

Paraphasing President Nixon, "I am not a data thief".
--

Peace & happy computing,

Mike Labosh, MCSD MCT
Owner, vbSensei.Com
"y = (-b ± (b^2 - 4 * a * c)^.5) / 2 * a" -- Dr. Houser
Author
28 Jul 2006 11:25 AM
Mike Labosh
Oh, and I don't work for Enron corporation, either.

--

Peace & happy computing,

Mike Labosh, MCSD MCT
Owner, vbSensei.Com
"y = (-b ± (b^2 - 4 * a * c)^.5) / 2 * a" -- Dr. Houser
Author
28 Jul 2006 12:13 PM
Aaron Bertrand [SQL Server MVP]
> If I want to take my work home with me, I'm talking about MB or even KB,
> if I can get it..., not GB.  As It should be.  As Arnie pointed out, there
> are too many dumbasses that take the whole system home,

I guess you confused me, because in your original post, it sounded like the
reason you couldn't take the whole thing home was because your crappy laptop
didn't have enough storage.  Essentially, you directed me to a "get more
storage" solution.  Sorry it wasn't good enough.
Author
28 Jul 2006 5:24 AM
Arnie Rowland
Mike,

Will we be seeing your name in the 6 o'clock news related to the data
security breach effecting hundreds or thousands of 'customers' because your
laptop went 'missing'?

--
Arnie Rowland, Ph.D., MCT
Westwood Consulting, Inc

Most good judgment comes from experience.
Most experience comes from bad judgment.
- Anonymous


Show quote
"Mike Labosh" <mlabosh_at_hotmail_dot_com> wrote in message
news:O13eebesGHA.2240@TK2MSFTNGP04.phx.gbl...
> Business rule:  I am a dev-guy, and they pay me for 40 hours a week, but
> they expect me to do 8192 hours of production time.
>
> So I have to get a "copy" of the data warehouse.
>
> But the stupid company issue "craptop" doesn't have that kind of storage.
>
> So here's what I am thinking:
>
> 1. Make a local empty structure of the Fact Table and each of the
> Dimension Tables
> 2. Do a SELECT TOP 1 PERCENT INTO each of the local dimension tables
> 3. Do a SELECT (yakety yakety yack) from the big fat giant fact table on
> the server, but joining it against the little tiny dimension tables that I
> locally.
>
> That should get me an honest sampling of the data.  Correct?  Enough to
> test my code with?
>
>
>
> --
>
> Peace & happy computing,
>
> Mike Labosh, MCSD MCT
> Owner, vbSensei.Com
> "y = (-b ± (b^2 - 4 * a * c)^.5) / 2 * a" -- Dr. Houser
>
Author
28 Jul 2006 11:00 AM
Mike Labosh
> Will we be seeing your name in the 6 o'clock news related to the data
> security breach effecting hundreds or thousands of 'customers' because
> your laptop went 'missing'?

Absolutely not.

They can have my computer when they pry it from my cold dead fingers.
Missing?  never happen.  Stolen?  They will have to kill me first.

Want to see the murder rate in Philadelphia rise some more?  Just try to
steal my laptop.  I will decorate the bottom of my shoes with your entrails.
(or whoever the scumbag is)

But you are correct in the sense that I am dangerous.  In this case, not
anything related to the social-security-number issue-du-jour, but in the
sense that, as a code monkey, my brain has uncountable amount of
intellectual secrets.
--

Peace & happy computing,

Mike Labosh, MCSD MCT
Owner, vbSensei.Com
"y = (-b ± (b^2 - 4 * a * c)^.5) / 2 * a" -- Dr. Houser
Author
28 Jul 2006 3:00 PM
Arnie Rowland
Mike,

If you haven't tried it yet, I recommend that you take a look at the Visual
Studio -Database Pro edition CTP. It has the capability of 'building' sample
data to fit a table/database. Perfectly good for fitting the applications'
needs with junk data.

I think that you can still download the recent CTP.

Otherwise, I've done things like selecting every x row from a parent table
(x calculated for percentage), and then all child rows from other tables.
Then, perhaps because I'm so paranoid about being on the 6 o'clock news, I
'scrub' the data, converting all names, address, phone numbers, etc., with
strings of equal length [xyz123] characters -usable for form display.

--
Arnie Rowland, Ph.D.
Westwood Consulting, Inc

Most good judgment comes from experience.
Most experience comes from bad judgment.
- Anonymous


Show quote
"Mike Labosh" <mlabosh_at_hotmail_dot_com> wrote in message
news:ey$oVUjsGHA.4324@TK2MSFTNGP05.phx.gbl...
>> Will we be seeing your name in the 6 o'clock news related to the data
>> security breach effecting hundreds or thousands of 'customers' because
>> your laptop went 'missing'?
>
> Absolutely not.
>
> They can have my computer when they pry it from my cold dead fingers.
> Missing?  never happen.  Stolen?  They will have to kill me first.
>
> Want to see the murder rate in Philadelphia rise some more?  Just try to
> steal my laptop.  I will decorate the bottom of my shoes with your
> entrails. (or whoever the scumbag is)
>
> But you are correct in the sense that I am dangerous.  In this case, not
> anything related to the social-security-number issue-du-jour, but in the
> sense that, as a code monkey, my brain has uncountable amount of
> intellectual secrets.
> --
>
> Peace & happy computing,
>
> Mike Labosh, MCSD MCT
> Owner, vbSensei.Com
> "y = (-b ± (b^2 - 4 * a * c)^.5) / 2 * a" -- Dr. Houser
>
Author
28 Jul 2006 10:32 PM
Mike Labosh
> Then, perhaps because I'm so paranoid about being on the 6 o'clock news, I
> 'scrub' the data, converting all names, address, phone numbers, etc., with
> strings of equal length [xyz123] characters -usable for form display.

EXCELLENT IDEA!!!
--

Peace & happy computing,

Mike Labosh, MCSD MCT
Owner, vbSensei.Com
"y = (-b ± (b^2 - 4 * a * c)^.5) / 2 * a" -- Dr. Houser

AddThis Social Bookmark Button