Home All Groups Group Topic Archive Search About

Use of NULL and DEFAULT

Author
30 Dec 2005 2:16 AM
Gerard Marshall Vignes
I have a rule of thumb which I would like to get opinions about.

RULE: Provide Data or Go Meta

SCOPE: Designing SQL Tables, UDFs and Stored Procedures

MEANING:

A good database design will restrict the occurrence of NULL. An
overabundance of NULLs in a database suggests, among other things, that the
schema is not properly normalized. It may also be difficult to determine what
NULL actually means in a give context, e.g., data that is needed but missing,
data that is not necessary, etc. This uncertainty is not good. We are
essentially corrupting our data in the process of storing it.

I believe there is one special case where NULL is valid. This is the case
where the absence of valid data requires that we defer to the meta-data to
determine what action should be taken.

If we specify that a column can accept NULL values, then we are saying that
it is ACCEPTABLE to defer to the meta-data for a situation in which the data
is not provided (for whatever reason).

If we specify that a column cannot accept NULL values, then we are saying
that it is NOT ACCEPTABLE to defer to the meta-data for a situation in which
the data is not provided (for whatever reason).

By capturing the knowledge that it is necessary to defer to the meta-data,
we are leaving it up to the meta-data to tell us how to handle the situation.
The response stipulated by the meta-data is then free to be changed---without
changing the data. This decouples changes in the meta-data response from the
data itself.

We should never use DEFAULT. We should use NULL instead. If we use DEFAULT
to plug in valid data, then we have lost the knowledge that we should be
deferring to the meta-data. We have corrupted that data by erasing the fact
that it was not provided (for whatever reason). Whe have further corrupted
the data by plugging in a value which may not have been valid in the past
and/or may not be valid in the future.

This rule of thumb attempts to avoid confustion between data and meta-data.

Feel free to comment (or to ignore me completely).

If you have seen this suggested before, in SQL or any other context, please
let me know.

Thanks,

Gerard

Author
30 Dec 2005 4:49 AM
Brian Selzer
I think your rule of thumb sucks.

A NULL indicates the absence of a value.  It does not indicate WHY the value
does not exist--nor should it: if you need that information, add another
column.

Whether or not a column can contain NULLs is a multiplicity
constraint--nothing more.  It doesn't make any sense to ascribe additional
meaning beyond that.

It doesn't make any sense to "defer" to metadata once a row has been
inserted into the database.

I don't understand what you mean by "what action should be taken."  It
appears that you want to query the current state of the metadata in order to
replace NULLs in a result set.  This makes no sense for existing rows.

DEFAULTs are good things.  They specify the initial state of an entity.

If you need to record which values were supplied, then you should add code
to the stored procedure to record that information.

Your rule of thum blurs the definition of metadata, thus increasing
confusion.


Show quote
"Gerard Marshall Vignes" <Gerard Marshall Vig***@discussions.microsoft.com>
wrote in message news:FA3008AF-93B8-48DC-8CAE-0B198BF9C12C@microsoft.com...
>I have a rule of thumb which I would like to get opinions about.
>
> RULE: Provide Data or Go Meta
>
> SCOPE: Designing SQL Tables, UDFs and Stored Procedures
>
> MEANING:
>
> A good database design will restrict the occurrence of NULL. An
> overabundance of NULLs in a database suggests, among other things, that
> the
> schema is not properly normalized. It may also be difficult to determine
> what
> NULL actually means in a give context, e.g., data that is needed but
> missing,
> data that is not necessary, etc. This uncertainty is not good. We are
> essentially corrupting our data in the process of storing it.
>
> I believe there is one special case where NULL is valid. This is the case
> where the absence of valid data requires that we defer to the meta-data to
> determine what action should be taken.
>
> If we specify that a column can accept NULL values, then we are saying
> that
> it is ACCEPTABLE to defer to the meta-data for a situation in which the
> data
> is not provided (for whatever reason).
>
> If we specify that a column cannot accept NULL values, then we are saying
> that it is NOT ACCEPTABLE to defer to the meta-data for a situation in
> which
> the data is not provided (for whatever reason).
>
> By capturing the knowledge that it is necessary to defer to the meta-data,
> we are leaving it up to the meta-data to tell us how to handle the
> situation.
> The response stipulated by the meta-data is then free to be
> changed---without
> changing the data. This decouples changes in the meta-data response from
> the
> data itself.
>
> We should never use DEFAULT. We should use NULL instead. If we use DEFAULT
> to plug in valid data, then we have lost the knowledge that we should be
> deferring to the meta-data. We have corrupted that data by erasing the
> fact
> that it was not provided (for whatever reason). Whe have further corrupted
> the data by plugging in a value which may not have been valid in the past
> and/or may not be valid in the future.
>
> This rule of thumb attempts to avoid confustion between data and
> meta-data.
>
> Feel free to comment (or to ignore me completely).
>
> If you have seen this suggested before, in SQL or any other context,
> please
> let me know.
>
> Thanks,
>
> Gerard
>
Author
30 Dec 2005 7:23 AM
Gerard Marshall Vignes
Thanks for your response, Brian!

Your comment about Multiplicity is interesting. NULLs are limited to the
case of 0 or 1. An additional table with a link via a foreign key is able to
handle a wider range 0 to N.

It is possible to build fewer tables, with more fields and use NULLs to
handle optional fields. That is poor design. It is better to create new
tables for optional fields and restrict the use of NULLs.  In fact, it is
possible to design a schema that does not use NULLs at all. Such a schema may
be more complex (more tables, more joins), but it will also be more
extensible.

You are missing the point about missing data. The objective is not to
explain why the data is missing. The point is to avoid hiding the fact that
the data is missing.

If a field must be provided (NOT NULL) and it is not available, then the row
will not be inserted. There is no confusion here.

If a field is "optional," i.e., can be NULL, then there is an opportunity
for confusion. Without deferring to meta-data (data about the data), we do
not know what a NULL value means. Assuming the meaning of NULL [and assuming
that meaning of NULL is universal] leads to confusion. This can give
unexpected results.

By deferring to the meta-data (data about the data) for a non-ambiguous
response (action), we eliminate this confusion. A response could be to (1)
ignore the field or (2) use some default value for the field. It makes more
sense to let the meta-data determine what NULL means than to let each person
or program come up with its own interpretation.

Ignoring a field is an action. Using a default value in place of the field
is an action. The point is to not read anything into the NULL that is not
stipulated in the meta-data.

DEFAULTs are poor technique. They hide the fact that no data was provided,
and thus they corrupt the data. If there is a default, it belongs in the
meta-data and not in the data.

All things being equal, the simpler solution is more desirable. It is
possible to shadow fields with other fields that describe necessary
omissions. It is better to design a schema that naturally accommodates
incomplete or otherwise imperfect data. Such a schema will not require
guesswork or workaround code.

Assuming stored procedures or triggers or any other specific programming
technique is a mistake. This is design, and design should be flexible.

Gerard

Show quote
"Brian Selzer" wrote:

> I think your rule of thumb sucks.
>
> A NULL indicates the absence of a value.  It does not indicate WHY the value
> does not exist--nor should it: if you need that information, add another
> column.
>
> Whether or not a column can contain NULLs is a multiplicity
> constraint--nothing more.  It doesn't make any sense to ascribe additional
> meaning beyond that.
>
> It doesn't make any sense to "defer" to metadata once a row has been
> inserted into the database.
>
> I don't understand what you mean by "what action should be taken."  It
> appears that you want to query the current state of the metadata in order to
> replace NULLs in a result set.  This makes no sense for existing rows.
>
> DEFAULTs are good things.  They specify the initial state of an entity.
>
> If you need to record which values were supplied, then you should add code
> to the stored procedure to record that information.
>
> Your rule of thum blurs the definition of metadata, thus increasing
> confusion.
>
>
> "Gerard Marshall Vignes" <Gerard Marshall Vig***@discussions.microsoft.com>
> wrote in message news:FA3008AF-93B8-48DC-8CAE-0B198BF9C12C@microsoft.com...
> >I have a rule of thumb which I would like to get opinions about.
> >
> > RULE: Provide Data or Go Meta
> >
> > SCOPE: Designing SQL Tables, UDFs and Stored Procedures
> >
> > MEANING:
> >
> > A good database design will restrict the occurrence of NULL. An
> > overabundance of NULLs in a database suggests, among other things, that
> > the
> > schema is not properly normalized. It may also be difficult to determine
> > what
> > NULL actually means in a give context, e.g., data that is needed but
> > missing,
> > data that is not necessary, etc. This uncertainty is not good. We are
> > essentially corrupting our data in the process of storing it.
> >
> > I believe there is one special case where NULL is valid. This is the case
> > where the absence of valid data requires that we defer to the meta-data to
> > determine what action should be taken.
> >
> > If we specify that a column can accept NULL values, then we are saying
> > that
> > it is ACCEPTABLE to defer to the meta-data for a situation in which the
> > data
> > is not provided (for whatever reason).
> >
> > If we specify that a column cannot accept NULL values, then we are saying
> > that it is NOT ACCEPTABLE to defer to the meta-data for a situation in
> > which
> > the data is not provided (for whatever reason).
> >
> > By capturing the knowledge that it is necessary to defer to the meta-data,
> > we are leaving it up to the meta-data to tell us how to handle the
> > situation.
> > The response stipulated by the meta-data is then free to be
> > changed---without
> > changing the data. This decouples changes in the meta-data response from
> > the
> > data itself.
> >
> > We should never use DEFAULT. We should use NULL instead. If we use DEFAULT
> > to plug in valid data, then we have lost the knowledge that we should be
> > deferring to the meta-data. We have corrupted that data by erasing the
> > fact
> > that it was not provided (for whatever reason). Whe have further corrupted
> > the data by plugging in a value which may not have been valid in the past
> > and/or may not be valid in the future.
> >
> > This rule of thumb attempts to avoid confustion between data and
> > meta-data.
> >
> > Feel free to comment (or to ignore me completely).
> >
> > If you have seen this suggested before, in SQL or any other context,
> > please
> > let me know.
> >
> > Thanks,
> >
> > Gerard
> >
>
>
>
Author
30 Dec 2005 3:33 PM
Brian Selzer
For the record: I wasn't advocating the use of NULLs.  I believe that they
have no place in base relations: "NULL value" is an oxymor0n.  They can,
however, be useful in relational expressions to simplify queries.

> DEFAULTs are poor technique. They hide the fact that no data was provided,
> and thus they corrupt the data. If there is a default, it belongs in the
> meta-data and not in the data.

Hello McFly!  DEFAULTs ARE a part of the metadata.  What doesn't make sense
is "deferring" to the metadata once a row has been inserted.  For example,
consider two types of time stamps in a temporal database.  One identifies
the time that an external event occurred; the other identifies the time that
information arrived in the database.  For the first type, the information
MUST be provided by the application; for the second type, DEFAULT(GETDATE())
is best.  It doesn't make sense for the user to specify it during an INSERT
because blocking may cause an interval to exist between the time that the
INSERT is issued and the time that the information becomes part of the
database, and it also makes no sense for the time stamp to be absent--in
fact, it is usually imperative that it exist in order to extract the state
of the database at a specific point-in-time.  Your technique of "deferring"
to the metadata for subsequent queries would cause vital information to be
lost--that is, the time that the information arrived in the database.

I reject your assertion that data is in any way corrupted by using DEFAULTs.
Your argument that information is lost is unconvincing.  Again, if you need
to record the fact that a value was not provided, then explicitly record
that fact in another column.

The meaning of NULL IS universal.  Interpretation of that meaning in context
is outside the scope of the database.  It is the purview of the application
developer or of the user.  All that is important from the database
perspective is that no value exists.  Again, if you need to record why it
doesn't exist, then add another column and record it.

Stored procedures and triggers are part of the database.  Triggers are
necessary to enforce some constraints that cannot be enforced
declaratively--for example, transition constraints.  Stored procedures
provide a means to embed business logic in the database where it belongs.
(Business rules are essentially database constraints.)  Sometimes triggers
are contraindicated for performance or logistical reasons.  Stored
procedures are the logical alternative.

Show quote
"Gerard Marshall Vignes" <GerardMarshallVig***@discussions.microsoft.com>
wrote in message news:07461B4D-0703-4428-8795-4D21C180C531@microsoft.com...
> Thanks for your response, Brian!
>
> Your comment about Multiplicity is interesting. NULLs are limited to the
> case of 0 or 1. An additional table with a link via a foreign key is able
> to
> handle a wider range 0 to N.
>
> It is possible to build fewer tables, with more fields and use NULLs to
> handle optional fields. That is poor design. It is better to create new
> tables for optional fields and restrict the use of NULLs.  In fact, it is
> possible to design a schema that does not use NULLs at all. Such a schema
> may
> be more complex (more tables, more joins), but it will also be more
> extensible.
>
> You are missing the point about missing data. The objective is not to
> explain why the data is missing. The point is to avoid hiding the fact
> that
> the data is missing.
>
> If a field must be provided (NOT NULL) and it is not available, then the
> row
> will not be inserted. There is no confusion here.
>
> If a field is "optional," i.e., can be NULL, then there is an opportunity
> for confusion. Without deferring to meta-data (data about the data), we do
> not know what a NULL value means. Assuming the meaning of NULL [and
> assuming
> that meaning of NULL is universal] leads to confusion. This can give
> unexpected results.
>
> By deferring to the meta-data (data about the data) for a non-ambiguous
> response (action), we eliminate this confusion. A response could be to (1)
> ignore the field or (2) use some default value for the field. It makes
> more
> sense to let the meta-data determine what NULL means than to let each
> person
> or program come up with its own interpretation.
>
> Ignoring a field is an action. Using a default value in place of the field
> is an action. The point is to not read anything into the NULL that is not
> stipulated in the meta-data.
>
> DEFAULTs are poor technique. They hide the fact that no data was provided,
> and thus they corrupt the data. If there is a default, it belongs in the
> meta-data and not in the data.
>
> All things being equal, the simpler solution is more desirable. It is
> possible to shadow fields with other fields that describe necessary
> omissions. It is better to design a schema that naturally accommodates
> incomplete or otherwise imperfect data. Such a schema will not require
> guesswork or workaround code.
>
> Assuming stored procedures or triggers or any other specific programming
> technique is a mistake. This is design, and design should be flexible.
>
> Gerard
>
> "Brian Selzer" wrote:
>
>> I think your rule of thumb sucks.
>>
>> A NULL indicates the absence of a value.  It does not indicate WHY the
>> value
>> does not exist--nor should it: if you need that information, add another
>> column.
>>
>> Whether or not a column can contain NULLs is a multiplicity
>> constraint--nothing more.  It doesn't make any sense to ascribe
>> additional
>> meaning beyond that.
>>
>> It doesn't make any sense to "defer" to metadata once a row has been
>> inserted into the database.
>>
>> I don't understand what you mean by "what action should be taken."  It
>> appears that you want to query the current state of the metadata in order
>> to
>> replace NULLs in a result set.  This makes no sense for existing rows.
>>
>> DEFAULTs are good things.  They specify the initial state of an entity.
>>
>> If you need to record which values were supplied, then you should add
>> code
>> to the stored procedure to record that information.
>>
>> Your rule of thum blurs the definition of metadata, thus increasing
>> confusion.
>>
>>
>> "Gerard Marshall Vignes" <Gerard Marshall
>> Vig***@discussions.microsoft.com>
>> wrote in message
>> news:FA3008AF-93B8-48DC-8CAE-0B198BF9C12C@microsoft.com...
>> >I have a rule of thumb which I would like to get opinions about.
>> >
>> > RULE: Provide Data or Go Meta
>> >
>> > SCOPE: Designing SQL Tables, UDFs and Stored Procedures
>> >
>> > MEANING:
>> >
>> > A good database design will restrict the occurrence of NULL. An
>> > overabundance of NULLs in a database suggests, among other things, that
>> > the
>> > schema is not properly normalized. It may also be difficult to
>> > determine
>> > what
>> > NULL actually means in a give context, e.g., data that is needed but
>> > missing,
>> > data that is not necessary, etc. This uncertainty is not good. We are
>> > essentially corrupting our data in the process of storing it.
>> >
>> > I believe there is one special case where NULL is valid. This is the
>> > case
>> > where the absence of valid data requires that we defer to the meta-data
>> > to
>> > determine what action should be taken.
>> >
>> > If we specify that a column can accept NULL values, then we are saying
>> > that
>> > it is ACCEPTABLE to defer to the meta-data for a situation in which the
>> > data
>> > is not provided (for whatever reason).
>> >
>> > If we specify that a column cannot accept NULL values, then we are
>> > saying
>> > that it is NOT ACCEPTABLE to defer to the meta-data for a situation in
>> > which
>> > the data is not provided (for whatever reason).
>> >
>> > By capturing the knowledge that it is necessary to defer to the
>> > meta-data,
>> > we are leaving it up to the meta-data to tell us how to handle the
>> > situation.
>> > The response stipulated by the meta-data is then free to be
>> > changed---without
>> > changing the data. This decouples changes in the meta-data response
>> > from
>> > the
>> > data itself.
>> >
>> > We should never use DEFAULT. We should use NULL instead. If we use
>> > DEFAULT
>> > to plug in valid data, then we have lost the knowledge that we should
>> > be
>> > deferring to the meta-data. We have corrupted that data by erasing the
>> > fact
>> > that it was not provided (for whatever reason). Whe have further
>> > corrupted
>> > the data by plugging in a value which may not have been valid in the
>> > past
>> > and/or may not be valid in the future.
>> >
>> > This rule of thumb attempts to avoid confustion between data and
>> > meta-data.
>> >
>> > Feel free to comment (or to ignore me completely).
>> >
>> > If you have seen this suggested before, in SQL or any other context,
>> > please
>> > let me know.
>> >
>> > Thanks,
>> >
>> > Gerard
>> >
>>
>>
>>
Author
30 Dec 2005 8:50 PM
Gerard Marshall Vignes
Thanks again Brian!

I agree with you, Brian, and with Scott. There are times when DEFAULT is
good technique. Logging and other activities that support forensics are good
examples. I was being a purist and taking my so-called "rule-of-thumb" too
far.

KEY POINT: Since our response to missing data may change over time, we need
to retain the knowledge that the data was missing. This is the one thing that
NULL does well. I advocate using the meta-data to stipulate the response to a
NULL (missing data).

Meta-data that remains meta-data is good meta-data. Meta-data that gets
copied into data and erases the fact that no data was provided is not good
meta-data.

Allowing missing data without stipulating how to respond to missing data
invites confusion. Is the record usable? Should it be retained?

Your point about Stored Procedures and Triggers is good. I would add to that
User Defined Functions, DTS Jobs (which can run VBScript & COM!) and new
technologies that Microsoft is adding to SQL Server.

I usually decide to implement logic in the database for performance reasons.
Database servers are usually very powerful and can grind out code rapidly.
Reducing the amount of data moving between the database and its clients
reduces network traffic. This downside is that this produces systems that are
highly non-portable.

That is an entirely new thread, and probably one more interesting than this
one :-)

Thanks Again,

Gerard


Show quote
"Brian Selzer" wrote:

> For the record: I wasn't advocating the use of NULLs.  I believe that they
> have no place in base relations: "NULL value" is an oxymor0n.  They can,
> however, be useful in relational expressions to simplify queries.
>
> > DEFAULTs are poor technique. They hide the fact that no data was provided,
> > and thus they corrupt the data. If there is a default, it belongs in the
> > meta-data and not in the data.
>
> Hello McFly!  DEFAULTs ARE a part of the metadata.  What doesn't make sense
> is "deferring" to the metadata once a row has been inserted.  For example,
> consider two types of time stamps in a temporal database.  One identifies
> the time that an external event occurred; the other identifies the time that
> information arrived in the database.  For the first type, the information
> MUST be provided by the application; for the second type, DEFAULT(GETDATE())
> is best.  It doesn't make sense for the user to specify it during an INSERT
> because blocking may cause an interval to exist between the time that the
> INSERT is issued and the time that the information becomes part of the
> database, and it also makes no sense for the time stamp to be absent--in
> fact, it is usually imperative that it exist in order to extract the state
> of the database at a specific point-in-time.  Your technique of "deferring"
> to the metadata for subsequent queries would cause vital information to be
> lost--that is, the time that the information arrived in the database.
>
> I reject your assertion that data is in any way corrupted by using DEFAULTs.
> Your argument that information is lost is unconvincing.  Again, if you need
> to record the fact that a value was not provided, then explicitly record
> that fact in another column.
>
> The meaning of NULL IS universal.  Interpretation of that meaning in context
> is outside the scope of the database.  It is the purview of the application
> developer or of the user.  All that is important from the database
> perspective is that no value exists.  Again, if you need to record why it
> doesn't exist, then add another column and record it.
>
> Stored procedures and triggers are part of the database.  Triggers are
> necessary to enforce some constraints that cannot be enforced
> declaratively--for example, transition constraints.  Stored procedures
> provide a means to embed business logic in the database where it belongs.
> (Business rules are essentially database constraints.)  Sometimes triggers
> are contraindicated for performance or logistical reasons.  Stored
> procedures are the logical alternative.
>
> "Gerard Marshall Vignes" <GerardMarshallVig***@discussions.microsoft.com>
> wrote in message news:07461B4D-0703-4428-8795-4D21C180C531@microsoft.com...
> > Thanks for your response, Brian!
> >
> > Your comment about Multiplicity is interesting. NULLs are limited to the
> > case of 0 or 1. An additional table with a link via a foreign key is able
> > to
> > handle a wider range 0 to N.
> >
> > It is possible to build fewer tables, with more fields and use NULLs to
> > handle optional fields. That is poor design. It is better to create new
> > tables for optional fields and restrict the use of NULLs.  In fact, it is
> > possible to design a schema that does not use NULLs at all. Such a schema
> > may
> > be more complex (more tables, more joins), but it will also be more
> > extensible.
> >
> > You are missing the point about missing data. The objective is not to
> > explain why the data is missing. The point is to avoid hiding the fact
> > that
> > the data is missing.
> >
> > If a field must be provided (NOT NULL) and it is not available, then the
> > row
> > will not be inserted. There is no confusion here.
> >
> > If a field is "optional," i.e., can be NULL, then there is an opportunity
> > for confusion. Without deferring to meta-data (data about the data), we do
> > not know what a NULL value means. Assuming the meaning of NULL [and
> > assuming
> > that meaning of NULL is universal] leads to confusion. This can give
> > unexpected results.
> >
> > By deferring to the meta-data (data about the data) for a non-ambiguous
> > response (action), we eliminate this confusion. A response could be to (1)
> > ignore the field or (2) use some default value for the field. It makes
> > more
> > sense to let the meta-data determine what NULL means than to let each
> > person
> > or program come up with its own interpretation.
> >
> > Ignoring a field is an action. Using a default value in place of the field
> > is an action. The point is to not read anything into the NULL that is not
> > stipulated in the meta-data.
> >
> > DEFAULTs are poor technique. They hide the fact that no data was provided,
> > and thus they corrupt the data. If there is a default, it belongs in the
> > meta-data and not in the data.
> >
> > All things being equal, the simpler solution is more desirable. It is
> > possible to shadow fields with other fields that describe necessary
> > omissions. It is better to design a schema that naturally accommodates
> > incomplete or otherwise imperfect data. Such a schema will not require
> > guesswork or workaround code.
> >
> > Assuming stored procedures or triggers or any other specific programming
> > technique is a mistake. This is design, and design should be flexible.
> >
> > Gerard
> >
> > "Brian Selzer" wrote:
> >
> >> I think your rule of thumb sucks.
> >>
> >> A NULL indicates the absence of a value.  It does not indicate WHY the
> >> value
> >> does not exist--nor should it: if you need that information, add another
> >> column.
> >>
> >> Whether or not a column can contain NULLs is a multiplicity
> >> constraint--nothing more.  It doesn't make any sense to ascribe
> >> additional
> >> meaning beyond that.
> >>
> >> It doesn't make any sense to "defer" to metadata once a row has been
> >> inserted into the database.
> >>
> >> I don't understand what you mean by "what action should be taken."  It
> >> appears that you want to query the current state of the metadata in order
> >> to
> >> replace NULLs in a result set.  This makes no sense for existing rows.
> >>
> >> DEFAULTs are good things.  They specify the initial state of an entity.
> >>
> >> If you need to record which values were supplied, then you should add
> >> code
> >> to the stored procedure to record that information.
> >>
> >> Your rule of thum blurs the definition of metadata, thus increasing
> >> confusion.
> >>
> >>
> >> "Gerard Marshall Vignes" <Gerard Marshall
> >> Vig***@discussions.microsoft.com>
> >> wrote in message
> >> news:FA3008AF-93B8-48DC-8CAE-0B198BF9C12C@microsoft.com...
> >> >I have a rule of thumb which I would like to get opinions about.
> >> >
> >> > RULE: Provide Data or Go Meta
> >> >
> >> > SCOPE: Designing SQL Tables, UDFs and Stored Procedures
> >> >
> >> > MEANING:
> >> >
> >> > A good database design will restrict the occurrence of NULL. An
> >> > overabundance of NULLs in a database suggests, among other things, that
> >> > the
> >> > schema is not properly normalized. It may also be difficult to
> >> > determine
> >> > what
> >> > NULL actually means in a give context, e.g., data that is needed but
> >> > missing,
> >> > data that is not necessary, etc. This uncertainty is not good. We are
> >> > essentially corrupting our data in the process of storing it.
> >> >
> >> > I believe there is one special case where NULL is valid. This is the
> >> > case
> >> > where the absence of valid data requires that we defer to the meta-data
> >> > to
> >> > determine what action should be taken.
> >> >
> >> > If we specify that a column can accept NULL values, then we are saying
> >> > that
> >> > it is ACCEPTABLE to defer to the meta-data for a situation in which the
> >> > data
> >> > is not provided (for whatever reason).
> >> >
> >> > If we specify that a column cannot accept NULL values, then we are
> >> > saying
> >> > that it is NOT ACCEPTABLE to defer to the meta-data for a situation in
> >> > which
> >> > the data is not provided (for whatever reason).
> >> >
> >> > By capturing the knowledge that it is necessary to defer to the
> >> > meta-data,
> >> > we are leaving it up to the meta-data to tell us how to handle the
> >> > situation.
> >> > The response stipulated by the meta-data is then free to be
> >> > changed---without
> >> > changing the data. This decouples changes in the meta-data response
> >> > from
> >> > the
> >> > data itself.
> >> >
> >> > We should never use DEFAULT. We should use NULL instead. If we use
> >> > DEFAULT
> >> > to plug in valid data, then we have lost the knowledge that we should
> >> > be
> >> > deferring to the meta-data. We have corrupted that data by erasing the
> >> > fact
> >> > that it was not provided (for whatever reason). Whe have further
> >> > corrupted
> >> > the data by plugging in a value which may not have been valid in the
> >> > past
> >> > and/or may not be valid in the future.
> >> >
> >> > This rule of thumb attempts to avoid confustion between data and
> >> > meta-data.
> >> >
> >> > Feel free to comment (or to ignore me completely).
> >> >
> >> > If you have seen this suggested before, in SQL or any other context,
> >> > please
> >> > let me know.
> >> >
> >> > Thanks,
> >> >
> >> > Gerard
> >> >
> >>
> >>
> >>
>
>
>
Author
31 Dec 2005 12:03 AM
Brian Selzer
> KEY POINT: Since our response to missing data may change over time, we
> need
> to retain the knowledge that the data was missing. This is the one thing
> that
> NULL does well. I advocate using the meta-data to stipulate the response
> to a
> NULL (missing data).

The problem is that the metadata may also change over time, so you lose
information either way.  If the metadata changes today, and queries rely on
it, then a query that was run yesterday will return different results than a
query that ran today.  One of Codd's rules stipulates that a nondestructive
schema change shouldn't break existing applications.  I would consider an
application that returns different results on different days given the same
set of data broken.


Show quote
"Gerard Marshall Vignes" <GerardMarshallVig***@discussions.microsoft.com>
wrote in message news:8B80941F-0668-4EB2-B5E8-55EFFB117A16@microsoft.com...
> Thanks again Brian!
>
> I agree with you, Brian, and with Scott. There are times when DEFAULT is
> good technique. Logging and other activities that support forensics are
> good
> examples. I was being a purist and taking my so-called "rule-of-thumb" too
> far.
>
> KEY POINT: Since our response to missing data may change over time, we
> need
> to retain the knowledge that the data was missing. This is the one thing
> that
> NULL does well. I advocate using the meta-data to stipulate the response
> to a
> NULL (missing data).
>
> Meta-data that remains meta-data is good meta-data. Meta-data that gets
> copied into data and erases the fact that no data was provided is not good
> meta-data.
>
> Allowing missing data without stipulating how to respond to missing data
> invites confusion. Is the record usable? Should it be retained?
>
> Your point about Stored Procedures and Triggers is good. I would add to
> that
> User Defined Functions, DTS Jobs (which can run VBScript & COM!) and new
> technologies that Microsoft is adding to SQL Server.
>
> I usually decide to implement logic in the database for performance
> reasons.
> Database servers are usually very powerful and can grind out code rapidly.
> Reducing the amount of data moving between the database and its clients
> reduces network traffic. This downside is that this produces systems that
> are
> highly non-portable.
>
> That is an entirely new thread, and probably one more interesting than
> this
> one :-)
>
> Thanks Again,
>
> Gerard
>
>
> "Brian Selzer" wrote:
>
>> For the record: I wasn't advocating the use of NULLs.  I believe that
>> they
>> have no place in base relations: "NULL value" is an oxymor0n.  They can,
>> however, be useful in relational expressions to simplify queries.
>>
>> > DEFAULTs are poor technique. They hide the fact that no data was
>> > provided,
>> > and thus they corrupt the data. If there is a default, it belongs in
>> > the
>> > meta-data and not in the data.
>>
>> Hello McFly!  DEFAULTs ARE a part of the metadata.  What doesn't make
>> sense
>> is "deferring" to the metadata once a row has been inserted.  For
>> example,
>> consider two types of time stamps in a temporal database.  One identifies
>> the time that an external event occurred; the other identifies the time
>> that
>> information arrived in the database.  For the first type, the information
>> MUST be provided by the application; for the second type,
>> DEFAULT(GETDATE())
>> is best.  It doesn't make sense for the user to specify it during an
>> INSERT
>> because blocking may cause an interval to exist between the time that the
>> INSERT is issued and the time that the information becomes part of the
>> database, and it also makes no sense for the time stamp to be absent--in
>> fact, it is usually imperative that it exist in order to extract the
>> state
>> of the database at a specific point-in-time.  Your technique of
>> "deferring"
>> to the metadata for subsequent queries would cause vital information to
>> be
>> lost--that is, the time that the information arrived in the database.
>>
>> I reject your assertion that data is in any way corrupted by using
>> DEFAULTs.
>> Your argument that information is lost is unconvincing.  Again, if you
>> need
>> to record the fact that a value was not provided, then explicitly record
>> that fact in another column.
>>
>> The meaning of NULL IS universal.  Interpretation of that meaning in
>> context
>> is outside the scope of the database.  It is the purview of the
>> application
>> developer or of the user.  All that is important from the database
>> perspective is that no value exists.  Again, if you need to record why it
>> doesn't exist, then add another column and record it.
>>
>> Stored procedures and triggers are part of the database.  Triggers are
>> necessary to enforce some constraints that cannot be enforced
>> declaratively--for example, transition constraints.  Stored procedures
>> provide a means to embed business logic in the database where it belongs.
>> (Business rules are essentially database constraints.)  Sometimes
>> triggers
>> are contraindicated for performance or logistical reasons.  Stored
>> procedures are the logical alternative.
>>
>> "Gerard Marshall Vignes" <GerardMarshallVig***@discussions.microsoft.com>
>> wrote in message
>> news:07461B4D-0703-4428-8795-4D21C180C531@microsoft.com...
>> > Thanks for your response, Brian!
>> >
>> > Your comment about Multiplicity is interesting. NULLs are limited to
>> > the
>> > case of 0 or 1. An additional table with a link via a foreign key is
>> > able
>> > to
>> > handle a wider range 0 to N.
>> >
>> > It is possible to build fewer tables, with more fields and use NULLs to
>> > handle optional fields. That is poor design. It is better to create new
>> > tables for optional fields and restrict the use of NULLs.  In fact, it
>> > is
>> > possible to design a schema that does not use NULLs at all. Such a
>> > schema
>> > may
>> > be more complex (more tables, more joins), but it will also be more
>> > extensible.
>> >
>> > You are missing the point about missing data. The objective is not to
>> > explain why the data is missing. The point is to avoid hiding the fact
>> > that
>> > the data is missing.
>> >
>> > If a field must be provided (NOT NULL) and it is not available, then
>> > the
>> > row
>> > will not be inserted. There is no confusion here.
>> >
>> > If a field is "optional," i.e., can be NULL, then there is an
>> > opportunity
>> > for confusion. Without deferring to meta-data (data about the data), we
>> > do
>> > not know what a NULL value means. Assuming the meaning of NULL [and
>> > assuming
>> > that meaning of NULL is universal] leads to confusion. This can give
>> > unexpected results.
>> >
>> > By deferring to the meta-data (data about the data) for a non-ambiguous
>> > response (action), we eliminate this confusion. A response could be to
>> > (1)
>> > ignore the field or (2) use some default value for the field. It makes
>> > more
>> > sense to let the meta-data determine what NULL means than to let each
>> > person
>> > or program come up with its own interpretation.
>> >
>> > Ignoring a field is an action. Using a default value in place of the
>> > field
>> > is an action. The point is to not read anything into the NULL that is
>> > not
>> > stipulated in the meta-data.
>> >
>> > DEFAULTs are poor technique. They hide the fact that no data was
>> > provided,
>> > and thus they corrupt the data. If there is a default, it belongs in
>> > the
>> > meta-data and not in the data.
>> >
>> > All things being equal, the simpler solution is more desirable. It is
>> > possible to shadow fields with other fields that describe necessary
>> > omissions. It is better to design a schema that naturally accommodates
>> > incomplete or otherwise imperfect data. Such a schema will not require
>> > guesswork or workaround code.
>> >
>> > Assuming stored procedures or triggers or any other specific
>> > programming
>> > technique is a mistake. This is design, and design should be flexible.
>> >
>> > Gerard
>> >
>> > "Brian Selzer" wrote:
>> >
>> >> I think your rule of thumb sucks.
>> >>
>> >> A NULL indicates the absence of a value.  It does not indicate WHY the
>> >> value
>> >> does not exist--nor should it: if you need that information, add
>> >> another
>> >> column.
>> >>
>> >> Whether or not a column can contain NULLs is a multiplicity
>> >> constraint--nothing more.  It doesn't make any sense to ascribe
>> >> additional
>> >> meaning beyond that.
>> >>
>> >> It doesn't make any sense to "defer" to metadata once a row has been
>> >> inserted into the database.
>> >>
>> >> I don't understand what you mean by "what action should be taken."  It
>> >> appears that you want to query the current state of the metadata in
>> >> order
>> >> to
>> >> replace NULLs in a result set.  This makes no sense for existing rows.
>> >>
>> >> DEFAULTs are good things.  They specify the initial state of an
>> >> entity.
>> >>
>> >> If you need to record which values were supplied, then you should add
>> >> code
>> >> to the stored procedure to record that information.
>> >>
>> >> Your rule of thum blurs the definition of metadata, thus increasing
>> >> confusion.
>> >>
>> >>
>> >> "Gerard Marshall Vignes" <Gerard Marshall
>> >> Vig***@discussions.microsoft.com>
>> >> wrote in message
>> >> news:FA3008AF-93B8-48DC-8CAE-0B198BF9C12C@microsoft.com...
>> >> >I have a rule of thumb which I would like to get opinions about.
>> >> >
>> >> > RULE: Provide Data or Go Meta
>> >> >
>> >> > SCOPE: Designing SQL Tables, UDFs and Stored Procedures
>> >> >
>> >> > MEANING:
>> >> >
>> >> > A good database design will restrict the occurrence of NULL. An
>> >> > overabundance of NULLs in a database suggests, among other things,
>> >> > that
>> >> > the
>> >> > schema is not properly normalized. It may also be difficult to
>> >> > determine
>> >> > what
>> >> > NULL actually means in a give context, e.g., data that is needed but
>> >> > missing,
>> >> > data that is not necessary, etc. This uncertainty is not good. We
>> >> > are
>> >> > essentially corrupting our data in the process of storing it.
>> >> >
>> >> > I believe there is one special case where NULL is valid. This is the
>> >> > case
>> >> > where the absence of valid data requires that we defer to the
>> >> > meta-data
>> >> > to
>> >> > determine what action should be taken.
>> >> >
>> >> > If we specify that a column can accept NULL values, then we are
>> >> > saying
>> >> > that
>> >> > it is ACCEPTABLE to defer to the meta-data for a situation in which
>> >> > the
>> >> > data
>> >> > is not provided (for whatever reason).
>> >> >
>> >> > If we specify that a column cannot accept NULL values, then we are
>> >> > saying
>> >> > that it is NOT ACCEPTABLE to defer to the meta-data for a situation
>> >> > in
>> >> > which
>> >> > the data is not provided (for whatever reason).
>> >> >
>> >> > By capturing the knowledge that it is necessary to defer to the
>> >> > meta-data,
>> >> > we are leaving it up to the meta-data to tell us how to handle the
>> >> > situation.
>> >> > The response stipulated by the meta-data is then free to be
>> >> > changed---without
>> >> > changing the data. This decouples changes in the meta-data response
>> >> > from
>> >> > the
>> >> > data itself.
>> >> >
>> >> > We should never use DEFAULT. We should use NULL instead. If we use
>> >> > DEFAULT
>> >> > to plug in valid data, then we have lost the knowledge that we
>> >> > should
>> >> > be
>> >> > deferring to the meta-data. We have corrupted that data by erasing
>> >> > the
>> >> > fact
>> >> > that it was not provided (for whatever reason). Whe have further
>> >> > corrupted
>> >> > the data by plugging in a value which may not have been valid in the
>> >> > past
>> >> > and/or may not be valid in the future.
>> >> >
>> >> > This rule of thumb attempts to avoid confustion between data and
>> >> > meta-data.
>> >> >
>> >> > Feel free to comment (or to ignore me completely).
>> >> >
>> >> > If you have seen this suggested before, in SQL or any other context,
>> >> > please
>> >> > let me know.
>> >> >
>> >> > Thanks,
>> >> >
>> >> > Gerard
>> >> >
>> >>
>> >>
>> >>
>>
>>
>>
Author
31 Dec 2005 3:12 AM
Gerard Marshall Vignes
Gerard is no longer participating in this discussion.

For further reference, please contact:

Name: John Doe
Phone: 999-999-9999
Address: 123 Primrose Lane, Anytown USA 99999-9999

Thank You :-)


Show quote
"Brian Selzer" wrote:

> > KEY POINT: Since our response to missing data may change over time, we
> > need
> > to retain the knowledge that the data was missing. This is the one thing
> > that
> > NULL does well. I advocate using the meta-data to stipulate the response
> > to a
> > NULL (missing data).
>
> The problem is that the metadata may also change over time, so you lose
> information either way.  If the metadata changes today, and queries rely on
> it, then a query that was run yesterday will return different results than a
> query that ran today.  One of Codd's rules stipulates that a nondestructive
> schema change shouldn't break existing applications.  I would consider an
> application that returns different results on different days given the same
> set of data broken.
>
>
> "Gerard Marshall Vignes" <GerardMarshallVig***@discussions.microsoft.com>
> wrote in message news:8B80941F-0668-4EB2-B5E8-55EFFB117A16@microsoft.com...
> > Thanks again Brian!
> >
> > I agree with you, Brian, and with Scott. There are times when DEFAULT is
> > good technique. Logging and other activities that support forensics are
> > good
> > examples. I was being a purist and taking my so-called "rule-of-thumb" too
> > far.
> >
> > KEY POINT: Since our response to missing data may change over time, we
> > need
> > to retain the knowledge that the data was missing. This is the one thing
> > that
> > NULL does well. I advocate using the meta-data to stipulate the response
> > to a
> > NULL (missing data).
> >
> > Meta-data that remains meta-data is good meta-data. Meta-data that gets
> > copied into data and erases the fact that no data was provided is not good
> > meta-data.
> >
> > Allowing missing data without stipulating how to respond to missing data
> > invites confusion. Is the record usable? Should it be retained?
> >
> > Your point about Stored Procedures and Triggers is good. I would add to
> > that
> > User Defined Functions, DTS Jobs (which can run VBScript & COM!) and new
> > technologies that Microsoft is adding to SQL Server.
> >
> > I usually decide to implement logic in the database for performance
> > reasons.
> > Database servers are usually very powerful and can grind out code rapidly.
> > Reducing the amount of data moving between the database and its clients
> > reduces network traffic. This downside is that this produces systems that
> > are
> > highly non-portable.
> >
> > That is an entirely new thread, and probably one more interesting than
> > this
> > one :-)
> >
> > Thanks Again,
> >
> > Gerard
> >
> >
> > "Brian Selzer" wrote:
> >
> >> For the record: I wasn't advocating the use of NULLs.  I believe that
> >> they
> >> have no place in base relations: "NULL value" is an oxymor0n.  They can,
> >> however, be useful in relational expressions to simplify queries.
> >>
> >> > DEFAULTs are poor technique. They hide the fact that no data was
> >> > provided,
> >> > and thus they corrupt the data. If there is a default, it belongs in
> >> > the
> >> > meta-data and not in the data.
> >>
> >> Hello McFly!  DEFAULTs ARE a part of the metadata.  What doesn't make
> >> sense
> >> is "deferring" to the metadata once a row has been inserted.  For
> >> example,
> >> consider two types of time stamps in a temporal database.  One identifies
> >> the time that an external event occurred; the other identifies the time
> >> that
> >> information arrived in the database.  For the first type, the information
> >> MUST be provided by the application; for the second type,
> >> DEFAULT(GETDATE())
> >> is best.  It doesn't make sense for the user to specify it during an
> >> INSERT
> >> because blocking may cause an interval to exist between the time that the
> >> INSERT is issued and the time that the information becomes part of the
> >> database, and it also makes no sense for the time stamp to be absent--in
> >> fact, it is usually imperative that it exist in order to extract the
> >> state
> >> of the database at a specific point-in-time.  Your technique of
> >> "deferring"
> >> to the metadata for subsequent queries would cause vital information to
> >> be
> >> lost--that is, the time that the information arrived in the database.
> >>
> >> I reject your assertion that data is in any way corrupted by using
> >> DEFAULTs.
> >> Your argument that information is lost is unconvincing.  Again, if you
> >> need
> >> to record the fact that a value was not provided, then explicitly record
> >> that fact in another column.
> >>
> >> The meaning of NULL IS universal.  Interpretation of that meaning in
> >> context
> >> is outside the scope of the database.  It is the purview of the
> >> application
> >> developer or of the user.  All that is important from the database
> >> perspective is that no value exists.  Again, if you need to record why it
> >> doesn't exist, then add another column and record it.
> >>
> >> Stored procedures and triggers are part of the database.  Triggers are
> >> necessary to enforce some constraints that cannot be enforced
> >> declaratively--for example, transition constraints.  Stored procedures
> >> provide a means to embed business logic in the database where it belongs.
> >> (Business rules are essentially database constraints.)  Sometimes
> >> triggers
> >> are contraindicated for performance or logistical reasons.  Stored
> >> procedures are the logical alternative.
> >>
> >> "Gerard Marshall Vignes" <GerardMarshallVig***@discussions.microsoft.com>
> >> wrote in message
> >> news:07461B4D-0703-4428-8795-4D21C180C531@microsoft.com...
> >> > Thanks for your response, Brian!
> >> >
> >> > Your comment about Multiplicity is interesting. NULLs are limited to
> >> > the
> >> > case of 0 or 1. An additional table with a link via a foreign key is
> >> > able
> >> > to
> >> > handle a wider range 0 to N.
> >> >
> >> > It is possible to build fewer tables, with more fields and use NULLs to
> >> > handle optional fields. That is poor design. It is better to create new
> >> > tables for optional fields and restrict the use of NULLs.  In fact, it
> >> > is
> >> > possible to design a schema that does not use NULLs at all. Such a
> >> > schema
> >> > may
> >> > be more complex (more tables, more joins), but it will also be more
> >> > extensible.
> >> >
> >> > You are missing the point about missing data. The objective is not to
> >> > explain why the data is missing. The point is to avoid hiding the fact
> >> > that
> >> > the data is missing.
> >> >
> >> > If a field must be provided (NOT NULL) and it is not available, then
> >> > the
> >> > row
> >> > will not be inserted. There is no confusion here.
> >> >
> >> > If a field is "optional," i.e., can be NULL, then there is an
> >> > opportunity
> >> > for confusion. Without deferring to meta-data (data about the data), we
> >> > do
> >> > not know what a NULL value means. Assuming the meaning of NULL [and
> >> > assuming
> >> > that meaning of NULL is universal] leads to confusion. This can give
> >> > unexpected results.
> >> >
> >> > By deferring to the meta-data (data about the data) for a non-ambiguous
> >> > response (action), we eliminate this confusion. A response could be to
> >> > (1)
> >> > ignore the field or (2) use some default value for the field. It makes
> >> > more
> >> > sense to let the meta-data determine what NULL means than to let each
> >> > person
> >> > or program come up with its own interpretation.
> >> >
> >> > Ignoring a field is an action. Using a default value in place of the
> >> > field
> >> > is an action. The point is to not read anything into the NULL that is
> >> > not
> >> > stipulated in the meta-data.
> >> >
> >> > DEFAULTs are poor technique. They hide the fact that no data was
> >> > provided,
> >> > and thus they corrupt the data. If there is a default, it belongs in
> >> > the
> >> > meta-data and not in the data.
> >> >
> >> > All things being equal, the simpler solution is more desirable. It is
> >> > possible to shadow fields with other fields that describe necessary
> >> > omissions. It is better to design a schema that naturally accommodates
> >> > incomplete or otherwise imperfect data. Such a schema will not require
> >> > guesswork or workaround code.
> >> >
> >> > Assuming stored procedures or triggers or any other specific
> >> > programming
> >> > technique is a mistake. This is design, and design should be flexible.
> >> >
> >> > Gerard
> >> >
> >> > "Brian Selzer" wrote:
> >> >
> >> >> I think your rule of thumb sucks.
> >> >>
> >> >> A NULL indicates the absence of a value.  It does not indicate WHY the
> >> >> value
> >> >> does not exist--nor should it: if you need that information, add
> >> >> another
> >> >> column.
> >> >>
> >> >> Whether or not a column can contain NULLs is a multiplicity
> >> >> constraint--nothing more.  It doesn't make any sense to ascribe
> >> >> additional
> >> >> meaning beyond that.
> >> >>
> >> >> It doesn't make any sense to "defer" to metadata once a row has been
> >> >> inserted into the database.
> >> >>
> >> >> I don't understand what you mean by "what action should be taken."  It
> >> >> appears that you want to query the current state of the metadata in
> >> >> order
> >> >> to
> >> >> replace NULLs in a result set.  This makes no sense for existing rows.
> >> >>
> >> >> DEFAULTs are good things.  They specify the initial state of an
> >> >> entity.
> >> >>
> >> >> If you need to record which values were supplied, then you should add
> >> >> code
> >> >> to the stored procedure to record that information.
> >> >>
> >> >> Your rule of thum blurs the definition of metadata, thus increasing
> >> >> confusion.
> >> >>
> >> >>
> >> >> "Gerard Marshall Vignes" <Gerard Marshall
> >> >> Vig***@discussions.microsoft.com>
> >> >> wrote in message
> >> >> news:FA3008AF-93B8-48DC-8CAE-0B198BF9C12C@microsoft.com...
> >> >> >I have a rule of thumb which I would like to get opinions about.
> >> >> >
> >> >> > RULE: Provide Data or Go Meta
> >> >> >
> >> >> > SCOPE: Designing SQL Tables, UDFs and Stored Procedures
> >> >> >
> >> >> > MEANING:
> >> >> >
> >> >> > A good database design will restrict the occurrence of NULL. An
> >> >> > overabundance of NULLs in a database suggests, among other things,
> >> >> > that
> >> >> > the
> >> >> > schema is not properly normalized. It may also be difficult to
> >> >> > determine
> >> >> > what
> >> >> > NULL actually means in a give context, e.g., data that is needed but
> >> >> > missing,
> >> >> > data that is not necessary, etc. This uncertainty is not good. We
> >> >> > are
> >> >> > essentially corrupting our data in the process of storing it.
> >> >> >
> >> >> > I believe there is one special case where NULL is valid. This is the
> >> >> > case
> >> >> > where the absence of valid data requires that we defer to the
> >> >> > meta-data
> >> >> > to
> >> >> > determine what action should be taken.
> >> >> >
> >> >> > If we specify that a column can accept NULL values, then we are
> >> >> > saying
> >> >> > that
> >> >> > it is ACCEPTABLE to defer to the meta-data for a situation in which
> >> >> > the
> >> >> > data
> >> >> > is not provided (for whatever reason).
> >> >> >
> >> >> > If we specify that a column cannot accept NULL values, then we are
> >> >> > saying
> >> >> > that it is NOT ACCEPTABLE to defer to the meta-data for a situation
> >> >> > in
> >> >> > which
> >> >> > the data is not provided (for whatever reason).
> >> >> >
> >> >> > By capturing the knowledge that it is necessary to defer to the
> >> >> > meta-data,
> >> >> > we are leaving it up to the meta-data to tell us how to handle the
> >> >> > situation.
> >> >> > The response stipulated by the meta-data is then free to be
> >> >> > changed---without
> >> >> > changing the data. This decouples changes in the meta-data response
> >> >> > from
> >> >> > the
> >> >> > data itself.
> >> >> >
> >> >> > We should never use DEFAULT. We should use NULL instead. If we use
> >> >> > DEFAULT
> >> >> > to plug in valid data, then we have lost the knowledge that we
> >> >> > should
> >> >> > be
> >> >> > deferring to the meta-data. We have corrupted that data by erasing
> >> >> > the
> >> >> > fact
> >> >> > that it was not provided (for whatever reason). Whe have further
Author
30 Dec 2005 2:25 PM
Scott Morris
> A good database design will restrict the occurrence of NULL. An
> overabundance of NULLs in a database suggests, among other things, that
> the
> schema is not properly normalized. It may also be difficult to determine
> what

I agree with this rule.

> We should never use DEFAULT. We should use NULL instead. If we use DEFAULT
> to plug in valid data, then we have lost the knowledge that we should be

I don't completely agree with this rule.  One popular use for defaults is
auditing (to track the identity of the creator and datetime of creation).
However, a gui-based system will often require that a "default" be visible
(and changeable) to the user, thus obviating the need for the corresponding
default within the schema.    In general, I think a "rule of thumb" should
avoid absolutes ("never").
Author
30 Dec 2005 7:32 PM
Gerard Marshall Vignes
Thanks Scott!

My mistake. I agree that absolutes should be avoided.

Logging activities are an example of where DEFAULT is good. Thank you for
pointing this out. I was a purist.

Gerard

Show quote
"Scott Morris" wrote:

> > A good database design will restrict the occurrence of NULL. An
> > overabundance of NULLs in a database suggests, among other things, that
> > the
> > schema is not properly normalized. It may also be difficult to determine
> > what
>
>  I agree with this rule.
>
> > We should never use DEFAULT. We should use NULL instead. If we use DEFAULT
> > to plug in valid data, then we have lost the knowledge that we should be
>
> I don't completely agree with this rule.  One popular use for defaults is
> auditing (to track the identity of the creator and datetime of creation).
> However, a gui-based system will often require that a "default" be visible
> (and changeable) to the user, thus obviating the need for the corresponding
> default within the schema.    In general, I think a "rule of thumb" should
> avoid absolutes ("never").
>
>
>

AddThis Social Bookmark Button