Home All Groups Group Topic Archive Search About
Author
10 Jun 2006 4:00 AM
James Hancock
I'm doing synchronization of two databases. I cannot use replication in this
case because I need to control too much of the process and I need to be able
to update tables. (if I'm wrong about these limitations requiring me to use
something else, please let me know!)

So I'm looking at using Timestamps in my tables to keep track of the
changes.

So what I'm thinking is that if I could keep a table that stores the values
of the next timestamp that is going to be issued as I update each table like
so:

TableName        Next TimeStamp from Last Sync
<Table>             <Next One>

Then I could do a query (somehow, this is where most of the question comes
in) for all of the records where the timestamp in the table is > the one on
file for that table.

The trick comes in updating the two records between the server and the
workstation that is remote.  In that case I need to be able to put the
server's timestamp into the record not the one that would be automatically
updated. (and visa versa when it's a server update) I read the documentation
in Books Online that you can do this using the Select Into, but it says that
it's not recommended.

Why isn't it recommended? And is it possible to do a pure insert with a
param into a timestamp field and have it put that value in and not update
it? The key here is that if it's the same database that has been backed up
and restored on the workstation, then the two are identical at the time of
the backup. So the next value for the timestamps should be exactly the same
in both cases.

Hence I should be able to use that information to match any records between
both the server and the workstation (I presume this is how replication
works).

Concurency can be dealt with when both SQL databases have the same record
with new timestamps and they are different from eachother.

So my question is three fold (if there isn't any logic issues above)

1. Can I directly insert an existing value into a timestamp field in a table
and have it stick so that the two will match after synchronization between
the two servers? (and why does MS say this is a bad thing?)
2. Will the two servers have the same incremental order for time stamp
values?  The documentation is confusing me here, because it looks like the
timestamp is from the time of creation of the database and is actually a
time value, and not just an incremental value. But other places, it looks
like this is just a universal incremented value that counts and not a real
date/time from the creation of the database and thus could cause me issues
in this scenario.
3. How do I compare time stamp fields and get which is newer? (I know that
it's basically a binary field, so how do I find which is larger?)

Thanks for the help, I think this could greatly simplify the way I'm doing
this, and since it appears to be very similar to the way replication works
it would be very nice to be able to use a proven system.

James Hancock

Author
10 Jun 2006 10:13 PM
Erland Sommarskog
James Hancock (~ja***@darwinproductions.net) writes:
> I'm doing synchronization of two databases. I cannot use replication in
> this case because I need to control too much of the process and I need
> to be able to update tables. (if I'm wrong about these limitations
> requiring me to use something else, please let me know!)

Updates are no problem with replication. As for your other issues,
I have no idea, as I am not inro replication. I suggest that you pay
a visit to microsoft.public.sqlserver.replication to see what the
people there have to say.

> Then I could do a query (somehow, this is where most of the question
> comes in) for all of the records where the timestamp in the table is >
> the one on file for that table.
>
> The trick comes in updating the two records between the server and the
> workstation that is remote.  In that case I need to be able to put the
> server's timestamp into the record not the one that would be
> automatically updated. (and visa versa when it's a server update) I read
> the documentation in Books Online that you can do this using the Select
> Into, but it says that it's not recommended.

I'm not sure that I exactly understand what you are up to, but if you
plan to keep timestamp columns in two databases in sync, this can but
fail. Just one failed update somewhere with a timestamp in column in
the receiving database, and you are out of sync. Keep in mind that
timestamp values are like identity values: they are consumed, even if
the transaction is rolled back.

However, what may very well make sense is to mirror the server's
timestamp values on the receiving side in a binary(8) column.

> Hence I should be able to use that information to match any records
> between both the server and the workstation (I presume this is how
> replication works).

Transaction replication works out of the transaction log and simply
pushes the transactions in the publisher to the subscribers. Really
how merge replication works, I don't know.

> 1. Can I directly insert an existing value into a timestamp field in a
> table and have it stick so that the two will match after synchronization
> between the two servers? (and why does MS say this is a bad thing?)

As I said, don't try this. But mirroring the timestamp in one database
as binary(8) is OK.

> 2. Will the two servers have the same incremental order for time stamp
> values?  The documentation is confusing me here, because it looks like the
> timestamp is from the time of creation of the database and is actually a
> time value, and not just an incremental value. But other places, it looks
> like this is just a universal incremented value that counts and not a real
> date/time from the creation of the database and thus could cause me issues
> in this scenario.

Timestamp has nothing to do with date and time whatsover. It is a 64-
bit value which is monotonically increasing.

> 3. How do I compare time stamp fields and get which is newer? (I know that
> it's basically a binary field, so how do I find which is larger?)

You can use the same > operator that you use for integers etc.





--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
Author
11 Jun 2006 4:25 AM
Roger Wolter[MSFT]
I'm not sure if I completely understand all the issues but here's my take.
The biggest issue with a timestamp is knowing when to record it.  For
example, if you want to pull data out of the customer table in database A
and transfer it to the customer table in database B and you take the
timestamp when you start selecting customer records, you may pick up records
with later timestamps if they were inserted between the time you took the
timestamp and when your select statement reached that point in the table.
You can avoid this with a table lock if you can afford to lock the table
long enough to do the select.  By the way, the timestamp is a number and
doesn't represent the time the update was done.  If you record the timestamp
at the end of the select you would miss updates.

Any timestamp based system will have to deal with two major issues -
conflicting updates and deletes.  I can't think of a way to fix conflicting
updates in your proposal without recording before and after images which in
essence means you're duplicating Merge Replication.  You could also use some
kind of data stewardship to avoid conflicts but if you do, Peer to Peer
replication will probably work much better for you purposes because it
handles deletes.  The only real way to handle deletes in your proposal is to
mark records without deleting them and only delete them once your sure they
have been replicated.

If you're determined to make this work, the safest technique would be to
lock the table, select all the records with timestamps newer than the last
timestamp which you kept track of in another table, update a dummy record in
the source table to get a timestamp, commit the transaction and then read
the dummy record and record the timestamp in a timestamp table, and finally
update and insert the records in the opposite table.  This assumes you came
up with solutions for the update conflict and delete issues.  If this sounds
like what you're dong then replication would do this with a huge amount less
effort on your part.

Feel free to contact me offline if you want more analysis then you get on
the newsgroup.

--
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at
http://www.microsoft.com/info/cpyright.htm

Show quote
"James Hancock" <~ja***@darwinproductions.net> wrote in message
news:2FD6AFCE-CE8F-4E95-B052-D51ADCA3DB80@microsoft.com...
> I'm doing synchronization of two databases. I cannot use replication in
> this case because I need to control too much of the process and I need to
> be able to update tables. (if I'm wrong about these limitations requiring
> me to use something else, please let me know!)
>
> So I'm looking at using Timestamps in my tables to keep track of the
> changes.
>
> So what I'm thinking is that if I could keep a table that stores the
> values of the next timestamp that is going to be issued as I update each
> table like so:
>
> TableName        Next TimeStamp from Last Sync
> <Table>             <Next One>
>
> Then I could do a query (somehow, this is where most of the question comes
> in) for all of the records where the timestamp in the table is > the one
> on file for that table.
>
> The trick comes in updating the two records between the server and the
> workstation that is remote.  In that case I need to be able to put the
> server's timestamp into the record not the one that would be automatically
> updated. (and visa versa when it's a server update) I read the
> documentation in Books Online that you can do this using the Select Into,
> but it says that it's not recommended.
>
> Why isn't it recommended? And is it possible to do a pure insert with a
> param into a timestamp field and have it put that value in and not update
> it? The key here is that if it's the same database that has been backed up
> and restored on the workstation, then the two are identical at the time of
> the backup. So the next value for the timestamps should be exactly the
> same in both cases.
>
> Hence I should be able to use that information to match any records
> between both the server and the workstation (I presume this is how
> replication works).
>
> Concurency can be dealt with when both SQL databases have the same record
> with new timestamps and they are different from eachother.
>
> So my question is three fold (if there isn't any logic issues above)
>
> 1. Can I directly insert an existing value into a timestamp field in a
> table and have it stick so that the two will match after synchronization
> between the two servers? (and why does MS say this is a bad thing?)
> 2. Will the two servers have the same incremental order for time stamp
> values?  The documentation is confusing me here, because it looks like the
> timestamp is from the time of creation of the database and is actually a
> time value, and not just an incremental value. But other places, it looks
> like this is just a universal incremented value that counts and not a real
> date/time from the creation of the database and thus could cause me issues
> in this scenario.
> 3. How do I compare time stamp fields and get which is newer? (I know that
> it's basically a binary field, so how do I find which is larger?)
>
> Thanks for the help, I think this could greatly simplify the way I'm doing
> this, and since it appears to be very similar to the way replication works
> it would be very nice to be able to use a proven system.
>
> James Hancock
>
Author
11 Jun 2006 9:24 PM
Erland Sommarskog
Roger Wolter[MSFT] (rwol***@online.microsoft.com) writes:
> I'm not sure if I completely understand all the issues but here's my
> take. The biggest issue with a timestamp is knowing when to record it.
> For example, if you want to pull data out of the customer table in
> database A and transfer it to the customer table in database B and you
> take the timestamp when you start selecting customer records, you may
> pick up records with later timestamps if they were inserted between the
> time you took the timestamp and when your select statement reached that
> point in the table. You can avoid this with a table lock if you can
> afford to lock the table long enough to do the select. 

Yes, this is a potential problem. Particularly, if there is a heavy
update frequency.

There are alternatives to table locks, though. For instance, you can
start with retrieving the currenst highest tstamp, and retrieve a
closed interval. You should also return this maxvalue in the result
set, so that you know what the maxvalue is. (That row may be updated
before you actually retrieve it.) There are still a few more things
that can go wrong, if the update frequency is really high. (And in
such table locks are not an alternative, because it will kill
concurrency.)

> kind of data stewardship to avoid conflicts but if you do, Peer to Peer
> replication will probably work much better for you purposes because it
> handles deletes.  The only real way to handle deletes in your proposal
> is to mark records without deleting them and only delete them once your
> sure they have been replicated.

I didn't even consider deletes, thanks for pointing out, Roger.

I think this may be enough from deterring James from taking this
path. :-)

> Feel free to contact me offline if you want more analysis then you get on
> the newsgroup.

Trying to lure him into a solution with service broker? :-)


--
Erland Sommarskog, SQL Server MVP, esq***@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx

AddThis Social Bookmark Button