Bookmarks

Yahoo Gmail Google Facebook Delicious Twitter Reddit Stumpleupon Myspace Digg

Search queries

sqldatasource dal, wwwxxxenden, convert raid5 to raid 10 mdadm, apache force chunked, nrao wwwxxx, xxxxxdup, procmail change subject header, wwwXxx not20, Wwwxxx.doks sas, linux raid resync after reboot

Links

XODOX
Impressum

#1: CHECKSUM to determine record changes

Posted on 2008-04-22 00:22:36 by rcamarda

I've searched the forum for uses of CHECKSUM and havent found a
satisfactory answer.
I need to know when a row changes and I dont care what it was or is, I
just need to know it changed.
Detail:
Hourly I select rows where the record as a create date or change date
of the last 3 days. I am only interested when the address or name
changes, not other columns.
Currently, I might select 30,000 rows that are new or changed, but
turns out that only 100 have address changes and 500 are new. I want
to process the 600, not the 30,000.
My main concern is if some columns change, but the resulting checksum
doesnt. Then I would have missed processing that record.
The column types that I will be tracking are:
"forenames" VARCHAR(50) NULL,
"surname" VARCHAR(51) NULL,
"ADDRESS1" VARCHAR(60) NULL,
"ADDRESS2" VARCHAR(60) NULL,
"ADDRESS4" VARCHAR(50) NULL,
"STATE" VARCHAR(10) NULL,
"ZIP" VARCHAR(20) NULL,
"email" VARCHAR(50) NULL,
"telephone" VARCHAR(10) NULL
(I left off the keys, these are the only fields that I will be using
for the checksum()).
Am I safe? I read about 32 bit CRC and some data changes would go
unnoticed, but not sure if this layout would qualifiy.

TIA
Rob

Report this message

#2: Re: CHECKSUM to determine record changes

Posted on 2008-04-22 00:33:34 by rcamarda

Im using SQL Server 2005 currently patched.

Report this message

#3: Re: CHECKSUM to determine record changes

Posted on 2008-04-22 08:25:56 by David Portas

"rcamarda" <robert.a.camarda@gmail.com> wrote in message
news:2d88cd48-2853-4f01-b797-c5c166bd167d@k13g2000hse.google groups.com...
> Im using SQL Server 2005 currently patched.

CHECKSUM isn't a reliable way to detect change because it's quite common to
find different rows with the same CHECKSUM value. You could use a ROWVERSION
column instead. ROWVERSION is guaranteed to increment when the row data is
updated.

Another alternative is to use a hash. The HashBytes function will return a
secure hash of a binary value with a very high probability of uniqueness.
Duplicate hashes are theoretically possible but are incredibly unlikely to
occur unintentionally. If you are extremely paranoid then you can use two
different hashes.

--
David Portas

Report this message

#4: Re: CHECKSUM to determine record changes

Posted on 2008-04-23 18:42:38 by Gert-Jan Strik

David Portas wrote:
>
> "rcamarda" <robert.a.camarda@gmail.com> wrote in message
> news:2d88cd48-2853-4f01-b797-c5c166bd167d@k13g2000hse.google groups.com...
> > Im using SQL Server 2005 currently patched.
>
> CHECKSUM isn't a reliable way to detect change because it's quite common to
> find different rows with the same CHECKSUM value. You could use a ROWVERSION
> column instead. ROWVERSION is guaranteed to increment when the row data is
> updated.
>
> Another alternative is to use a hash. The HashBytes function will return a
> secure hash of a binary value with a very high probability of uniqueness.
> Duplicate hashes are theoretically possible but are incredibly unlikely to
> occur unintentionally. If you are extremely paranoid then you can use two
> different hashes.
>
> --
> David Portas

I agree about the advice for use ROWVERSION.

However, CHECKSUM is also just a hash. Whether the chance of an
unnotices change is lower if you use HashBytes function instead of
CHECKSUM depends on your data.

Although adding a second hash value will lower the chance of a missed
change, it would be silly to do so. In the end you would need as many
hash bytes as there are bytes in the data.

--
Gert-Jan

Report this message