Automatically Archiving a Large Table

Automatically Archiving a Large Table

am 04.12.2007 00:43:05 von Brian - Support

I have an extremely large table that is starting to slow down
production. I want to keep it for historical queries, but want to have
it fast for daily production activity. The daily activity will never
access records older than a few days. One of the difficult things
about this table is there is a lot of foreign keys that reference the
table.

So, I'm thinking of doing this:

1) Keep a "live" archive of the table using INSERT, UPDATE, and DELETE
triggers.
2) Having a job that disables the DELETE triggers and will delete
records older than 1 week.
3) Of course, part of this is also deleting the records that reference
this table.
4) Doing the same thing, with other large tables including, of course,
the ones that reference this table.

It seems the trick, mostly, is to delete the records from various
tables in order, so that none of the foreign keys are left hanging.

Is this a reasonable approach? Is there a more straight forward/built-
in technique for doing this?

We examined partitioning and this does not seem to be a practical
approach because, amoung other things, some of the tables are self
referencing and these references would cross partition boundaries.

Thanks!

- Brian

Re: Automatically Archiving a Large Table

am 04.12.2007 04:14:03 von Tom van Stiphout

On Mon, 3 Dec 2007 15:43:05 -0800 (PST), Brian - Support
wrote:

What do you consider "extremely large"?

A query like:
select * from SomeTable
where SomeDate between 'x' and 'y'
should be fast if SomeDate is indexed and x and y are only a few days
apart, even if there are millions of rows. That's the power of
indexing. So it should be fast to select the "active records" for use
in the front-end application.

-Tom.



>I have an extremely large table that is starting to slow down
>production. I want to keep it for historical queries, but want to have
>it fast for daily production activity. The daily activity will never
>access records older than a few days. One of the difficult things
>about this table is there is a lot of foreign keys that reference the
>table.
>
>So, I'm thinking of doing this:
>
>1) Keep a "live" archive of the table using INSERT, UPDATE, and DELETE
>triggers.
>2) Having a job that disables the DELETE triggers and will delete
>records older than 1 week.
>3) Of course, part of this is also deleting the records that reference
>this table.
>4) Doing the same thing, with other large tables including, of course,
>the ones that reference this table.
>
>It seems the trick, mostly, is to delete the records from various
>tables in order, so that none of the foreign keys are left hanging.
>
>Is this a reasonable approach? Is there a more straight forward/built-
>in technique for doing this?
>
>We examined partitioning and this does not seem to be a practical
>approach because, amoung other things, some of the tables are self
>referencing and these references would cross partition boundaries.
>
>Thanks!
>
>- Brian

Re: Automatically Archiving a Large Table

am 04.12.2007 20:56:19 von Brian - Support

A handful of tables with 10 - 20 million records.

For one thing, we're having to do update statitistics quite frequently
or performance slows down, and the update statistics is taking 3-4
minutes for each table (at 2% sampling) and using quite a bit of CPU
when it runs.

Also, we would put the larger "archive" tables in a seperate filegroup
so that we can do frequent backups of the smaller tables for quicker
emergency restores.

Re: Automatically Archiving a Large Table

am 05.12.2007 00:02:18 von Erland Sommarskog

Brian - Support (mail@brianrice.com) writes:
> A handful of tables with 10 - 20 million records.
>
> For one thing, we're having to do update statitistics quite frequently
> or performance slows down, and the update statistics is taking 3-4
> minutes for each table (at 2% sampling) and using quite a bit of CPU
> when it runs.

10-20 millions rows is not that much. 3-4 minutes for 2% sample sound
a lot. Then again, what is the average row size?

One thing that we found when we want to put UPDATE STATISTICS in an
maintenance job, was that adding the WITH INDEX clause gave us good
performance. This means that we neglected the statistics on non-
indexed columns. Which may or may not be acceptable.

As for your archiving scheme, it's difficult to comment without knowing
all the details. But beware that the DELETE operations can be painful,
particularly, if there is activity in the other end of the table while
the DELETE is running.


--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downlo ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books .mspx