T-SQL Duplicate Record Help

T-SQL Duplicate Record Help

am 12.04.2008 21:55:04 von Jane T

I want to check the data from an invoicing database for possible duplicate
invoices.

I have created an example script below.

I want to return rows where there is a match for both reference_ and value_,
so in this
example in my SELECT statement I only want to return the two rows that match
on both
reference_ and value_


CREATE TABLE #TMP
(reference_ varchar(10), suppliercode_ varchar(10), value_ money)

INSERT INTO #TMP VALUES ('A1','123',100)
INSERT INTO #TMP VALUES ('A2','234',100)
INSERT INTO #TMP VALUES ('A1','345',100)
INSERT INTO #TMP VALUES ('A2','234',70)

SELECT * FROM #TMP ORDER BY REFERENCE_

Re: T-SQL Duplicate Record Help

am 13.04.2008 02:59:37 von Tom van Stiphout

On Sat, 12 Apr 2008 20:55:04 +0100, "Jane T" wrote:

(I renamed your table to tblTest)
Here is how I solve it for simple cases involving a single field:
select reference_
from tblTest
group by reference_
having count(reference_) > 1

With more than one field, I simply concatenate those fields and then
I'm using the same method as in the first query:
select reference_ + '-' + cast(value_ as varchar)
from tblTest
group by reference_ + '-' + cast(value_ as varchar)
having count(reference_ + '-' + cast(value_ as varchar)) > 1

If you then want the entire records for the found values, then compare
the concatenated values with those of the values from the previous
query:
select *
from tblTest
where reference_ + '-' + cast(value_ as varchar) = (select
reference_ + '-' + cast(value_ as varchar)
from tblTest
group by reference_ + '-' + cast(value_ as varchar)
having count(reference_ + '-' + cast(value_ as varchar)) > 1)

-Tom.


>I want to check the data from an invoicing database for possible duplicate
>invoices.
>
>I have created an example script below.
>
>I want to return rows where there is a match for both reference_ and value_,
>so in this
>example in my SELECT statement I only want to return the two rows that match
>on both
>reference_ and value_
>
>
>CREATE TABLE #TMP
>(reference_ varchar(10), suppliercode_ varchar(10), value_ money)
>
>INSERT INTO #TMP VALUES ('A1','123',100)
>INSERT INTO #TMP VALUES ('A2','234',100)
>INSERT INTO #TMP VALUES ('A1','345',100)
>INSERT INTO #TMP VALUES ('A2','234',70)
>
>SELECT * FROM #TMP ORDER BY REFERENCE_
>

Re: T-SQL Duplicate Record Help

am 13.04.2008 03:59:02 von Erland Sommarskog

Jane T (janet@nospam.net) writes:

> I want to check the data from an invoicing database for possible duplicate
> invoices.
>
> I have created an example script below.
>
> I want to return rows where there is a match for both reference_ and
> value_, so in this example in my SELECT statement I only want to return
> the two rows that match on both reference_ and value_
>
>
> CREATE TABLE #TMP
> (reference_ varchar(10), suppliercode_ varchar(10), value_ money)
>
> INSERT INTO #TMP VALUES ('A1','123',100)
> INSERT INTO #TMP VALUES ('A2','234',100)
> INSERT INTO #TMP VALUES ('A1','345',100)
> INSERT INTO #TMP VALUES ('A2','234',70)
>
> SELECT * FROM #TMP ORDER BY REFERENCE_

SELECT reference_, value_, COUNT(*)
FROM #TMP
GROUP BY reference_, value_
HAVING COUNT(*) > 1



--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downlo ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books .mspx

Re: T-SQL Duplicate Record Help

am 13.04.2008 05:29:13 von Plamen Ratchev

Here is one way to return all duplicate rows based on the two columns (SQL
Server 2005):

;WITH Dups
AS
(SELECT reference_, suppliercode_, value_,
COUNT(*) OVER(
PARTITION BY reference_,
value_) AS cnt
FROM #TMP)
SELECT reference_, suppliercode_, value_
FROM Dups
WHERE cnt > 1;


HTH,

Plamen Ratchev
http://www.SQLStudio.com