Possible to find this duplicate?

Possible to find this duplicate?

am 13.02.2010 17:12:24 von Brian Dunning

--Apple-Mail-46-7535885
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
charset=us-ascii

Hey all -

I have a table listing references for chapters in a book. I'm trying to =
find all the cases where a single chapter lists more than one reference =
from the same author. In this example table, I want it to find IDs 1 and =
2, because they're both from the same author, and both in chapter 1 of =
the book. It should not return ID 4, because that's in a different =
chapter.

Note that J. and John have to be considered the same. For my purposes, =
it's sufficient to look at the first word, Smith, and consider that a =
duplicate.

+----+--------------+---------+
| ID | Author | Chapter |=20
+----+--------------+---------+
| 1 | Smith, John | 1 |=20
| 2 | Smith, J. | 1 |=20
| 3 | Williams, B. | 1 |=20
| 4 | Smith, John | 2 |=20
+----+--------------+---------+

I haven't been able to even get a start on this. Any suggestions?

- Brian


--Apple-Mail-46-7535885--

Re: Possible to find this duplicate?

am 13.02.2010 18:06:10 von Mark Goodge

On 13/02/2010 16:12, Brian Dunning wrote:
> Hey all -
>
> I have a table listing references for chapters in a book. I'm trying to find all the cases where a single chapter lists more than one reference from the same author. In this example table, I want it to find IDs 1 and 2, because they're both from the same author, and both in chapter 1 of the book. It should not return ID 4, because that's in a different chapter.
>
> Note that J. and John have to be considered the same. For my purposes, it's sufficient to look at the first word, Smith, and consider that a duplicate.
>
> +----+--------------+---------+
> | ID | Author | Chapter |
> +----+--------------+---------+
> | 1 | Smith, John | 1 |
> | 2 | Smith, J. | 1 |
> | 3 | Williams, B. | 1 |
> | 4 | Smith, John | 2 |
> +----+--------------+---------+
>
> I haven't been able to even get a start on this. Any suggestions?

Try this:

select
count(id) as total,
concat(substring_index(Author,",",1),Chapter) as my_reference
from my_table
group by my_reference
having total > 1

That may or may not work straight off, I haven't tested it. But the
thing you're looking for is something involving a substring_index on the
Author column.

Mark

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org