Not Exists joining 2 tables

Not Exists joining 2 tables

am 23.10.2007 20:52:02 von Artie

Hi,
I was hoping somebody could assist me with this. I need to find accounts
that do not contain a contact person called 'Accounting'. Each account may
contain multiple contacts.

Here's my query to find accounts that DO contain an 'Accounting' contact:

SELECT company.code, company.name, company.type,
contacts.fullname
FROM company INNER JOIN
contacts ON company.code = contacts.code
WHERE (company.type in ('C', 'R')) and (contacts.fullname =
'Accounting')
order by company.code


How do I find accounts that DO NOT contain an 'Accounting' contact?

Thanks

Re: Not Exists joining 2 tables

am 23.10.2007 23:31:21 von Erland Sommarskog

Artie (artie2269@yahoo.com) writes:
> I was hoping somebody could assist me with this. I need to find
> accounts that do not contain a contact person called 'Accounting'. Each
> account may contain multiple contacts.
>
> Here's my query to find accounts that DO contain an 'Accounting' contact:
>
> SELECT company.code, company.name, company.type,
> contacts.fullname
> FROM company INNER JOIN
> contacts ON company.code = contacts.code
> WHERE (company.type in ('C', 'R')) and (contacts.fullname =
> 'Accounting')
> order by company.code
>
>
> How do I find accounts that DO NOT contain an 'Accounting' contact?

SELECT cm.code, cm.name, cm.type, ct.fullname
FROM company cm
JOIN contacts ct ON cm.code = ct.code
WHERE cm.type in ('C', 'R'))
AND NOT EXISTS (SELECT *
FROM contacts ct2
WHERE cm.code = ct2.code
AND ct2.fullname = 'Accounting')
ORDER BY cm.code



--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downlo ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books .mspx

Re: Not Exists joining 2 tables

am 24.10.2007 02:17:55 von Artie

Thanks Erland.


"Erland Sommarskog" wrote in message
news:Xns99D2F052488E0Yazorman@127.0.0.1...
> Artie (artie2269@yahoo.com) writes:
>> I was hoping somebody could assist me with this. I need to find
>> accounts that do not contain a contact person called 'Accounting'. Each
>> account may contain multiple contacts.
>>
>> Here's my query to find accounts that DO contain an 'Accounting' contact:
>>
>> SELECT company.code, company.name, company.type,
>> contacts.fullname
>> FROM company INNER JOIN
>> contacts ON company.code = contacts.code
>> WHERE (company.type in ('C', 'R')) and (contacts.fullname =
>> 'Accounting')
>> order by company.code
>>
>>
>> How do I find accounts that DO NOT contain an 'Accounting' contact?
>
> SELECT cm.code, cm.name, cm.type, ct.fullname
> FROM company cm
> JOIN contacts ct ON cm.code = ct.code
> WHERE cm.type in ('C', 'R'))
> AND NOT EXISTS (SELECT *
> FROM contacts ct2
> WHERE cm.code = ct2.code
> AND ct2.fullname = 'Accounting')
> ORDER BY cm.code
>
>
>
> --
> Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
>
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/prodtechnol/sql/2005/downlo ads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodinfo/previousversions/books .mspx

Re: Not Exists joining 2 tables

am 24.10.2007 02:32:31 von Joe Celko

Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, data types, etc. in
your schema are. If you know how, follow ISO-11179 data element naming
conventions and formatting rules. Sample data is also a good idea,
along with clear specifications. It is very hard to debug code when
you do not let us see it.

>> I need to find accounts that do not contain a contact person called 'Accounting'. Each account may contain multiple contacts.<<

What is the key of the Companies table (your singular name means that
you have one or fewer rows in that table, but I assume this is part of
the other violations of ISO-11179 with uselessly vague data element
names like "name" (of my dog?) "code" (ZIP code?) and "type" (blood
type?).

It sorta looks like code is the key, but that makes no sense. BY
DEFINITION a code of any kind cannot be a key; this is fundamental.

You need a DUNS number or other industry standard company identifier.
Your spec asked only for for the companies without a contact =
'Accounting'; but we have no idea if the Contacts table (properly
named!) has a reference to the companies.

Here is a weird way to do this, based on guessing at your DDL:

SELECT CT.company_name
FROM Contacts AS CT
GROUP BY CT.company_name
HAVING SUM(CASE WHEN CT.contact_name = 'Accounting' THEN 1 ELSE 0
END)= 0;

This is untested; if we had DDL, we could try it!! I assumed that
Contacts ought to be a relationship between a company and a lawful
person or role within the company.

Re: Not Exists joining 2 tables

am 24.10.2007 03:16:20 von Artie

I did not post the full DDL because there are so many fields that are not
relevant and felt it would complicate things. It would open up a can of
worms as to why it is desgined the way it is (I did not design it).

Erland's response did the trick.

Do you guys (or gals) have a preferred method of generating insert
statements that pull data from the table? I have used sp_generate_inserts
from http://vyaskn.tripod.com but have run into some cases where the 8000
char limit was not enough.

I really appreciate the help all of you provide on these boards to newbs
like me.


"--CELKO--" wrote in message
news:1193185951.805397.62450@v23g2000prn.googlegroups.com...
> Please post DDL, so that people do not have to guess what the keys,
> constraints, Declarative Referential Integrity, data types, etc. in
> your schema are. If you know how, follow ISO-11179 data element naming
> conventions and formatting rules. Sample data is also a good idea,
> along with clear specifications. It is very hard to debug code when
> you do not let us see it.
>
>>> I need to find accounts that do not contain a contact person called
>>> 'Accounting'. Each account may contain multiple contacts.<<
>
> What is the key of the Companies table (your singular name means that
> you have one or fewer rows in that table, but I assume this is part of
> the other violations of ISO-11179 with uselessly vague data element
> names like "name" (of my dog?) "code" (ZIP code?) and "type" (blood
> type?).
>
> It sorta looks like code is the key, but that makes no sense. BY
> DEFINITION a code of any kind cannot be a key; this is fundamental.
>
> You need a DUNS number or other industry standard company identifier.
> Your spec asked only for for the companies without a contact =
> 'Accounting'; but we have no idea if the Contacts table (properly
> named!) has a reference to the companies.
>
> Here is a weird way to do this, based on guessing at your DDL:
>
> SELECT CT.company_name
> FROM Contacts AS CT
> GROUP BY CT.company_name
> HAVING SUM(CASE WHEN CT.contact_name = 'Accounting' THEN 1 ELSE 0
> END)= 0;
>
> This is untested; if we had DDL, we could try it!! I assumed that
> Contacts ought to be a relationship between a company and a lawful
> person or role within the company.
>

Re: Not Exists joining 2 tables

am 24.10.2007 23:19:41 von Erland Sommarskog

Artie (artie2269@yahoo.com) writes:
> I did not post the full DDL because there are so many fields that are not
> relevant and felt it would complicate things. It would open up a can of
> worms as to why it is desgined the way it is (I did not design it).

In many cases it's better to post a simplied schema that captures the
essence of the problem.

> Do you guys (or gals) have a preferred method of generating insert
> statements that pull data from the table? I have used sp_generate_inserts
> from http://vyaskn.tripod.com but have run into some cases where the 8000
> char limit was not enough.

I haven't looked in to Vyas's code, but if you are on SQL 2005, it should
be easy to modify it to use varchar(MAX) rather than varchar(8000).

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downlo ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books .mspx

Re: Not Exists joining 2 tables

am 26.10.2007 05:48:06 von Ed Murphy

--CELKO-- wrote:

> Your spec asked only for for the companies without a contact =
> 'Accounting'; but we have no idea if the Contacts table (properly
> named!) has a reference to the companies.

Yes, we do. "It sorta looks like code is the key", by your own
admission, and his "companies with a contact = 'Accounting')"
sample query backs this up:

FROM company INNER JOIN
contacts ON company.code = contacts.code

Granted, "'code' is a bad name for a key column" is a valid complaint.

> SELECT CT.company_name
> FROM Contacts AS CT
> GROUP BY CT.company_name
> HAVING SUM(CASE WHEN CT.contact_name = 'Accounting' THEN 1 ELSE 0
> END)= 0;
>
> This is untested; if we had DDL, we could try it!! I assumed that
> Contacts ought to be a relationship between a company and a lawful
> person or role within the company.

You mean a many-to-many linking table between Companies and Persons
(i.e. a person might be a contact for multiple companies)? Okay, but
I still don't see why you would get company_name from any table other
than Companies. Basic normalization.

If you do get company_name from Companies, then you might have a company
with no contacts at all. You could use LEFT JOIN and COALESCE(SUM()),
but Erland's NOT EXISTS is more natural.

Re: Not Exists joining 2 tables

am 28.11.2007 01:53:22 von creedp.71

On 25 Oct, 21:48, Ed Murphy wrote:
> --CELKO-- wrote:
> > Your spec asked only for for the companies without a contact =
> > 'Accounting'; but we have no idea if the Contacts table (properly
> > named!) has a reference to the companies.
>
> Yes, we do. "It sorta looks like code is the key", by your own
> admission, and his "companies with a contact = 'Accounting')"
> sample query backs this up:
>
> FROM company INNER JOIN
> contacts ON company.code = contacts.code
>
> Granted, "'code' is a bad name for a key column" is a valid complaint.
>
> > SELECT CT.company_name
> > FROM Contacts AS CT
> > GROUP BY CT.company_name
> > HAVING SUM(CASE WHEN CT.contact_name = 'Accounting' THEN 1 ELSE 0
> > END)= 0;
>
> >
> > This is untested; if we had DDL, we could try it!! I assumed that
> > Contacts ought to be a relationship between a company and a lawful
> > person or role within the company.
>
> You mean a many-to-many linking table between Companies and Persons
> (i.e. a person might be a contact for multiple companies)? Okay, but
> I still don't see why you would get company_name from any table other
> than Companies. Basic normalization.
>
> If you do get company_name from Companies, then you might have a company
> with no contacts at all. You could use LEFT JOIN and COALESCE(SUM()),
> but Erland's NOT EXISTS is more natural.

I know this thread is month old but I couldn't help but renew it. It
warms my heart to see a couple people that know how to use an NOT
EXISTS clause with a correlated subquery properly, especially when so
many people don't understand the impact of the seemingly simple "NOT
IN()" version which would use a nested sub-select instead leaving an
exponentialy larger footprint on the DB.
However, unless I am mistaken, this is a classic case of exactly the
type of scenario that forced me to switch from the old-school Oracle
SQL+ syntax (pre v8) and start using the ANSI SQL syntax that seems
far too "wordy" just to join a couple tables together. It was a
scenario just like this that I realized you could only do with the
newer syntax. And (depending on the data of course) it may be a more
efficient query than the correlated subquery.

The basic idea is just change the join to an outer join, and then move
the additional criteria into the JOIN clause just like it's another
column you're joining to. Then the only criteria in the where clause
will be a test for NULL in the primary key column of the joining table
which would mean there's not a matching record to the join. This
gives you the ability to evaluate criteria on the records both before
and after the join is made. This is also a common method used in ETL
when writing an "insert into select from" statement in a single
statement that will give you a single DML statement that both tests
for the existence of a record in a table before attempting to insert a
batch of rows into it.

Consider this alternative and let me know if I screw this one up,
after all, it's getting late, and I'm still at work :) .....

SELECT company.code, company.name, company.type,
contacts.fullname
FROM company Left JOIN
contacts ON company.code = contacts.code
AND (company.type in ('C', 'R')) and (contacts.fullname =
'Accounting')
WHERE contacts.code IS NULL
order by company.code

Re: Not Exists joining 2 tables

am 28.11.2007 06:46:39 von Ed Murphy

creedp.71@gmail.com wrote:

> Consider this alternative and let me know if I screw this one up,
> after all, it's getting late, and I'm still at work :) .....
>
> SELECT company.code, company.name, company.type,
> contacts.fullname
> FROM company Left JOIN
> contacts ON company.code = contacts.code
> AND (company.type in ('C', 'R')) and (contacts.fullname =
> 'Accounting')
> WHERE contacts.code IS NULL
> order by company.code

I think this would work (except in the unlikely case that
contacts.code is nullable, in which case it might return
some data that it shouldn't). But NOT EXISTS has the strong
advantage of letting you say what you mean.

Re: Not Exists joining 2 tables

am 28.11.2007 12:40:09 von jhofmeyr

On Nov 28, 5:46 am, Ed Murphy wrote:
> creedp...@gmail.com wrote:
> > Consider this alternative and let me know if I screw this one up,
> > after all, it's getting late, and I'm still at work :) .....
>
> > SELECT company.code, company.name, company.type,
> > contacts.fullname
> > FROM company Left JOIN
> > contacts ON company.code = contacts.code
> > AND (company.type in ('C', 'R')) and (contacts.fullname =
> > 'Accounting')
> > WHERE contacts.code IS NULL
> > order by company.code
>
> I think this would work (except in the unlikely case that
> contacts.code is nullable, in which case it might return
> some data that it shouldn't). But NOT EXISTS has the strong
> advantage of letting you say what you mean.

The last time I tried this (on a fairly complex query as well) on SQL
2005, the execution plan of using a NOT EXISTS and LEFT OUTER JOIN to
filter rows was identical. Could it be that the query optimiser
actually understands what we are trying to achieve and derives the
best method to do so regardless of syntax these days? More likely it
was simply a quirk of the query / tables / indexing I guess...

J