SQL confusion
am 09.10.2004 22:39:45 von adward555
I'm trying to figure out how to do a particular query,
and I'm beating my head against a wall. Here's my
situation:
I'm running postgres 7.3.2 on linux, and making my
requests from Perl scripts using DBD::Pg. My table
structure is as follows (irrelevant cols removed)
CREATE TABLE name (
namecounter integer NOT NULL,
firstmiddle character varying(64) NOT NULL,
lastname character varying(64) NOT NULL,
birthdate date,
hh smallint,
famnu integer,
);
Each row represents a person with a unique
namecounter. Families share a famnu, and usually one
person in a family is marked as head of household
(hh>0), with everyone else hh=0. However, there are a
few families with nobody marked as hh, and I'd like to
elect one by age. The query I'm trying to do is to
pull one person from each household, either the head
of household if available, or the eldest if not. I
want them sorted by last name, so I'd prefer to find
them all in one query, no matter how ugly and nested
it has to be.
I can pull the list with hh>0 easily enough, but I'm
not sure how to pull out the others.
I realize that this could be done through some looping
in the Perl script, but I'd like to avoid pulling the
whole list into memory in case the list gets long. My
preference is to just handle one record at a time in
Perl if possible.
Help?
Andrew Ward
adward55@yahoo.com
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html
Re: SQL confusion
am 09.10.2004 23:11:45 von RP
On Sat, Oct 09, 2004 at 01:39:45PM -0700, Andrew Ward wrote:
>
> I'm running postgres 7.3.2 on linux, and making my
> requests from Perl scripts using DBD::Pg. My table
> structure is as follows (irrelevant cols removed)
>
> CREATE TABLE name (
> namecounter integer NOT NULL,
> firstmiddle character varying(64) NOT NULL,
> lastname character varying(64) NOT NULL,
> birthdate date,
> hh smallint,
> famnu integer,
> );
This may not be the fastest query, but it should be quite comprehensible.
Lightly tested only on 7.4.5 (do you know why you're not using 7.4.x?).
SELECT namecounter -- and whatever other columns you need
FROM name outername -- alias so it's clear what the joins do
WHERE hh = 1 -- simple case: head of family
OR (
hh = 0 -- not a head
AND NOT EXISTS ( -- there is no head of this family
SELECT namecounter
FROM name in1
WHERE hh = 1 AND in1.famnu = outername.famnu)
AND birthdate = ( -- this person is as old as the oldest person
SELECT MIN(birthdate)
FROM name in2
WHERE in2.famnu = outername.famnu)
);
Richard
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
Re: SQL confusion
am 09.10.2004 23:59:51 von Mike
On Sat, Oct 09, 2004 at 01:39:45PM -0700, Andrew Ward wrote:
>
> CREATE TABLE name (
> namecounter integer NOT NULL,
> firstmiddle character varying(64) NOT NULL,
> lastname character varying(64) NOT NULL,
> birthdate date,
> hh smallint,
> famnu integer,
> );
>
> Each row represents a person with a unique
> namecounter. Families share a famnu, and usually one
> person in a family is marked as head of household
> (hh>0), with everyone else hh=0. However, there are a
> few families with nobody marked as hh, and I'd like to
> elect one by age. The query I'm trying to do is to
> pull one person from each household, either the head
> of household if available, or the eldest if not. I
> want them sorted by last name, so I'd prefer to find
> them all in one query, no matter how ugly and nested
> it has to be.
This should be close to what you need:
SELECT DISTINCT ON (lastname, famnu) *
FROM name
ORDER BY lastname, famnu, COALESCE(hh, 0) DESC, birthdate;
The order of the fields in the ORDER BY clause is important, so if
you want to sort the results by some other criteria then you might
need to do it with a subselect.
I used COALESCE on the hh field because a descending sort puts NULL
values ahead of non-NULL values -- without COALESCE a person with
hh=1 won't be recognized as the head of household if another member
of the family has hh=NULL.
You might need to modify the query to handle NULL birthdates or to
add other tiebreakers.
--
Michael Fuhr
http://www.fuhr.org/~mfuhr/
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
Re: SQL confusion
am 12.10.2004 19:01:06 von tfo
This is untested, but it might be enough to get you started:
SELECT namecounter
FROM name n
WHERE NOT EXISTS (
SELECT 1
FROM name
WHERE hh > 0
AND famnu = n.famnu
)
GROUP BY famnu
HAVING birthdate = min( birthdate );
What I'm trying to do here is grab all families that don't have a head
of household, group them by family, and get only the namecounter
corresponding to the minimum birthdate for that family.
If I recall, I've had some trouble using HAVING with min/max in ways
that seem intuitive to me, but this might help get you started.
-tfo
On Oct 9, 2004, at 3:39 PM, Andrew Ward wrote:
> I'm trying to figure out how to do a particular query,
> and I'm beating my head against a wall. Here's my
> situation:
>
> I'm running postgres 7.3.2 on linux, and making my
> requests from Perl scripts using DBD::Pg. My table
> structure is as follows (irrelevant cols removed)
>
> CREATE TABLE name (
> namecounter integer NOT NULL,
> firstmiddle character varying(64) NOT NULL,
> lastname character varying(64) NOT NULL,
> birthdate date,
> hh smallint,
> famnu integer,
> );
>
> Each row represents a person with a unique
> namecounter. Families share a famnu, and usually one
> person in a family is marked as head of household
> (hh>0), with everyone else hh=0. However, there are a
> few families with nobody marked as hh, and I'd like to
> elect one by age. The query I'm trying to do is to
> pull one person from each household, either the head
> of household if available, or the eldest if not. I
> want them sorted by last name, so I'd prefer to find
> them all in one query, no matter how ugly and nested
> it has to be.
>
> I can pull the list with hh>0 easily enough, but I'm
> not sure how to pull out the others.
>
> I realize that this could be done through some looping
> in the Perl script, but I'd like to avoid pulling the
> whole list into memory in case the list gets long. My
> preference is to just handle one record at a time in
> Perl if possible.
>
> Help?
>
> Andrew Ward
> adward55@yahoo.com
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org