COUNT(*) to find records which have a certain number of dependencies?

am 20.09.2004 17:14:46 von mailreg

Hello,

I apologize in advance for this garbled message but I've been banging my
head against a brick-wall for a while and I just can't figure how to do
the following:

I have 3 tables BRAND,MODEL,TYPE which are related to each other:

BRAND
=====
BRAND_PK
BRAND_NAME

MODEL
=====
MODEL_PK
MODEL_NAME
BRAND_FK (NOT NULL, references BRAND_PK)

TYPE
====
TYPE_PK
TYPE_NAME
MODEL_FK (NOT NULL, references MODEL_PK)

------------------------------------

I want to select only those BRAND/MODEL combinations, where the MODEL
has more than one TYPE, but only where one of those has TYPE_NAME='xyz'.
I am not interested in MODELs with multiple TYPEs where none of them are
called 'xyz'.

--

Regards/Gruß,

Tarlika Elisabeth Schmitz

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Re: COUNT(*) to find records which have a certain number of

am 20.09.2004 18:19:11 von mailreg

I figured it eventually. (The only thing I don't know is where to put
the ORDER BY.)

> I want to select only those BRAND/MODEL combinations, where the MODEL
> has more than one TYPE, but only where one of those has TYPE_NAME='xyz'.
> I am not interested in MODELs with multiple TYPEs where none of them are
> called 'xyz'.

SELECT
BRAND_NAME,MODEL_NAME
FROM TYPE
left outer join MODEL on MODEL_PK =TYPE.MODEL_FK
left outer join BRAND on BRAND_PK =MODEL.BRAND_FK
group by BRAND.BRAND_NAME,MODEL_NAME
having count(*)>1

intersect

SELECT
BRAND_NAME,MODEL.MODEL_NAME
FROM TYPE
left outer join MODEL on MODEL_PK =TYPE.MODEL_FK
left outer join BRAND on BRAND_PK =MODEL.BRAND_FK
where TYPE_NAME='xyz'
group by BRAND.BRAND_NAME,MODEL_NAME

--

Regards/Gruß,

Tarlika Elisabeth Schmitz

--

Regards/Gruß,

Tarlika Elisabeth Schmitz

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Re: COUNT(*) to find records which have a certain number of

am 20.09.2004 18:31:36 von pg

On Mon, 2004-09-20 at 12:19, T E Schmitz wrote:
> I figured it eventually. (The only thing I don't know is where to put
> the ORDER BY.)

Try this:

SELECT brand_name, model_name
FROM (SELECT ... INTERSECT SELECT ...) AS t
ORDER BY ...

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Re: COUNT(*) to find records which have a certain number of

am 20.09.2004 19:59:39 von mailreg

Hi Rod,

Rod Taylor wrote:

> On Mon, 2004-09-20 at 12:19, T E Schmitz wrote:
>
>>I figured it eventually. (The only thing I don't know is where to put
>>the ORDER BY.)
>
>
> Try this:
>
> SELECT brand_name, model_name
> FROM (SELECT ... INTERSECT SELECT ...) AS t
> ORDER BY ...

That does the trick. You're a genius ;-)

--

Regards/Gruß,

Tarlika

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Re: COUNT(*) to find records which have a certain number of dependencies ?

am 20.09.2004 20:32:17 von gsstark

T E Schmitz writes:

> I want to select only those BRAND/MODEL combinations, where the MODEL has more
> than one TYPE, but only where one of those has TYPE_NAME='xyz'.
> I am not interested in MODELs with multiple TYPEs where none of them are called
> 'xyz'.

There are lots of approaches to this with various pros and cons.

The simplest one off the top of my head:

select *
from brand
join model on (brand_pk = brand_fk)
where exists (select 1 from type where model_fk = model_pk and type_name = 'xyz')
and (select count(*) from type where model_fk = model_pk) > 1

You could try to be clever about avoiding the redundant access to the type table:

select *
from brand
join model on (brand_pk = brand_fk)
where (select count(*)
from type
where model_fk = model_pk
having sum(case when type = 'xyz' then 1 else 0 end) >= 1
) > 1

I'm haven't tested that, it might need some tweaking. In any case I don't
think it's worth the added complexity, assuming you have indexes on type. I'm
not even sure it would run faster.

You could try to be really clever about it by turning the whole thing into a
join:

select *
from brand
join model on (brand_pk = brand_fk)
join (select model_fk
from type
group by model_fk
having sum(case when type = 'xyz' then 1 else 0 end) >= 1
and count(*) > 1
) on (model_fk = model_pk)

This would let the planner have a more plans to choose from and might be a big
win if there are lots of brands and models but few that satisfy the criteria
you're looking for.

--
greg

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Re: COUNT(*) to find records which have a certain number of

am 20.09.2004 20:33:33 von gsstark

T E Schmitz writes:

> SELECT
> BRAND_NAME,MODEL_NAME
....
> intersect
....

Huh, I never think of the set operation solutions. I'm curious how it compares
speed-wise.

--
greg

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

JOIN performance

am 20.09.2004 21:24:49 von postgresql3

I have a view that when used, is slow:

CREATE VIEW "Data".genlic_a4avail AS
SELECT
genlic_a4.*,
last_action_date,
end_date,
canceled
FROM genlic_a4
LEFT JOIN lic_hd USING( sys_id )
WHERE status != 'A';

Here is the EXPLAIN output:

Merge Join (cost=155360.47..159965.70 rows=13063 width=75)
Merge Cond: ("outer".sys_id = "inner".sys_id)
-> Sort (cost=3912.51..3916.48 rows=1589 width=62)
Sort Key: "_GenLicGroupA4".sys_id
-> Nested Loop (cost=0.00..3828.04 rows=1589 width=62)
Join Filter: ("outer".callsign ~ ("inner".pattern)::text)
-> Seq Scan on "_GenLicGroupA4" (cost=0.00..1667.40
rows=1589 width=21)
Filter: ((status <> 'R'::bpchar) AND (status <>
'A'::bpchar) AND (geo_region = 12))
-> Seq Scan on "_GeoRestrict" (cost=0.00..1.16 rows=16
width=41)
-> Sort (cost=262032.96..264249.96 rows=886799 width=72)
Sort Key: lic_hd.sys_id
-> Subquery Scan lic_hd (cost=0.00..24529.99 rows=886799 width=72)
-> Seq Scan on "_LicHD" (cost=0.00..24529.99 rows=886799
width=72)

If I change the view to this:

CREATE VIEW "Data".genlic_a4avail AS
SELECT
genlic_a4.*,
(SELECT last_action_date FROM lic_hd WHERE sys_id =
genlic_a4.sys_id LIMIT 1) AS
last_action_date,
(SELECT end_date FROM lic_hd WHERE sys_id =
genlic_a4.sys_id LIMIT 1) AS
end_date,
(SELECT canceled FROM lic_hd WHERE sys_id =
genlic_a4.sys_id LIMIT 1) AS
canceled
FROM genlic_a4
WHERE status != 'A';

Then the performance is MUCH better:

Subquery Scan genlic_a4avail (cost=0.00..3828.04 rows=1589 width=62)
-> Nested Loop (cost=0.00..3828.04 rows=1589 width=62)
Join Filter: ("outer".callsign ~ ("inner".pattern)::text)
-> Seq Scan on "_GenLicGroupA4" (cost=0.00..1667.40 rows=1589
width=21)
Filter: ((status <> 'R'::bpchar) AND (status <>
'A'::bpchar) AND (geo_region = 12))
-> Seq Scan on "_GeoRestrict" (cost=0.00..1.16 rows=16 width=41)
SubPlan
-> Limit (cost=0.00..3.01 rows=1 width=4)
-> Index Scan using "_LicHD_pkey" on
"_LicHD" (cost=0.00..3.01 rows=1 width=4)
Index Cond: (unique_system_identifier = $0)
-> Limit (cost=0.00..3.01 rows=1 width=8)
-> Index Scan using "_LicHD_pkey" on
"_LicHD" (cost=0.00..3.01 rows=1 width=8)
Index Cond: (unique_system_identifier = $0)
-> Limit (cost=0.00..3.01 rows=1 width=8)
-> Index Scan using "_LicHD_pkey" on
"_LicHD" (cost=0.00..3.01 rows=1 width=8)
Index Cond: (unique_system_identifier = $0)

Note that genlic_a4 is small (4519), and lic_hd is large (886799), and
lic_hd has sys_id as its PRIMARY KEY.

Is there a better way to write the LEFT JOIN so as to achieve the
performance of the second VIEW without the clumsiness of the three (SELECT
.... LIMIT 1) ???

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: COUNT(*) to find records which have a certain number of

am 20.09.2004 21:50:34 von mailreg

Hello Greg,
You have given me plenty of food for thought. Thank you for taking the
time.
Currently, the tables have such few records (350, 900, 1000) that
performance does not come into it, particularly seeing as this was only
needed for a one-shot report.
However, I have stached your examples away for future reference.

I was feeling a bit guilty about posting such a trivial question. I can
cobble together some straightforward SQL but I could really do with a
source of more complex SQL examples.
If you know of any links - that would great and save the list from more
such questions ;-)

I am correcting a couple of typos below in case someone tries these
examples out.

Greg Stark wrote:

> select *
> from brand
> join model on (brand_pk = brand_fk)
> where (select count(*)
> from type
> where model_fk = model_pk
> having sum(case when type = 'xyz' then 1 else 0 end) >= 1
> ) > 1

.... having sum(case when type_name = 'xyz' ...

> select *
> from brand
> join model on (brand_pk = brand_fk)
> join (select model_fk
> from type
> group by model_fk
> having sum(case when type = 'xyz' then 1 else 0 end) >= 1
> and count(*) > 1
> ) on (model_fk = model_pk)
>

) as somealias on (model_fk = model_pk)

(subquery in FROM must have an alias)

--

Regards/Gruß,

Tarlika

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Re: COUNT(*) to find records which have a certain number of dependencies ?

am 20.09.2004 22:45:48 von gsstark

T E Schmitz writes:

> ) as somealias on (model_fk = model_pk)
>
> (subquery in FROM must have an alias)

ARGH! This is one of the most annoying things about postgres! It bites me all
the time. Obviously it's totally insignificant since it's easy for my to just
throw an "AS x" on the end of it. But damn.

I see there's a comment foreseeing some annoyance value for this in the
source:

/*
* The SQL spec does not permit a subselect
* () without an alias clause,
* so we don't either. This avoids the problem
* of needing to invent a unique refname for it.
* That could be surmounted if there's sufficient
* popular demand, but for now let's just implement
* the spec and see if anyone complains.
* However, it does seem like a good idea to emit
* an error message that's better than "syntax error".
*/

So where can I officially register my complaint? :)

--
greg

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Re: COUNT(*) to find records which have a certain number of

am 20.09.2004 22:55:38 von chester

On Mon, 20 Sep 2004, T E Schmitz wrote:
> I was feeling a bit guilty about posting such a trivial question. I can
> cobble together some straightforward SQL but I could really do with a
> source of more complex SQL examples.
> If you know of any links - that would great and save the list from more
> such questions ;-)

SQL for Smarties has some more complicated examples and topics for
"advanced" type queries. I can't say it's exhaustive, but I found it
a good bridge by hinting at what is really possible. I also found that
a good way to improve is to try to write every complicated query by
using all the different ways I can think of, like:

- UNION (ALL)
- SUB-SELECT
- LEFT OUTER JOINS
- HAVING
etc.

Here is the link for SQL for Smarties:

http://www.amazon.com/exec/obidos/tg/detail/-/1558605762/002 -2222957-7220055?v=glance

The bad thing about the book is that it is sort of SQL agnostic, so
some of the examples would be sub-optimal on postgresql, or may not
even work.

I would like to hear about other sources too.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: JOIN performance

am 21.09.2004 01:06:14 von tgl

"Dean Gibson (DB Administrator)" writes:
> I have a view that when used, is slow:

It's obvious that you haven't given us anything remotely like full
information here. AFAICT the "tables" in your view are actually
other views, plus it looks like your call to the view is a query
that adds some other conditions you didn't mention. (Or are those
conditions coming from the sub-views? Hard to tell.) If you want
useful help you need to be more complete.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: JOIN performance

am 21.09.2004 02:10:58 von postgresql3

Tom Lane wrote on 2004-09-20 16:06:
>"Dean Gibson (DB Administrator)" writes:
> > I have a view that when used, is slow:
>
>... If you want useful help you need to be more complete.

I use views to "hide" tables so that I can populate new tables and then
atomically switch to them with "CREATE OR REPLACE ...". Here is the same
data with the raw tables:

=> explain select * from "20040920_070010"."_GenLicGroupA4" AS x LEFT JOIN
"20040919_070713"."_LicHD" AS y ON x.sys_id =
y.unique_system_identifier;
QUERY
PLAN
------------------------------------------------------------ -------------------------------------
Merge Join (cost=5235.14..35123.51 rows=43680 width=365)
Merge Cond: ("outer".unique_system_identifier = "inner".sys_id)
-> Index Scan using "_LicHD_pkey" on "_LicHD" y (cost=0.00..27361.79
rows=886799 width=344)
-> Sort (cost=5235.14..5344.34 rows=43680 width=21)
Sort Key: x.sys_id
-> Seq Scan on "_GenLicGroupA4" x (cost=0.00..1339.80
rows=43680 width=21)

Using first level views, as mentioned above, the results are the same:

=> explain select * from "Base"."GenLicGroupA4" AS x LEFT JOIN
"Base"."LicHD" AS y ON x.sys_id =
y.unique_system_identifier;
QUERY
PLAN
------------------------------------------------------------ -----------------------------------
Merge Join (cost=5235.14..35123.51 rows=43680 width=365)
Merge Cond: ("outer".unique_system_identifier = "inner".sys_id)
-> Index Scan using "_LicHD_pkey" on "_LicHD" (cost=0.00..27361.79
rows=886799 width=344)
-> Sort (cost=5235.14..5344.34 rows=43680 width=21)
Sort Key: "_GenLicGroupA4".sys_id
-> Seq Scan on "_GenLicGroupA4" (cost=0.00..1339.80 rows=43680
width=21)

However, when I introduce a second-level view for the second table of:

CREATE VIEW "Data".lic_hd AS
SELECT
unique_system_identifier AS sys_id,
callsign AS callsign,
uls_file_number AS
uls_file_num,
applicant_type_code AS
applicant_type,
radio_service_code AS
radio_service,
license_status AS
license_status,
grant_date AS grant_date,
effective_date AS
effective_date,
cancellation_date AS
cancel_date,
expired_date AS
expire_date,
last_action_date AS
last_action_date,
CASE WHEN cancellation_date < expired_date
THEN cancellation_date
ELSE expired_date
END AS end_date,
cancellation_date < expired_date AS canceled
FROM "Base"."LicHD";

And then change the query to use it, I get:

=> explain select * from "Base"."GenLicGroupA4" AS x LEFT JOIN lic_hd AS y
ON x.sys_id = y.sys_id;
QUERY
PLAN
------------------------------------------------------------ ------------------------
Merge Join (cost=280258.11..289399.92 rows=359154 width=98)
Merge Cond: ("outer".sys_id = "inner".sys_id)
-> Sort (cost=5235.14..5344.34 rows=43680 width=21)
Sort Key: "_GenLicGroupA4".sys_id
-> Seq Scan on "_GenLicGroupA4" (cost=0.00..1339.80 rows=43680
width=21)
-> Sort (cost=262032.96..264249.96 rows=886799 width=72)
Sort Key: y.sys_id
-> Subquery Scan y (cost=0.00..24529.99 rows=886799 width=72)
-> Seq Scan on "_LicHD" (cost=0.00..24529.99 rows=886799
width=72)

Note that the scan on _LicHD is now sequential. If I change the above view
to remove the last two columns, I get:

QUERY
PLAN
------------------------------------------------------------ ------------------------
Merge Join (cost=5235.14..35123.51 rows=43680 width=93)
Merge Cond: ("outer".unique_system_identifier = "inner".sys_id)
-> Index Scan using "_LicHD_pkey" on "_LicHD" (cost=0.00..27361.79
rows=886799 width=72)
-> Sort (cost=5235.14..5344.34 rows=43680 width=21)
Sort Key: x.sys_id
-> Seq Scan on "_GenLicGroupA4" x (cost=0.00..1339.80
rows=43680 width=21)

Which is back to my original (good) performance.

Question: why do the last two column definitions in the second VIEW change
the scan on _LicHD from indexed to sequential ??

-- Dean

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: JOIN performance

am 21.09.2004 02:54:30 von tgl

"Dean Gibson (DB Administrator)" writes:
> Question: why do the last two column definitions in the second VIEW change
> the scan on _LicHD from indexed to sequential ??

It's the CASE that's getting you. The poor plan is basically because
the sub-view isn't getting "flattened" into the upper query, and so it's
not possible to choose a plan for it that's dependent on the upper query
context. And the reason it's not getting flattened is that subselects
that are on the nullable side of an outer join can't be flattened unless
they have nullable targetlists --- otherwise the results might not go to
NULL when they are supposed to. A CASE construct is always going to be
treated as non-nullable.

Fixing this properly is a research project, and I haven't thought of any
quick-and-dirty hacks that aren't too ugly to consider :-(

In the meantime, you could easily replace that CASE construct with a
min() function that's declared strict. I think date_smaller would
do nicely, assuming the columns are actually of type date.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Re: JOIN performance

am 21.09.2004 03:41:33 von postgresql3

Ahh, that explains why a "plain" JOIN (which doesn't do what I need) gave
much better performance than the LEFT JOIN.

I could ask why a CASE statement is always non-nullable, but I don't think
the answer would help be solve my problem. I suppose it's that even
though my particular CASE statement has WHEN/ELSE values that come from the
nullable side of the JOIN, in general that's not true ...

Okay, now for my big question: I searched high and low for a function that
would return the minimum of two dates, and found none. Now you come up
with "date_smaller", which works fine (as does "date_larger"), but where
are those documented? More importantly, where are other functions like
them documented?

-- Dean

ps: Who dreamt up the names "date_smaller" and "date_larger" ??? Much more
intuitive are "min_date" and "max_date".

pps: I realize that "date_smaller" isn't exactly equivalent to my CASE
statement; a NULL value for one of the CASE operands causes the result of
the ELSE clause to be returned, whereas "date_smaller" just returns NULL in
that case. In my data, that's significant. I suppose that COALESCE has
the same problem as CASE ...

Tom Lane wrote on 2004-09-20 17:54:
>"Dean Gibson (DB Administrator)" writes:
>Question: Why do the last two column definitions in the second VIEW
>change the scan on _LicHD from indexed to sequential ??
>
>It's the CASE that's getting you. The poor plan is basically because the
>sub-view isn't getting "flattened" into the upper query, and so it's not
>possible to choose a plan for it that's dependent on the upper query
>context. And the reason it's not getting flattened is that subselects
>that are on the nullable side of an outer join can't be flattened unless
>they have nullable targetlists --- otherwise the results might not go to
>NULL when they are supposed to. A CASE construct is always going to be
>treated as non-nullable.
>
>Fixing this properly is a research project, and I haven't thought of any
>quick-and-dirty hacks that aren't too ugly to consider :-(
>
>In the meantime, you could easily replace that CASE construct with a min()
>function that's declared strict. I think date_smaller would do nicely,
>assuming the columns are actually of type date.

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Re: JOIN performance

am 21.09.2004 06:09:19 von chester

On Mon, 20 Sep 2004, Dean Gibson (DB Administrator) wrote:
> Okay, now for my big question: I searched high and low for a function that
> would return the minimum of two dates, and found none. Now you come up
> with "date_smaller", which works fine (as does "date_larger"), but where
> are those documented? More importantly, where are other functions like
> them documented?

You can make them:

http://www.postgresql.org/docs/7.4/interactive/server-progra mming.html

"strict" means it will return NULL when the input is NULL.

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Re: JOIN performance

am 21.09.2004 06:10:13 von tgl

"Dean Gibson (DB Administrator)" writes:
> I could ask why a CASE statement is always non-nullable, but I don't think
> the answer would help be solve my problem. I suppose it's that even
> though my particular CASE statement has WHEN/ELSE values that come from the
> nullable side of the JOIN, in general that's not true ...

Right, the code just sees CASE and barfs. A finer grain of analysis
could conclude that this CASE is actually safe, but I'm unconvinced that
it's worth doing. (Feel free to pursue this on -hackers if you care.)

> Okay, now for my big question: I searched high and low for a function that
> would return the minimum of two dates, and found none. Now you come up
> with "date_smaller", which works fine (as does "date_larger"), but where
> are those documented?

They aren't; they are actually only intended as support functions for
MIN (resp. MAX) on dates. But they're there, and there's nothing to
stop you using 'em.

(You do of course realize that you could have implemented these
functions for yourself in a one-liner sql or plpgsql function. Finding
the function in the system saves you a few minutes at most.)

> ps: Who dreamt up the names "date_smaller" and "date_larger" ???

[ shrug... ] Some now-forgotten grad student at Berkeley. All the
support functions for MIN and MAX are named 'foosmaller' and
'foolarger'. Try

select aggtransfn from pg_aggregate, pg_proc
where pg_proc.oid = aggfnoid and proname = 'min';

> pps: I realize that "date_smaller" isn't exactly equivalent to my CASE
> statement; a NULL value for one of the CASE operands causes the result of
> the ELSE clause to be returned, whereas "date_smaller" just returns NULL in
> that case. In my data, that's significant.

Well, in that case you may have a problem here. The point of the test
in question is that the expression has to return NULL if *either* input
is null. Now it's true that the LEFT JOIN only cares that it goes to
NULL when *both* inputs are NULL, but we don't have any way to declare
that particular property of a function. To make the planner happy you
will have to declare the function as STRICT, which will force it to
behave in the first fashion.

> I suppose that COALESCE has the same problem as CASE ...

Yup, of course.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

Re: JOIN performance

am 21.09.2004 06:41:40 von postgresql3

It turns out that even when I removed the CASE statement from the VIEW, the
performance problem remained. I had to remove the conditional as well from
the VIEW.

To refresh your memory, there was this line in the VIEW (said VIEW being
the subject of the LEFT JOIN):

cancel_date < expire_date AS canceled,

(cancel_date & expire_date are DATEs in the base view)

When I removed the above line from the VIEW, the performance problem
disappeared.

Do you know why? I'm running 7.3.4, by the way.

-- Dean

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

Re: JOIN performance

am 21.09.2004 07:19:33 von tgl

"Dean Gibson (DB Administrator)" writes:
> It turns out that even when I removed the CASE statement from the VIEW, the
> performance problem remained. I had to remove the conditional as well from
> the VIEW.

> cancel_date < expire_date AS canceled,

Yuck :-(

> Do you know why? I'm running 7.3.4, by the way.

7.3 is even stupider than 7.4, is why ...

In 7.3 only a view whose targetlist consists of simple column references
can be pulled up into the nullable side of an outer join.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Re: JOIN performance

am 21.09.2004 20:26:37 von gsstark

Tom Lane writes:

> Fixing this properly is a research project, and I haven't thought of any
> quick-and-dirty hacks that aren't too ugly to consider :-(

Just thinking out loud here. Instead of trying to peek inside the CASE
couldn't the optimizer just wrap the non-strict expression in a conditional
that tests whether the row was found?

--
greg

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Re: COUNT(*) to find records which have a certain number of dependencies?

am 22.09.2004 00:16:24 von Mischa Sandberg

Greg Stark wrote:
> T E Schmitz writes:

>>) as somealias on (model_fk = model_pk)
>>(subquery in FROM must have an alias)
>
> ARGH! This is one of the most annoying things about postgres! It bites me all
> the time. Obviously it's totally insignificant since it's easy for my to just
> throw an "AS x" on the end of it. But damn.
>
> So where can I officially register my complaint? :)

Hope you don't mind an opinion from someone who looks at this from the
underside ...

Trying to give the parser a better chance of confusing you?

Having the tag only totally insignificant if you want to have a bunch of
special validation cases, where if there is only ONE anonymous
pseudotable, and no ambiguity is possible.

If all it does is give you an annoying but understandable error message,
might you care to consider the cryptic error messages you get from
systems that try to 'do what you mean, not what you say' for such
special cases ... and thereby turn a typo into an error twenty lines
further down the page.

BTW, the "as" is optional, but I always suggest that people use it
explicitly. Why? Because without it, you get another silly error message
or even a runtime error when what you did was omit a comma. For example

select salary name from Employee

returns one column (a dollar figure called "name").
(Yes, I know it's harder to cook up an example when the comma is missing
between tables in the FROM list; just wanted it to be obvious)

Okay, apologies for what may sound like a rant.
I've just been wrangling with an interpreter that tries WAY too hard to
make something executable out of what you tell it ... even if that's
really nothing like your intent.

Re: COUNT(*) to find records which have a certain number of

am 27.09.2004 11:45:01 von mailreg

Hallo Martin,

Martin Knipper wrote:

> Am 20.09.2004 18:19 schrieb T E Schmitz:
>
>>>I want to select only those BRAND/MODEL combinations, where the MODEL
>>>has more than one TYPE, but only where one of those has TYPE_NAME='xyz'.
>>>I am not interested in MODELs with multiple TYPEs where none of them are
>>>called 'xyz'.
>>
>>
>
> Try this:
>
> select b.brand_name,m.model_name,count(*) from model m inner join
> brand b on m.brand_fk=b.brand_pk inner join type t on
> m.model_pk=t.model_fk where t.type_name='xyz' group by
> b.brand_name,m.model_name having count(m.model_name)>0;
>

Sorry, no, this doesn't work. It definitely doesn't return the required
resultset. It just returns all BRAND_NAME,MODEL_NAME for all Models that
have a Type with TYPE_NAME='xyz'.

--

Regards/Gruß,

Tarlika Elisabeth Schmitz

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)

Re: COUNT(*) to find records which have a certain number of dependencies

am 27.09.2004 15:27:26 von gsstark

Mischa Sandberg writes:

> Hope you don't mind an opinion from someone who looks at this from the
> underside ...
>
> Trying to give the parser a better chance of confusing you?

Ok, I understand the basic idea that a parser muddles along too long before
reporting an error makes it harder to track down the original error. However
this isn't such a case. I can't think of any way I could accidentally
introduce an extra comma that would lead to a valid looking alias and an error
much further along. You're talking about something like:

select * from a, as x

which would still produce a syntax error directly after the erroneous comma.
Which is right where you want the error to happen.

> Having the tag only totally insignificant if you want to have a bunch of
> special validation cases, where if there is only ONE anonymous pseudotable,
> and no ambiguity is possible.

There's no additional ambiguity. There can't be since all the user would do to
make the parser happy is go and specify aliases, not necessarily use them. If
the user didn't feel the need to put aliases in initially presumably it was
because he wasn't using them because he didn't need them.

Can you really say queries like this are dangerous:

select a_id,b_id
from (select a_id from a where xyz=?),
(select b_id from b where xyz=?)

or queries like

select a_id,b_id
from (select a_id from a where xyz=?) as x,
(select b_id from b where xyz=?) as x

are any clearer or less ambiguous?

> If all it does is give you an annoying but understandable error message

The reason it's annoying is because the database is saying, "I know perfectly
well what you're doing: I parsed the subquery fine, but I'm going to refuse to
run it because you didn't say the magic word." But when I add in "as x" after
every subquery even though it's utterly meaningless to the rest of the query,
then postgres is perfectly happy to cooperate.

> BTW, the "as" is optional, but I always suggest that people use it explicitly.
> Why? Because without it, you get another silly error message or even a runtime
> error when what you did was omit a comma. For example
>
> select salary name from Employee

I understand what you mean but I don't understand how always using the "as"
helps you here. You'll still get this error even if you were always using AS
when you intended. There's no option to make AS mandatory.

Incidentally, I also always use AS in both column and table aliases (when
specifying them) but for a different reason. I could never keep straight which
of Oracle and MSSQL required AS for columns and which required it for tables.
IIRC they're exactly reversed. However as best as I recall they both allowed
subqueries without any aliases at all.

--
greg

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Is NULLIF nullable?

am 10.11.2004 04:29:49 von postgresql4

Recently I asked about why a field from the nullable side of an OUTER JOIN
was causing the JOIN to be inefficient, and was told that it was because
that field had a CASE statement as part of its definition, and that CASE
(and by extension COALESCE) were non-nullable constructs.

Is NULLIF nullable, in that sense?

Reason: I'd like to define a VIEW with a field definition thusly:

'P.O. Box' || NULLIF( po_box, ' '::char ) AS ...

And I would like the field thusly defined to be nullable.

-- Dean

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Re: Is NULLIF nullable?

am 10.11.2004 06:12:43 von postgresql4

You can ignore my question below, since I just put the field definition
below in an SQL FUNCTION and marked it STRICT.

-- Dean

Dean Gibson (DB Administrator) wrote on 2004-11-09 19:29:
>Recently I asked about why a field from the nullable side of an OUTER JOIN
>was causing the JOIN to be inefficient, and was told that it was because
>that field had a CASE statement as part of its definition, and that CASE
>(and by extension COALESCE) were non-nullable constructs.
>
>Is NULLIF nullable, in that sense?
>
>Reason: I'd like to define a VIEW with a field definition thusly:
>
>'P.O. Box' || NULLIF( po_box, ' '::char ) AS ...
>
>And I would like the field thusly defined to be nullable.
>
>-- Dean

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings