ORDER BY and NULLs
am 19.09.2004 16:57:33 von mailreg
Hello,
I am using PostgreSQL 7.4.2 and as I understand NULL values always sort
last.
However, I have a table from which select using two numerical sort keys
"FROM" and "TO". "TO" might be NULL and I would like to display those
rows first (without sorting the column in descending order).
Is there any way this can be achieved without inserting bogus values
into that column?
--
Regards/Gruß,
Tarlika Elisabeth Schmitz
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
Re: ORDER BY and NULLs
am 19.09.2004 17:17:31 von Terry
Use the coalesce() function. (coalesce returns the first non-null value in its list)
Specifically
ORDER BY coalesce("TO", 0), "FROM"
If you have records in "TO" column whose values is LESS then 0, then you need to replace 0 with
something that sorts BEFORE the first most value that your TO result can return.
Terry Fielder
Manager Software Development and Deployment
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com
Fax: (416) 441-9085
> -----Original Message-----
> From: pgsql-sql-owner@postgresql.org
> [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of T E Schmitz
> Sent: Sunday, September 19, 2004 10:58 AM
> To: pgsql-sql@postgresql.org
> Subject: [SQL] ORDER BY and NULLs
>
>
> Hello,
>
> I am using PostgreSQL 7.4.2 and as I understand NULL values
> always sort
> last.
>
> However, I have a table from which select using two numerical
> sort keys
> "FROM" and "TO". "TO" might be NULL and I would like to display those
> rows first (without sorting the column in descending order).
>
> Is there any way this can be achieved without inserting bogus values
> into that column?
>
> --
>
>
> Regards/Gruß,
>
> Tarlika Elisabeth Schmitz
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html
Re: ORDER BY and NULLs
am 19.09.2004 18:19:55 von jllachan
select ... order by "FROM" is not null, "FROM";
If you have large amount of rows (with or without nulls) it is faster if
use a partial index.
create index ... on ...("FROM");
create index ... on ...("FROM") where "FROM" is null;
JLL
terry@ashtonwoodshomes.com wrote:
> Use the coalesce() function. (coalesce returns the first non-null value in its list)
>
> Specifically
>
> ORDER BY coalesce("TO", 0), "FROM"
>
> If you have records in "TO" column whose values is LESS then 0, then you need to replace 0 with
> something that sorts BEFORE the first most value that your TO result can return.
>
> Terry Fielder
> Manager Software Development and Deployment
> Great Gulf Homes / Ashton Woods Homes
> terry@greatgulfhomes.com
> Fax: (416) 441-9085
>
>
>
>>-----Original Message-----
>>From: pgsql-sql-owner@postgresql.org
>>[mailto:pgsql-sql-owner@postgresql.org]On Behalf Of T E Schmitz
>>Sent: Sunday, September 19, 2004 10:58 AM
>>To: pgsql-sql@postgresql.org
>>Subject: [SQL] ORDER BY and NULLs
>>
>>
>>Hello,
>>
>>I am using PostgreSQL 7.4.2 and as I understand NULL values
>>always sort
>>last.
>>
>>However, I have a table from which select using two numerical
>>sort keys
>>"FROM" and "TO". "TO" might be NULL and I would like to display those
>>rows first (without sorting the column in descending order).
>>
>>Is there any way this can be achieved without inserting bogus values
>>into that column?
>>
>>--
>>
>>
>>Regards/Gruß,
>>
>>Tarlika Elisabeth Schmitz
>>
>>---------------------------(end of
>>broadcast)---------------------------
>>TIP 4: Don't 'kill -9' the postmaster
>>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
Re: ORDER BY and NULLs
am 19.09.2004 18:36:36 von mailreg
Hello Jean-Luc,
You must've been reading my mind. I was just wondering what to do about
indexing on that particular table. I read somewhere that an Index is not
going to improve the performance of an ORDER BY if the sort column
contains NULLs because NULLs aren't indexed?
For the sake of the example I had simplified matters a wee bit. What I
really have is:
SELECT * FROM PRODUCT ORDER BY NAME, FROM, TO, FROM2, TO2
FROM, TO, FROM2, TO2 might be NULL. If FROM is NULL, TO will be NULL. If
FROM2 is NULL, TO2 will be NULL.
How would you index this table?
Kind regards,
Tarlika
Jean-Luc Lachance wrote:
> select ... order by "FROM" is not null, "FROM";
>
> If you have large amount of rows (with or without nulls) it is faster if
> use a partial index.
>
> create index ... on ...("FROM");
> create index ... on ...("FROM") where "FROM" is null;
>
>
> JLL
>
>
> terry@ashtonwoodshomes.com wrote:
>
>> Use the coalesce() function. (coalesce returns the first non-null
>> value in its list)
>>
>> Specifically
>>
>> ORDER BY coalesce("TO", 0), "FROM"
>>
>> If you have records in "TO" column whose values is LESS then 0, then
>> you need to replace 0 with
>> something that sorts BEFORE the first most value that your TO result
>> can return.
>>
>> Terry Fielder
>> Manager Software Development and Deployment
>> Great Gulf Homes / Ashton Woods Homes
>> terry@greatgulfhomes.com
>> Fax: (416) 441-9085
>>
>>
>>
>>> -----Original Message-----
>>> From: pgsql-sql-owner@postgresql.org
>>> [mailto:pgsql-sql-owner@postgresql.org]On Behalf Of T E Schmitz
>>> Sent: Sunday, September 19, 2004 10:58 AM
>>> To: pgsql-sql@postgresql.org
>>> Subject: [SQL] ORDER BY and NULLs
>>>
>>>
>>> Hello,
>>>
>>> I am using PostgreSQL 7.4.2 and as I understand NULL values
>>> always sort
>>> last.
>>>
>>> However, I have a table from which select using two numerical
>>> sort keys
>>> "FROM" and "TO". "TO" might be NULL and I would like to display those
>>> rows first (without sorting the column in descending order).
>>>
>>> Is there any way this can be achieved without inserting bogus values
>>> into that column?
>>>
>>> --
>>>
>>>
>>> Regards/Gruß,
>>>
>>> Tarlika Elisabeth Schmitz
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
Re: ORDER BY and NULLs
am 19.09.2004 18:58:57 von tgl
T E Schmitz writes:
> You must've been reading my mind. I was just wondering what to do about
> indexing on that particular table. I read somewhere that an Index is not
> going to improve the performance of an ORDER BY if the sort column
> contains NULLs because NULLs aren't indexed?
Whatever you were reading had it pretty badly garbled :-(
Btree indexes *do* store nulls, so the presence of nulls doesn't affect
whether they are usable for meeting an ORDER BY spec. However the index
sort order does have to exactly match the ORDER BY list, and even then
it's not necessarily the case that the index is useful. The brutal fact
is that seqscan-and-sort is generally faster than a full-table indexscan
for large tables anyway, unless the table is clustered or otherwise
roughly in order by the index.
If you are going to use an ORDER BY that involves COALESCE or NOT NULL
expressions, then the only way that it could be met with an index is to
create an expressional index on exactly that list of expressions. For
instance
regression=# create table foo (f int, t int);
CREATE TABLE
regression=# explain select * from foo order by f, coalesce(t, -1);
QUERY PLAN
------------------------------------------------------------ -
Sort (cost=69.83..72.33 rows=1000 width=8)
Sort Key: f, COALESCE(t, -1)
-> Seq Scan on foo (cost=0.00..20.00 rows=1000 width=8)
(3 rows)
regression=# create index fooi on foo (f, (coalesce(t, -1)));
CREATE INDEX
regression=# explain select * from foo order by f, coalesce(t, -1);
QUERY PLAN
------------------------------------------------------------ --------
Index Scan using fooi on foo (cost=0.00..52.00 rows=1000 width=8)
(1 row)
regression=#
I'm a bit dubious that such an index would be worth its update costs,
given that it's likely to be no more than a marginal win for the query.
But try it and see.
> Jean-Luc Lachance wrote:
>> If you have large amount of rows (with or without nulls) it is faster if
>> use a partial index.
This advice seems entirely irrelevant to the problem of sorting the
whole table...
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org
Re: ORDER BY and NULLs
am 19.09.2004 19:18:37 von mailreg
Hello Tom,
Tom Lane wrote:
> T E Schmitz writes:
>
>>I read somewhere that an Index is not
>>going to improve the performance of an ORDER BY if the sort column
>>contains NULLs because NULLs aren't indexed?
>
> Whatever you were reading had it pretty badly garbled :-(
I just dug out the PostgreSQL book again because I thought I might've
garbled it:
Quote: "PostgreSQL will not index NULL values. Because an index will
never include NULL values, it cannot be used to satisfy the ORDER BY
clause of a query that returns all rows in a table."
> Btree indexes *do* store nulls, so the presence of nulls doesn't affect
Thank you for your explanations. At the moment the table has only 1300
entries and any query is responsive. I'm just planning ahead...
--
Regards/Gruß,
Tarlika Elisabeth Schmitz
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend
Re: ORDER BY and NULLs
am 19.09.2004 20:37:11 von gsstark
T E Schmitz writes:
> I just dug out the PostgreSQL book again because I thought I might've garbled
> it:
>
> Quote: "PostgreSQL will not index NULL values. Because an index will never
> include NULL values, it cannot be used to satisfy the ORDER BY clause of a
> query that returns all rows in a table."
You should just cross out that whole section. It's just flatly wrong.
I had always assumed it was just people bringing assumptions over from Oracle
where it is true. Perhaps this book is to blame for some of the confusion.
Which book is it?
Postgres indexes NULLs. It can use them for ORDER BY clauses.
Where it cannot use them is to satisfy "WHERE foo IS NULL" or "WHERE foo IS
NOT NULL" constraints though. That's an implementation detail, but it can be
worked around with partial indexes.
--
greg
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
Re: ORDER BY and NULLs
am 19.09.2004 21:00:08 von tgl
T E Schmitz writes:
> Tom Lane wrote:
>> Whatever you were reading had it pretty badly garbled :-(
> I just dug out the PostgreSQL book again because I thought I might've
> garbled it:
> Quote: "PostgreSQL will not index NULL values. Because an index will
> never include NULL values, it cannot be used to satisfy the ORDER BY
> clause of a query that returns all rows in a table."
[ shrug ] It's wrong on both counts, and has been since (checks CVS) 1997.
What book is that anyway?
There is a related statement that is still true: "WHERE x IS NULL"
(or NOT NULL) clauses are not indexscannable. This is a shortcoming of
the planner-to-index-access-method interface, though, not a question of
whether the index can store NULLs.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match
Re: ORDER BY and NULLs
am 19.09.2004 22:22:23 von mailreg
Hello Greg,
Greg Stark wrote:
> T E Schmitz writes:
>
>>Quote: "PostgreSQL will not index NULL values. Because an index will never
>>include NULL values, it cannot be used to satisfy the ORDER BY clause of a
>>query that returns all rows in a table."
>
>
> You should just cross out that whole section. It's just flatly wrong.
> I had always assumed it was just people bringing assumptions over from Oracle
> where it is true. Perhaps this book is to blame for some of the confusion.
> Which book is it?
PostgreSQL by Korry Douglas + Susan Douglas, ISBN 0-7357-1257-3; Feb 2003
> Postgres indexes NULLs. It can use them for ORDER BY clauses.
>
> Where it cannot use them is to satisfy "WHERE foo IS NULL" or "WHERE foo IS
> NOT NULL" constraints though. That's an implementation detail, but it can be
> worked around with partial indexes.
The paragraph continues:
"If the SELECT command included the clause WHERE phone NOT NULL,
PostgreSQL could use the index to satisfy the ORDER BY clause.
An index that covers optional (NOT NULL) columns will not be used to
speed table joins either."
--
Regards/Gruß,
Tarlika Elisabeth Schmitz
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
Re: ORDER BY and NULLs
am 19.09.2004 22:30:29 von tgl
T E Schmitz writes:
> Greg Stark wrote:
>> Which book is it?
> PostgreSQL by Korry Douglas + Susan Douglas, ISBN 0-7357-1257-3; Feb 2003
Hmm, I've heard of that book but never seen it. The authors are not
participants in the PG community --- AFAICT neither of them have ever
posted anything in the mailing lists.
> The paragraph continues:
> "If the SELECT command included the clause WHERE phone NOT NULL,
> PostgreSQL could use the index to satisfy the ORDER BY clause.
> An index that covers optional (NOT NULL) columns will not be used to
> speed table joins either."
My goodness, it seems to be a veritable fount of misinformation :-(
I wonder how much of this is stuff that is true for Oracle and they just
assumed it carried over?
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?
http://archives.postgresql.org
Re: ORDER BY and NULLs
am 19.09.2004 22:50:00 von mailreg
Hello Tom,
Tom Lane wrote:
> T E Schmitz writes:
>
>>Greg Stark wrote:
>>
>>>Which book is it?
>
>
>>PostgreSQL by Korry Douglas + Susan Douglas, ISBN 0-7357-1257-3; Feb 2003
>
>
> Hmm, I've heard of that book but never seen it. The authors are not
> participants in the PG community --- AFAICT neither of them have ever
> posted anything in the mailing lists.
>
>
>>The paragraph continues:
>>"If the SELECT command included the clause WHERE phone NOT NULL,
>>PostgreSQL could use the index to satisfy the ORDER BY clause.
>>An index that covers optional (NOT NULL) columns will not be used to
>>speed table joins either."
>
>
> My goodness, it seems to be a veritable fount of misinformation :-(
Well, that's great. My knowledge of SQL is good enough to model a DB and
do run of the mill queries; but when it comes to some fine details I
rely on sensible input ;-)
Thanks for chipping in here and answering what I thought was a dummy
question.
--
Kind Regards/Gruß,
Tarlika
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
Re: ORDER BY and NULLs
am 20.09.2004 02:51:35 von gsstark
Tom Lane writes:
> > The paragraph continues:
> > "If the SELECT command included the clause WHERE phone NOT NULL,
> > PostgreSQL could use the index to satisfy the ORDER BY clause.
> > An index that covers optional (NOT NULL) columns will not be used to
> > speed table joins either."
>
> My goodness, it seems to be a veritable fount of misinformation :-(
>
> I wonder how much of this is stuff that is true for Oracle and they just
> assumed it carried over?
The first part is true for Oracle. You have to add the WHERE phone NOT NULL to
convince Oracle it can use an index. Or just make the column NOT NULL to begin
with I think.
However as far as I recall the second part is not true. Oracle is smart enough
to realize that an equijoin clause implies NOT NULL and therefore allows it to
use the index.
(This may have all changed in Oracle 9+. The last I saw of Oracle was 8i)
I wonder if they just tried explain on a bunch of queries and noticed that
postgres wasn't using an index for SELECT * FROM foo ORDER BY bar and came up
with explanations for the patterns they saw?
--
greg
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings
Re: ORDER BY and NULLs
am 20.09.2004 10:04:26 von mailreg
Hello,
Greg Stark wrote:
> Tom Lane writes:
>
>
>>>The paragraph continues:
>>>"If the SELECT command included the clause WHERE phone NOT NULL,
>>>PostgreSQL could use the index to satisfy the ORDER BY clause.
>>>An index that covers optional (NOT NULL) columns will not be used to
>>>speed table joins either."
>>
>>My goodness, it seems to be a veritable fount of misinformation :-(
>>
> I wonder if they just tried explain on a bunch of queries and noticed that
> postgres wasn't using an index for SELECT * FROM foo ORDER BY bar and came up
> with explanations for the patterns they saw?
This paragraph was in the chapter about PostgreSQL indexing and the
paragraph itself was entitled "Indexes and Null values".
--
Regards/Gruß,
Tarlika Elisabeth Schmitz
---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster
Re: ORDER BY and NULLs
am 28.10.2004 18:52:31 von Murphy Pope
>> I just dug out the PostgreSQL book again because I thought I might've
>> garbled it:
>>
>> Quote: "PostgreSQL will not index NULL values. Because an index will
>> never include NULL values, it cannot be used to satisfy the ORDER BY
>> clause of a query that returns all rows in a table."
>
> You should just cross out that whole section. It's just flatly wrong.
>
> I had always assumed it was just people bringing assumptions over from
> Oracle where it is true. Perhaps this book is to blame for some of the
> confusion. Which book is it?
>
> Postgres indexes NULLs. It can use them for ORDER BY clauses.
Now I'm confused... here's a quote from Bruce Momjian from Oct. 2003:
> To be specific, we do not do index NULL values in a column, but we
> easily index non-null values in the column.
And a comment from backend/access/gist/gist.c (appears a few times):
> GiST cannot index tuples with leading NULLs
So what's the story? Do GiST indexes index NULLs? Do other index types
index NULLs? Is the comment wrong or am I misreading it?
Re: ORDER BY and NULLs
am 28.10.2004 20:36:36 von Murphy Pope
>> You should just cross out that whole section. It's just flatly wrong.
>>
>> I had always assumed it was just people bringing assumptions over from
>> Oracle where it is true. Perhaps this book is to blame for some of the
>> confusion. Which book is it?
>>
>> Postgres indexes NULLs. It can use them for ORDER BY clauses.
>
> Now I'm confused...
I think I found the definitive answer and it looks like everyone (Bruce,
Tom, the book) is half-right. Maybe this should go in a FAQ or something
since there seems to be so much confusion.
From section 41.3 of the documentation - this section describes the pg_am
table:
> An index access method that supports multiple columns
> (has amcanmulticol true) must support indexing null
> values in columns after the first, because the planner
> will assume the index can be used for queries on just
> the first column(s). For example, consider an index
> on (a,b) and a query with WHERE a = 4. The system will
> assume the index can be used to scan for rows
> with a = 4, which is wrong if the index omits rows
> where b is null. It is, however, OK to omit rows
> where the first indexed column is null. (GiST
> currently does so.) amindexnulls should be set true
> only if the index access method indexes all rows,
> including arbitrary combinations of null values.
Here's what I get when I look at pg_am:
select amname, amcanmulticol, amindexnulls from pg_am;
amname | amcanmulticol | amindexnulls
--------+---------------+--------------
rtree | f | f
btree | t | t
hash | f | f
gist | t | f
So it looks like btree indexes will index completely-NULL values, but the
other types won't index a row where all of the index columns are NULL.
Am I reading that right?
It sounds like the explanation quoted from the book is correct for all types
except for btree?
Re: ORDER BY and NULLs
am 31.10.2004 02:00:25 von Murphy Pope
> You should just cross out that whole section. It's just flatly wrong.
>
> I had always assumed it was just people bringing assumptions over from
> Oracle where it is true. Perhaps this book is to blame for some of the
> confusion. Which book is it?
>
> Postgres indexes NULLs. It can use them for ORDER BY clauses.
I know this is an old-ish topic, but the question keeps coming up and I see
different answers every time.
I think I found the definitive answer and it looks like everyone (Bruce,
Tom, the book) is half-right.  Maybe this should go in a FAQ or some thing
since there seems to be so much confusion.
From section 41.3 of the documentation - this section describes the pg_am
table:
> An index access method that supports multiple columns
> (has amcanmulticol true) must  support indexing nullÂ
> values in columns after the first, because the planner
> will assume the index can be used for queries on just
> the first column(s). For example, consider an index
> on (a,b) and a query with WHERE a = 4. The system will
> assume the index can be used to scan for rows
> with a = 4, which is wrong if the index omits rows
> where b is null. It is, however, OK to omit rows
> where the first indexed column is null. (GiST
> currently does so.) amindexnulls should be set true
> only if the index access method indexes all rows,
> including arbitrary combinations of null values.
Here's what I get when I look at pg_am:
select amname, amcanmulticol, amindexnulls from pg_am;
 amname | amcanmulticol | amindexnulls
--------+---------------+--------------
 rtree  | f             | f
 btree  | t             | t
 hash   | f             | f
 gist   | t             | f
So it looks like btree indexes will index completely-NULL values, but the
other types won't index a row where all of the index columns are NULL.
Am I reading that right?
It sounds like the explanation quoted from the book is correct for all types
except for btree?