SELECT * vs SELECT col1, col2

am 17.09.2007 22:25:30 von Danny Burton

I know that in order to reduce network traffic you should always select only
the columns you require but I have a stored procedure which must **always**
return all columns from a specific table.

When returning all columns from a table is there any performance gain (or
any other reason) for using SELECT col1, col2 etc... instead of just SELECT
* ?

I can immediately think of a disadvantage which is that if you include named
columns in your SELECT then you must update the stored procedure each time a
column is added to the table, using SELECT * gets around this so the stored
procedure will never need to be modified.

Would anyone care to share their thoughts on the above?

Thanks,

Clive

Re: SELECT * vs SELECT col1, col2

am 17.09.2007 23:06:11 von Leythos

In article , no@email says...
> I know that in order to reduce network traffic you should always select only
> the columns you require but I have a stored procedure which must **always**
> return all columns from a specific table.
>
> When returning all columns from a table is there any performance gain (or
> any other reason) for using SELECT col1, col2 etc... instead of just SELECT
> * ?
>
> I can immediately think of a disadvantage which is that if you include named
> columns in your SELECT then you must update the stored procedure each time a
> column is added to the table, using SELECT * gets around this so the stored
> procedure will never need to be modified.
>
> Would anyone care to share their thoughts on the above?

Imagine if you returned columns that the application requesting the info
didn't know how to handle. Select * means you didn't know what you were
expecting on one side or the other or both....

--

Leythos
- Igitur qui desiderat pacem, praeparet bellum.
- Calling an illegal alien an "undocumented worker" is like calling a
drug dealer an "unlicensed pharmacist"
spam999free@rrohio.com (remove 999 for proper email address)

Re: SELECT * vs SELECT col1, col2

am 17.09.2007 23:11:18 von David Portas

"Danny Burton" wrote in message
news:Uv-dnaiBfeHjf3PbRVnyugA@giganews.com...
>I know that in order to reduce network traffic you should always select
>only the columns you require but I have a stored procedure which must
>**always** return all columns from a specific table.
>
> When returning all columns from a table is there any performance gain (or
> any other reason) for using SELECT col1, col2 etc... instead of just
> SELECT * ?
>
> I can immediately think of a disadvantage which is that if you include
> named columns in your SELECT then you must update the stored procedure
> each time a column is added to the table, using SELECT * gets around this
> so the stored procedure will never need to be modified.
>
> Would anyone care to share their thoughts on the above?
>
> Thanks,
>
> Clive
>
>
>
>
>

When doing maintenance or making changes it's often useful to do a search
over your code to find exactly where a column is being referenced. If you
use * rather than column names then you will be less likely to find
potential dependencies in your code. That can make debugging a much harder
task.

In some environments it can be difficult to maintain total control over the
logical column order returned by *. For example it's easy to re-create a
table in development (maybe even by accident) but it's probably totally
unacceptable to do that in production. So if you use SELECT * you may get
columns returned in some unexpected order, which could be a problem
depending on how your client process consumes that data. By the way, I don't
recommend that data be processed based on fixed column order anyway, but
based on your remark about not wanting to make code changes when you add new
columns I assume that must be what you are doing.

Some code may break or give the wrong results due to column order
dependencies or new or missing columns. For example UNION or CHECKSUM(*) or
INSERT without a column list. Regarding INSERT, how will you populate your
table if you don't know what columns exist in it?.

It is also a foundation of relational database design, that columns are
identified by name and not position. The compromise you are suggesting is
only possible at all because SQL violates that principle.

In summary, experience suggests it is much less trouble to maintain code
with column names than without them.

One other consideration. One day someone else will review or even inherit
your code. Using SELECT * in your code just guarantees an oppotunity for
someone to point the finger and tut-tut about best practices and coding
standards...

--
David Portas

Re: SELECT * vs SELECT col1, col2

am 17.09.2007 23:12:22 von Erland Sommarskog

Danny Burton (no@email) writes:
> I know that in order to reduce network traffic you should always select
> only the columns you require but I have a stored procedure which must
> **always** return all columns from a specific table.

Unless this is a temp table created within the procedure, I would
question the wise in this. A question that I often have reason to
ask about our database is "is this column actually used for something".
It is not uncommon that I find a couple of procedures that return this
column, but I also see that they return of columns that are of no
interest in that context, so it looks like "let's select all columns
while we're at it, in case we need them later".

A stored procedure should in my opinion only return the column that
the caller actually requests.

> When returning all columns from a table is there any performance gain
> (or any other reason) for using SELECT col1, col2 etc... instead of just
> SELECT * ?

No.

> I can immediately think of a disadvantage which is that if you include
> named columns in your SELECT then you must update the stored procedure
> each time a column is added to the table, using SELECT * gets around
> this so the stored procedure will never need to be modified.

As I said, I question the wise in this. And what if the colunms in the
tables are rearranged, renamed or a column is dropped? With SELECT *,
the procedure will continue to run, but the client may be utterly
confused. With listed columns, you would get an error if columns are
renamed or dropped. And if they are merely rearranged, the client
will not even notice.

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downlo ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books .mspx

Re: SELECT * vs SELECT col1, col2

am 19.09.2007 21:03:26 von Hugo Kornelis

On Mon, 17 Sep 2007 21:25:30 +0100, Danny Burton wrote:

>When returning all columns from a table is there any performance gain (or
>any other reason) for using SELECT col1, col2 etc... instead of just SELECT
>* ?

Hi Clive,

I have to contradict Erland here - there will be a small performance
hit, that might (in very rare cases) grow to be a annoying enough to be
noticeable.

For SELECT *, the first step in parsing is to find out which columns are
in the table. That requires a read on the syscolumns system table in SQL
Server 2000, or it's undocumented equivalent in SQL Server 2005. That in
itself means that you'll have a small bit of overhead, due to requesting
a lock, getting it, reading the data, and releasing the lock again.

If this happens frequently in a system that also frequently adds or
removes columns to of from tables, you might see blocking on the system
table, since the Sch-S (schema stability) lock taken for reading the
columns can't co-exist with the Sch-M (schema modification) lock taken
for changing the table structure.

--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis

Re: SELECT * vs SELECT col1, col2

am 19.09.2007 22:21:53 von cphite

> I know that in order to reduce network traffic you should always select only
> the columns you require but I have a stored procedure which must **always**
> return all columns from a specific table.

Reducing traffic really isn't the main reason.

> When returning all columns from a table is there any performance gain (or
> any other reason) for using SELECT col1, col2 etc... instead of just SELECT
> * ?

Simply put, when you are writing production code you should KNOW for
certain exactly how and why everything works. You should know exactly
which columns are going to be returned; even if it happens to be all
of them.

> I can immediately think of a disadvantage which is that if you include named
> columns in your SELECT then you must update the stored procedure each time a
> column is added to the table, using SELECT * gets around this so the stored
> procedure will never need to be modified.

Yes, you will have to modify the procedure in that case. However,
while that does require that you spend time actually doing work, it is
still preferable to a situation where somebody changed a table
somewhere and your stored procedure keeps "working" despite returning
the wrong data. If all they did is add a column you might be okay -
what if they changed the order of some columns? What if they dropped
a column your application was expecting?

While there may not be a significant performance gain, there might
well be gains in terms of not getting phone calls in the middle of the
night because some application crashed, or because some job failed
when a table was changed earlier in the day.

> Would anyone care to share their thoughts on the above?

Using SELECT * in a procedure, view, or whatever is generally just
lazy and sloppy.

Re: SELECT * vs SELECT col1, col2

am 19.09.2007 23:10:46 von Erland Sommarskog

Hugo Kornelis (hugo@perFact.REMOVETHIS.info.INVALID) writes:
> For SELECT *, the first step in parsing is to find out which columns are
> in the table. That requires a read on the syscolumns system table in SQL
> Server 2000, or it's undocumented equivalent in SQL Server 2005. That in
> itself means that you'll have a small bit of overhead, due to requesting
> a lock, getting it, reading the data, and releasing the lock again.

But if you list all columns in the table, the optimizer still has to
read all rows in sys.columns for the table verify that all columns
exists, to find their data types etc. So the amount of locking would be
the same.

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downlo ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books .mspx

Re: SELECT * vs SELECT col1, col2

am 19.09.2007 23:45:50 von Serge Rielau

Erland Sommarskog wrote:
> Hugo Kornelis (hugo@perFact.REMOVETHIS.info.INVALID) writes:
>> For SELECT *, the first step in parsing is to find out which columns are
>> in the table. That requires a read on the syscolumns system table in SQL
>> Server 2000, or it's undocumented equivalent in SQL Server 2005. That in
>> itself means that you'll have a small bit of overhead, due to requesting
>> a lock, getting it, reading the data, and releasing the lock again.
>
> But if you list all columns in the table, the optimizer still has to
> read all rows in sys.columns for the table verify that all columns
> exists, to find their data types etc. So the amount of locking would be
> the same.
Uhmm.. I don't think I'd want to confuse the official information schema
with what the DBMS uses for itself. You assume there is neither
compacting of the decomposed information into one object (like a LOB)
nor caching of the schema in memory.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Re: SELECT * vs SELECT col1, col2

am 20.09.2007 23:10:13 von Erland Sommarskog

Serge Rielau (srielau@ca.ibm.com) writes:
> Uhmm.. I don't think I'd want to confuse the official information schema
> with what the DBMS uses for itself. You assume there is neither
> compacting of the decomposed information into one object (like a LOB)
> nor caching of the schema in memory.

I would guess that SQL Server caches system tables more or less like
either tables. That would at least make most sense to me. DB2 may be
different.

And caching or not, the metadata changes requires a lock. I did this in
one window:

begin transaction
alter table Orders add xyz int null

And in another window I tried to running SELECTs against the Orders table,
and it no difference if I had "SELECT *" or if I selected a small number
of columns. The query blocked.

Again, this may be different in DB2.

--
Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downlo ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books .mspx

Re: SELECT * vs SELECT col1, col2

am 19.01.2008 13:08:39 von Knowledgy

Adding columns to tables frequently points to a poor database design.

--
Sincerely,
John K
Knowledgy Consulting
http://knowledgy.org

Atlanta's Business Intelligence and Data Warehouse Experts

"Danny Burton" wrote in message
news:Uv-dnaiBfeHjf3PbRVnyugA@giganews.com...
>I know that in order to reduce network traffic you should always select
>only the columns you require but I have a stored procedure which must
>**always** return all columns from a specific table.
>
> When returning all columns from a table is there any performance gain (or
> any other reason) for using SELECT col1, col2 etc... instead of just
> SELECT * ?
>
> I can immediately think of a disadvantage which is that if you include
> named columns in your SELECT then you must update the stored procedure
> each time a column is added to the table, using SELECT * gets around this
> so the stored procedure will never need to be modified.
>
> Would anyone care to share their thoughts on the above?
>
> Thanks,
>
> Clive
>
>
>
>
>

Re: SELECT * vs SELECT col1, col2

am 19.01.2008 16:23:46 von Serge Rielau

Knowledgy wrote:
> Adding columns to tables frequently points to a poor database design.
.... or a vibrant changing company with new requirements in their IT...
I see no harm in adding columns. Things that don't change are dead...

--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Re: SELECT * vs SELECT col1, col2

am 27.01.2008 19:02:51 von bb_43

In article , "Knowledgy" wrote:
>Adding columns to tables frequently points to a poor database design.
>

It must be nice to work for a company that doesnt change the requirements
either mid development, or after it's deployed.