Question about table design

Question about table design

am 11.01.2005 12:45:37 von Manlio Perillo

Hi.
I'm a new user of PostgreSQL.

I want to create a catalogue of books but there are several problems.
Here are the two tables I need:

create table books (
title text,
author text,
primary key (title, author)
);

create table books_edition (
title text
autor text,
editor text,
ISBN text,
...
foreign key (title, author) references books (title, author)
);


Unfortunately books can have more then one author!.
Should I use an array?
However with an array it is hard to do queries on authors.

The best I have found is:

create table books (
title text,
author text,
other_authors text[],
primary key (title, author)
);


The problem is that I want title and author to be primary key.
Better solution?



Thanks and regards Manlio Perillo

Re: Question about table design

am 12.01.2005 13:55:07 von Erwin Moller

Manlio Perillo wrote:

> Hi.
> I'm a new user of PostgreSQL.
>
> I want to create a catalogue of books but there are several problems.
> Here are the two tables I need:
>
> create table books (
> title text,
> author text,
> primary key (title, author)
> );
>
> create table books_edition (
> title text
> autor text,
> editor text,
> ISBN text,
> ...
> foreign key (title, author) references books (title, author)
> );
>
>
> Unfortunately books can have more then one author!.
> Should I use an array?
> However with an array it is hard to do queries on authors.
>
> The best I have found is:
>
> create table books (
> title text,
> author text,
> other_authors text[],
> primary key (title, author)
> );
>
>
> The problem is that I want title and author to be primary key.
> Better solution?
>
>
>
> Thanks and regards Manlio Perillo


Hi,

If you insist on using title and author as PK, you are making your life
difficult for no appearant reason (at least for me).
Make life easy and start using a unique number to identify a row (use
SERIAL).
I never hit on a designproblem I couldn't simply solve by using an id
(SERIAL) for each table.
(Actually, I never use a table without it)

Why not try this: ???

create table tblbooks (
bookid SERIAL PRIMARY KEY,
title text,
editor text,
ISBN text,
edition text
);

now tblbooks just contains information about the book itself.
Ok?

create table tblauthors (
authorid SERIAL PRIMARY KEY,
authorname text
) ;

tblauthors contains a list of possible authors.
But they are not yet linked to a book(s).
Use another table to link the author(s) to book(s), like this:

create table tblauthorbooks (
authorbookid SERIAL PRIMARY KEY,
bookid int REFERENCES tblbooks(bookid),
authorid int REFERENCES tblauthor(authorid)
);

If you design like this, you are able to link more authors to a book.
tblauthorbooks contains the links between books and authors.

If you want to know who wrote a book with bookid=41, just:

SELECT authorid FROM tblauthorbooks WHERE (bookid=41);

or to get the names:

SELECT A.authorid, A.authorname FROM tblauthorbooks AS AB, tblauthors AS A
WHERE ( (A.authorid=AB.authorid) AND (AB.bookid=41));

(Or use different JOINstyle, but I am an oldfashioned guy.)


Hope this help you a bit.

(I didn't check my code, so beware for typos. It is just ment as an idea.)

Regards,
Erwin Moller

Re: Question about table design

am 13.01.2005 18:56:22 von Manlio Perillo

On Wed, 12 Jan 2005 13:55:07 +0100, Erwin Moller
wrote:

> [...]
>> create table books (
>> title text,
>> author text,
>> primary key (title, author)
>> );
>>
>> create table books_edition (
>> title text
>> autor text,
>> editor text,
>> ISBN text,
>> ...
>> foreign key (title, author) references books (title, author)
>> );
>>
>>
>> Unfortunately books can have more then one author!.
> [...]

>If you insist on using title and author as PK, you are making your life
>difficult for no appearant reason (at least for me).
>Make life easy and start using a unique number to identify a row (use
>SERIAL).

My problem is that I would like to use 'real' primary keys, not
'fictious' ones...

>I never hit on a designproblem I couldn't simply solve by using an id
>(SERIAL) for each table.
>(Actually, I never use a table without it)
>
>Why not try this: ???
>
>create table tblbooks (
> bookid SERIAL PRIMARY KEY,
> title text,
> editor text,
> ISBN text,
> edition text
>);
>

I would like to keep separate books and book's editions.

>now tblbooks just contains information about the book itself.
>Ok?
>
>create table tblauthors (
> authorid SERIAL PRIMARY KEY,
> authorname text
>) ;
>
>tblauthors contains a list of possible authors.
>But they are not yet linked to a book(s).
>Use another table to link the author(s) to book(s), like this:
>
>create table tblauthorbooks (
> authorbookid SERIAL PRIMARY KEY,
> bookid int REFERENCES tblbooks(bookid),
> authorid int REFERENCES tblauthor(authorid)
>);
>

Ok, I think this is the only solution.

>If you design like this, you are able to link more authors to a book.
>tblauthorbooks contains the links between books and authors.
>
>If you want to know who wrote a book with bookid=41, just:
>
>SELECT authorid FROM tblauthorbooks WHERE (bookid=41);
>
>or to get the names:
>
>SELECT A.authorid, A.authorname FROM tblauthorbooks AS AB, tblauthors AS A
>WHERE ( (A.authorid=AB.authorid) AND (AB.bookid=41));
>
>(Or use different JOINstyle, but I am an oldfashioned guy.)
>

In a 'real' query should appear tblbooks.title = 'some title' and not
tblbooks.bookid = some_number.



>
>Hope this help you a bit.
>

Of course, thank you.


Regards Manlio Perillo

Re: Question about table design

am 14.01.2005 11:11:10 von Erwin Moller

Manlio Perillo wrote:

Hi Manlio,

> On Wed, 12 Jan 2005 13:55:07 +0100, Erwin Moller
> wrote:
>
>> [...]
>>> create table books (
>>> title text,
>>> author text,
>>> primary key (title, author)
>>> );
>>>
>>> create table books_edition (
>>> title text
>>> autor text,
>>> editor text,
>>> ISBN text,
>>> ...
>>> foreign key (title, author) references books (title, author)
>>> );
>>>
>>>
>>> Unfortunately books can have more then one author!.
>> [...]
>
>>If you insist on using title and author as PK, you are making your life
>>difficult for no appearant reason (at least for me).
>>Make life easy and start using a unique number to identify a row (use
>>SERIAL).
>
> My problem is that I would like to use 'real' primary keys, not
> 'fictious' ones...

No punch intended, but that IS your problem.
autonumberfields for UNIQUE (PRIMARY) KEYS is your friend.

Imagine the (rather unusual) situation the we have 2 books with the same
title and same authorname?
With SERIAL you can tell which one is which one, with a PK on 2 fields you
cannot.
And futhermore:
- Joining is MUCH faster using a number instead of 2 textfields.
- Your queries are MUCH easier to read.

Really, you call it 'fictious', I call it a no-doubt/clear/unconfusing
'pointer' to the row you want.

Give it a shot, I am sure you want nothing else after some time. :-)

>
>>I never hit on a designproblem I couldn't simply solve by using an id
>>(SERIAL) for each table.
>>(Actually, I never use a table without it)
>>
>>Why not try this: ???
>>
>>create table tblbooks (
>> bookid SERIAL PRIMARY KEY,
>> title text,
>> editor text,
>> ISBN text,
>> edition text
>>);
>>
>
> I would like to keep separate books and book's editions.

Ok, I admit I was lazy.
When I typed it I didn't want to create another table.
If you want to keep track of the editions, the best way is of course to
create a table for it:

create table tblbookeditions(
bookeditionid SERIAL PRIMARY KEY,
bookid int REFERENCES tblbooks(bookid),
edition text
}

Of course you can also use a number instead of text for edition.


>
>>now tblbooks just contains information about the book itself.
>>Ok?
>>
>>create table tblauthors (
>> authorid SERIAL PRIMARY KEY,
>> authorname text
>>) ;
>>
>>tblauthors contains a list of possible authors.
>>But they are not yet linked to a book(s).
>>Use another table to link the author(s) to book(s), like this:
>>
>>create table tblauthorbooks (
>> authorbookid SERIAL PRIMARY KEY,
>> bookid int REFERENCES tblbooks(bookid),
>> authorid int REFERENCES tblauthor(authorid)
>>);
>>
>
> Ok, I think this is the only solution.
>
>>If you design like this, you are able to link more authors to a book.
>>tblauthorbooks contains the links between books and authors.
>>
>>If you want to know who wrote a book with bookid=41, just:
>>
>>SELECT authorid FROM tblauthorbooks WHERE (bookid=41);
>>
>>or to get the names:
>>
>>SELECT A.authorid, A.authorname FROM tblauthorbooks AS AB, tblauthors AS A
>>WHERE ( (A.authorid=AB.authorid) AND (AB.bookid=41));
>>
>>(Or use different JOINstyle, but I am an oldfashioned guy.)
>>
>
> In a 'real' query should appear tblbooks.title = 'some title' and not
> tblbooks.bookid = some_number.

Which shouldn't raise any problems.
Works also just fine.

>
>
>
>>
>>Hope this help you a bit.
>>
>
> Of course, thank you.
>
>
> Regards Manlio Perillo

One general word of advise when designing databases:
(whole books are written to cover this subject, but I'll do it in a few
sentences. :P)

This is how I approach it:

If you need to add some characteristic (eg: publishyear or author) to an
entity (book), just wonder if:
- that characteristic can be viewed as an entity too you might need in other
places (in the case of an author),
- or that it is just an extra piece of information you DON'T need to
'categorize' (= making a special table for it) as in the case of
publishingyear.

Of course you could make a table tblpublishingyears to which an entry in
tblbooks points to by using a Foreign Key, but that is overkill in my
humble opinion.

A few examples:
1) if you want to store the year of birth of an author, add it as a new
field to tblauthor.
2) If you want to store the publisher of a book, make a new table because
you proably want to use that same publisher too for other books.
3) If you want to store the publisher of a book PER EDITION (which of course
happens), you use a Foreign Key in our newly created table tblbookeditions
that points to the PK in tblpublisher.



Good luck!

Regards,
Erwin Moller

Re: Question about table design

am 15.01.2005 14:02:03 von Manlio Perillo

On Fri, 14 Jan 2005 11:11:10 +0100, Erwin Moller
wrote:


>> My problem is that I would like to use 'real' primary keys, not
>> 'fictious' ones...
>
>No punch intended, but that IS your problem.
>autonumberfields for UNIQUE (PRIMARY) KEYS is your friend.
>
>Imagine the (rather unusual) situation the we have 2 books with the same
>title and same authorname?

The assertion was that this is not the case!
However SERIAL is really the solution.

>With SERIAL you can tell which one is which one, with a PK on 2 fields you
>cannot.
>And futhermore:
>- Joining is MUCH faster using a number instead of 2 textfields.
>- Your queries are MUCH easier to read.
>
>Really, you call it 'fictious', I call it a no-doubt/clear/unconfusing
>'pointer' to the row you want.

> [...]

>>
>>>If you design like this, you are able to link more authors to a book.
>>>tblauthorbooks contains the links between books and authors.
>>>
>>>If you want to know who wrote a book with bookid=41, just:
>>>
>>>SELECT authorid FROM tblauthorbooks WHERE (bookid=41);
>>>
>>>or to get the names:
>>>
>>>SELECT A.authorid, A.authorname FROM tblauthorbooks AS AB, tblauthors AS A
>>>WHERE ( (A.authorid=AB.authorid) AND (AB.bookid=41));
>>>
>>>(Or use different JOINstyle, but I am an oldfashioned guy.)
>>>
>>
>> In a 'real' query should appear tblbooks.title = 'some title' and not
>> tblbooks.bookid = some_number.
>
>Which shouldn't raise any problems.
>Works also just fine.
>

It works just fine if the query is done by the database administrator!
Moreover what if one repopulate the tables in a different order?
The SERIAL ids are no more the same.

>One general word of advise when designing databases:
>(whole books are written to cover this subject, but I'll do it in a few
>sentences. :P)
>

Thank you very much.
Can you suggest me a good book on relational theory, SQL manual and
database design (with real life examples)?
How is Date's 'Introduction to Database Systems' ?


>This is how I approach it:
>
>If you need to add some characteristic (eg: publishyear or author) to an
>entity (book), just wonder if:
>- that characteristic can be viewed as an entity too you might need in other
>places (in the case of an author),
>- or that it is just an extra piece of information you DON'T need to
>'categorize' (= making a special table for it) as in the case of
>publishingyear.
>
>Of course you could make a table tblpublishingyears to which an entry in
>tblbooks points to by using a Foreign Key, but that is overkill in my
>humble opinion.
>
>A few examples:
>1) if you want to store the year of birth of an author, add it as a new
>field to tblauthor.
>2) If you want to store the publisher of a book, make a new table because
>you proably want to use that same publisher too for other books.
>3) If you want to store the publisher of a book PER EDITION (which of course
>happens), you use a Foreign Key in our newly created table tblbookeditions
>that points to the PK in tblpublisher.
>
>
>
>Good luck!
>


Thanks and regards Manlio Perillo

Re: Question about table design

am 18.01.2005 15:57:04 von Erwin Moller

Manlio Perillo wrote:

> On Fri, 14 Jan 2005 11:11:10 +0100, Erwin Moller
> wrote:
>
>
>>> My problem is that I would like to use 'real' primary keys, not
>>> 'fictious' ones...
>>
>>No punch intended, but that IS your problem.
>>autonumberfields for UNIQUE (PRIMARY) KEYS is your friend.
>>
>>Imagine the (rather unusual) situation the we have 2 books with the same
>>title and same authorname?
>
> The assertion was that this is not the case!
> However SERIAL is really the solution.

Ok, no need to shout. :-)
I didn't know of any assertions.

>
>>With SERIAL you can tell which one is which one, with a PK on 2 fields you
>>cannot.
>>And futhermore:
>>- Joining is MUCH faster using a number instead of 2 textfields.
>>- Your queries are MUCH easier to read.
>>
>>Really, you call it 'fictious', I call it a no-doubt/clear/unconfusing
>>'pointer' to the row you want.
>
>> [...]
>
>>>
>>>>If you design like this, you are able to link more authors to a book.
>>>>tblauthorbooks contains the links between books and authors.
>>>>
>>>>If you want to know who wrote a book with bookid=41, just:
>>>>
>>>>SELECT authorid FROM tblauthorbooks WHERE (bookid=41);
>>>>
>>>>or to get the names:
>>>>
>>>>SELECT A.authorid, A.authorname FROM tblauthorbooks AS AB, tblauthors AS
>>>>A WHERE ( (A.authorid=AB.authorid) AND (AB.bookid=41));
>>>>
>>>>(Or use different JOINstyle, but I am an oldfashioned guy.)
>>>>
>>>
>>> In a 'real' query should appear tblbooks.title = 'some title' and not
>>> tblbooks.bookid = some_number.
>>
>>Which shouldn't raise any problems.
>>Works also just fine.
>>
>
> It works just fine if the query is done by the database administrator!

How so?
Anybody who has SELECT-rights to the relevant tables should be able to run
that query.
No difference between:
(bookid=XX) or
(title LIKE '%XXX%')

You really don't need administrator rights. It is just a query.

Or am I missing some point here?


> Moreover what if one repopulate the tables in a different order?
> The SERIAL ids are no more the same.

You are really not the first one that tries that. :-)
Of course you can repopulate the table and keep the SERIALs the same.

This is how it works:

1) If you insert in a table with a serial and DON'T give a value for the
serial, POSTGRESQL will use a function to get the next value. (Called
NEXTVAL)

This happens because Postgres will use the DEFAULT-value for that
serialfield (as it should, because you didn't give a value).
The defaultvalue is set to a function in this case.

Please check this by making a simple table with a serial and then describe
the table.
You will see what I mean.


2) If you insert into a table AND you give a value for the SERIAL-field, no
problem: Postgres tries to insert your number. (Of course, when this number
already exists and you made that column PRIMARY KEY, you receive an error.)

Bottomline:
There really is no need to EVER change the value you gave to a SERIAL-field.
(At least, none that I ever encountered.)



>
>>One general word of advise when designing databases:
>>(whole books are written to cover this subject, but I'll do it in a few
>>sentences. :P)
>>
>
> Thank you very much.
> Can you suggest me a good book on relational theory, SQL manual and
> database design (with real life examples)?
> How is Date's 'Introduction to Database Systems' ?

No, I don't have advise here, except this:

When I asked that very same question a few years back to a friend who was
good with databases he told me: "Designing databases is something you have
to do a lot of times. Make mistakes and learn how you can avoid them next
time. You will recognize your designmistakes whenever your queries become
extremely long and complicated, and thus errorprone. AFTER that read a
book, if you still need it because most books start making sense when you
have some experience."

Really, I am NOT sure if that is good advise in general, but it worked great
for me.

I never came to reading a book about it. :P

In general:
- When you design a database, ALWAYS try to make the queries you need in
your mind. Try to figure out if your design allows reasonably simple
queries to get an answer. If not, redesign the tablestructure.

- Make a visual representation of the database.
Make sure you show all Foreign Keys by drawing lines from the FK-column on
some table to the PK (SERIAL) to another table.

You can use some softwarepackage for this, but I prefer doing in on paper,
using just a marker and 2 colorpens.

- NEVER start programming before you are confident with your design.
(Of course, when you are new you are never confident, then just try.)


Anyway, I am sure there must be many good books concerning databasedesign, I
just don't know any.
(I know actually one, but that is in Dutch only, my mothertongue.)

>
>
>>This is how I approach it:
>>
>>If you need to add some characteristic (eg: publishyear or author) to an
>>entity (book), just wonder if:
>>- that characteristic can be viewed as an entity too you might need in
>>other places (in the case of an author),
>>- or that it is just an extra piece of information you DON'T need to
>>'categorize' (= making a special table for it) as in the case of
>>publishingyear.
>>
>>Of course you could make a table tblpublishingyears to which an entry in
>>tblbooks points to by using a Foreign Key, but that is overkill in my
>>humble opinion.
>>
>>A few examples:
>>1) if you want to store the year of birth of an author, add it as a new
>>field to tblauthor.
>>2) If you want to store the publisher of a book, make a new table because
>>you proably want to use that same publisher too for other books.
>>3) If you want to store the publisher of a book PER EDITION (which of
>>course happens), you use a Foreign Key in our newly created table
>>tblbookeditions that points to the PK in tblpublisher.
>>
>>
>>
>>Good luck!
>>
>
>
> Thanks and regards Manlio Perillo


Good luck Manlio!

Regards,
Erwin Moller

Re: Question about table design

am 20.01.2005 21:17:27 von Manlio Perillo

On Tue, 18 Jan 2005 15:57:04 +0100, Erwin Moller
wrote:


>>>>>SELECT A.authorid, A.authorname FROM tblauthorbooks AS AB, tblauthors AS
>>>>>A WHERE ( (A.authorid=AB.authorid) AND (AB.bookid=41));
>>>>>
>>>>>(Or use different JOINstyle, but I am an oldfashioned guy.)
>>>>>
>>>>
>>>> In a 'real' query should appear tblbooks.title = 'some title' and not
>>>> tblbooks.bookid = some_number.
>>>
>>>Which shouldn't raise any problems.
>>>Works also just fine.
>>>
>>
>> It works just fine if the query is done by the database administrator!
>
>How so?
>Anybody who has SELECT-rights to the relevant tables should be able to run
>that query.
>No difference between:
>(bookid=XX) or
>(title LIKE '%XXX%')
>
>You really don't need administrator rights. It is just a query.
>
>Or am I missing some point here?
>

It is a misunderstanding!
As non administrator database user I intend an user who interacts with
the database via an user interface.
Such an user of cource would like to do a query using author name and
book title.

>
>> Moreover what if one repopulate the tables in a different order?
>> The SERIAL ids are no more the same.
>
>You are really not the first one that tries that. :-)
>Of course you can repopulate the table and keep the SERIALs the same.
>
>This is how it works:
>
> [...]

Ok.
I have already read the PostgreSQL documentation.

>
>>
>>>One general word of advise when designing databases:
>>>(whole books are written to cover this subject, but I'll do it in a few
>>>sentences. :P)
>>>
>>
>> Thank you very much.
>> Can you suggest me a good book on relational theory, SQL manual and
>> database design (with real life examples)?
>> How is Date's 'Introduction to Database Systems' ?
>
>No, I don't have advise here, except this:
>
>When I asked that very same question a few years back to a friend who was
>good with databases he told me: "Designing databases is something you have
>to do a lot of times. Make mistakes and learn how you can avoid them next
>time. You will recognize your designmistakes whenever your queries become
>extremely long and complicated, and thus errorprone. AFTER that read a
>book, if you still need it because most books start making sense when you
>have some experience."
>
>Really, I am NOT sure if that is good advise in general, but it worked great
>for me.
>
>I never came to reading a book about it. :P
>
>In general:
> [...]

Many thanks for the advices!
I have asked for a book because it is wise (and 'easy') to learn from
"other' errors"

Moreover, usually there are well known design patterns for common
problems


Regards, Manlio Perillo