SHOW TABLES, SELECT and REGEXP

am 27.01.2007 12:01:02 von Szymon Bobek

Hi!

I have databese with a few tables named like that: films_triller,
films_horror, films_action etc.
I decided to create many tables instead of creting just one big table
containing all sort of films. Is it a good idea? (I think, that it boost
search time much - am I right?)
Now - to find a film (when only title is known) I have to search in all
that tables. But I have a problem with that.. How to do that? SELECT * FROM
REGEXP 'film_*' WHERE title='the film' does not work. The same is when I
want to list all tables (to see what groups of films I have) - if I try do
join SHOW TABLES with REGEXP I get an syntax error too.

Where is the mistake and how to correct it?

Thanks in advance
Szymon Bobek

Re: SHOW TABLES, SELECT and REGEXP

am 27.01.2007 13:40:41 von Paul Lautman

Szymon Bobek wrote:
> Hi!
>
> I have databese with a few tables named like that: films_triller,
> films_horror, films_action etc.
> I decided to create many tables instead of creting just one big table
> containing all sort of films. Is it a good idea? (I think, that it
> boost search time much - am I right?)
> Now - to find a film (when only title is known) I have to search in
> all that tables. But I have a problem with that.. How to do that?
> SELECT * FROM REGEXP 'film_*' WHERE title='the film' does not work.
> The same is when I want to list all tables (to see what groups of
> films I have) - if I try do join SHOW TABLES with REGEXP I get an
> syntax error too.
> Where is the mistake and how to correct it?
>
> Thanks in advance
> Szymon Bobek

Well the first mistake is that you have lots of tables. The way you have is
set up will mean that search times are far greater than they need to be.

Put them all in one table with an ID column. Then have another table
containing columns ID and Film_Type where an ID can be entered many times
each with a different film type. This is because a film can be a
comedy/thriller so if the film's ID was 225, in this table would be entries:
ID | Film_Type
225 | Comedy
225 | Thriller.

This is called "normalisation".

Re: SHOW TABLES, SELECT and REGEXP

am 27.01.2007 15:32:19 von Norman Peelman

Szymon Bobek wrote:
> Hi!
>
> I have databese with a few tables named like that: films_triller,
> films_horror, films_action etc.
> I decided to create many tables instead of creting just one big table
> containing all sort of films. Is it a good idea? (I think, that it boost
> search time much - am I right?)
> Now - to find a film (when only title is known) I have to search in all
> that tables. But I have a problem with that.. How to do that? SELECT * FROM
> REGEXP 'film_*' WHERE title='the film' does not work. The same is when I
> want to list all tables (to see what groups of films I have) - if I try do
> join SHOW TABLES with REGEXP I get an syntax error too.
>
> Where is the mistake and how to correct it?
>
> Thanks in advance
> Szymon Bobek
>
>

Your structure is fine... the problem is your method of searching. I'm
not really sure you can use REGEX in that context:

1) You can obtain your TABLE names with 'SHOW TABLES'
2) You could keep a list of film types in an array
3) populate your SELECT with the TABLE names

example:

$tables = '';
$columns = '';
$query = "SHOW TABLES";
$result = mysql_query($query,$dbc);
while ($row = mysql_fetch_array($result))
{
$tables. = "$row,";
$columns .= "$row.title,";
}
$tables = substr($tables,0,-1); // strip trailing commas
$columns = substr($columns,0,-1);

$query = "SELECT * FROM $tables WHERE $columns.title LIKE "$film_name";
// or "%$film_name%", play with the % (wildcard matching) this should be
used as you could have films like 'GODZILA vs MOTHRA', 'GODZILA vs
MEGAZILA' or 'KING KONG', 'KING KONG (2002)', etc.
// echo "$query
"; // uncomment to see what your query looks like
---

You need to have an index on the film_name columns to make it real fast.

Norm

Re: SHOW TABLES, SELECT and REGEXP

am 27.01.2007 16:17:32 von Paul Lautman

Norman Peelman wrote:
> Szymon Bobek wrote:
>> Hi!
>>
>> I have databese with a few tables named like that: films_triller,
>> films_horror, films_action etc.
>> I decided to create many tables instead of creting just one big table
>> containing all sort of films. Is it a good idea? (I think, that it
>> boost search time much - am I right?)
>> Now - to find a film (when only title is known) I have to search in
>> all that tables. But I have a problem with that.. How to do that?
>> SELECT * FROM REGEXP 'film_*' WHERE title='the film' does not work.
>> The same is when I want to list all tables (to see what groups of
>> films I have) - if I try do join SHOW TABLES with REGEXP I get an
>> syntax error too. Where is the mistake and how to correct it?
>>
>> Thanks in advance
>> Szymon Bobek
>>
>>
>
> Your structure is fine... the problem is your method of searching. I'm
> not really sure you can use REGEX in that context:
You REALLY think that structure is fine???

Re: SHOW TABLES, SELECT and REGEXP

am 27.01.2007 16:55:44 von Norman Peelman

Paul Lautman wrote:
> Norman Peelman wrote:
>> Szymon Bobek wrote:
>>> Hi!
>>>
>>> I have databese with a few tables named like that: films_triller,
>>> films_horror, films_action etc.
>>> I decided to create many tables instead of creting just one big table
>>> containing all sort of films. Is it a good idea? (I think, that it
>>> boost search time much - am I right?)
>>> Now - to find a film (when only title is known) I have to search in
>>> all that tables. But I have a problem with that.. How to do that?
>>> SELECT * FROM REGEXP 'film_*' WHERE title='the film' does not work.
>>> The same is when I want to list all tables (to see what groups of
>>> films I have) - if I try do join SHOW TABLES with REGEXP I get an
>>> syntax error too. Where is the mistake and how to correct it?
>>>
>>> Thanks in advance
>>> Szymon Bobek
>>>
>>>
>> Your structure is fine... the problem is your method of searching. I'm
>> not really sure you can use REGEX in that context:
> You REALLY think that structure is fine???
>
>

What do you THINK is wrong with it? If you want to go for one big
table then your structure could become:

film_type set('horror','thriller','comedy','romance','action',...)

and then a row could be:

film_id film_title film_type
1 Romancing the Stone action, comedy, romance
2 GODZILLA action, sci-fi
3 Jurasic Park action, sci-fi

and your search(s) becomes:

SELECT * FROM film_table WHERE film_title LIKE '%Jurasic%'
returns data on Jurasic Park

or

SELECT * FROM film_table WHERE FIND_IN_SET('sci-fi')
returns data on GODZILLA and Jurasic Park

....does that look better. (an ENUM can contain only one item from a
list, a SET can contain multiple item from a list)

You would still want an index on the film_title. Not sure if you can
index a SET column but you could try.

Norm

Re: SHOW TABLES, SELECT and REGEXP

am 27.01.2007 17:37:57 von Szymon Bobek

Thank you all for help. But now I'm still wondering if this structure is
really good. I mean - I read that it is better when there is more smaller
tables that when you use one big table (MySQL searches faster)- that's why I
decided to organize my database like that.
But is it true? When you can really feel that some data structures work
faster than the others? Is it like matter of tables with milions of records
only? Or if structure of database is crucial thing even for small databases?
I'm asking, because I think that it would be easier form me to have just one
big table then. Searching will be easier, adding easier, etc.

Szymon

Re: SHOW TABLES, SELECT and REGEXP

am 27.01.2007 18:18:16 von zac.carey

On Jan 27, 3:55 pm, Norman Peelman wrote:
> Paul Lautman wrote:
> > Norman Peelman wrote:
> >> Szymon Bobek wrote:
> >>> Hi!
>
> >>> I have databese with a few tables named like that: films_triller,
> >>> films_horror, films_action etc.
> >>> I decided to create many tables instead of creting just one big table
> >>> containing all sort of films. Is it a good idea? (I think, that it
> >>> boost search time much - am I right?)
> >>> Now - to find a film (when only title is known) I have to search in
> >>> all that tables. But I have a problem with that.. How to do that?
> >>> SELECT * FROM REGEXP 'film_*' WHERE title='the film' does not work.
> >>> The same is when I want to list all tables (to see what groups of
> >>> films I have) - if I try do join SHOW TABLES with REGEXP I get an
> >>> syntax error too. Where is the mistake and how to correct it?
>
> >>> Thanks in advance
> >>> Szymon Bobek
>
> >> Your structure is fine... the problem is your method of searching. I'm
> >> not really sure you can use REGEX in that context:
> > You REALLY think that structure is fine??? What do you THINK is wrong with it? If you want to go for one big
> table then your structure could become:
>
> film_type set('horror','thriller','comedy','romance','action',...)
>
> and then a row could be:
>
> film_id film_title film_type
> 1 Romancing the Stone action, comedy, romance
> 2 GODZILLA action, sci-fi
> 3 Jurasic Park action, sci-fi
>
> and your search(s) becomes:
>
> SELECT * FROM film_table WHERE film_title LIKE '%Jurasic%'
> returns data on Jurasic Park
>
> or
>
> SELECT * FROM film_table WHERE FIND_IN_SET('sci-fi')
> returns data on GODZILLA and Jurasic Park
>
> ...does that look better. (an ENUM can contain only one item from a
> list, a SET can contain multiple item from a list)
>
> You would still want an index on the film_title. Not sure if you can
> index a SET column but you could try.
>
> Norm

NO!

Read up on normalisation.

Re: SHOW TABLES, SELECT and REGEXP

am 27.01.2007 20:51:24 von Norman Peelman

Szymon Bobek wrote:
> Thank you all for help. But now I'm still wondering if this structure is
> really good. I mean - I read that it is better when there is more smaller
> tables that when you use one big table (MySQL searches faster)- that's why I
> decided to organize my database like that.

MySQL searches fastest when indexing and keys are involved. Anything
outside that will start hitting peformance issues. The bigger the
database the bigger the issue. The second you have to look at every row
in a table your performance is out the window (even when referencing
multiple tables)

> But is it true? When you can really feel that some data structures work
> faster than the others? Is it like matter of tables with milions of records
> only? Or if structure of database is crucial thing even for small databases?
> I'm asking, because I think that it would be easier form me to have just one
> big table then. Searching will be easier, adding easier, etc.
>
> Szymon
>
>

I'm assuming that searching/adding/updating/deleting will be easier
on one big table for you.

Norm

Re: SHOW TABLES, SELECT and REGEXP

am 27.01.2007 21:07:37 von Paul Lautman

Norman Peelman wrote:
> Paul Lautman wrote:
>> Norman Peelman wrote:
>>> Szymon Bobek wrote:
>>>> Hi!
>>>>
>>>> I have databese with a few tables named like that: films_triller,
>>>> films_horror, films_action etc.
>>>> I decided to create many tables instead of creting just one big
>>>> table containing all sort of films. Is it a good idea? (I think,
>>>> that it boost search time much - am I right?)
>>>> Now - to find a film (when only title is known) I have to search
>>>> in all that tables. But I have a problem with that.. How to do
>>>> that? SELECT * FROM REGEXP 'film_*' WHERE title='the film' does
>>>> not work. The same is when I want to list all tables (to see what
>>>> groups of films I have) - if I try do join SHOW TABLES with
>>>> REGEXP I get an syntax error too. Where is the mistake and how to
>>>> correct it? Thanks in advance
>>>> Szymon Bobek
>>>>
>>>>
>>> Your structure is fine... the problem is your method of searching.
>>> I'm not really sure you can use REGEX in that context:
>> You REALLY think that structure is fine???
>>
>>
>
> What do you THINK is wrong with it? If you want to go for one big
> table then your structure could become:
>
> film_type set('horror','thriller','comedy','romance','action',...)
>
> and then a row could be:
>
> film_id film_title film_type
> 1 Romancing the Stone action, comedy, romance
> 2 GODZILLA action, sci-fi
> 3 Jurasic Park action, sci-fi
>
> and your search(s) becomes:
>
> SELECT * FROM film_table WHERE film_title LIKE '%Jurasic%'
> returns data on Jurasic Park
>
> or
>
> SELECT * FROM film_table WHERE FIND_IN_SET('sci-fi')
> returns data on GODZILLA and Jurasic Park
>
> ...does that look better. (an ENUM can contain only one item from a
> list, a SET can contain multiple item from a list)
>
> You would still want an index on the film_title. Not sure if you can
> index a SET column but you could try.
>
> Norm

If you read my post you would know that that was not what I was suggesting.
Have you heard of normalisation?
What you are suggesting is the sort of thing I expect to see from a first
year student!

Re: SHOW TABLES, SELECT and REGEXP

am 28.01.2007 07:33:41 von Norman Peelman

Paul Lautman wrote:
> Norman Peelman wrote:
>> Paul Lautman wrote:
>>> Norman Peelman wrote:
>>>> Szymon Bobek wrote:
>>>>> Hi!
>>>>>
>>>>> I have databese with a few tables named like that: films_triller,
>>>>> films_horror, films_action etc.
>>>>> I decided to create many tables instead of creting just one big
>>>>> table containing all sort of films. Is it a good idea? (I think,
>>>>> that it boost search time much - am I right?)
>>>>> Now - to find a film (when only title is known) I have to search
>>>>> in all that tables. But I have a problem with that.. How to do
>>>>> that? SELECT * FROM REGEXP 'film_*' WHERE title='the film' does
>>>>> not work. The same is when I want to list all tables (to see what
>>>>> groups of films I have) - if I try do join SHOW TABLES with
>>>>> REGEXP I get an syntax error too. Where is the mistake and how to
>>>>> correct it? Thanks in advance
>>>>> Szymon Bobek
>>>>>
>>>>>
>>>> Your structure is fine... the problem is your method of searching.
>>>> I'm not really sure you can use REGEX in that context:
>>> You REALLY think that structure is fine???
>>>
>>>
>> What do you THINK is wrong with it? If you want to go for one big
>> table then your structure could become:
>>
>> film_type set('horror','thriller','comedy','romance','action',...)
>>
>> and then a row could be:
>>
>> film_id film_title film_type
>> 1 Romancing the Stone action, comedy, romance
>> 2 GODZILLA action, sci-fi
>> 3 Jurasic Park action, sci-fi
>>
>> and your search(s) becomes:
>>
>> SELECT * FROM film_table WHERE film_title LIKE '%Jurasic%'
>> returns data on Jurasic Park
>>
>> or
>>
>> SELECT * FROM film_table WHERE FIND_IN_SET('sci-fi')
>> returns data on GODZILLA and Jurasic Park
>>
>> ...does that look better. (an ENUM can contain only one item from a
>> list, a SET can contain multiple item from a list)
>>
>> You would still want an index on the film_title. Not sure if you can
>> index a SET column but you could try.
>>
>> Norm
>
> If you read my post you would know that that was not what I was suggesting.
> Have you heard of normalisation?
> What you are suggesting is the sort of thing I expect to see from a first
> year student!
>
>

First of all, my apologies to you, I thought I was replying to the
original poster. Secondly, I seem to be the only one offering
suggestions. Yes, read up on normalization, get your head spun around
with a lot of things you may or may not understand (and I don't claim
to). I would imagine the original poster would like to get his database
up to speed. Looking at your original post you seem to describe a
schema that is the opposite of normalization.

For your example:

> ID | Film_Type
> 225 | Comedy
> 225 | Thriller.

> This is called "normalisation".

ID is not unique and Film_type can (and will) be repeated. And
actually both will be repeated quite often as alot of films fit multiple
categories.

1NF suggests that ID should be unique (or at least there should be a
unique identifier, and there is none) and Film_Type (and ID) should not
repeat. This schema in reality is no different than using a SET() for
Film_Type and I would imagine the later is more efficient since the
SET() data is stored in the schema definition and only pointers are
stored in the tables. 1NF suggests that for normalization you should
have one table for each Film_Type thus ensuring unique ID's and
non-repeating data.

p.s. Is normalization a requirement?

Norm

Re: SHOW TABLES, SELECT and REGEXP

am 28.01.2007 13:09:38 von Paul Lautman

Norman Peelman wrote:
> Paul Lautman wrote:
>> Norman Peelman wrote:
>>> Paul Lautman wrote:
>>>> Norman Peelman wrote:
>>>>> Szymon Bobek wrote:
>>>>>> Hi!
>>>>>>
>>>>>> I have databese with a few tables named like that: films_triller,
>>>>>> films_horror, films_action etc.
>>>>>> I decided to create many tables instead of creting just one big
>>>>>> table containing all sort of films. Is it a good idea? (I think,
>>>>>> that it boost search time much - am I right?)
>>>>>> Now - to find a film (when only title is known) I have to search
>>>>>> in all that tables. But I have a problem with that.. How to do
>>>>>> that? SELECT * FROM REGEXP 'film_*' WHERE title='the film' does
>>>>>> not work. The same is when I want to list all tables (to see what
>>>>>> groups of films I have) - if I try do join SHOW TABLES with
>>>>>> REGEXP I get an syntax error too. Where is the mistake and how to
>>>>>> correct it? Thanks in advance
>>>>>> Szymon Bobek
>>>>>>
>>>>>>
>>>>> Your structure is fine... the problem is your method of searching.
>>>>> I'm not really sure you can use REGEX in that context:
>>>> You REALLY think that structure is fine???
>>>>
>>>>
>>> What do you THINK is wrong with it? If you want to go for one big
>>> table then your structure could become:
>>>
>>> film_type set('horror','thriller','comedy','romance','action',...)
>>>
>>> and then a row could be:
>>>
>>> film_id film_title film_type
>>> 1 Romancing the Stone action, comedy, romance
>>> 2 GODZILLA action, sci-fi
>>> 3 Jurasic Park action, sci-fi
>>>
>>> and your search(s) becomes:
>>>
>>> SELECT * FROM film_table WHERE film_title LIKE '%Jurasic%'
>>> returns data on Jurasic Park
>>>
>>> or
>>>
>>> SELECT * FROM film_table WHERE FIND_IN_SET('sci-fi')
>>> returns data on GODZILLA and Jurasic Park
>>>
>>> ...does that look better. (an ENUM can contain only one item from a
>>> list, a SET can contain multiple item from a list)
>>>
>>> You would still want an index on the film_title. Not sure if you can
>>> index a SET column but you could try.
>>>
>>> Norm
>>
>> If you read my post you would know that that was not what I was
>> suggesting. Have you heard of normalisation?
>> What you are suggesting is the sort of thing I expect to see from a
>> first year student!
>>
>>
>
> First of all, my apologies to you, I thought I was replying to the
> original poster. Secondly, I seem to be the only one offering
> suggestions. Yes, read up on normalization, get your head spun around
> with a lot of things you may or may not understand (and I don't claim
> to). I would imagine the original poster would like to get his
> database up to speed. Looking at your original post you seem to
> describe a schema that is the opposite of normalization.
>
> For your example:
>
>> ID | Film_Type
>> 225 | Comedy
>> 225 | Thriller.
>
>> This is called "normalisation".
>
> ID is not unique and Film_type can (and will) be repeated. And
> actually both will be repeated quite often as alot of films fit
> multiple categories.
>
> 1NF suggests that ID should be unique (or at least there should be a
> unique identifier, and there is none) and Film_Type (and ID) should
> not repeat.
In this case the primay key would be (ID, Film_Type). 1NF does not require
that the unique identifier should consist of a single column of a table.

> This schema in reality is no different than using a SET()
> for Film_Type
this schema is VERY DIFFERENT from using a SET().

> and I would imagine the later is more efficient since
> the SET()
Unless you want to look things up in an efficient way!

> data is stored in the schema definition and only pointers
> are stored in the tables. 1NF suggests that for normalization you
> should have one table for each Film_Type thus ensuring unique ID's and
> non-repeating data.
>
> p.s. Is normalization a requirement?
>
> Norm

Re: SHOW TABLES, SELECT and REGEXP

am 29.01.2007 04:27:19 von Norman Peelman

Paul Lautman wrote:
> Norman Peelman wrote:
>> Paul Lautman wrote:
>>> Norman Peelman wrote:
>>>> Paul Lautman wrote:
>>>>> Norman Peelman wrote:
>>>>>> Szymon Bobek wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>> I have databese with a few tables named like that: films_triller,
>>>>>>> films_horror, films_action etc.
>>>>>>> I decided to create many tables instead of creting just one big
>>>>>>> table containing all sort of films. Is it a good idea? (I think,
>>>>>>> that it boost search time much - am I right?)
>>>>>>> Now - to find a film (when only title is known) I have to search
>>>>>>> in all that tables. But I have a problem with that.. How to do
>>>>>>> that? SELECT * FROM REGEXP 'film_*' WHERE title='the film' does
>>>>>>> not work. The same is when I want to list all tables (to see what
>>>>>>> groups of films I have) - if I try do join SHOW TABLES with
>>>>>>> REGEXP I get an syntax error too. Where is the mistake and how to
>>>>>>> correct it? Thanks in advance
>>>>>>> Szymon Bobek
>>>>>>>
>>>>>>>
>>>>>> Your structure is fine... the problem is your method of searching.
>>>>>> I'm not really sure you can use REGEX in that context:
>>>>> You REALLY think that structure is fine???
>>>>>
>>>>>
>>>> What do you THINK is wrong with it? If you want to go for one big
>>>> table then your structure could become:
>>>>
>>>> film_type set('horror','thriller','comedy','romance','action',...)
>>>>
>>>> and then a row could be:
>>>>
>>>> film_id film_title film_type
>>>> 1 Romancing the Stone action, comedy, romance
>>>> 2 GODZILLA action, sci-fi
>>>> 3 Jurasic Park action, sci-fi
>>>>
>>>> and your search(s) becomes:
>>>>
>>>> SELECT * FROM film_table WHERE film_title LIKE '%Jurasic%'
>>>> returns data on Jurasic Park
>>>>
>>>> or
>>>>
>>>> SELECT * FROM film_table WHERE FIND_IN_SET('sci-fi')
>>>> returns data on GODZILLA and Jurasic Park
>>>>
>>>> ...does that look better. (an ENUM can contain only one item from a
>>>> list, a SET can contain multiple item from a list)
>>>>
>>>> You would still want an index on the film_title. Not sure if you can
>>>> index a SET column but you could try.
>>>>
>>>> Norm
>>> If you read my post you would know that that was not what I was
>>> suggesting. Have you heard of normalisation?
>>> What you are suggesting is the sort of thing I expect to see from a
>>> first year student!
>>>
>>>
>> First of all, my apologies to you, I thought I was replying to the
>> original poster. Secondly, I seem to be the only one offering
>> suggestions. Yes, read up on normalization, get your head spun around
>> with a lot of things you may or may not understand (and I don't claim
>> to). I would imagine the original poster would like to get his
>> database up to speed. Looking at your original post you seem to
>> describe a schema that is the opposite of normalization.
>>
>> For your example:
>>
>>> ID | Film_Type
>>> 225 | Comedy
>>> 225 | Thriller.
>>> This is called "normalisation".
>> ID is not unique and Film_type can (and will) be repeated. And
>> actually both will be repeated quite often as alot of films fit
>> multiple categories.
>>
>> 1NF suggests that ID should be unique (or at least there should be a
>> unique identifier, and there is none) and Film_Type (and ID) should
>> not repeat.
> In this case the primay key would be (ID, Film_Type). 1NF does not require
> that the unique identifier should consist of a single column of a table.
>

Something that you neglected to mention in your original post.

To the original poster, sorry if I gave you wrong advice. You will
definitely need an index on your film_title (whether it be a
multi-column primary or standard index) to have fast searches by that
column.

Norm

Re: SHOW TABLES, SELECT and REGEXP

am 29.01.2007 06:03:37 von Norman Peelman

Paul Lautman wrote:
> Norman Peelman wrote:
>> Paul Lautman wrote:
>>> Norman Peelman wrote:
>>>> Paul Lautman wrote:
>>>>> Norman Peelman wrote:
>>>>>> Szymon Bobek wrote:
>>>>>>> Hi!
>>>>>>>
>>>>>>> I have databese with a few tables named like that: films_triller,
>>>>>>> films_horror, films_action etc.
>>>>>>> I decided to create many tables instead of creting just one big
>>>>>>> table containing all sort of films. Is it a good idea? (I think,
>>>>>>> that it boost search time much - am I right?)
>>>>>>> Now - to find a film (when only title is known) I have to search
>>>>>>> in all that tables. But I have a problem with that.. How to do
>>>>>>> that? SELECT * FROM REGEXP 'film_*' WHERE title='the film' does
>>>>>>> not work. The same is when I want to list all tables (to see what
>>>>>>> groups of films I have) - if I try do join SHOW TABLES with
>>>>>>> REGEXP I get an syntax error too. Where is the mistake and how to
>>>>>>> correct it? Thanks in advance
>>>>>>> Szymon Bobek
>>>>>>>
>>>>>>>
>>>>>> Your structure is fine... the problem is your method of searching.
>>>>>> I'm not really sure you can use REGEX in that context:
>>>>> You REALLY think that structure is fine???
>>>>>
>>>>>
>>>> What do you THINK is wrong with it? If you want to go for one big
>>>> table then your structure could become:
>>>>
>>>> film_type set('horror','thriller','comedy','romance','action',...)
>>>>
>>>> and then a row could be:
>>>>
>>>> film_id film_title film_type
>>>> 1 Romancing the Stone action, comedy, romance
>>>> 2 GODZILLA action, sci-fi
>>>> 3 Jurasic Park action, sci-fi
>>>>
>>>> and your search(s) becomes:
>>>>
>>>> SELECT * FROM film_table WHERE film_title LIKE '%Jurasic%'
>>>> returns data on Jurasic Park
>>>>
>>>> or
>>>>
>>>> SELECT * FROM film_table WHERE FIND_IN_SET('sci-fi')
>>>> returns data on GODZILLA and Jurasic Park
>>>>
>>>> ...does that look better. (an ENUM can contain only one item from a
>>>> list, a SET can contain multiple item from a list)
>>>>
>>>> You would still want an index on the film_title. Not sure if you can
>>>> index a SET column but you could try.
>>>>
>>>> Norm
>>> If you read my post you would know that that was not what I was
>>> suggesting. Have you heard of normalisation?
>>> What you are suggesting is the sort of thing I expect to see from a
>>> first year student!
>>>
>>>
>> First of all, my apologies to you, I thought I was replying to the
>> original poster. Secondly, I seem to be the only one offering
>> suggestions. Yes, read up on normalization, get your head spun around
>> with a lot of things you may or may not understand (and I don't claim
>> to). I would imagine the original poster would like to get his
>> database up to speed. Looking at your original post you seem to
>> describe a schema that is the opposite of normalization.
>>
>> For your example:
>>
>>> ID | Film_Type
>>> 225 | Comedy
>>> 225 | Thriller.
>>> This is called "normalisation".
>> ID is not unique and Film_type can (and will) be repeated. And
>> actually both will be repeated quite often as alot of films fit
>> multiple categories.
>>
>> 1NF suggests that ID should be unique (or at least there should be a
>> unique identifier, and there is none) and Film_Type (and ID) should
>> not repeat.
> In this case the primay key would be (ID, Film_Type). 1NF does not require
> that the unique identifier should consist of a single column of a table.
>
>> This schema in reality is no different than using a SET()
>> for Film_Type
> this schema is VERY DIFFERENT from using a SET().
>
>> and I would imagine the later is more efficient since
>> the SET()
> Unless you want to look things up in an efficient way!
>
>> data is stored in the schema definition and only pointers
>> are stored in the tables. 1NF suggests that for normalization you
>> should have one table for each Film_Type thus ensuring unique ID's and
>> non-repeating data.
>>
>> p.s. Is normalization a requirement?
>>
>> Norm
>
>
>

It also appears that having an index on film_title using queries like:

1) SELECT * FROM table WHERE film_title = '????'; = instant retrieval
(.01 secs)
2) SELECT * FROM table WHERE film_title LIKE '????'; = instant retrieval
(.03 secs)
3) SELECT * FROM table WHERE film_title LIKE '%???'; = full table scan
(my system ~28 seconds on 10 million rows)
4) SELECT * FROM table WHERE film_title = '????%'; = instant retrieval
(.01 secs)
5) SELECT * FROM table WHERE film_title = '%??%'; = full table scan (my
system ~28 seconds on 10 million rows)

What can be done to speed #3 and #5 up? I would almost expect MySQL
to look into the INDEX first but it doesn't seem to want to. It tried to
USE/FORCE INDEX but no go if '%' was the first character in the LIKE
clause. This is really what the original poster needs to know (in
addition to normalization).

Norm

Re: SHOW TABLES, SELECT and REGEXP

am 29.01.2007 10:31:35 von Captain Paralytic

On 29 Jan, 05:03, Norman Peelman wrote:
> Paul Lautman wrote:
> > Norman Peelman wrote:
> >> Paul Lautman wrote:
> >>> Norman Peelman wrote:
> >>>> Paul Lautman wrote:
> >>>>> Norman Peelman wrote:
> >>>>>> Szymon Bobek wrote:
> >>>>>>> Hi!
>
> >>>>>>> I have databese with a few tables named like that: films_triller,
> >>>>>>> films_horror, films_action etc.
> >>>>>>> I decided to create many tables instead of creting just one big
> >>>>>>> table containing all sort of films. Is it a good idea? (I think,
> >>>>>>> that it boost search time much - am I right?)
> >>>>>>> Now - to find a film (when only title is known) I have to search
> >>>>>>> in all that tables. But I have a problem with that.. How to do
> >>>>>>> that? SELECT * FROM REGEXP 'film_*' WHERE title='the film' does
> >>>>>>> not work. The same is when I want to list all tables (to see what
> >>>>>>> groups of films I have) - if I try do join SHOW TABLES with
> >>>>>>> REGEXP I get an syntax error too. Where is the mistake and how to
> >>>>>>> correct it? Thanks in advance
> >>>>>>> Szymon Bobek
>
> >>>>>> Your structure is fine... the problem is your method of searching.
> >>>>>> I'm not really sure you can use REGEX in that context:
> >>>>> You REALLY think that structure is fine???
>
> >>>> What do you THINK is wrong with it? If you want to go for one big
> >>>> table then your structure could become:
>
> >>>> film_type set('horror','thriller','comedy','romance','action',...)
>
> >>>> and then a row could be:
>
> >>>> film_id film_title film_type
> >>>> 1 Romancing the Stone action, comedy, romance
> >>>> 2 GODZILLA action, sci-fi
> >>>> 3 Jurasic Park action, sci-fi
>
> >>>> and your search(s) becomes:
>
> >>>> SELECT * FROM film_table WHERE film_title LIKE '%Jurasic%'
> >>>> returns data on Jurasic Park
>
> >>>> or
>
> >>>> SELECT * FROM film_table WHERE FIND_IN_SET('sci-fi')
> >>>> returns data on GODZILLA and Jurasic Park
>
> >>>> ...does that look better. (an ENUM can contain only one item from a
> >>>> list, a SET can contain multiple item from a list)
>
> >>>> You would still want an index on the film_title. Not sure if you can
> >>>> index a SET column but you could try.
>
> >>>> Norm
> >>> If you read my post you would know that that was not what I was
> >>> suggesting. Have you heard of normalisation?
> >>> What you are suggesting is the sort of thing I expect to see from a
> >>> first year student!
>
> >> First of all, my apologies to you, I thought I was replying to the
> >> original poster. Secondly, I seem to be the only one offering
> >> suggestions. Yes, read up on normalization, get your head spun around
> >> with a lot of things you may or may not understand (and I don't claim
> >> to). I would imagine the original poster would like to get his
> >> database up to speed. Looking at your original post you seem to
> >> describe a schema that is the opposite of normalization.
>
> >> For your example:
>
> >>> ID | Film_Type
> >>> 225 | Comedy
> >>> 225 | Thriller.
> >>> This is called "normalisation".
> >> ID is not unique and Film_type can (and will) be repeated. And
> >> actually both will be repeated quite often as alot of films fit
> >> multiple categories.
>
> >> 1NF suggests that ID should be unique (or at least there should be a
> >> unique identifier, and there is none) and Film_Type (and ID) should
> >> not repeat.
> > In this case the primay key would be (ID, Film_Type). 1NF does not require
> > that the unique identifier should consist of a single column of a table.
>
> >> This schema in reality is no different than using a SET()
> >> for Film_Type
> > this schema is VERY DIFFERENT from using a SET().
>
> >> and I would imagine the later is more efficient since
> >> the SET()
> > Unless you want to look things up in an efficient way!
>
> >> data is stored in the schema definition and only pointers
> >> are stored in the tables. 1NF suggests that for normalization you
> >> should have one table for each Film_Type thus ensuring unique ID's and
> >> non-repeating data.
>
> >> p.s. Is normalization a requirement?
>
> >> Norm It also appears that having an index on film_title using queries like:
>
> 1) SELECT * FROM table WHERE film_title = '????'; = instant retrieval
> (.01 secs)
> 2) SELECT * FROM table WHERE film_title LIKE '????'; = instant retrieval
> (.03 secs)
> 3) SELECT * FROM table WHERE film_title LIKE '%???'; = full table scan
> (my system ~28 seconds on 10 million rows)
> 4) SELECT * FROM table WHERE film_title = '????%'; = instant retrieval
> (.01 secs)
> 5) SELECT * FROM table WHERE film_title = '%??%'; = full table scan (my
> system ~28 seconds on 10 million rows)
>
> What can be done to speed #3 and #5 up? I would almost expect MySQL
> to look into the INDEX first but it doesn't seem to want to. It tried to
> USE/FORCE INDEX but no go if '%' was the first character in the LIKE
> clause. This is really what the original poster needs to know (in
> addition to normalization).
>
> Norm- Hide quoted text -- Show quoted text -
If you understand what an index is, then it is obvious why it cannot
be used if % is the first character in a LIKE!

Re: SHOW TABLES, SELECT and REGEXP

am 30.01.2007 21:40:43 von unknown

Post removed (X-No-Archive: yes)