Re: [PHP] SQL Syntax

Re: [PHP] SQL Syntax

am 16.06.2010 03:05:27 von Daniel Brown

[Top-post.]

You'll probably have much better luck on the MySQL General list.
CC'ed on this email.


On Tue, Jun 15, 2010 at 20:58, Jan Reiter wrote:
> Hi folks!
>
> I'm kind of ashamed to ask a question, as I haven't followed this list ve=
ry
> much lately.
>
>
>
> This isn't exactly a PHP question, but since mysql is the most popular
> database engine used with php, I figured someone here might have an idea.
>
>
>
> I have 2 tables. Table A containing 2 fields. A user ID and a picture ID =
=3D>
> A(uid,pid) and another table B, containing 3 fields. The picture ID, an
> attribute ID and a value for that attribute =3D> B(pid,aid,value).
>
>
>
> Table B contains several rows for a single PID with various AIDs and valu=
es.
> Each AID is unique to a PID. =A0(e.g. AID =3D 1 always holding the value =
for the
> image size and AID =3D 3 always holding a value for the image type)
>
>
>
> The goal is now to join table A on table B using pid, and selecting the r=
ows
> based on MULTIPLE =A0attributes.
>
>
>
> So the result should only contain rows for images, that relate to an
> attribute ID =3D 1 (size) that is bigger than 100 AND!!!!!!! an attribute=
ID =3D
> 5 that equals 'jpg'.
>
>
>
> I know that there is an easy solution to this, doing it in one query and =
I
> have the feeling, that I can almost touch it with my fingertips in my min=
d,
> but I can't go that final step, if you know what I mean. AND THAT DRIVES =
ME
> CRAZY!!!!!!
>
>
>
> I appreciate your thoughts on this.
>
>
>
> Regards,
>
> Jan
>
>



--=20

daniel.brown@parasane.net || danbrown@php.net
http://www.parasane.net/ || http://www.pilotpig.net/
We now offer SAME-DAY SETUP on a new line of servers!

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg

Re: [PHP] SQL Syntax

am 16.06.2010 14:49:10 von Joerg Bruehe

Hi!


Daniel Brown wrote:
> [Top-post.]
>=20
> You'll probably have much better luck on the MySQL General list=
..
> CC'ed on this email.
>=20
>=20
> On Tue, Jun 15, 2010 at 20:58, Jan Reiter wrot=
e:
>> Hi folks!
>>
>> [[...]]
>>
>> I have 2 tables. Table A containing 2 fields. A user ID and a pict=
ure ID =3D>
>> A(uid,pid) and another table B, containing 3 fields. The picture I=
D, an
>> attribute ID and a value for that attribute =3D> B(pid,aid,value).
>>
>> Table B contains several rows for a single PID with various AIDs a=
nd values.
>> Each AID is unique to a PID. (e.g. AID =3D 1 always holding the v=
alue for the
>> image size and AID =3D 3 always holding a value for the image type=
)
>>
>> The goal is now to join table A on table B using pid, and selectin=
g the rows
>> based on MULTIPLE attributes.
>>
>> So the result should only contain rows for images, that relate to =
an
>> attribute ID =3D 1 (size) that is bigger than 100 AND!!!!!!! an at=
tribute ID =3D
>> 5 that equals 'jpg'.
>>
>> [[...]]

You need to do a multi-table join, table A joined to one instance of
table B for each attribute relevant to your search.

Roughly, syntax not tested, it is something like
SELECT a.uid, a.pid FROM a JOIN b AS b1 ON a.pid=3Db1.pid
JOIN b AS b2 ON a.pid=3Db2.pid
JOIN ...
WHERE b1.aid =3D 1 AND b1.value > 100
AND b2.aid =3D 3 AND b2.value =3D 5
AND ...
(assuming 'jpg' is coded as 5, what I take from your text).

Now, I see some difficulties with this:
1) You are using the "value" column for anything, that may cause data
type problems.
2) AFAIR, there was a post recently claiming the alias names (b1, b2,
...) could not be used in WHERE conditions, and the recommendation
was to replace WHERE by HAVING.
3) If you need to support many attributes in one search, the number o=
f
tables joined grows, and the amount of data to handle (cartesian
product!) will explode.
What works fine with 3 criteria on 10 pictures (10 * 10 * 10 =3D 1=
000)
may totally fail with 4 criteria on 200 pictures
(200**4 =3D 800.000.000 =3D 800 million)
4) The more different attributes you store per picture, the larger yo=
ur
table B will become, and this will make the data grow for each joi=
n
step.
If you store 4 attributes each for 200 pictures, table B will alre=
ady
have 800 entries. In itself, that isn't much, but now the 4-way jo=
in
will produce a cartesian product of
800**4 =3D 8**4 * 100**4 =3D 4096 * 100.000.000 =3D 409.600.000=
..000
combinations.
In your place, I would use a separate table for attributes which a=
re
expected to be defined for all pictures, like size and image type.
Then your general attributes table B will hold much fewer rows, th=
us
each join step will profit.
5) Because of that explosion, it may be better to work with a tempora=
ry
table, joining it to B for one attribute and thus reducing the dat=
a,
then looping over such a step for all the relevant attributes.

Good luck in experimenting!


Jörg

--=20
Joerg Bruehe, MySQL Build Team, Joerg.Bruehe@Sun.COM
Sun Microsystems GmbH, Komturstrasse 18a, D-12099 Berlin
Geschaeftsfuehrer: Juergen Kunz
Amtsgericht Muenchen: HRB161028


--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=3Dgcdmg-mysql-2@m.gmane.o rg

RE: SQL Syntax

am 16.06.2010 15:50:20 von Jan Reiter

Thank you, for your response!

I'm testing with exactly that table structure to evaluate flexibility vs
speed. I only use images, as it gives me easy a vast amount of sample =
date.
I'm testing with 30,000 images. :-)

So far I was testing multi-table joins in various combinations, trying =
to
help mysql's optimization routines using IN() etc.=20

But comparing for 2 attributes costs 1.8 seconds/ query with 7 =
attributes
per image and 30,000 images.=20
Not acceptable. ;-)

Views without conditions on the values resulted in no speed advantages.
(figures ... )=20

I see chances, that temporary tables will speed things up a bit. I'm =
testing
that next.=20
Another attempt could be to switch DB engines from InnoDB to MyISAM ... =
I'll
see what it results in. :D.


This is just a private test on a sandbox machine. I earn my living =
flying
airliners, not programming.

Thanks and Regards,
Jan


-----Original Message-----
From: Joerg.Bruehe@Sun.COM [mailto:Joerg.Bruehe@Sun.COM]=20
Sent: Wednesday, June 16, 2010 2:49 PM
To: MYSQL General List
Cc: Jan Reiter
Subject: Re: [PHP] SQL Syntax

Hi!


Daniel Brown wrote:
> [Top-post.]
>=20
> You'll probably have much better luck on the MySQL General list.
> CC'ed on this email.
>=20
>=20
> On Tue, Jun 15, 2010 at 20:58, Jan Reiter wrote:
>> Hi folks!
>>
>> [[...]]
>>
>> I have 2 tables. Table A containing 2 fields. A user ID and a picture =
ID
=3D>
>> A(uid,pid) and another table B, containing 3 fields. The picture ID, =
an
>> attribute ID and a value for that attribute =3D> B(pid,aid,value).
>>
>> Table B contains several rows for a single PID with various AIDs and
values.
>> Each AID is unique to a PID. (e.g. AID =3D 1 always holding the =
value for
the
>> image size and AID =3D 3 always holding a value for the image type)
>>
>> The goal is now to join table A on table B using pid, and selecting =
the
rows
>> based on MULTIPLE attributes.
>>
>> So the result should only contain rows for images, that relate to an
>> attribute ID =3D 1 (size) that is bigger than 100 AND!!!!!!! an =
attribute
ID =3D
>> 5 that equals 'jpg'.
>>
>> [[...]]

You need to do a multi-table join, table A joined to one instance of
table B for each attribute relevant to your search.

Roughly, syntax not tested, it is something like
SELECT a.uid, a.pid FROM a JOIN b AS b1 ON a.pid=3Db1.pid
JOIN b AS b2 ON a.pid=3Db2.pid
JOIN ...
WHERE b1.aid =3D 1 AND b1.value > 100
AND b2.aid =3D 3 AND b2.value =3D 5
AND ...
(assuming 'jpg' is coded as 5, what I take from your text).

Now, I see some difficulties with this:
1) You are using the "value" column for anything, that may cause data
type problems.
2) AFAIR, there was a post recently claiming the alias names (b1, b2,
...) could not be used in WHERE conditions, and the recommendation
was to replace WHERE by HAVING.
3) If you need to support many attributes in one search, the number of
tables joined grows, and the amount of data to handle (cartesian
product!) will explode.
What works fine with 3 criteria on 10 pictures (10 * 10 * 10 =3D =
1000)
may totally fail with 4 criteria on 200 pictures
(200**4 =3D 800.000.000 =3D 800 million)
4) The more different attributes you store per picture, the larger your
table B will become, and this will make the data grow for each join
step.
If you store 4 attributes each for 200 pictures, table B will already
have 800 entries. In itself, that isn't much, but now the 4-way join
will produce a cartesian product of
800**4 =3D 8**4 * 100**4 =3D 4096 * 100.000.000 =3D =
409.600.000.000
combinations.
In your place, I would use a separate table for attributes which are
expected to be defined for all pictures, like size and image type.
Then your general attributes table B will hold much fewer rows, thus
each join step will profit.
5) Because of that explosion, it may be better to work with a temporary
table, joining it to B for one attribute and thus reducing the data,
then looping over such a step for all the relevant attributes.

Good luck in experimenting!


Jörg

--=20
Joerg Bruehe, MySQL Build Team, Joerg.Bruehe@Sun.COM
Sun Microsystems GmbH, Komturstrasse 18a, D-12099 Berlin
Geschaeftsfuehrer: Juergen Kunz
Amtsgericht Muenchen: HRB161028


--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php