MySQL queries in a relational database

am 30.03.2011 12:58:57 von francesca casalino

--20cf304346f43b2a7f049fb11085
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

Dear Perl and MySQL gurus,

I have been learning a lot from you for the past month, and, learning from
your comments and with your help, I was able to build a database on MySQL
through Perl DBI.

I am now having trouble with generating a query in Perl DBI to take data
from this database and insert it into another table in MySQL; I am still
just at the start in learning both MySQL and Perl, and sorry if this is a
simple question again, but I have been stuck on this for a while now=85if y=
ou
have any advice on how to tackle this please let me know.

My database is constructed with different tables connected to each other
through foreign keys. This is an example of the data that I am having
trouble with (primary and foreign keys are specified in parenthesis, and
foreign keys refer to another table in the database). I used ENGINE=3DINNOD=
B
for constructing the database, and MySQL version 5.1.

*Allele*

Allele_id (Primary key) Variation_id (Foreign key) Allele
Reference

12 1
A 0

13 1
G 1

*Genotype*

Genotype_id (Primary key) Sample_id (Foreign key) Allele_id
(Foreign key)

1 Sample1
12

2 Sample1
13

I am trying to:

1) Fill in another table which contains frequencies for each of these
alleles, for each variation, and

2) Find a way to select and count the samples that have allele.reference =
=3D 0
and 0 for each variation and store them in one group, and the ones that hav=
e
allele.reference =3D 0 and 1 in another group, and the ones that have
allele.reference =3D 1 and 1 ina athird group.

For 1) I have solved the counts that I will need for the frequency
calculations on MySQL, but for some reason it does not work when I try the
first of these queries on Perl. The Mysql is this:

#count number of alleles entered for each variation:

select allele.variation_id, count(genotype.allele_id)

from allele, genotype

where allele.allele_id =3D genotype.allele_id

group by allele.variation_id;

#Count the number of samples that have reference=3D0 for each variation:

select allele.variation_id, count(genotype.allele_id)

from allele, genotype

where allele.allele_id =3D genotype.allele_id and allele.reference=3D0

group by allele.variation_id;

When I try this in Perl it does not work=85

#count number of alleles entered for each variation:

my $sth =3D$dbh->prepare("SELECT allele.variation_id,
count(genotype.allele_id)

from allele, genotype

where allele.allele_id =3D genotype.allele_id

group by allele.variation_id");

$sth->execute();

my (@allele);

while (@allele =3D $sth->fetchrow_array()) {

print @allele."\n";

}

2) I really have no idea how to record the values of the successive entries
of =93reference=94 grouped by variation and sample_id=85 I have been lookin=
g
through MySQL tutorials, but I really need to understand this better to kno=
w
how to approach this=85

#Count the number of samples that have reference =3D0 for both entries for
each variation

select allele.variation_id, count(genotype.sample_id)

from allele JOIN genotype ON allele.allele_id =3D genotype.allele_id

group by allele.variation_id, genotype.sample_id;

Thank you VERY VERY much for any help/suggestions, or any books/tutorials
that I could look at to understand how to solve these problems=85

-francesca

--20cf304346f43b2a7f049fb11085--

Re: MySQL queries in a relational database

am 30.03.2011 16:32:19 von Michael Boudreau

--_000_C9B8A923C262mboudreauchicagoedu_
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Hi Francesca,

Given the sample tables below, your query should return a single row ('1', =
'1'). You'll need to describe in more detail what you mean by "it does not =
work". In particular, I'd recommend adding some code to collect error messa=
ges, e.g.,

my $sth =3D $dbh->prepare("...")
or print "Cannot prepare query: $DBI::errstr\n";

$sth->execute
or print "Cannot execute query: $DBI::errstr\n";

Here are some books that I have found helpful for learning MySQL and the Pe=
rl DBI:

Alligator Descartes and Tim Bunce, Programming the Perl DBI. O'Reilly, 2000=
..

Paul DuBois, MySQL. 4th edition. Addison-Wesley, 2009.

Paul DuBois, MySQL Cookbook. 2nd edition. O'Reilly, 2007.

On 3/30/11 5:58 AM, "francesca casalino" wrote:

Dear Perl and MySQL gurus,

I have been learning a lot from you for the past month, and, learning from
your comments and with your help, I was able to build a database on MySQL
through Perl DBI.

I am now having trouble with generating a query in Perl DBI to take data
from this database and insert it into another table in MySQL; I am still
just at the start in learning both MySQL and Perl, and sorry if this is a
simple question again, but I have been stuck on this for a while now...if y=
ou
have any advice on how to tackle this please let me know.

My database is constructed with different tables connected to each other
through foreign keys. This is an example of the data that I am having
trouble with (primary and foreign keys are specified in parenthesis, and
foreign keys refer to another table in the database). I used ENGINE=3DINNOD=
B
for constructing the database, and MySQL version 5.1.

*Allele*

Allele_id (Primary key) Variation_id (Foreign key) Allele
Reference

12 1
A 0

13 1
G 1

*Genotype*

Genotype_id (Primary key) Sample_id (Foreign key) Allele_id
(Foreign key)

1 Sample1
12

2 Sample1
13

I am trying to:

1) Fill in another table which contains frequencies for each of these
alleles, for each variation, and

2) Find a way to select and count the samples that have allele.reference =
=3D 0
and 0 for each variation and store them in one group, and the ones that hav=
e
allele.reference =3D 0 and 1 in another group, and the ones that have
allele.reference =3D 1 and 1 ina athird group.

For 1) I have solved the counts that I will need for the frequency
calculations on MySQL, but for some reason it does not work when I try the
first of these queries on Perl. The Mysql is this:

#count number of alleles entered for each variation:

select allele.variation_id, count(genotype.allele_id)

from allele, genotype

where allele.allele_id =3D genotype.allele_id

group by allele.variation_id;

#Count the number of samples that have reference=3D0 for each variation:

select allele.variation_id, count(genotype.allele_id)

from allele, genotype

where allele.allele_id =3D genotype.allele_id and allele.reference=3D0

group by allele.variation_id;

When I try this in Perl it does not work...

#count number of alleles entered for each variation:

my $sth =3D$dbh->prepare("SELECT allele.variation_id,
count(genotype.allele_id)

from allele, genotype

where allele.allele_id =3D genotype.allele_id

group by allele.variation_id");

$sth->execute();

my (@allele);

while (@allele =3D $sth->fetchrow_array()) {

print @allele."\n";

}

2) I really have no idea how to record the values of the successive entries
of "reference" grouped by variation and sample_id... I have been looking
through MySQL tutorials, but I really need to understand this better to kno=
w
how to approach this...

#Count the number of samples that have reference =3D0 for both entries for
each variation

select allele.variation_id, count(genotype.sample_id)

from allele JOIN genotype ON allele.allele_id =3D genotype.allele_id

group by allele.variation_id, genotype.sample_id;

Thank you VERY VERY much for any help/suggestions, or any books/tutorials
that I could look at to understand how to solve these problems...

-francesca

--
Michael R. Boudreau
Senior Publishing Technology Analyst
The University of Chicago Press
1427 E. 60th Street
Chicago, IL 60637
(773) 753-3298 fax: (773) 753-3383

--_000_C9B8A923C262mboudreauchicagoedu_--

Re: MySQL queries in a relational database

am 30.03.2011 17:08:49 von francesca casalino

--bcaec519643be00144049fb48d96
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

Hi and thank you for your replies and suggestions.

I realise I was not clear at all, sorry!

The problem I have with both of my questions is actually that I don't reall=
y
know how to do it and whether I am approaching the problems correctly (well=
,
just the first problem, for the second I don't even have ideas...)

I can get the counts of the alleles for each variation, and the counts of
the minor allele, but I do not know how to store and retrieve the values fo=
r
both of these select statements, in order to calculate the frequency (for
each variation_id) and insert this frequency in another MySQL table...

Here is what I can do so far:

#count number of alleles entered for each variation: This gives me 2
columns: one with the variation_id and one with the count
my $sth =3D$dbh->prepare("SELECT allele.variation_id,
count(genotype.allele_id)
from allele, genotype
where allele.allele_id =3D genotype.allele_id
group by allele.variation_id");
$sth->execute();
my (@allele_count);
while (@allele_count =3D $sth->fetchrow_array()) {
print join (",", @allele_count), "\n";
}
$sth->finish;

#Count the number of samples that have alternate (minor) allele: This gives
me two columns: one with the variation_id and one with the minor allele
counts

my $sth =3D$dbh->prepare("SELECT allele.variation_id,
count(genotype.allele_id)
from allele, genotype where allele.allele_id =3D genotype.allele_id and
allele.reference=3D1

group by allele.variation_id");
$sth->execute();
my (@min_allele_count);
while (@min_allele_count =3D $sth->fetchrow_array()) {
print join (",", @min_allele_count), "\n";
}
$sth->finish;

I then need to divide min_allele_count by allele_count to get the frequency=
,
but as it is I cannot get these values because they are not stored...I hope
this helps explaining 1) ....

Thank you for any suggestions..
-francy

2011/3/30 Michael R Boudreau

> Hi Francesca,
>
> Given the sample tables below, your query should return a single row (=91=
1=92,
> =911=92). You=92ll need to describe in more detail what you mean by =93it=
does not
> work=94. In particular, I=92d recommend adding some code to collect error
> messages, e.g.,
>
> my $sth =3D $dbh->prepare(=93...=94)
> or print =93Cannot prepare query: $DBI::errstr\n=94;
>
> $sth->execute
> or print =93Cannot execute query: $DBI::errstr\n=94;
>
>
> Here are some books that I have found helpful for learning MySQL and the
> Perl DBI:
>
> Alligator Descartes and Tim Bunce, *Programming the Perl DBI*. O=92Reilly=
,
> 2000.
>
> Paul DuBois, *MySQL*. 4th edition. Addison-Wesley, 2009.
>
> Paul DuBois, *MySQL Cookbook*. 2nd edition. O=92Reilly, 2007.
>
>
>
>
>
>
> On 3/30/11 5:58 AM, "francesca casalino"
> wrote:
>
> Dear Perl and MySQL gurus,
>
>
>
> I have been learning a lot from you for the past month, and, learning fro=
m
> your comments and with your help, I was able to build a database on MySQL
> through Perl DBI.
>
>
>
> I am now having trouble with generating a query in Perl DBI to take data
> from this database and insert it into another table in MySQL; I am still
> just at the start in learning both MySQL and Perl, and sorry if this is a
> simple question again, but I have been stuck on this for a while now=85if=
you
> have any advice on how to tackle this please let me know.
>
>
>
> My database is constructed with different tables connected to each other
> through foreign keys. This is an example of the data that I am having
> trouble with (primary and foreign keys are specified in parenthesis, and
> foreign keys refer to another table in the database). I used ENGINE=3DINN=
ODB
> for constructing the database, and MySQL version 5.1.
>
>
>
>
>
> *Allele*
>
> Allele_id (Primary key) Variation_id (Foreign key) Alle=
le
> Reference
>
> 12 1
> A 0
>
> 13 1
> G 1
>
>
>
> *Genotype*
>
> Genotype_id (Primary key) Sample_id (Foreign key) Allele_=
id
> (Foreign key)
>
> 1 Sample1
> 12
>
> 2 Sample1
> 13
>
>
>
>
>
>
>
> I am trying to:
>
> 1) Fill in another table which contains frequencies for each of these
> alleles, for each variation, and
>
> 2) Find a way to select and count the samples that have allele.reference =
=3D
> 0
> and 0 for each variation and store them in one group, and the ones that
> have
> allele.reference =3D 0 and 1 in another group, and the ones that have
> allele.reference =3D 1 and 1 ina athird group.
>
>
>
> For 1) I have solved the counts that I will need for the frequency
> calculations on MySQL, but for some reason it does not work when I try th=
e
> first of these queries on Perl. The Mysql is this:
>
>
>
> #count number of alleles entered for each variation:
>
> select allele.variation_id, count(genotype.allele_id)
>
> from allele, genotype
>
> where allele.allele_id =3D genotype.allele_id
>
> group by allele.variation_id;
>
>
>
> #Count the number of samples that have reference=3D0 for each variation:
>
> select allele.variation_id, count(genotype.allele_id)
>
> from allele, genotype
>
> where allele.allele_id =3D genotype.allele_id and allele.reference=3D0
>
> group by allele.variation_id;
>
>
>
> When I try this in Perl it does not work=85
>
> #count number of alleles entered for each variation:
>
> my $sth =3D$dbh->prepare("SELECT allele.variation_id,
> count(genotype.allele_id)
>
> from allele, genotype
>
> where allele.allele_id =3D genotype.allele_id
>
> group by allele.variation_id");
>
> $sth->execute();
>
> my (@allele);
>
> while (@allele =3D $sth->fetchrow_array()) {
>
> print @allele."\n";
>
> }
>
>
>
> 2) I really have no idea how to record the values of the successive entri=
es
> of =93reference=94 grouped by variation and sample_id=85 I have been look=
ing
> through MySQL tutorials, but I really need to understand this better to
> know
> how to approach this=85
>
>
>
> #Count the number of samples that have reference =3D0 for both entries fo=
r
> each variation
>
> select allele.variation_id, count(genotype.sample_id)
>
> from allele JOIN genotype ON allele.allele_id =3D genotype.allele_id
>
> group by allele.variation_id, genotype.sample_id;
>
>
>
>
>
> Thank you VERY VERY much for any help/suggestions, or any books/tutorials
> that I could look at to understand how to solve these problems=85
>
>
>
> -francesca
>
>
> --
> Michael R. Boudreau
> Senior Publishing Technology Analyst
> The University of Chicago Press
> 1427 E. 60th Street
> Chicago, IL 60637
> (773) 753-3298 fax: (773) 753-3383
>
>

--bcaec519643be00144049fb48d96--

Re: MySQL queries in a relational database

am 30.03.2011 20:14:24 von Darren Duncan

francesca casalino wrote:
> I am now having trouble with generating a query in Perl DBI to take data
> from this database and insert it into another table in MySQL; I am still
> just at the start in learning both MySQL and Perl, and sorry if this is a
> simple question again, but I have been stuck on this for a while nowâ¦if you
> have any advice on how to tackle this please let me know.

I don't know if you're already doing this, but you may be able to do your SELECT
and INSERT as a single SQL statement, which is simpler than pulling your data
out to Perl and pushing it in again.

insert into mytbl (mycol1, mycol2)
select mycol1, mycol2 ...

-- Darren Duncan

--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=gcdmp-msql-mysql-modules@m .gmane.org

Re: MySQL queries in a relational database

am 31.03.2011 10:27:48 von francesca casalino

--00504502cc169562fc049fc311c8
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: quoted-printable

Hi Duncan,
Thank you for the hint about doing the insert and select in one step. I
tried that but it is not inserting any records, and it does not give me an
error message.
This is what I am doing:

#count number of alleles entered for each variation:

my $sth =3D$dbh->prepare("INSERT into frequency (total_count)
SELECT count(genotype.allele_id)
from allele, genotype
where allele.allele_id =3D genotype.allele_id
group by allele.variation_id");
$sth->execute();
$sth->finish;

Thank you for your help again,
-f

2011/3/30 Darren Duncan

> francesca casalino wrote:
>
>> I am now having trouble with generating a query in Perl DBI to take data
>> from this database and insert it into another table in MySQL; I am still
>> just at the start in learning both MySQL and Perl, and sorry if this is =
a
>> simple question again, but I have been stuck on this for a while now=85i=
f
>> you
>> have any advice on how to tackle this please let me know.
>>
>
> I don't know if you're already doing this, but you may be able to do your
> SELECT and INSERT as a single SQL statement, which is simpler than pullin=
g
> your data out to Perl and pushing it in again.
>
> insert into mytbl (mycol1, mycol2)
> select mycol1, mycol2 ...
>
> -- Darren Duncan
>

--00504502cc169562fc049fc311c8--

Re: MySQL queries in a relational database

am 31.03.2011 10:52:58 von Darren Duncan

francesca casalino wrote:
> Hi Duncan,
> Thank you for the hint about doing the insert and select in one step. I
> tried that but it is not inserting any records, and it does not give me an
> error message.
> This is what I am doing:
>
> #count number of alleles entered for each variation:
>
> my $sth =$dbh->prepare("INSERT into frequency (total_count)
> SELECT count(genotype.allele_id)
> from allele, genotype
> where allele.allele_id = genotype.allele_id
> group by allele.variation_id");
> $sth->execute();
> $sth->finish;

The first thing to try when that happens is to run the SELECT portion by itself
and see if it returns any rows. If it doesn't, then the above INSERT...SELECT
would be a no-op.

The only thing that would cause no rows for the above query is if the two tables
"allele" and "genotype" have no rows matching each other.

Although you have a count(), the fact you also have a group-by means that the
query could return zero rows.

Another thing you should do when writing SQL is to put the table join conditions
in the FROM clause rather than the WHERE clause, like this:

INSERT into frequency (total_count)
SELECT count(genotype.allele_id)
from allele inner join genotype on allele.allele_id = genotype.allele_id
group by allele.variation_id;

Moreover, because you are joining on common column names, this shorter version
is better:

INSERT into frequency (total_count)
SELECT count(allele_id)
from allele inner join genotype using (allele_id)
group by allele.variation_id;

When you say "using", besides being shorter, the two matching columns are also
combined into one automatically, so you can just say "allele_id" elsewhere and
it won't be confused about "which table" since it doesn't matter. (The only
time it might matter is if you're doing an outer join and you want to explicitly
test that a row didn't match by testing that the other column copy is null.)

-- Darren Duncan

--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=gcdmp-msql-mysql-modules@m .gmane.org

Re: MySQL queries in a relational database

am 31.03.2011 11:31:20 von francesca casalino

--001636e0a87bba1ca6049fc3f4c2
Content-Type: text/plain; charset=ISO-8859-1

Hi Duncan,

Thank you again for the very useful advice! I am not sure how much better it
is to have the insert stetment like this and I think I am forced to use
placeholders in this case, since I realised that I also need the
last_insert_id for the population and for the variation_id that are foreign
keys in the frequency table, so the new error that comes up is "Cannot add
or update a child row: a foreign key constraint fails"...

But your join structure is much more clean so I will definitly use that.
What I am trying to do is to calculate a frequency with the two counts I get
from two insert statements, and for now I am getting the two counts
seperately, but I do not know how to go about to calculate the division
between these counts in the most convenient way: should I try to do it
directly in the same statement, or maybe I can insert these columns in the
"frequency" table and then retrieve them and devide them? I really do not
kow how to go about doing this so if anybody has an idea could you please
let me know?

I tried dividing the two sleect statements (giving counts), but this is the
error message I get:
"Argument "SELECT allele.variation_id, count(genotype.allele_id) as..."
isn't numeric in division (/) at maf.pl line 16.
Illegal division by zero at maf.pl line 16"

I was also trying to isert the two statements into one using SELECT (SELECT
but I get the error message "DBD::mysql::st execute failed: Subquery returns
more than 1 row at freq.pl line 26." And I don't think this would solve my
problem anyway...Just trying things but I can't seem to get to the answer...

Thank you for any suggestions...
-f
2011/3/31 Darren Duncan

> francesca casalino wrote:
>
>> Hi Duncan,
>> Thank you for the hint about doing the insert and select in one step. I
>> tried that but it is not inserting any records, and it does not give me an
>> error message.
>> This is what I am doing:
>>
>> #count number of alleles entered for each variation:
>>
>> my $sth =$dbh->prepare("INSERT into frequency (total_count)
>> SELECT count(genotype.allele_id)
>> from allele, genotype
>> where allele.allele_id = genotype.allele_id
>> group by allele.variation_id");
>> $sth->execute();
>> $sth->finish;
>>
>
> The first thing to try when that happens is to run the SELECT portion by
> itself and see if it returns any rows. If it doesn't, then the above
> INSERT...SELECT would be a no-op.
>
> The only thing that would cause no rows for the above query is if the two
> tables "allele" and "genotype" have no rows matching each other.
>
> Although you have a count(), the fact you also have a group-by means that
> the query could return zero rows.
>
> Another thing you should do when writing SQL is to put the table join
> conditions in the FROM clause rather than the WHERE clause, like this:
>
>
> INSERT into frequency (total_count)
> SELECT count(genotype.allele_id)
> from allele inner join genotype on allele.allele_id = genotype.allele_id
> group by allele.variation_id;
>
> Moreover, because you are joining on common column names, this shorter
> version is better:
>
>
> INSERT into frequency (total_count)
> SELECT count(allele_id)
> from allele inner join genotype using (allele_id)
> group by allele.variation_id;
>
> When you say "using", besides being shorter, the two matching columns are
> also combined into one automatically, so you can just say "allele_id"
> elsewhere and it won't be confused about "which table" since it doesn't
> matter. (The only time it might matter is if you're doing an outer join and
> you want to explicitly test that a row didn't match by testing that the
> other column copy is null.)
>
> -- Darren Duncan
>

--001636e0a87bba1ca6049fc3f4c2--