Help with constructing a SQL query

Help with constructing a SQL query

am 03.12.2007 13:49:44 von Marcus Claesson

Hi!

I have a "SQL query construction" question that I hope someone can help
me with. After comparing a bunch of DNA fragments (see name below) with
a larger reference sequence I get a ordered list ranked according to
similarities, and with start/stop co-ordinates where the fragments map
to the reference sequence:

+------+------+-------+------+----------+
| name | rank | start | stop | sub_rank |
+------+------+-------+------+----------+
| A | 1 | 1 | 1000 | NULL |
| B | 2 | 2 | 998 | NULL |
| C | 4 | 1100 | 2000 | NULL |
| D | 3 | 3050 | 4100 | NULL |
| E | 5 | 2040 | 3000 | NULL |
| F | 6 | 1102 | 2000 | NULL |
| G | 7 | 1098 | 1998 | NULL |
| H | 8 | 3048 | 4100 | NULL |
| I | 9 | 3051 | 4102 | NULL |
+------+------+-------+------+----------+

A graphical representation of fragments mapped to the ref sequence:

ref========================================>
1 A------>
2 B---->
3 D------>
4 C------>
5 E---->
6 F------->
7 G------->
8 H------->
9 I------->

Now, I want to group fragments in each overlapping position and sub-rank
them according to their rank in that position. The final table would
then look like:
+------+------+-------+------+----------+
| name | rank | start | stop | sub_rank |
+------+------+-------+------+----------+
| A | 1 | 1 | 1000 | 1 |
| B | 2 | 2 | 998 | 2 |
| C | 4 | 1100 | 2000 | 1 |
| D | 3 | 3050 | 4100 | 1 |
| E | 5 | 2040 | 3000 | 1 |
| F | 6 | 1102 | 2000 | 2 |
| G | 7 | 1098 | 1998 | 3 |
| H | 8 | 3048 | 4100 | 2 |
| I | 9 | 3051 | 4102 | 3 |
+------+------+-------+------+----------+

Is this possible to achieve using SQL queries alone (perhaps with GROUP
BY, nested SELECTs etc)?

I've managed to do this with a Perl-DBI script, but would much prefer to
do it completely with MySQL instead. The Perl code is below and below
that is the MySQL-dump of the test data set...

Many thanks in advance!
Marcus


while (@{$dbh->selectcol_arrayref("SELECT rank FROM test WHERE sub_rank
IS NULL")}) {
@null_sub_ranks = @{$dbh->selectcol_arrayref("SELECT rank FROM test
WHERE sub_rank IS NULL AND NOT (start>=(SELECT stop FROM test WHERE rank
= (SELECT min(rank) FROM test WHERE sub_rank IS NULL)) OR stop <=
(SELECT start FROM test WHERE rank = (SELECT min(rank) FROM test WHERE
sub_rank IS NULL)))")};
for ($rank=0; $rank < scalar(@null_sub_ranks); $rank++ ) {
$sub_rank = $rank + 1;
$dbh->do("UPDATE test SET sub_rank=$sub_rank WHERE rank=
$null_sub_ranks[$rank]");
}
}


-- MySQL dump 10.10
--
-- Host: localhost Database: bxb
-- ------------------------------------------------------
-- Server version 5.0.22

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS,
FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO'
*/;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Table structure for table `test`
--

DROP TABLE IF EXISTS `test`;
CREATE TABLE `test` (
`name` text,
`rank` int(11) default NULL,
`start` int(11) default NULL,
`stop` int(11) default NULL,
`sub_rank` int(11) default NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

--
-- Dumping data for table `test`
--


/*!40000 ALTER TABLE `test` DISABLE KEYS */;
LOCK TABLES `test` WRITE;
INSERT INTO `test` VALUES
('A',1,1,1000,NULL),('B',2,2,998,NULL),('C',4,1100,2000,NULL ),('D',3,3050,4100,NULL),('E',5,2040,3000,NULL),('F',6,1102, 2000,NULL),('G',7,1098,1998,NULL),('H',8,3048,4100,NULL),('I ',9,3051,4102,NULL);
UNLOCK TABLES;
/*!40000 ALTER TABLE `test` ENABLE KEYS */;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;



--
MySQL Perl Mailing List
For list archives: http://lists.mysql.com/perl
To unsubscribe: http://lists.mysql.com/perl?unsub=gcdmp-msql-mysql-modules@m .gmane.org