Complex sql query?

Complex sql query?

am 06.03.2006 13:25:25 von jardar.maatje

I am logging scientific data. For this a normal relationship database
is not id=E9al but I am going to try it.

I have basically three tables like:

datarecord:
* idx - integer
* time - datetime
* type - integer

metadata:
* idx - integer
* unit - varchar(20) (for example m,bar,m/s,celcius)
* name - text (for exampel Pressure, Temperature)

datavalue:
* idx - integer
* rec_id - integer (foreign key pointing to datarecord.idx)
* meta_id - integer (foreign key pointing to metadata.idx)
* value - real

An example dataset would be:
datarecord
idx time type
1 01.02.2006 12:12:00 1
2 01.02.2006 12:13:00 1

metadata:
idx unit name
1 bar pressure
2 celcius temperature
3 degrees heading
4 celcius maxtemp
5 celcius mintemp


datavalue
idx rec_id meta_id value
1 1 1 10.3
2 1 2 44.2
3 1 3 34
4 1 4 55.3
5 2 1 9.2
6 2 2 44.0
7 2 3 48
8 2 4 54.3
9 2 5 22.8

I would like to construct a query that for example could give me time,
pressure, temperature, maxtemperature like:
01.02.2006 10.3 44.2 55.3
01.02.2006 9.2 44.0 54.3

Is this possible with SQL? And how do I do this?

Best regards

Jardar

Re: Complex sql query?

am 06.03.2006 15:50:41 von zeljko.prince

Yes it is possible.

In order to retrieve time, pressure, temperature and maximum
temperature from tables you've defined, you need to execute the
following query:

SELECT datarecord.time,(SELECT datavalue.value FROM datavalue WHERE
meta_id=1 AND datavalue.rec_id=datarecord.idx),(SELECT datavalue.value
FROM datavalue WHERE meta_id=2 AND
datavalue.rec_id=datarecord.idx),(SELECT datavalue.value FROM datavalue
WHERE meta_id=4 AND datavalue.rec_id=datarecord.idx);

But your final query will depend on several parameters: perhaps you do
not want hard-code the meta_id ("WHERE meta_id=1") but would like more
sophisticated solution.
Furthermore, primary keys should be defined in order previous query not
to crash if there are several rows in datavalue, all matching the same
rec_id.

Zeljko

Re: Complex sql query?

am 06.03.2006 15:55:55 von zeljko.prince

Yes it is possible.

In order to retrieve time, pressure, temperature and maximum
temperature from tables you've defined, you need to execute the
following query:

SELECT datarecord.time,(SELECT datavalue.value FROM datavalue WHERE
meta_id=1 AND datavalue.rec_id=datarecord.idx),(SELECT datavalue.value
FROM datavalue WHERE meta_id=2 AND
datavalue.rec_id=datarecord.idx),(SELECT datavalue.value FROM datavalue
WHERE meta_id=4 AND datavalue.rec_id=datarecord.idx);

But your final query will depend on several parameters: perhaps you do
not want hard-code the meta_id ("WHERE meta_id=1") but would like more
sophisticated solution.
Furthermore, primary keys should be defined in order previous query not
to crash if there are several rows in datavalue, all matching the same
rec_id.

Zeljko

Re: Complex sql query?

am 06.03.2006 19:40:28 von Bill Karwin

wrote in message
news:1141647925.427260.307510@i39g2000cwa.googlegroups.com.. .
> I would like to construct a query that for example could give me time,
> pressure, temperature, maxtemperature like:
> 01.02.2006 10.3 44.2 55.3
> 01.02.2006 9.2 44.0 54.3

If you can rely on the values in the metadata table, you can do this:

SELECT r.time, p.value, t.value, tmax.value
FROM datarecord AS r
LEFT OUTER JOIN datavalue AS p ON r.idx = p.rec_id AND p.meta_id = 1
LEFT OUTER JOIN datavalue AS t ON r.idx = t.rec_id AND t.meta_id = 2
LEFT OUTER JOIN datavalue AS tmax ON r.idx = tmax.rec_id AND tmax.meta_id
= 4;

But it would be safer to use the following query, which assumes metadata
records have reliable names, but not reliable idx values:

SELECT r.time, p.value, t.value, tmax.value
FROM datarecord AS r
LEFT OUTER JOIN (datavalue AS p INNER JOIN metadata AS pm ON p.meta_id =
pm.idx AND pm.name = 'pressure') ON r.idx = p.rec_id
LEFT OUTER JOIN (datavalue AS t INNER JOIN metadata AS tm ON t.meta_id =
tm.idx AND tm.name = 'temperature') ON r.idx = t.rec_id
LEFT OUTER JOIN (datavalue AS tmax INNER JOIN metadata AS tmaxm ON
tmax.meta_id = tmaxm.idx AND tmaxm.name = 'maxtemp') ON r.idx = tmax.rec_id;

I have to agree with you that this schema is not ideal for this purpose.
- There's no way to ensure that a datavalue exists for pressure,
temperature, and maxtemp for a given datarecord. That's why you should use
outer joins instead of inner joins, or else some of your records may seem to
disappear from the result set.
- The 'metadata' table is guilty of mixing data and metadata. You can see
the difficulty this causes.
- The datavalue table is guilty of assigning five different meanings to the
values in the 'value' column.

IMHO, it would be better to make pressure, temperature, and maxtemp be
attribute columns of the datarecord table.

Regards,
Bill K.

Re: Complex sql query?

am 07.03.2006 02:30:19 von avidfan

jardar.maatje@gmail.com wrote:
> I am logging scientific data. For this a normal relationship database
> is not idéal but I am going to try it.
>
> I have basically three tables like:
>
> datarecord:
> * idx - integer
> * time - datetime
> * type - integer
>
> metadata:
> * idx - integer
> * unit - varchar(20) (for example m,bar,m/s,celcius)
> * name - text (for exampel Pressure, Temperature)
>
> datavalue:
> * idx - integer
> * rec_id - integer (foreign key pointing to datarecord.idx)
> * meta_id - integer (foreign key pointing to metadata.idx)
> * value - real
>
> An example dataset would be:
> datarecord
> idx time type
> 1 01.02.2006 12:12:00 1
> 2 01.02.2006 12:13:00 1
>
> metadata:
> idx unit name
> 1 bar pressure
> 2 celcius temperature
> 3 degrees heading
> 4 celcius maxtemp
> 5 celcius mintemp
>
>
> datavalue
> idx rec_id meta_id value
> 1 1 1 10.3
> 2 1 2 44.2
> 3 1 3 34
> 4 1 4 55.3
> 5 2 1 9.2
> 6 2 2 44.0
> 7 2 3 48
> 8 2 4 54.3
> 9 2 5 22.8
>
> I would like to construct a query that for example could give me time,
> pressure, temperature, maxtemperature like:
> 01.02.2006 10.3 44.2 55.3
> 01.02.2006 9.2 44.0 54.3
>
> Is this possible with SQL? And how do I do this?
>
> Best regards
>
> Jardar
>


First of all this is NOT a complex query, if it is for you then you
might consider starting with something like "SQL for Dummies" -
excellent beginner book! or similar to understand join and outer join
query constructs. Another is might be to de-normalize this a bit...

One reason to "normalize" a database is to reduce repeating data but
in your metadata table, you will be repeating unit many times, however
there are times when you must denormalize for various reasons
(performance is one simplicity another).

When inserting data into this database you must insert data in two
tables for EACH record - and the way you need to code it would be to
insert into datarecord - get the resulting idx (assuming it is an
autokey or next value), the return to the program and insert into
datavalue table - very expensive in terms of I/O.

I would suggest combining datarecord and datavalue (denormalize) or
redesign taking the previous paragraph into consideration.

given:
insert into datarecord values (date_time, presssure, temp, heading,etc.)

consider min and max temp (anything) can be calculated -(Select
min(temp) from datavalue where date_time betweed date1 and date2;)
You really only have one value for temperature at any give point in time
(date_time) you cannot not have a mintemp, maxtemp and temp that are
different at the same point in time - all 3 values will be the same.

Re: Complex sql query?

am 07.03.2006 10:02:50 von jardar.maatje

Thanks a lot for very good answers. Then an explanation of why I
constructed the database as I did. The issue is that I do not
necesarility know upfront what data will be put into the database. Thus
why datarecord and datavalue is split.
Then the reason why metadata was not normalized was because it made the
structure more simple. This is also a small table so optimization in
terms is datastorage is not that important.
And I agree with the max/min temp it is no need to split it up.
A last reason why the datavalue and datarecord was separated was the I
would like to store arrays with a datarecord as well. Thus another
table arraydata which should work almost as datarecord, but without
timestamp.

arraydata:
* id
* rec_id
* meta_id

An example data set would be:

metadata:
id unit name
1 bar pressure
2 celcius temperature
3 m altitude
4 none altitude profile - old method
5 none altitude profile - new method


datarecord
id time type
1 01.02.2006 12:12:00 1
2 01.02.2006 12:13:00 1

arraydata:
id rec_id meta_id
3 1 4
4 2 4
5 2 5

datavalue:
id rec_id meta_id value
6 3 1 0.1
7 3 2 10.2
8 3 3 100
9 3 1 0.2
10 3 2 8.5
11 3 3 200
12 3 1 0.3
13 3 2 6.4
14 3 3 300
15 4 1 0.11
16 4 2 10.5
17 4 3 100
18 4 1 0.22
19 4 2 8.8
20 4 3 200
21 4 1 0.33
22 4 2 6.5
23 4 3 300
24 5 1 0.12
25 5 2 10.6
26 5 3 110
27 5 1 0.23
29 5 3 210
30 5 1 0.34
31 5 2 6.6
32 5 3 310
33 5 1 0.44
34 5 2 5.6
35 5 3 410


>From this I would like to be able to get something like this:
date meta_id pressure temp altitude
01.02.2006 12:12:00 4 0.1 10.2 100
01.02.2006 12:12:00 4 0.2 8.5 200
01.02.2006 12:12:00 4 0.3 10.2 300
01.02.2006 12:13:00 4 0.11 10.5 100
01.02.2006 12:13:00 4 0.22 8.8 200
01.02.2006 12:13:00 4 0.33 6.5 300
01.02.2006 12:13:00 5 0.12 10.6 110
01.02.2006 12:13:00 5 0.23 NULL 210
01.02.2006 12:13:00 5 0.34 6.6 310
01.02.2006 12:13:00 5 0.44 5.6 410

How would I construct a query like that?

Jardar

Re: Complex sql query?

am 07.03.2006 15:03:31 von jardar.maatje

I tried to do it with a query similar to (meta_id=11 is for array
data):

SELECT a2.time, d1.value
FROM (SELECT r2.time AS time, a.id AS arrid
FROM norsci_record AS r2
JOIN norsci_arraydata AS a ON r2.id = a.rec_id AND a.meta_id = 11) as
a2
LEFT JOIN norsci_realdata AS d1 ON d1.rec_id = a2.arrid AND d1.meta_id
= 5;

This seems to work with one datavalue, however it does not work when I
add another join like :
SELECT a2.time, a2.arrid, d1.value, d2.value
FROM
(SELECT r2.time AS time, a.id AS arrid
FROM norsci_record AS r2
JOIN norsci_arraydata AS a ON r2.id = a.rec_id AND a.meta_id = 11) as
a2
LEFT JOIN norsci_realdata AS d1 ON d1.rec_id = a2.arrid AND d1.meta_id
= 5
LEFT JOIN norsci_realdata AS d2 ON d2.rec_id = a2.arrid AND d2.meta_id
= 6;

I then get squared number of values back instead.

Best regards

Jardar

Re: Complex sql query?

am 08.03.2006 02:27:11 von Bill Karwin

"Utopico" wrote in message
news:1141722170.534927.26530@j33g2000cwa.googlegroups.com...
> The issue is that I do not
> necesarility know upfront what data will be put into the database.

I don't mean to be unsympathetic, but if this is the case, then creating a
database at all is premature. You might want to start by storing your
scientific data in spreadsheets, and then once you have some better idea of
the data model, implement it in a normalized fashion. You'll find that the
path you're on (an approach called Entity-Attribute-Value or EAV) is very
difficult to get right.

Regards,
Bill K.

Re: Complex sql query?

am 09.03.2006 09:59:26 von jardar.maatje

It might be that it is EAV that I try to achive. Normally I would go
for a standard database structure where the data is embedded in the
datarecord. But since the data can come from different instruments that
can have different types of data. It is also nice to be able to attache
metadata directly to the data.

In any case:
Are there any good articles on using EAV?
Are there any alternative databases that are made for this kind of
data? I have had a look at HDF and NetCDF and they might suit my needs
but are are not as standard as relational databases.

Jardar

Re: Complex sql query?

am 09.03.2006 19:21:01 von Bill Karwin

"Utopico" wrote in message
news:1141894766.479577.228900@j33g2000cwa.googlegroups.com.. .
> Are there any good articles on using EAV?

Here's one:
http://ycmi.med.yale.edu/nadkarni/Introduction%20to%20EAV%20 systems.htm

The EAV paradigm seems to be used widely in medical database systems, based
on Google searches I've done.

> Are there any alternative databases that are made for this kind of
> data? I have had a look at HDF and NetCDF and they might suit my needs
> but are are not as standard as relational databases.

For data that is both heirarchical and array-like, is extensible ad hoc, and
is structured even though it has no predefined schema, I'd choose XML.
There are lots of tools to manipulate XML, and even W3C standardized query
languages XSL and XPath.

Regards,
Bill K.