column definition that references other data in same row

am 17.06.2006 19:37:45 von Steve

I realize that this probably isn't a best practice, but I'm working with
legacy code that has a query stored in one column of a table. Because
the queries vary, the JSP page that selects a query and runs it cannot
add any additional information (like a WHERE clause).

I need to add a few more records to the the table, and would like the
query to include a value from another field in the current row in a
WHERE clause -- something like 'WHERE id = this.idlist' to use a
Java-like syntax, where 'this' would mean a value retrieved from the
current row of this table.

I.e., the table will have rows like the following:

| id | idlist | query |
------ --------- --------------------------------------------
3 12 'SELECT * FROM X WHERE id = this.idlist'

So that the query value ends up as:

'SELECT * FROM X WHERE id = 12'

Is such a thing possible?

Re: column definition that references other data in same row

am 17.06.2006 20:22:42 von gordonb.xbyzx

>I realize that this probably isn't a best practice, but I'm working with
>legacy code that has a query stored in one column of a table. Because
>the queries vary, the JSP page that selects a query and runs it cannot
>add any additional information (like a WHERE clause).

Since when is anything in a WHERE clause a *definition* of a column?

>I need to add a few more records to the the table, and would like the
>query to include a value from another field in the current row in a

Define: current row. Is this anything besides the row you are adding?

>WHERE clause -- something like 'WHERE id = this.idlist' to use a
>Java-like syntax, where 'this' would mean a value retrieved from the
>current row of this table.

Define: current row. Is this anything besides the row you are adding?

Isn't this adding a WHERE clause like you said you couldn't do?

You can have a query:
SELECT * FROM X WHERE id = idlist

which selects rows where id and idlist in the same row are equal.
It does *not* select based on the row where the query came from,
as there is no concept of "the row where the query came from" in
SQL, and the vast majority of queries do not come from rows.

>
>I.e., the table will have rows like the following:
>
>| id | idlist | query |
> ------ --------- --------------------------------------------
> 3 12 'SELECT * FROM X WHERE id = this.idlist'
>
>
>So that the query value ends up as:
>
>'SELECT * FROM X WHERE id = 12'
>
>Is such a thing possible?

If you want to substitute values into queries, substitute values into
queries. This normally involves changing the code.

Gordon L. Burditt

Re: column definition that references other data in same row

am 18.06.2006 04:21:25 von Steve

Gordon Burditt wrote:
>> I realize that this probably isn't a best practice, but I'm working with
>> legacy code that has a query stored in one column of a table. Because
>> the queries vary, the JSP page that selects a query and runs it cannot
>> add any additional information (like a WHERE clause).
>
> Since when is anything in a WHERE clause a *definition* of a column?

It isn't. But one column of this table contains queries as text. I.e.,
the query I'm referring to is data in the table. The JSP retrieves the
appropriate query from the table and runs it.

>
>> I need to add a few more records to the the table, and would like the
>> query to include a value from another field in the current row in a
>
> Define: current row. Is this anything besides the row you are adding?

No, other than semantics. By current row I mean the row I have added,
but at the time I retrieve it as opposed to at the time I add it.

>
>> WHERE clause -- something like 'WHERE id = this.idlist' to use a
>> Java-like syntax, where 'this' would mean a value retrieved from the
>> current row of this table.
>
> Define: current row. Is this anything besides the row you are adding?

See above.

>
> Isn't this adding a WHERE clause like you said you couldn't do?

No, because the WHERE clause is in the contents of the field as opposed
to in the JSP.

>
> You can have a query:
> SELECT * FROM X WHERE id = idlist
>
> which selects rows where id and idlist in the same row are equal.
> It does *not* select based on the row where the query came from,
> as there is no concept of "the row where the query came from" in
> SQL, and the vast majority of queries do not come from rows.

No, but data does, and there are certainly plenty of functions that work
on data from the current row (IF, CONCAT, etc.). What I'd like to do is
embed some of that type of logic into a field at the time the record is
written as opposed to having to write external logic to handle the task.

The queries I'm speaking are quite varied, which is why they are stored
in a table. The problem is that for the simpler queries, the WHERE
clause simply uses the autonumber field for this record. My current
solution is to add a record, then edit it so I can see what the
autonumber field was and hard-code that into the query stored in that
row. I'd like something a little more dynamic.

>
>
>> I.e., the table will have rows like the following:
>>
>> | id | idlist | query |
>> ------ --------- --------------------------------------------
>> 3 12 'SELECT * FROM X WHERE id = this.idlist'
>>
>>
>> So that the query value ends up as:
>>
>> 'SELECT * FROM X WHERE id = 12'
>>
>> Is such a thing possible?
>
> If you want to substitute values into queries, substitute values into
> queries. This normally involves changing the code.
>
> Gordon L. Burditt

Re: column definition that references other data in same row

am 22.06.2006 23:18:55 von kreeves

I'm trying to do this same thing. Have a table that has a column
definition that finds it's value based on values in the same row.

This sort of practice is used on Access very often.

Re: column definition that references other data in same row

am 22.06.2006 23:56:28 von Bill Karwin

Steve wrote:
> | id | idlist | query |
> ------ --------- --------------------------------------------
> 3 12 'SELECT * FROM X WHERE id = this.idlist'
>
> So that the query value ends up as:
>
> 'SELECT * FROM X WHERE id = 12'
>
> Is such a thing possible?

Certainly it is possible, but you'll have to parse the query string and
substitute values into it. Probably involving coming up with your own
notation in the query string.

But this seems like perpetuating -- and even extending -- a code design
that was a mistake to begin with. Sounds like the normalization of this
database is completely broken, if idlist can reference different things
_per row_.

If this were an object-oriented system like Java, it'd be like using a
Collection of objects to store objects of different types. This is a
not good to do, because it's prone to errors that are hard to diagnose.
Hence Java 5 introduced generics, to catch such mistakes at compile-time.

What you're describing doing with SQL is similarly a bad design. Any
set (e.g. a column) should contain only objects of the same type. I
don't mean primitive datatypes like INT, I mean the entities should be
members of the same logical domain.

For instance, you wouldn't use a single INT to store either a person's
age, or the number of dogs they own, or a foreign key to some other
table. It's too confusing to tell which of these meanings to use on a
given row; if you make a mistake, you can make false conclusions about
the attribute.

If idlist can reference multiple things in other tables, this is a clue
that idlist should be multiple columns. Each column references one
_type_ of thing, and those that are not relevant on a given row should
have a NULL state.

If your database is properly normalized, you shouldn't need to store
different queries per row. You'd be better off spending your time
fixing this legacy code than extending it.

Regards,
Bill K.

Re: column definition that references other data in same row

am 28.06.2006 18:17:12 von Steve

I think you're misunderstanding my app, based on a column name. idlist
is not a list of id's -- therefore no collection is needed. It is the
id of a list, since this is a distribution list application. For a
particular project, the distribution list might be "everyone in plant
engineering for plant 123 and everyone on capital expenditures committee
and everyone in finance over a certain grade level". So the table is
basically storing decision rules in the form of queries.

But, there are a lot of manually created lists for which the
distribution is "everyone on this list", which is where the
self-referencing part comes in -- for a row in this table with idlist of
12, a matching table would have a set of entries each with an employee
id and the idlist of 12 for this list.

And yes, this design could be normalized, but that would be by using
some sort of criteria object, and a self-referencing table that and-ed
and or-ed existing criteria into new criteria, etc. For the amount of
this that needs to be done, creating and debugging everything associated
with it would be far more effort and risk than the current approach.

I did come up with a solution -- since there is no case where there is
both a complex criteria and a simple one, I just checked the query
column for NULL, and if it was, manually create the simple criteria from
the idlist field. Resulting in a minimal and controlled amount of
messing with the JSP code.

Bill Karwin wrote:
> Steve wrote:
>> | id | idlist | query |
>> ------ --------- --------------------------------------------
>> 3 12 'SELECT * FROM X WHERE id = this.idlist'
>>
>> So that the query value ends up as:
>>
>> 'SELECT * FROM X WHERE id = 12'
>>
>> Is such a thing possible?
>
> Certainly it is possible, but you'll have to parse the query string and
> substitute values into it. Probably involving coming up with your own
> notation in the query string.
>
> But this seems like perpetuating -- and even extending -- a code design
> that was a mistake to begin with. Sounds like the normalization of this
> database is completely broken, if idlist can reference different things
> _per row_.
>
> If this were an object-oriented system like Java, it'd be like using a
> Collection of objects to store objects of different types. This is a
> not good to do, because it's prone to errors that are hard to diagnose.
> Hence Java 5 introduced generics, to catch such mistakes at compile-time.
>
> What you're describing doing with SQL is similarly a bad design. Any
> set (e.g. a column) should contain only objects of the same type. I
> don't mean primitive datatypes like INT, I mean the entities should be
> members of the same logical domain.
>
> For instance, you wouldn't use a single INT to store either a person's
> age, or the number of dogs they own, or a foreign key to some other
> table. It's too confusing to tell which of these meanings to use on a
> given row; if you make a mistake, you can make false conclusions about
> the attribute.
>
> If idlist can reference multiple things in other tables, this is a clue
> that idlist should be multiple columns. Each column references one
> _type_ of thing, and those that are not relevant on a given row should
> have a NULL state.
>
> If your database is properly normalized, you shouldn't need to store
> different queries per row. You'd be better off spending your time
> fixing this legacy code than extending it.
>
> Regards,
> Bill K.

Re: column definition that references other data in same row

am 28.06.2006 22:24:11 von Bill Karwin

Steve wrote:
> I think you're misunderstanding my app, based on a column name.

No, I assumed that idlist was a value, not a list of values. You hadn't
mentioned that it was a list, and the way you were using it in the
example in your original post showed that it was a single value.

What I was cautioning against was using the criteria differently on each
row. You have a column which is a string forming a SQL statement. This
suggests that the criteria are potentially different on each row.
Possibly idlist is used differently on each row.

So for example, I was anticipating that something the following might occur:

| id | idlist | query |
---- -------- --------------------------------------------
3 12 'SELECT * FROM user_directory WHERE user_id = this.idlist'
4 8675309 'SELECT * FROM department_users WHERE dept_no = this.idlist'

The usage of the idlist column to reference logically different
attributes in other tables is against database normalization. It's
extremely fragile, depends on the application-level logic knowing the
conventions perfectly. Frequently designs like this unravel as more
complex conditions are added.

Also if schema changes occur, like a table or column being renamed, it's
hard to debug when some of the SQL is in your application, and some of
it is in the database. You may not find errors for weeks or months,
because some of these per-row queries are not run regularly.

It's better to keep separate columns for the separate criteria, and use
them consistently. Name the columns for the logical attribute they
reference. If one or the other is irrelevant in a certain case, then
enter NULL.

| id | user_id | dept_no | committee_id | ...etc... |
---- --------- --------- ------------------------------
3 12 NULL NULL
4 NULL 8675309 NULL

This requires that you codify _all_ the possible complex criteria for
your user groups. And the above table does not distinguish between AND
and OR, so if you list non-null values in more than one column, it's up
to the application to figure that out.

Most traditional email distribution lists are effectively "OR" only.
You can list individual recipients, or even additional distribution
lists. But the only way to combine them is inclusion; there's no
feature of making a list like, "everyone in list A except those who are
in list B". If you need that level of control, then perhaps you do need
to do it the way you are doing it.

I do understand that real-life scenarios are complex collections of
exception cases, and do not follow patterns. Designing a database in a
well-normalized way requires some kind of consistency. You're trying to
model an organization in which you cannot assume any patterns.

Anyway, I'm glad you found a solution.

Regards,
Bill K.