Re: Doing Summation on multiple criterias on the same column in a

Re: Doing Summation on multiple criterias on the same column in a

am 31.03.2008 21:28:13 von Joe Celko

The example of a medical questionnaire is very appropriate for me
right now. I am getting a physical on 2008-04-01 and just had to fill
a four-page basic intake questionnaire I got in the mail.

1)The pre-existing conditions are asked as yes/no questions for the
intake form ("Do you have high cholesterol?") so that they can be
measured on an appropriate scale later in the exam (LDL cholesterol,
HDL cholesterol, and triglycerides)
2)The surgery list asks for the calendar year of the operations. Not
just yes/no, not within a range of years past, but the actual calendar
year. They want the fact, not a flag.
3)The family history also asks about the calendar years when family
members were diagnosed for heart problems, cancer, etc. Not just yes/
no, not within a range of years past, but the actual calendar year.
They want the fact, not a flag.
4)The "life style" questions are also detailed and not just flags;
they want measurements.
1.Do you use tobacco? What kind? (cigarettes, cigars, snuff, etc.)
How much?
2.Do you drink alcohol? What kind (beer, wine, liquor, etc.) How many
drinks per week?
3.Do you use caffeine? What kind? (coffee, tea, etc.) How many drinks
per day?
4.How many sex partners do you have? What genders? Animals don't seem
to count:)
5)Male and female conditions are clearly separated to avoid
conflicting data entries. One of the problems with flags is that
certain combinations might not be valid data -- "pregnant men" -- and
you need elaborate CHECK() constraints to avoid bad data. But this is
a Data Quality issue.

This sort of form is for intake only; it is not meant to be a medical
record. The actual database will contain my blood pressure, blood
type, cholesterol level and any tests indicated by the intake form --
not a yes/no flag for "do you have blood:)

Now, we are into data quality issues and the use of scales and
measurement. There standards for the acceptable levels of error and
risk in particular industries. There are measures of "fuzziness" in
data.

While all of this DQ stuff is important, it has little to do with the
use of flags in an RDBMS.