Unwanted Multiplication Due to Join

Unwanted Multiplication Due to Join

am 31.08.2006 02:09:13 von Cirene

I have a series of tables as below:
CREATE TABLE `collectorfeetable` (
`colFeeID` int(10) unsigned NOT NULL default '0',
`colFeeAmount` decimal(10,2) NOT NULL default '0.00',
`colTimeStamp` datetime default NULL,
`colWorklist` char(4) NOT NULL default '',
PRIMARY KEY (`colFeeID`),
KEY `feeamounts` (`colFeeAmount`),
KEY `feesandworklist` (`colFeeAmount`,`colWorklist`),
KEY `worklistandfee` (`colWorklist`,`colFeeAmount`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 PACK_KEYS=1

CREATE TABLE `collectorworklists` (
`colWorklistxref` int(10) unsigned NOT NULL auto_increment,
`collectorID` char(4) NOT NULL default '',
`collectorWorklist` char(5) character set utf8 NOT NULL default '',
`collectorWorklistLastTimeStamp` datetime NOT NULL default
'0000-00-00 00:00:00',
`collectorWorklistCallSheet` tinyint(1) NOT NULL default '0',
PRIMARY KEY (`colWorklistxref`),
KEY `Worklist` (`collectorWorklist`)
) ENGINE=MyISAM AUTO_INCREMENT=61 DEFAULT CHARSET=latin1 PACK_KEYS=1

CREATE TABLE `collectordailygoals` (
`collectorID` char(4) NOT NULL default '',
`GoalDAY` tinyint(3) unsigned NOT NULL default '0',
`PromiseGoal` smallint(5) unsigned default NULL,
`WorkedGoal` smallint(5) unsigned default NULL,
`goalUpdate` timestamp NOT NULL default '0000-00-00 00:00:00',
PRIMARY KEY (`collectorID`,`GoalDAY`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 PACK_KEYS=1

The problem query is:
SELECT sum(cf.colfeeamount) , collectordailygoals.collectorid from
collectorfeetable cf inner join collectorworklists cw on
cw.collectorworklist = cf.colworklist inner join collectordailygoals on
collectordailygoals.collectorid = cw.collectorid where
month(cf.coltimestamp) = month(current_date()) and year(cf.coltimestamp)
= year(current_date()) group by collectordailygoals.collectorid order by
collectorid ;

My problem is that the sum is multiplied by the number of times a
collector ID is present in the collectordailygoals table. How do I set
up the join such that it looks at only unique instances of the collector
ID in the collectordailygoals table? Alternatively how would I divide
the sum by the number of instances that the collector ID shows up in the
collectordailygoals table? Despite the presence of the goalupdate
timestamp this particular table does not hold historical data. That is
kept in another table.

When I get a chance I will look into overhauling the database structure.
Until then I am stuck with what you see.

Thanks.

Re: Unwanted Multiplication Due to Join

am 01.09.2006 19:07:05 von Cirene

I forgot to mention that this is with 5.0.24 (not 5.0.24a yet) on
Windows 2003. Oops.

No bother wrote:
> I have a series of tables as below:
> CREATE TABLE `collectorfeetable` (
> `colFeeID` int(10) unsigned NOT NULL default '0',
> `colFeeAmount` decimal(10,2) NOT NULL default '0.00',
> `colTimeStamp` datetime default NULL,
> `colWorklist` char(4) NOT NULL default '',
> PRIMARY KEY (`colFeeID`),
> KEY `feeamounts` (`colFeeAmount`),
> KEY `feesandworklist` (`colFeeAmount`,`colWorklist`),
> KEY `worklistandfee` (`colWorklist`,`colFeeAmount`)
> ) ENGINE=MyISAM DEFAULT CHARSET=utf8 PACK_KEYS=1
>
> CREATE TABLE `collectorworklists` (
> `colWorklistxref` int(10) unsigned NOT NULL auto_increment,
> `collectorID` char(4) NOT NULL default '',
> `collectorWorklist` char(5) character set utf8 NOT NULL default '',
> `collectorWorklistLastTimeStamp` datetime NOT NULL default '0000-00-00
> 00:00:00',
> `collectorWorklistCallSheet` tinyint(1) NOT NULL default '0',
> PRIMARY KEY (`colWorklistxref`),
> KEY `Worklist` (`collectorWorklist`)
> ) ENGINE=MyISAM AUTO_INCREMENT=61 DEFAULT CHARSET=latin1 PACK_KEYS=1
>
> CREATE TABLE `collectordailygoals` (
> `collectorID` char(4) NOT NULL default '',
> `GoalDAY` tinyint(3) unsigned NOT NULL default '0',
> `PromiseGoal` smallint(5) unsigned default NULL,
> `WorkedGoal` smallint(5) unsigned default NULL,
> `goalUpdate` timestamp NOT NULL default '0000-00-00 00:00:00',
> PRIMARY KEY (`collectorID`,`GoalDAY`)
> ) ENGINE=MyISAM DEFAULT CHARSET=utf8 PACK_KEYS=1
>
> The problem query is:
> SELECT sum(cf.colfeeamount) , collectordailygoals.collectorid from
> collectorfeetable cf inner join collectorworklists cw on
> cw.collectorworklist = cf.colworklist inner join collectordailygoals on
> collectordailygoals.collectorid = cw.collectorid where
> month(cf.coltimestamp) = month(current_date()) and year(cf.coltimestamp)
> = year(current_date()) group by collectordailygoals.collectorid order by
> collectorid ;
>
> My problem is that the sum is multiplied by the number of times a
> collector ID is present in the collectordailygoals table. How do I set
> up the join such that it looks at only unique instances of the collector
> ID in the collectordailygoals table? Alternatively how would I divide
> the sum by the number of instances that the collector ID shows up in the
> collectordailygoals table? Despite the presence of the goalupdate
> timestamp this particular table does not hold historical data. That is
> kept in another table.
>
> When I get a chance I will look into overhauling the database structure.
> Until then I am stuck with what you see.
>
> Thanks.

Re: Unwanted Multiplication Due to Join

am 01.09.2006 20:38:02 von Bill Karwin

No bother wrote:
> I forgot to mention that this is with 5.0.24 (not 5.0.24a yet) on
> Windows 2003. Oops.

Thanks for specifying the version!

Okay, to solve your duplication problem, how about eliminating the
collectordailygoals from the query?

SELECT SUM(cf.colfeeamount), cw.collectorid
FROM collectorfeetable cf
INNER JOIN collectorworklists cw
ON cw.collectorworklist = cf.colworklist
WHERE MONTH(cf.coltimestamp) = MONTH(CURRENT_DATE())
AND YEAR(cf.coltimestamp) = YEAR(CURRENT_DATE())
GROUP BY cw.collectorid
ORDER BY cw.collectorid;

collectordailygoals doesn't seem to be doing anything in your original
query, unless there are fewer collector id's in that table than in
collectorworklists, and you need the join to pair down the list. If
that's the case, you can use a subquery instead of a JOIN to avoid the
duplication.

SELECT SUM(cf.colfeeamount), cw.collectorid
FROM collectorfeetable cf
INNER JOIN collectorworklists cw
ON cw.collectorworklist = cf.colworklist
WHERE MONTH(cf.coltimestamp) = MONTH(CURRENT_DATE())
AND YEAR(cf.coltimestamp) = YEAR(CURRENT_DATE())
AND cw.collectorid IN (SELECT cd.collectorid FROM collectordailygoals cd)
GROUP BY cw.collectorid
ORDER BY cw.collectorid;

Regards,
Bill K.

Re: Unwanted Multiplication Due to Join

am 11.09.2006 20:23:59 von Cirene

I kept the query in my question sufficiently simple to illustrate the
problem. I will be pulling data from collectordailygoals but did not
use it in the example query so as to avoid confusion. Thanks for the help!


Bill Karwin wrote:
> No bother wrote:
>> I forgot to mention that this is with 5.0.24 (not 5.0.24a yet) on
>> Windows 2003. Oops.
>
> Thanks for specifying the version!
>
> Okay, to solve your duplication problem, how about eliminating the
> collectordailygoals from the query?
>
> SELECT SUM(cf.colfeeamount), cw.collectorid
> FROM collectorfeetable cf
> INNER JOIN collectorworklists cw
> ON cw.collectorworklist = cf.colworklist
> WHERE MONTH(cf.coltimestamp) = MONTH(CURRENT_DATE())
> AND YEAR(cf.coltimestamp) = YEAR(CURRENT_DATE())
> GROUP BY cw.collectorid
> ORDER BY cw.collectorid;
>
> collectordailygoals doesn't seem to be doing anything in your original
> query, unless there are fewer collector id's in that table than in
> collectorworklists, and you need the join to pair down the list. If
> that's the case, you can use a subquery instead of a JOIN to avoid the
> duplication.
>
> SELECT SUM(cf.colfeeamount), cw.collectorid
> FROM collectorfeetable cf
> INNER JOIN collectorworklists cw
> ON cw.collectorworklist = cf.colworklist
> WHERE MONTH(cf.coltimestamp) = MONTH(CURRENT_DATE())
> AND YEAR(cf.coltimestamp) = YEAR(CURRENT_DATE())
> AND cw.collectorid IN (SELECT cd.collectorid FROM collectordailygoals cd)
> GROUP BY cw.collectorid
> ORDER BY cw.collectorid;
>
> Regards,
> Bill K.