Probability Selects
am 07.01.2010 21:10:17 von Matt Neimeyer
I've tried Googling till my brain is fried and I'm obviously missing
something because I'm not finding anything useful.
I'm trying to select names at random from a table that contains the
name and the frequency at which it is actually used in society. The
table is defined as follows:
CREATE TABLE `MaleNames` (
`Name_ID` int(11) NOT NULL auto_increment,
`Name` char(50) default NULL,
`Frequency` decimal(5,3) default NULL,
PRIMARY KEY (`Name_ID`)
)
Some examples:
1, Aaron, 0.240
3, Abe, 0.006
13, Adrian, 0.069
What's the best way to select names at random from this but still take
into account frequency of use?
Thanks in advance!
Matt
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org
RE: Probability Selects
am 07.01.2010 22:25:28 von Daevid Vincent
http://www.greggdev.com/web/articles.php?id=6
> -----Original Message-----
> From: Matt Neimeyer [mailto:matt@neimeyer.org]
> Sent: Thursday, January 07, 2010 12:10 PM
> To: mysql@lists.mysql.com
> Subject: Probability Selects
>
> I've tried Googling till my brain is fried and I'm obviously missing
> something because I'm not finding anything useful.
>
> I'm trying to select names at random from a table that contains the
> name and the frequency at which it is actually used in society. The
> table is defined as follows:
>
> CREATE TABLE `MaleNames` (
> `Name_ID` int(11) NOT NULL auto_increment,
> `Name` char(50) default NULL,
> `Frequency` decimal(5,3) default NULL,
> PRIMARY KEY (`Name_ID`)
> )
>
> Some examples:
>
> 1, Aaron, 0.240
> 3, Abe, 0.006
> 13, Adrian, 0.069
>
> What's the best way to select names at random from this but still take
> into account frequency of use?
>
> Thanks in advance!
>
> Matt
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:
> http://lists.mysql.com/mysql?unsub=daevid@daevid.com
>
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org
Re: Probability Selects
am 08.01.2010 00:40:19 von Tom Worster
On 1/7/10 3:10 PM, "Matt Neimeyer" wrote:
> I'm trying to select names at random from a table that contains the
> name and the frequency at which it is actually used in society. The
> table is defined as follows:
>
> CREATE TABLE `MaleNames` (
> `Name_ID` int(11) NOT NULL auto_increment,
> `Name` char(50) default NULL,
> `Frequency` decimal(5,3) default NULL,
> PRIMARY KEY (`Name_ID`)
> )
>
> Some examples:
>
> 1, Aaron, 0.240
> 3, Abe, 0.006
> 13, Adrian, 0.069
>
> What's the best way to select names at random from this but still take
> into account frequency of use?
after reading the source, i'd avoid using MySQL's RAND(). you can probably
easily get better quality pseudorandom number in your app's environment.
to get a random row, "LIMIT 1, r", where offset r is a random number between
0 and (tablelength - 1), should work.
if you must use RAND(), FLOOR(tablelength*RAND()) will work for r.
if you want a random row from a constrained subset of rows based on
frequency, e.g. "WHERE Frequency > 0.001", you can include that condition in
the query and repeat it until you get a non-empty response.
but if the constrained subset is only a small fraction of the table's rows
then it might be more efficient to compute its size (with COUNT() in another
query or a subquery) and use that in generating the offset r.
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org
Re: Probability Selects
am 11.01.2010 15:31:06 von Baron Schwartz
Matt,
On Thu, Jan 7, 2010 at 3:10 PM, Matt Neimeyer wrote:
> What's the best way to select names at random from this but still take
> into account frequency of use?
Here's the link I usually send clients:
http://jan.kneschke.de/projects/mysql/order-by-rand/
--
Baron Schwartz
Percona Inc: Services and Support for MySQL
http://www.percona.com/
--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql?unsub=gcdmg-mysql-2@m.gmane.org
Re: Probability Selects
am 11.01.2010 15:50:32 von Johan De Meersman
--000e0cd56c32c09bc0047ce4a815
Content-Type: text/plain; charset=ISO-8859-1
I think what he's trying to accomplish is not truly random, but to use the
probability that's indicated in the second field of the table:
1, Aaron, 0.240
> 3, Abe, 0.006
> 13, Adrian, 0.069
>
So there would be a probability of 0.240 that the call returns Aaron, a
probability of 0.006 that it returns Abe, and so on.
I've no clue how to do this in SQL, though, save for the utter stupidity of
creating a table that repeats each name (or UID, saves memory/disk)
1000*probability times, and do a random select on that (or 100* or 10*
depending on how precise you need it). I'd not recommend it, though - it's
gonna be a mess and a huge performance drain.
I suspect this would be better done in code, but I've been out of coding (or
statistics) for too long to give pointers there.
On Mon, Jan 11, 2010 at 3:31 PM, Baron Schwartz wrote:
> Matt,
>
> On Thu, Jan 7, 2010 at 3:10 PM, Matt Neimeyer wrote:
> > What's the best way to select names at random from this but still take
> > into account frequency of use?
>
> Here's the link I usually send clients:
> http://jan.kneschke.de/projects/mysql/order-by-rand/
>
> --
> Baron Schwartz
> Percona Inc: Services and Support for MySQL
> http://www.percona.com/
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql?unsub=vegivamp@tuxera.be
>
>
--
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel
--000e0cd56c32c09bc0047ce4a815--