Devel::Peek -- hash "Quality" query
Devel::Peek -- hash "Quality" query
am 19.01.2010 22:18:42 von Deane.Rothenmaier
This is a multipart message in MIME format.
--===============1276713214==
Content-Type: multipart/alternative;
boundary="=_alternative 007511C0862576B0_="
This is a multipart message in MIME format.
--=_alternative 007511C0862576B0_=
Content-Type: text/plain; charset="US-ASCII"
Dear gurus,
>From the perldoc for Devel::Peek:
The "quality" of a hash is defined as the total number of comparisons
needed to access
every element once, relative to the expected number needed for a
random hash. The value
can go over 100%.
The total number of comparisons is equal to the sum of the squares of
the number of
entries in each bucket. For a random hash of "n" keys into "k"
buckets, the expected
value is:
n + n(n-1)/2k
I'm working on some code documentation and have gotten to this point. The
above is vaguely worded. Or maybe I should just say that it doesn't
explicitly state what I'm looking for... So, do I want a higher
percentage for hash "quality," or a lower percentage? That is, which
direction of change signals improvement?
TIA,
Deane Rothenmaier
Programmer/Analyst
Walgreens Corp.
224-542-5150
You cannot strengthen the weak by weakening the strong. -- Abraham Lincoln
--=_alternative 007511C0862576B0_=
Content-Type: text/html; charset="US-ASCII"
Dear gurus,
From the perldoc for Devel::Peek:
The "quality"
of a hash is defined as the total number of comparisons needed to access
every element once, relative
to the expected number needed for a random hash. The value
can go over 100%.
The total number of comparisons
is equal to the sum of the squares of the number of
entries in each bucket.
For a random hash of "n" keys into "k" buckets, the
expected
value is:
n + n(n-1)/2k
I'm working on some code documentation
and have gotten to this point. The above is vaguely worded. Or maybe I
should just say that it doesn't explicitly state what I'm looking for...
So, do I want a higher percentage for hash "quality," or
a lower percentage? That is, which direction of change signals improvement?
TIA,
Deane Rothenmaier
Programmer/Analyst
Walgreens Corp.
224-542-5150
You cannot strengthen the weak by weakening the strong. -- Abraham Lincoln
--=_alternative 007511C0862576B0_=--
--===============1276713214==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
_______________________________________________
ActivePerl mailing list
ActivePerl@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
--===============1276713214==--
RE: Devel::Peek -- hash "Quality" query
am 20.01.2010 01:09:40 von Jan Dubois
This is a multipart message in MIME format.
--===============0117868236==
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0108_01CA9921.CE0EFE20"
Content-Language: en-us
This is a multipart message in MIME format.
------=_NextPart_000_0108_01CA9921.CE0EFE20
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit
Lower values (closer to 100%) are better:
A random hash should distribute the keys equally to all buckets, so it would need the minimal number of comparisons. Real world
distributions will have some buckets fuller than others. Since the number of comparisons (for visiting each element exactly once) is
quadratic with the number of elements in a bucket (linear search), the additional comparisons for the longer chains will not be
offset by the savings of comparisons from shorter chains.
But don't worry about it too much, as you can't really do anything about it anyways. The only reason for Devel::Peek to print this
metric is to allow Perl5 Porters to play with different implementations of the hashing function.
Cheers,
-Jan
From: activeperl-bounces@listserv.ActiveState.com [mailto:activeperl-bounces@listserv.ActiveState.com] On Behalf Of
Deane.Rothenmaier@walgreens.com
Sent: Tuesday, January 19, 2010 1:19 PM
To: activeperl@listserv.ActiveState.com
Subject: Devel::Peek -- hash "Quality" query
Dear gurus,
>From the perldoc for Devel::Peek:
The "quality" of a hash is defined as the total number of comparisons needed to access
every element once, relative to the expected number needed for a random hash. The value
can go over 100%.
The total number of comparisons is equal to the sum of the squares of the number of
entries in each bucket. For a random hash of "n" keys into "k" buckets, the expected
value is:
n + n(n-1)/2k
I'm working on some code documentation and have gotten to this point. The above is vaguely worded. Or maybe I should just say that
it doesn't explicitly state what I'm looking for... So, do I want a higher percentage for hash "quality," or a lower percentage?
That is, which direction of change signals improvement?
TIA,
Deane Rothenmaier
Programmer/Analyst
Walgreens Corp.
224-542-5150
You cannot strengthen the weak by weakening the strong. -- Abraham Lincoln
------=_NextPart_000_0108_01CA9921.CE0EFE20
Content-Type: text/html;
charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
xmlns:o=3D"urn:schemas-microsoft-com:office:office" =
xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" =
xmlns=3D"http://www.w3.org/TR/REC-html40">
http-equiv=3DContent-Type content=3D"text/html; =
charset=3Dus-ascii">
(filtered medium)">
vlink=3Dpurple>
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>Lower values (closer to 100%) are better:
class=3DMsoNormal>
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>A random hash should distribute the keys equally to all buckets, so =
it would need the minimal number of comparisons. Real world =
distributions will have some buckets fuller than others. Since the =
number of comparisons (for visiting each element exactly once) is =
quadratic with the number of elements in a bucket (linear search), the =
additional comparisons for the longer chains will not be offset by the =
savings of comparisons from shorter chains.
class=3DMsoNormal>
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>But don’t worry about it too much, as you can’t really do =
anything about it anyways. The only reason for Devel::Peek to =
print this metric is to allow Perl5 Porters to play with different =
implementations of the hashing function.
class=3DMsoNormal>
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>Cheers,
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>-Jan
style=3D'font-size:11.0pt;font-family:"Calibri","sans-serif" ;color:#1F497=
D'>
style=3D'border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in =
4.0pt'>
1.0pt;padding:3.0pt 0in 0in 0in'>
style=3D'font-size:10.0pt;font-family:"Tahoma","sans-serif"' >From:=
=
activeperl-bounces@listserv.ActiveState.com =
[mailto:activeperl-bounces@listserv.ActiveState.com] On Behalf Of =
Deane.Rothenmaier@walgreens.com
Sent: Tuesday, January 19, =
2010 1:19 PM
To: =
activeperl@listserv.ActiveState.com
Subject: Devel::Peek -- =
hash "Quality" query
class=3DMsoNormal>
style=3D'font-size:10.0pt;font-family:"Arial","sans-serif"'> Dear =
gurus,
style=3D'font-size:10.0pt;font-family:"Arial","sans-serif"'> From the =
perldoc for Devel::Peek:
style=3D'font-size:7.5pt;font-family:"Courier New"'> The =
"quality" of a hash is defined as the total number of =
comparisons needed to access
style=3D'font-size:7.5pt;font-family:"Courier New"'> every =
element once, relative to the expected number needed for a random hash. =
The value
New"'> can go over 100%.
style=3D'font-size:7.5pt;font-family:"Courier New"'> The =
total number of comparisons is equal to the sum of the squares of the =
number of
New"'> entries in each bucket. For a random hash of =
"n" keys into "k" buckets, the expected =
New"'> value is:
style=3D'font-size:7.5pt;font-family:"Courier New"'> =
n + n(n-1)/2k =
style=3D'font-size:10.0pt;font-family:"Arial","sans-serif"'> I'm working =
on some code documentation and have gotten to this point. The above is =
vaguely worded. Or maybe I should just say that it doesn't explicitly =
state what I'm looking for... So, do I want a higher percentage =
for hash "quality," or a lower percentage? That is, =
which direction of change signals improvement?
style=3D'font-size:10.0pt;font-family:"Arial","sans-serif"'> TIA, =
style=3D'font-size:10.0pt;font-family:"Arial","sans-serif"'> Deane =
Rothenmaier
Programmer/Analyst
Walgreens =
Corp.
224-542-5150
You cannot strengthen the weak by weakening =
the strong. -- Abraham =
Lincoln
------=_NextPart_000_0108_01CA9921.CE0EFE20--
--===============0117868236==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
_______________________________________________
ActivePerl mailing list
ActivePerl@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs
--===============0117868236==--