Counting the repetition count of repeated substrings
am 15.11.2007 16:38:23 von Michele DondiThis is the subject of the following PM node:
http://perlmonks.org/?node_id=650761
I am the author of the node and to briefly summarize, the problem is:
"to calculate the repetition count of repeated substrings of length 2
or more of a given string."
Currently, the most interesting code suggestions are within the
respective subs of the following benchmark:
(For definiteness the subs take the given string as an argument and
return a hashref of the counts.)
#!/usr/bin/perl
use strict;
use warnings;
use constant MIN => 2;
use Benchmark qw/:all :hireswallclock/;
my $str='aabcdabcabcecdecd';
sub oha {
my $s=shift;
my %saw;
while($s =~ /(..+)(?=.*?\1)/g) {
for my $x (0..length $1) {
@saw{ map {substr $1, $x, $_} $x+2..length $1 } = ();
}
}
$saw{$_} =()= $s =~ /\Q$_/g for keys %saw;
\%saw;
}
sub kramba {
my %count;
my $count;
$count = sub {
my( $string) = @_;
my $length = length( $string );
if ($length < MIN) {
$count{$_} == 1 and delete $count{$_} for keys %count;
return \%count;
}
for my $l (MIN..$length) {
my $s = substr( $string, 0, $l );
$count{ $s } += 1;
}
$count->( substr( $string, 1 ) );
};
$count->(shift);
for (keys %count) {
delete $count{$_} unless
$count{$_} >= 2 and length($_) >= MIN;
}
\%count;
}
sub ikegami {
my $str = shift;
local our %counts;
$str =~ /
(.{2,}) # or (.+)
(?(?{ !$counts{$1} })(?=.*\1))
(?{ ++$counts{$1} })
(?!)
/x;
\%counts;
}
sub lodin {
my $str = shift;
local our %count;
$str =~ /
(.{2,})
(?(?{ $count{$1} })
(?!)
)
.*
\1
(?{ ($count{$1} ||= 1)++ })
(?!)
/x;
\%count;
}
cmpthese 10_000 => {
oha => sub { oha $str },
kramba => sub { kramba $str },
ikegami => sub { ikegami $str },
lodin => sub { lodin $str },
};
__END__
Note: benchmarking is interesting but I'm also interested in other
charachteristics, e.g. readability, intuitiveness and so on.
Michele
--
{$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
(($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^