script takes long time to run when comparing digits within strings using foreach
script takes long time to run when comparing digits within strings using foreach
am 27.05.2011 10:18:01 von eventual
--0-1840572514-1306484281=:10519
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Hi,
I have an array , @datas, and each element within @datas is a string that's=
made up of 6 digits with spaces in between like this â=9C1 2 3 4 5 6=
â=9D, so the array look like this=20
@datas =3D ('1 2 3 4 5 6', '1 2 9 10 11 12', '1 2 3 4 5 8', '1 2 3 4 5 9' ,=
'6 7 8 9 10 11');
Now I wish to compare each element of @datas with the rest of the elements =
in @datas in such a way that if 5 of the digits match, to take note of the =
matching indices, Â and so the script I wrote is appended below.
However, the script below takes a long time to run if the datas at @datas a=
re huge( eg 30,000 elements). I then wonder is there a way to rewrite the s=
cript so that the script can run faster.
Thanks
Â
###### script below #######################
Â
#!/usr/bin/perl
use strict;
Â
my @matched_location =3D ();
my @datas =3D ('1 2 3 4 5 6', '1 2 9 10 11 12', '1 2 3 4 5 8', '1 2 3 4 5 9=
' , '6 7 8 9 10 11');
Â
my $iteration_counter =3D -1;
foreach (@datas){
  $iteration_counter++;
  my $reference =3D $_;
Â
  my $second_iteration_counter =3D -1;
  my $string =3D '';
  foreach (@datas){
     $second_iteration_counter++;
     my @individual_digits =3D split / /,$_;
Â
     my $ctr =3D 0;
     foreach(@individual_digits){
         if($reference =3D~/^=
$_ | $_ | $_$/){
            =C2=
=A0 $ctr++;
         }
     }
     if ($ctr >=3D 5){
         $string =3D $string =
.. "$second_iteration_counter ";
     }
  }
  $matched_location[$iteration_counter] =3D $string;
}
Â
my $ctr =3D -1;
foreach(@matched_location){
   $ctr++;
   print "Index $ctr of \@matched_location =3D $_\n";
}
Â
--0-1840572514-1306484281=:10519--
Re: script takes long time to run when comparing digits within strings using foreach
am 27.05.2011 11:38:39 von Shlomi Fish
Hi eventual,
On Friday 27 May 2011 11:18:01 eventual wrote:
> Hi,
> I have an array , @datas, and each element within @datas is a string that=
's
> made up of 6 digits with spaces in between like this â=9C1 2 3 4 5 6=
â=9D, so the
> array look like this @datas =3D ('1 2 3 4 5 6', '1 2 9 10 11 12', '1 2 3 =
4 5
> 8', '1 2 3 4 5 9' , '6 7 8 9 10 11'); Now I wish to compare each element
> of @datas with the rest of the elements in @datas in such a way that if 5
> of the digits match, to take note of the matching indices, and so the
> script I wrote is appended below. However, the script below takes a long
> time to run if the datas at @datas are huge( eg 30,000 elements). I then
> wonder is there a way to rewrite the script so that the script can run
> faster. Thanks
> =20
> ###### script below #######################
> =20
> #!/usr/bin/perl
> use strict;
> =20
> my @matched_location =3D ();
> my @datas =3D ('1 2 3 4 5 6', '1 2 9 10 11 12', '1 2 3 4 5 8', '1 2 3 4 5=
9'
> , '6 7 8 9 10 11');=20
> my $iteration_counter =3D -1;
> foreach (@datas){
> $iteration_counter++;
> my $reference =3D $_;
> =20
> my $second_iteration_counter =3D -1;
> my $string =3D '';
> foreach (@datas){
> $second_iteration_counter++;
> my @individual_digits =3D split / /,$_;
> =20
> my $ctr =3D 0;
> foreach(@individual_digits){
> if($reference =3D~/^$_ | $_ | $_$/){
> $ctr++;
> }
> }
> if ($ctr >=3D 5){
> $string =3D $string . "$second_iteration_counter ";
> }
> }
> $matched_location[$iteration_counter] =3D $string;
> }
> =20
> my $ctr =3D -1;
> foreach(@matched_location){
> $ctr++;
> print "Index $ctr of \@matched_location =3D $_\n";
> }
> =20
=46irst of all, you should add "use warnings;" to your code. Then you shoul=
d get=20
rid of the implicit $_ as loop iterator because it's easy to break. For mor=
e=20
information see:
http://perl-begin.org/tutorials/bad-elements/
Other than that - you should use a better algorithm. One option would be to=
=20
sort the integers and then use a diff/merge-like algorithm:
http://en.wikipedia.org/wiki/Merge_algorithm
A different way would be to use a hash to count the number of times each=20
number occured in the two sets, and then see how many of them got a value o=
f 2=20
(indicating they are in both sets).
But at the moment, everything is very inefficient there.
Regards,
Shlomi Fish
=2D-=20
=2D--------------------------------------------------------- -------
Shlomi Fish http://www.shlomifish.org/
"Star Trek: We, the Living Dead" - http://shlom.in/st-wtld
I often wonder why I hang out with so many people who are so pedantic. And
then I remember - because they are so pedantic.
=2D- Israeli Perl Monger
Please reply to list if it's a mailing list post - http://shlom.in/reply .
--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/
Re: script takes long time to run when comparing digits within stringsusing foreach
am 27.05.2011 12:27:21 von rvtol+usenet
On 2011-05-27 10:18, eventual wrote:
> I have an array , @datas, and each element within @datas is a string th=
at's made up of 6 digits with spaces in between like this â=9C1 2 3 =
4 5 6â=9D, so the array look like this
> @datas =3D ('1 2 3 4 5 6', '1 2 9 10 11 12', '1 2 3 4 5 8', '1 2 3 4 5 =
9' , '6 7 8 9 10 11');
> Now I wish to compare each element of @datas with the rest of the eleme=
nts in @datas in such a way that if 5 of the digits match, to take note o=
f the matching indices, and so the script I wrote is appended below.
a. Do once what you can do only once. There are at least 2 points where=20
you didn't: 1. prepare @datas before looping; 2. don't compare the same=20
stuff more than once.
b. Assemble a result, and report at the end. Don't use any 'shared=20
resources' like incrementing global counters while going along.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my @data =3D ;
$_ =3D { map { $_ =3D> 1 } split } for @data;
$ARGV[0] and print Dumper( \@data );
my @result;
for my $i ( 0 .. $#data - 1 ) {
my @k =3D keys %{ $data[ $i ] };
for my $j ( $i + 1 .. $#data ) {
my $n =3D 0;
exists $data[ $j ]{ $_ } and ++$n for @k;
$n >=3D 5 and push @result, [ $i, $j ];
}
}
print Dumper( \@result );
__DATA__
1 2 3 4 5 6
1 2 9 10 11 12
1 2 3 4 5 8
1 2 3 4 5 9
6 7 8 9 10 11
--=20
Ruud
--=20
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/
Re: script takes long time to run when comparing digits within stringsusing foreach
am 29.05.2011 04:17:46 von jwkrahn
eventual wrote:
> Hi,
Hello,
> I have an array , @datas, and each element within @datas is a string
> that's made up of 6 digits with spaces in between like this â=9C1 =
2 3 4 5
> 6â=9D, so the array look like this
> @datas =3D ('1 2 3 4 5 6', '1 2 9 10 11 12', '1 2 3 4 5 8', '1 2 3 4 5
> 9' , '6 7 8 9 10 11');
> Now I wish to compare each element of @datas with the rest of the
> elements in @datas in such a way that if 5 of the digits match, to
> take note of the matching indices, and so the script I wrote is
> appended below.
> However, the script below takes a long time to run if the datas at
> @datas are huge( eg 30,000 elements). I then wonder is there a way to
> rewrite the script so that the script can run faster.
> Thanks
>
> ###### script below #######################
>
> #!/usr/bin/perl
> use strict;
>
> my @matched_location =3D ();
> my @datas =3D ('1 2 3 4 5 6', '1 2 9 10 11 12', '1 2 3 4 5 8', '1 2 3 4=
5 9' , '6 7 8 9 10 11');
>
> my $iteration_counter =3D -1;
> foreach (@datas){
> $iteration_counter++;
> my $reference =3D $_;
>
> my $second_iteration_counter =3D -1;
> my $string =3D '';
> foreach (@datas){
> $second_iteration_counter++;
> my @individual_digits =3D split / /,$_;
>
> my $ctr =3D 0;
> foreach(@individual_digits){
> if($reference =3D~/^$_ | $_ | $_$/){
> $ctr++;
> }
> }
> if ($ctr>=3D 5){
> $string =3D $string . "$second_iteration_counter ";
> }
> }
> $matched_location[$iteration_counter] =3D $string;
> }
>
> my $ctr =3D -1;
> foreach(@matched_location){
> $ctr++;
> print "Index $ctr of \@matched_location =3D $_\n";
> }
Your program can be reduced to:
my @matched_location;
my @datas =3D ( '1 2 3 4 5 6', '1 2 9 10 11 12', '1 2 3 4 5 8', '1 2 3 4 =
5=20
9', '6 7 8 9 10 11' );
for my $i ( 0 .. $#datas ) {
for my $j ( 0 .. $#datas ) {
$matched_location[ $i ] .=3D "$j " if 5 <=3D grep $datas[ $i ] =3D=
~=20
/(?:^|(?<=3D ))$_(?=3D |$)/, split ' ', $datas[ $j ]
}
}
print map "Index $_ of \@matched_location =3D $matched_location[$_]\n", 0=
=20
... $#matched_location;
You should benchmark it to see if it is any faster than your original cod=
e.
John
--=20
Any intelligent fool can make things bigger and
more complex... It takes a touch of genius -
and a lot of courage to move in the opposite
direction. -- Albert Einstein
--=20
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/