"Right way" of doing threads in Perl?

am 03.08.2011 12:16:50 von Deyan Ginev

Hi all,

I have heard a lot of bashing of the thread support in Perl and tried it
myself with limited success.

For example, some months ago I tried the standard "use threads;" and
remember seeing a segfault when it reached the "join" (but everything
worked well otherwise, which gave me a mixed feeling).

I will soon have access to a cluster of hexacore machines and am
thinking of the best way to utilize them - obviously threaded programs
would get me a long way in such a setup. Is there a "best" way to do
threads in Perl? Do you know of any production-ready applications that
are using Perl threads?

Thanks for any suggestions!

Cheers,
Deyan

--
Deyan Ginev, Jacobs University Bremen,
http://kwarc.info/people/dginev

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/

Re: "Right way" of doing threads in Perl?

am 03.08.2011 13:49:29 von rvtol+usenet

On 2011-08-03 12:16, Deyan Ginev wrote:

> I have heard a lot of bashing of the thread support in Perl and tried it
> myself with limited success.
>
> For example, some months ago I tried the standard "use threads;" and
> remember seeing a segfault when it reached the "join" (but everything
> worked well otherwise, which gave me a mixed feeling).
>
> I will soon have access to a cluster of hexacore machines and am
> thinking of the best way to utilize them - obviously threaded programs
> would get me a long way in such a setup. Is there a "best" way to do
> threads in Perl? Do you know of any production-ready applications that
> are using Perl threads?

If at all possible, design in a 'map-reduce-merge' way
(which is basically just 'init / process / activate' anyway).

Then use a perl binary that doesn't support threading,
because it is about 20% faster.

And fork.

My code often (runs on 24-core boxes with 96 GB RAM, and) looks like:

#!/usr/bin/perl -wl
use strict;

# Each singer (=child) sings (=process)
# all of the lines (=job: ordered set of tasks).

use Data::Dumper;
use Parallel::Series;

my $NAME = "Brother Jacob Song";

my @lines = split /\n/, <<'EOT';
Brother Jacob - Brother Jacob
Sleeping still? - Sleeping still?
Morning bells are ringing! - Mornings bells are ringing!
Ding, dang, dong - Ding, dang, dong
EOT

my @TASKS = map +{ ix => $_ + 1, info => $lines[ $_ ] },
0 .. $#lines;

my $todo = Parallel::Series::->new(
DEBUG => 0,
NAME => $NAME,
TASKS => \@TASKS,
);

$todo->set(
map => \&init, # returns \@jobs

reduce => \&process, # a child's job is to process an
# ordered set of (one or more) tasks

merge => \&activate, # wrap up
);

$todo->LOG( 3, 'starting: %s', Dumper( $todo ) );

$todo->run;

exit 0;

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

sub init {
my ( $self ) = @_;
$self->LOG( 2, '>>> init <<<' );
return [ map [ $_ ], 'A' .. 'F' ]; # singers
}

sub process {
my ( $self, $chunk, $task ) = @_;
sleep 0.7;

my $ok = ( rand > 0.2 ? 1 : 0 ); # simulate failure
$self->LOG( 0, q(processed: chunk=%s.%s: %s),
join( '-', @$chunk ),
$task->{ ix },
( $ok ? $task->{ info } : '<>' ) );
return $ok;
}

sub activate {
my ( $self, $skip ) = @_;
$self->LOG( 2, '>>> activate <<<' );
$self->LOG( 3, q(activate: skip=%s), $skip || 0 );
return;
}

__END__

I haven't put Parallel::Series on CPAN yet,
but there are several fine alternatives there.

--
Ruud

--
To unsubscribe, e-mail: beginners-unsubscribe@perl.org
For additional commands, e-mail: beginners-help@perl.org
http://learn.perl.org/