write a new module for suffix trees?

write a new module for suffix trees?

am 03.04.2008 23:12:07 von donotreply

Hello,

for the purpose of detecting code copies in Perl sources,
I would like to use a suffix tree with a method to get
all longest repeated substrings (lrs).

I had a look at

1. Tree::Suffix, which looks good at first sight.
It is a wrapper around the c library libstree.
But libstree has a pretty buggy last version, and
the lib author does not respond to patches and email.
It has a lrs function, but it is buggy (incomplete).
I have patches for some of the bugs, but not all.

2. SuffixTree, which has no bugs, but lacks the lrs function.
The module does not allow for extensions.

3. Array::Suffix has not a generic enough interface for my
purpose and does not seem suitable.

So, I am thinking of writing my own.
What strategy would you suggest:

Making a pure Perl implementation?
Adapting a c library?

Thanks for any input,
Heiko (who only wants to write a new perlcritic policy)
(heiko!at!hexco!dot!de)

Re: write a new module for suffix trees?

am 04.04.2008 00:20:11 von John Bokma

"Heiko Eißfeldt" wrote:

> Hello,
>
> for the purpose of detecting code copies in Perl sources,
> I would like to use a suffix tree with a method to get
> all longest repeated substrings (lrs).
>
> I had a look at
>
> 1. Tree::Suffix, which looks good at first sight.
> It is a wrapper around the c library libstree.
> But libstree has a pretty buggy last version, and
> the lib author does not respond to patches and email.
> It has a lrs function, but it is buggy (incomplete).
> I have patches for some of the bugs, but not all.

Can you fork this lib?

if so, would it be possible to make Tree::Suffix use your fork.

--
John

http://johnbokma.com/perl/

Re: write a new module for suffix trees?

am 04.04.2008 00:30:14 von donotreply

John Bokma wrote in
news:Xns9A75A62DF26FAcastleamber@130.133.1.4:

>> 1. Tree::Suffix, which looks good at first sight.
>> It is a wrapper around the c library libstree.
>> But libstree has a pretty buggy last version, and
>> the lib author does not respond to patches and email.
>> It has a lrs function, but it is buggy (incomplete).
>> I have patches for some of the bugs, but not all.
>
> Can you fork this lib?
>
> if so, would it be possible to make Tree::Suffix use your fork.

I guess, i could. But sometimes it is less work to
start over afresh than bending your mind in order to
understand the given lib...
It would really need a test suite first.

Ok, so you would prefer the c library implementation over a
pure Perl imp.

Thanks, heiko

Re: write a new module for suffix trees?

am 04.04.2008 18:18:23 von Colin von Heuring

On Apr 3, 2:12=A0pm, "Heiko Eißfeldt"
wrote:
> 2. SuffixTree, which has no bugs, but lacks the lrs function.
> The module does not allow for extensions.

I suggest forking this one.

Re: write a new module for suffix trees?

am 04.04.2008 18:23:30 von Colin von Heuring

Heiko Eißfeldt wrote:
> 2. SuffixTree, which has no bugs, but lacks the lrs function.
> The module does not allow for extensions.

This sounds the most promising to me. Can you get the author to add
the function? Write a patchfile yourself and submit it?