Narvy tree in PERL.

am 06.11.2004 11:32:59 von Smurf

Hi All.

I want to write my own Nary tree for learning purposes, so I can get a good
handle on the language. I have read documentation on references to build
multi dymention arrays, hash to hash, array to hash, etc. But I am missing
something. NOte, I am relative new to perl and this is the first time I
have tryed to do something of this nature ever.

Outline of problem:

1. I have a directory structure that I am reading from a file.
2. The hash structure for the top of the tre is:

%tree = ("node", "\\", "value", []);

3. I can break the line of the path into an array by using split and place
this into an array.
4. There are multiple parent directories and child of the parents with the
same directory name. EG:

a/a/file
a/a/file1
a/b/file1
a/b/file2
a/c/file1
b/a/file1
b/b/file

Questions:

1. How do I keep the keys unique as I process the file one line at a time?
2. How do you walk a Nary tree?
3. How can I do this by using on references?

Since I want to keep the hash structure that I mention above. Since the
node tells me the name of the directory and array tells me how many children
are in this directory. Also, the files are at the end of the path, never in
one of the parent directories.

Help please, I am completely lost.

smurf

Re: Narvy tree in PERL.

am 09.11.2004 03:11:04 von Jim Gibson

In article , smurf
wrote:

> Hi All.
>
> I want to write my own Nary tree for learning purposes, so I can get a good
> handle on the language. I have read documentation on references to build
> multi dymention arrays, hash to hash, array to hash, etc. But I am missing
> something. NOte, I am relative new to perl and this is the first time I
> have tryed to do something of this nature ever.
>
> Outline of problem:
>
> 1. I have a directory structure that I am reading from a file.
> 2. The hash structure for the top of the tre is:
>
> %tree = ("node", "\\", "value", []);

This is better written as

%tree = ( 'node' => '\\', 'value' => [] );

Note the use of single quotes and the 'fat' commas (=>), but why are
you using '\' as your root node, when your directory separator
character appears to be '/'?

What are in the 'node' and 'value' slots? I can guess that node is the
path name of the directory, and value is an array of references to the
contents of the directory, but that is just a guess. If this guess is
correct, you should use the keys 'name' and 'contents', or something
like that that describes what they contain. 'node' and 'value' are too
generic. Can you describe your data structure in more detail?

The main question is whether you want the value array to contain
directories or just files. If just files, you don't need a tree; just
use the full path name of the directory in the node slot and the files
in the value slot. If you really want an N-ary tree, then each
directory node can contain references to other nodes, which can be
directories or files. In this case, you should consider using separate
elements for directories and files. Thus, each node might be
represented by a hash with 3 members: name, directories, files. The
name entry is a string, the directories entry is a reference to an
array of hashes (nodes), and the files entry is a reference to an array
of strings (file names).

>
> 3. I can break the line of the path into an array by using split and place
> this into an array.

That would be useful for each line in the file, but the array will be a
temporary one.

> 4. There are multiple parent directories and child of the parents with the
> same directory name. EG:
>
> a/a/file
> a/a/file1
> a/b/file1
> a/b/file2
> a/c/file1
> b/a/file1
> b/b/file
>
> Questions:
>
> 1. How do I keep the keys unique as I process the file one line at a time?

Within each node, subdirectory and file names will be unique.

> 2. How do you walk a Nary tree?

By iterating over the directory and file arrays with a foreach loop.
You can use a recursive subroutine or a stack of unvisited nodes to
save where you are in your tree.

> 3. How can I do this by using on references?

Each directory entry is a reference to an array of references to hashes
(nodes). Each file entry is a reference to an array of strings (file
names). You use a foreach loop to iterate over each array.

>
> Since I want to keep the hash structure that I mention above. Since the
> node tells me the name of the directory and array tells me how many children
> are in this directory. Also, the files are at the end of the path, never in
> one of the parent directories.

OK. Now you give us more detail on what a node contains.

You can put the directory pointers and file names in the same array if
you want to, and then use the ref function to figure out whether each
entry is a file name (scalar) or pointer-to-node (reference), but
having two entries as I suggested above might prove simpler.

>
> Help please, I am completely lost.

Give it a shot and post what you come up with for more help.

HTH.