tree processing

tree processing

am 30.10.2007 12:48:20 von Evert Lammerts

------=_Part_1245_16897560.1193744900894
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Hello all,

In my MySQL database, with which I communicate through the mysqli extension,
I have two tables that each store a tree.

My first problem. The first tree is based on the nested set model, with a
primary key on the id (INT(10)) column, an index on the left and right
(INT(10)) columns and a VARCHAR (250) column to represent a name. This table
holds almost 14 thousand records. If I execute my single query (buffered, I
work with a custom template system that only allows output after all other
processing is done) to fetch the complete tree, it takes around 40 seconds
to execute. This query is a standard query to fetch a tree and the depth per
node. Is there any way I could optimize this?

My second, bigger problem. The other table holds a binary tree in which
every node holds the id of its parent. All relevant fields are indexed. This
table holds now almost 11 thousand records. The functionality that creates
the problem is the fetching of a big tree, and the looking for the optimal
insertion point for a node (I try to keep the tree symmetric according to a
simple recursive algorithm - I follow the path with the least children per
node, e.g. topnode (lft (1000), rgt (800)) -> Rightnode (lft (300),
rgt(500)) -> Leftnode ... etc). For the biggest tree this functionality can
take very long to execute. So I thought that instead of using a recursive
algorithm, I use binary operations - it is a binary tree after all.

My solution for fetching the tree looks like this: for every node you save
it's path as a binary number represented as an integer in the database, eg
11 = left left = 3, 10 = left right = 2, 1001110 = left right right left
left left right = 78, etc. To get all children of a certain node you do a
logical and on the parent's path and all other paths - SELECT * FROM tree
WHERE TRUE = path & {$parent_path}. However, this solution requires the
tree's longest path to be no more then 64 (MySQL's BIGINT is 8 bytes). This
is not enough for us.

I'm desperately looking for suggestions, and I guess I'm not the first one
to have problems with processing big amounts of data in tree structures. Do
you guys have any suggestions?

Thanks,
Evert Lammerts

------=_Part_1245_16897560.1193744900894--

Re: tree processing

am 31.10.2007 05:19:19 von dmagick

Evert Lammerts wrote:
> Hello all,
>
> In my MySQL database, with which I communicate through the mysqli extension,
> I have two tables that each store a tree.
>
> My first problem. The first tree is based on the nested set model, with a
> primary key on the id (INT(10)) column, an index on the left and right
> (INT(10)) columns and a VARCHAR (250) column to represent a name. This table
> holds almost 14 thousand records. If I execute my single query (buffered, I
> work with a custom template system that only allows output after all other
> processing is done) to fetch the complete tree, it takes around 40 seconds
> to execute. This query is a standard query to fetch a tree and the depth per
> node. Is there any way I could optimize this?

Show us the query, the table definition and the indexes on the table.

> My second, bigger problem. The other table holds a binary tree in which
> every node holds the id of its parent. All relevant fields are indexed. This
> table holds now almost 11 thousand records. The functionality that creates
> the problem is the fetching of a big tree, and the looking for the optimal
> insertion point for a node (I try to keep the tree symmetric according to a
> simple recursive algorithm - I follow the path with the least children per
> node, e.g. topnode (lft (1000), rgt (800)) -> Rightnode (lft (300),
> rgt(500)) -> Leftnode ... etc). For the biggest tree this functionality can
> take very long to execute. So I thought that instead of using a recursive
> algorithm, I use binary operations - it is a binary tree after all.
>
> My solution for fetching the tree looks like this: for every node you save
> it's path as a binary number represented as an integer in the database, eg
> 11 = left left = 3, 10 = left right = 2, 1001110 = left right right left
> left left right = 78, etc. To get all children of a certain node you do a
> logical and on the parent's path and all other paths - SELECT * FROM tree
> WHERE TRUE = path & {$parent_path}. However, this solution requires the
> tree's longest path to be no more then 64 (MySQL's BIGINT is 8 bytes). This
> is not enough for us.

Try a varchar field? They can be indexed of course, but I'm not sure how
well it will work with those sort of operations.

--
Postgresql & php tutorials
http://www.designmagick.com/

--
PHP Database Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php