NAME

GT::SQL::Tree - Helps create and manage a tree in an SQL database.


SYNOPSIS

    use GT::SQL::Tree;
    my $tree = $table->tree;
    my $children = $tree->children(id => [1,2,3], max_depth => 2);
    my $parents = $tree->parents(id => [4,5,6]);


DESCRIPTION

GT::SQL::Tree is designed to implement a tree structure with a SQL table. Most of the work on managing the table is performed automatically behind the scenes, however there are a couple of front end methods to retrieving the tree nodes from a GT::SQL::Tree object.


METHODS

new, tree

Typically, the way to get a tree object is to call ->tree on a table object. The table object then calls GT::SQL::Tree->new for you and returns the results, which is a GT::SQL::Tree object. Typically you should not call ->new directly, but instead let $table->tree call it with the proper arguments.

create, add_tree

To use GT::SQL::Tree, you need to first call create(). You shouldn't call it directly, but instead call ->add_tree() on an editor object. The arguments to add_tree are passed through to create, so that they are essentially the same (there is one exception - add_tree passed in table => $table_object).

create() will create a tree table, with the name passed on the name of the table passed in. For example, if you wish to build a tree on 'MyTable', the tree table that is created by create() will be named MyTable_tree. The tree table provides easy one-query access to all of a nodes parents or children, and also keeps track of the number of hops between a node and its descendant, allowing you to limit how far you descend into the tree.

The following arguments are required:

table
This contains the table object for the table the tree is to be built upon. Note that when calling add_tree you should not specify this - add_tree passes it along on its own.

father
This must specify the name of the father ID column. The father ID column controls the relationship between father/child.

For example, if your primary key is ``my_id'' and your father id column is ``my_father_id'', you would pass in ``my_father_id'' as the value to father.

root
This is used to specify the name of the root column. For example, if your primary key is ``my_id'' and your root id column is ``my_root_id'', you would pass in ``my_root_id'' as the value to root.

depth
This is used to specify the name of the depth column for the table. For example, if you are using a column named ``my_depth'' to keep track of the depth of a node, you would pass in ``my_depth'' as the value to depth.

The following are optional arguments to create/add_tree:

force
Takes a value such as 'force' or 'check'. This value is passed on to the GT::SQL table creation subroutine.

rebuild
You can pass in a GT::SQL::Tree::Rebuild object if you have an incomplete or invalid table structure. See the GT::SQL::Tree::Rebuild manpage for more details.

debug
Sets the debug level of the tree object. add_tree() automatically passes in the debug value for the table object, so it normally is not necessary to set this.

destroy, drop_tree

You can call $tree->destroy to destroy a tree. This involves dropping the tree table and deleting the tree reference from the table the tree was on. This can be called by calling $tree->destroy() on a GT::SQL::Tree object, however this is typically invoked by calling $editor->drop_tree() on a table editor object.

Neither $tree->destroy() nor $editor->drop_tree() take any arguments.

root_id_col, father_id_co, depth_col

These three tree object methods return the name of the associated column in the main table. Usually you will already know them, and these methods are primarily used internally.

children

This is where the usefulness of the tree module comes into play. $tree->children is used to access all of the children of a particular node. It takes a wide variety of arguments to control the return.

Usually, the return will be either a hash reference of array references each containing hash references, or else an array reference of hash references. Which reference you get depends on what you request via the id parameter, described below. Each inner hash reference is a row from the database, typically a joined row from the table the tree is on with the tree table, however the roots_only, cols, and select_from parameters all change this behaviour.

The arguments to children() are as follows:

id
The value of the id key is either a scalar value, or an array reference. The value/values to id should be the id whose descendants you are looking for. For example, if you are looking for the children of ID 3 and ID 4, you would pass in id => [3, 4]. The return value of children will be a hash reference containing two keys: 3 and 4.

If you are looking for the children of a single ID and pass the id as a scalar value, you will get back an array reference as described above.

So, basically, if the value to id is an array reference, you will get back a hash reference of array references of hash references; if it is a scalar value, you will get back an array reference of hash references. $tree->children(id => [1])->{1}; and $tree->children(id => 1); will result in the same thing.

To get all the trees in a single query, you pass in 0 as the value. This is as if you are requesting the children of the imaginary root to which all roots belong.

id is the only required parameter.

max_depth
You can specify a max_depth value to specify that the records returned should not be more a certain distance from the node. For example, supposing you have this tree: a b c d Selecting the children of a with a max_depth of 1 would return just b, not c or d. A max_depth of 2 would return b and c.

Not specifying max_depth means that you do not want to limit the maximum distance from the parent of the returned values.

cols
You can specify an array reference as the value to cols to alter the values returned. Instead of doing ``SELECT * FROM ...'', the query will be ``SELECT <what you specify> FROM ...''. Note, however, that the father, root, and depth columns are required and will be present in the rows returned whether or not you specify them.

sort_col, sort_order
Where the sort option sorts the results based on tree levels, sort_col and sort_order control the sorting for nodes with the same father ID. For example, with this tree: a b c sort_col and sort_order affect whether or not b comes before or after c. The value of each can either be a scalar value or an array reference. There is essentially no difference, the scalar value is just a little easier when you are only sorting on a single column. The values of sort_col should be column names, and the values of sort_order 'ASC' or 'DESC', per sort column respectively. For example: sort_col => ['a','b'], sort_order => ['ASC', 'DESC'] will sort first in ascending order based on the value of a, then descending order based on the value of column b. This correlates directly to SQL - it becomes ``ORDER BY a ASC, b DESC''.

You can specify a different sort order for roots by using the roots_order_by option, when using id => 0. See below.

condition
If you want to limit the results, you can pass a GT::SQL::Condition object into children() via the condition key. The condition will apply to the select performed. For example, if you want to select rows with a column ``a'' having a value less than 20, you could do: my $cond = GT::SQL::Condition->new(a => '<' => 20) my $children = $tree->children(..., condition => $cond);

limit
Like condition, you can specify any valid LIMIT _____ value here, for example ``50, 25''. This option is only used when using id => 0 - it will limit the number of roots returned, taking into account the sort_col and sort_order.

roots_only
If you specify this option, it will assume that what you passed in via id consists only of root_ids. Doing so makes a join with the tree table unneccessary and allows you to use the select_from option. This option can be used (and generally this is a good idea) when specifying id => 0.

roots_order_by
This option controlls the order of root posts, when selecting roots using id => 0 and a limit. sort_order above will affect the order of children of the roots, but the order of the roots themselves will be controlled by whatever ORDER BY value you specify here.

Again, this option requires that id => 0, roots_only, and limit are also being used.

If this option is omitted, the ORDER BY will be generated from the values of the sort_col and sort_order options.

select_from
If you are using roots_only, you can also specify the select_from option. This option allows you to perform the selects from a GT::SQL::Relation object instead of just the table associated with the tree. Note that the table associated with the tree must be part of the relation, however you can have as many other tables as you like.

left_join
If the select_from relation should be a left join, pass left_join => 1. This simply passes the left_join option to ->select. This option is only applicable when select_from is used.

parents

This is effectively the opposite of children. Instead of getting back all of the children nodes, it gives the parents, all the way up to the root for any given node. The return value is the same as that of children, so see that section.

Each array returned by children is sorted by depth from root to parent.

id
id is the only required parameter for parents(). It should be either a scalar value or an array reference. You specify the ID's of children whose parents you are looking for. The type of argument (scalar or array ref) affects the return in the same way as children().

cols
cols works in a similar way to the cols parameter to children. You specify the columns you want in the return as an array ref. What you get back will have these columns in it. If cols is not specified, you'll get back all columns.

Note that 'tree_id_fk' and the depth column for the table are required fields and will be added if not specified.

child_ids

If you are looking for just the ID's of the children of a particular node, you should use this. The return value is one of the following, depending on what you pass in:

hash reference of array references: { ID => [ID, ID, ...], ... } with one ID in the hash reference for each id you specify. The array reference contains the child ID's of the key ID.

hash reference of hash references: { ID => { ID => dist, ID => dist, ... }, ... } with one ID in the other hash reference for each id you specify. The inner hash reference is made of child_id => child_distance key-value pairs.

array reference or hash reference: [ID, ID, ...] hash reference: { ID => dist, ID => dist }

The first two apply when passing in an array reference for id, the latter two when passing a scalar value for id. The first and third are without include_dist specified, the second and fourth occur when you specify include_dist.

id
Like all other accessors, child_ids takes a scalar value or array reference as the id value. Return as noted above.

include_dist
This changes the return as noted above - instead of just getting an array reference of child ID's, you get the child ID's as the keys of a hash reference, and the distances of the child from the parent you requested as the values.

parent_ids

Exactly the same as child_ids, except that this works up the tree instead of down. Takes the same arguments, gives the same possible returns.


INDICES

A tree requires a few indices to get optimal performance out of it. If the table is never expected to be more than just a few rows, you won't notice a substantial difference, however, as with any table, as the table grows the performance proper indexing provides becomes more appreciable.

Two indices are created automatically on the tree table, one on tree_id_fk, and the other on tree_anc_id_fk,tree_dist, so you don't need to worry about that table.

Obviously, the usage of the tree affects how many indices you want, this section is simply to provide some general guidelines for the indices required.

Because the roots_only option is based solely on the main table and not the tree, if you are using roots_only (calling children with id => 0 automatically turns on the roots_only option), you want to make sure you have an index on the root column. If you also use the max_depth depth option, add the depth column to this index.

Keep in mind that you may need to mix other columns in here if you are using a condition with children(). This also applies when using the sort_col and sort_order parameters - basically you need to figure out what your indices are, and then add in the root column and, if using max_depth, the depth column.


COPYRIGHT

Copyright (c) 2004 Gossamer Threads Inc. All Rights Reserved. http://www.gossamer-threads.com/


VERSION

Revision: $Id: Tree.pm,v 1.29 2005/05/31 06:26:32 brewt Exp $