GT::SQL::Tree - Helps create and manage a tree in an SQL database.
use GT::SQL::Tree;
my $tree = $table->tree; my $children = $tree->children(id => [1,2,3], max_depth => 2);
my $parents = $tree->parents(id => [4,5,6]);
GT::SQL::Tree is designed to implement a tree structure with a SQL table. Most of the work on managing the table is performed automatically behind the scenes, however there are a couple of front end methods to retrieving the tree nodes from a GT::SQL::Tree object.
Typically, the way to get a tree object is to call ->tree on a table object. The table object then calls GT::SQL::Tree->new for you and returns the results, which is a GT::SQL::Tree object. Typically you should not call ->new directly, but instead let $table->tree call it with the proper arguments.
To use GT::SQL::Tree, you need to first call create(). You shouldn't call it
directly, but instead call ->add_tree()
on an editor object. The arguments to
add_tree are passed through to create, so that they are essentially the same
(there is one exception - add_tree passed in table => $table_object
).
create()
will create a tree table, with the name passed on the name of the table
passed in. For example, if you wish to build a tree on 'MyTable', the tree table
that is created by create()
will be named MyTable_tree. The tree table provides
easy one-query access to all of a nodes parents or children, and also keeps
track of the number of hops between a node and its descendant, allowing you to
limit how far you descend into the tree.
The following arguments are required:
For example, if your primary key is ``my_id'' and your father id column is
``my_father_id'', you would pass in ``my_father_id'' as the value to father
.
root
.
depth
.
The following are optional arguments to create/add_tree:
add_tree()
automatically passes in the
debug value for the table object, so it normally is not necessary to set this.
You can call $tree->destroy
to destroy a tree. This involves dropping the
tree table and deleting the tree reference from the table the tree was on. This
can be called by calling $tree->destroy()
on a GT::SQL::Tree object,
however this is typically invoked by calling $editor->drop_tree()
on a
table editor object.
Neither $tree->destroy()
nor $editor->drop_tree()
take any
arguments.
These three tree object methods return the name of the associated column in the main table. Usually you will already know them, and these methods are primarily used internally.
This is where the usefulness of the tree module comes into play.
$tree->children
is used to access all of the children of a particular
node. It takes a wide variety of arguments to control the return.
Usually, the return will be either a hash reference of array references each
containing hash references, or else an array reference of hash references. Which
reference you get depends on what you request via the id
parameter, described
below. Each inner hash reference is a row from the database, typically a joined
row from the table the tree is on with the tree table, however the
roots_only
, cols
, and select_from
parameters all change this behaviour.
The arguments to children()
are as follows:
id => [3, 4]
. The return value of children will be a hash reference
containing two keys: 3 and 4.
If you are looking for the children of a single ID and pass the id as a scalar value, you will get back an array reference as described above.
So, basically, if the value to id is an array reference, you will get back a hash reference of array references of hash references; if it is a scalar value, you will get back an array reference of hash references. $tree->children(id => [1])->{1}; and $tree->children(id => 1); will result in the same thing.
To get all the trees in a single query, you pass in 0 as the value. This is as if you are requesting the children of the imaginary root to which all roots belong.
id
is the only required parameter.
Not specifying max_depth means that you do not want to limit the maximum distance from the parent of the returned values.
cols
to alter the values
returned. Instead of doing ``SELECT * FROM ...'', the query will be ``SELECT <what
you specify> FROM ...''. Note, however, that the father, root, and depth columns
are required and will be present in the rows returned whether or not you specify
them.
sort
option sorts the results based on tree levels, sort_col
and
sort_order
control the sorting for nodes with the same father ID. For
example, with this tree:
a
b
c
sort_col
and sort_order
affect whether or not b comes before or after c.
The value of each can either be a scalar value or an array reference. There is
essentially no difference, the scalar value is just a little easier when you are
only sorting on a single column. The values of sort_col
should be column
names, and the values of sort_order
'ASC' or 'DESC', per sort column
respectively. For example:
sort_col => ['a','b'], sort_order => ['ASC', 'DESC']
will sort first in ascending order based on the value of a, then descending
order based on the value of column b. This correlates directly to SQL - it
becomes ``ORDER BY a ASC, b DESC''.
You can specify a different sort order for roots by using the roots_order_by
option, when using id => 0
. See below.
children()
via the condition key. The condition will apply to the select
performed. For example, if you want to select rows with a column ``a'' having a
value less than 20, you could do:
my $cond = GT::SQL::Condition->new(a => '<' => 20)
my $children = $tree->children(..., condition => $cond);
id => 0
- it will limit the
number of roots returned, taking into account the sort_col and sort_order.
id
consists only of root_ids. Doing so makes a join with the tree table
unneccessary and allows you to use the select_from
option. This option can be
used (and generally this is a good idea) when specifying id => 0
.
id => 0
and a limit. sort_order
above will affect the order of
children of the roots, but the order of the roots themselves will be controlled
by whatever ORDER BY
value you specify here.
Again, this option requires that id => 0
, roots_only
, and limit
are
also being used.
If this option is omitted, the ORDER BY
will be generated from the values of
the sort_col
and sort_order
options.
select_from
option.
This option allows you to perform the selects from a GT::SQL::Relation object
instead of just the table associated with the tree. Note that the table
associated with the tree must be part of the relation, however you can have as
many other tables as you like.
left_join => 1
.
This simply passes the left_join
option to ->select. This option is only
applicable when select_from is used.
This is effectively the opposite of children. Instead of getting back all of the
children nodes, it gives the parents, all the way up to the root for any given
node. The return value is the same as that of children
, so see that section.
Each array returned by children
is sorted by depth from root to parent.
id
is the only required parameter for parents()
. It should be either a
scalar value or an array reference. You specify the ID's of children whose
parents you are looking for. The type of argument (scalar or array ref) affects
the return in the same way as children()
.
cols
works in a similar way to the cols
parameter to children
. You
specify the columns you want in the return as an array ref. What you get back
will have these columns in it. If cols
is not specified, you'll get back all
columns.
Note that 'tree_id_fk' and the depth column for the table are required fields and will be added if not specified.
If you are looking for just the ID's of the children of a particular node, you should use this. The return value is one of the following, depending on what you pass in:
hash reference of array references: { ID => [ID, ID, ...], ... } with one ID in the hash reference for each id you specify. The array reference contains the child ID's of the key ID.
hash reference of hash references: { ID => { ID => dist, ID => dist, ... }, ... } with one ID in the other hash reference for each id you specify. The inner hash reference is made of child_id => child_distance key-value pairs.
array reference or hash reference: [ID, ID, ...] hash reference: { ID => dist, ID => dist }
The first two apply when passing in an array reference for id
, the latter two
when passing a scalar value for id
. The first and third are without
include_dist
specified, the second and fourth occur when you specify
include_dist
.
id
value. Return as noted above.
Exactly the same as child_ids, except that this works up the tree instead of down. Takes the same arguments, gives the same possible returns.
A tree requires a few indices to get optimal performance out of it. If the table is never expected to be more than just a few rows, you won't notice a substantial difference, however, as with any table, as the table grows the performance proper indexing provides becomes more appreciable.
Two indices are created automatically on the tree table, one on tree_id_fk, and the other on tree_anc_id_fk,tree_dist, so you don't need to worry about that table.
Obviously, the usage of the tree affects how many indices you want, this section is simply to provide some general guidelines for the indices required.
Because the roots_only option is based solely on the main table and not the tree, if you are using roots_only (calling children with id => 0 automatically turns on the roots_only option), you want to make sure you have an index on the root column. If you also use the max_depth depth option, add the depth column to this index.
Keep in mind that you may need to mix other columns in here if you are using a
condition with children(). This also applies when using the sort_col
and
sort_order
parameters - basically you need to figure out what your indices
are, and then add in the root column and, if using max_depth, the depth column.
Copyright (c) 2004 Gossamer Threads Inc. All Rights Reserved. http://www.gossamer-threads.com/
Revision: $Id: Tree.pm,v 1.29 2005/05/31 06:26:32 brewt Exp $