NAME¶
Tree::Simple - A simple tree object
SYNOPSIS¶
use Tree::Simple;
# make a tree root
my $tree = Tree::Simple->new("0", Tree::Simple->ROOT);
# explicity add a child to it
$tree->addChild(Tree::Simple->new("1"));
# specify the parent when creating
# an instance and it adds the child implicity
my $sub_tree = Tree::Simple->new("2", $tree);
# chain method calls
$tree->getChild(0)->addChild(Tree::Simple->new("1.1"));
# add more than one child at a time
$sub_tree->addChildren(
Tree::Simple->new("2.1"),
Tree::Simple->new("2.2")
);
# add siblings
$sub_tree->addSibling(Tree::Simple->new("3"));
# insert children a specified index
$sub_tree->insertChild(1, Tree::Simple->new("2.1a"));
# clean up circular references
$tree->DESTROY();
DESCRIPTION¶
This module in an fully object-oriented implementation of a simple n-ary tree.
It is built upon the concept of parent-child relationships, so therefore every
Tree::Simple object has both a parent and a set of children (who
themselves may have children, and so on). Every
Tree::Simple object
also has siblings, as they are just the children of their immediate parent.
It is can be used to model hierarchal information such as a file-system, the
organizational structure of a company, an object inheritance hierarchy,
versioned files from a version control system or even an abstract syntax tree
for use in a parser. It makes no assumptions as to your intended usage, but
instead simply provides the structure and means of accessing and traversing
said structure.
This module uses exceptions and a minimal Design By Contract style. All method
arguments are required unless specified in the documentation, if a required
argument is not defined an exception will usually be thrown. Many arguments
are also required to be of a specific type, for instance the $parent argument
to the constructor
must be a
Tree::Simple object or an object
derived from
Tree::Simple, otherwise an exception is thrown. This may
seems harsh to some, but this allows me to have the confidence that my code
works as I intend, and for you to enjoy the same level of confidence when
using this module. Note however that this module does not use any Exception or
Error module, the exceptions are just strings thrown with "die".
I consider this module to be production stable, it is based on a module which
has been in use on a few production systems for approx. 2 years now with no
issue. The only difference is that the code has been cleaned up a bit,
comments added and the thorough tests written for its public release. I am
confident it behaves as I would expect it to, and is (as far as I know)
bug-free. I have not stress-tested it under extreme duress, but I don't so
much intend for it to be used in that type of situation. If this module cannot
keep up with your Tree needs, i suggest switching to one of the modules listed
in the "OTHER TREE MODULES" section below.
CONSTANTS¶
- ROOT
- This class constant serves as a placeholder for the root of
our tree. If a tree does not have a parent, then it is considered a
root.
METHODS¶
Constructor
- new ($node, $parent)
- The constructor accepts two arguments a $node value and an
optional $parent. The $node value can be any scalar value (which includes
references and objects). The optional $parent value must be a
Tree::Simple object, or an object derived from Tree::Simple.
Setting this value implies that your new tree is a child of the parent
tree, and therefore adds it to the parent's children. If the $parent is
not specified then its value defaults to ROOT.
Mutator Methods
- setNodeValue ($node_value)
- This sets the node value to the scalar $node_value, an
exception is thrown if $node_value is not defined.
- setUID ($uid)
- This allows you to set your own unique ID for this specific
Tree::Simple object. A default value derived from the object's hex address
is provided for you, so use of this method is entirely optional. It is the
responsibility of the user to ensure the value's uniqueness, all that is
tested by this method is that $uid is a true value (evaluates to true in a
boolean context). For even more information about the Tree::Simple UID see
the "getUID" method.
- addChild ($tree)
- This method accepts only Tree::Simple objects or
objects derived from Tree::Simple, an exception is thrown
otherwise. This method will append the given $tree to the end of it's
children list, and set up the correct parent-child relationships. This
method is set up to return its invocant so that method call chaining can
be possible. Such as:
my $tree = Tree::Simple->new("root")->addChild(Tree::Simple->new("child one"));
Or the more complex:
my $tree = Tree::Simple->new("root")->addChild(
Tree::Simple->new("1.0")->addChild(
Tree::Simple->new("1.0.1")
)
);
- addChildren (@trees)
- This method accepts an array of Tree::Simple
objects, and adds them to it's children list. Like "addChild"
this method will return its invocant to allow for method call
chaining.
- insertChild ($index, $tree)
- This method accepts a numeric $index and a
Tree::Simple object ($tree), and inserts the $tree into the
children list at the specified $index. This results in the shifting down
of all children after the $index. The $index is checked to be sure it is
the bounds of the child list, if it out of bounds an exception is thrown.
The $tree argument's type is verified to be a Tree::Simple or
Tree::Simple derived object, if this condition fails, an exception
is thrown.
- insertChildren ($index, @trees)
- This method functions much as insertChild does, but instead
of inserting a single Tree::Simple, it inserts an array of
Tree::Simple objects. It too bounds checks the value of $index and
type checks the objects in @trees just as "insertChild"
does.
- removeChild ($child | $index)>
- Accepts two different arguemnts. If given a
Tree::Simple object ($child), this method finds that specific
$child by comparing it with all the other children until it finds a match.
At which point the $child is removed. If no match is found, and exception
is thrown. If a non- Tree::Simple object is given as the $child
argument, an exception is thrown.
This method also accepts a numeric $index and removes the child found at
that index from it's list of children. The $index is bounds checked, if
this condition fail, an exception is thrown.
When a child is removed, it results in the shifting up of all children after
it, and the removed child is returned. The removed child is properly
disconnected from the tree and all its references to its old parent are
removed. However, in order to properly clean up and circular references
the removed child might have, it is advised to call it's
"DESTROY" method. See the "CIRCULAR REFERENCES"
section for more information.
- addSibling ($tree)
- addSiblings (@trees)
- insertSibling ($index, $tree)
- insertSiblings ($index, @trees)
- The "addSibling", "addSiblings",
"insertSibling" and "insertSiblings" methods pass
along their arguments to the "addChild",
"addChildren", "insertChild" and
"insertChildren" methods of their parent object respectively.
This eliminates the need to overload these methods in subclasses which may
have specialized versions of the *Child(ren) methods. The one exceptions
is that if an attempt it made to add or insert siblings to the ROOT
of the tree then an exception is thrown.
NOTE: There is no "removeSibling" method as I felt it was
probably a bad idea. The same effect can be achieved by manual upwards
traversal.
Accessor Methods
- getNodeValue
- This returns the value stored in the object's node
field.
- getUID
- This returns the unique ID associated with this particular
tree. This can be custom set using the "setUID" method, or you
can just use the default. The default is the hex-address extracted from
the stringified Tree::Simple object. This may not be a universally
unique identifier, but it should be adequate for at least the current
instance of your perl interpreter. If you need a UUID, one can be
generated with an outside module (there are
many to choose from on CPAN) and the "setUID" method (see
above).
- getChild ($index)
- This returns the child (a Tree::Simple object) found
at the specified $index. Note that we do use standard zero-based array
indexing.
- getAllChildren
- This returns an array of all the children (all
Tree::Simple objects). It will return an array reference in scalar
context.
- getSibling ($index)
- getAllSiblings
- Much like "addSibling" and
"addSiblings", these two methods simply call
"getChild" and "getAllChildren" on the invocant's
parent.
- getDepth
- Returns a number representing the invocant's depth within
the hierarchy of Tree::Simple objects.
NOTE: A "ROOT" tree has the depth of -1. This be because
Tree::Simple assumes that a tree's root will usually not contain data, but
just be an anchor for the data-containing branches. This may not be
intuitive in all cases, so I mention it here.
- getParent
- Returns the invocant's parent, which could be either
ROOT or a Tree::Simple object.
- getHeight
- Returns a number representing the length of the longest
path from the current tree to the furthest leaf node.
- getWidth
- Returns the a number representing the breadth of the
current tree, basically it is a count of all the leaf nodes.
- getChildCount
- Returns the number of children the invocant contains.
- getIndex
- Returns the index of this tree within its parent's child
list. Returns -1 if the tree is the root.
Predicate Methods
- isLeaf
- Returns true (1) if the invocant does not have any
children, false (0) otherwise.
- isRoot
- Returns true (1) if the invocant's "parent" field
is ROOT, returns false (0) otherwise.
Recursive Methods
- traverse ($func, ?$postfunc)
- This method accepts two arguments a mandatory $func and an
optional $postfunc. If the argument $func is not defined then an exception
is thrown. If $func or $postfunc are not in fact CODE references then an
exception is thrown. The function $func is then applied recursively to all
the children of the invocant. If given, the function $postfunc will be
applied to each child after the child's children have been traversed.
Here is an example of a traversal function that will print out the hierarchy
as a tabbed in list.
$tree->traverse(sub {
my ($_tree) = @_;
print (("\t" x $_tree->getDepth()), $_tree->getNodeValue(), "\n");
});
Here is an example of a traversal function that will print out the hierarchy
in an XML-style format.
$tree->traverse(sub {
my ($_tree) = @_;
print ((' ' x $_tree->getDepth()),
'<', $_tree->getNodeValue(),'>',"\n");
},
sub {
my ($_tree) = @_;
print ((' ' x $_tree->getDepth()),
'</', $_tree->getNodeValue(),'>',"\n");
});
- size
- Returns the total number of nodes in the current tree and
all its sub-trees.
- height
- This method has also been deprecated in favor of the
"getHeight" method above, it remains as an alias to
"getHeight" for backwards compatability.
NOTE: This is also no longer a recursive method which get's it's
value on demand, but a value stored in the Tree::Simple object itself,
hopefully making it much more efficient and usable.
Visitor Methods
- accept ($visitor)
- It accepts either a Tree::Simple::Visitor object
(which includes classes derived
from Tree::Simple::Visitor), or an object who has the
"visit" method available
(tested with "$visitor->can('visit')"). If these
qualifications are not met,
and exception will be thrown. We then run the Visitor's "visit"
method giving the
current tree as its argument.
I have also created a number of Visitor objects and packaged them into the
Tree::Simple::VisitorFactory.
Cloning Methods
Cloning a tree can be an extremly expensive operation for large trees, so we
provide two options for cloning, a deep clone and a shallow clone.
When a Tree::Simple object is cloned, the node is deep-copied in the following
manner. If we find a normal scalar value (non-reference), we simply copy it.
If we find an object, we attempt to call "clone" on it, otherwise we
just copy the reference (since we assume the object does not want to be
cloned). If we find a SCALAR, REF reference we copy the value contained within
it. If we find a HASH or ARRAY reference we copy the reference and recursively
copy all the elements within it (following these exact guidelines). We also do
our best to assure that circular references are cloned only once and
connections restored correctly. This cloning will not be able to copy CODE,
RegExp and GLOB references, as they are pretty much impossible to clone. We
also do not handle "tied" objects, and they will simply be copied as
plain references, and not re-"tied".
- clone
- The clone method does a full deep-copy clone of the object,
calling "clone" recursively on all its children. This does not
call "clone" on the parent tree however. Doing this would result
in a slowly degenerating spiral of recursive death, so it is not
recommended and therefore not implemented. What happens is that the tree
instance that "clone" is actually called upon is detached from
the tree, and becomes a root node, all if the cloned children are then
attached as children of that tree. I personally think this is more
intuitive then to have the cloning crawl back up the tree is not
what I think most people would expect.
- cloneShallow
- This method is an alternate option to the plain
"clone" method. This method allows the cloning of single
Tree::Simple object while retaining connections to the rest of the
tree/hierarchy.
Misc. Methods
- DESTROY
- To avoid memory leaks through uncleaned-up circular
references, we implement the "DESTROY" method. This method will
attempt to call "DESTROY" on each of its children (if it has
any). This will result in a cascade of calls to "DESTROY" on
down the tree. It also cleans up it's parental relations as well.
Because of perl's reference counting scheme and how that interacts with
circular references, if you want an object to be properly reaped you
should manually call "DESTROY". This is especially nessecary if
your object has any children. See the section on "CIRCULAR
REFERENCES" for more information.
- fixDepth
- Tree::Simple will manage your tree's depth field for you
using this method. You should never need to call it on your own, however
if you ever did need to, here is it. Running this method will traverse
your all the invocant's sub-trees correcting the depth as it goes.
- fixHeight
- Tree::Simple will manage your tree's height field for you
using this method. You should never need to call it on your own, however
if you ever did need to, here is it. Running this method will correct the
heights of the current tree and all it's ancestors.
- fixWidth
- Tree::Simple will manage your tree's width field for you
using this method. You should never need to call it on your own, however
if you ever did need to, here is it. Running this method will correct the
widths of the current tree and all it's ancestors.
Private Methods
I would not normally document private methods, but in case you need to subclass
Tree::Simple, here they are.
- _init ($node, $parent,
$children )
- This method is here largely to facilitate subclassing. This
method is called by new to initialize the object, where new's primary
responsibility is creating the instance.
- _setParent ($parent)
- This method sets up the parental relationship. It is for
internal use only.
- _setHeight ($child)
- This method will set the height field based upon the height
of the given $child.
CIRCULAR REFERENCES¶
I have revised the model by which Tree::Simple deals with ciruclar references.
In the past all circular references had to be manually destroyed by calling
DESTROY. The call to DESTROY would then call DESTROY on all the children, and
therefore cascade down the tree. This however was not always what was needed,
nor what made sense, so I have now revised the model to handle things in what
I feel is a more consistent and sane way.
Circular references are now managed with the simple idea that the parent makes
the descisions for the child. This means that child-to-parent references are
weak, while parent-to-child references are strong. So if a parent is destroyed
it will force all it's children to detach from it, however, if a child is
destroyed it will not be detached from it's parent.
Optional Weak References
By default, you are still required to call DESTROY in order for things to
happen. However I have now added the option to use weak references, which
alleviates the need for the manual call to DESTROY and allows Tree::Simple to
manage this automatically. This is accomplished with a compile time setting
like this:
use Tree::Simple 'use_weak_refs';
And from that point on Tree::Simple will use weak references to allow for perl's
reference counting to clean things up properly.
For those who are unfamilar with weak references, and how they affect the
reference counts, here is a simple illustration. First is the normal model
that Tree::Simple uses:
+---------------+
| Tree::Simple1 |<---------------------+
+---------------+ |
| parent | |
| children |-+ |
+---------------+ | |
| |
| +---------------+ |
+->| Tree::Simple2 | |
+---------------+ |
| parent |-+
| children |
+---------------+
Here, Tree::Simple1 has a reference count of 2 (one for the original variable it
is assigned to, and one for the parent reference in Tree::Simple2), and
Tree::Simple2 has a reference count of 1 (for the child reference in
Tree::Simple2).
Now, with weak references:
+---------------+
| Tree::Simple1 |.......................
+---------------+ :
| parent | :
| children |-+ : <--[ weak reference ]
+---------------+ | :
| :
| +---------------+ :
+->| Tree::Simple2 | :
+---------------+ :
| parent |..
| children |
+---------------+
Now Tree::Simple1 has a reference count of 1 (for the variable it is assigned
to) and 1 weakened reference (for the parent reference in Tree::Simple2). And
Tree::Simple2 has a reference count of 1, just as before.
BUGS¶
None that I am aware of. The code is pretty thoroughly tested (see "CODE
COVERAGE" below) and is based on an (non-publicly released) module which
I had used in production systems for about 3 years without incident. Of
course, if you find a bug, let me know, and I will be sure to fix it.
CODE COVERAGE¶
I use Devel::Cover to test the code coverage of my tests, below is the
Devel::Cover report on this module's test suite.
---------------------------- ------ ------ ------ ------ ------ ------ ------
File stmt branch cond sub pod time total
---------------------------- ------ ------ ------ ------ ------ ------ ------
Tree/Simple.pm 99.6 96.0 92.3 100.0 97.0 95.5 98.0
Tree/Simple/Visitor.pm 100.0 96.2 88.2 100.0 100.0 4.5 97.7
---------------------------- ------ ------ ------ ------ ------ ------ ------
Total 99.7 96.1 91.1 100.0 97.6 100.0 97.9
---------------------------- ------ ------ ------ ------ ------ ------ ------
SEE ALSO¶
I have written a number of other modules which use or augment this module, they
are describes below and available on CPAN.
- Tree::Parser - A module for parsing formatted files into
Tree::Simple hierarchies.
- Tree::Simple::View - A set of classes for viewing
Tree::Simple hierarchies in various output formats.
- Tree::Simple::VisitorFactory - A set of several useful
Visitor objects for Tree::Simple objects.
- Tree::Binary - If you are looking for a binary tree, this
you might want to check this one out.
Also, the author of Data::TreeDumper and I have worked together to make sure
that
Tree::Simple and his module work well together. If you need a
quick and handy way to dump out a Tree::Simple heirarchy, this module does an
excellent job (and plenty more as well).
I have also recently stumbled upon some packaged distributions of Tree::Simple
for the various Unix flavors. Here are some links:
- FreeBSD Port -
<http://www.freshports.org/devel/p5-Tree-Simple/>
- Debian Package -
<http://packages.debian.org/unstable/perl/libtree-simple-perl>
- Linux RPM -
<http://rpmpan.sourceforge.net/Tree.html>
OTHER TREE MODULES¶
There are a few other Tree modules out there, here is a quick comparison between
Tree::Simple and them. Obviously I am biased, so take what I say with a
grain of salt, and keep in mind, I wrote
Tree::Simple because I could
not find a Tree module that suited my needs. If
Tree::Simple does not
fit your needs, I recommend looking at these modules. Please note that I am
only listing Tree::* modules I am familiar with here, if you think I have
missed a module, please let me know. I have also seen a few tree-ish modules
outside of the Tree::* namespace, but most of them are part of another
distribution (
HTML::Tree,
Pod::Tree, etc) and are likely
specialized in purpose.
- Tree::DAG_Node
- This module seems pretty stable and very robust with a lot
of functionality. However, Tree::DAG_Node does not come with any
automated tests. It's test.pl file simply checks the module loads
and nothing else. While I am sure the author tested his code, I would feel
better if I was able to see that. The module is approx. 3000 lines with
POD, and 1,500 without the POD. The shear depth and detail of the
documentation and the ratio of code to documentation is impressive, and
not to be taken lightly. But given that it is a well known fact that the
likeliness of bugs increases along side the size of the code, I do not
feel comfortable with large modules like this which have no tests.
All this said, I am not a huge fan of the API either, I prefer the gender
neutral approach in Tree::Simple to the mother/daughter style of
Tree::DAG_Node. I also feel very strongly that
Tree::DAG_Node is trying to do much more than makes sense in a
single module, and is offering too many ways to do the same or similar
things.
However, of all the Tree::* modules out there, Tree::DAG_Node seems
to be one of the favorites, so it may be worth investigating.
- Tree::MultiNode
- I am not very familiar with this module, however, I have
heard some good reviews of it, so I thought it deserved mention here. I
believe it is based upon C++ code found in the book Algorithms in
C++ by Robert Sedgwick. It uses a number of interesting ideas, such as
a ::Handle object to traverse the tree with (similar to Visitors, but also
seem to be to be kind of like a cursor). However, like
Tree::DAG_Node, it is somewhat lacking in tests and has only 6
tests in its suite. It also has one glaring bug, which is that there is
currently no way to remove a child node.
- Tree::Nary
- It is a (somewhat) direct translation of the N-ary tree
from the GLIB library, and the API is based on that. GLIB is a C library,
which means this is a very C-ish API. That doesn't appeal to me, it might
to you, to each their own.
This module is similar in intent to Tree::Simple. It implements a
tree with n branches and has polymorphic node containers. It
implements much of the same methods as Tree::Simple and a few
others on top of that, but being based on a C library, is not very OO. In
most of the method calls the $self argument is not used and the second
argument $node is. Tree::Simple is a much more OO module than
Tree::Nary, so while they are similar in functionality they greatly
differ in implementation style.
- Tree
- This module is pretty old, it has not been updated since
Oct. 31, 1999 and is still on version 0.01. It also seems to be (from the
limited documentation) a binary and a balanced binary tree,
Tree::Simple is an n-ary tree, and makes no attempt to
balance anything.
- Tree::Ternary
- This module is older than Tree, last update was
Sept. 24th, 1999. It seems to be a special purpose tree, for storing and
accessing strings, not general purpose like Tree::Simple.
- Tree::Ternary_XS
- This module is an XS implementation of the above tree
type.
- Tree::Trie
- This too is a specialized tree type, it sounds similar to
the Tree::Ternary, but it much newer (latest release in 2003). It
seems specialized for the lookup and retrieval of information like a
hash.
- Tree::M
- Is a wrapper for a C++ library, whereas Tree::Simple
is pure-perl. It also seems to be a more specialized implementation of a
tree, therefore not really the same as Tree::Simple.
- Tree::Fat
- Is a wrapper around a C library, again Tree::Simple
is pure-perl. The author describes FAT-trees as a combination of a Tree
and an array. It looks like a pretty mean and lean module, and good if you
need speed and are implementing a custom data-store of some kind. The
author points out too that the module is designed for embedding and there
is not default embedding, so you can't really use it "out of the
box".
ACKNOWLEDGEMENTS¶
- Thanks to Nadim Ibn Hamouda El Khemir for making
Data::TreeDumper work with Tree::Simple.
- Thanks to Brett Nuske for his idea for the
"getUID" and "setUID" methods.
- Thanks to whomever submitted the memory leak bug to RT
(#7512).
- Thanks to Mark Thomas for his insight into how to best
handle the height and width properties without unessecary
recursion.
- Thanks for Mark Lawrence for the &traverse post-func
patch, tests and docs.
AUTHOR¶
Stevan Little, <stevan@iinteractive.com>
Rob Kinyon, <rob@iinteractive.com>
COPYRIGHT AND LICENSE¶
Copyright 2004-2006 by Infinity Interactive, Inc.
<
http://www.iinteractive.com>
This library is free software; you can redistribute it and/or modify it under
the same terms as Perl itself.