NAME¶
UR::Context - Manage the current state of the application
SYNOPSIS¶
use AppNamespace;
my $obj = AppNamespace::SomeClass->get(id => 1234);
$obj->some_property('I am changed');
UR::Context->get_current->rollback; # some_property reverts to its original value
$obj->other_property('Now, I am changed');
UR::Context->commit; # other_property now permanently has that value
DESCRIPTION¶
The main application code will rarely interact with UR::Context objects
directly, except for the "commit" and "rollback" methods.
It manages the mappings between an application's classes, object cache, and
external data sources.
SUBCLASSES¶
UR::Context is an abstract class. When an application starts up, the system
creates a handful of Contexts that logically exist within one another:
- 1. UR::Context::Root - A context to represent all the data reachable in
the application's namespace. It connects the application to external data
sources.
- 2. UR::Context::Process - A context to represent the state of data within
the currently running application. It handles the transfer of data to and
from the Root context, through the object cache, on behalf of the
application code.
- 3. UR::Context::Transaction - A context to represent an in-memory
transaction as a diff of the object cache. The Transaction keeps a list of
changes to objects and is able to revert those changes with
"rollback()", or apply them to the underlying context with
"commit()".
CONSTRUCTOR¶
- begin
-
my $trans = UR::Context::Transaction->begin();
UR::Context::Transaction instances are created through
"begin()".
A UR::Context::Root and UR::Context::Process context will be created for you
when the application initializes. Additional instances of these classes are
not usually instantiated.
METHODS¶
Most of the methods below can be called as either a class or object method of
UR::Context. If called as a class method, they will operate on the current
context.
- get_current
-
my $context = UR::Context::get_current();
Returns the UR::Context instance of whatever is the most currently created
Context. Can be called as a class or object method.
- query_underlying_context
-
my $should_load = $context->query_underlying_context();
$context->query_underlying_context(1);
A property of the Context that sets the default value of the $should_load
flag inside "get_objects_for_class_and_rule" as described below.
Initially, its value is undef, meaning that during a get(), the
Context will query the underlying data sources only if this query has not
been done before. Setting this property to 0 will make the Context never
query data sources, meaning that the only objects retrievable are those
already in memory. Setting the property to 1 means that every query will
hit the data sources, even if the query has been done before.
- get_objects_for_class_and_rule
-
@objs = $context->get_objects_for_class_and_rule(
$class_name,
$boolexpr,
$should_load,
$should_return_iterator
);
This is the method that serves as the main entry point to the Context behind
the "get()", and "is_loaded()" methods of UR::Object,
and "reload()" method of UR::Context.
$class_name and $boolexpr are required arguments, and specify the target
class by name and the rule used to filter the objects the caller is
interested in.
$should_load is a flag indicating whether the Context should load objects
satisfying the rule from external data sources. A true value means it
should always ask the relevent data sources, even if the Context believes
the requested data is in the object cache, A false but defined value means
the Context should not ask the data sources for new data, but only return
what is currently in the cache matching the rule. The value
"undef" means the Context should use the value of its
query_underlying_context property. If that is also undef, then it will use
its own judgement about asking the data sources for new data, and will
merge cached and external data as necessary to fulfill the request.
$should_return_iterator is a flag indicating whether this method should
return the objects directly as a list, or iterator function instead. If
true, it returns a subref that returns one object each time it is called,
and undef after the last matching object:
my $iter = $context->get_objects_for_class_and_rule(
'MyClass',
$rule,
undef,
1
);
my @objs;
while (my $obj = $iter->());
push @objs, $obj;
}
- has_changes
-
my $bool = $context->has_changes();
Returns true if any objects in the given Context's object cache (or the
current Context if called as a class method) have any changes that haven't
been saved to the underlying context.
- commit
-
UR::Context->commit();
Causes all objects with changes to save their changes back to the underlying
context. If the current context is a UR::Context::Transaction, then the
changes will be applied to whatever Context the transaction is a part of.
if the current context is a UR::Context::Process context, then
"commit()" pushes the changes to the underlying
UR::Context::Root context, meaning that those changes will be applied to
the relevent data sources.
In the usual case, where no transactions are in play and all data sources
are RDBMS databases, calling "commit()" will cause the program
to begin issuing SQL against the databases to update changed objects,
insert rows for newly created objects, and delete rows from deleted
objects as part of an SQL transaction. If all the changes apply cleanly,
it will do and SQL "commit", or "rollback" if not.
commit() returns true if all the changes have been safely
transferred to the underlying context, false if there were problems.
- rollback
-
UR::Context->rollback();
Causes all objects' changes for the current transaction to be reversed. If
the current context is a UR::Context::Transaction, then the transactional
properties of those objects will be reverted to the values they had when
the transaction started. Outside of a transaction, object properties will
be reverted to their values when they were loaded from the underlying data
source. rollback() will also ask all the underlying databases to
rollback.
- clear_cache
-
UR::Context->clear_cache();
Asks the current context to remove all non-infrastructional data from its
object cache. This method will fail and return false if any object has
changes.
- resolve_data_source_for_object
-
my $ds = $obj->resolve_data_source_for_object();
For the given $obj object, return the UR::DataSource instance that object
was loaded from or would be saved to. If objects of that class do not have
a data source, then it will return "undef".
- resolve_data_sources_for_class_meta_and_rule
-
my @ds = $context->resolve_data_sources_for_class_meta_and_rule($class_obj, $boolexpr);
For the given class metaobject and boolean expression (rule), return the
list of data sources that will need to be queried in order to return the
objects matching the rule. In most cases, only one data source will be
returned.
- infer_property_value_from_rule
-
my $value = $context->infer_property_value_from_rule($property_name, $boolexpr);
For the given boolean expression (rule), and a property name not mentioned
in the rule, but is a property of the class the rule is against, return
the value that property must logically have.
For example, if this object is the only TestClass object where
"foo" is the value 'bar', it can infer that the TestClass
property "baz" must have the value 'blah' in the current
context.
my $obj = TestClass->create(id => 1, foo => 'bar', baz=> 'blah');
my $rule = UR::BoolExpr->resolve('TestClass', foo => 'bar);
my $val = $context->infer_property_value_from_rule('baz', $rule);
# val now is 'blah'
- object_cache_size_highwater
-
UR::Context->object_cache_size_highwater(5000);
my $highwater = UR::Context->object_cache_size_highwater();
Set or get the value for the Context's object cache pruning high water mark.
The object cache pruner will be run during the next "get()" if
the cache contains more than this number of prunable objects. See the
"Object Cache Pruner" section below for more information.
- object_cache_size_lowwater
-
UR::Context->object_cache_size_lowwater(5000);
my $lowwater = UR::Context->object_cache_size_lowwater();
Set or get the value for the Context's object cache pruning high water mark.
The object cache pruner will stop when the number of prunable objects
falls below this number.
- prune_object_cache
-
UR::Context->prune_object_cache();
Manually run the object cache pruner.
- reload
-
UR::Context->reload($object);
UR::Context->reload('Some::Class', 'property_name', value);
Ask the context to load an object's data from an underlying Context, even if
the object is already cached. With a single parameter, it will use that
object's ID parameters as the basis for querying the data source.
"reload" will also accept a class name and list of key/value
parameters the same as "get".
- _light_cache
-
UR::Context->_light_cache(1);
Turn on or off the light caching flag. Light caching alters the behavior of
the object cache in that all object references in the cache are made weak
by Scalar::Util::weaken(). This means that the application code
must keep hold of any object references it wants to keep alive. Light
caching defaults to being off, and must be explicitly turned on with this
method.
Custom observer aspects¶
UR::Context sends signals for observers watching for some non-standard aspects.
- precommit
- After "commit()" has been called, but before any changes are
saved to the data sources. The only parameters to the Observer's callback
are the Context object and the aspect ("precommit").
- commit
- After "commit()" has been called, and after an attempt has been
made to save the changes to the data sources. The parameters to the
callback are the Context object, the aspect ("commit"), and a
boolean value indicating whether the commit succeeded or not.
- prerollback
- After "rollback()" has been called, but before and object state
is reverted.
- rollback
- After "rollback()" has been called, and after an attempt has
been made to revert the state of all the loaded objects. The parameters to
the callback are the Context object, the aspect ("rollback"),
and a boolean value indicating whether the rollback succeeded or not.
Data Concurrency¶
Currently, the Context is optimistic about data concurrency, meaning that it
does very little to prevent clobbering data in underlying Contexts during a
commit() if other processes have changed an object's data after the
Context has cached and object. For example, a database has an object with ID 1
and a property with value 'bob'. A program loads this object and changes the
property to 'fred', but does not yet
commit(). Meanwhile, another
program loads the same object, changes the value to 'joe' and does
commit(). Finally the first program calls
commit(). The final
value in the database will be 'fred', and no exceptions will be raised.
As part of the caching behavior, the Context keeps a record of what the object's
state is as it's loaded from the underlying Context. This is how the Context
knows what object have been changed during "commit()".
If an already cached object's data is reloaded as part of some other query, data
consistency of each property will be checked. If there are no conflicting
changes, then any differences between the object's initial state and the
current state in the underlying Context will be applied to the object's notion
of what it thinks its initial state is.
In some future release, UR may support additional data concurrency methods such
as pessimistic concurrency: check that the current state of all changed (or
even all cached) objects in the underlying Context matches the initial state
before committing changes downstream. Or allowing the object cache to operate
in write-through mode for some or all classes.
Internal Methods¶
There are many methods in UR::Context meant to be used internally, but are worth
documenting for anyone interested in the inner workings of the Context code.
- _create_import_iterator_for_underlying_context
-
$subref = $context->_create_import_iterator_for_underlying_context(
$boolexpr, $data_source, $serial_number
);
$next_obj = $subref->();
This method is part of the object loading process, and is called by
"get_objects_for_class_and_rule" when it is determined that the
requested data does not exist in the object cache, and data should be
brought in from another, underlying Context. Usually this means the data
will be loaded from an external data source.
$boolexpr is the UR::BoolExpr rule, usually from the application code.
$data_source is the UR::DataSource that will be used to load data from.
$serial_number is used by the object cache pruner. Each object loaded
through this iterator will have $serial_number in its
"__get_serial" hashref key.
It works by first getting an iterator for the data source (the
$db_iterator). It calls "_resolve_query_plan_for_ds_and_bxt" to
find out how data is to be loaded and whether this request spans multiple
data sources. It calls
"__create_object_fabricator_for_loading_template" to get a list
of closures to transform the primary data source's data into UR objects,
and "_create_secondary_loading_closures" (if necessary) to get
more closures that can load and join data from the primary to the
secondary data source(s).
It returns a subref that works as an iterator, loading and returning objects
one at a time from the underlying context into the current context. It
returns undef when there are no more objects to return.
The returned iterator works by first asking the $db_iterator for the next
row of data as a listref. Asks the secondary data source joiners whether
there is any matching data. Calls the object fabricator closures to
convert the data source data into UR objects. If any of the object
requires subclassing, then additional importing iterators are created to
handle that. Finally, the objects matching the rule are returned to the
caller one at a time.
- _resolve_query_plan_for_ds_and_bxt
-
my $query_plan = $context->_resolve_query_plan_for_ds_and_bxt(
$data_source,
$boolexpr_tmpl
);
my($query_plan, @addl_info) = $context->_resolve_query_plan_for_ds_and_bxt(
$data_source,
$boolexpr_tmpl
);
When a request is made that will hit one or more data sources,
"_resolve_query_plan_for_ds_and_bxt" is used to call a method of
the same name on the data source. It retuns a hashref used by many other
parts of the object loading system, and describes what data source to use,
how to query that data source to get the objects, how to use the raw data
returned by the data source to construct objects and how to resolve any
delegated properties that are a part of the rule.
$data_source is a UR::DataSource object ID. $coolexpr_tmpl is a
UR::BoolExpr::Template object.
In the common case, the query will only use one data source, and this method
returns that data directly. But if the primary data source sets the
"joins_across_data_sources" key on the data structure as may be
the case when a rule involves a delegated property to a class that uses a
different data source, then this methods returns an additional list of
data. For each additional data source needed to resolve the query, this
list will have three items:
- 1.
- The secondary data source ID
- 2.
- A listref of delegated UR::Object::Property objects joining the primary
data source to this secondary data source.
- 3.
- A UR::BoolExpr::Template rule template applicable against the secondary
data source
- _create_secondary_rule_from_primary
-
my $new_rule = $context->_create_secondary_rule_from_primary(
$primary_rule,
$delegated_properties,
$secondary_rule_tmpl
);
When resolving a request that requires multiple data sources, this method is
used to construct a rule against applicable against the secondary data
source. $primary_rule is the UR::BoolExpr rule used in the original query.
$delegated_properties is a listref of UR::Object::Property objects as
returned by " _resolve_query_plan_for_ds_and_bxt()"
linking the primary to the secondary data source. $secondary_rule_tmpl is
the rule template, also as returned by "
_resolve_query_plan_for_ds_and_bxt()".
- _create_secondary_loading_closures
-
my($obj_importers, $joiners) = $context->_create_secondary_loading_closures(
$primary_rule_tmpl,
@addl_info);
When reolving a request that spans multiple data sources, this method is
used to construct two lists of subrefs to aid in the request.
$primary_rule_tmpl is the UR::BoolExpr::Template rule template made from
the original rule. @addl_info is the same list returned by
"_resolve_query_plan_for_ds_and_bxt". For each secondary data
source, there will be one item in the two listrefs that are returned, and
in the same order.
$obj_importers is a listref of subrefs used as object importers. They
transform the raw data returned by the data sources into UR objects.
$joiners is also a listref of subrefs. These closures know how the
properties link the primary data source data to the secondary data source.
They take the raw data from the primary data source, load the next row of
data from the secondary data source, and returns the secondary data that
successfully joins to the primary data. You can think of these closures as
performing the same work as an SQL "join" between data in
different data sources.
- _cache_is_complete_for_class_and_normalized_rule
-
($is_cache_complete, $objects_listref) =
$context->_cache_is_complete_for_class_and_normalized_rule(
$class_name, $boolexpr
);
This method is part of the object loading process, and is called by
"get_objects_for_class_and_rule" to determine if the objects
requested by the UR::BoolExpr $boolexpr will be found entirely in the
object cache. If the answer is yes then $is_cache_complete will be true.
$objects_listef may or may not contain objects matching the rule from the
cache. If that list is not returned, then
"get_objects_for_class_and_rule" does additional work to locate
the matching objects itself via
"_get_objects_for_class_and_rule_from_cache"
It does its magic by looking at the $boolexpr and loosely matching it
against the query cache $UR::Context::all_params_loaded
- _get_objects_for_class_and_rule_from_cache
-
@objects = $context->_get_objects_for_class_and_rule_from_cache(
$class_name, $boolexpr
);
This method is called by "get_objects_for_class_and_rule" when
_cache_is_complete_for_class_and_normalized_rule says the requested
objects do exist in the cache, but did not return those items directly.
The UR::BoolExpr $boolexpr contains hints about how the matching data is
likely to be found. Its "_context_query_strategy" key will
contain one of three values
- 1. all
- This rule is against a class with no filters, meaning it should return
every member of that class. It calls
"$class->all_objects_loaded" to extract all objects of that
class in the object cache.
- 2. id
- This rule is against a class and filters by only a single ID, or a list of
IDs. The request is fulfilled by plucking the matching objects right out
of the object cache.
- 3. index
- This rule is against one more more non-id properties. An index is built
mapping the filtered properties and their values, and the cached objects
which have those values. The request is fulfilled by using the index to
find objects matching the filter.
- 4. set intersection
- This is a group-by rule and will return a ::Set object.
- _loading_was_done_before_with_a_superset_of_this_params_hashref
-
$bool = $context->_loading_was_done_before_with_a_superset_of_this_params_hashref(
$class_name,
$params_hashref
);
This method is used by
"_cache_is_complete_for_class_and_normalized_rule" to determine
if the requested data was asked for previously, either from a get()
asking for a superset of the current request, or from a request on a
parent class of the current request.
For example, if a get() is done on a class with one param:
@objs = ParentClass->get(param_1 => 'foo');
And then later, another request is done with an additional param:
@objs2 = ParentClass->get(param_1 => 'foo', param_2 => 'bar');
Then the first request must have returned all the data that could have
possibly satisfied the second request, and so the system will not issue a
query against the data source.
As another example, given those two previously done queries, if another
get() is done on a class that inherits from ParentClass
@objs3 = ChildClass->get(param_1 => 'foo');
again, the first request has already loaded all the relevent data, and
therefore won't query the data source.
- _sync_databases
-
$bool = $context->_sync_databases();
Starts the process of committing all the Context's changes to the external
data sources. _sync_databases() is the workhorse behind
"commit".
First, it finds all objects with changes. Checks those changed objects for
validity with "$obj->invalid". If any objects are found
invalid, then _sync_databases() will fail. Finally, it bins all the
changed objects by data source, and asks each data source to save those
objects' changes. It returns true if all the data sources were able to
save the changes, false otherwise.
- _reverse_all_changes
-
$bool = $context->_reverse_all_changes();
_reverse_all_changes() is the workhorse behind "rollback".
For each class, it goes through each object of that class. If the object is
a UR::Object::Ghost, representing a deleted object, it converts the ghost
back to the live version of the object. For other classes, it makes a list
of properties that have changed since they were loaded (represented by the
"db_committed" hash key in the object), and reverts those
changes by using each property's accessor method.
The Object Cache¶
The object cache is integral to the way the Context works, and also the main
difference between UR and other ORMs. Other systems do no caching and require
the calling application to hold references to any objects it is interested in.
Say one part of the app loads data from the database and gives up its
references, then if another part of the app does the same or similar query, it
will have to ask the database again.
UR handles caching of classes, objects and queries to avoid asking the data
sources for data it has loaded previously. The object cache is essentially a
software transaction that sits above whatever database transaction is active.
After objects are loaded, any changes, creations or deletions exist only in
the object cache, and are not saved to the underlying data sources until the
application explicitly requests a commit or rollback.
Objects are returned to the application only after they are inserted into the
object cache. This means that if disconnected parts of the application are
returned objects with the same class and ID, they will have references to the
same exact object reference, and changes made in one part will be visible to
all other parts of the app. An unchanged object can be removed from the object
cache by calling its "unload()" method.
Since changes to the underlying data sources are effectively delayed, it is
possible that the application's notion of the object's current state does not
match the data stored in the data source. You can mitigate this by using the
"load()" class or object method to fetch the latest data if it's a
problem. Another issue to be aware of is if multiple programs are likely to
commit conflicting changes to the same data, then whichever applies its
changes last will win; some kind of external locking needs to be applied.
Finally, if two programs attempt to insert data with the same ID columns into
an RDBMS table, the second application's commit will fail, since that will
likely violate a constraint.
Object Change Tracking¶
As objects are loaded from their data sources, their properties are initialized
with the data from the query, and a copy of the same data is stored in the
object in its "db_committed" hash key. Anyone can ask the object for
a list of its changes by calling "$obj->changed". Internally,
changed() goes through all the object's properties, comparing the
current values in the object's hash with the same keys under 'db_committed'.
Objects created through the "create()" class method have no
'db_committed', and so the object knows it it a newly created object in this
context.
Every time an object is retrieved with
get() or through an iterator, it
is assigned a serial number in its "__get_serial" hash key from the
$UR::Context::GET_SERIAL counter. This number is unique and increases with
each
get(), and is used by the "Object Cache Pruner" to
expire the least recently requested data.
Objects also track what parameters have been used to
get() them in the
hash "$obj->{__load}". This is a copy of the data in
"$UR::Context::all_params_loaded->{$template_id}". For each rule
ID, it will have a count of the number of times that rule was used in a
get().
Deleted Objects and Ghosts¶
Calling
delete() on an object is tracked in a different way. First, a new
object is created, called a ghost. Ghost classes exist for every class in the
application and are subclasses of UR::Object::Ghost. For example, the ghost
class for MyClass is MyClass::Ghost. This ghost object is initialized with the
data from the original object. The original object is removed from the object
cache, and is reblessed into the UR::DeletedRef class. Any attempt to interact
with the object further will raise an exception.
Ghost objects are not included in a
get() request on the regular class,
though the app can ask for them specificly using
"MyClass::Ghost->get(%params)".
Ghost classes do not have ghost classes themselves. Calling
create() or
delete() on a Ghost class or object will raise an exception. Calling
other methods on the Ghost object that exist on the original, live class will
delegate over to the live class's method.
all_objects_are_loaded¶
$UR::Context::all_objects_are_loaded is a hashref keyed by class names. If the
value is true, then
"_cache_is_complete_for_class_and_normalized_rule" knows that all
the instances of that class exist in the object cache, and it can avoid asking
the underlying context/datasource for that class' data.
all_params_loaded¶
$UR::Context::all_params_loaded is a two-level hashref. The first level is class
names. The second level is rule (UR::BoolExpr) IDs. The values are how many
times that class and rule have been involved in a
get(). This data is
used by
"_loading_was_done_before_with_a_superset_of_this_params_hashref" to
determine if the requested data will be found in the object cache for non-id
queries.
all_objects_loaded¶
$UR::Context::all_objects_loaded is a two-level hashref. The first level is
class names. The second level is object IDs. Every time an object is created,
defined or loaded from an underlying context, it is inserted into the
"all_objects_loaded" hash. For queries involving only ID properties,
the Context can retrieve them directly out of the cache if they appear there.
The entire cache can be purged of non-infrastructional objects by calling
"clear_cache".
Object Cache Pruner¶
The default Context behavior is to cache all objects it knows about for the
entire life of the process. For programs that churn through large amounts of
data, or live for a long time, this is probably not what you want.
The Context has two settings to loosely control the size of the object cache.
"object_cache_size_highwater" and
"object_cache_size_lowwater". As objects are created and loaded, a
count of uncachable objects is kept in $UR::Context::all_objects_cache_size.
The first part of "get_objects_for_class_and_rule" checks to see of
the current size is greater than the highwater setting, and call
"prune_object_cache" if so.
prune_object_cache() works by looking at what $UR::Context::GET_SERIAL
was the last time it ran, and what it is now, and making a guess about what
object serial number to use as a guide for removing objects by starting at 10%
of the difference between the last serial and the current value, called the
target serial.
It then starts executing a loop as long as $UR::Context::all_objects_cache_size
is greater than the lowwater setting. For each uncachable object, if its
"__get_serial" is less than the target serial, it is weakened from
any UR::Object::Indexes it may be a member of, and then weakened from the main
object cache, $UR::Context::all_objects_loaded.
The application may lock an object in the cache by calling
"__strengthen__" on it, Likewise, the app may hint to the pruner to
throw away an object as soon as possible by calling "__weaken__".
SEE ALSO¶
UR::Context::Root, UR::Context::Process, UR::Object, UR::DataSource,
UR::Object::Ghost, UR::Observer