GT::SQL::Relation - manage multiple table joins
my $relation = $DB->table('Company', 'Employees'); my $sth = $relation->select( { Company.Name => 'Gossamer Threads', Employees.Name => 'Alex Krohn' }, ['Employees.Salary', 'Company.City'] ); my ($salary, $city) = $sth->fetchrow_array; print "Alex works in $city and earns $salary!\n";
This module aims at emulating a set of tables that are related to each other via the use of foreign keys just as if it was one big table.
The module interface should be as compatible as possible with GT::SQL::Table, thus you should be familiar with GT::SQL::Table before even reading this.
This documentation explains the differences between GT::SQL::Relation and GT::SQL::Table and how the module internally works as well.
GT::SQL supports the concept of foreign keys (also known as external references). Basically, two tables that are linked together using external references can look like that:
.-------------. .---------. | EMPLOYEE | | COMPANY | `-------------' `---------' | ID | .--->ID | | COMPANY_ID ----' | NAME | | NAME | `---------' | SALARY | `-------------'
In this example, the COMPANY_ID attribute relates the fact that a an EMPLOYEE belongs to such or such COMPANY.
Utilizing a Relation object can make these tables look like that:
.----------------------. | EMPLOYEE-COMPANY | `----------------------' | EMPLOYEE.ID | | EMPLOYEE.COMPANY_ID | | EMPLOYEE.NAME | | EMPLOYEE.SALARY | | COMPANY.NAME | `----------------------'
The first thing that can be seen from there is that COMPANY.ID has disappeared from this ``Virtual'' table.
Indeed, as for a given ``joined'' record this value must be the same in both tables, representing the values twice would have been a useless source of confusion.
Selecting from a Relation object is pretty simple using the GT::SQL module. As the interface is (almost) the same as the GT::SQL::Table manpage, the GT::SQL wrapper returns Table or Relation objects depending on the arguments that are passed to table.
# This gives me a GT::SQL::Table object for # the EMPLOYEE table. my $emp = $sql->table('EMPLOYEE');
# This gives me a GT::SQL::Relation object for # the relation EMPLOYEE-COMPANY tables my $emp_cmp = $sql->table('EMPLOYEE','COMPANY');
From there, performing a select is pretty simple:
# select all the people from a real cool company my $sth = $emp_cmp->select( { COMPANY.NAME => "Gossamer Threads" } )
Internally, the generated SQL query would look like:
SELECT EMPLOYEE.ID, EMPLOYEE.COMPANY_ID, EMPLOYEE.NAME EMPLOYEE.SALARY, COMPANY.NAME FROM EMPLOYEE, COMPANY WHERE COMPANY.NAME = 'Gossamer Threads' AND EMPLOYEE.COMPANY_ID = COMPANY.ID
Note that the join condition is computed and automatically appended at the end of the query, so you do not have to worry about this.
The select options for relation are similar to that of table, you have
select_options()
which will be set for the next query done. Example:
$relation->select_options("LIMIT 10");
This would append 'LIMIT 10' to your next select query. Another useful thing
is join_on(). join_on()
allows you to specify the FK relation for the nextr
select. This overrides what is in the def files. It is useful for allowing you
to have one table which will be join differently depending on what you are
doing. The argument to this are the same as to fk().
Example:
$relation->join_on( remote_table => { local_column => remote_column } );
The FK relation will be changed to this the next time you call select()
but
then it will be cleared.
* As previously said, the cols()
method when invoked on a GT::SQL::Relation
object does not return all the columns, removing the duplicate external
references. So, how does it decides which column to keep and which one to
return?
In the EMPLOYEE-COMPANY example we have the constraint EMPLOYEE.COMPANY_ID => COMPANY.ID and it keeps COMPANY_ID, i.e. the foreign key instead of the key itself.
* The pk()
method has to return the table primary key. The property of a primary
key is that it is a non-null unique record identifier. When pk()
is invoked on
a Relation object, this base definition is applied to construct the object
primary key.
To find a unique set of fields that makes a good primary key for a Relation object, the following, simple algorithm is used:
. . . for each table . . if the table is not referenced by another table that . . is in the current relation . . do . . append the current table's primary key fields to . . the Relation primary key fields . . end-do . . end-if . . end-for . . .
This algorithm selects all the tables that represent the ``many'' in one-to-many relations, and for all these tables add a list of fields which ensure a record uniqueness.
* When invoked on a GT::SQL::Table object, the fk()
method returns a hash which
has the following general structure:
{ target_table_1 => { source_col_1 => target_col_1, source_col_2 => target_col_2 }, target_table_2 => { source_col_1 => target_col_1 } }
The GT::SQL::Relation module returns a hash which has the same structure. The only difference is that it does not returns the external references which are managed internally.
This is done for two reasons: As one field is removed from a Relation table, it would not have been very logical to return a structure that point to non-existent fields.
Moreover, these internal references from the ``Relation'' point of view have nothing to do with the external world and thus should not be shown.
(i.e. EMPLOYEE.COMPANY_ID |===> COMPANY.ID would not count in our example)
The interface for inserting data in a Relation is the same as the one that is being used for Table. However, because rows are being inserted in a relation one-to-many, things internally work a bit differently.
The Relation insert()
method takes an optional argument, which can be
'complete' or 'abort' (default being complete).
insert()
splits the relation columns into separate records that can be inserted
in a single table. However, some of the records may exist already!
for example, if we perform:
$sql = shift; # our GT::SQL object $rel = $sql->table(qw/EMPLOYEE COMPANY/); $rel->insert({ 'EMPLOYEE.NAME' => $your_name, 'EMPLOYEE.SALARY' => $big_buck, 'COMPANY.NAME' => "Gossamer Threads" });
Obviously the company ``Gossamer Threads'' already exists, but you were not in the ``EMPLOYEE'' table. Thus, when 'complete' is specified (it is the default option), the program will not complain if a record to insert already exists but just warns and continue the insertion work.
In other words, Gossamer Threads exists already and it will not be inserted twice, but the employee will still be inserted and will belong to this company.
On the other hand, if you specify ``abort'', then no data is inserted if a record that has to be inserted would trigger an error in GT::SQL::Table.
This feature can be useful if you want to insert a relation record assuming that none of the entities that you specify should exist.
Deleting data from a Relation object works using the following pattern:
. . . for each row that matches the delete condition . . do . . split the row in table-based records . . for each table that contains foreing keys from the . . current relation object . . do . . delete the record . . end-do . . . . for each table that is being referenced by another . . table in the current relation object . . do . . delete the record unless there exists . . some "referencing" data. . . end-do . . .
As I feel that this explanation is probably very confusing, let us see how it works using our classical example (The salary column has been removed).
.-------------------------------------------------------------. | EMPLOYEE.ID | COMPANY_ID | EMPLOYEE.NAME | COMPANY.NAME | `-------------------------------------------------------------' | 1 | 1 | Alex | Gossamer Threads | |-------------|------------|---------------|------------------| | 2 | 1 | Scott | Gossamer Threads | |-------------|------------|---------------|------------------| | 3 | 1 | Aki | Gossamer Threads | `-------------------------------------------------------------'
Now let us say that we do the following:
# remove all the crazy geeks $relation->delete({ 'EMPLOYEE.NAME' => 'Scott' });
This will remove ``Scott'' from the EMPLOYEE table, but of course Gossamer Threads will not be deleted because there still exists Alex and Aki that would reference it.
Now if we do:
$relation->delete({ 'COMPANY.NAME' => 'Gossamer Threads' });
or even
my $condition = new GT::SQL::Condition; $condition->add(qw/EMPLOYEE.NAME LIKE %/); $relation->delete($condition);
Then we have generated a condition that matches all the employees, this means that when the last record will be deleted, then the company Gossamer Threads will have no more employees and therefore will be deleted.
(Yeah, well, this is for the purpose of this example, of course this will never happen in real life :) )
Currently, there exists a limitation on updating records in a Relation, which is that only the records that represent the ``many'' part of the Relation are updated.
The way it proceeds to perform the update is pretty simple:
. . . for each row that matches the update condition . . do . . split the row in table-based records . . for each table that contains foreing keys from the . . current relation object . . do . . update the record . . end-do . . .
That means that this will work:
# SALARY being a property of EMPLOYEE, it will be updated # because EMPLOYEE references COMPANY and therefore is a # "many" $relation->update({ SALARY => $big_bill }, { 'COMPANY.NAME' => 'Gossamer Threads' });
# nope, you cannot use Relation to update the COMPANY table that # way, this will not do anything. $relation->update({ 'COMPANY.NAME' => 'New_Name' }, { 'COMPANY.NAME' => 'Gossamer Threads' });
Who would like to change such a great name anyway ?
Select behaves exactly like the GT::SQL::Table manpage select. The only difference is the ability to specify LEFT JOINs. For instance, if you want to see a list of Employees who don't belong to a company, you can do:
my $relation = $DB->table('Employees', 'Company'); my $cond = GT::SQL::Condition->new('Company.ID', 'IS', \'NULL'); my $sth = $relation->select('left_join', $cond);
The order of tables specified in the relation constructor is important!
In selecting columns, calling functions utilizing fully qualified column names will cause GT::SQL::Relation to fail. Simply turn the values into references like below.
my $sth = $relation->select("MIN(Company.ID)"); # will fail
my $sth = $relation->select(\"MIN(Company.ID)"); # will work
Copyright (c) 2004 Gossamer Threads Inc. All Rights Reserved. http://www.gossamer-threads.com/
Revision: $Id: Relation.pm,v 1.102 2004/08/28 03:53:43 jagerman Exp $