First, though, I encourage everyone to read the mod_perl Guide, by Stas Bekman. In particular, the Scenario and Performance Tuning sections are required reading for anyone who is trying to scale a mod_perl site.
As a note, creating an infinitely scalable mod_perl architecture is not impossible; it is not even particularly rare. The following well-known sites use mod_perl:
As a rule of thumb, vertical scaling strategies are more expensive from a hardware standpoint but cost less to maintain, and horizontal scaling strategies are cheaper from a hardware standpoint but cost more to maintain. In general, most solutions will consist of some combination of all of these suggestions: scaling vertically in some places, scaling horizontally in others, etc.
Here are some scaling techniques that can be used.
RESULTS: I tested the following configurations:
::::::::::::::
session_mysql.pl2
::::::::::::::
use Apache::Session::MySQL;
my %session;
# Connect to the database.
my $dbh = DBI->connect("DBI:mysql:database=apachesession;host=localhost",
"root", "", {'RaiseError' => 1});
$opts = {
Handle => $dbh,
LockHandle => $dbh
};
tie %session, 'Apache::Session::MySQL', undef, $opts;
use Time::HiRes qw(gettimeofday tv_interval);
$t0 = [gettimeofday];
for (1..1000) {
$session{username} = 'epark';
$session_id = $session{_session_id};
untie %session;
tie %session2, 'Apache::Session::MySQL',$session_id, $opts;
}
print "Content-type: text/html\n\n";
print tv_interval($t0);
::::::::::::::
session_oracle.pl2
::::::::::::::
use Apache::Session::Oracle;
my %session;
# Connect to the database.
my $dbh = DBI->connect("dbi:Oracle:XXXX","XXXX/XXXX");
$opts = {
Handle => $dbh,
LockHandle => $dbh,
Commit => 0
};
tie %session, 'Apache::Session::Oracle', undef, $opts;
use Time::HiRes qw(gettimeofday tv_interval);
$t0 = [gettimeofday];
for (1..1000) {
$session{username} = 'epark';
$session_id = $session{_session_id};
untie %session;
tie %session2, 'Apache::Session::Oracle',$session_id, $opts;
}
print "Content-type: text/html\n\n";
print tv_interval($t0);
::::::::::::::
session_file.pl2
::::::::::::::
use Apache::Session::File;
my %session;
my $opts = { Directory => '/tmp/session', LockDirectory => '/tmp/session', Transaction => 1 };
$session_id = 1;
tie %session, 'Apache::Session::File', undef, $opts;
use Time::HiRes qw(gettimeofday tv_interval);
$t0 = [gettimeofday];
for (1..1000) {
$session{username} = 'epark';
$session_id = $session{_session_id};
untie %session;
tie %session2, 'Apache::Session::File',$session_id, $opts;
}
print "Content-type: text/html\n\n";
print tv_interval($t0);
::::::::::::::
session_dbfile.pl2
::::::::::::::
use Apache::Session::DB_File;
my %session;
$opts = {
FileName => '/tmp/sessions/sessions.db',
LockDirectory => '/tmp/sessions',
};
tie %session, 'Apache::Session::DB_File', undef, $opts;
use Time::HiRes qw(gettimeofday tv_interval);
$t0 = [gettimeofday];
for (1..1000) {
$session{username} = 'epark';
$session_id = $session{_session_id};
untie %session;
tie %session2, 'Apache::Session::DB_File',$session_id, $opts;
}
print "Content-type: text/html\n\n";
print tv_interval($t0);
Solution: You can change the default Oracle schema! See this message from the dbi-dev archive. In particular, if you look at the official DBD::Oracle install directory. In there, there's a file called t/reauth.t. Here's a small chunk of code from that:
ok(0, ($dbh->selectrow_array("SELECT USER FROM DUAL"))[0] eq $uid1 );
ok(0, $dbh->func($dbuser_2, '', 'reauthenticate'));
ok(0, ($dbh->selectrow_array("SELECT USER FROM DUAL"))[0] eq $uid2 );
Another way of achieving the same effect would be to connect to the database once as a master user, and use the completely undocumented Oracle command "alter session set current_schema=$dbuser", via a simple:
$dbh->do("alter session set current_schema=$dbuser")
This actually executes much faster than the previous code. On a dual PIII/600 client (Linux) talking to a dual PIII-500 (Oracle on Linux), I benched the following for 300 change-schemas:
reauthenticate: 7.866089 alter session: 1.09673
Using DBD::Oracle, there are at least two levels of caching you should be aware of:
If you want to know what actually happens when you call $dbh->prepare, view this slide from one of Tim Bunce's DBI talks:
You may also want to consider using dbi's prepare_cached function, which will cache statement handles client-side so that initial prepare call to Oracle doesn't have to be made. Depending on your application, your mileage with this may vary.The raw performance impact of these two caching methods will, of course, depend on your particular application, but here is an example benchmark that can give you a feel for what the relative magnitudes of these techniques on a *very* simple SQL statement:
[bench]# perl dbibench.pl prepare_cached... 1.391559 prepare, use Oracle SGA... 3.445736 prepare, do not use Oracle SGA... 4.370455
from the DBI documentation:
$sth = $dbh->prepare_cached($statement)
$sth = $dbh->prepare_cached($statement, \%attr)
$sth = $dbh->prepare_cached($statement, \%attr, $allow_active)
Like the prepare entry elsewhere in this documentexcept that the statement handle returned will be stored
in a hash associated with the $dbh. If another call is made to prepare_cached with the same parameter
values then the corresponding cached $sth will be returned without contacting the database server.
This caching can be useful in some applications but it can also cause problems and should be used with
care. A warning will be generated if the cached $sth being returned is active (i.e., is a select that may
still have data to be fetched) unless $allow_active is true.
The cache can be accessed (and cleared) via the the CachedKids entry elsewhere in this documentattribute.
The following program was used to generate the results:
#!/usr/bin/perl
use Time::HiRes;
use DBI;
$dbh = DBI->connect("dbi:Oracle:DEV", "scott/tiger") || die "cannot connect to master root db\n";
$sql = "select * from dual";
print "prepare_cached...\n";
$t0 = [Time::HiRes::gettimeofday];
for ((1..1000)) {
$sth = $dbh->prepare_cached("select * from dual");
$sth->execute or die;
while ($sth->fetchrow_array) {};
}
print Time::HiRes::tv_interval($t0);
print "\n";
print "prepare...\n";
$t0 = [Time::HiRes::gettimeofday];
for ((1..1000)) {
$sth = $dbh->prepare("select * from dual");
$sth->execute or die;
while ($sth->fetchrow_array) {};
}
print Time::HiRes::tv_interval($t0);
print "\n";
print "prepare...\n";
$t0 = [Time::HiRes::gettimeofday];
for $i ((1..1000)) {
$sth = $dbh->prepare("select $i from dual");
$sth->execute or die;
while ($sth->fetchrow_array) {};
}
print Time::HiRes::tv_interval($t0);
print "\n";
To change the number of file descriptors used, try:
echo 128000 > /proc/sys/fs/inode-max echo 64000 > /proc/sys/fs/file-max and as root: ulimit -n 64000before running your server process. For more information on this and other OS techniques, see System Tuning Info for Linux Servers.
If you don't know what Oracle's explain plan is and you're trying to create a high-performance website, then you are seriously depriving yourself. Essentially, explain plan tells you the execution path of an Oracle statement in the database.
The intended use of this program is to understand where you are making database round-trips, and why. Because DBI is so clean and easy to use, it is easy to get trapped into making useless calls to the database; those calls become immediately apparent when using this utility.
Sample output:
1 SQL: select count(*) from transaction where voided is null and patientid=:p1
BINDS: 8100033794
Elapsed Time: 0.004796
SELECT STATEMENT Cost =
SORT AGGREGATE
TABLE ACCESS BY INDEX ROWID TRANSACTION
INDEX RANGE SCAN TRANSACTION_PATIENTID
2 SQL: select * from medicalgroup order by id
BINDS:
Elapsed Time: 0.00796
SELECT STATEMENT Cost =
TABLE ACCESS BY INDEX ROWID MEDICALGROUP
INDEX FULL SCAN PK_MEDICALGROUP
3 SQL: select value from tablespaceinfo where key='Time Zone Offset Hours'
BINDS:
Elapsed Time: 0.004105
SELECT STATEMENT Cost =
TABLE ACCESS FULL TABLESPACEINFO
Total Elapsed Time: 0.01086
Files
If you're interested in tuning your Oracle app in general have access to an X terminal, a good place to start is Alan Burlison's oracle_explain utility, which is bundled with the stock distribution of DBD::Oracle. Read the README.explain section of DBD::Oracle for more information. A quote from README.explain:
explain allows the user to grab all the SQL currently cached by Oracle. The SQL capture can be filtered and sorted by different criterea, e.g. all SQL matching a pattern, order by number of executions etc.
Another excellent tool that, among other things, includes an explain_plan window is Adam vonNieda's oracletool CGI script. This also allows you to manage your Oracle installation from any browser.
Finally, there is the venerable TOAD. The freeware version of this tool is a general-purpose replacement for SQL Worksheet that also allows you to explain plans, snoop in on active SQL sessions to see what's spinning your database, etc.
References/Other tools you can use:
package AccessTimer;
# USAGE:
# Just put the following line into your .conf file:
#
# PerlFixupHandler AccessTimer
#
# and use a custom Apache log (this logging piece is not at all
mod_perl-based...
# see http://httpd.apache.org/docs/mod/mod_log_config.html)
#
# CustomLog /path/to/your/log "%h %l %u %t \"%r\" %>s %b %{ELAPSED}e"
#
use strict;
use Apache::Constants qw(:common);
use Time::HiRes qw(gettimeofday tv_interval);
use vars qw($begin);
sub handler {
my $r = shift;
$begin = [gettimeofday];
$r->push_handlers(PerlLogHandler=>\&log);
return OK;
}
sub log {
my $r = shift;
my $elapsed = tv_interval($begin);
$r->subprocess_env('ELAPSED' => "$elapsed");
return DECLINED;
}
1;