As mentioned earlier, the following library modules are arranged in alphabetical order, for easy reference.
use AnyDBM_File;
This module is a "pure virtual base class"--it has nothing of its own. It's just there to inherit from the various DBM packages. By default it inherits from NDBM_File for compatibility with earlier versions of Perl. If it doesn't find NDBM_File, it looks for DB_File, GDBM_File, SDBM_File (which is always there--it comes with Perl), and finally ODBM_File.
Perl's dbmopen function (which now exists only for backward compatibility) actually just calls tie to bind a hash to AnyDBM_File. The effect is to bind the hash to one of the specific DBM classes that AnyDBM_File inherits from.
You can override the defaults and determine which class dbmopen will tie to. Do this by redefining @ISA:
@AnyDBM_File::ISA = qw(DB_File GDBM_File NDBM_File);
Note, however, that an explicit use takes priority over the ordering of @ISA, so that:
use GDBM_File;
will cause the next dbmopen to tie your hash to GDBM_File.
You can tie hash variables directly to the desired class yourself, without using dbmopen or AnyDBM_File. For example, by using multiple DBM implementations, you can copy a database from one format to another:
use Fcntl; # for O_* values
use NDBM_File;
use DB_File;
tie %oldhash, "NDBM_File", $old_filename, O_RDWR;
tie %newhash, "DB_File", $new_filename, O_RDWR|O_CREAT|O_EXCL, 0644;
while (($key,$val) = each %oldhash) {
$newhash{$key} = $val;
}
Here's a table of the features that the different DBMish packages offer:
| Feature | ODBM | NDBM | SDBM | GDBM | BSD-DB |
|---|---|---|---|---|---|
| Linkage comes with Perl | Yes | Yes | Yes | Yes | Yes |
| Source bundled with Perl | No | No | Yes | No | No |
| Source redistributable | No | No | Yes | GPL | Yes |
| Often comes with UNIX | Yes | Yes[1] | No | No | No |
| Builds OK on UNIX | N/A | N/A | Yes | Yes | Yes[2] |
| Code size | Varies[3] | Varies[3] | Small | Big | Big |
| Disk usage | Varies[3] | Varies[3] | Small | Big | OK[4] |
| Speed | Varies[3] | Varies[3] | Slow | OK | Fast |
| FTPable | No | No | Yes | Yes | Yes |
| Easy to build | N/A | N/A | Yes | Yes | OK[5] |
| Block size limits | 1k | 4k | 1k[6] | None | None |
| Byte-order independent | No | No | No | No | Yes |
| User-defined sort order | No | No | No | No | Yes |
| Wildcard lookups | No | No | No | No | Yes |
|
Footnotes:
[1] | |||||
Relevant library modules include: DB_File, GDBM_File, NDBM_File, ODBM_File, and SDBM_File. Related manpages: dbm (3), ndbm (3). Tied variables are discussed extensively in Chapter 5, Packages, Modules, and Object Classes, and the dbmopen entry in Chapter 3, Functions, may also be helpful. You can pick up the unbundled modules from the src/misc/ directory on your nearest CPAN site. Here are the most popular ones, but note that their version numbers may have changed by the time you read this:
http://www.perl.com/CPAN/src/misc/db.1.85.tar.gz http://www.perl.com/CPAN/src/misc/gdbm-1.7.3.tar.gz
package GoodStuff; use Exporter; use AutoLoader; @ISA = qw(Exporter AutoLoader);
The AutoLoader module provides a standard mechanism for delayed loading of functions stored in separate files on disk. Each file has the same name as the function (plus a .al ), and comes from a directory named after the package (with the auto/ directory). For example, the function named GoodStuff::whatever() would be loaded from the file auto/GoodStuff/whatever.al.
A module using the AutoLoader should have the special marker _ _END_ _ prior to the actual subroutine declarations. All code before this marker is loaded and compiled when the module is used. At the marker, Perl stops parsing the file.
When a subroutine not yet in memory is called, the AUTOLOAD function attempts to locate it in a directory relative to the location of the module file itself. As an example, assume POSIX.pm is located in /usr/local/lib/perl5/POSIX.pm. The AutoLoader will look for the corresponding subroutines for this package in /usr/ local/lib/perl5/auto/POSIX/*.al.
Lexicals declared with my in the main block of a package using the AutoLoader will not be visible to autoloaded functions, because the given lexical scope ends at the _ _END_ _ marker. A module using such variables as file-scoped globals will not work properly under the AutoLoader. Package globals must be used instead. When running under use strict, the use vars pragma may be employed in such situations as an alternative to explicitly qualifying all globals with the package name. Package variables predeclared with this pragma will be accessible to any autoloaded routines, but of course will not be invisible outside the module file.
The AutoLoader is a counterpart to the SelfLoader module. Both delay the loading of subroutines, but the SelfLoader accomplishes this by storing the subroutines right there in the module file rather than in separate files elsewhere. While this avoids the use of a hierarchy of disk files and the associated I/O for each routine loaded, the SelfLoader suffers a disadvantage in the one-time parsing of the lines after _ _DATA_ _, after which routines are cached. The SelfLoader can also handle multiple packages in a file.
AutoLoader, on the other hand, only reads code as it is requested, and in many cases should be faster. But it requires a mechanism like AutoSplit to be used to create the individual files.
On systems with restrictions on file name length, the file corresponding to a subroutine may have a shorter name than the routine itself. This can lead to conflicting filenames. The AutoSplit module will warn of these potential conflicts when used to split a module.
See the discussion of autoloading in Chapter 5, Packages, Modules, and Object Classes. Also see the AutoSplit module, a utility that automatically splits a module into a collection of files for autoloading.
# from a program use AutoSplit; autosplit_modules(@ARGV) # or from the command line perl -MAutoSplit -e 'autosplit(FILE, DIR, KEEP, CHECK, MODTIME)' ... # another interface perl -MAutoSplit -e 'autosplit_lib_modules(@ARGV)' ...
This function splits up your program or module into files that the AutoLoader module can handle. It is mainly used to build autoloading Perl library modules, especially complex ones like POSIX. It is used by both the standard Perl libraries and by the MakeMaker module to automatically configure libraries for autoloading.
The autosplit() interface splits the specified FILE into a hierarchy rooted at the directory DIR. It creates directories as needed to reflect class hierarchy. It then creates the file autosplit.ix, which acts as both a forward declaration for all package routines and also as a timestamp for when the hierarchy was last updated.
The remaining three arguments to autosplit() govern other options to the autosplitter. If the third argument, KEEP, is false, then any pre-existing .al files in the autoload directory are removed if they are no longer part of the module (obsoleted functions). The fourth argument, CHECK, instructs autosplit() to check the module currently being split to ensure that it really does include a use specification for the AutoLoader module, and skips the module if AutoLoader is not detected. Lastly, the MODTIME argument specifies that autosplit() is to check the modification time of the module against that of the autosplit.ix file, and only split the module if it is newer.
Here's a typical use of AutoSplit by the MakeMaker utility via the command line:
perl -MAutoSplit -e 'autosplit($ARGV[0], $ARGV[1], 0, 1, 1)'
MakeMaker defines this as a make macro, and it is invoked with file and directory arguments. The autosplit() function splits the named file into the given directory and deletes obsolete .al files, after checking first that the module does use the AutoLoader and ensuring that the module isn't already split in its current form.
The autosplit_lib_modules() form is used in the building of Perl. It takes as input a list of files (modules) that are assumed to reside in a directory lib/ relative to the current directory. Each file is sent to the autosplitter one at a time, to be split into the directory lib/auto/.
In both usages of the autosplitter, only subroutines defined following the Perl special marker _ _END_ _ are split out into separate files. Routines placed prior to this marker are not autosplit, but are forced to load when the module is first required.
Currently, AutoSplit cannot handle multiple package specifications within one file.
AutoSplit will inform the user if it is necessary to create the top-level directory specified in the invocation. It's better if the script or installation process that invokes AutoSplit has created the full directory path ahead of time. This warning may indicate that the module is being split into an incorrect path.
AutoSplit will also warn the user of subroutines whose names cause potential naming conflicts on machines with severely limited (eight characters or less) filename length. Since the subroutine name is used as the filename, these warnings can aid in portability to such systems.
Warnings are issued and the file skipped if AutoSplit cannot locate either the _ _END_ _ marker or a specification of the form package Name;. AutoSplit will also complain if it can't create directories or files.
use Benchmark;
# timeit(): run $count iterations of the given Perl code, and time it
$t = timeit($count, 'CODE'); # $t is now a Benchmark object
# timestr(): convert Benchmark times to printable strings
print "$count loops of 'CODE' took:", timestr($t), "\n";
# timediff(): calculate the difference between two times
$t = timediff($t1 - $t2);
# timethis(): run "code" $count times with timeit(); also, print out a
# header saying "timethis $count: "
$t = timethis($count, "CODE");
# timethese(): run timethis() on multiple chunks of code
@t = timethese($count, {
'Name1' => '...CODE1...',
'Name2' => '...CODE2...',
});
# new method: return the current time
$t0 = new Benchmark;
# ... your CODE here ...
$t1 = new Benchmark;
$td = timediff($t1, $t0);
print "the code took: ", timestr($td), "\n";
# debug method: enable or disable debugging
Benchmark->debug (1);
$t = timeit(10, ' 5 ** $Global ');
Benchmark->debug(0);
The Benchmark module encapsulates a number of routines to help you figure out how long it takes to execute some code a given number of times within a loop.
For the timeit() routine, $count is the number of times to run the loop. CODE is a string containing the code to run. timeit() runs a null loop with $count iterations, and then runs the same loop with your code inserted. It reports the difference between the times of execution.
For timethese(), a loop of $count iterations is run on each code chunk separately, and the results are reported separately. The code to run is given as a hash with keys that are names and values that are code. timethese() is handy for quick tests to determine which way of doing something is faster. For example:
$ perl -MBenchmark -Minteger
timethese(100000, { add => '$i += 2', inc => '$i++; $i++' });
_ _END_ _
Benchmark: timing 1000000 iterations of add, inc...
add: 4 secs ( 4.52 usr 0.00 sys = 4.52 cpu)
inc: 6 secs ( 5.32 usr 0.00 sys = 5.32 cpu)
The following routines are exported into your namespace if you use the Benchmark module:
timeit() timethis() timethese() timediff() timestr()
The following routines will be exported into your namespace if you specifically ask that they be imported:
clearcache() # clear just the cache element indexed by $key clearallcache() # clear the entire cache disablecache() # do not use the cache enablecache() # resume caching
Code is executed in the caller's package.
The null loop times are cached, the key being the number of iterations. You can control caching with calls like these:
clearcache($key); clearallcache(); disablecache(); enablecache();
Benchmark inherits only from the Exporter class.
The elapsed time is measured using time (2) and the granularity is therefore only one second. Times are given in seconds for the whole loop (not divided by the number of iterations). Short tests may produce negative figures because Perl can appear to take longer to execute the empty loop than a short test.
The user and system CPU time is measured to millisecond accuracy using times (3). In general, you should pay more attention to the CPU time than to elapsed time, especially if other processes are running on the system. Also, elapsed times of five seconds or more are needed for reasonable accuracy.
Because you pass in a string to be evaled instead of a closure to be executed, lexical variables declared with my outside of the eval are not visible.
use Carp; carp "Be careful!"; # warn of errors (from perspective of caller) croak "We're outta here!"; # die of errors (from perspective of caller) confess "Bye!"; # die of errors with stack backtrace
carp() and croak() behave like warn and die, respectively, except that they report the error as occurring not at the line of code where they are invoked, but at a line in one of the calling routines. Suppose, for example, that you have a routine goo() containing an invocation of carp(). In that case--and assuming that the current stack shows no callers from a package other than the current one--carp() will report the error as occurring where goo() was called. If, on the other hand, callers from different packages are found on the stack, then the error is reported as occurring in the package immediately preceding the package in which the carp() invocation occurs. The intent is to let library modules act a little more like built-in functions, which always report errors where you call them from.
confess() is like die except that it prints out a stack backtrace. The error is reported at the line where confess() is invoked, not at a line in one of the calling routines.
use Config;
if ($Config{cc} =~ /gcc/) {
print "built by gcc\n";
}
use Config qw(myconfig config_sh config_vars);
print myconfig();
print config_sh();
config_vars(qw(osname archname));
The Config module contains all the information that the Configure script had to figure out at Perl build time (over 450 values).[1]
[1] Perl was written in C, not because it's a portable language, but because it's a ubiquitous language. A bare C program is about as portable as Chuck Yeager on foot.
Shell variables from the config.sh file (written by Configure) are stored in a readonly hash, %Config, indexed by their names. Values set to the string "undef" in config.sh are returned as undefined values. The Perl exists function should be used to check whether a named variable exists.
Returns a textual summary of the major Perl configuration values. See also the explanation of Perl's -V command-line switch in Chapter 6, Social Engineering.
Returns the entire Perl configuration information in the form of the original config.sh shell variable assignment script.
Prints to STDOUT the values of the named configuration variables. Each is printed on a separate line in the form:
name='value';
Names that are unknown are output as name='UNKNOWN';.
Here's a more sophisticated example using %Config:
use Config;
defined $Config{sig_name} or die "No sigs?";
foreach $name (split(' ', $Config{sig_name})) {
$signo{$name} = $i;
$signame[$i] = $name;
$i++;
}
print "signal #17 = $signame[17]\n";
if ($signo{ALRM}) {
print "SIGALRM is $signo{ALRM}\n";
}
Because configuration information is not stored within the Perl executable itself, it is possible (but unlikely) that the information might not relate to the actual Perl binary that is being used to access it. The Config module checks the Perl version number when loaded to try to prevent gross mismatches, but can't detect subsequent rebuilds of the same version.
use Cwd;
$dir = cwd(); # get current working directory safest way
$dir = getcwd(); # like getcwd(3) or getwd(3)
$dir = fastcwd(); # faster and more dangerous
use Cwd 'chdir'; # override chdir; keep PWD up to date
chdir "/tmp";
print $ENV{PWD}; # prints "/tmp"
cwd() gets the current working directory using the most natural and safest form for the current architecture. For most systems it is identical to `pwd` (but without the trailing line terminator).
getcwd() does the same thing by re-implementing getcwd (3) or getwd (3) in Perl.
fastcwd() looks the same as getcwd(), but runs faster. It's also more dangerous because you might chdir out of a directory that you can't chdir back into.
It is recommended that one of these functions be used in all code to ensure portability because the pwd program probably only exists on UNIX systems.
If you consistently override your chdir built-in function in all packages of your program, then your PWD environment variable will automatically be kept up to date. Otherwise, you shouldn't rely on it. (Which means you probably shouldn't rely on it.)
use DB_File; # brackets in following code indicate optional arguments [$X =] tie %hash, "DB_File", $filename [, $flags, $mode, $DB_HASH]; [$X =] tie %hash, "DB_File", $filename, $flags, $mode, $DB_BTREE; [$X =] tie @array, "DB_File", $filename, $flags, $mode, $DB_RECNO; $status = $X->del($key [, $flags]); $status = $X->put($key, $value [, $flags]); $status = $X->get($key, $value [, $flags]); $status = $X->seq($key, $value [, $flags]); $status = $X->sync([$flags]); $status = $X->fd; untie %hash; untie @array;
DB_File is the most flexible of the DBM-style tie modules. It allows Perl programs to make use of the facilities provided by Berkeley DB (not included). If you intend to use this module you should really have a copy of the Berkeley DB manual page at hand. The interface defined here mirrors the Berkeley DB interface closely.
Berkeley DB is a C library that provides a consistent interface to a number of database formats. DB_File provides an interface to all three of the database (file) types currently supported by Berkeley DB.
The file types are:
Allows arbitrary key/data pairs to be stored in data files. This is equivalent to the functionality provided by other hashing packages like DBM, NDBM, ODBM, GDBM, and SDBM. Remember, though, the files created using DB_HASH are not binary compatible with any of the other packages mentioned. A default hashing algorithm that will be adequate for most applications is built into Berkeley DB. If you do need to use your own hashing algorithm, it's possible to write your own and have DB_File use it instead.
The btree format allows arbitrary key/data pairs to be stored in a sorted, balanced binary tree. It is possible to provide a user-defined Perl routine to perform the comparison of keys. By default, though, the keys are stored in lexical order. This is useful for providing an ordering for your hash keys, and may be used on hashes that are only in memory and never go to disk.
DB_RECNO allows both fixed-length and variable-length flat text files to be manipulated using the same key/value pair interface as in DB_HASH and DB_BTREE. In this case the key will consist of a record (line) number.
DB_File gives access to Berkeley DB files using Perl's tie function. This allows DB_File to access Berkeley DB files using either a hash (for DB_HASH and DB_BTREE file types) or an ordinary array (for the DB_RECNO file type).
In addition to the tie interface, it is also possible to use most of the functions provided in the Berkeley DB API.
Berkeley DB uses the function dbopen (3) to open or create a database. Below is the C prototype for dbopen (3).
DB *
dbopen (const char *file, int flags, int mode,
DBTYPE type, const void *openinfo)
The type parameter is an enumeration selecting one of the three interface methods, DB_HASH, DB_BTREE or DB_RECNO. Depending on which of these is actually chosen, the final parameter, openinfo, points to a data structure that allows tailoring of the specific interface method.
This interface is handled slightly differently in DB_File. Here is an equivalent call using DB_File.
tie %array, "DB_File", $filename, $flags, $mode, $DB_HASH;
The filename, flags, and mode parameters are the direct equivalent of their dbopen (3) counterparts. The final parameter $DB_HASH performs the function of both the type and openinfo parameters in dbopen (3).
In the example above $DB_HASH is actually a reference to a hash object. DB_File has three of these predefined references. Apart from $DB_HASH, there are also $DB_BTREE and $DB_RECNO.
The keys allowed in each of these predefined references are limited to the names used in the equivalent C structure. So, for example, the $DB_HASH reference will only allow keys called bsize, cachesize, ffactor, hash, lorder, and nelem.
To change one of these elements, just assign to it like this:
$DB_HASH->{cachesize} = 10_000;
In order to make RECNO more compatible with Perl, the array offset for all RECNO arrays begins at 0 rather than 1 as in Berkeley DB.
Berkeley DB allows the creation of in-memory databases by using NULL (that is, a (char *)0 in C) in place of the filename. DB_File uses undef instead of NULL to provide this functionality.
use strict;
use Fcntl;
use DB_File;
my ($k, $v, %hash);
tie(%hash, 'DB_File', undef, O_RDWR|O_CREAT, 0, $DB_BTREE)
or die "can't tie DB_File: $!":
foreach $k (keys %ENV) {
$hash{$k} = $ENV{$k};
}
# this will now come out in sorted lexical order
# without the overhead of sorting the keys
while (($k,$v) = each %hash) {
print "$k=$v\n";
}
In addition to accessing Berkeley DB using a tied hash or array, you can also make direct use of most functions defined in the Berkeley DB documentation.
To do this you need to remember the return value from tie, or use the tied function to get at it yourself later on.
$db = tie %hash, "DB_File", "filename";
Once you have done that, you can access the Berkeley DB API functions directly.
$db->put($key, $value, R_NOOVERWRITE); # invoke the DB "put" function
All the functions defined in the dbopen (3) manpage are available except for close() and dbopen() itself. The DB_File interface to these functions mirrors the way Berkeley DB works. In particular, note that all these functions return only a status value. Whenever a Berkeley DB function returns data via one of its parameters, the DB_File equivalent does exactly the same thing.
All the constants defined in the dbopen manpage are also available.
Below is a list of the functions available. (The comments only tell you the differences from the C version.)
The $flags parameter is optional. The value associated with the key you request is returned in the $value parameter.
As usual the flags parameter is optional. If you use either the R_IAFTER or R_IBEFORE flags, the $key parameter will be set to the record number of the inserted key/value pair.
The $flags parameter is optional.
No differences encountered.
The $flags parameter is optional. Both the $key and $value parameters will be set.
The $flags parameter is optional.
Here are a few examples. First, using $DB_HASH:
use DB_File;
use Fcntl;
tie %h, "DB_File", "hashed", O_RDWR|O_CREAT, 0644, $DB_HASH;
# Add a key/value pair to the file
$h{apple} = "orange";
# Check for value of a key
print "No, we have some bananas.\n" if $h{banana};
# Delete
delete $h{"apple"};
untie %h;
Here is an example using $DB_BTREE. Just to make life more interesting, the default comparison function is not used. Instead, a Perl subroutine, Compare(), does a case-insensitive comparison.
use DB_File;
use Fcntl;
sub Compare {
my ($key1, $key2) = @_;
"\L$key1" cmp "\L$key2";
}
$DB_BTREE->{compare} = 'Compare';
tie %h, 'DB_File', "tree", O_RDWR|O_CREAT, 0644, $DB_BTREE;
# Add a key/value pair to the file
$h{Wall} = 'Larry';
$h{Smith} = 'John';
$h{mouse} = 'mickey';
$h{duck} = 'donald';
# Delete
delete $h{duck};
# Cycle through the keys printing them in order.
# Note it is not necessary to sort the keys as
# the btree will have kept them in order automatically.
while ($key = each %h) { print "$key\n" }
untie %h;
The preceding code yields this output:
mouse Smith Wall
Next, an example using $DB_RECNO. You may access a regular textfile as an array of lines. But the first line of the text file is the zeroth element of the array, and so on. This provides a clean way to seek to a particular line in a text file.
my(@line, $number);
$number = 10;
use Fcntl;
use DB_File;
tie(@line, "DB_File", "/tmp/text", O_RDWR|O_CREAT, 0644, $DB_RECNO)
or die "can't tie file: $!";
$line[$number - 1] = "this is a new line $number";
Here's an example of updating a file in place:
use Fcntl;
use DB_File;
tie(@file, 'DB_File', "/tmp/sample", O_RDWR, 0644, $DB_RECNO)
or die "can't update /tmp/sample: $!";
print "line #3 was ", $file[2], "\n";
$file[2] = `date`;
untie @file;
Note that the tied array interface is incomplete, causing some operations on the resulting array to fail in strange ways. See the discussion of tied arrays in Chapter 5, Packages, Modules, and Object Classes. Some object methods are provided to avoid this. Here's an example of reading a file backward:
use DB_File;
use Fcntl;
$H = tie(@h, "DB_File", $file, O_RDWR, 0640, $DB_RECNO)
or die "Cannot open file $file: $!\n";
# print the records in reverse order
for ($i = $H->length - 1; $i >= 0; --$i) {
print "$i: $h[$i]\n";
}
untie @h;
Concurrent access of a read-write database by several parties requires that each use some kind of locking. Here's an example that uses the fd() method to get the file descriptor, and then a careful open to give something Perl will flock for you. Run this repeatedly in the background to watch the locks granted in proper order. You have to call the sync() method to ensure that the writes make it to disk between access, or else the library would normally hold some in its own cache.
use Fcntl; use DB_File;
use strict;
sub LOCK_SH { 1 }
sub LOCK_EX { 2 }
sub LOCK_NB { 4 }
sub LOCK_UN { 8 }
my($oldval, $fd, $db_obj, %db_hash, $value, $key);
$key = shift || 'default'; $value = shift || 'magic';
$value .= " $$";
$db_obj = tie(%db_hash, 'DB_File', '/tmp/foo.db', O_CREAT|O_RDWR, 0644)
or die "dbcreat /tmp/foo.db $!";
$fd = $db_obj->fd;
print "$$: db fd is $fd\n";
open(DB_FH, "+<&=$fd") or die "fdopen $!";
unless (flock (DB_FH, LOCK_SH | LOCK_NB)) {
print "$$: CONTENTION; can't read during write update!
Waiting for read lock ($!) ....";
unless (flock (DB_FH, LOCK_SH)) { die "flock: $!" }
}
print "$$: Read lock granted\n";
$oldval = $db_hash{$key};
print "$$: Old value was $oldval\n";
flock(DB_FH, LOCK_UN);
unless (flock (DB_FH, LOCK_EX | LOCK_NB)) {
print "$$: CONTENTION; must have exclusive lock!
Waiting for write lock ($!) ....";
unless (flock (DB_FH, LOCK_EX)) { die "flock: $!" }
}
print "$$: Write lock granted\n";
$db_hash{$key} = $value;
sleep 10;
$db_obj->sync(); # to flush
flock(DB_FH, LOCK_UN);
untie %db_hash;
undef $db_obj; # removing the last reference to the DB
# closes it. Closing DB_FH is implicit.
print "$$: Updated db to $key=$value\n";
Related manpages: dbopen (3), hash (3), recno (3), btree (3).
Berkeley DB is available from these locations:
use Devel::SelfStubber; $modulename = "Mystuff::Grok"; # no .pm suffix or slashes $lib_dir = ""; # defaults to current directory Devel::SelfStubber->stub($modulename, $lib_dir); # stubs only # to generate the whole module with stubs inserted correctly use Devel::SelfStubber; $Devel::SelfStubber::JUST_STUBS = 0; Devel::SelfStubber->stub($modulename, $lib_dir);
Devel::SelfStubber supports inherited, autoloaded methods by printing the stubs you need to put in your module before the _ _DATA_ _ token. A subroutine stub looks like this:
sub moo;
The stub ensures that if a method is called, it will get loaded. This is best explained using the following example:
Assume four classes, A, B, C, and D. A is the root class, B is a subclass of A, C is a subclass of B, and D is another subclass of A.
A
/ \
B D
/
C
If D calls an autoloaded method moo() which is defined in class A, then the method is loaded into class A, and executed. If C then calls method moo(), and that method was reimplemented in class B, but set to be autoloaded, then the lookup mechanism never gets to the AUTOLOAD mechanism in B because it first finds the moo() method already loaded in A, and so erroneously uses that. If the method moo() had been stubbed in B, then the lookup mechanism would have found the stub, and correctly loaded and used the subroutine from B.
So, to get autoloading to work right with classes and subclasses, you need to make sure the stubs are loaded.
The SelfLoader can load stubs automatically at module initialization with:
SelfLoader->load_stubs();
But you may wish to avoid having the stub-loading overhead associated with your initialization.[2] In this case, you can put the subroutine stubs before the _ _DATA_ _ token. This can be done manually, by inserting the output of the first call to the stub() method above. But the module also allows automatic insertion of the stubs. By default the stub() method just prints the stubs, but you can set the global $Devel::SelfStubber::JUST_STUBS to 0 and it will print out the entire module with the stubs positioned correctly, as in the second call to stub().
[2] Although note that the load_stubs() method will be called sooner or later, at latest when the first subroutine is being autoloaded--which may be too late, if you're trying to moo().
At the very least, this module is useful for seeing what the SelfLoader thinks are stubs; in order to ensure that future versions of the SelfStubber remain in step with the SelfLoader, the SelfStubber actually uses the SelfLoader to determine which stubs are needed.
# As a pragma: use diagnostics; use diagnostics -verbose; enable diagnostics; disable diagnostics; # As a program: $ perl program 2>diag.out $ splain [-v] [-p] diag.out
The diagnostics module extends the terse diagnostics normally emitted by both the Perl compiler and the Perl interpreter, augmenting them with the more explicative and endearing descriptions found in Chapter 9, Diagnostic Messages. It affects the compilation phase of your program rather than merely the execution phase.
To use in your program as a pragma, merely say:
use diagnostics;
at the start (or near the start) of your program. (Note that this enables Perl's -w flag.) Your whole compilation will then be subject to the enhanced diagnostics. These are still issued to STDERR.
Due to the interaction between run-time and compile-time issues, and because it's probably not a very good idea anyway, you may not use:
no diagnostics
to turn diagnostics off at compile time. However, you can turn diagnostics on or off at run-time by invoking diagnostics::enable() and diagnostics::disable(), respectively.
The -verbose argument first prints out the perldiag (1) manpage introduction before any other diagnostics. The $diagnostics::PRETTY variable, if set in a BEGIN block, results in nicer escape sequences for pagers:
BEGIN { $diagnostics::PRETTY = 1 }
While apparently a whole other program, splain is actually nothing more than a link to the (executable) diagnostics.pm module. It acts upon the standard error output of a Perl program, which you may have treasured up in a file, or piped directly to splain.
The -v flag has the same effect as:
use diagnostics -verbose
The -p flag sets $diagnostics::PRETTY to true. Since you're post-processing with splain, there's no sense in being able to enable() or disable() diagnostics.
Output from splain (unlike the pragma) is directed to STDOUT.
The following file is certain to trigger a few errors at both run-time and compile-time:
use diagnostics; print NOWHERE "nothing\n"; print STDERR "\n\tThis message should be unadorned.\n"; warn "\tThis is a user warning"; print "\nDIAGNOSTIC TESTER: Please enter a <CR> here: "; my $a, $b = scalar <STDIN>; print "\n"; print $x/$y;
If you prefer to run your program first and look at its problems afterward, do this while talking to a Bourne-like shell:
perl -w test.pl 2>test.out ./splain < test.out
If you don't want to modify your source code, but still want on-the-fly warnings, do this:
perl -w -Mdiagnostics test.pl
If you want to control warnings on the fly, do something like this. (Make sure the use comes first, or you won't be able to get at the enable() or disable() methods.)
use diagnostics; # checks entire compilation phase print "\ntime for 1st bogus diags: SQUAWKINGS\n"; print BOGUS1 'nada'; print "done with 1st bogus\n"; disable diagnostics; # only turns off run-time warnings print "\ntime for 2nd bogus: (squelched)\n"; print BOGUS2 'nada'; print "done with 2nd bogus\n"; enable diagnostics; # turns back on run-time warnings print "\ntime for 3rd bogus: SQUAWKINGS\n"; print BOGUS3 'nada'; print "done with 3rd bogus\n"; disable diagnostics; print "\ntime for 4th bogus: (squelched)\n"; print BOGUS4 'nada'; print "done with 4th bogus\n";
use DirHandle;
my $d = new DirHandle "."; # open the current directory
if (defined $d) {
while (defined($_ = $d->read)) { something($_); }
$d->rewind;
while (defined($_ = $d->read)) { something_else($_); }
}
DirHandle provides an alternative interface to Perl's opendir, closedir, readdir, and rewinddir functions.
The only objective benefit to using DirHandle is that it avoids name-space pollution by creating anonymous globs to hold directory handles. Well, and it also closes the DirHandle automatically when the last reference goes out of scope. But since most people only keep a directory handle open long enough to slurp in all the filenames, this is of dubious value. But hey, it's object-oriented.
package YourModule; require DynaLoader; @ISA = qw(... DynaLoader ...); bootstrap YourModule;
This module defines the standard Perl interface to the dynamic linking mechanisms available on many platforms. A common theme throughout the module system is that using a module should be easy, even if the module itself (or the installation of the module) is more complicated as a result. This applies particularly to the DynaLoader. To use it in your own module, all you need are the incantations listed above in the synopsis. This will work whether YourModule is statically or dynamically linked into Perl. (This is a Configure option for each module.) The bootstrap() method will either call YourModule's bootstrap routine directly if YourModule is statically linked into Perl, or if not, YourModule will inherit the bootstrap() method from DynaLoader, which will do everything necessary to load in your module, and then call YourModule's bootstrap() method for you, as if it were there all the time and you called it yourself. Piece of cake, of the have-it-and-eat-it-too variety.
The rest of this description talks about the DynaLoader from the viewpoint of someone who wants to extend the DynaLoader module to a new architecture. The Configure process selects which kind of dynamic loading to use by choosing to link in one of several C implementations, which must be linked into perl statically. (This is unlike other C extensions, which provide a single implementation, which may be linked in either statically or dynamically.)
The DynaLoader is designed to be a very simple, high-level interface that is sufficiently general to cover the requirements of SunOS, HP-UX, NeXT, Linux, VMS, Win-32, and other platforms. By itself, though, DynaLoader is practically useless for accessing non-Perl libraries because it provides almost no Perl-to-C "glue". There is, for example, no mechanism for calling a C library function or supplying its arguments in any sort of portable form. This job is delegated to the other extension modules that you may load in by using DynaLoader.
Variables:
@dl_library_path
@dl_resolve_using
@dl_require_symbols
$dl_debug
Subroutines:
bootstrap($modulename);
@filepaths = dl_findfile(@names);
$filepath = dl_expandspec($spec);
$libref = dl_load_file($filename);
$symref = dl_find_symbol($libref, $symbol);
@symbols = dl_undef_symbols();
dl_install_xsub($name, $symref [, $filename]);
$message = dl_error;
The bootstrap() and dl_findfile() routines are standard across all platforms, and so are defined in DynaLoader.pm. The rest of the functions are supplied by the particular .xs file that supplies the implementation for the platform. (You can examine the existing implementations in the ext/DynaLoader/ *.xs files in the Perl source directory. You should also read DynaLoader.pm, of course.) These implementations may also tweak the default values of the variables listed below.
The default list of directories in which dl_findfile() will search for libraries. Directories are searched in the order they are given in this array variable, beginning with subscript 0. @dl_library_path is initialized to hold the list of "normal" directories (/usr/lib and so on) determined by the Perl installation script, Configure, and given by $Config{'libpth'}. This is to ensure portability across a wide range of platforms. @dl_library_path should also be initialized with any other directories that can be determined from the environment at run-time (such as LD_LIBRARY_PATH for SunOS). After initialization, @dl_library_path can be manipulated by an application using push and unshift before calling dl_findfile(). unshift can be used to add directories to the front of the search order either to save search time or to override standard libraries with the same name. The load function that dl_load_file() calls might require an absolute pathname. The dl_findfile() function and @dl_library_path can be used to search for and return the absolute pathname for the library/object that you wish to load.
A list of additional libraries or other shared objects that can be used to resolve any undefined symbols that might be generated by a later call to dl_load_file(). This is only required on some platforms that do not handle dependent libraries automatically. For example, the Socket extension shared library (auto/Socket/Socket.so) contains references to many socket functions that need to be resolved when it's loaded. Most platforms will automatically know where to find the "dependent" library (for example, /usr/lib/libsocket.so). A few platforms need to be told the location of the dependent library explicitly. Use @dl_resolve_using for this. Example:
@dl_resolve_using = dl_findfile('-lsocket');
A list of one or more symbol names that are in the library/object file to be dynamically loaded. This is only required on some platforms.
$message = dl_error();
Error message text from the last failed DynaLoader function. Note that, similar to errno in UNIX, a successful function call does not reset this message. Implementations should detect the error as soon as it occurs in any of the other functions and save the corresponding message for later retrieval. This will avoid problems on some platforms (such as SunOS) where the error message is very temporary (see, for example, dlerror (3)).
Internal debugging messages are enabled when $dl_debug is set true. Currently, setting $dl_debug only affects the Perl side of the DynaLoader. These messages should help an application developer to resolve any DynaLoader usage problems. $dl_debug is set to $ENV{'PERL_DL_DEBUG'} if defined. For the DynaLoader developer and porter there is a similar debugging variable added to the C code (see dlutils.c) and enabled if Perl was built with the -DDEBUGGING flag. This can also be set via the PERL_DL_DEBUG environment variable. Set to 1 for minimal information or higher for more.
@filepaths = dl_findfile(@names)
Determines the full paths (including file suffix) of one or more loadable files, given their generic names and optionally one or more directories. Searches directories in @dl_library_path by default and returns an empty list if no files were found. Names can be specified in a variety of platform-independent forms. Any names in the form -lname are converted into libname.*, where .* is an appropriate suffix for the platform. If a name does not already have a suitable prefix or suffix, then the corresponding file will be sought by trying prefix and suffix combinations appropriate to the platform: $name.o, lib$name.* and $name. If any directories are included in @names, they are searched before @dl_library_path. Directories may be specified as -Ldir. Any other names are treated as filenames to be searched for. Using arguments of the form -Ldir and -lname is recommended. Example:
@dl_resolve_using = dl_findfile(qw(-L/usr/5lib -lposix));
$filepath = dl_expandspec($spec)
Some unusual systems such as VMS require special filename handling in order to deal with symbolic names for files (that is, VMS's Logical Names). To support these systems a dl_expandspec() function can be implemented either in the dl_*.xs file or code can be added to the autoloadable dl_expandspec() function in DynaLoader.pm.
$libref = dl_load_file($filename)
Dynamically load $filename, which must be the path to a shared object or library. An opaque "library reference" is returned as a handle for the loaded object. dl_load_file() returns the undefined value on error. (On systems that provide a handle for the loaded object such as SunOS and HP-UX, the returned handle will be $libref. On other systems $libref will typically be $filename or a pointer to a buffer containing $filename. The application should not examine or alter $libref in any way.) Below are some of the functions that do the real work. Such functions should use the current values of @dl_require_symbols and @dl_resolve_using if required.
SunOS: dlopen($filename) HP-UX: shl_load($filename) Linux: dld_create_reference(@dl_require_symbols); dld_link($filename) NeXT: rld_load($filename, @dl_resolve_using) VMS: lib$find_image_symbol($filename, $dl_require_symbols[0])
$symref = dl_find_symbol($libref, $symbol)
Returns the address of the symbol $symbol, or the undefined value if not found. If the target system has separate functions to search for symbols of different types, then dl_find_symbol() should search for function symbols first and then search for other types. The exact manner in which the address is returned in $symref is not currently defined. The only initial requirement is that $symref can be passed to, and understood by, dl_install_xsub(). Here are some current implementations:
SunOS: dlsym($libref, $symbol)
HP-UX: shl_findsym($libref, $symbol)
Linux: dld_get_func($symbol) and/or dld_get_symbol($symbol)
NeXT: rld_lookup("_$symbol")
VMS: lib$find_image_symbol($libref, $symbol)
@symbols = dl_undef_symbols()
Returns a list of symbol names which remain undefined after dl_load_file(). It returns () if these names are not known. Don't worry if your platform does not provide a mechanism for this. Most platforms do not need it and hence do not provide it; they just return an empty list.
dl_install_xsub($perl_name, $symref [, $filename])
Creates a new Perl external subroutine named $perl_name using $symref as a pointer to the function that implements the routine. This is simply a direct call to newXSUB(). It returns a reference to the installed function. The $filename parameter is used by Perl to identify the source file for the function if required by die, caller, or the debugger. If $filename is not defined, then DynaLoader will be used.
bootstrap($module);
This is the normal entry point for automatic dynamic loading in Perl.
It performs the following actions:
use English;
...
if ($ERRNO =~ /denied/) { ... }
This module provides aliases for the built-in "punctuation" variables. Variables with side effects that get triggered merely by accessing them (like $0) will still have the same effects under the aliases.
For those variables that have an awk (1) version, both long and short English alternatives are provided. For example, the $/ variable can be referred to either as $RS or as $INPUT_RECORD_SEPARATOR if you are using the English module.
Here is the list of variables along with their English alternatives:
| Perl | English | Perl | English |
|---|---|---|---|
| @_ | @ARG | $? | $CHILD_ERROR |
| $_ | $ARG | $! | $OS_ERROR |
| $& | $MATCH | $! | $ERRNO |
| $` | $PREMATCH | $@ | $EVAL_ERROR |
| $' | $POSTMATCH | $$ | $PROCESS_ID |
| $+ | $LAST_PAREN_MATCH | $$ | $PID |
| $. | $INPUT_LINE_NUMBER | $< | $REAL_USER_ID |
| $. | $NR | $< | $UID |
| $/ | $INPUT_RECORD_SEPARATOR | $> | $EFFECTIVE_USER_ID |
| $/ | $RS | $> | $EUID |
| $| | $OUTPUT_AUTOFLUSH | $( | $REAL_GROUP_ID |
| $, | $OUTPUT_FIELD_SEPARATOR | $( | $GID |
| $, | $OFS | $) | $EFFECTIVE_GROUP_ID |
| $\ | $OUTPUT_RECORD_SEPARATOR | $) | $EGID |
| $\ | $ORS | $0 | $PROGRAM_NAME |
| $" | $LIST_SEPARATOR | $] | $PERL_VERSION |
| $; | $SUBSCRIPT_SEPARATOR | $^A | $ACCUMULATOR |
| $; | $SUBSEP | $^D | $DEBUGGING |
| $% | $FORMAT_PAGE_NUMBER | $^F | $SYSTEM_FD_MAX |
| $= | $FORMAT_LINES_PER_PAGE | $^I | $INPLACE_EDIT |
| $- | $FORMAT_LINES_LEFT | $^P | $PERLDB |
| $~ | $FORMAT_NAME | $^T | $BASETIME |
| $^ | $FORMAT_TOP_NAME | $^W | $WARNING |
| $: | $FORMAT_LINE_BREAK_CHARACTERS | $^X | $EXECUTABLE_NAME |
| $^L | $FORMAT_LINEFEED | $^O | $OSNAME |
use Env; # import all possible variables use Env qw(PATH HOME TERM); # import only specified variables
Perl maintains environment variables in a pseudo-associative array named %ENV. Since this access method is sometimes inconvenient, the Env module allows environment variables to be treated as simple variables.
The Env::import() routine ties environment variables to global Perl variables with the same names. By default it ties suitable, existing environment variables (that is, variables yielded by keys %ENV). An environmental variable is considered suitable if its name begins with an alphabetic character, and if it consists of nothing but alphanumeric characters plus underscore.
If you supply arguments when invoking use Env, they are taken to be a list of environment variables to tie. It's OK if the variables don't yet exist.
After an environment variable is tied, you can use it like a normal variable. You may access its value:
@path = split(/:/, $PATH);
or modify it any way you like:
$PATH .= ":.";
To remove a tied environment variable from the environment, make it the undefined value:
undef $PATH;
Note that the corresponding operation performed directly against %ENV is not undef, but delete:
delete $ENV{PATH};
# in module YourModule.pm: package YourModule; use Exporter (); @ISA = qw(Exporter); @EXPORT = qw(...); # Symbols to export by default. @EXPORT_OK = qw(...); # Symbols to export on request. %EXPORT_TAGS = (tag => [...]); # Define names for sets of symbols. # in other files that wish to use YourModule: use YourModule; # Import default symbols into my package. use YourModule qw(...); # Import listed symbols into my package. use YourModule (); # Do not import any symbols!
Any module may define a class method called import(). Perl automatically calls a module's import() method when processing the use statement for the module. The module itself doesn't have to define the import() method, though. The Exporter module implements a default import() method that many modules choose to inherit instead. The Exporter module supplies the customary import semantics, and any other import() methods will tend to deviate from the normal import semantics in various (hopefully documented) ways. Now we'll talk about the normal import semantics.
Ignoring the class name, which is always the first argument to a class method, the arguments that are passed into the import() method are known as an import list. Usually the import list is nothing more than a list of subroutine or variable names, but occasionally you may want to get fancy. If the first entry in an import list begins with !, :, or /, the list is treated as a series of specifications that either add to or delete from the list of names to import. They are processed left to right. Specifications are in the form:
| Symbol | Meaning |
|---|---|
| [!]name | This name only |
| [!]:DEFAULT | All names in @EXPORT |
| [!]:tag | All names in $EXPORT_TAGS{tag} anonymous list |
| [!]/pattern/ | All names in @EXPORT and @EXPORT_OK that match pattern |
A leading ! indicates that matching names should be deleted from the list of names to import. If the first specification is a deletion, it is treated as though preceded by :DEFAULT. If you just want to import extra names in addition to the default set, you will still need to include :DEFAULT explicitly.
For example, suppose that YourModule.pm says:
@EXPORT = qw(A1 A2 A3 A4 A5);
@EXPORT_OK = qw(B1 B2 B3 B4 B5);
%EXPORT_TAGS = (
T1 => [qw(A1 A2 B1 B2)],
T2 => [qw(A1 A2 B3 B4)]
);
Individual names in EXPORT_TAGS must also appear in @EXPORT or @EXPORT_OK. Note that you cannot use the tags directly within either @EXPORT or @EXPORT_OK (though you could preprocess tags into either of those arrays, and in fact, the export_tags() and export_ok_tags() functions below do precisely that).
An application using YourModule can then say something like this:
use YourModule qw(:DEFAULT :T2 !B3 A3);
The :DEFAULT adds in A1, A2, A3, A4, and A5. The :T2 adds in only B3 and B4, since A1 and A2 were already added. The !B3 then deletes B3, and the A3 does nothing because A3 was already included. Other examples include:
use Socket qw(!/^[AP]F_/ !SOMAXCONN !SOL_SOCKET); use POSIX qw(:errno_h :termios_h !TCSADRAIN !/^EXIT/);
Remember that most patterns (using //) will need to be anchored with a leading ^, for example, /^EXIT/ rather than /EXIT/.
You can say:
BEGIN { $Exporter::Verbose=1 }
in order to see how the specifications are being processed and what is actually being imported into modules.
The Exporter module will convert an attempt to import a number from a module into a call to $module_name->require_version($value). This can be used to validate that the version of the module being used is greater than or equal to the required version. The Exporter module also supplies a default require_version() method, which checks the value of $VERSION in the exporting module.
Since the default require_version() method treats the $VERSION number as a simple numeric value, it will regard version 1.10 as lower than 1.9. For this reason it is strongly recommended that the module developer use numbers with at least two decimal places; for example, 1.09.
Prior to release 5.004 or so of Perl, this only worked with modules that use the Exporter module; in particular, this means that you can't check the version of a class module that doesn't require the Exporter module.
In some situations you may want to prevent certain symbols from being exported. Typically this applies to extensions with functions or constants that may not exist on some systems.
The names of any symbols that cannot be exported should be listed in the @EXPORT_FAIL array.
If a module attempts to import any of these symbols, the Exporter will give the module an opportunity to handle the situation before generating an error. The Exporter will call an export_fail() method with a list of the failed symbols:
@failed_symbols = $module_name->export_fail(@failed_symbols);
If the export_fail() method returns an empty list, then no error is recorded and all requested symbols are exported. If the returned list is not empty, then an error is generated for each symbol and the export fails. The Exporter provides a default export_fail() method that simply returns the list unchanged.
Uses for the export_fail() method include giving better error messages for some symbols and performing lazy architectural checks. Put more symbols into @EXPORT_FAIL by default and then take them out if someone actually tries to use them and an expensive check shows that they are usable on that platform.
Since the symbols listed within %EXPORT_TAGS must also appear in either @EXPORT or @EXPORT_OK, two utility functions are provided that allow you to easily add tagged sets of symbols to @EXPORT or @EXPORT_OK:
%EXPORT_TAGS = (Bactrian => [qw(aa bb cc)], Dromedary => [qw(aa cc dd)]);
Exporter::export_tags('Bactrian'); # add aa, bb and cc to @EXPORT
Exporter::export_ok_tags('Dromedary'); # add aa, cc and dd to @EXPORT_OK
Any names that are not tags are added to @EXPORT or @EXPORT_OK unchanged, but will trigger a warning (with -w) to avoid misspelt tag names being silently added to @EXPORT or @EXPORT_OK. Future versions may regard this as a fatal error.
use ExtUtils::Install; install($hashref, $verbose, $nonono); uninstall($packlistfile, $verbose, $nonono);
install() and uninstall() are specific to the way ExtUtils::MakeMaker handles the platform-dependent installation and deinstallation of Perl extensions. They are not designed as general-purpose tools. If you're reading this chapter straight through (brave soul), you probably want to take a glance at the MakeMaker entry first. (Or just skip over everything in the ExtUtils package until you start writing an Ext.)
install() takes three arguments: a reference to a hash, a verbose switch, and a don't-really-do-it switch. The hash reference contains a mapping of directories; each key/value pair is a combination of directories to be copied. The key is a directory to copy from, and the value is a directory to copy to. The whole tree below the "from" directory will be copied, preserving timestamps and permissions.
There are two keys with a special meaning in the hash: `read` and `write`. After the copying is done, install will write the list of target files to the file named by $hashref->{write}. If there is another file named by $hashref->{read}, the contents of this file will be merged into the written file. The read and the written file may be identical, but on the Andrew File System (AFS) it is fairly likely that people are installing to a different directory than the one where the files later appear.
uninstall() takes as first argument a file containing filenames to be unlinked. The second argument is a verbose switch, the third is a no-don't-really-do-it-now switch (useful to know what will happen without actually doing it).
require ExtUtils::Liblist; ExtUtils::Liblist::ext($potential_libs, $Verbose);
This utility takes a list of libraries in the form -llib1 -llib2 -llib3 and returns lines suitable for inclusion in a Perl extension Makefile on the current platform. Extra library paths may be included with the form -L/another/path. This will affect the searches for all subsequent libraries.
ExtUtils::Liblist::ext() returns a list of four scalar values, which Makemaker will eventually use in constructing a Makefile, among other things. The values are:
List of libraries that need to be linked with ld (1) when linking a Perl binary that includes a static extension. Only those libraries that actually exist are included.
List of those libraries that can or must be linked when creating a shared library using ld (1). These may be static or dynamic libraries.
A colon-separated list of the directories in LDLOADLIBS. It is passed as an environment variable to the process that links the shared library.
List of those libraries that are needed but can be linked in dynamically with the DynaLoader at run-time on this platform. This list is used to create a .bs (bootstrap) file. SunOS/Solaris does not need this because ld (1) records the information (from LDLOADLIBS) into the object file.
This module deals with a lot of system dependencies and has quite a few architecture-specific ifs in the code.
use ExtUtils::MakeMaker; WriteMakefile( ATTRIBUTE => VALUE, ... ); # which internally is really more like... %att = (ATTRIBUTE => VALUE, ...); MM->new(\%att)->flush;
When you build an extension to Perl, you need to have an appropriate Makefile[3] in the extension's source directory. And while you could conceivably write one by hand, this would be rather tedious. So you'd like a program to write it for you.
[3] If you don't know what a Makefile is, or what the make (1) program does with one, you really shouldn't be reading this section. We will be assuming that you know what happens when you type a command like make foo.
Originally, this was done using a shell script (actually, one for each extension) called Makefile.SH, much like the one that writes the Makefile for Perl itself. But somewhere along the line, it occurred to the perl5-porters that, by the time you want to compile your extensions, there's already a bare-bones version of the Perl executable called miniperl, if not a fully installed perl. And for some strange reason, Perl programmers prefer programming in Perl to programming in shell. So they wrote MakeMaker, just so that you can write Makefile.PL instead of Makefile.SH.
MakeMaker isn't a program; it's a module (or it wouldn't be in this chapter). The module provides the routines you need; you just need to use the module, and then call the routines. As with any programming job, there are many degrees of freedom; but your typical Makefile.PL is pretty simple. For example, here's ext/POSIX/Makefile.PL from the Perl distribution's POSIX extension (which is by no means a trivial extension):
use ExtUtils::MakeMaker;
WriteMakefile(
NAME => 'POSIX',
LIBS => ["-lm -lposix -lcposix"],
MAN3PODS => ' ', # Pods will be built by installman.
XSPROTOARG => '-noprototypes', # XXX remove later?
VERSION_FROM => 'POSIX.pm',
);
Several things are apparent from this example, but the most important is that the WriteMakefile() function uses named parameters. This means that you can pass many potential parameters, but you're only required to pass the ones you want to be different from the default values. (And when we say "many", we mean "many"--there are about 75 of them. See the Attributes section later.)
As the synopsis above indicates, the WriteMakefile() function actually constructs an object. This object has attributes that are set from various sources, including the parameters you pass to the function. It's this object that actually writes your Makefile, meshing together the demands of your extension with the demands of the architecture on which the extension is being installed. Like many craftily crafted objects, this MakeMaker object delegates as much of its work as possible to various other subroutines and methods. Many of these may be overridden in your Makefile.PL if you need to do some fine tuning. (Generally you don't.)
But let's not lose track of the goal, which is to write a Makefile that will know how to do anything to your extension that needs doing. Now as you can imagine, the Makefile that MakeMaker writes is quite, er, full-featured. It's easy to get lost in all the details. If you look at the POSIX Makefile generated by the bit of code above, you will find a file containing about 122 macros and 77 targets. You will want to go off into a corner and curl up into a little ball, saying, "Never mind, I didn't really want to know."
Well, the fact of the matter is, you really don't want to know, nor do you have to. Most of these items take care of themselves--that's what MakeMaker is there for, after all. We'll lay out the various attributes and targets for you, but you can just pick and choose, like in a cafeteria. We'll talk about the make targets first, because they're the actions you eventually want to perform, and then work backward to the macros and attributes that feed the targets.
But before we do that, you need to know just a few more architectural features of MakeMaker to make sense of some of the things we'll say. The targets at the end of your Makefile depend on the macro definitions that are interpolated into them. Those macro definitions in turn come from any of several places. Depending on how you count, there are about five sources of information for these attributes. Ordered by increasing precedence and (more or less) decreasing permanence, they are:
The first four of these turn into attributes of the object we mentioned, and are eventually written out as macro definitions in your Makefile. In most cases, the names of the values are consistent from beginning to end. (Except that the Config database keeps the names in lowercase, as they come from Perl's config.sh file. The names are translated to uppercase when they become attributes of the object.) In any case, we'll tend to use the term attributes to mean both attributes and the Makefile macros derived from them.
The Makefile.PL and the hints may also provide overriding methods for the object, if merely changing an attribute isn't good enough.
The hints files are expected to be named like their counterparts in PERL_SRC/hints, but with a .pl filename extension (for example, next_3_2.pl ), because the file consists of Perl code to be evaluated. Apart from that, the rules governing which hintsfile is chosen are the same as in Configure. The hintsfile is evaled within a routine that is a method of our MakeMaker object, so if you want to override or create an attribute, you would say something like:
$self->{LIBS} = ['-ldbm -lucb -lc'];
By and large, if your Makefile isn't doing what you want, you just trace back the name of the misbehaving attribute to its source, and either change it there or override it downstream.
Extensions may be built using the contents of either the Perl source directory tree or the installed Perl library. The recommended way is to build extensions after you have run make install on Perl itself. You can then build your extension in any directory on your hard disk that is not below the Perl source tree. The support for extensions below the ext/ directory of the Perl distribution is only good for the standard extensions that come with Perl.
If an extension is being built below the ext/ directory of the Perl source, then MakeMaker will set PERL_SRC automatically (usually to ../..). If PERL_SRC is defined and the extension is recognized as a standard extension, then other variables default to the following:
PERL_INC = PERL_SRC PERL_LIB = PERL_SRC/lib PERL_ARCHLIB = PERL_SRC/lib INST_LIB = PERL_LIB INST_ARCHLIB = PERL_ARCHLIB
If an extension is being built away from the Perl source, then MakeMaker will leave PERL_SRC undefined and default to using the installed copy of the Perl library. The other variables default to the following:
PERL_INC = $archlibexp/CORE PERL_LIB = $privlibexp PERL_ARCHLIB = $archlibexp INST_LIB = ./blib/lib INST_ARCHLIB = ./blib/arch
If Perl has not yet been installed, then PERL_SRC can be defined as an override on the command line.
Far and away the most commonly used make targets are those used by the installer to install the extension. So we aim to make the normal installation very easy:
perl Makefile.PL # generate the Makefile make # compile the extension make test # test the extension make install # install the extension
This assumes that the installer has dynamic linking available. If not, a couple of additional commands are also necessary:
make perl # link a new perl statically with this extension make inst_perl # install that new perl appropriately
Other interesting targets in the generated Makefile are:
make config # check whether the Makefile is up-to-date make clean # delete local temp files (Makefile gets renamed) make realclean # delete derived files (including ./blib) make ci # check in all files in the MANIFEST file make dist # see the "Distribution Support" section below
Now we'll talk about some of these commands, and how each of them is related to MakeMaker. So we'll not only be talking about things that happen when you invoke the make target, but also about what MakeMaker has to do to generate that make target. So brace yourself for some temporal whiplash.
This command is the one most closely related to MakeMaker because it's the one in which you actually run MakeMaker. No temporal whiplash here. As we mentioned earlier, some of the default attribute values may be overridden by adding arguments of the form KEY=VALUE. For example:
perl Makefile.PL PREFIX=/tmp/myperl5
To get a more detailed view of what MakeMaker is doing, say:
perl Makefile.PL verbose
A make command without arguments performs any compilation needed and puts any generated files into staging directories that are named by the attributes INST_LIB, INST_ARCHLIB, INST_EXE, INST_MAN1DIR, and INST_MAN3DIR. These directories default to something below . /blib if you are not building below the Perl source directory. If you are building below the Perl source, INST_LIB and INST_ARCHLIB default to .. /.. /lib, and INST_EXE is not defined.
The goal of this command is to run any regression tests supplied with the extension, so MakeMaker checks for the existence of a file named test.pl in the current directory and, if it exists, adds commands to the test target of the Makefile that will execute the script with the proper set of Perl -I options (since the files haven't been installed into their final location yet).
MakeMaker also checks for any files matching glob(`t/*.t`). It will add commands to the test target that execute all matching files via the Test::Harness module with the -I switches set correctly. If you pass TEST_VERBOSE=1, the test target will run the tests verbosely.
Once the installer has tested the extension, the various generated files need to get put into their final resting places. The install target copies the files found below each of the INST_* directories to their INSTALL* counterparts.
| INST_LIB | -> | INSTALLPRIVLIB[1]or INSTALLSITELIB[2] |
| INST_ARCHLIB | -> | INSTALLARCHLIB[1]or INSTALLSITEARCH[2] |
| INST_EXE | -> | INSTALLBIN |
| INST_MAN1DIR | -> | INSTALLMAN1DIR |
| INST_MAN3DIR | -> | INSTALLMAN3DIR |
|
Footnotes:
[1] | ||
The INSTALL* attributes in turn default to their %Config counterparts, $Config{installprivlib}, $Config{installarchlib}, and so on.
If you don't set INSTALLARCHLIB or INSTALLSITEARCH, MakeMaker will assume you want them to be subdirectories of INSTALLPRIVLIB and INSTALLSITELIB, respectively. The exact relationship is determined by Configure. But you can usually just go with the defaults for all these attributes.
The PREFIX attribute can be used to redirect all the INSTALL* attributes in one go. Here's the quickest way to install a module in a nonstandard place:
perl Makefile.PL PREFIX=~
The value you specify for PREFIX replaces one or more leading pathname components in all INSTALL* attributes. The prefix to be replaced is determined by the value of $Config{prefix}, which typically has a value like /usr. (Note that the tilde expansion above is done by MakeMaker, not by perl or make.)
If the user has superuser privileges and is not working under the Andrew File System (AFS) or relatives, then the defaults for INSTALLPRIVLIB, INSTALLARCHLIB, INSTALLBIN, and so on should be appropriate.
By default, make install writes some documentation of what has been done into the file given by $(INSTALLARCHLIB)/perllocal.pod. This feature can be bypassed by calling make pure_install.
If you are using AFS, you must specify the installation directories, since these most probably have changed since Perl itself was installed. Do this by issuing these commands:
perl Makefile.PL INSTALLSITELIB=/afs/here/today
INSTALLBIN=/afs/there/now INSTALLMAN3DIR=/afs/for/manpages
make
Be careful to repeat this procedure every time you recompile an extension, unless you are sure the AFS installation directories are still valid.
The steps above are sufficient on a system supporting dynamic loading. On systems that do not support dynamic loading, however, the extension has to be linked together statically with everything else you might want in your perl executable. MakeMaker supports the linking process by creating appropriate targets in the Makefile. If you say:
make perl
it will produce a new perl binary in the current directory with all extensions linked in that can be found in INST_ARCHLIB, SITELIBEXP, and PERL_ARCHLIB. To do that, MakeMaker writes a new Makefile ; on UNIX it is called Makefile.aperl, but the name may be system-dependent. When you want to force the creation of a new perl, we recommend that you delete this Makefile.aperl so the directories are searched for linkable libraries again.
The binary can be installed in the directory where Perl normally resides on your machine with:
make inst_perl
To produce a Perl binary with a different filename than perl, either say:
perl Makefile.PL MAP_TARGET=myperl make myperl make inst_perl
or say:
perl Makefile.PL make myperl MAP_TARGET=myperl make inst_perl MAP_TARGET=myperl
In either case, you will be asked to confirm the invocation of the inst_perl target, since this invocation is likely to overwrite your existing Perl binary in INSTALLBIN.
By default make inst_perl documents what has been done in the file given by $(INSTALLARCHLIB)/perllocal.pod. This behavior can be bypassed by calling make pure_inst_perl.
Sometimes you might want to build a statically linked Perl even though your system supports dynamic loading. In this case you may explicitly set the linktype:
perl Makefile.PL LINKTYPE=static
The following attributes can be specified as arguments to WriteMakefile() or as NAME=VALUE pairs on the command line. We give examples below in the form they would appear in your Makefile.PL, that is, as though passed as a named parameter to WriteMakefile() (including the comma that comes after it).
A reference to an array of *.c filenames. It's initialized by doing a directory scan and by derivation from the values of the XS attribute hash. This is not currently used by MakeMaker but may be handy in Makefile.PLs.
An array reference containing a list of attributes to fetch from %Config. For example:
CONFIG => [qw(archname manext)],
defines ARCHNAME and MANEXT from config.sh. MakeMaker will automatically add the following values to CONFIG:
ar dlext ldflags ranlib cc dlsrc libc sitelibexp cccdlflags ld lib_ext sitearchexp ccdlflags lddlflags obj_ext so
A reference to a subroutine returning a hash reference. The hash may contain further attributes, for example, {LIBS => ...}, that have to be determined by some evaluation method. Be careful, because any attributes defined this way will override hints and WriteMakefile( ) parameters (but not command-line arguments).
An attribute containing additional defines, such as -DHAVE_UNISTD_H.
A reference to an array of subdirectories containing Makefile.PLs. For example, SDBM_FILE has:
DIR => ['sdbm'],
MakeMaker will automatically do recursive MakeMaking if subdirectories contain Makefile.PL files. A separate MakeMaker class is generated for each subdirectory, so each MakeMaker object can override methods using the fake MY:: class (see below) without interfering with other MakeMaker objects. You don't even need a Makefile.PL in the top level directory if you pass one in via -M and -e:
perl -MExtUtils::MakeMaker -e 'WriteMakefile()'
Your name for distributing the package (by tar file). This defaults to NAME below.
A reference to a hash of symbol names for routines to be made available as universal symbols. Each key/value pair consists of the package name and an array of routine names in that package. This attribute is used only under AIX (export lists) and VMS (linker options) at present. The routine names supplied will be expanded in the same way as XSUB names are expanded by the XS attribute.
The default key/value pair looks like this:
"$PKG" => ["boot_$PKG"]
For a pair of packages named RPC and NetconfigPtr, you might, for example, set it to this:
DL_FUNCS => {
RPC => [qw(boot_rpcb rpcb_gettime getnetconfigent)],
NetconfigPtr => ['DESTROY'],
},
An array of symbol names for variables to be made available as universal symbols. It's used only under AIX (export lists) and VMS (linker options) at present. Defaults to []. A typical value might look like this:
DL_VARS => [ qw( Foo_version Foo_numstreams Foo_tree ) ],
A reference to an array of executable files. The files will be copied to the INST_EXE directory. A make realclean command will delete them from there again.
The name of the Makefile to be produced. Defaults to the contents of MAKEFILE, but can be overridden. This is used for the second Makefile that will be produced for the MAP_TARGET.
A Perl binary able to run this extension.
A reference to an array of *.h filenames. Similar to C.
Directories containing include files, in -I form. For example:
INC => "-I/usr/5include -I/path/to/inc",
Used by make install, which copies files from INST_ARCHLIB to this directory if INSTALLDIRS is set to "perl".
Used by make install, which copies files from INST_EXE to this directory.
Determines which of the two sets of installation directories to choose: installprivlib and installarchlib versus installsitelib and installsitearch. The first pair is chosen with INSTALLDIRS=perl, the second with INSTALLDIRS=site. The default is "site".
This directory gets the command manpages at make install time. It defaults to $Config{installman1dir}.
This directory gets the library manpages at make install time. It defaults to $Config{installman3dir}.
Used by make install, which copies files from INST_LIB to this directory if INSTALLDIRS is set to "perl".
Used by make install, which copies files from INST_LIB to this directory if INSTALLDIRS is set to "site" (default).
Used by make install, which copies files from INST_ARCHLIB to this directory if INSTALLDIRS is set to "site" (default).
Same as INST_LIB, but for architecture-dependent files.
Directory where executable scripts should be staged during running of make. Defaults to ./blib/bin, just to have a dummy location during testing. make install will copy the files in INST_EXE to INSTALLBIN.
Directory where we put library files of this extension while building it.
Directory to hold the command manpages at make time.
Directory to hold the library manpages at make time
Defaults to $(OBJECT) and is used in the ld (1) command to specify what files to link/load from. (Also see dynamic_lib later for how to specify ld flags.)
The filename of the Perl library that will be used together with this extension. Defaults to libperl.a.
An anonymous array of alternative library specifications to be searched for (in order) until at least one library is found.
For example:
LIBS => ["-lgdbm", "-ldbm -lfoo", "-L/path -ldbm.nfs"],
Note that any element of the array contains a complete set of arguments for the ld command. So do not specify:
LIBS => ["-ltcl", "-ltk", "-lX11"],
See NDBM_File/Makefile.PL for an example where an array is needed. If you specify a scalar as in:
LIBS => "-ltcl -ltk -lX11",
MakeMaker will turn it into an array with one element.
"static" or "dynamic" (the latter is the default unless usedl=undef in config.sh). Should only be used to force static linking. (Also see linkext, later in this chapter).
Boolean that tells MakeMaker to include the rules for making a Perl binary. This is handled automatically as a switch by MakeMaker. The user normally does not need it.
The name of the Makefile to be produced.
A reference to a hash of POD-containing files. MakeMaker will default this to all EXE_FILES files that include POD directives. The files listed here will be converted to manpages and installed as requested at Configure time.
A reference to a hash of .pm and .pod files. MakeMaker will default this to all .pod and any .pm files that include POD directives. The files listed here will be converted to manpages and installed as requested at Configure time.
If it is intended that a new Perl binary be produced, this variable holds the name for that binary. Defaults to perl.
If the extension links to a library that it builds, set this to the name of the library (see SDBM_File).
Perl module name for this extension (for example, DBD::Oracle). This will default to the directory name, but should really be explicitly defined in the Makefile.PL.
MakeMaker will figure out whether an extension contains linkable code anywhere down the directory tree, and will set this variable accordingly. But you can speed it up a very little bit if you define this Boolean variable yourself.
Governs make 's @ (echoing) feature. By setting NOECHO to an empty string, you can generate a Makefile that echos all commands. Mainly used in debugging MakeMaker itself.
A Boolean that inhibits the automatic descent into subdirectories (see DIR above). For example:
NORECURS => 1,
A string containing a list of object files, defaulting to $(BASEEXT)$(OBJ_EXT). But it can be a long string containing all object files. For example:
OBJECT => "tkpBind.o tkpButton.o tkpCanvas.o",
Perl binary for tasks that can be done by miniperl.
The command line that is able to compile perlmain.c. Defaults to $(CC).
Same as PERL_LIB for architecture-dependent files.
The directory containing the Perl library to use.
The directory containing the Perl source code. Use of this should be avoided, since it may be undefined.
A reference to hash of files to be processed as Perl programs. By default MakeMaker will turn the names of any *.PL files it finds (except Makefile.PL) into keys, and use the basenames of these files as values. For example:
PL_FILES => {'whatever.PL' => 'whatever'},
This turns into a Makefile entry resembling:
all :: whatever
whatever :: whatever.PL
$(PERL) -I$(INST_ARCHLIB) -I$(INST_LIB) \
-I$(PERL_ARCHLIB) -I$(PERL_LIB) whatever.PL
You'll note that there's no I/O redirection into whatever there. The *.PL files are expected to produce output to the target files themselves.
A reference to a hash of .pm files and .pl files to be installed. For example:
PM => {'name_of_file.pm' => '$(INST_LIBDIR)/install_as.pm'},
By default this includes *.pm and *.pl. If a lib/ subdirectory exists and is not listed in DIR (above) then any *.pm and *.pl files it contains will also be included by default. Defining PM in the Makefile.PL will override PMLIBDIRS.
A reference to an array of subdirectories that contain library files. Defaults to:
PMLIBDIRS => [ 'lib', '$(BASEEXT)' ],
The directories will be scanned and any files they contain will be installed in the corresponding location in the library. A libscan() method may be used to alter the behavior. Defining PM in the Makefile.PL will override PMLIBDIRS.
May be used to set the three INSTALL* attributes in one go (except for probably INSTALLMAN1DIR if it is not below PREFIX according to %Config). They will have PREFIX as a common directory node and will branch from that node into lib/, lib/ARCHNAME or whatever Configure decided at the build time of your Perl (unless you override one of them, of course).
A placeholder, not yet implemented. Will eventually be a hash reference: the keys of the hash are names of modules that need to be available to run this extension (for example, Fcntl f