PERL -- I/O Operations

I/O Operations

binmode(FILEHANDLE)

Arranges for the file to be read in "binary" mode in operating systems that distinguish between binary and text files. Files that are not read in binary mode have CR LF sequences translated to LF on input and LF translated to CR LF on output. Binmode has no effect under Unix. If FILEHANDLE is an expression, the value is taken as the name of the filehandle.

close(FILEHANDLE)

close FILEHANDLE

Closes the file or pipe associated with the file handle. You don't have to close FILEHANDLE if you are immediately going to do another open on it, since open will close it for you. (See open.) However, an explicit close on an input file resets the line counter ($.), while the implicit close done by open does not. Also, closing a pipe will wait for the process executing on the pipe to complete, in case you want to look at the output of the pipe afterwards. Closing a pipe explicitly also puts the status value of the command into $?. Example:

	open(OUTPUT, '|sort >foo');	# pipe to sort
	...	# print stuff to output
	close OUTPUT;		# wait for sort to finish
	open(INPUT, 'foo');	# get sort's results

FILEHANDLE may be an expression whose value gives the real filehandle name.

dbmclose(ASSOC_ARRAY)

dbmclose ASSOC_ARRAY

Breaks the binding between a dbm file and an associative array. The values remaining in the associative array are meaningless unless you happen to want to know what was in the cache for the dbm file. This function is only useful if you have ndbm.

dbmopen(ASSOC,DBNAME,MODE)

This binds a dbm or ndbm file to an associative array. ASSOC is the name of the associative array. (Unlike normal open, the first argument is NOT a filehandle, even though it looks like one). DBNAME is the name of the database (without the .dir or .pag extension). If the database does not exist, it is created with protection specified by MODE (as modified by the umask). If your system only supports the older dbm functions, you may perform only one dbmopen in your program. If your system has neither dbm nor ndbm, calling dbmopen produces a fatal error.

Values assigned to the associative array prior to the dbmopen are lost. A certain number of values from the dbm file are cached in memory. By default this number is 64, but you can increase it by preallocating that number of garbage entries in the associative array before the dbmopen. You can flush the cache if necessary with the reset command.

If you don't have write access to the dbm file, you can only read associative array variables, not set them. If you want to test whether you can write, either use file tests or try setting a dummy array entry inside an eval, which will trap the error.

Note that functions such as keys() and values() may return huge array values when used on large dbm files. You may prefer to use the each() function to iterate over large dbm files. Example:

	# print out history file offsets
	dbmopen(HIST,'/usr/lib/news/history',0666);
	while (($key,$val) = each %HIST) {
		print $key, ' = ', unpack('L',$val), "\n";
	}
	dbmclose(HIST);

eof(FILEHANDLE)

eof()

eof

Returns 1 if the next read on FILEHANDLE will return end of file, or if FILEHANDLE is not open. FILEHANDLE may be an expression whose value gives the real filehandle name. (Note that this function actually reads a character and then ungetc's it, so it is not very useful in an interactive context.) An eof without an argument returns the eof status for the last file read. Empty parentheses () may be used to indicate the pseudo file formed of the files listed on the command line, i.e. eof() is reasonable to use inside a while (<>) loop to detect the end of only the last file. Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop. Examples:

	# insert dashes just before last line of last file
	while (<>) {
		if (eof()) {
			print "--------------\n";
		}
		print;
	}

	# reset line numbering on each input file
	while (<>) {
		print "$.\t$_";
		if (eof) {	# Not eof().
			close(ARGV);
		}
	}

fcntl(FILEHANDLE,FUNCTION,SCALAR)

Implements the fcntl(2) function. You'll probably have to say

	require "fcntl.ph";	# probably /usr/local/lib/perl/fcntl.ph

first to get the correct function definitions. If fcntl.ph doesn't exist or doesn't have the correct definitions you'll have to roll your own, based on your C header files such as <sys/fcntl.h>. (There is a perl script called h2ph that comes with the perl kit which may help you in this.) Argument processing and value return works just like ioctl below. Note that fcntl will produce a fatal error if used on a machine that doesn't implement fcntl(2).

fileno(FILEHANDLE)

fileno FILEHANDLE

Returns the file descriptor for a filehandle. Useful for constructing bitmaps for select(). If FILEHANDLE is an expression, the value is taken as the name of the filehandle.

flock(FILEHANDLE,OPERATION)

Calls flock(2) on FILEHANDLE. See manual page for flock(2) for definition of OPERATION. Returns true for success, false on failure. Will produce a fatal error if used on a machine that doesn't implement flock(2). Here's a mailbox appender for BSD systems.

	$LOCK_SH = 1;
	$LOCK_EX = 2;
	$LOCK_NB = 4;
	$LOCK_UN = 8;

	sub lock {
	    flock(MBOX,$LOCK_EX);
	    # and, in case someone appended
	    # while we were waiting...
	    seek(MBOX, 0, 2);
	}

	sub unlock {
	    flock(MBOX,$LOCK_UN);
	}

	open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}")
		|| die "Can't open mailbox: $!";

	do lock();
	print MBOX $msg,"\n\n";
	do unlock();

getc(FILEHANDLE)

getc FILEHANDLE

getc

Returns the next character from the input file attached to FILEHANDLE, or a null string at EOF. If FILEHANDLE is omitted, reads from STDIN.

ioctl(FILEHANDLE,FUNCTION,SCALAR)

Implements the ioctl(2) function. You'll probably have to say

	require "ioctl.ph";	# probably /usr/local/lib/perl/ioctl.ph

first to get the correct function definitions. If ioctl.ph doesn't exist or doesn't have the correct definitions you'll have to roll your own, based on your C header files such as <sys/ioctl.h>. (There is a perl script called h2ph that comes with the perl kit which may help you in this.) SCALAR will be read and/or written depending on the FUNCTION--a pointer to the string value of SCALAR will be passed as the third argument of the actual ioctl call. (If SCALAR has no string value but does have a numeric value, that value will be passed rather than a pointer to the string value. To guarantee this to be true, add a 0 to the scalar before using it.) The pack() and unpack() functions are useful for manipulating the values of structures used by ioctl(). The following example sets the erase character to DEL.

	require 'ioctl.ph';
	$sgttyb_t = "ccccs";		# 4 chars and a short
	if (ioctl(STDIN,$TIOCGETP,$sgttyb)) {
		@ary = unpack($sgttyb_t,$sgttyb);
		$ary[2] = 127;
		$sgttyb = pack($sgttyb_t,@ary);
		ioctl(STDIN,$TIOCSETP,$sgttyb)
			|| die "Can't ioctl: $!";
	}

The return value of ioctl (and fcntl) is as follows:

	if OS returns:\h'|3i'perl returns:
	  -1\h'|3i'  undefined value
	  0\h'|3i'  string "0 but true"
	  anything else\h'|3i'  that number

Thus perl returns true on success and false on failure, yet you can still easily determine the actual value returned by the operating system:

	($retval = ioctl(...)) || ($retval = -1);
	printf "System returned %d\n", $retval;

open(FILEHANDLE,EXPR)

open(FILEHANDLE)

open FILEHANDLE

Opens the file whose filename is given by EXPR, and associates it with FILEHANDLE. If FILEHANDLE is an expression, its value is used as the name of the real filehandle wanted. If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE contains the filename. If the filename begins with "<" or nothing, the file is opened for input. If the filename begins with ">", the file is opened for output. If the filename begins with ">>", the file is opened for appending. (You can put a '+' in front of the '>' or '<' to indicate that you want both read and write access to the file.) If the filename begins with "|", the filename is interpreted as a command to which output is to be piped, and if the filename ends with a "|", the filename is interpreted as command which pipes input to us. (You may not have a command that pipes both in and out.) Opening '-' opens STDIN and opening '>-' opens STDOUT. Open returns non-zero upon success, the undefined value otherwise. If the open involved a pipe, the return value happens to be the pid of the subprocess. Examples:

    
	$article = 100;
	open article || die "Can't find article $article: $!\n";
	while (<article>) {...

	open(LOG, '>>/usr/spool/news/twitlog');
					# (log is reserved)

	open(article, "caesar <$article |");
					# decrypt article

	open(extract, "|sort >/tmp/Tmp$$");
					# $$ is our process#

	# process argument list of files along with any includes

	foreach $file (@ARGV) {
		do process($file, 'fh00');	# no pun intended
	}

	sub process {
		local($filename, $input) = @_;
		$input++;		# this is a string increment
		unless (open($input, $filename)) {
			print STDERR "Can't open $filename: $!\n";
			return;
		}
		while (<$input>) {		# note use of indirection
			if (/^#include "(.*)"/) {
				do process($1, $input);
				next;
			}
			...		# whatever
		}
	}

You may also, in the Bourne shell tradition, specify an EXPR beginning with ">&", in which case the rest of the string is interpreted as the name of a filehandle (or file descriptor, if numeric) which is to be duped and opened. You may use & after >, >>, <, +>, +>> and +<. The mode you specify should match the mode of the original filehandle. Here is a script that saves, redirects, and restores STDOUT and STDERR:

	#!/usr/bin/perl
	open(SAVEOUT, ">&STDOUT");
	open(SAVEERR, ">&STDERR");

	open(STDOUT, ">foo.out") || die "Can't redirect stdout";
	open(STDERR, ">&STDOUT") || die "Can't dup stdout";

	select(STDERR); $| = 1;		# make unbuffered
	select(STDOUT); $| = 1;		# make unbuffered

	print STDOUT "stdout 1\n";	# this works for
	print STDERR "stderr 1\n"; 	# subprocesses too

	close(STDOUT);
	close(STDERR);

	open(STDOUT, ">&SAVEOUT");
	open(STDERR, ">&SAVEERR");

	print STDOUT "stdout 2\n";
	print STDERR "stderr 2\n";

If you open a pipe on the command "-", i.e. either "|-" or "-|", then there is an implicit fork done, and the return value of open is the pid of the child within the parent process, and 0 within the child process. (Use defined($pid) to determine if the open was successful.) The filehandle behaves normally for the parent, but i/o to that filehandle is piped from/to the STDOUT/ STDIN of the child process. In the child process the filehandle isn't opened--i/o happens from/to the new STDOUT or STDIN. Typically this is used like the normal piped open when you want to exercise more control over just how the pipe command gets executed, such as when you are running setuid, and don't want to have to scan shell commands for metacharacters. The following pairs are more or less equivalent:

	open(FOO, "|tr '[a-z]' '[A-Z]'");
	open(FOO, "|-") || exec 'tr', '[a-z]', '[A-Z]';

	open(FOO, "cat -n '$file'|");
	open(FOO, "-|") || exec 'cat', '-n', $file;

Explicitly closing any piped filehandle causes the parent process to wait for the child to finish, and returns the status value in $?. Note: on any operation which may do a fork, unflushed buffers remain unflushed in both processes, which means you may need to set $| to avoid duplicate output.

The filename that is passed to open will have leading and trailing whitespace deleted. In order to open a file with arbitrary weird characters in it, it's necessary to protect any leading and trailing whitespace thusly:

        $file =~ s#^(\s)#./$1#;
        open(FOO, "< $file\0");

pipe(READHANDLE,WRITEHANDLE)

Opens a pair of connected pipes like the corresponding system call. Note that if you set up a loop of piped processes, deadlock can occur unless you are very careful. In addition, note that perl's pipes use stdio buffering, so you may need to set $| to flush your WRITEHANDLE after each command, depending on the application. [Requires version 3.0 patchlevel 9.]

print(FILEHANDLE LIST)

print(LIST)

print FILEHANDLE LIST

print LIST

print

Prints a string or a comma-separated list of strings. Returns non-zero if successful. FILEHANDLE may be a scalar variable name, in which case the variable contains the name of the filehandle, thus introducing one level of indirection. (NOTE: If FILEHANDLE is a variable and the next token is a term, it may be misinterpreted as an operator unless you interpose a + or put parens around the arguments.) If FILEHANDLE is omitted, prints by default to standard output (or to the last selected output channel--see select()). If LIST is also omitted, prints $_ to STDOUT. To set the default output channel to something other than STDOUT use the select operation. Note that, because print takes a LIST, anything in the LIST is evaluated in an array context, and any subroutine that you call will have one or more of its expressions evaluated in an array context. Also be careful not to follow the print keyword with a left parenthesis unless you want the corresponding right parenthesis to terminate the arguments to the print--interpose a + or put parens around all the arguments.

printf(FILEHANDLE LIST)

printf(LIST)

printf FILEHANDLE LIST

printf LIST

Equivalent to a "print FILEHANDLE sprintf(LIST)".

read(FILEHANDLE,SCALAR,LENGTH,OFFSET)

read(FILEHANDLE,SCALAR,LENGTH)

Attempts to read LENGTH bytes of data into variable SCALAR from the specified FILEHANDLE. Returns the number of bytes actually read, or undef if there was an error. SCALAR will be grown or shrunk to the length actually read. An OFFSET may be specified to place the read data at some other place than the beginning of the string. This call is actually implemented in terms of stdio's fread call. To get a true read system call, see sysread.

seek(FILEHANDLE,POSITION,WHENCE)

Randomly positions the file pointer for FILEHANDLE, just like the fseek() call of stdio. FILEHANDLE may be an expression whose value gives the name of the filehandle. Returns 1 upon success, 0 otherwise.

select(FILEHANDLE)

select

Returns the currently selected filehandle. Sets the current default filehandle for output, if FILEHANDLE is supplied. This has two effects: first, a write or a print without a filehandle will default to this FILEHANDLE. Second, references to variables related to output will refer to this output channel. For example, if you have to set the top of form format for more than one output channel, you might do the following:

	select(REPORT1);
	$^ = 'report1_top';
	select(REPORT2);
	$^ = 'report2_top';

FILEHANDLE may be an expression whose value gives the name of the actual filehandle. Thus:

	$oldfh = select(STDERR); $| = 1; select($oldfh);

sprintf(FORMAT,LIST)

Returns a string formatted by the usual printf conventions. The * character is not supported.

sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET)

sysread(FILEHANDLE,SCALAR,LENGTH)

Attempts to read LENGTH bytes of data into variable SCALAR from the specified FILEHANDLE, using the system call read(2). It bypasses stdio, so mixing this with other kinds of reads may cause confusion. Returns the number of bytes actually read, or undef if there was an error. SCALAR will be grown or shrunk to the length actually read. An OFFSET may be specified to place the read data at some other place than the beginning of the string.

syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET)

syswrite(FILEHANDLE,SCALAR,LENGTH)

Attempts to write LENGTH bytes of data from variable SCALAR to the specified FILEHANDLE, using the system call write(2). It bypasses stdio, so mixing this with prints may cause confusion. Returns the number of bytes actually written, or undef if there was an error. An OFFSET may be specified to place the read data at some other place than the beginning of the string.

tell(FILEHANDLE)

tell FILEHANDLE

tell

Returns the current file position for FILEHANDLE. FILEHANDLE may be an expression whose value gives the name of the actual filehandle. If FILEHANDLE is omitted, assumes the file last read.

write(FILEHANDLE)

write(EXPR)

write

Writes a formatted record (possibly multi-line) to the specified file, using the format associated with that file. By default the format for a file is the one having the same name is the filehandle, but the format for the current output channel (see select) may be set explicitly by assigning the name of the format to the $~ variable.

Top of form processing is handled automatically: if there is insufficient room on the current page for the formatted record, the page is advanced by writing a form feed, a special top-of-page format is used to format the new page header, and then the record is written. By default the top-of-page format is the name of the filehandle with "_TOP" appended, but it may be dynamicallly set to the format of your choice by assigning the name to the $^ variable while the filehandle is selected. The number of lines remaining on the current page is in variable $-, which can be set to 0 to force a new page.

If FILEHANDLE is unspecified, output goes to the current default output channel, which starts out as STDOUT but may be changed by the select operator. If the FILEHANDLE is an EXPR, then the expression is evaluated and the resulting string is used to look up the name of the FILEHANDLE at run time. For more on formats, see the section on formats later on.

Note that write is NOT the opposite of read.