The World Wide Web Security FAQ

	Up to Table of Contents
Back to CGI Scripts		Forward to Server Logs and Privacy

7. Safe Scripting in Perl

Q44: How do I avoid passing user variables through a shell when calling exec() and system()?

In Perl, you can invoke external programs in many different ways. You can capture the output of an external program using backticks:

   $date = `/bin/date`;

You can open up a pipe to a program:

   open (SORT, " | /usr/bin/sort | /usr/bin/uniq");

You can invoke an external program and wait for it to return with system():

   system "/usr/bin/sort < foo.in";

or you can invoke an external program and never return with exec():

   exec "/usr/bin/sort < foo.in";

All of these constructions can be risky if they involve user input that may contain shell metacharacters. For system() and exec(), there's a somewhat obscure syntactical feature that allows you to call external programs directly rather than going through a shell. If you pass the arguments to the external program, not in one long string, but as separate members in a list, then Perl will not go through the shell and shell metacharacters will have no unwanted side effects. For example:

   system "/usr/bin/sort","foo.in";

You can take advantage of this feature to open up a pipe without going through a shell. By calling open on the magic character sequence |-, you fork a copy of Perl and open a pipe to the copy. The child copy can then exec another program using the argument list variant of exec().

   my $result =  open (SORT,"|-");
   die "Couldn't open pipe to subprocess" unless defined($result);
   exec "/usr/bin/sort",$uservariable or die "Couldn't exec sort"
        if $result == 0;
   for my $line (@lines) {
     print SORT $line,"\n";
   }
   close SORT;

The initial call to open() tries to fork a copy of Perl. If the call fails it returns an undefined value and the script immediately dies (you might want to do something more sophisticated, such as sending an HTML error message to the user). Otherwise, the result will return zero to the child process, and the child's process ID to the parent. The child process checks the result value, and immediately attempts to exec the sort program. If something fails at this point, the child quits.

The parent process can then print to the SORT filehandle in the normal way.

To read from a pipe without opening up a shell, you can do something similar with the sequence -|:

   $result = open(GREP,"-|");
   die "Couldn't open pipe to subprocess" unless defined($result);
   exec "/usr/bin/grep",'-i',$userpattern,$filename
              or die "Couldn't exec grep" if $result == 0;
   while (<GREP>) {
     print "match: $_";
   }
   close GREP;

These are the forms of open() you should use whenever you would otherwise perform a piped open to a command.

An even more obscure feature allows you to call an external program and lie to it about its name. This is useful for calling programs that behave differently depending on the name by which they were invoked.

The syntax is

   system $real_name "fake_name","argument1","argument2"

For example:

   $shell = "/bin/sh"

   system $shell "-sh","-norc"

This invokes the shell using the name "-sh", forcing it to behave interactively. Note that the real name of the program must be stored in a variable, and that there's no comma between the variable holding the real name and the start of the argument list.

There's also a more compact syntax for this construction:

   system { "/bin/sh" } "-sh","-norc"

Q45: What are Perl taint checks? How do I turn them on?

As we've seen, one of the most frequent security problems in CGI scripts is inadvertently passing unchecked user variables to the shell. Perl provides a "taint" checking mechanism that prevents you from doing this. Any variable that is set using data from outside the program (including data from the environment, from standard input, and from the command line) is considered tainted and cannot be used to affect anything else outside your program. The taint can spread. If you use a tainted variable to set the value of another variable, the second variable also becomes tainted. Tainted variables cannot be used in eval(), system(), exec() or piped open() calls. If you try to do so, Perl exits with a warning message. Perl will also exit if you attempt to call an external program without explicitly setting the PATH environment variable.

You turn on taint checks in version 4 of Perl by using a special version of the interpreter named "taintperl":

   #!/usr/local/bin/taintperl

In version 5 of perl, pass the -T flag to the interpreter:

   #!/usr/local/bin/perl -T

See below for how to "untaint" a variable.

See Gunther Birznieks' CGI/Perl Taint Mode FAQ for a full discussion of taint mode.

Q46: OK, I turned on taint checks like you said. Now my script dies with the message: "Insecure $ENV{PATH} at line XX" every time I try to run it!

Even if you don't rely on the path when you invoke an external program, there's a chance that the invoked program might. Therefore you need to include the following line towards the top of your script whenever you use taint checks:

   $ENV{'PATH'} = '/bin:/usr/bin:/usr/local/bin';

Adjust this as necessary for the list of directories you want searched. It's not a good idea to include the current directory (".") in the path.

Q47: How do I "untaint" a variable?

Once a variable is tainted, Perl won't allow you to use it in a system(), exec(), piped open, eval(), backtick command, or any function that affects something outside the program (such as unlink). You can't use it even if you scan it for shell metacharacters or use the tr/// or s/// commands to remove metacharacters. The only way to untaint a tainted variable is by performing a pattern matching operation on it and extracting the matched substrings. For example, if you expect a variable to contain an e-mail address, you can extract an untainted copy of the address in this way:

   $mail_address=~/(\S+)\@([\w.-]+)/;
   $untainted_address = "$1\@$2";

This pattern match accepts e-mail addresses of the form "who@where" where "where" looks like a domain name, and "who" consists of one or more non-whitespace characters. Note that this regular expression will not remove shell meta-characters from the e-mail address. This is because it is perfectly valid for e-mail addresses to contain such characters, as in:

fred&barney@bedrock.com

Just because you have untainted a variable doesn't mean that it is now safe to pass it to a shell. E-mail addresses are the perfect examples of this. The taint checks are there in order to force you to recognize when a variable is potentially dangerous. Use the techniques described in Q44 to avoid passing dangerous variables to the shell.

Q48: I'm removing shell metacharacters from the variable, but Perl still thinks it's tainted!

See the answer to the question above. The only way to untaint a variable is to extract substrings using a pattern matching operation.

Q49: Is it true that the pattern matching operation `$foo=~/$user_variable/` is unsafe?

A frequent task for Perl CGI scripts is to take a list of keywords provided by the remote user and to use them in a patttern matching operation to fetch a list of matching file names (or something similar). This, in and of itself, isn't dangerous. What is dangerous is an optimization that many Perl programmers use to speed up the pattern matching operation. When you use a variable inside a pattern matching operation, the pattern is recompiled every time the operation is invoked. In order to avoid this expensive recompilation, you can provide the "o" flag to the pattern matching operation to tell Perl to compile the expression once:

    foreach (@files) {

       m/$user_pattern/o;

    }

Now, however, Perl will ignore any changes you make to the user variable, making this sort of loop fail:

    foreach $user_pattern (@user_patterns) {
       foreach (@files) {
          print if m/$user_pattern/o;
       }
    }

To get around this problem Perl programmers often use this sort of trick:

   foreach $user_pattern (@user_patterns) {
      eval "foreach (\@files) { print if m/$user_pattern/o; }";
   }

The problem here is that the eval() statement involves a user-supplied variable. Unless this variable is checked carefully, the eval() statement can be tricked into executing arbitrary Perl code. (For example of what can happen, consider what the eval statement does if the user passes in this pattern: "/; system 'rm *'; /"

The taint checks described above will catch this potential problem. Your alternatives include using the unoptimized form of the pattern matching operation, or carefully untainting user-supplied patterns. In Perl5, a useful trick is to use the escape sequence \Q \E to quote metacharacters so that they won't be interpreted:

   print if m/\Q$user_pattern\E/o;

Q50: My CGI script needs more privileges than it's getting as user "nobody". How do I run a Perl script as suid?

First of all, do you really need to run your Perl script as suid? This represents a major risk insofar as giving your script more privileges than the "nobody" user has also increases the potential for damage that a subverted script can cause. If you're thinking of giving your script root privileges, think it over extremely carefully.

You can make a script run with the privileges of its owner by setting its "s" bit:

   chmod u+s foo.pl

You can make it run with the privileges of its owner's group by setting the s bit in the group field:

   chmod g+s foo.pl

However, many Unix systems contain a hole that allows suid scripts to be subverted. This hole affects only scripts, not compiled programs. On such systems, an attempt to execute a Perl script with the suid bits set will result in a nasty error message from Perl itself.

You have two options on such systems:

You can apply a patch to the kernel that disables the suid bits for scripts. Perl will detect these bits nevertheless and do the suid function safely. See the Perl faq for details on obtaining this kernel patch. This faq can be found at:
ftp://rtfm.mit.edu/pub/usenet-by-group/comp.lang.perl/
You can put a C wrapper around the program. A typical wrapper looks like this:
```
       #include <unistd.h>
       void main () {
       execl("/usr/local/bin/perl","foo.pl","/local/web/cgi-bin/foo.pl",NULL);
       }
       
```
After compiling this program, make it suid. It will run under its owner's permission, launching a Perl interpreter and executing the statements in the file "foo.pl".

Another option is to run the server itself as a user that has sufficient privileges to do whatever the scripts need to do. If you're using the CERN server, you can even run as a different user for each script. See the CERN documentation for details.

	Up to Table of Contents
Back to CGI Scripts		Forward to Server Logs and Privacy

Lincoln D. Stein (lstein@cshl.org)
WWW Consortium

Last modified: Sun Dec 20 12:47:56 MET 1998