Printer-friendly version

03 October 2003

Perl Programming for Everybody

by Joseph DiVerdi

An Apologia

Let me start off with an admission: Perl is my very most favorite programming language in the whole world. If some task before me can be performed in Perl (and there are few that cannot) I will do everything I possibly can to have to perform it in Perl. Now usually I'm not that fanatical about programming languages. I don't really consider myself a real programmer (a view readily offered by others). Rather, I view myself more as application-driven or as-needed programmer. I embrace the field of computer science as a scholarly and important enterprise; I've even contributed to that field with one of my own progeny. I can often appreciate the beauty and occasional subtlety in a well-crafted code fragment. But for my particular work, a computer program is a tool which is necessary to accomplish some particular computational or textual analysis or synthesis. It's part of getting my job done - nothing more, nothing less. That said, there are very few days in which I don't find myself carrying out some sort of analysis or synthesis suitable for implementation by computer program. These are times when I usually turn to Perl. I should note clearly and without chance of misinterpretation that although my enthusiasm for Perl borders on the extreme, I did not (and will not) write that Perl is "the best programming language in the world". The spectrum of tasks which call upon computer programs is too wide and far-ranging for that claim to be made in favor of any single language. I usually find that when such a claim is made it is variably accompanied by a set of limiting conditions (either explicitly stated, implied, or worse, assumed). Beware such claims for they may be true in their particular domain but they are usually levied when the claimant wants others (perhaps you) to fall in line and use the language of his or her choice. This is perhaps a penetrating test case for challenging one's true views on diversity. In any event, my purpose in embarking on this series of articles it to provide a taste of Perl through a series of different examples including various numerical calculations and text processing. It is important to note that this series is intended for those already familiar with some sort of programming language.

Perl in a Nutshell

Perl is an high-level programming language which is easy to use, nearly unlimited in its capabilities, mostly fast, extremely portable, and very expressive. On the other hand, it can be kind of ugly, can be written so it is very hard to read (intimates enjoy the fun-filled, annual "Obfuscated Perl Contest"), and isn't particularly easy to learn. It's quite good for those quick-and-dirty programs that you cook up in a few minutes and it works just as well for those big projects that take multiple programers multiple time units to complete. Perhaps its most compelling attribute is that it is free - really free. Some folks will try to charge you for it but don't be fooled. The only Perl items for which I've ever parted with my hard-earned shekels for is printed books from which I learn more about my favorite language. The Perl language is reputed to be optimized for problems which are about 90% working with text and about 10% everything else. That pretty much sums up most programs written these days. Perl is actively being used in such far-flung applications as the US Census, the Human Genome Project and its ilk, aerospace engineering, linguistics, graphics, database manipulation, dynamic web page generation, network management, and many financial futures analyses on Wall Street. A Perl-insider joke is that the next big stock market crash will probably be triggered by a bug in someone's Perl program. Although Perl was conceived, born, and grew up on the Unix operating system platforms, it has outgrown its roots and is available on a wide number of platforms including the pre-OS X Macintosh and the Windows family of OSes. Perl also grew up with the World Wide Web, was and still is the most popular programming language for the CGI (Common Gateway Interface) programs which perform the "heavy lifting" performed by web sites. Hassan Schroeder, Sun Microsystems first webmaster, has called Perl the "Duct Tape of the Internet".

Getting Perl and Perl Information

CPAN is the Comprehensive Perl Archive Network, the font from which most things Perl can be found. From this site one can get the source code for Perl, "binary distributions" or ready-to-install packages for all sorts of operating systems, tutorials, documentation, "modules" which are easily installed extensions to the language, example code, and much more. The amount of information available on this site is daunting but in time you will find the various nooks and crannies quite familiar. In the meantime, there is an on-site search engine to help. While I've been accused of receiving kickbacks from the publisher O'Reilly and Associates (I don't), I do find its series of Perl books to be outstanding, recommend particular ones constantly, and have even made some of them the basis for Perl courses I have taught. For your first book on learning Perl I recommend:

Learning Perl, 3rd Edition
by Randal Schwartz & Tom Phoenix
ISBN 0-596-00132-0

The first and final word on Perl, known to the cognoscentias the "Camel Book" because of its cover, is authored by the creator of Perl and its chief architect. Every serious Perl programmer has a copy and uses it often.

Programming Perl, 3rd Edition
by Larry Wall, Tom Christiansen, & Jon Orwant
ISBN 0-596-00027-8

Lastly, a book which I would choose as one of the three I'd want to have on a lost and deserted island is:

Perl 5 Pocket Reference, 3rd Edition
by Johan Vromans
ISBN 0-596-00032-4

Running Perl

There are principally two environments for running Perl programs: (1) on your own computer and (2) on your or someone else's web server. Programs can be run directly from your own desktop to accomplish many useful tasks. For example, I maintain a number of special-interest name lists which occasionally require me to send out a bulk e-mailing to their members. I use a small Perl program of my own design to accomplish this because the commercially-available bulk e-mail programs just don't do exactly what I want. Additionally, my wife and I enjoy solving crossword puzzles and we have often been frustrated when looking up a word in a printed dictionary (yes, we use external resources to solve puzzles). Just try finding all possible five letter words with "o" as the third and "e" as the fifth letters in a regular dictionary. My hand-rolled and very short program accepts a specification like "..o.e" for the previous example, looks up a dictionary, and returns an alphabetized list of matching words. Very spiffy. Alternately, you will automatically execute Perl CGI programs when viewing certain of your web site pages. Others who visit your web pages will also execute the Perl programs. Pages which display dynamic content, that is, content which didn't exist before the page was requested. A good example is a page on the National Institute of Standards and Technology (NIST) web site which displays the current time and date derived from an atomic clock. I can assure you that the good folks who maintain this site do not write a new page of HTML every second. Rather a CGI program, written in Perl, writes the appropriate HTML when and only when a request is received.

Where We're Going With This Series

In upcoming articles I will show how various tasks, relevant to different scientific fields and tasks, can be accomplished using Perl programs. Some of these will be numeric computations, others will be text analysis. All will be pure Perl.

A Special Offer to SAS Members

Since it can be an inconvenient chore and a potential barrier to have to install Perl on your own desktop computer before trying out programs, I am offering a service to SAS members. The service is an account on one of my servers for the express purpose of learning Perl. The account will permit SSH access, that is, Secure Shell - a secure form of telnet, SFTP access, that is, Secure File Transfer Protocol, and HTTP/CGI access, that is the famous World Wide Web. This service will only be available to SAS members in good standing, for a limited (but useful) time, and after agreeing to a Terms of Use Agreement which describes all sorts of bad things that you will not do with this account. So if you're thinking of wearing a black hat then don't apply. More details on this offer soon.


Joseph DiVerdi can often be found happily getting his job done using Perl. Contact him at diverdi@xtrsystems.com.