Why Python?

I recently counseled a friend who wanted to learn about computer programming to start by learning the Python language. I also mentioned that I liked Python to an old friend who is a fellow experienced programmer, but I wasn't very clear about why. Now that I'm in the middle of a project that uses both the Python and C languages, I've come to better understand my reasons for favoring Python both for learning to program and for serious use.

Computer programming is, at its core, communication. At the lowest level, a program instructs a computer how to solve a problem. But at a more important level, a program communicates to people the thought process of the programmer in translating a vague problem into a specific solution. A programming language, then, should be expressive. That is, it should be easy for a programmer to concisely and accurately describe his thoughts, and it should be easy for someone reading the code (often that same programmer years later) to understand the original programmer's intent.

A brief history of programming

Computer hardware is reasonably simple, conceptually (though there are certainly complex details). There is a large memory that stores numbers. Programs move these numbers from one place to another in memory and do math operations on them. Attached devices use numbers to represent dots of color on a screen, letters on a keyboard or printer, the position of a mouse, the movement of a loudspeaker.

The computer also uses numbers in memory to represent instructions to itself. This is called “machine code”, and was how the very first programmers had to program. If I wanted to write the letter “A” to the screen, I had to know that the screen interprets the number 65 as that letter, and that putting it on the screen involved writing that number to a specific address in memory, and that the instruction code for “write this number to this address” is another number, and so on. Then I put the numbers 248, 65, 2, 193 in the right place in memory and press a start button. This worked, but was complex, tedious, and error-prone.

It is no accident that the first programs written by early programmers were tools to simplify programming. One such tool is called an assembler. This is a program that takes computer instructions described as more human-readable text and translates them into the raw numbers. For example, I might give the memory address of the screen the name screen. The number representing the write-to-memory instruction is named write, and so on. Now I can type write screen, "A", and the assembler will translate that into 248, 65, 2, 193 for me. Assembly instructions are an exact one-to-one match for machine instructions. They are just a more convenient—and more expressive—way to write them.

The next tools were real programming languages, which are a level of abstraction above machine code. Instead of describing machine instructions directly, the programmer used expressions like y = (x + 3) / 2, and a program called a compiler would translate that into a string of instructions to load numbers from memory, do the math, and write them back. The details of exactly how that was accomplished were delegated to the compiler program. A computer is meant to solve problems, so why not have it solve problems about how to program itself? FORTRAN was the first popular such language, although the C language overtook it and remains popular today.

In addition to compilers, there are programs called interpreters that do roughly the same job of translating a programming language into machine instructions, but do so on the fly, as the program is running. This makes using them even simpler since a programmer can try things interactively and get immediate feedback. BASIC is the canonical example of this kind of interpreted language.

Modern languages

Programming languages today add one more level of abstraction. Instead of operating only on numbers directly, or even names given to numbers, they allow you to describe “objects”, which are large collections of numbers that represent real-world things like people, places, accounts, documents, and so on. You express actions in terms of these objects, and the compiler or interpreter then decomposes these into the actions needed on the individual numbers and finally into machine code.

The measure of a modern programming language then is twofold: how well does it allow the programmer to clearly describe the problem to be solved, and how well does it translate that solution into machine code? These goals are often at odds: efficient translation to machine code is often accomplished by making special-case exceptions and back doors in the higher-level abstractions that allow the programmer to fiddle with the numbers directly at the expense of clarity and safety.

The best example of doing this badly is the C++ language. It is an extension to the C language that adds some modern abstraction tools, but it retains all of the low-level number-twiddling of C, which allows—indeed encourages—programmers to step outside of the abstractions. The resulting programs are complex, hard to understand, loaded with exceptions, and hard to debug.

The Java language does a better job, producing efficient machine code while maintaining well-defined higher-level abstractions with few exceptions. It achieves this in part by requiring the programmer to be very explicit about a lot of implementation details that don't really express programmer intent, so it tends to be verbose and hard to read and write.

The Python language manages to maintain consistent and clear use of its high-level abstractions without special cases, and yet produces machine instructions that are remarkably efficient in terms of speed. It achieves this goal mostly at the expense of using more memory at runtime than other languages. Python programs internally use dozens of big hash tables to speed up the namespace and associative-array lookups that accomplish its expressiveness. This is a good tradeoff. Today, memory is a lot cheaper than time. Time saved running a program—and time saved writing it—more than make up for the fact that a running Python program takes up twice the memory of a running Java program.

Examples

Python also adds many features that increase expressiveness without sacrificing either efficiency or high-level abstraction. Things like multiple-value function returns, default function arguments, named arguments, and tuple assignment allow a programmer to provide the information he wants to see in the code and eliminate much that isn't really expressive but merely pro forma. Here are some geeky examples for those who want (and understand) details:

Let's say I have a function of two arguments, the second of which is a value from 1 to 100, but almost always a default value the caller may not know about. In C++, I'd have two choices of how to write this. One, and the way you'd do it in C, would be to use a value like 0 to mean “use the default”:

void dosomething(int a, int b) {
    if (b == 0) { b = 53; }
    . . .
}
. . .
dosomething(5, 12);
dosomething(5, 0);

What would a programmer reading this later think on seeing the call? It looks as if 0 is just an ordinary passed argument and might be surprised that it gets changed. Maybe some change has made 0 a valid argument now, and we'll have to change the default marker. If that happens, and we run across that call in another piece of code, is that call a valid one with 0 or was it the old default?

The other option in C++ uses function overloading:

void dosomething(int a, int b) { . . . }
void dosomething(int a) { dosomething(a, 53); }
. . .
dosomething(5, 12);
dosomething(5);

Now we don't have the problem of confusing a real value with a default marker, but we have a new problem. In C++, overloaded functions are not related in any way and might do completely different things. the first call might draw a line on the screen while the second one plays music. A programmer might reasonably expect the latter call to be a default case of the former, but he can't rely on it. In Python, two function calls with the same name in the same place are guaranteed to call the same function. Omitting an argument means “use the default”, and can mean nothing else.

def dosomething(a, b = 53):
. . .
dosomething(5, 12)
dosomething(5)

Python doesn't need overloading because its arguments aren't typed (more on this later). You treat the arguments of different type differently by checking them explicitly only when necessary. In this way, the code more closely matches the programmer intent. Both languages can do what we want, but in C++ it's easy to get it wrong, while in Python it's easy to get right and is more lucid. Even the much-maligned feature of Python that code indentation is significant helps catch errors by forcing the physical appearance of the code to match its real meaning.

Many of the more expressive idioms in Python come from the world of “functional programming”, a field of study in computer science that uses functions as the overriding abstraction rather than objects—more about verbs, less about nouns. Python is not a functional language itself. It is firmly grounded in the not-so-old school of organizing your problem first by the things involved and then what they do. But its carefully-borrowed features from that world make it capable of expressing complex actions more clearly and effectively than is possible in many other languages.

Let's say you have two lists of things called lista and listb, a function f(), and you want to create another list of f() of things from lista where that thing also appears in listb. In Java, most of your work is specifying implementation (my Java is a little rusty, so forgive me if there are errors; I'm just trying to convey the flavor of the code):

List<Thing> nl = new ArrayList<Thing>();
Iterator it = lista.iterator();

while (it.hasNext()) {
    <Thing>a = it.next();
    if (listb.contains(a)) {
        nl.add(f(a));
    }
}

Here's the equivalent Python:

nl = [ f(x) for x in lista if x in listb ]

The advantage here is not just that it's easier to type, but that it clearly and concisely describes what the programmer wants, and no more. I didn't have to tell the computer exactly how to do what I wanted, just what I wanted. And this code will be easier for me and others to understand later.

“But wait”, I hear Java programmers saying about both of my examples, “Python code isn't type-safe!” That's true. My Python list here might contain non-Things, and the code will still compile. More, it will actually work as long as f(x) succeeds for whatever it finds in lista. It is believed by some that strict type safety catches programmer mistakes. I believe (and some evidence suggests) that this is a myth. Strict typing gets in the way more than it helps. And here's an important point: if I wanted to add strict type-checking to the Python code, I could, making it look more like the Java code. So the Python code is simpler for the more generally useful case, and more complex if that's what the programmer wants.

Performance and Conclusion

As a final note, I'm sure others will point out that speed of execution is critical in many applications, and that Python may not be suitable for those. That's true, but such cases are fewer than you might think. I've used Python for graphics, sound, number crunching, database access and many other things that might seem performance-hungry, and it's up to the task. It's certainly on par with Java. If you have written a program in Python and it's not fast enough, odds are it's your algorithm, not your language. Even if it is the language, it's likely that you've saved enough time in development that you could translate your slow—but working and clearly defined—code into C and still have spent less total time than it would have taken to develop in C in the first place, and have fewer bugs.

My OneJoker library is a hybrid: the core stuff that needs to run blindingly fast is in C, and Python code links to it. This is so that I can write the code for a simulation in nice readable Python, and still do millions of hands in reasonable time. Let's say I wanted to compare the odds of starting with ace-king in a Texas Hold'en game against pocket deuces. Here's the whole program, using my library:

#!/usr/bin/env python3
import onejoker as oj

h1 = oj.Sequence(7, "Ac Kh")
h2 = oj.Sequence(7, "2s 2d")

deck = oj.Sequence(52)
deck.fill()
deck.remove(h1)
deck.remove(h2)

wins1, wins2, ties = 0, 0, 0
boards = oj.Iterator(deck, 5)

for b in boards.all():
    h1.append(b)
    v1, h = oj.poker_best5(h1)
    h1.truncate(2)

    h2.append(b)
    v2, h = oj.poker_best5(h2)
    h2.truncate(2)

    if v1 < v2:
        wins1 += 1
    elif v2 < v1:
        wins2 += 1
    else:
        ties += 1

print("{0:10d} boards".format(boards.total))
print("{0:10d} wins for {1}".format(wins1, h1))
print("{0:10d} wins for {1}".format(wins2, h2))
print("{0:10d} ties".format(ties))

Run this, and it dutifully prints:

1712304 boards
 799119 wins for (Ac Kh)
 903239 wins for (2s 2d)
   9946 ties

in about a minute and a half on my old laptop. And this is actually a pretty bad case for my library: I have run other simulations that complete billions of hands in minutes. If I wanted an approximate answer faster instead of waiting for the exact one, I could replace the line for b boards.all() with for b in boards.random(10000).

So I repeat my recommendation for others out there who may be looking to get into computer programming: try Python. If you want to learn C or Java after that, go ahead. But if anyone suggests you learn C++, run as fast as you can.