Why Python?

I recently counseled a friend who wanted to learn about computer programming to start by learning the Python language. I also mentioned that I liked Python to an old friend who is a fellow experienced programmer, but I wasn't very clear about why. Now that I'm in the middle of a project that uses both the Python and C languages, I've come to better understand my reasons for favoring Python both for learning to program and for serious use.

Computer programming is, at its core, communication. At the lowest level, a program instructs a computer how to solve a problem. But at a more important level, a program communicates to people the thought process of the programmer in translating a vague problem into a specific solution. A programming language, then, should be expressive. That is, it should be easy for a programmer to concisely and accurately describe his thoughts, and it should be easy for someone reading the code (often that same programmer years later) to understand the original programmer's intent.

A brief history of programming

Computer hardware is reasonably simple, conceptually (though there are certainly complex details). There is a large memory that stores numbers. Programs move these numbers from one place to another in memory and do math operations on them. Attached devices use numbers to represent dots of color on a screen, letters on a keyboard or printer, the position of a mouse, the movement of a loudspeaker.

The computer also uses numbers in memory to represent instructions to itself. This is called “machine code”, and was how the very first programmers had to program. If I wanted to write the letter “A” to the screen, I had to know that the screen interprets the number 65 as that letter, and that putting it on the screen involved writing that number to a specific address in memory, and that the instruction code for “write this number to this address” is another number, and so on. Then I put the numbers 248, 65, 2, 193 in the right place in memory and press a start button. This worked, but was complex, tedious, and error-prone.

It is no accident that the first programs written by early programmers were tools to simplify programming. One such tool is called an assembler. This is a program that takes computer instructions described as more human-readable text and translates them into the raw numbers. For example, I might give the memory address of the screen the name screen. The number representing the write-to-memory instruction is named write, and so on. Now I can type write screen, "A", and the assembler will translate that into 248, 65, 2, 193 for me. Assembly instructions are an exact one-to-one match for machine instructions. They are just a more convenient—and more expressive—way to write them.

The next tools were real programming languages, which are a level of abstraction above machine code. Instead of describing machine instructions directly, the programmer used expressions like y = (x + 3) / 2, and a program called a compiler would translate that into a string of instructions to load numbers from memory, do the math, and write them back. The details of exactly how that was accomplished were delegated to the compiler program. A computer is meant to solve problems, so why not have it solve problems about how to program itself? FORTRAN was the first popular such language, although the C language overtook it and remains popular today.

In addition to compilers, there are programs called interpreters that do roughly the same job of translating a programming language into machine instructions, but do so on the fly, as the program is running. This makes using them even simpler since a programmer can try things interactively and get immediate feedback. BASIC is the canonical example of this kind of interpreted language.

Modern languages

Programming languages today add one more level of abstraction. Instead of operating only on numbers directly, or even names given to numbers, they allow you to describe “objects”, which are large collections of numbers that represent real-world things like people, places, accounts, documents, and so on. You express actions in terms of these objects, and the compiler or interpreter then decomposes these into the actions needed on the individual numbers and finally into machine code.

The measure of a modern programming language then is twofold: how well does it allow the programmer to clearly describe the problem to be solved, and how well does it translate that solution into machine code? These goals are often at odds: efficient translation to machine code is often accomplished by making special-case exceptions and back doors in the higher-level abstractions that allow the programmer to fiddle with the numbers directly at the expense of clarity and safety.

The best example of doing this badly is the C++ language. It is an extension to the C language that adds some modern abstraction tools, but it retains all of the low-level number-twiddling of C, which allows—indeed encourages—programmers to step outside of the abstractions. The resulting programs are complex, hard to understand, loaded with exceptions, and hard to debug.

The Java language does a better job, producing efficient machine code while maintaining well-defined higher-level abstractions with few exceptions. It achieves this in part by requiring the programmer to be very explicit about a lot of implementation details that don't really express programmer intent, so it tends to be verbose and hard to read and write.

The Python language manages to maintain consistent and clear use of its high-level abstractions without special cases, and yet produces machine instructions that are remarkably efficient in terms of speed. It achieves this goal mostly at the expense of using more memory at runtime than other languages. Python programs internally use dozens of big hash tables to speed up the namespace and associative-array lookups that accomplish its expressiveness. This is a good tradeoff. Today, memory is a lot cheaper than time. Time saved running a program—and time saved writing it—more than make up for the fact that a running Python program takes up twice the memory of a running Java program.

Examples

Python also adds many features that increase expressiveness without sacrificing either efficiency or high-level abstraction. Things like multiple-value function returns, default function arguments, named arguments, and tuple assignment allow a programmer to provide the information he wants to see in the code and eliminate much that isn't really expressive but merely pro forma. Here are some geeky examples for those who want (and understand) details:

Let's say I have a function of two arguments, the second of which is a value from 1 to 100, but almost always a default value the caller may not know about. In C++, I'd have two choices of how to write this. One, and the way you'd do it in C, would be to use a value like 0 to mean “use the default”:

void dosomething(int a, int b) {
    if (b == 0) { b = 53; }
    . . .
}
. . .
dosomething(5, 12);
dosomething(5, 0);

What would a programmer reading this later think on seeing the call? It looks as if 0 is just an ordinary passed argument and might be surprised that it gets changed. Maybe some change has made 0 a valid argument now, and we'll have to change the default marker. If that happens, and we run across that call in another piece of code, is that call a valid one with 0 or was it the old default?

The other option in C++ uses function overloading:

void dosomething(int a, int b) { . . . }
void dosomething(int a) { dosomething(a, 53); }
. . .
dosomething(5, 12);
dosomething(5);

Now we don't have the problem of confusing a real value with a default marker, but we have a new problem. In C++, overloaded functions are not related in any way and might do completely different things. the first call might draw a line on the screen while the second one plays music. A programmer might reasonably expect the latter call to be a default case of the former, but he can't rely on it. In Python, two function calls with the same name in the same place are guaranteed to call the same function. Omitting an argument means “use the default”, and can mean nothing else.

def dosomething(a, b = 53):
. . .
dosomething(5, 12)
dosomething(5)

Python doesn't need overloading because its arguments aren't typed (more on this later). You treat the arguments of different type differently by checking them explicitly only when necessary. In this way, the code more closely matches the programmer intent. Both languages can do what we want, but in C++ it's easy to get it wrong, while in Python it's easy to get right and is more lucid. Even the much-maligned feature of Python that code indentation is significant helps catch errors by forcing the physical appearance of the code to match its real meaning.

Many of the more expressive idioms in Python come from the world of “functional programming”, a field of study in computer science that uses functions as the overriding abstraction rather than objects—more about verbs, less about nouns. Python is not a functional language itself. It is firmly grounded in the not-so-old school of organizing your problem first by the things involved and then what they do. But its carefully-borrowed features from that world make it capable of expressing complex actions more clearly and effectively than is possible in many other languages.

Let's say you have two lists of things called lista and listb, a function f(), and you want to create another list of f() of things from lista where that thing also appears in listb. In Java, most of your work is specifying implementation (my Java is a little rusty, so forgive me if there are errors; I'm just trying to convey the flavor of the code):

List<Thing> nl = new ArrayList<Thing>();
Iterator it = lista.iterator();

while (it.hasNext()) {
    <Thing>a = it.next();
    if (listb.contains(a)) {
        nl.add(f(a));
    }
}

Here's the equivalent Python:

nl = [ f(x) for x in lista if x in listb ]

The advantage here is not just that it's easier to type, but that it clearly and concisely describes what the programmer wants, and no more. I didn't have to tell the computer exactly how to do what I wanted, just what I wanted. And this code will be easier for me and others to understand later.

“But wait”, I hear Java programmers saying about both of my examples, “Python code isn't type-safe!” That's true. My Python list here might contain non-Things, and the code will still compile. More, it will actually work as long as f(x) succeeds for whatever it finds in lista. It is believed by some that strict type safety catches programmer mistakes. I believe (and some evidence suggests) that this is a myth. Strict typing gets in the way more than it helps. And here's an important point: if I wanted to add strict type-checking to the Python code, I could, making it look more like the Java code. So the Python code is simpler for the more generally useful case, and more complex if that's what the programmer wants.

Performance and Conclusion

As a final note, I'm sure others will point out that speed of execution is critical in many applications, and that Python may not be suitable for those. That's true, but such cases are fewer than you might think. I've used Python for graphics, sound, number crunching, database access and many other things that might seem performance-hungry, and it's up to the task. It's certainly on par with Java. If you have written a program in Python and it's not fast enough, odds are it's your algorithm, not your language. Even if it is the language, it's likely that you've saved enough time in development that you could translate your slow—but working and clearly defined—code into C and still have spent less total time than it would have taken to develop in C in the first place, and have fewer bugs.

My OneJoker library is a hybrid: the core stuff that needs to run blindingly fast is in C, and Python code links to it. This is so that I can write the code for a simulation in nice readable Python, and still do millions of hands in reasonable time. Let's say I wanted to compare the odds of starting with ace-king in a Texas Hold'en game against pocket deuces. Here's the whole program, using my library:

#!/usr/bin/env python3
import onejoker as oj

h1 = oj.Sequence(7, "Ac Kh")
h2 = oj.Sequence(7, "2s 2d")

deck = oj.Sequence(52)
deck.fill()
deck.remove(h1)
deck.remove(h2)

wins1, wins2, ties = 0, 0, 0
boards = oj.Iterator(deck, 5)

for b in boards.all():
    h1.append(b)
    v1, h = oj.poker_best5(h1)
    h1.truncate(2)

    h2.append(b)
    v2, h = oj.poker_best5(h2)
    h2.truncate(2)

    if v1 < v2:
        wins1 += 1
    elif v2 < v1:
        wins2 += 1
    else:
        ties += 1

print("{0:10d} boards".format(boards.total))
print("{0:10d} wins for {1}".format(wins1, h1))
print("{0:10d} wins for {1}".format(wins2, h2))
print("{0:10d} ties".format(ties))

Run this, and it dutifully prints:

1712304 boards
 799119 wins for (Ac Kh)
 903239 wins for (2s 2d)
   9946 ties

in about a minute and a half on my old laptop. And this is actually a pretty bad case for my library: I have run other simulations that complete billions of hands in minutes. If I wanted an approximate answer faster instead of waiting for the exact one, I could replace the line for b boards.all() with for b in boards.random(10000).

So I repeat my recommendation for others out there who may be looking to get into computer programming: try Python. If you want to learn C or Java after that, go ahead. But if anyone suggests you learn C++, run as fast as you can.

Why is prostitution illegal?

Many of you may have heard by now that my friend and WSOP champion Greg Raymer was arrested after setting up a date with a prostitute that turned out to be an undercover cop. But that isn't the real news story. Frankly, I think “Overweight wealthy man hires hooker” is about as newsworthy as “Teenager caught drinking beer”. Yes, they're both illegal, but I really don't have a problem with either if they're done responsibly—and they can be. What I do have a problem with, and what really should be the story here, is that the government of North Carolina thought it was a good idea to spend taxpayer money to catch two consenting adults doing something in private that they have every right to do.

We own our bodies. A woman (or man) has a right to have sex. And a right to charge for services. But for some reason, our backward society takes issue with doing those things together. It is true that other kinds of crime often surround prostitution, but prosecuting it for those reasons makes no more sense than jailing the friends and family of a criminal instead of the criminal. San Francisco recently busted a gang caught smuggling women from Asia to work as forced prostitutes, and that's great—not because they cut down on prostitution, but because they cut down on fucking slavery. I am also happy when the police grab an abusive pimp or a hooker who robs her customers. But laws against prostitution itself actually make catching these crimes harder, because it forces prostitutes and customers to work outside the law and not report such crimes.

It is no accident that prostitutes in places like Nevada and Amsterdam do not suffer from crime, disease, or abuse. Where it is legal and regulated, it is safe and actually reduces crimes like rape and spousal abuse.

If a woman freely chooses to charge money for sex, and a man freely chooses to hire her, and they do so safely, that's their own business. The man's wife might have an issue with it, or she might not. That's her business. What I'm quite sure of is that it's none of my business, and none of my government's. What is my business is how my government spends my money, and I for one don't want them wasting it on sting operations for non-crimes.

How many poker hands are there?

I've been posting a lot of philosophical geekery lately, so I'll balance that today with some math geekery. Today's math term is equivalence class. The basic idea is simple: if you have a big set of things, you can reduce it to a smaller set of things, each of which is a subset, or "class", of those things that are “equivalent” in some well-defined way.

Here's an example: how many five-card poker hands are there? Well we can pick any card from a 52-card deck, then pick a second card from the 51 remaining, and so on five times. This gives us 52×51×50×49×48 hands, or 311,875,200. Those 300 million hands include A♥4♦A♣9♥4♣ and also 4♣9♥A♣A♥4♦, so we can immediately reduce that by a factor of 120 by noting that poker rules don't care what order the cards are in. So we collect all those together and reduce our number to 2,598,960.

But we can go further still. That 2.5 million counts our two hands above (along with other combinations like 9♥4♦A♥4♣A♣) as equivalent, but it counts separately the hand A♠A♥4♥4♦9♠, which is the “same” hand in the sense of being identically valued—it is “two pair, aces and fours, nine kicker”, just like the first two. So how many poker hands are there, only counting those that are actually of different value in the game? As it turns out, only 7,462. Number one at the top of that list of 7,462 is simply “royal flush”, which accounts for four of our 2.5 million hands, and 480 of our 300 million. Number 2 is “king high straight flush”, and so on down to number 7,462 which is “no pair, 7-5-4-3-2” (which accounts for 1,020 of our 2.5 million, or 122,400 of our 300 million).

Notice a major difference between our two reduction operations: in the first case, we reduced the big set into subsets that were all the same size. Each of our 2.5 million hands contains exactly 120 of our 300 million—that's the number of different ways you can arrange 5 cards. As a consequence of this, the probability of each of those 2.5 million hands is exactly the same, just as is the probability of each of the 300 million. The 7,462 sets, however, are of different sizes. There are hundreds of times as many ways to get 7-5-4-3-2 as there are to get a royal flush, so the probability of each is different.

One common application of equivalence classes is in computer science: sometimes you need to do something to a very large set of inputs, and you can simplify and speed up the operation by reducing them to a smaller set. If you ask Google for pages about “poker”, not only would you expect it to return pages that mention “Poker” and “POKER”, but Google would save time and disk space by indexing those only once.

This can be applied to life as well. Perhaps there is a large set of things you'd like to improve about your life in some way. If you can group them by things that might have a similar cause or similar solutions, not only will you reduce the number of things to think about, you might notice that some groups are much larger than others, giving you guidance about what to focus on.

Prisoner's Dilemma

Grade schools in the US ignore philosophy as a subject. A few high schools give it brief mention (and then it's only to cover historically important people). Even in many colleges it remains elective. The result is that many important subjects in philosophy are unknown to the general public, despite the fact that they are simple and can have a great influence on our everyday lives.

I've mentioned concepts like confirmation bias and the sunk cost fallacy before. These are common mistakes all people make in reasoning that can be avoided if we learn about them. These have aspects of psychology as well as philosophy. A more purely philosophical concept everyone should understand is the Prisoner's Dilemma. A typical example goes like this:

Two suspects are arrested for a robbery. Each is questioned separately by police and told this: Our evidence against the two of you for the robbery is thin, but we can give each of you a year in jail on a lesser weapons charge. If you confess and squeal on your buddy, he'll get five years and we'll let you walk. But if you both squeal, you each get three years.

A keeps silentA confesses
B keeps silentA: 1 year
B: 1 year
A: free
B: 5 years
B confessesA: 5 years
B: free
A: 3 years
B: 3 years

Each suspect reasons like this: I can't talk to my buddy, and I have no control over what he does. If he clams up, I get a year if I do as well, but I go free if I confess. If he squeals, then I get five years if I stay silent and three if I confess. In both cases, I'm better off confessing. Both suspects reason this way, so both confess, and each gets three years in prison. But if they had both remained silent, they both would have gotten only a year. So the essence of the Prisoner's dilemma is this: reasoning separately, both parties doing what is clearly in their best interest end up with a result that is worse that what they would have gotten if they had cooperated.

Many situations in life mirror this. Take doping in sports, for example. Whether or not your opponent is doping is out of your control. If he is, you must dope to compete. If he isn't, doping won't lessen your chance of winning, so individually you are always better off doping. But if everyone in the sport is doping, the results will be roughly the same as if no one is doping, so as a group, it would be better if everyone didn't. Other situations like the tragedy of the commons can be modeled this way.

The way out of these dilemmas is to find some means to encourage or force cooperation. For example, a criminal gang might have a prior agreement—or strong social taboo—against snitching. Sports regulators might have strong rules against doping and do regular testing. It has even been suggested that one of the primary motivations for which people create governments is to have a third party to resolve such dilemmas between citizens.

Like any simplified mathematical model of complex human interaction, there is a danger of applying it to situations that don't quite match. In a recent episode of the Philosophy Bites podcast, Jeff McMahan suggests modeling aspects of the gun control debate as a prisoner's dilemma: for example, the interaction between a burglar and homeowner. Each reasons that if the opponent is armed, he is certainly safer being armed himself, and if his opponent is unarmed, being armed doesn't hurt, so it is better to be armed in each case. But collectively, they are safer if both are unarmed.

I don't think this particular argument holds water, even ignoring all the other aspects of a very complicated issue. First, the “payoffs” (that's mathematical jargon for the relative value of the results to each of the participants) are not symmetrical. In the “both unarmed” condition, the winner of the interaction is likely to be whoever is bigger, stronger, or the more experienced fighter—probably the criminal. The “both armed” condition raises the stakes and the danger for both, but is also equalizes them, so it is likely to benefit the homeowner relative to the burglar. Second, it assumes that both the criminal and the homeowner place equal value on their own safety. This is a psychological question. Perhaps the burglar is a sociopath who values violence for its own sake. It also makes the assumption that the “both unarmed” condition is something that's possible to achieve in real life.

Another way out of the prisoner's dilemma is available when situations are repeated more than once: reciprocity. When we have multiple interactions with people, we can gain a reputation for being cooperative, making others more likely to cooperate with us. Robert Axelrod's classic experiments along these lines are explored in his book The Evolution of Cooperation. Reciprocity can also explain how our moral sense (and things like our ability to recognize faces) can evolve from the essentially selfish process of natural selection.

This concept is simple, and recognizing it in real life situations can make such a difference in life that it should be taught to everyone in grade school.

Predictably Irrational

Dan Ariely's Predictably Irrational should be required reading in all English-speaking high schools. Though he is an academic of impeccable credentials (including a 2008 Ig Nobel prize), he is also an entertaining writer—a rare combination. The book details some of his important and cutting-edge research in the emerging field of behavioral economics, but its writing is accessible, clear, funny, and effective.

His examples resonate with the ordinary choices we all make in life: buying magazines, dating, vacationing. He shows us the mistakes we all make, but not in a way that is condescending or cynical. Indeed, his intent is clearly to show us how we can avoid making those mistakes even while he shows us how universal they are. His advice is not that of a college professor or a parent, but more like a best friend telling you “Wow, I just did something really stupid—don't do that.”

Indeed, its very lucidity might be a risk: you might be tempted to think “Well, of course, how obvious” after he explains some aspect of human behavior, and not realize that his discoveries were not obvious, and that they are backed up by solid experimental evidence, not just platitudes.

You can get a taste of his style from Youtube, but the details in the book are worth the time spent. While it is likely that academics will continue to cite the groundbreaking 1974 Kahneman and Tversky paper as the founding work of the field, Ariely's book is likely to be one most discussed by the rest of us, and it will serve you well to be familiar with it when related subjects come up in conversation.