Representing playing cards in software

There are several different ways to represent playing cards in software, each with its own benefits, drawbacks, and best application. I want to outline these, and explain why I chose the particular integer representation used in the OneJoker card library.

A standard deck from Copag, popular in casino poker rooms.

A standard deck from Copag, popular in casino poker rooms.

What is a card?

The set of cards in a standard Anglo-American deck is the cartesian product of two sets: 13 ranks and 4 “French” suits, each card having one of each. Operations on cards typically involve comparing the ranks of two cards based on an ordering dependent on the game, and comparing suits for equality. Many games also use one or more jokers, which have neither rank nor suit. Decks of cards today are manufactured with two jokers, one of which is typically printed in black only, and the other in color (or distinguished in some similar way). Games that distinguish between these often call them “red” and “black” by analogy to suited cards.

Whatever representation is used, it is useful to be able to get the rank or suit of a card as a small integer that can be used to index lookup tables and compute sums. Direct comparison of ranks and identifying sequences are also common, but ordering varies from one game to another, so this should be done carefully. If an application is designed for one game, choosing a representation suited to that game will be handy. For example, poker applications should choose a rank that gives the lowest number to deuces and the highest number to aces so that ranks can compare directly.

Text

While interactive games will display cards to the user as graphical images and accept input from a mouse, other applications that use playing cards must at some point acquire input and produce output as text for humans. A common and effective method is to use one digit or letter for the card's rank and another for the suit: 2c, 9h, Qd, As, etc. The letter T is usually used for tens to keep these strings uniform. This is a good way to save card information in text files, to communicate them over network protocols, and so on. It is common to use JK to represent the joker. It is not common at all to distinguish between the red and black jokers, though some games require it. I recommend using JR for the red joker when the distinction matters, and JK for the black (or when the distinction doesn't matter).

In the spirit of the networking axiom “Be conservative in what you produce, liberal in what you accept”, I recommend that cards be consistently written in this two-character format, uppercase rank and lowercase suit, with a space between cards when representing a list or set. When reading such a list, one can be more liberal by accepting case differences, extra whitespace, no whitespace, or even 10 if uniformity is not required. If such text is for human consumption only (such as running text on a web page or printed book not likely to ever be read by a program), one might use the Unicode suit symbols (♣ ♦ ♥ ♠) as well as red and black text, but these are awkward for use in 8-bit data formats.

Using such a text representation of cards internally for code that runs a game or simulation is always a bad idea. There is no programming language or application I know of for which such an internal representation does not lead to loss of performance and excessive memory usage. Converting other representations to strings for output is always trivial and fast. Converting from input strings may be a tiny bit harder, but it is still simple, and even programs using string representations will have these same complications dealing with irregular inputs and such. So it is always better to represent cards internally with a different representation and convert them for input and output as needed.

Objects

In object-oriented languages, using objects to represent cards is reasonably efficient for most uses. Operations on cards often involve comparing ranks and suits separately, so the card object should have two member variables for rank and suit. Rank should be an integer or an integer-like class (such as an enumeration class) that can do ordered comparisons. Suits are generally only compared for equality, so they can be integers, enumerations, or pointers to one of four static suit objects. Identifying a card object as a joker can be done with an additional flag, or else it can be assigned a unique rank.

Such a representation is fairly compact, so it will not cause excessive memory use. It should be pointed out, though, that even object-oriented languages typically have efficient “primitive” types like integers, and so it might make sense for some applications to forgo objects in favor of one of the integer representations below for extra performance. One might still have a card class with static functions that operate on these integers for clarity. A good example is the Pokerstove application in C++ which uses Card, Rank, and Suit objects for I/O and some functions, but computes different internal representations from them when needed for performance.

If you want to keep extra information in the card object, you can avoid the cost of copying larger objects by keeping a single collection of 52 static card objects and using pointers to these as the cards that get manipulated at runtime.

Bitmaps

If the card games being simulated involve sets of cards with no duplicates, and for which the order of cards in a set is not important, one can represent a set of cards as a single 64-bit integer in which each bit indicates the presence or absence of one particular card in the set. In addition to being the most compact representation for sets, this can speed up many complex calculations. If the bit positions are chosen so that each 16-bit subset of the value repesents one suit, and 13 of each of those 16 bits is the rank, then the 16-bit sub-integers can be used directly for comparisons as well, speeding up calculations further.

As noted, this does not preserve the order of cards, so if you want to do something like shuffle a deck, you'll have to represent the deck as an array of these masks, with each member having one bit set, and then OR them together into a hand as they are dealt. This may be slower than dealing with arrays of machine-size integers. This representation also makes using lookup tables indexed by rank or suit difficult. Also, since no duplciates are allowed, this method cannot be used for games that require duplicate cards such as Pinochle and Canasta.

This representation is most useful for single-purpose applications doing very complex calculations on fixed-sized sets of cards. The venerable pokersource library uses bitmaps to evaluate poker hands, and it is quite effective.

Bitfields

Because the typical 32-bit integer size of most machines is much larger than necessary to identify a card, we can use groups of bits within an integer to store information about the card. Specifically, two bits for suit, four for rank, and the rest for flags or anything else the application might need. This is similar to treating an integer as an object with member variables stored in a very compact way. The well-known Suffecool/Senzee poker hand evaluator uses this method to store along with each card one of 13 prime numbers used in its calculations.

This gives us some of the advantages of the object representation while being more compact. This speeds up applications that need to move and copy many cards from place to place, such as blackjack simulations. A blackjack simulation might use 4 bits to store the 1 to 10 numerical value of a card to avoid some branching in the innermost loop that computes a hand value (though you'll still need to deal with aces specially). Getting ranks and suits out of our numbers requires only fast bit-masking operations to get numbers suitable for indexing lookup tables.

Integers

Finally, there is what is probably the simplest representation of all, but no less powerful if done correctly: simply assigning a small integer value to each card. One can see software in which cards are ordered the way they are when you open a typical new deck of cards, which is Ac, 2c, 3c, ... Kc, Ad, 2d, ... Qs, Ks. This is a bad idea for two reasons. First, getting a numerical rank and suit from a number requires an expensive division by 13, and even after that aces will usually have to be special-cased to move them to their usual high rank.

Better is to order the cards in the standard poker “high card by suit” ordering, which is 2c, 2d, 2h, 2s, 3c, 3d, ... Ks, Ac, Ad, Ah, As. This has many advantages. First, you can separate rank and suit with fast bit masking (in fact, this ordering is essentially a bitfield representation with suit as the low order bits). Also, one can often compare or sort cards by rank without even separating the ranks just by comparing the values themselves. Likewise, comparing ranges of ranks can be done by comparing ranges of values (the “10 count” cards in blackjack, for example, are the range 32 to 47).

This representation is ideal for indexing lookup tables. The values that one might store in a bitfield or object, for example, can simply be fetched from a small lookup table with almost no performance hit. Sets of cards (hands, decks, discard piles, etc.) are simply arrays of integers, for which many programming languages are highly optimized. Duplicate values are no problem, so games like Pinochle and things like 6-deck blackjack shoes need no special handling.

The OneJoker card library uses this represention with a minor change: I add one, so cards have the values 1 to 52 rather than 0 to 51 (the values 53 and 54 are used for jokers). The need for an occasional -1 is not a significant performance hit, it can often be avoided entirely by adding one element to lookup tables, and being able to use 0 as a “null” value is very handy in the C language.

While any one particular application might be faster with a different representation, this simple one is very fast for the vast majority of applications, and can be easily converted to others when needed, so it is probably ideal for a general-purpose library.