Card Bit Mask

We already have one representation of cards using bit indices but there is another representation we could use that would be easier to work with when evaluating poker hands. A long can be split into 4 × 16 bit sections, each of which can represent a suit. 16 bits is enough to represent each of the 13 ranks with a few bits left over. Poker can treat the ace as both high and low depending on the other cards so we can use the 14th bit to represent a high ace. This gives us the following layout:

|     Clubs     ||    Diamonds   ||     Hearts    ||     Spades    |
--AKQJT987654321A--AKQJT987654321A--AKQJT987654321A--AKQJT987654321A

This layout is designed to make horizontal reductions easier. What do I mean by a horizontal reduction? I'm taking inspiration from the various horizontal instructions CPUs have for the naming, such as _mm_hadd_pi16 which adds together 16 bit sections from a 64 bit number. In our case we are going to be combining the 16 bit sections with bitwise logical operators. So reducing using AND would look something like:

public static ulong HorizontalAnd16(this ulong value)
{
    value &= value >> 32;

    return value & (value >> 16);
}

What does this give us? Four of a kind requires us to have a card of the same rank in each suit. If we therefore combine the sections with AND then we will be left with one bit set which will be the rank of the four of a kind. Other logical operators will let us extract other information about the hand during evaluation.

We could of course do exactly the same using 14 bits for each suit and 8 high bits unused. However 16 bits is a short which might come in handy, it's more symmetrical, the shifts are all multiples of 16 and I've spent a lot of my years programming 8 and 16 bit machines so it just feels natural...

Creating a bit mask from our card is simple; shift 1 left by the value of rank to get the rank bit mask, then shift that by 16 × value of the suit. How do we go the other way? We could check each section in turn for a set bit and extract that. This would involve if conditions and branching though; a quicker way would be just to extract the sections, shift them into the correct place and then use the fast lookup by bit index we already have. Ignoring high aces that would look something like this:

public static ulong BitMaskToBitIndex(ulong bitMask)
{
    // Bit indices: Spades = 0 -> 12, Hearts = 13 -> 25, Diamonds = 26 -> 38, Clubs = 39 -> 51.
    // Bit masks: Spades = 0 -> 12, Hearts = 16 -> 28, Diamonds = 32 -> 44, Clubs = 48 -> 60.
    var spades = bitMask & 0x000000000000FFFF;
    var hearts = (bitMask & 0x00000000FFFF0000) >> 3;
    var diamonds = (bitMask & 0x0000FFFF00000000) >> 6;
    var clubs = (bitMask & 0xFFFF000000000000) >> 9;
    return spades | hearts | diamonds | clubs;
}

This is an example of an extract operation, sometimes called a compress. (Once again see the excellent Hacker's Delight for an in depth discussion) We extract bits from certain positions in the source value and ignore the rest. We then compress those bits into the low end of the result. Modern CPUs have a built in instruction to do this for us, _pext_u64. This takes a bit mask for the bits we want to extract and performs the extraction in a single operation. We can use .NET hardware intrinsics to do this:

internal static ulong BitMaskToBitIndex(ulong bitMask)
{
    const ulong mask = 0b0001111111111111_0001111111111111_0001111111111111_0001111111111111;
    return Bmi2.X64.ParallelBitExtract(bitMask, mask);
}

Comparing these two methods along with an implementation of the branching mentioned above using Benchmark.NET we can see no branching is better than branching and intrinsics are even better still:

|      Method |     Mean |   Error |  StdDev | Ratio |
|------------ |---------:|--------:|--------:|------:|
|   Branching | 378.3 ns | 3.13 ns | 2.93 ns |  1.00 |
| NoBranching | 352.4 ns | 0.70 ns | 0.65 ns |  0.93 |
|   Intrinsic | 328.1 ns | 1.53 ns | 1.44 ns |  0.87 |

Poker Hand Representation

When we run our poker evaluation it will work out what type of hand we have (four of a kind, straight, etc.) and give us some information about the ranks involved, e.g. if we have two four of a kinds we also need to know the rank of the four cards so we can state that four aces beats four kings. If we twiddle our bits in the right way we can reduce this information to a single number where a higher number means a better hand.

Firstly we will represent the hand type as an enum. I am going to treat a royal flush as it's own type rather than treating it as a straight flush to ace; if you're ever luckily enough to get a royal flush I think its only right you're treated specially. I am also going to include five of a kind in case I ever get around to implementing wild cards. This gives a total of 11 hand types which fits nicely into 4 bits. We will use these as the most significant bits in our number so that the hand with the highest numbered type is always greater than lower numbered type.

Next we need the rank information. This will be used to differentiate between two hands of the same type. For some hands we only need a single rank. For example with four of a kind you only need to know about the rank of the four cards to work out who wins against another four of a kind; the rank of the other card is irrelevant. For two high card hands you need to know the rank of every card in the hands as if the two high cards are the same the winner is the one with the next highest card, if they're the same the third highest and so on.

We cannot however just use a single bit mask of ranks. This would work for some hands such as high card; the numbers represented by the five bits will be in the correct order to give a higher number for the winning hand, e.g.:

Loser:  Q J 8 5 4 -> 00110010011000 ->  3,224
Winner: K Q 5 4 3 -> 01100000011100 -> 14,034

Loser:  Q J 8 5 4 -> 00110010011000 ->  3,224
Winner: Q J 9 8 4 -> 00110110001000 ->  6,610

But for other hands such as two pair, full house or three of a kind this falls down because the pair/triple should take precedence over the other cards, e.g.:

Loser:  K K 8 8 8 -> 01000010000000 -> 10,200
Winner: K K K 8 8 -> 01000010000000 -> 10,200

Loser:  9 9 9 K Q -> 01100100000000 -> 14,400
Winner: J J J 9 8 -> 00010110000000 ->  2,600

Loser:  Q Q J J K -> 01110000000000 -> 16,000
Winner: K K 3 3 2 -> 01000000000110 -> 10,006

We will therefore have two rank bit masks, primary and secondary. For our full house the primary will be the three, the secondary the two. For three of a kind the primary is the three, the secondary the other cards. And for a two pair the primary is the pairs, the secondary the other cards. Using the primary as the high bits gives us numbers in the correct order:

Loser:  K K 8 8 8 -> 00000010000000 01000000000000 ->  10,010,000
Winner: K K K 8 8 -> 01000000000000 00000010000000 -> 400,000,200

Loser:  9 9 9 K Q -> 00000100000000 01100000000000 ->  20,014,000
Winner: J J J 9 8 -> 00010000000000 00000110000000 -> 100,000,600

Loser:  Q Q J J K -> 00110000000000 01000000000000 -> 300,010,000
Winner: K K 3 3 2 -> 01000000000100 00000000000010 -> 400,200,002

There are 14 bits needed for ranks if we have both high and low ace. The secondary rank will only ever have ace high so we can represent that as 13 bits. Although the primary rank does need low aces for ace to five straights/straight flushes we could represent straights just by the high card and the comparisons would still work as expected. I have decided to use all five cards for the ranks in straights however as to extract the high card (as we will see in a later post) would require extra operations.

So in total we have 4 + 14 +13 = 31 bits. Which handily fits into an int without using the most significant bit so we will never have a negative number to worry about. (Negative numbers in two's compliment have the most significant bit set) This satisifes our goal of representing hands by a single number where a higher number means a better hand.

We now have all the tools we need to start evaluating hands in the next post.