In Search of... Clumping (was Re: Reply to Ken Fuchs)
by Abdul Jalib

From r.g.b.m

"RUSSELL  J. HALL"  writes:

> We believe that the more players in the game, the more the
> cards clump up.  Perhaps not for that number of players but for a fewer
> number of players.  When you go from 7 to 5 players the game should get
> worse for the players.  On the other hand 1 or 2 players vs the dealer have
> a tendency to unclump the cards.

Nrrr. The opposite is true, as I will show.

I arm myself with my realistic blackjack simulator and go In Search Of...
                        CLUMPING [insert discordant creepy music]

(Actually, I did a similar study around 1990, and I posted the results,
but I don't have a copy, so I have to do it from scratch here.  Brace
yourself, this is going to be a long one.  You can cheat by skipping
down to the line that says "*SUMMARY*".)

The game being simulated is 6 decks, S17, DOA, DAS.  My program includes
very detailed implementations of real Atlantic City casino shuffles, but
I thought the clumpers would go bonkers trying to make all sorts of
modifications to the shuffle, so I opted for a simple shuffle: unplayed
stacked on top of played, cut in two, half deck picks, single riff each
pair of picks, stack until done, random cut.  The interlace frequency
was modelled after an actual shitty dealer's drops: 66% chance 1 card,
26% chance 2 cards, 5% chance 3 cards, 2% chance 4 cards, and 1% 5 cards.
The grabs are subject to up to a +-20% error, evenly distributed.  To avoid
any disputes about "card boxing" I stuck to a single riff.  My program
puts the cards in the discard tray just like Atlantic City dealers
do.  The hands are played using proper basic strategy (sometimes without
splitting as noted.) If clumping does not occur with this, then why
would it occur with more thorough casino shuffles?

The simulation is of 100,000 shoes, and in each shoe the first 169
cards are compared with their successors.  (There's a method to this
madness of 169, but mostly it was unnecessary.)  The theoretical
distribution of the 16,900,000 trials can be computed mathematically.
I show this theoretical frequency distribution of previous card (column)
to next card (row) below:

                FREQUENCY DISTRIBUTION OF PREVIOUS CARD TO NEXT
            (Rows are the next card, A,2,3...T, from top to bottom)
           theoretical expected frequencies from math not simulation
================================PREVIOUS CARD==================================
   A       2       3       4       5       6       7       8       9       T
------- ------- ------- ------- ------- ------- ------- ------- ------- -------
 96141  100322  100322  100322  100322  100322  100322  100322  100322  401286
100322   96141  100322  100322  100322  100322  100322  100322  100322  401286
100322  100322   96141  100322  100322  100322  100322  100322  100322  401286
100322  100322  100322   96141  100322  100322  100322  100322  100322  401286
100322  100322  100322  100322   96141  100322  100322  100322  100322  401286
100322  100322  100322  100322  100322   96141  100322  100322  100322  401286
100322  100322  100322  100322  100322  100322   96141  100322  100322  401286
100322  100322  100322  100322  100322  100322  100322   96141  100322  401286
100322  100322  100322  100322  100322  100322  100322  100322   96141  401286
401286  401286  401286  401286  401286  401286  401286  401286  401286 1588424

For the sims, the 3 standard deviation confidence intervals on the figures are:
    For 96141:  [95214, 97068]
    For 100322: [99375,101269]
    For 401286: [399408,403164]
    For 1588424:[1584808,1592040]
For example, for the bottom right hand figure of the table, the 1588424
refers to the relative frequency of a 10 following a 10, and in our
simulation of 16,900,000 trials, it's not likely to occur less than
1584808 times or more than 1592040 times.  We'll use these confidence
bounds to judge whether the simulations are exhibiting anything out of
the norm.

Next I'll present the control case, 7 hands, no splitting, with
the control being the random shuffle, since we're using it as a
crosscheck on the theoretical values...

                FREQUENCY DISTRIBUTION OF PREVIOUS CARD TO NEXT
            (Rows are the next card, A,2,3...T, from top to bottom)
                7 hands, no splitting, random shuffle (control)
================================PREVIOUS CARD==================================
   A       2       3       4       5       6       7       8       9       T
------- ------- ------- ------- ------- ------- ------- ------- ------- -------
 96488  100400  100316  100294  100600  100617  100601  100044  100573  401561
100103   95999  100660   99633  100922  100333  100139  100021  100764  401423
100421  100777   95575  100067   99526  100491  100680  100504  100312  401374
100099  100391  100236   96338  100244  100827  100561  100124   99827  400342
100716   99963  100102  100582   96325  100725   99828  100446   99939  401405
100528  100342  100270  100160  100816   95720   99984  100364  100231  401853
100154  100007  101524> 100226   99714  100731   95913  100164  100273  401679
100225  100590  100221  100022   99652  100193  100236   96167  100370  401483
100481   99855  100419  100089  100303  100166  100902  100229   95677  401666
402240  401665  400569  401764  401816  400378  401646  401059  401803  1587223

Interpretation: This is just a "control" simulation - I used a random
shuffle, even though I already know what the results should be from theory.
One result falls outside the 3 standard deviation confidence interval.
I've indicated it with a greater than sign (">"), meaning it's above the
upper bound.  When you see the less than signs ("<") later, you can
guess what they mean.  This is not totally unexpected, as there are 100
entries in the table, and there is about a 1 in 500 chance of any one
entry being outside the confidence bounds strictly by random fluctuation.
Anyway, the random shuffle appears pretty random, which is all I wanted
to verify.

Now let's see what happens if we use a realistic shuffle...

                FREQUENCY DISTRIBUTION OF PREVIOUS CARD TO NEXT
            (Rows are the next card, A,2,3...T, from top to bottom)
                     7 hands, no splitting, realistic shuffle
================================PREVIOUS CARD==================================
   A       2       3       4       5       6       7       8       9       T
------- ------- ------- ------- ------- ------- ------- ------- ------- -------
 97869> 100280  100185   99945   99900   99757  100521   99980  100371  400399
 99667   97643> 100847  101419> 101306> 101113  101236  100652  100303  398742<
100299  100680   96271  100690  101248  100596  100598  100610  100203  399898
 99848  100745  100932   97056  100424  100224  100883  100514  100311  400604
 99578  100396  100969  101247   96496  100457  100700  101181  100648  400132
 99529  100371  100238  100202  100317   95867  100330  100485  100335  400506
100580  100551  100683  100591  100570   99969   96702  100555  100165  401565
100279  100893  100671  100311  100329  100407  100196   96083  100124  401948
 99965  101144  100595  100157  100078  100372  100569  100289   95658  400698
401603  400405  399824  399689  401176  399613  400116  400841  401333 1587950

Interpretation: With 7 players, no splitting, and a realistic shuffle,
small cards have a slight tendency to clump with each other.  Aces tend
to follow aces, and deuces tend to follow deuces, fours, and fives.

Now let's see what happens if we allow splitting...

                FREQUENCY DISTRIBUTION OF PREVIOUS CARD TO NEXT
            (Rows are the next card, A,2,3...T, from top to bottom)
                 7 hands, realistic shuffle, splitting allowed
================================PREVIOUS CARD==================================
   A       2       3       4       5       6       7       8       9       T
------- ------- ------- ------- ------- ------- ------- ------- ------- -------
 88899< 101521> 101363> 100931  100398  100417  100990  102048> 100837  402606
100736   91932< 102310> 101434> 101258  101218  101959> 101838> 100863  398538<
101111  101793>  91535< 101136  101119  101368> 101387> 101420> 100931  400498
100231  101730> 101152   94828< 100515  100429  101459> 101016  100548  399079<
 99960  101043  101245  100035   96142  100191  101023  100805  100286  399613
100273  100592  100997  100308  100787   92974< 100627  101903> 100677  400945
101266  101291> 101190  101175  100972  101113   91008< 101468> 101479> 401813
102018> 101058  101185  101190  101456> 101369> 101612>  86059< 101785> 402378
101128  100490  100289  101040  100252  100920  101119  101428>  90735< 402118
404498> 400587  401115  399084< 397442< 400217  401354  402250  401374 1583868<

Interpretation: If we allow splitting, then the results change fairly
dramatically.  Now aces *hate* aces, and deuces *hate* deuces, and
similarly for all the other split cards.  Note that 5's get along just
fine with each other.  The sudden profusion of values over their max
bounds must be attributed to being a side effect of the split cards'
distaste for each other.  For example, if an 8 tends not to follow an
8, then every other card is more likely to follow an 8 than it would be
with a random shuffle.  The anticlumping of 10's is probably also a side
effect, since sequences like 88TT tend to be less likely than sequences
8T8T - runs of tens get broken up in the haste of split cards to get
far away from each other.  In case it's not obvious, the split cards
don't really hate each other - it's just that the shuffle is too weak
to bring back together cards that have been split apart and had a few
cards inserted in between.

Now let's see what happens when we drop from seven hands to one hand...

                FREQUENCY DISTRIBUTION OF PREVIOUS CARD TO NEXT
            (Rows are the next card, A,2,3...T, from top to bottom)
                   1 hand, splitting allowed, realistic shuffle
================================PREVIOUS CARD==================================
   A       2       3       4       5       6       7       8       9       T
------- ------- ------- ------- ------- ------- ------- ------- ------- -------
 90751< 101179> 100944   99986  100937  101031  101019  101883> 100474  402763
102033>  95692  103085> 102691> 101271> 102070> 100564  100163   99654  393693<
101527> 103890>  94247< 101760> 101464> 101149  100342   99589   99399  396220<
101218  102470> 102079>  96545  101349> 101497> 100198  100342   99623  395246<
100945  102176> 102079> 101784>  96266  100478  100363   99957   99666  395885<
100567  102148> 101794> 101079  100603   93924< 100539  100145  100400  397574<
101247  100103   99690   99919   99686  100242   92658< 101676> 101131  403862>
100620   99230< 100032   99491  100369  100096  101290>  90260< 101274> 407644>
100807   99190<  99041< 100411   99591   99783  100855  101735>  93129< 405038>
401266  394761< 396652< 396605< 398235< 398799< 402399  404454> 404920>1601400>

Interpretation: when dropping to heads up, still with splitting and realistic
pick-up and shuffling procedures, split cards still anti-clump and small 
nonidentical cards clump more. The big difference is that now big (8,9,T)
cards clump (except that 8's anticlump and 9's anticlump within themselves).
The largest effect is the 8 and ace anticlumping, which is about equivalent
to the effect of an ace/8 or two being removed from a six deck shoe when
the previous card was an ace/8.  The ten clumping effect is the equivalent
of an extra ten being in the shoe when the previous card was a ten.  This
could be used as a tie breaker for close strategy decisions, but no extreme
strategy or betting changes would be warranted.

Now let's see what happens if we disallow splitting for the one player...

                FREQUENCY DISTRIBUTION OF PREVIOUS CARD TO NEXT
            (Rows are the next card, A,2,3...T, from top to bottom)
                   1 hand, no splitting, realistic shuffle
================================PREVIOUS CARD==================================
 96041  100423  100099  100191  100733   99929  100077  100999  100509  400881
101402>  99533> 102966> 102205> 101747> 101345>  99983  100104   99300< 392658<
100564  103723>  98063> 101878> 101700> 101387>  99683   99116<  98702  395634<
100922  102454> 101778>  97342> 101023  101115  100843   99893   99483  395733<
100819  102755> 100867  101434>  97044  100931  100064  100052   99736  396415<
100728  101463  101321> 101339> 100291   96395  100423   99697   99836  397927<
100450  100275  100474   99648  100295  100092   95817   99554  100810  402956
 99819   98798<  99435   99588   99767  100125  100945   96871  100629  404870>
 99826   98467<  98821<  99320<  99250< 100265  100937  100851   96595  405255>
399102< 393338< 396890< 397695< 398298< 397917< 401709  403712> 404096>1604740>

Interpretation: Clumping ahoy!  Small cards (A-6) tend to clump, as to
big cards (8,9,T).  This seems to be the most pronounced clumping case.
Even so, the clumping effect of a ten preceding a ten only just neutralizes
the effect of removal of that first ten, so there's not much exploitability
here.  Also, it costs 1% to never split, so you should not consider
refraining from splitting.

Now just for completeness, let's see the control case for 1 hand...

                FREQUENCY DISTRIBUTION OF PREVIOUS CARD TO NEXT
            (Rows are the next card, A,2,3...T, from top to bottom)
                   1 hand, no splitting, random shuffle (control)
================================PREVIOUS CARD==================================
   A       2       3       4       5       6       7       8       9       T
------- ------- ------- ------- ------- ------- ------- ------- ------- -------
 95545  100294  100775  100116   99664   99950  100721  100262  100143  400991
100501   96080  100458  100598  100191  101010  100328  100630  100526  401427
100509  100375   95680  100050  100912   99809  100739  100193  100029  401498
100548  100330  100035   96252  100353  100281  100533  100277  100275  400985
 99751  100577  100061  100417   95902  100517  100404  100349  100311  401254
100320  100050  100696  100236  100175   96113  100578   99612  100172  401382
100202  100247  100480  100520  100482  100424   95939  100176  100274  401806
100262  100734   99776  101132  100695  100024  100230   95941  100501  400540
 99743  100296  100520  100170  100834  100381  100078  100386   96038  401881
401012  402847  401293  400400  400132  400935  400838  402024  402103 1588954

Interpretation: The previous control had one value outside the bounds; this
one does not, which may allay concerns about the random number generator
and whatnot.

If you've gotten this far, you're probably overwhelmed by numbers, so
I'll try to summarize briefly.

*SUMMARY*

     % CLUMPING OF TENS

           SPLITTING 
          Yes      No
  H      ------  ------
  A 1 |  +0.817  +1.03
  N
  D 7 |  -0.287  -0.036
  S 

The above table shows you how far off the tens-follow-tens results are
for each of the four experiments.  You can infer that the fewer the hands,
the less the anti-clumping of tens (ultimately becoming clumping). Also,
if splitting is not allowed, then the anti-clumping is reduced (or the
clumping is increased), though this is a much less important factor
than the number of hands.

Before clumpers go insane with glee, note that you've probably got
things exactly backwards, as anti-clumping is probably the rule for
multiplayer games, and in any case, the effects are very small,
only about as large as removing a single card from a six deck shoe,
which any card counter knows is worth very little.

I'm sure the clumpers will cry for simulations showing the
correlations of the previous N cards to the next M cards.  I did
such a study back around 1990, and it's a similar story - the
clumping effect can be observed, but it's miniscule in size, much
smaller than the effects I revealed above.  In any case, it looks
like someone is about to post such a study.

The simulations I've done here, and back in 1990, should have been
done by the clumpers long ago, long before they ever dreamed of trying
to exploit clumps in the casinos, long before they ever dreamed to
declare that clumping was real and not a perceptual artifact.  By not
doing their homework, they got things pretty much exactly backwards,
and vastly overestimated the strengths of the effects and the impact
of casino shuffles on normal players and counters.

The fact that the card distributions are not random does not
invalidate card counting, because the card distributions are *almost*
random.  Simulations show that if there is any impact on player/counter
expectation of a casino shuffle compared to a random shuffle, it's below
.02% advantage, because most very long simulations show no statistically
significant difference.

-- 
Abdul Jalib wearing the hat of | May you never be tapped on the shoulder
Professional Degenerate Gambler| in the New Year.
AbdulJ_DELETE_@PosEV.com       | (Delete _DELETE_ to reply via email.)