Drew University
The problem of determining the distribution of the number of "hits" or "matchings" under random matching of two decks of cards has received attention from a number of authors within the last few years. In 1984 Chapman [2] considered pairings between two series of t elements each, and later [3] generalized the problem to series of u and t (R u) elements respectively. In the same paper he also considered the distribution of the mean number of correct matchings resulting from n independent trials, and gave a method, and tables, for determining the significance of any obtained mean. In 1937 Bartlett [1] considered matchings of two decks of cards, using a number of interesting moment generating functions.

In 1937 Huntington [12, 13] gave tables of probabilities for matchings between decks with the compositions (5^{5}), (4^{4}), and (3^{3}), where (s^{t}) denotes a deck consisting of s of each of t kinds of cards. More generally (s_{1}s_{2}…s_{t}) denotes s_{i} cards of the first kind, s_{2} of the second, etc. Sterne [16] has given the first four moments of the frequency distribution for the (5^{5}) case and has fitted a Pearson Type I distribution function to the distribution. Sterne obtained his results by considering the probabilities in a 5 X 5 contingency table. He also considered the 4 X 4 and 3 X 3 cases. In 1938 Greville [7] gave a table of the exact probabilities for matchings between two decks of compositions (5^{5}). Greenwood [4] obtained the variance of the distribution of hits for matchings between two decks having the respective compositions (s^{t}) and (s_{1}s_{2}…s_{t}) with s_{i}_{ }s_{1} +s_{2 }+^{…} s_{t}_{ } = st = n, and where it is not necessary that all the s's should be different from zero. Earlier Wilks [19] had considered the same problem for t = 5 and n = 25.
In a very interesting paper Olds
[15] in 1938 used permanents to express a moment generating function suitable
for the problem in question. He obtained factorial moments and the first four
ordinary moments about the mean, first for two decks with composition (4^{2}), and then for two decks of composition
(s^{t}) . In
1938 Stevens [17] considered a contingency table in connection with matchings
between two sets of n objects
each, and gave the means, variances, and covariances of the single cell entries
and various subtotals of the cell entries. Stevens [18] also gave a treatment
of the problem of matchings between two decks which was based on elementary
considerations. In 1940 Greenwood [6] gave the first four moments of the distribution
of hits between two decks of any composition whatever, generalizing the problem
which had been treated earlier by Olds [15]. Finally in 1941, Greville [8]
gave the exact distribution of hits for matchings between two decks of arbitrary
composition. He also considered the problem from the standpoint of a contingency
table, as had been done earlier by Stevens.
In 1939 Kullback [14] considered matchings between two sequences obtained by drawing at random a single element in turn from each of n urns U_{i} containing elements of r types E_{j} in the respective proportions p_{ij}. He showed that if the process of drawing were indefinitely repeated the distribution of hits would be that of a Poisson series.
The work which has been done thus far applies to the problem of matching two decks of cards. In the present paper a method is developed for obtaining the moments of the distribution of hits for matchings between three or more decks of cards of arbitrary composition.
In the present paper it will be convenient to take as the point of departure the method used by Wilks [19] in his treatment of the problem of hits occurring under random matching of two decks of 25 cards each, namely a target deck with composition (5^{5}) and a matching deck with composition _{ } . He showed that
_{ }
where, _{ }
is a suitable generating function
for obtaining the moments of the distribution.
In fact, if we define an operator _{ } as
_{ }
where _{ } , and if h denotes the number of hits, then for r = 1, 2, ... , 5,
P(h = r) = coefficient of _{ } in _{ }
And it is readily seen that
_{ }
Wilks' _{ } function involves a particular order for the target deck. If we are to generalize and obtain moments for matchings between more than two decks, it is obvious that we must devise a procedure which will, in the case of two decks, be perfectly symmetrical and not require that one deck be given a preferred status. In the case of two decks this is readily accomplished by the use of Kronecker deltas, and in the case of three or more decks by the use of obvious generalizations of these deltas.
For two decks of 25 cards each with compositions (5^{5}) we need only let
_{ }
where _{ } . Then, if
_{ }
it readily follows that
_{ }
More
generally, for two decks of
n cards each, the cards being of k types, and the decks having compositions _{ } respectively,
we let
_{ } .
The factors of 0 are in onetoone correspondence with the n events of dealing a card from each of the two decks. The values which can be assumed by the subscripts i and j are in onetoone correspondence with the k types of cards. The symbol x_{i} corresponds to the first deck, y_{i} to the second, the subscripts i and j corresponding to the different types of cards in each deck. The expansion of _{ } consists of all products which can be formed by choosing one and only one pair _{ } from each factor of _{ } as a factor of the product. In forming any term of _{ } , choosing _{ } . from any factor of _{ } corresponds to dealing a card of type _{ } from both decks, and introduces _{ } into the coefficient of the term. Choosing _{ } from any factor corresponds to dealing a card of type _{ } from the first deck, _{ } from the second. If _{ } , then, since _{ } _{ } is not introduced into the coefficient. Therefore in the coefficient of any term of _{ } , _{ } will be raised to a power, say s, which is equal to the number of factors of _{ } from which pairs _{ } . have been chosen.
The total number of ways in which the term
_{ }
can arise is equal to the number
of ways in which two decks of types_{ } respectively
can be dealt, {where _{ } and similarly for_{ } }. But
this is given by
_{ }
The coefficient of _{ } in
_{ } is
the total number of ways in which the term _{ }
can be formed subject to the restriction that pairs _{ } are chosen from s of the factors of _{ } . But this is precisely the number of ways in which the two decks can be dealt so that there will be s hits. Hence if, as above, h is the number of hits, the probability that h = s, assuming all permutations in each deck to be equally likely, is given by
_{ }
Since this is true for all values of s it follows that
_{ }
Since _{ }
we have at once
_{ }
_{ }
It is an equally straightforward matter to show that
_{ }
and that
Evidently any of the _{ } and _{ } may be zero, provided only that _{ }
The case of two decks with unequal numbers of cards m and n,(m < n), is readily handled by substituting for the smaller deck one obtained by adding nm "blank" cards  that is, cards of any type not already appearing in either deck, as indicated by Greville [8], who however obtained his results by considering a preferred order for one of the decks.
EXAMPLE 1. In the case of the decks treated by Wilks [19], n = 25, k = 5, n_{1i} = n_{2j} = 5.
Hence from ( _{ }
and from _{ }
EXAMPLE 2.
Suppose we have two decks as shown by the
scheme
Type of cards 
Total of all types 

1 
2 
3 
4 
5 

N° in deck A 
5 
7 
8 
0 
0 
20 
N° in deck B 
0 
3 
4 
6 
2 
15 
Here deck B has five fewer cards
than deck A. Hence we must presume that there are six types of cards in all,
and that the decks have the respective distributions (578000) and (034625).
We then have at once
_{ }
And
_{ }
 3. Matchings between three
decks.
Let the three decks be of types respectively, with _{ } , and consider the function
where
_{ }
and the other deltas are the usual Kronecker symbols.
Each factor of _{ } corresponds to one deal from each of the three decks. The symbols x, y, and z correspond respectively to cards in the first, second, and third decks. The subscripts i, j, k, = 1, 2, …q correspond to the types of cards  there being q distinct types.
Choosing from a factor
of corresponds
to a deal in which a card of type a is dealt from all three decks, and introduces
into the coefficient
of the corresponding term in the expansion of . Similarly, choosing
, corresponds
to a hit between the first and second decks, and introduces into the coefficient.
Similarly choosing introduces _{
} ;
_{ } introduces . Choosing _{
} corresponds to a deal with no hits, and introduces
no powers of e into
the coefficient, since all the d's are zero.
Let _{ } be
defined by :
_{ }
Then the coefficient of _{ } is the number of ways in which the cards can be dealt so as to yield precisely h_{123} triples, or hits between all three decks. Similarly the coefficient of _{ } is the number of ways in which the cards can be dealt so as to yield precisely h_{12} hits between the first and second decks, with corresponding results for the first and third (h_{13}) and second and third (h_{23}) decks.
By the same reasoning as before then, we have
_{ }
_{ }
with similar results for h_{13 } and h_{23} . And it is a straightforward matter to show that
_{ }
_{ }
_{ }
_{ }
_{ }
_{ }
with corresponding results for other moments. It is understood each summation
index takes values from 1 to q.
As before, if the decks do not all have the same total number of cards it is merely necessary to introduce one or more sets of "blank" cards. Thus we would replace decks with the compositions (57800), (03462), (00335) by hypothetical decks (5780000 ), (0346250), (0033509) and proceed as before.
EXAMPLE 3. For three decks of 25 cards, consisting of five of each of five kinds we have n = 25, _{ } .
Hence _{ }
_{ }
_{ }
_{ }
_{ }
_{ }
with similar results for _{ } _{ } and
_{ }
4.
Generalization to any number of decks.
If the moments of the distribution of hits doubles, triples, quadruples, . . . in matching any number of decks is desired, these can be obtained by using an obvious generalization of . Thus for four decks we would define _{ } not all equal, and use
_{ }
However, it is evident that as the number of decks is increased the summations
involved and the manipulation of the (generalized) K
operators rapidly become complicated.
5. Application of our
momentgenerating technique to twoway contingency tables.
The momentgenerating technique
which we have discussed has wider applications than merely to matching problems.
As an example of considerable interest we shall consider the contingency problem.
Consider the array
and also the function
_{ }
If i and j are particular values of a and b respectively, then to the ith row of the array corresponds the product _{ } , consisting of _{ } identical factors _{ } , one such factor corresponding to each of the _{ } elements in the row. To the jth column of the array corresponds the x_{j} which appears in each of the factors of _{ } . To the ijth cell of the array corresponds which appears only in the factors _{ } , and in each of these only as the coefficient of xj
The expansion of _{ } consists of all products which can be formed by taking as factors one and only one element_{ } ,' (not summed) from each factor of _{ } . But taking _{ } from one of the factors _{ } of _{ } corresponds exactly to putting an element in the ijth cell of a lattice such as . Hence every term in the expansion of _{ } corresponds to a particular distribution in such a lattice. Moreover, all terms of _{ } correspond to distributions in which the row totals are _{ } , for we must take n_{i}. elements from the product _{ } Further, those terms in which the _{ } appear in the particular product _{ } correspond to distributions in which the column totals are_{ } , since choosing _{ } elements _{ } corresponds to putting _{ } elements in the jth column and some row of the lattice.
Expanding _{ } we
obtain
where the summation is over all partitions
_{ }
It is clear that since every set of values of the _{ } subject to the partition restrictions _{ } corresponds to a particular distribution of n elements in the lattice , every particular product _{ } corresponds to such a distribution, and represents the number of ways in which it can arise. Further, the total coefficient displayed , namely _{ } , represents the total number of ways in which distributions with row totals _{ } and column totals _{ } can arise. Setting all the _{ } we readily find
_{ }
Hence the probability of any particular distribution _{ } with
fixed row totals _{ } and
fixed column totals _{ } is
:
_{ }
Consider now the result of differentiating with respect to a particular . We obtain
_{ }
where_{ } denotes summation
over indices such that _{ } . Now _{
} , but also
n_{ij} can never be less than
_{ } . For _{ } . Since the maximum value of _{ } the maximum value of _{ } .
Hence
_{ }
_{ }
Accordingly, combining all the terms of in which n_{ij} has a particular value, g, we have
_{ }
where S* denotes summation and P* multiplication with n_{ij} = g.
Since _{ } is
precisely the number of distributions _{ } for which n_{ij} = g, it follows that
_{ }
Similarly it follows that
_{ }
_{ }
where we may have (i = k)
or (i ≠ k),
and (j =
l))or (j ≠ l).
By straightforward differentiation and reduction we find that for the array
with given marginal totals_{ }
_{ }
_{ }
_{ }
_{ }
_{ }
_{ }
_{ }
_{ }
For the array :
_{ }
Hence, using the above results
we can, theoretically, find all the moments of the exact distribution Of _{
} . It is not
difficult to show that
_{ }
The value of _{ } and the variance
of _{ } . were found
by straightforward application of our methods and the results agreed with
those given by Haldane [10].
The writer is indebted to Professor Wilks for helpful criticisms and suggestions.
[1] M. S. BARTLETT, "Properties of sufficiency and statistical tests," Proc. Roy. Soc., A, Vol. 160 (1937), pp. 268282.
[2] DWIGHT CHAPMAN, "The statistics of the method of correct matchings," Amer. Jour. Psych., Vol. 46 (1934), pp. 287298.
[3] DWIGHT CHAPMAN, "The generalized problem of correct matchings," Annals of Math. Stat., Vol. 6 (1935), pp. 8595.
[4] J. A. GREENWOOD, "Variance of a general matching problem," Annals of Math. Stal., Vol. 9 (1938), pp. 5659.
[5] J. A. GREENWOOD, "Variance of the ESP call series," Jour. of Parapsychology, Vol. 2 (1938), pp. 6064.
[7] T. N. E. GREVILLE, "Exact probabilities for the matching hypothesis," Jour. of Parapsychology, Vol. 2 (1938), pp. 5559.
[8] T. N. E. GREVILLE, "The frequency distribution of a general matching problem," Annals of Math. Stat., Vol. 12 (1941), pp. 350354.
[9] J. B. S. HALDANE, "The mean and variance of Chi square, when used as a test of homogeneity, when expectations are small," Biometrika, Vol . 31 (1940).
[10] J. B. S. HALDANE, "The first six moments of Chisquare for an nfold table with n degrees of freedom when some expectations are small," Biometrika, Vol. 29 (1939), p. 389.
[11] J. B. S. HALDANE, "The exact value of the moments of the distribution of chisquare, used as a test of goodness of fit, when the expectations are small," Biometrika, Vol. 29 (1939), p. 133.
[12] E. V. HUNTINGTON, "A rating table for card matching experiments," Jour. of Parapsychology, Vol. 4 (1937), pp. 292294.
[13] E. V. HUNTINGTON, "Exact probabilities in certain cardmatching problems," Science, Vol. 86 (1937), pp. 499500.
[14] SOLOMON KULLBACK, "Note on a matching problem," Annals of Math. Stat., Vol. 10 (1939), pp. 7780.
[15] E. G. OLDS, "A momentgenerating function which is useful in solving certain matching problems," Bull. Amer. Math. Soc., Vol. 44 (1938), pp. 407413.
[16] T. E. STERNE,, "The solution of a problem in probability," Science, Vol. 86, (1937) pp. 500501.
[17] W. L. STEVENS, "Distribution of entries in a contingency table," Annals of Eugenics, Vol. 8 (1938), pp. 238244.
[18] W. L. STEVENS, "Tests of significance for extra sensory perception data," Psychological Review, Vol. 46 (1938), pp. 142150.
[19] S. S. WILKS, "Statistical aspects of experiments in telepathy," a lecture delivered to the Galois Institute of Mathematics, Long Island University, December 4, 1937.