I did laundry the other day, and as I grabbed a chunk of my newly-laundried socks to transition from the living room to my bedroom, I wondered: how many pairs of socks am I holding (on average)?

I couldn’t immediately think of the right probability distribution, and I really just wanted an answer, so I wrote a small Haskell program, below. (I plan on doing many of these sub-fifty-line programs to address tiny computational curiosities in the future, with emphesis on readability more than optimization.)

```
import Data.List (delete)
import Prelude hiding (Left,Right)
import System.Environment (getArgs)
import System.Random (random, getStdGen, RandomGen)
type Sock = (Species,Parity)
type N_Pairs = Int
data Parity = Left | Right deriving (Eq,Show)
type Species = Int
main = do
args <- getArgs
g <- getStdGen
{- Note that n is the number of initial *pairs*,
- while k is the number of picked *socks*. Therefore
- the case n == k corresponds to picking half the
- jumbled socks at the end. -}
let (n,k) = ( read (args!!0) :: Int
, read (args!!1) :: Int )
putStrLn . show . pairs . take k . jumble g $ socks n
socks :: N_Pairs -> [Sock]
socks n = [ (j,foot) | j <- [1..n]
, foot <- [Left,Right] ]
pairs :: [Sock] -> N_Pairs
pairs [] = 0
pairs (sock:pile) = if other sock `elem` pile
then succ . pairs . delete (other sock) $ pile
else pairs pile
jumble :: RandomGen g => g -> [Sock] -> [Sock]
jumble _ [] = []
jumble g pile = pick : jumble g' (delete pick pile)
where (k,g') = random g
pick = pile !! (k `mod` length pile)
other :: Sock -> Sock
other (j,foot) = case foot
of Left -> (j,Right)
Right -> (j,Left)
```

A sequence of this program’s output is represented in the image. After running this, I decided to do some “real” math and look up the appropriate distribution.

I don’t feel too bad for not knowing it; this problem (almost) obeys a *multivariate hypergeometric distribution*. You can read the linked Wikipedia article for more general details, but the result is that, if we pick *n* socks out of *N*, the likelikhood of picking **k**≡(k_{1},k_{2}, …, k_{N/2}) socks is (∏_{j} _{2}C_{kj}) / _{N}C_{n}, where _{n}C_{k} means *n* choose *k*. It might be more helpful to read that in pseudo-Haskell:

```
-- the argument ks is the vector k above,
-- where |k| is the total picked
pmf :: [Integer] -> Double
pmf ks = ( product [ 2 `choose` k | k <- ks ] )
/ ( n `choose` length ks )
```

In each expression of the probability mass function, the 2 comes from the fact that there are two socks in a pair.

For the laundry problem, though, I wasn’t particularly the interested in *which* pairs I got, but rather *how many* pairs. If we’re interested in any of the *n* picked socks being paired, then we’d actually like to ask about the subspace of all vectors **k** with *L ^{1}*(

**Z**) norm

*n*(socks picked) which have

*p*“twos” (pairs).

I’m pretty sure, **but please correct me if I’m wrong**, that for an *S* dimensional 2×2×…×2 lattice, there are ∑_{j}(*S*)_{j}, where *j* runs from ⌈*k*/2⌉ to min(*k*,*S*), points with Manhattan norm *k*, where (*x*)_{j} is to be read as a Pochhammer symbol (falling factorial). Correspondingly, for norm *k*, there are (*S*)_{⌈k/2⌉} distinct instances of ⌊*k*/2⌋ pairs for *k* picked; there are (*S*)_{⌈k/2⌉+1} for (*k*-1) picked, and so forth (with some weird effects when you get to the case *k* > *S*).

I’m not going to tackle this further analytically, but I will try plotting the modified multivariate hypergeometric distribution when I get back to my *Mathematica* and can have it do the heavy lifting.

Edit: I didn’t really mean 2×2× .. ×2, I meant, a lattice embedded in **R**^{n} with points where there are non-negative integral coordinates less than 3.

**TL;DR:** I really can’t be blamed for not knowing why I have so few matching socks.