[<< wikibooks] Statistics/Distributions/Hypergeometric
=== Hypergeometric Distribution ===
The hypergeometric distribution describes the number of successes in a sequence of n draws without replacement from a population of N that contained m total successes.
Its probability mass function is:

  
    
      
        f
        (
        k
        )
        =
        
          
            
              
                
                  
                    (
                  
                  
                    m
                    k
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      m
                    
                    
                      n
                      −
                      k
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  (
                
                
                  N
                  n
                
                
                  )
                
              
            
          
        
        
           for all 
        
        x
        ∈
        [
        0
        ,
        n
        ]
      
    
    {\displaystyle f(k)={{{m \choose k}{{N-m} \choose {n-k}}} \over {N \choose n}}{\text{ for all }}x\in [0,n]}
  Technically the support for the function is only where x∈[max(0, n+m-N), min(m, n)]. In situations where this range is not [0,n], f(x)=0 since for k>0, 
  
    
      
        
          
            
              (
            
            
              0
              k
            
            
              )
            
          
        
        =
        0
      
    
    {\displaystyle {0 \choose k}=0}
  .


==== Probability Density Function ====
We first check to see that f(x) is a valid pmf. This requires that it is non-negative everywhere and that its total sum is equal to 1. The first condition is obvious. For the second condition we will start with Vandermonde's identity

  
    
      
        
          ∑
          
            x
            =
            0
          
          
            n
          
        
        
          
            
              (
            
            
              a
              x
            
            
              )
            
          
        
        
          
            
              (
            
            
              b
              
                n
                −
                x
              
            
            
              )
            
          
        
        =
        
          
            
              (
            
            
              
                a
                +
                b
              
              n
            
            
              )
            
          
        
      
    
    {\displaystyle \sum _{x=0}^{n}{a \choose x}{b \choose n-x}={a+b \choose n}}
  

  
    
      
        
          ∑
          
            x
            =
            0
          
          
            n
          
        
        
          
            
              
                
                  
                    (
                  
                  
                    a
                    x
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    b
                    
                      n
                      −
                      x
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  (
                
                
                  
                    a
                    +
                    b
                  
                  n
                
                
                  )
                
              
            
          
        
        =
        1
      
    
    {\displaystyle \sum _{x=0}^{n}{{a \choose x}{b \choose n-x} \over {a+b \choose n}}=1}
  We now see that if a=m and b=N-m that the condition is satisfied.


==== Mean ====
We derive the mean as follows:

  
    
      
        E
        ⁡
        [
        X
        ]
        =
        
          ∑
          
            x
            =
            0
          
          
            n
          
        
        x
        ⋅
        f
        (
        x
        ;
        n
        ,
        m
        ,
        N
        )
        =
        
          ∑
          
            x
            =
            0
          
          
            n
          
        
        x
        ⋅
        
          
            
              
                
                  
                    (
                  
                  
                    m
                    x
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      m
                    
                    
                      n
                      −
                      x
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  (
                
                
                  N
                  n
                
                
                  )
                
              
            
          
        
      
    
    {\displaystyle \operatorname {E} [X]=\sum _{x=0}^{n}x\cdot f(x;n,m,N)=\sum _{x=0}^{n}x\cdot {{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}}
  

  
    
      
        E
        ⁡
        [
        X
        ]
        =
        0
        ⋅
        
          
            
              
                
                  
                    (
                  
                  
                    m
                    0
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      m
                    
                    
                      n
                      −
                      0
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  (
                
                
                  N
                  n
                
                
                  )
                
              
            
          
        
        +
        
          ∑
          
            x
            =
            1
          
          
            n
          
        
        x
        ⋅
        
          
            
              
                
                  
                    (
                  
                  
                    m
                    x
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      m
                    
                    
                      n
                      −
                      x
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  (
                
                
                  N
                  n
                
                
                  )
                
              
            
          
        
      
    
    {\displaystyle \operatorname {E} [X]=0\cdot {{{m \choose 0}{{N-m} \choose {n-0}}} \over {N \choose n}}+\sum _{x=1}^{n}x\cdot {{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}}
  We use the identity 
  
    
      
        
          
            
              (
            
            
              a
              b
            
            
              )
            
          
        
        =
        
          
            a
            b
          
        
        
          
            
              (
            
            
              
                a
                −
                1
              
              
                b
                −
                1
              
            
            
              )
            
          
        
      
    
    {\displaystyle {\binom {a}{b}}={\frac {a}{b}}{\binom {a-1}{b-1}}}
   in the denominator.

  
    
      
        E
        ⁡
        [
        X
        ]
        =
        0
        +
        
          ∑
          
            x
            =
            1
          
          
            n
          
        
        x
        ⋅
        
          
            
              
                
                  
                    (
                  
                  
                    m
                    x
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      m
                    
                    
                      n
                      −
                      x
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  N
                  n
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      1
                    
                    
                      n
                      −
                      1
                    
                  
                  
                    )
                  
                
              
            
          
        
      
    
    {\displaystyle \operatorname {E} [X]=0+\sum _{x=1}^{n}x\cdot {{{m \choose x}{{N-m} \choose {n-x}}} \over {{N \over n}{{N-1} \choose {n-1}}}}}
  

  
    
      
        E
        ⁡
        [
        X
        ]
        =
        
          
            n
            N
          
        
        
          ∑
          
            x
            =
            1
          
          
            n
          
        
        x
        ⋅
        
          
            
              
                
                  
                    (
                  
                  
                    m
                    x
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      m
                    
                    
                      n
                      −
                      x
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  (
                
                
                  
                    N
                    −
                    1
                  
                  
                    n
                    −
                    1
                  
                
                
                  )
                
              
            
          
        
      
    
    {\displaystyle \operatorname {E} [X]={n \over N}\sum _{x=1}^{n}x\cdot {{{m \choose x}{{N-m} \choose {n-x}}} \over {{N-1} \choose {n-1}}}}
  Next we use the identity 
  
    
      
        b
        
          
            
              (
            
            
              a
              b
            
            
              )
            
          
        
        =
        a
        
          
            
              (
            
            
              
                a
                −
                1
              
              
                b
                −
                1
              
            
            
              )
            
          
        
      
    
    {\displaystyle b{\binom {a}{b}}=a{\binom {a-1}{b-1}}}
   in the first binomial of the numerator.

  
    
      
        E
        ⁡
        [
        X
        ]
        =
        
          
            n
            N
          
        
        
          ∑
          
            x
            =
            1
          
          
            n
          
        
        
          
            
              m
              
                
                  
                    
                      (
                    
                    
                      
                        m
                        −
                        1
                      
                      
                        x
                        −
                        1
                      
                    
                    
                      )
                    
                  
                
                
                  
                    
                      (
                    
                    
                      
                        N
                        −
                        m
                      
                      
                        n
                        −
                        x
                      
                    
                    
                      )
                    
                  
                
              
            
            
              
                
                  (
                
                
                  
                    N
                    −
                    1
                  
                  
                    n
                    −
                    1
                  
                
                
                  )
                
              
            
          
        
      
    
    {\displaystyle \operatorname {E} [X]={n \over N}\sum _{x=1}^{n}{m{{m-1 \choose x-1}{{N-m} \choose {n-x}}} \over {{N-1} \choose {n-1}}}}
  Next, for the variables inside the sum we define corresponding prime variables that are one less. So N′=N−1, m′=m−1, x′=x−1, n′=n-1.

  
    
      
        E
        ⁡
        [
        X
        ]
        =
        
          
            
              m
              n
            
            N
          
        
        
          ∑
          
            
              x
              ′
            
            =
            0
          
          
            
              n
              ′
            
          
        
        
          
            
              
                
                  
                    (
                  
                  
                    
                      m
                      ′
                    
                    
                      x
                      ′
                    
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      
                        N
                        ′
                      
                      −
                      
                        m
                        ′
                      
                    
                    
                      
                        n
                        ′
                      
                      −
                      
                        x
                        ′
                      
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  (
                
                
                  
                    
                      N
                      ′
                    
                  
                  
                    
                      n
                      ′
                    
                  
                
                
                  )
                
              
            
          
        
      
    
    {\displaystyle \operatorname {E} [X]={mn \over N}\sum _{x'=0}^{n'}{{{m' \choose x'}{{N'-m'} \choose {n'-x'}}} \over {{N'} \choose {n'}}}}
  

  
    
      
        E
        ⁡
        [
        X
        ]
        =
        
          
            
              m
              n
            
            N
          
        
        
          ∑
          
            
              x
              ′
            
            =
            0
          
          
            
              n
              ′
            
          
        
        f
        (
        
          x
          ′
        
        ;
        
          n
          ′
        
        ,
        
          m
          ′
        
        ,
        
          N
          ′
        
        )
      
    
    {\displaystyle \operatorname {E} [X]={mn \over N}\sum _{x'=0}^{n'}f(x';n',m',N')}
  Now we see that the sum is the total sum over a Hypergeometric pmf with modified parameters. This is equal to 1. Therefore

  
    
      
        E
        ⁡
        [
        X
        ]
        =
        
          
            
              n
              m
            
            N
          
        
      
    
    {\displaystyle \operatorname {E} [X]={nm \over N}}
  


==== Variance ====
We first determine E(X2).

  
    
      
        E
        ⁡
        [
        
          X
          
            2
          
        
        ]
        =
        
          ∑
          
            x
            =
            0
          
          
            n
          
        
        f
        (
        x
        ;
        n
        ,
        m
        ,
        N
        )
        ⋅
        
          x
          
            2
          
        
        =
        
          ∑
          
            x
            =
            0
          
          
            n
          
        
        
          
            
              
                
                  
                    (
                  
                  
                    m
                    x
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      m
                    
                    
                      n
                      −
                      x
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  (
                
                
                  N
                  n
                
                
                  )
                
              
            
          
        
        ⋅
        
          x
          
            2
          
        
      
    
    {\displaystyle \operatorname {E} [X^{2}]=\sum _{x=0}^{n}f(x;n,m,N)\cdot x^{2}=\sum _{x=0}^{n}{{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}\cdot x^{2}}
  

  
    
      
        E
        ⁡
        [
        
          X
          
            2
          
        
        ]
        =
        
          
            
              
                
                  
                    (
                  
                  
                    m
                    0
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      m
                    
                    
                      n
                      −
                      0
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  (
                
                
                  N
                  n
                
                
                  )
                
              
            
          
        
        ⋅
        
          0
          
            2
          
        
        +
        
          ∑
          
            x
            =
            1
          
          
            n
          
        
        
          
            
              
                
                  
                    (
                  
                  
                    m
                    x
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      m
                    
                    
                      n
                      −
                      x
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  (
                
                
                  N
                  n
                
                
                  )
                
              
            
          
        
        ⋅
        
          x
          
            2
          
        
      
    
    {\displaystyle \operatorname {E} [X^{2}]={{{m \choose 0}{{N-m} \choose {n-0}}} \over {N \choose n}}\cdot 0^{2}+\sum _{x=1}^{n}{{{m \choose x}{{N-m} \choose {n-x}}} \over {N \choose n}}\cdot x^{2}}
  

  
    
      
        E
        ⁡
        [
        
          X
          
            2
          
        
        ]
        =
        0
        +
        
          ∑
          
            x
            =
            1
          
          
            n
          
        
        
          
            
              m
              
                
                  
                    (
                  
                  
                    
                      m
                      −
                      1
                    
                    
                      x
                      −
                      1
                    
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      m
                    
                    
                      n
                      −
                      x
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  N
                  n
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      1
                    
                    
                      n
                      −
                      1
                    
                  
                  
                    )
                  
                
              
            
          
        
        ⋅
        x
      
    
    {\displaystyle \operatorname {E} [X^{2}]=0+\sum _{x=1}^{n}{{m{m-1 \choose x-1}{{N-m} \choose {n-x}}} \over {{N \over n}{{N-1} \choose {n-1}}}}\cdot x}
  

  
    
      
        E
        ⁡
        [
        
          X
          
            2
          
        
        ]
        =
        
          
            
              m
              n
            
            N
          
        
        
          ∑
          
            x
            =
            1
          
          
            n
          
        
        
          
            
              
                
                  
                    (
                  
                  
                    
                      m
                      −
                      1
                    
                    
                      x
                      −
                      1
                    
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      N
                      −
                      m
                    
                    
                      n
                      −
                      x
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  (
                
                
                  
                    N
                    −
                    1
                  
                  
                    n
                    −
                    1
                  
                
                
                  )
                
              
            
          
        
        ⋅
        x
      
    
    {\displaystyle \operatorname {E} [X^{2}]={mn \over N}\sum _{x=1}^{n}{{{m-1 \choose x-1}{{N-m} \choose {n-x}}} \over {{N-1} \choose {n-1}}}\cdot x}
  We use the same variable substitution as when deriving the mean.

  
    
      
        E
        ⁡
        [
        
          X
          
            2
          
        
        ]
        =
        
          
            
              m
              n
            
            N
          
        
        
          ∑
          
            
              x
              ′
            
            =
            0
          
          
            
              n
              ′
            
          
        
        
          
            
              
                
                  
                    (
                  
                  
                    
                      m
                      ′
                    
                    
                      x
                      ′
                    
                  
                  
                    )
                  
                
              
              
                
                  
                    (
                  
                  
                    
                      
                        N
                        ′
                      
                      −
                      
                        m
                        ′
                      
                    
                    
                      
                        n
                        ′
                      
                      −
                      
                        x
                        ′
                      
                    
                  
                  
                    )
                  
                
              
            
            
              
                
                  (
                
                
                  
                    
                      N
                      ′
                    
                  
                  
                    
                      n
                      ′
                    
                  
                
                
                  )
                
              
            
          
        
        (
        
          x
          ′
        
        +
        1
        )
      
    
    {\displaystyle \operatorname {E} [X^{2}]={mn \over N}\sum _{x'=0}^{n'}{{{m' \choose x'}{{N'-m'} \choose {n'-x'}}} \over {{N'} \choose {n'}}}(x'+1)}
  

  
    
      
        E
        ⁡
        [
        
          X
          
            2
          
        
        ]
        =
        
          
            
              m
              n
            
            N
          
        
        
          [
          
            
              ∑
              
                
                  x
                  ′
                
                =
                0
              
              
                
                  n
                  ′
                
              
            
            
              
                
                  
                    
                      
                        (
                      
                      
                        
                          m
                          ′
                        
                        
                          x
                          ′
                        
                      
                      
                        )
                      
                    
                  
                  
                    
                      
                        (
                      
                      
                        
                          
                            N
                            ′
                          
                          −
                          
                            m
                            ′
                          
                        
                        
                          
                            n
                            ′
                          
                          −
                          
                            x
                            ′
                          
                        
                      
                      
                        )
                      
                    
                  
                
                
                  
                    
                      (
                    
                    
                      
                        
                          N
                          ′
                        
                      
                      
                        
                          n
                          ′
                        
                      
                    
                    
                      )
                    
                  
                
              
            
            
              x
              ′
            
            +
            
              ∑
              
                
                  x
                  ′
                
                =
                0
              
              
                
                  n
                  ′
                
              
            
            
              
                
                  
                    
                      
                        (
                      
                      
                        
                          m
                          ′
                        
                        
                          x
                          ′
                        
                      
                      
                        )
                      
                    
                  
                  
                    
                      
                        (
                      
                      
                        
                          
                            N
                            ′
                          
                          −
                          
                            m
                            ′
                          
                        
                        
                          
                            n
                            ′
                          
                          −
                          
                            x
                            ′
                          
                        
                      
                      
                        )
                      
                    
                  
                
                
                  
                    
                      (
                    
                    
                      
                        
                          N
                          ′
                        
                      
                      
                        
                          n
                          ′
                        
                      
                    
                    
                      )
                    
                  
                
              
            
          
          ]
        
      
    
    {\displaystyle \operatorname {E} [X^{2}]={mn \over N}\left[\sum _{x'=0}^{n'}{{{m' \choose x'}{{N'-m'} \choose {n'-x'}}} \over {{N'} \choose {n'}}}x'+\sum _{x'=0}^{n'}{{{m' \choose x'}{{N'-m'} \choose {n'-x'}}} \over {{N'} \choose {n'}}}\right]}
  The first sum is the expected value of a hypergeometric random variable with parameteres (n',m',N'). The second sum is the total sum that random variable's pmf.

  
    
      
        E
        ⁡
        [
        
          X
          
            2
          
        
        ]
        =
        
          
            
              m
              n
            
            N
          
        
        
          [
          
            
              
                
                  
                    n
                    ′
                  
                  
                    m
                    ′
                  
                
                
                  N
                  ′
                
              
            
            +
            1
          
          ]
        
      
    
    {\displaystyle \operatorname {E} [X^{2}]={mn \over N}\left[{n'm' \over N'}+1\right]}
  

  
    
      
        E
        ⁡
        [
        
          X
          
            2
          
        
        ]
        =
        
          
            
              m
              n
            
            N
          
        
        
          [
          
            
              
                
                  (
                  n
                  −
                  1
                  )
                  (
                  m
                  −
                  1
                  )
                
                
                  (
                  N
                  −
                  1
                  )
                
              
            
            +
            1
          
          ]
        
        =
        
          
            
              m
              n
            
            N
          
        
        
          [
          
            
              
                (
                n
                −
                1
                )
                (
                m
                −
                1
                )
                +
                (
                N
                −
                1
                )
              
              
                (
                N
                −
                1
                )
              
            
          
          ]
        
      
    
    {\displaystyle \operatorname {E} [X^{2}]={mn \over N}\left[{(n-1)(m-1) \over (N-1)}+1\right]={mn \over N}\left[{{(n-1)(m-1)+(N-1)} \over (N-1)}\right]}
  We then solve for the variance

  
    
      
        Var
        ⁡
        (
        X
        )
        =
        E
        ⁡
        [
        
          X
          
            2
          
        
        ]
        −
        (
        E
        ⁡
        [
        X
        ]
        
          )
          
            2
          
        
      
    
    {\displaystyle \operatorname {Var} (X)=\operatorname {E} [X^{2}]-(\operatorname {E} [X])^{2}}
  

  
    
      
        Var
        ⁡
        (
        X
        )
        =
        
          
            
              m
              n
            
            N
          
        
        
          [
          
            
              
                (
                n
                −
                1
                )
                (
                m
                −
                1
                )
                +
                (
                N
                −
                1
                )
              
              
                (
                N
                −
                1
                )
              
            
          
          ]
        
        −
        
          
            (
            
              
                
                  m
                  n
                
                N
              
            
            )
          
          
            2
          
        
      
    
    {\displaystyle \operatorname {Var} (X)={mn \over N}\left[{{(n-1)(m-1)+(N-1)} \over (N-1)}\right]-\left({mn \over N}\right)^{2}}
  

  
    
      
        Var
        ⁡
        (
        X
        )
        =
        
          
            
              N
              m
              n
            
            
              N
              
                2
              
            
          
        
        
          [
          
            
              
                (
                n
                −
                1
                )
                (
                m
                −
                1
                )
                +
                (
                N
                −
                1
                )
              
              
                (
                N
                −
                1
                )
              
            
          
          ]
        
        −
        
          
            
              (
              N
              −
              1
              )
              (
              m
              n
              
                )
                
                  2
                
              
            
            
              (
              N
              −
              1
              )
              
                N
                
                  2
                
              
            
          
        
      
    
    {\displaystyle \operatorname {Var} (X)={Nmn \over N^{2}}\left[{{(n-1)(m-1)+(N-1)} \over (N-1)}\right]-{(N-1)(mn)^{2} \over (N-1)N^{2}}}
  
  
    
      
        Var
        ⁡
        (
        X
        )
        =
        
          
            
              n
              m
              (
              N
              −
              n
              )
              (
              N
              −
              m
              )
            
            
              
                N
                
                  2
                
              
              (
              N
              −
              1
              )
            
          
        
      
    
    {\displaystyle \operatorname {Var} (X)={nm(N-n)(N-m) \over N^{2}(N-1)}}