Epi

Contingency Table

search

Contingency Table, Contingency Grid, Cross Tabulation, Cross Tab, Statistics Example

  • Technique
  • Setting up grid for test efficacy or risk factor
  1. Examples
    1. Test efficacy: How well does a test detect a certain condition
    2. Risk Factor: How much is a particular risk associated with a given condition
  2. Draw 2x2 grid
  3. Labels
    1. Upper Boxes: Across the top (x-axis) place the Disease State Labels
      1. Left Box: Disease present (e.g. Breast Cancer)
      2. Right Box: Disease not present (e.g. Not Breast Cancer)
    2. Left Boxes: Across the left (y-axis) place the Test Result Labels
      1. Upper Box: Test Positive, Screened or exposed to contributing factor
      2. Lower Box: Test Negative, Not screened, no exposed
  1. Given
    1. Risk of Breast Cancer based on age
      1. Age 40 years old: 1 in 69
      2. Age 50 years old: 1 in 42
      3. Age 60 years old: 1 in 29
    2. Mammogram efficacy
      1. Note: We use the upper end of the Test Sensitivity and Specificity ranges for this example
      2. Test Sensitivity: 77-95%
      3. Test Specificity: 94-97%
  2. Create a hypothetical grid for patients age 40 who undergo Mammograms
    1. Generating example data
      1. Of 100,000 patients, 1449 will have Breast Cancer (1 in 69)
      2. Of the 1449 with Breast Cancer, 1376 will be detected with Mammogram (95% Test Sensitivity)
      3. Of the 98,551 without Breast Cancer, 95,594 will have a normal Mammogram (97% Specificity)
    2. Label the grid top
      1. Disease Positive (or D+): Breast Cancer positive
      2. Disease Negative (or D-): Breast Cancer negative
    3. Label the grid left
      1. Test Positive (or T+): Mammogram positive
      2. Test Negative (or T-): Mammogram negative
      3. Total patients
    4. Fill in total patients first (bottom row)
      1. Breast Cancer positive (D+): 1449
        1. Every 69 in 100,000 will have Breast Cancer for those at age 40
      2. Breast Cancer negative (D-): 98,551
        1. The remainder of the 100,000 without Breast Cancer
    5. Complete the left column (D+)
      1. Top left: Mammogram Positive (or T+): 1376
        1. True positive patients represent 95% of 1449 (the Test Sensitivity)
      2. Bottom left: Mammogram Negative (or T-): 73
        1. False Negative patients represents 1449 - 1376
    6. Complete the right column (D-)
      1. Bottom right: Mammogram Negative (or T-): 95,594
        1. True negative patients represents 97% of 98,551 (the Test Specificity)
      2. Top right: Mammogram Positive (or T+): 2957
        1. False Positive patients represents 98,551 - 95,594
  3. Summary of grid
    1. D+ T+: 1376 (true positives)
    2. D- T+: 2957 (False Positives)
    3. D+ T- : 73 (False Negatives)
    4. D- T- : 95,594 (true negatives)
  4. Calculations
    1. Test Sensitivity (Test Recall)
      1. Sensitivity: True positives / (true positives + False Negatives)
      2. Sensitivity: 1376 / (1376 + 73) = 95%
    2. Test Specificity
      1. Specificity: True negatives / (true negatives + False Positives)
      2. Specificity: 95,594 / (95,594 + 2957) = 97%
    3. Positive Predictive Value (PPV, Test Precision)
      1. PPV: True positive / (true positives + False Positives)
      2. PPV: 1376 / (1376 + 2957) = 32%
    4. Negative Predictive Value (NPV)
      1. NPV: True negative / (true negative + False Negatives)
      2. NPV: 95,594 / (95,594 + 73) = 99%
    5. False Positive Rate (type I error or a)
      1. a: (1 - Test Specificity)
      2. a: (1 - 0.97) = 3%
    6. False Negative Rate (type II error or b)
      1. b: (1 - Test Sensitivity)
      2. b: (1 - 0.95) = 5%
    7. Likelihood Ratio positive (LR+)
      1. LR+: Sensitivity / (1-Specificity)
      2. LR+: 0.95 / (1 - 0.97) = 32 (high likelihood of disease if >10)
    8. Likelihood Ratio negative (LR-)
      1. LR-: (1 - Sensitivity) / (Specificity)
      2. LR-: (1 - 0.95) / 0.97 = 0.05 (low likelihood of disease if <0.1)
    9. F1 Score
      1. F1 Score is the harmonic mean of Test Precision (PPV) and Test Recall (Test Sensitivity)
      2. F1 is least accurate at 0, and most accurate at 1
      3. F1 = 2 * (Precision * Recall) / (Precision + Recall)
      4. F1 = 2 * (0.32 * 0.95)/(0.32 + 0.95) = 0.47