Epi

Bayes Theorem

search

Bayes Theorem, Bayesian Statistics

  • Definitions
  1. Bayes Theorem (calculation)
    1. P (Disease | Positive Test) = P(Positive test | Disease) * P(Disease) / P(Positive Test)
    2. Where
      1. P (A | B) = Probability of A given B
      2. P(Positive test | Disease) = Test Sensitivity
  • Evaluation
  • Example - Probability of Disease Based on a Test
  1. Positive Test
    1. Disease Y Present in 75
    2. Disease Y NOT Present in 25
  2. Negative Test
    1. Disease Y Present in 10
    2. Disease Y NOT Present in 190
  3. Probabilities
    1. P(Positive test I Disease) = Test Sensitivity = 75 / (75 + 10) = 0.88
    2. P(Disease) = Pretest Probability in cohort tested = (75+10)/(75+10+25+190) = 0.28
    3. P(Positive Test) = True positives and False Positives = (75 + 25)/(75+10+25+190) = 0.33
  4. Conclusion
    1. P (Disease | Pos Test) = P(Pos test I Disease) * P(Disease) / P(Pos Test) = 0.88 * 0.28 / 0.33 = 0.75
    2. In this case a patient from the given cohort has a 75% probability of Disease Y given a Positive Test
  • Evaluation
  • Example - Probability of a disease based on a group of findings
  1. The probability of a disease given one or more findings can be calculated from:
    1. Prevalence of a Disease (and of its differential diagnosis) AND
    2. Probability of findings when the disease is present (and when other conditions on the differential diagnosis are present)
  2. Assumptions
    1. Conditional independence of findings
      1. For a given disease, different findings do not have a relationship with one another
      2. Example: For Acute Coronary Syndrome, Chest Pain and Shortness of Breath are not dependently related
    2. Mutual exclusivity of conditions
      1. For a given presentation with specific findings, only one disease is present to explain those findings
      2. Example: The patient with Chest Pain, Tachypnea and Shortness of Breath
        1. Does NOT have both a Myocardial Infarction AND a Pulmonary Embolism
    3. Calculation
      1. P(D|F) = Probability of Disease (D) given Findings (F) = P(D) * P(F | D) / P(DDx) * P(F | DDx)
      2. Where
        1. P(D) = Probability of Disease (D)
        2. P(F | D) = Probability of Findings (F) given Disease (D)
        3. P(DDx) = Sum of probabilities of a group of Diseases including the Disease (D) of interest (Differential Diagnosis)
        4. P(F | DDx) = Probability of Findings (F) given the group of diseases (DDx)
  • Evaluation
  • Example of Family Tree and Hemophilia
  1. Setup
    1. A healthy woman has a brother with Hemophilia (xY)
    2. Hemophilia is X-linked and as she is unaffected she is either Xx (Hemophilia carrier) or XX (normal)
    3. She has two healthy male children without Hemophilia (each XY)
    4. What is the probability that she is XX (no Hemophilia gene)
  2. Assumptions
    1. P(xX) = p(XX) = probability mother is either Hemophilia carrier (xX) or normal (XX) = 0.5
    2. P(cXY and cXY|mXX) = probability that both children are XY (normal) given mother is XX = 1
    3. P(cXY and cXY|mxX) = probablity that both children are XY (normal) given mother is xX (Hemophilia carrier) = 0.5 * 0.5 = 0.25
  3. Bayes Formula
    1. P(A|B) = P(B|A) * P(A) / (P(B|A)*P(A) + P(B|not A)*P(not A) )
    2. P(mXX| cXY and cXY) = Probability mother has 2 normal X copies given 2 non-Hemophiliac sons
    3. P(mXX| cXY and cXY) = P(cXY and cXY|mXX) * P(XX) / ( P(cXY and cXY|mXX) * P(XX) + P(cXY and cXY|mxX) * P(xX) )
    4. P(mXX| cXY and cXY) = (1* 0.5) / ( 1 * 0.5 + 0.25 * 0.5 ) = 0.5 / 0.625 = 0.8 or 4/5
  4. References
    1. (2015) Columbia Statistical Thinking for Data Science and Analytics, EDX, accessed online 2/4/2017
  • References
  1. Desai (2014) Clinical Decision Making, AMIA’s CIBRC Online Course
  2. Hersh (2014) Knowledge Acquisition and Use for Clinical Decision Support, AMIA’s CIBRC Online Course