What is Bayesian Statistics?¶
Bayesian Statistics = Probability Theory
This is the central message of the book “Probability Theory – the logic of Science” by E. T. Jaynes which is arguably the most important book on Bayesian Statistics.
How does Bayesian Statistics work?¶
Probability works in the following way. We are interested in knowing whether a certain proposition is true, and we do not have access to full information that would allow us to conclusively determine whether the proposition is true or not. Probability theory allows us to determine a number between 0 and 1 representing how likely it is that the proposition is true based on the available information. This is achieved by the following two steps:
Step One: The available information that we either possess or that we assume for the sake of argument is converted into numerical assignments for the probabilities of certain basic or elementary propositions. This step is often referred to as the modeling step.
Step Two: Based on the probability model, we calculate probabilities of the propositions of interest using the rules of probability.
In the context of Bayesian statistics, the unknown proposition is usually written in terms of a variable e.g., for some set . The goal is to calculate the probability:
The modeling step involves specification of:
as well as
It is okay to simply think of all these as just probabilities. However, it is common to use the following terminology. (3) is called the prior model (or prior distribution), (2) is called the likelihood, and (1) is called posterior probability.
Rules of Probability¶
Probabilities are assigned to propositions (also known as events). Every probability is conditional on some information (this could be available information or some information that we assume for the sake of argument). We shall denote the probability of a proposition conditioned on some information by . When the information is clear from context, we sometimes omit it and write the probability as simply . Even when we do this, it should always be kept in mind that probabilities are always conditioned on some information.
The probability of a proposition always lies between 0 and 1. The probability of an impossible proposition is 0 and the probability of a certain proposition is 1.
Product Rule: . Here is the proposition: “both and are true”. Also is the probability of conditioned on the truth of the proposition as well as the information . A direct consequence of the product rule is:
The above formula is known as the Bayes rule.
Sum Rule: If a proposition is broken down into disjoint propositions , then
Disjoint here means that no two of s can happen simultaneously. For the sum rule, we need
We shall see some justification for these rules later. All other rules of probability follow as a consequence of these rules. For example, the Bayes rule states that:
Example 1: Testing and Covid¶
Let denote the binary parameter which represents whether I truly have Covid or not ( when I have Covid and when I don’t). Let denote the binary outcome of the Covid test so that represents the positive test. We need to calculate the probability:
where test data is simply , and the background information refers to things like “I have been strictly quarantining for the past 3 weeks”, “I do not have symptoms such as fever” etc.
In order to calculate the posterior probability (4), we need to introduce probability assumptions. Consider the following model (below stands for background information)
represents the probability of Covid based on background information alone (this is the prior). The fact that it is low (0.02) is meaningful when I know that I have been largely isolating myself for the past few weeks.
(true positive rate) and (false positive rate) represent the likelihood.
With these probability assignments, we use the Bayes rule to compute (4) as
Thus, under the assumed probability model, there is a 33.56% chance that I am truly covid positive given the positive test. Note that 0.3356 () is not very high even though the test has very good false positive and false negative rates. This is because (which can be interpreted as probability of having Covid without taking into the account the test result) is very low (0.02).