Skip to article frontmatterSkip to article content

Jump to:

Instructors

Office Hours

Class Overview

Course Description

Bayesian statistics is a collection of methods rooted in the use of Bayes rule to update prior beliefs given observed evidence. These center on recovering, then using, a posterior distribution. In this class we will focus on the three major steps needed to perform Bayesian data analysis.

  1. Model construction. A user intending to implement a Bayesian analysis must specify a complete data generating process. This often requires intensive mathematical modeling, and, in responsible application, a healthy dose of model criticism. We will discuss classical methods for prior specification, the costs of mis-specification, and tools for model fitting. We will also encourage a healthy dose of model skepticism.
  2. Posterior evaluation. Posterior evaluation is typically automatic, given an appropriately selected computational tool. We will discuss sampling and optimization based methods for performing point estimation, interval estimation, and uncertainty quantification given a model specification. The main practical advantage of Bayesian statistics lies in the fact that, after specifying a model, posterior evaluation is conceptually unified, and may be performed automatically with standard tools. This remains true even for quite complicated models, so allows a user to coherently integrate many different sources of evidence, and to easily update inferences with new data.
  3. Finally, we will discuss methods for model checking. These are essential, as posterior probabilities are only valid as model scores when the prior is correctly specified. We will discuss posterior predictive checks and sensitivity testing.

Advanced Topics: Non-parametric conjugate priors (Gaussian processes), exponential families, expectation-maximization, Hamiltonian Monte Carlo, variational inference, Bayes optimal experimental design, and Bayesian optimization. We will also discuss non-inferential applications including risk-minimization in prediction, coding, compression, and optimal search procedures for surrogate modeling.

Prerequisites

Probability, Frequentist Statistics, Multi-variate Calculus, and Linear Algebra proficiency at an upper-division undergraduate level. Practice coding in python or in R. Coding will be performed in a Jupyter notebook environment.

While we are working to make this class widely accessible, we currently require the following (or equivalent) prerequisites. Prerequisites will be enforced in Stat 238. It is your responsibility to know the material in the prerequisites.

Course Philosophy

This course will focus on model formulation, computational methodology, and practical model criticism. We will spend the majority of our time on the first two topics as these are the most important skills needed to start using Bayesian data analysis. We will attempt to balance between theoretical foundations, computation, and applied reasoning. The latter is essential for useful, responsible, work. We will mostly avoid philosophical issues as we believe these are better understood after a user has experience working with real problems. Philosophies of science divorced from lived practice will not be discussed. On the whole, we adopt Gelman’s “hypothetico-deductivism” over the “subjective inductivism” that maligns Bayesian analyses. We will, on the whole, argue that it is worth pursuing objective priors where possible, weakly informative priors where not, and, will emphasize the importance of explainability and robustness in model construction. In particular, we view complex Bayesian models primarily as mathematical modes of argumentation, and demand that those arguments be explainable, accountable to predictive accuracy where possible, and that they introduce justifiable, understandable, biases in inference.

Teaching Philosophy: On Struggle

My goal is to provide a welcoming, collaborative learning environment for exploration and growth. My job is to serve your development first. I will work to give you the tools to succeed. Success in a course is relative to your desired outcomes, so you should think about your learning objectives. Note: a grade is not a learning objective. Importantly, your grade is not your worth. I do not judge students’ character from their class performance. Rather, productive struggle is the best hallmark of learning. So, if you are struggling, for any reason, come talk to me. I will do my best to provide clear, achievable routes to success.

Class Procedure

Time and Location

Materials (Text)

Bayesian Data Analysis by Gelman, Carlin, Stern, Dunson, Vehtari, and Rubin. Third Edition. Available online for free and by request as a pdf (email alexstrang@berkeley.edu).

Website

Announcements will be posted on Ed. Please submit assignments to Gradescope.

Announcements

All announcements will be posted on Ed. Regular course communication will happen through Ed. If you need to contact me directly regarding an urgent or individual issue, please email me at alexstrang@berkeley.edu. To ensure that your email is seen and sorted, please start all course email topics with [STAT 238]. I check email twice a day during the week and once a day on weekends. I do not promise to answer emails after 5:00 pm, so please try to make sure you send your emails with appropriate lead time.

Policies

This is only a survey of the syllabus. For the full syllabus (with all legal details), please go to: Full 238 course syllabus. As policies are subject to change, this document may change during the semester.