Discussion topics and frequently asked questions

Here we collect questions on Bayesian statistics and its application to nuclear physics problems. Some of these are also asked in other Jupyter notebooks. Participants are strongly encouraged to try these questions and propose answers, and also to suggest new questions to be added.

Categories

Bayesian basics

  1. What are the best references for learning more about Bayesian statistics?

  2. Under what assumption(s) do sequential and one-step updating of Bayesian posteriors give the same answer? Demonstrate the equivalence under the appropriate conditions. (Suggestion: start with two data.)

  3. How should I choose a prior?

  4. What is a “non-informative” prior? What is a “weakly informative” prior?

  5. What are the common or subtle pitfalls that novices to Bayesian methods fall into?

  6. Why can’t I re-use data to update a posterior?

[Return to Categories]

All about pdfs

  1. Why do we mostly work with logarithms when dealing with pdfs?

  2. Why do Gaussian distributions appear everywhere?

  3. Why is a normal (Gaussian) distribution so often a good statistical model?

  4. How are the sum of two Gaussian variables distributed? E.g., if and , then how is distributed? How about ? How about , where and are constants (scalars)? How do you prove these?

  5. How are the sum of two vectors of correlated Gaussian variables distributed?

  6. When does the central limit theorem not apply?

[Return to Categories]

Bayesian vs. Frequentist

  1. When a weather forecast says there is a 50% chance of rain today, what does that mean to a Bayesian? What does it mean to a frequentist?

  2. Do Bayesians and frequentists disagree about whether Bayes’ theorem is valid?

  3. What do Bayesian techniques offer that frequentist statistics does not?

  4. What kinds of problems are ill-suited for Bayesian or frequentist approaches?

  5. Doesn’t the use of priors make the Bayesian approach completely subjective?

  6. What is the modern view of the conflict (if any) between Bayesian and frequentist statistics?

[Return to Categories]

Parameter estimation

  1. What are the assumptions (from a Bayesian perspective) underlying the usual application of the least-square method to fit parameters?

  2. What are some common point estimates for parameters?

  3. How is it best to present parameter estimates?

  4. How do you identify underfitting / overfitting?

[Return to Categories]

Sampling

  1. When do we need MCMC?

  2. How is MCMC related to the Monte Carlo methods used to solve the Ising model (e.g., in a statistical physics course)?

  3. What is Metropolis-Hastings? How is it different from Metropolis?

  4. What is Gibbs sampling? When would you use it?

  5. What is Hamiltonian Monte Carlo?

  6. When should I use Metropolis-Hastings and when should I use Hamiltonian Monte Carlo?

  7. What is the best sampling software for doing MCMC? E.g., emcee, PyMC3 (or PyMC4?), PyStan, …

[Return to Categories]

Model selection

  1. What is Bayesian model selection?

  2. Where in nuclear physics would you apply model selection?

  3. What method should I use for calculating the evidence or odds ratios?

  4. How does “PyMultiNest” compute evidences internally?

[Return to Categories]

Model checking

  1. What is Bayesian model checking?

  2. What are examples of model checking?

  3. How can model checking be used to minimize or validate the influence of priors?

[Return to Categories]

Gaussian processes

  1. Where does the name “Gaussian process” come from?

  2. What is the role of the kernel / covariance function?

  3. What properties must be fulfilled by a covariance matrix?

  4. How can you build a Gaussian emulator?

[Return to Categories]

Bayesian machine learning

  1. Does machine learning as commonly practiced use Bayesian ideas?

  2. What is a Bayesian neural network?

  3. What is the main challenge when performing Bayesian output predecitions using a neural network?

[Return to Categories]

Physicist-friendly references:

Standard statistics references

Good blogs on statistics and/or machine learning (see also the list from Andrew Gelman’s blog):

Machine learning

Github repositories

[Return to Categories]