Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Data Analysis: Probability, Hypothesis Testing, and Chi-Square, Slides of Computational and Statistical Data Analysis

A series of lecture notes on statistical data analysis, specifically focusing on probability theory, bayes' theorem, random variables, probability distributions, hypothesis testing, and the chi-square test. The notes cover topics such as functions of random variables, expectation values, error propagation, the monte carlo method, p-values, and the significance of a peak.

Typology: Slides

2011/2012

Uploaded on 03/08/2012

leyllin
leyllin 🇬🇧

4.3

(15)

242 documents

1 / 18

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
G. Cowan Lectures on Statistical Data Analysis 1
Statistical Data Analysis: Lecture 7
1 Probability, Bayes’ theorem, random variables, pdfs
2 Functions of r.v.s, expectation values, error propagation
3 Catalogue of pdfs
4 The Monte Carlo method
5 Statistical tests: general concepts
6 Test statistics, multivariate methods
7 Significance tests
8 Parameter estimation, maximum likelihood
9 More maximum likelihood
10 Method of least squares
11 Interval estimation, setting limits
12 Nuisance parameters, systematic uncertainties
13 Examples of Bayesian approach
14 tba
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12

Partial preview of the text

Download Statistical Data Analysis: Probability, Hypothesis Testing, and Chi-Square and more Slides Computational and Statistical Data Analysis in PDF only on Docsity!

Statistical Data Analysis: Lecture 7

1 Probability, Bayes’ theorem, random variables, pdfs 2 Functions of r.v.s, expectation values, error propagation 3 Catalogue of pdfs 4 The Monte Carlo method 5 Statistical tests: general concepts 6 Test statistics, multivariate methods 7 Significance tests 8 Parameter estimation, maximum likelihood 9 More maximum likelihood 10 Method of least squares 11 Interval estimation, setting limits 12 Nuisance parameters, systematic uncertainties 13 Examples of Bayesian approach 14 tba

Testing significance / goodness-of-fit

Suppose hypothesis H predicts pdf observations for a set of We observe a single point in this space: What can we say about the validity of H in light of the data? Decide what part of the data space represents less compatibility with H than does the point (^) less compatible with H more compatible with H (Not unique!)

p -value example: testing whether a coin is ‘fair’

i.e. p = 0.0026 is the probability of obtaining such a bizarre result (or more so) ‘by chance’, under the assumption of H. Probability to observe n heads in N coin tosses is binomial: Hypothesis H : the coin is fair ( p = 0.5). Suppose we toss the coin N = 20 times and get n = 17 heads. Region of data space with equal or lesser compatibility with H relative to n = 17 is: n = 17, 18, 19, 20, 0, 1, 2, 3. Adding up the probabilities for these values gives:

The significance of an observed signal

Suppose we observe n events; these can consist of: n b events from known processes (background) n s events from a new process (signal) If n s , n b are Poisson r.v.s with means s , b , then n = n s

  • n b is also Poisson, mean = s + b : Suppose b = 0.5, and we observe n obs = 5. Should we claim evidence for a new discovery? Give p -value for hypothesis s = 0:

The significance of a peak

Suppose we measure a value x for each event and find: Each bin (observed) is a Poisson r.v., means are given by dashed lines. In the two bins with the peak, 11 entries found with b = 3.2. The p -value for the s = 0 hypothesis is:

The significance of a peak (2)

But... did we know where to look for the peak? → give P ( n ≥ 11) in any 2 adjacent bins Is the observed width consistent with the expected x resolution? → take x window several times the expected resolution How many bins × distributions have we looked at? → look at a thousand of them, you’ll find a 10

  • effect Did we adjust the cuts to ‘enhance’ the peak? → freeze cuts, repeat analysis with new data How about the bins to the sides of the peak... (too low!) Should we publish????

G. Cowan 10

Distribution of the p -value

The p -value is a function of the data, and is thus itself a random variable with a given distribution. Suppose the p -value of H is found from a test statistic t ( x ) as Lectures on Statistical Data Analysis The pdf of p H under assumption of H is In general for continuous data, under assumption of H , p H ~ Uniform[0,1] and is concentrated toward zero for Some (broad) class of alternatives. pH g ( p H

|H )

g ( p H

|H′ )

G. Cowan 11

Using a p -value to define test of H

0 So the probability to find the p -value of H 0 , p 0

, less than α is

Lectures on Statistical Data Analysis We started by defining critical region in the original data space ( x ), then reformulated this in terms of a scalar test statistic t ( x ). We can take this one step further and define the critical region of a test of H 0

with size α as the set of data space where p

0

Formally the p -value relates only to H 0 , but the resulting test will have a given power with respect to a given alternative H 1

Pearson’s χ

2

test

If n i

are Gaussian with mean ν

i

and std. dev. σ

i , i.e., n i

~ N( ν

i

i 2 ),

then Pearson’s χ

2

will follow the χ

2

pdf (here for χ

2 = z ): If the n i

are Poisson with ν

i

>> 1 (in practice OK for ν

i

then the Poisson dist. becomes Gaussian and therefore Pearson’s

2

statistic here as well follows the χ

2 pdf.

The χ

2 value obtained from the data then gives the p -value:

The ‘ χ

2

per degree of freedom’

Recall that for the chi-square pdf for N degrees of freedom,

This makes sense: if the hypothesized ν

i are right, the rms deviation of n i

from ν

i

is σ

i , so each term in the sum contributes ~ 1.

One often sees χ

2 / N reported as a measure of goodness-of-fit.

But... better to give χ

2 and N separately. Consider, e.g.,

i.e. for N large, even a χ

2 per dof only a bit greater than one can imply a small p -value, i.e., poor goodness-of-fit.

Example of a χ

2

test

← This gives for N = 20 dof. Now need to find p -value, but... many bins have few (or no)

entries, so here we do not expect χ

2 to follow the chi-square pdf.

Using MC to find distribution of χ

2

statistic

The Pearson χ

2 statistic still reflects the level of agreement between data and prediction, i.e., it is still a ‘valid’ test statistic. To find its sampling distribution, simulate the data with a Monte Carlo program: Here data sample simulated 10 6 times. The fraction of times we

find χ

2

29.8 gives the p -value: p = 0. If we had used the chi-square pdf we would find p = 0.073.