homework

StatisticsHandout.pdf

Statistics for Experimental Physics

1. Measurement uncertainty 2. Standard Deviation and Uncertainty 3. Significant Figures 4. Error Propagation 5. Agreement and Difference Significance

1. Measurement uncertainty

Unfortunately, no measurement is perfect. Every instrument you ever use will have some margin of error. For example, say you have a ruler with millimeter markings and you are measuring the diameter of a cylindrical object that is between 5 and 6 cm. You may see that the measurement lies between the second and third mm markings past 5 cm, but you cannot precisely measure how far it is between marks. Your measurement could be 5.25 cm, 5.28 cm, 5.23 cm ... or anything between 5.2 cm and 5.3 cm—you just know that it’s somewhere between the two! In this case, you could report your measurement as 5.25 ± 0.05 cm, or 5.25 cm ± 0.95% (note that 0.05/5.25 = 0.0095). In general, a measurement is reported as r ± δr, though sometimes relative

uncertainties (given as percentages) are more useful: r ± !" " ∙100% .

Here we use a lowercase delta, δ, to indicate measurement uncertainty. δr is one single number, and would be read as “delta r,” or “the uncertainty in r.” When you take a measurement, you’ll look at the instrument you use and estimate how much you may be off. We’ll also use the same symbol to indicate the total error on a calculated quantity, for which the uncertainty had to be found through error propagation (covered below).

Part of being an honest experimenter is knowing and reporting the experimental uncertainties in your equipment. For analog instruments, a good rule of thumb is to take the smallest increment marked on the instrument and divide by two to find the uncertainty. This is what we did with the ruler in the example above.

For digital instruments, the uncertainty (sometimes called the “tolerance”) should be indicated on the instrument or in its user manual. Whoever builds the instrument must do careful calibration to measure how uncertain its readings are. Use this uncertainty when available. In this lab course, if you come across an instrument without a stated uncertainty, you may assume the digital readout was properly chosen and the uncertainty is 1 in the last digit shown. For example, if a bathroom scale reads out to the kilogram, the uncertainty would be ±1 kg.

Before continuing, we need to define some terms that are associated with error and error propagation:

In everyday English “error” means “mistake,” but in statistics and science we use it a different way. We’ll use “uncertainty” and “error” interchangeably as terms for the limitation on how well we have measured a value. The smaller the uncertainty/error, the better we have determined the value. One important thing to understand about error, is that it cannot be completely eliminated

from experiments and analysis of data. Error can be minimized to acceptable levels but cannot be ignored. Since uncertainty and error cannot be completely ignored, we must have a process in place to deal with error and uncertainty. Over the duration of this lab we will use multiple methods of addressing and stating error. These methods begin with an understanding of how to define error in terms of its nature.

There are two main categories of error. Our labs will mostly focus on the first, random error, but it’s important to understand systematic error as well.

Random error is uncertainty caused by unknown and unpredictable (random) changes in the experiment, including the physical setup, the instruments used, and the fundamental nature of what’s being measured. For example, if you were trying to determine the top speed of a certain model of racecar by taking many measurements of how fast you could make it go, the random error would have contributions from the physical setup (like the density of air and temperature of the air when you did the test), the instruments used (the measurement limitations of the speedometer or your radar speed gun), and the nature of what’s being measured (you’d use several cars of the same model, but they would all have very slightly different properties from variations in the manufacturing process). This means you would get many different numbers for the top speed, and the variation of these numbers would tell you how well you determined the true top speed. Remember, none of this variation is about making mistakes, like using a different model car or taking a reading before maximum speed was achieved.

Systematic error: this is uncertainty resulting from a bias in the measurement or theory, consistently leading to values that are off in one particular direction.

• Systematic errors can arise from instruments that are miscalibrated. For example, if you have two thermometers that are reading different temperatures when measuring the exact same substance, you know that one or both of them are miscalibrated and will likely be off every time you measure a temperature with it. (Unfortunately it’s not always obvious which thermometer is wrong, or by how much!)

• Systematic errors can also arise from an experimental design that doesn’t take all relevant factors into account. For example, if you’re measuring the acceleration due to gravity and you're not working in a vacuum, but you haven’t taken air resistance into account, your measurements for g will be systematically wrong (by an unknown amount).

• Systematic errors also show up when the theory used does not represent the physical system being investigated. An example of this is when there is an error in the formula being used to investigate data. Let’s say a vacuum chamber is used in conjunction with an electronically controlled drop time mechanism to determine height and drop time of an object falling in the chamber. Let’s say that the scientists use the following equation to calculate the acceleration of the object falling in the chamber:

𝑎 = 𝑑𝑟𝑜𝑝 ℎ𝑒𝑖𝑔ℎ𝑡

2 ∙ 𝑑𝑟𝑜𝑝 𝑡𝑖𝑚𝑒 𝑠𝑞𝑢𝑎𝑟𝑒𝑑

In this case, no matter how hard the experimenter tried to take better data, the calculated acceleration will always be about one fourth the physically realistic gravitational acceleration value. This is because the theoretical equation used for the calculation of acceleration does not accurately represent the relationship between gravitational acceleration, distance and time.

2. Standard Deviation and Uncertainty

Often in science we’ll try to measure the same thing many times (like the mass of an electron, or the speed of light). The key here is that the value itself should be a single, constant number, so any variation in our measurements is related to measurement uncertainty. For example, you can ask “what is my weight right now?”, but not “what is my weight (over my entire lifetime)?” The second number has no single, well-defined value, since it’s changed dramatically from infancy to now. When we report the value after many measurements, we want to report our best estimate for the true value, along with the uncertainty to show how precise our measurement was. Our best estimate is the mean (average) of our measurements. If we’ve measured the same thing many times, we can use as our uncertainty the standard deviation (represented with the symbol σ, lowercase sigma). This measures how big the scatter in our measurements is. We calculate this with the following formula, where N is the number of trials you ran, and �̅� is your average value of all measured quantities (x1, x2, x3, ….. up to xN),

𝜎 = 8 1

𝑁 − 1 [ (�̅� − 𝑥>)@ + (�̅� − 𝑥@)@ + ⋯+ (�̅� − 𝑥C)@]

We can report a single measurement as, for example, x3 ± 𝜎. The standard deviation is the appropriate uncertainty on a single one of our measurements, but the uncertainty on the average of our measurements is smaller (the point of averaging is to reduce the uncertainty in our final number). So if we average N measurements, the uncertainty on that average value is given by the standard error of the mean,

𝜎EFG = 𝜎 √𝑁

We would report our average as �̅� ± 𝜎EFG, with a high probability of the true value lying between �̅� - 𝜎EFG and �̅� + 𝜎EFG. Notice that as we collect more data and N gets larger, our standard error of the mean will get smaller (although our standard deviation will remain about the same). It can be difficult and time-consuming to collect data, so there’s a balance between completing the experiment in a reasonable time and having an acceptably small error. Example:

Let’s say we have a jar of jelly beans and we have N = 5 people taking a guess on how many jelly beans are in the jar. The first person’s guess will be represented using the variable x1, the second person’s guess will be represented as x2, and so on.

x1 = 235, x2 = 202, x3 = 215, x4 = 190, and x5 = 185. The average of the guesses is:

�̅� = 1 𝑁 (𝑥> + 𝑥@ + 𝑥I + 𝑥J + 𝑥K)

�̅� = >

K (235 + 202 + 215 + 190 + 185) = 205.4

The standard deviation for the set of guesses is:

𝜎 = 8 1

𝑁 − 1 [ (�̅� − 𝑥>)@ + (�̅� − 𝑥@)@ + (�̅� − 𝑥I)@ + (�̅� − 𝑥J)@ + (�̅� − 𝑥K)@]

𝜎 = 8 1

5 − 1 [ (205.4 − 235)@ + (205.4 − 202)@ + (205.4 − 215)@ + (205.4 − 190)@ + (205.4 − 185)@]

s = 20.2064346

Lastly, the uncertainty on the average is:

𝜎STU = V √C

= @W.@WXJIJX √K

= 9.03659228

At this point there are no more calculations to be done so we round to the nearest jelly bean and the result of our best guess and the uncertainty on that guess for the number of beans in the jar is:

205 ± 9 jelly beans.

3. Significant Figures Notice what we did at the end of the example above, rounding our best estimate and the uncertainty on it. The number of significant figures we use on our estimate is determined by the uncertainty. Our rules for significant figures:

1) Round your uncertainty to one significant figure, unless the uncertainty begins with the digit 1. In that case, round to two significant figures.

2) Round your result to agree in decimal place with your uncertainty. The basic idea behind both of these rules is that you want to state your results to the precision you actually know them, and no farther. In your lecture class you will probably be given some shorthand rule for significant figures. Use that rule in your lecture class, but keep in mind that it’s just a nod to this real principle from experimental science about the limitations of measurements. In lab class, we’ll be explicitly calculating the uncertainties, and can therefore

treat significant figures properly. Consider the following examples and think about whether they are correctly or incorrectly stated. Ex. 1: Brianna’s height is 165.0 ± 0.5 cm Ex. 2: Brianna’s height is 165 ± 0.5 cm Ex. 3: Jose’s height is 173 ± 1 cm Ex. 4: Jose’s height is 173.3 ± 1.0 cm Ex. 5: the area of San Francisco is 600. ± 2 km2 Ex. 6: the area of San Francisco is 6.00×102 ± 2 km2 Ex. 7: The mass of an electron is 9.109×10-31 ± 1.560×10-33 kg Ex. 8: The mass of an electron is 9.109×10-31 ± 1.6×10-33 kg Ex. 9: The mass of an electron is (9.109 ± 0.016)×10-31 kg Answers to examples: Ex. 1: correct Ex. 2: incorrect (decimal place disagreement between value and uncertainty) Ex. 3: incorrect (the uncertainty is rounded to one sig fig, but since it starts with “1” it

needs two sig figs) Ex. 4: correct (trailing zeros are significant) Ex. 5: correct (without the decimal point after the 600, it’s ambiguous whether it has one

or three significant figures) Ex. 6: correct (writing it in scientific notation makes the significant figures clear) Ex. 7: incorrect (decimal place disagreement between value and uncertainty—hard to see

because of the scientific notation—and incorrect number of sig figs on uncertainty) Ex. 8: correct Ex. 9: correct, and preferable to what’s in example 8 because it’s easier to read In intermediate calculations, keep some extra digits along (as in the example above), at least two or three digits more than you think you’ll eventually round to. That avoids introducing rounding errors.

4. Error Propagation If we have values and uncertainties for some quantities, and then use arithmetic to calculate another quantity, we can find the uncertainty on the result by propagating our uncertainties on the input.

Addition/subtraction:

If you calculate a quantity c by either adding or subtracting,

𝑐 = 𝑥 + 𝑦 or 𝑐 = 𝑥 − 𝑦

then the uncertainty on c can be found by adding the uncertainties on x and y “in quadrature” (squaring them, adding, then taking the square root):

𝛿𝑐 = \(𝛿𝑥)@ + (𝛿𝑦)@

If you are adding or subtracting more than two numbers, just add more squared uncertainties under the radical.

Example:

Two people each have a piece of copper wire. The first person measures the length of their wire to be L1 = 2.40 cm and their ruler has tick marks spaced apart by two millimeters. The second person measures their length of wire to be L2 = 10.0 cm and is using a less sophisticated ruler that only has tick marks every half centimeter.

What is the uncertainty on the total length of both wires combined? In other words, what is the error on Ltot, if Ltot = L1 + L2? These are analog devices, so the measurement uncertainty is one half of the smallest unit of measure. The measurement uncertainty for the first person is 0.1 cm and 0.25 cm for the second person. Using these values and the uncertainty formula for added or subtracted values the error on Ltot is:

𝛿𝐿^_^ = \(𝛿𝐿>)@ + (𝛿𝐿@)@

𝛿𝐿^_^ = \(0.1 cm)@ + (0.25 cm)@

𝛿𝐿^_^ = 0.27 cm

The result for the calculated total length would be 12.4 ± 0.3 cm, following our significant figure rules.

Multiplication/division:

If you calculate a quantity d by either multiplying or dividing,

𝑑 = c d or 𝑑 = d

c or 𝑑 = 𝑥 ∙ 𝑦

then the uncertainty on d can be found by multiplying the calculated value of d by the result of adding the ratios of the uncertainties divided by the measured values on x and y in quadrature.

δd = d ∙ 8g δx x i

+ g δy y i

If you are multiplying or dividing more than two numbers, just add more squared uncertainty ratios under the radical.

Example:

A ball is rolled along a straight track, measured with a tape measure to be 240 cm long. The tape measure is spaced out in one-centimeter increments. An electronic timer counts the number of seconds it takes for the ball to roll the entire length of the track. The time interval for the ball to travel this distance is 2 seconds, which has a tolerance of 0.25 seconds.

The velocity of the ball is v = (length of straight path) / (time to travel the path),

L/t = 2.40 m / 2 s = 1.20 m/s.

What is the uncertainty on the calculated value of velocity?

𝛿𝑣 = 𝑣 ∙ 8g 𝛿𝐿 𝐿 i

+ g 𝛿𝑡 𝑡 i

The measurement uncertainty for the length is 0.005 meters and for the time is 0.25 seconds.

𝛿𝑣 = 1.2 m/s ∙ 8g 0.005 m 2.4 m i

+ g 0.25 s 2.0 s i

𝛿𝑣 = 0. 1500208 m/s

The result for the calculated value for velocity would be 1.20 ± 0.15 m/s, following our significant figure rules.

5. Agreement and Difference Significance How do you know if two measurements are different? There are situations when experiments will be performed on two or more different systems. It is often useful to compare results between systems to see if they behave similarly or not. Since there is uncertainty on each result, there needs to be a mathematical procedure to determine the degree of disagreement between any two experimental results.

Example:

A paleontologist measures a certain marsupial fossil to be 37.4 million years (Myr) old. Across the continent, another fossil of this species is found and dated to 36.8 million years old. Are these fossils the same age?

Your first instinct is to probably say “no,” since the numbers 36.8 and 37.4 are different. But this neglects the key point that all measurements have uncertainties. So let’s consider four fossils. Specimen A is dated (37.4 ± 0.1) Myr, specimen B (36.8 ± 0.2) Myr, specimen C (37.4 ± 0.9) Myr, and specimen D (36.8 ± 0.8) Myr. Ages are plotted on below with error bars, that show the uncertainty above and below the value.

If you compared the ages of A and B, the best estimates are 0.6 Myr apart. So are C and D. But these are very different situations. You can see the error bars of the ages of A and B don’t overlap, whereas the error bars for the ages of C and D overlap substantially. Intuitively, it seems unlikely that the ages of A and B are really the same, but it seems quite likely that the ages of C and D could be the same.

We’ll formalize this with a parameter that we’ll call difference significance (DS). The idea is that we want to calculate the difference between our quantities in terms of uncertainties1. If the difference is many times larger than the uncertainty, that should be a real difference. If the difference is smaller than the uncertainty, that’s probably not a real difference. We’ll employ a shorthand approach to a rigorous statistical treatment of hypothesis tests, but it will suffice as a good introduction for this lab course.

We calculate the difference significance by

𝐷𝑆 = difference

uncertainty in difference = |𝐴 − 𝐵| 𝜎{|

The uncertainty on a difference between two numbers is calculated from the individual uncertainties on the numbers added in quadrature, as explained in the Error Propagation section above.

In this course, we’ll categorize difference significance into four possibilities:

1 For those with a statistics background, you may recognize that this is similar to t from a Student’s t-test.

36.8

fossil specimen

age (Myr)

B C D

37.4

0 ≤ DS ≤ 1 Our measurements show no difference between the two numbers, and we are very confident in that result.

1 < DS ≤ 2 Our measurements show no difference between the two numbers, but we are not very confident in that result.

2 < DS ≤ 3 Our measurements do show a difference between the two numbers, but we are not very confident in that result.

3 < DS Our measurements do show a difference between the two numbers, and we are very confident in that result.

For our examples above, the difference between the ages of A and B has a total uncertainty of

\(0.1 Myr)@ + (0.2 Myr)@ = 0.2236 Myr

The difference significance is therefore

𝐷𝑆 = 37.4 Myr − 36.8 Myr

0.2236 Myr = 2.7

2.7 is large enough that this is probably a real difference between the ages, but it’s not large enough to be very sure about that. (You can report your difference significance to two or three significant figures.)

Doing the same calculation for specimens C and D yields an uncertainty on the difference of 1.204 Myr, and therefore a difference significance of 0.50. We definitely can’t see a difference between the ages of these two specimens.

rev 2020-09-01