Lab03-prelab
html
keyboard_arrow_up
School
University of British Columbia *
*We aren’t endorsed by this school
Course
119
Subject
Computer Science
Date
Feb 20, 2024
Type
html
Pages
11
Uploaded by MajorTreeLyrebird3
Lab 03 Prelab, Part 2 - Analysis preparation and initial data collection
¶
Please complete Part 1 of the prelab on Canvas before working through this notebook.
In [41]:
%reset -f # Clear all variables, start with a clean environment.
import numpy as np
import data_entry2
This prelab activity introduces a useful features in our data_entry2 spreadsheet tool and then walks you through how to calculate, using Python, the quantities average
, standard deviation
and (standard) uncertainty of the mean
. It starts by using a hypothetical example data set to guide you through the use of the relevant Python functions. The work done with the hypothetical data set will not be handed in directly, and instead will set you up to perform these same calculations on some real data, also collected in this prelab.
Simple Calculations in data_entry2 cells
¶
It is possible to do some simple calculations directly in the data_entry2 sheet. In general we want you to do calculations using python, but for some tasks, most notably recording your uncertainties, it is very convenient to
use this feature of the sheet.
As an example, if you measure a mass of 497 g, and estimate a 95% confidence interval of 477 -> 516 g, your sheet could look like:
m
u_m
g
g
497
= (516-477)/4
Alternatively, if you have a rectangular PDF on a balance with a 10 g resolution, you might use something like:
m
u_m
g
g
142
= 10/(2 * np.sqrt(3))
Try it
Use the sheet below to try out both of these styles of uncertainty.
•
enter a variable name, m (in grams) for the first column, and um in the second column.
•
In the next two rows, enter the measurements and expressions to calculate uncertainties as shown in the two examples above.
•
To get rid of unused rows and columns, execute (Shift-Enter) in the notebook cell that creates the sheet again.
•
Notice that in the sheet interface, you see the formulas you've entered, but that when you Generate Vectors, the expressions are evaluated and the generated uncertainy vector contains the results of the calculations.
•
Alter one of the expressions in the uncertainy column so that it contains an error - perhaps add an extra ')' at the end of the expression to see what happens.
In [42]:
de0 = data_entry2.sheet("test_formulas")
Sheet name: test_formulas.csv
Summary of Part 1 of the prelab
¶
Here is a summary of the statistics concepts covered or reviewed in part 1 of this prelab:
a)
Average is given by $$x_{ave} = \frac{1}{N} \sum_{i=1}^N x_i$$
b)
For variables that follow a Gaussian distribution, approximately 68% of the values lie between the range $ x_{ave} - \sigma$ to $x_{ave} + \sigma$ (68% CI)
c)
Approximately 95% of the values will lie within the range $ x_{ave} - 2\sigma$ to $x_{ave} + 2\sigma$ (95% CI)
d)
Standard deviation is given by
$$ \sigma = \frac{95\% \,\mathrm{CI}}{4} = \sqrt{\frac{1}{N-1}\sum_{i=1}^N \left(x_i - x_{ave}\right)^2} $
$
e)
We use the standard deviation as an indicator of the uncertainty (or the variability) in a single measurement and it does not depend on the number of measurements taken.
f)
Uncertainty of the mean (often called standard error of the mean) is given by
$$\sigma_m = u\_x_{ave} = \frac{\sigma}{\sqrt{N}}$$
We use uncertainty of the mean as an indicator of the uncertainty (or the variability) in the average of multiple
measurements and it does improve as we increase the number of measurements.
Developing your Python skills
¶
Let's import a spreadsheet of our data "prelab03_01"
In [43]:
# Run me to import the spreadsheet, `prelab03_1`, which is found in the same directory as `Lab03-prelab.ipynb`
de = data_entry2.sheet('prelab03_1')
Sheet name: prelab03_1.csv
Below is a table of the hypothetical data in your imported spreadsheet
Your turn #1:
Double-check that you have the correct number of data points. It should be 25, but you need to recall that Python indexing starts at 0!
Hypothetical data
¶
d (mm)
439.3
431.6 434.6 433.3 439.3 442.6 428.6 441.6 431.2 427.6 433.2 441.3 436 437.6 434.7 433.2 433.1 431.3 436 432.9 436.5 437.2 435.7 432.6 434.7
Calculating average and standard deviation using Python numpy functions
¶
Your turn #2:
Press the 'Generate Vectors' button at the top of your spreadsheet to transfer the data into the Python environment and then calculate the average and standard deviation in the cell below using the np.mean
and np.std()
functions, respectively. np.mean
has a single argument
, which is the vector of values over which to calculate the average. We discuss the second argument in np.std
below.
Note: If it is not working correctly, double-check above that you have correctly titled the single spreadsheet column as 'd' and that there is a resulting generated vector 'dVec'.
In [44]:
# Run me to calculate average and standard deviation. Note how we're able to include descriptive text and units in the print commands.
dAve = np.mean(dVec)
print("Average of d =", dAve, "mm")
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
dStd = np.std(dVec, ddof=1)
print("Standard deviation of d =", dStd, "mm")
Average of d = 435.028 mm
Standard deviation of d = 3.8362872676586677 mm
You should find that the average is 435.028 mm, which is consistent with our earlier estimate of 435 mm from the histogram. The standard deviation should be 3.8362872676586677 mm, which would be 3.8 mm if we were to round it to 2 significant figures when we report it. This is also consistent with our estimate of 4 mm using the 95% Confidence Interval with the histogram earlier.
Note that in 'np.std()' we are supplying a second argument, ddof=1
; this additional argument is needed because the np.std() function uses a general formula in its calculation - it can be used for a number of related calculations. In particular the formula it uses is:
$$ \textrm{np.std()} = \sqrt{\frac{1}{N-\textrm{ddof}}\sum_{i=1}^N \left(x_i - x_{ave}\right)^2} $$
We want $N-1$ in the denominator as per our definition of standard deviation, so we need to use ddof = 1
:
$$ \sigma = \sqrt{\frac{1}{N-1}\sum_{i=1}^N \left(x_i - x_{ave}\right)^2} $$
If you are interested, ddof is an abbreviation for 'delta degrees of freedom.' As discussed in Lab 01, we use one 'degree of freedom' from our dataset when we calculate the average. Since the average is used in the calculation of standard deviation, we control for this in the formula for standard deviation by dividing the squared differences between each data point in the mean by $N-1$ instead of $N$.
If you want to control the number of significant figures displayed you can modify the print statement as follows.
Within the curly braces, the ':.2' tells the print function to round the variable specified to 'format()' - in this case 'dStd', the standard deviation of 'd' - to two digits.
In [45]:
# Run me to print dStd with 2 decimal places "{:.2}"
print("Standard deviation to 2 sig figs = {:.2} mm".format(dStd))
Standard deviation to 2 sig figs = 3.8 mm
Let's step back for a moment and think about what the standard deviation represents. Twenty-five measurements
were made using the same experimental procedure, so this standard deviation is a method we can use to represent the variability in our measurements. In the language we are using in the lab, this standard deviation is the single-measurement standard uncertainty of the distance, $u\_d_1$. What does this mean? It means that if we wanted to report the value and uncertainty for one of our measurements of $d$, 434.7 mm for example, we would report it as:
$$ d_1 = (434.7 \pm 3.8) \, \textnormal{mm} $$
The subscript '1' is being used here to emphasize that we are talking about a single measurement and not the average. We will look at the uncertainty in the average later.
The variability (the standard deviation) in the 25 measurements that we made describes us how confident we
should be in any one of the individual values. Instead of estimating our uncertainty from a single measurement as we did with the height of the spring in the first two labs, the use of repeated measurements can allow us to measure the variability in our measurements in a more rigorous way.
Calculating average and standard deviation the "long way" using Python
¶
In the lab, you do not need to perform your calculations the "long way", but we want you to learn how to do it this way as part of the prelab for the following reasons:
1.
Many of the calculations we perform later in this course will not correspond to built-in functions, so it is useful to learn how to do more complicated calculations. 2.
Breaking down complicated calculations into a several lines of code---as we do in these "long way" calculations---is the strategy that we will be encouraging you to use for most of your coding work going forward in this course. 3.
We will also be giving you a few generally useful tips and skills during this process. 4.
It is often easier to find problems or errors in your calculations if you can look at intermediate values. Calculating average the "long way"
¶
Let's revisit our equation for calculating average,
$$x_{ave} = \frac{1}{N} \sum_{i=1}^N x_i$$
We will break the operation of calculating the average into steps. We will first sum up all the $x_i$ values, then
count how many values there are ($N$), and then finally calculate the quotient.
Your turn #3a:
Similar to np.mean()
and np.std()
, there is a NumPy function for calculating a sum,
np.sum()
. Use this function in the code cell below to define a variable 'dSum' which is the result of the sum over 'dVec'.
In [46]:
# Use this cell to define a variable dSum, which uses np.sum to sum over dVec
dSum = np.sum(dVec) print("sum of all the distances =", dSum, "mm")
sum of all the distances = 10875.7 mm
Next, the built-in Python function len()
calculate how "long" a vector is, i.e. it counts up the number of elements within the supplied variable. For instance, if you run the code cell below you can see len()
returns 3
when we supply it with the three-element vector foo
:
In [47]:
# Run me to see how len() works
foo = np.array([1, 2, 3])
len(foo)
Out[47]:
3
Your turn #3b:
Use len()
in the cell below to define another variable 'dCount' which is the result of counting
the number of elements in 'dVec'.
In [48]:
# Use this cell to define a variable dCount, which uses len() to count the number of elements in dVec
dCount = len(dVec) print("number of elements in d=", dCount)
number of elements in d= 25
Your turn #3c:
Finally, define the variable 'dAveLong' which is calculated by dividing 'dSum' by 'dCount' to arrive at the average of 'dVec' the "long way".
In [49]:
# Use this cell to define dAveLong. Add a second line of code to print out the value
dAveLong = dSum / dCount print("long way average is=", dAveLong, "mm")
long way average is= 435.028 mm
You should find that you calculated an average distance of 435.028 mm just like when using the short way.
Calculating standard deviation the "long way"
¶
This equation is a little more involved, but we want you have some practice with these methods in addition to having to stop and think a bit about each of the pieces involved in doing the standard deviation calculation.
Lets look again at our equation for the standard deviation,
$$ \sigma = \sqrt{\frac{1}{N-1}\sum_{i=1}^N \left(x_i - x_{ave}\right)^2} $$
We need to first find the average (done!), then for each value $x_i$ find the difference between it and the average. Next we need to find the square of that difference for each value and then sum up all of those differences of squares. And then finally we need divide that sum by $N-1$ and take the square root. Let's do it!
Let's start with calculating $x_i - x_{ave}$ for each data point. What we want Python to do is take each data point in 'dVec' and subtract 'dAve'. Thankfully, this can be done in a single, intuitive line of code. If we were to do this in a calculator, we'd have to make 25 calculations - one for each data point in 'dVec'. However, Python is
smart enough that when we supply it with a 25-element vector like 'dVec' and ask it to subtract off a one-
element vector (or scalar) like 'dAve', then it knows that you want to subtract 'dAve' from each data point in 'dVec'.
In [50]:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
# Run me to see how subtracting a single number from a vector works
bar = np.array([1, 2, 3, 4, 5])
print('Dummy data = ', bar)
barMinusOne = bar - 1
print('Dummy data subtracted by 1 = ', barMinusOne)
Dummy data = [1 2 3 4 5]
Dummy data subtracted by 1 = [0 1 2 3 4]
Your turn #4a:
Using the example above, define a new Python variable 'diffFromAve' below which subtracts 'dAvg' from each element of 'dVec'.
In [51]:
# Use this cell to define diffFromAve
diffFromAve = dVec - dAve
print("data subtracted from average=", diffFromAve, "mm")
data subtracted from average= [ 4.272 -3.428 -0.428 -1.728 4.272 7.572 -6.428 6.572 -
3.828 -7.428
-1.828 6.272 0.972 2.572 -0.328 -1.828 -1.928 -3.728 0.972 -2.128
1.472 2.172 0.672 -2.428 -0.328] mm
Going back to the standard deviation formula, we see that we now need to square
each of these differences from
the average. In Python, the operator that raises a number to a power is two stars. Again, Python is smart enough to know when we ask to square a vector, Python will square each element within the vector. Run the cell below to define the new variable 'diffFromAvgSquared', which squares your previous result.
In [52]:
# Run this cell to define diffFromAvgSquared, the square of each element from the vector diffFromAvg
diffFromAvgSquared = diffFromAve**2
Our next step is to sum up these squared differences. You already learned how to perform sums in Python using
'np.sum()' earlier in calculating the average the "long way".
Your turn #4b:
Use np.sum()
to define a new variable 'sumSquaredDiffs' which is the result of summing all the elements in the vector 'diffFromAvgSquared'.
In [53]:
# Use this cell to define sumSquaredDiffs
sumSquaredDiffs = np.sum(diffFromAvgSquared) print("sum of all squared numbers=", sumSquaredDiffs, "mm")
sum of all squared numbers= 353.21040000000016 mm
Only two steps to go! Recall that because we use one degree of freedom to calculate the average, we divide the sum of the squared differences by $N-1$ instead of $N$ when we calculate the standard deviation.
Your turn #4c:
We already have $N$ calculated and stored in the variable 'dCount', so below define a new variable 'dCountMinusOne' which stores $N-1$.
In [54]:
# Use this cell to define dCountMinusOne
dCountMinusOne = dCount - 1
print("counted numbers minus one is=", dCountMinusOne)
counted numbers minus one is= 24
Finally, we can combine everything together by running the code cell below, which takes the square root of the sum of the squared differences divided by $N-1$:
In [55]:
# Run me to finish the "long way" calculation of the standard deviation and compare it to the "short way"
dStdLong = np.sqrt( sumSquaredDiffs / dCountMinusOne )
print("Standard deviation (long way) =", dStdLong, "mm")
print("Standard deviation (short way) =", dStd, "mm")
Standard deviation (long way) = 3.8362872676586677 mm
Standard deviation (short way) = 3.8362872676586677 mm
If all went well, you should see identical results for calculating the standard deviation of 'dVec' the long or short
way.
Collecting your first set of data (approx. 15 min)
¶
For this lab, we are asking you to collect some initial data using a simulation of the experimental equipment.
Notes:
•
You may find it helpful to add some notes about your observations in the "Part B - Start of familiarization" section of your Lab03.ipynb notebook. •
All of your calculations should use the "short way" (
np.std(dVec, ddof=1)
, etc). The "long way"
was to help you better understand what the equations are doing and to give you some initial practice with
doing calculations by column, which will come up again later in the course. Your turn #5:
Please open the Pendulum simulation (link found on Canvas in this lab's module). Play around with the pendulum simulation so that you understand how the pendulum and the timer work. In this prelab, you’ll be taking some initial measurements to determine the period of a pendulum $T$ at a starting amplitude of
$15^\circ$. Here are things to consider when planning your first set of measurements:
1.
Remember that the period, $T$, is defined as one complete cycle of the pendulum’s motion, returning to the same initial position while also travelling in the same initial direction. 2.
Once you have figured out how to use the timer and pendulum, you will have a design choice to make: how many swings back and forth (Mswings) will be counted in each of your trials (Ntrials). Be sure to record Mswings as a python variable (i.e., have something like: Mswings = <value>
in a code cell).
3.
Start a fresh spreadsheet below for data collection (make sure the name is different
from the name used for the earlier spreadsheet above). In the new spreadsheet you will record the time taken for Mswings swings of the pendulum in each trial. 4.
Set an external timer and give yourself 7 minutes total to collect data.
1.
Start with a release amplitude of $15^\circ$. Record directly in your spreadsheet
the time taken, $t$ for the pendulum to complete $M$ cycles. We will refer to this as your "measured time" or just "time." 2.
Repeat your measurement as many times as you can
in 7 minutes. We will refer to the number of
data points you collected as your number of trials, Ntrials. After your 7 minutes of data collection are finished:
5. Press generate_vectors
to create a vector with your data 6.
In the code cell below the new spreadsheet, calculate the average time for M swings (tave) and the average period (Tave) 7.
Calculate utave ($u\_t_{ave}$) and uTave ($u\_T_{ave}$), the uncertainties of the means for tave and Tave. 8.
Calculate relutave and reluTave, the relative uncertainties in tave and Tave. In [56]:
# Use this cell to create a new spreadsheet, prelab03_2, for data collection
de = data_entry2.sheet('prelab03_2')
Sheet name: prelab03_2.csv
In [57]:
# Use this cell (and additional ones if you like) to define and print tave, Tave, utave, uTave, relutave and reluTave
# for your collected data
Mswings = 20 #number of cycles fixed
Ntrials = 8 #number of data collected tave = np.mean(TimeVec) # average of the measured times print("average time is=", tave, "s")
Tave = np.mean(PeriodVec) #average period of the measured times
print("average period is=", Tave, "s")
utave = (np.std(TimeVec, ddof=1)) / np.sqrt(Ntrials) #uncertainty in the average time
print("uncertainty in the average time is=", utave)
uTave = (np.std(PeriodVec, ddof=1)) / np.sqrt(Ntrials) #uncertainty in the average period
print("uncertainty in the average period is=", uTave) relutave = utave / tave # relative uncertainty in tave
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
print("relative uncertainty in tave is=", relutave)
reluTave = uTave / Tave # relative uncertainty in Tave
print("relative uncertainty in Tave is=", reluTave)
average time is= 36.675 s
average period is= 1.83375 s
uncertainty in the average time is= 0.013758114488755488
uncertainty in the average period is= 0.0006879057244378101
relative uncertainty in tave is= 0.00037513604604650275
relative uncertainty in Tave is= 0.0003751360460465222
Share your prelab results
¶
We will use everybody's shared prelab results as the basis for a discussion about measurement design during the
lab. Please add your results to the "Spreadsheet for Sharing prelab data" found in this lab's module on Canvas.
Submit
¶
Steps for submission:
1.
Click: Run => Run_All_Cells 2.
Read through the notebook to ensure all the cells executed correctly and without error. 3.
Correct any errors you find. 4.
File => Save_and_Export_Notebook_As->HTML 5.
Upload the HTML document to the lab submission assignment on Canvas. In [58]:
display_sheets()
Sheet: de0 File: test_formulas.csv
m
u_m
Units
g
g
0
497 =(516-477)/4
1
142
=10/(2 * np.sqrt(3))
Sheet: de File: prelab03_2.csv
Time
Period
Units
seconds seconds
0
36.63
1.8315
Time
Period
1
36.70
1.8350
2
36.72
1.8360
3
36.73
1.8365
4
36.66
1.8330
5
36.68
1.8340
6
36.63
1.8315
7
36.65
1.8325
In [ ]:
Related Documents
Recommended textbooks for you
data:image/s3,"s3://crabby-images/b907a/b907ada1f4be11d175260bd2a8acbc475b9f1fe1" alt="Text book image"
Systems Architecture
Computer Science
ISBN:9781305080195
Author:Stephen D. Burd
Publisher:Cengage Learning
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage
COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
Computer Science
ISBN:9780357392676
Author:FREUND, Steven
Publisher:CENGAGE L
data:image/s3,"s3://crabby-images/ce875/ce87572bfb5586f780940f75a1da4ae090e95154" alt="Text book image"
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781305627482
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
data:image/s3,"s3://crabby-images/61705/6170520b9ca02682c7ca3517a4c2977d0fc377d0" alt="Text book image"
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781285196145
Author:Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos Coronel
Publisher:Cengage Learning
data:image/s3,"s3://crabby-images/d6156/d61566c71eeaf4b6f1aeba510303e4372d1fb98a" alt="Text book image"
Principles of Information Systems (MindTap Course...
Computer Science
ISBN:9781285867168
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning
Recommended textbooks for you
- Systems ArchitectureComputer ScienceISBN:9781305080195Author:Stephen D. BurdPublisher:Cengage LearningNp Ms Office 365/Excel 2016 I NtermedComputer ScienceISBN:9781337508841Author:CareyPublisher:CengageCOMPREHENSIVE MICROSOFT OFFICE 365 EXCEComputer ScienceISBN:9780357392676Author:FREUND, StevenPublisher:CENGAGE L
- Database Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781305627482Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781285196145Author:Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos CoronelPublisher:Cengage LearningPrinciples of Information Systems (MindTap Course...Computer ScienceISBN:9781285867168Author:Ralph Stair, George ReynoldsPublisher:Cengage Learning
data:image/s3,"s3://crabby-images/b907a/b907ada1f4be11d175260bd2a8acbc475b9f1fe1" alt="Text book image"
Systems Architecture
Computer Science
ISBN:9781305080195
Author:Stephen D. Burd
Publisher:Cengage Learning
Np Ms Office 365/Excel 2016 I Ntermed
Computer Science
ISBN:9781337508841
Author:Carey
Publisher:Cengage
COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
Computer Science
ISBN:9780357392676
Author:FREUND, Steven
Publisher:CENGAGE L
data:image/s3,"s3://crabby-images/ce875/ce87572bfb5586f780940f75a1da4ae090e95154" alt="Text book image"
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781305627482
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
data:image/s3,"s3://crabby-images/61705/6170520b9ca02682c7ca3517a4c2977d0fc377d0" alt="Text book image"
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781285196145
Author:Steven, Steven Morris, Carlos Coronel, Carlos, Coronel, Carlos; Morris, Carlos Coronel and Steven Morris, Carlos Coronel; Steven Morris, Steven Morris; Carlos Coronel
Publisher:Cengage Learning
data:image/s3,"s3://crabby-images/d6156/d61566c71eeaf4b6f1aeba510303e4372d1fb98a" alt="Text book image"
Principles of Information Systems (MindTap Course...
Computer Science
ISBN:9781285867168
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning