lab09-AgeUniverse

html

School

Temple University *

*We aren’t endorsed by this school

Course

1013

Subject

Astronomy

Date

Dec 6, 2023

Type

html

Pages

19

Uploaded by samzahroun

Report
The Age of the Universe Welcome to Lab 9! Elements of Data Science adapted from Berkeley Data8 Sometimes, the primary purpose of regression analysis is to learn something about the slope or intercept of the best-fitting line. When we use a sample of data to estimate the slope or intercept, our estimate is subject to random error, just as in the simpler case of the mean of a random sample. In this lab, we'll use regression to get an accurate estimate for the age of the universe, using pictures of exploding stars. Our estimate will come from a sample of all exploding stars. We'll compute a confidence interval to quantify the error caused by sampling. In [1]: name = "Sam Z." In [2]: ## import statements # These lines load the tests. from gofer.ok import check import numpy as np from datascience import * import pandas as pd import matplotlib from matplotlib import patches %matplotlib inline import matplotlib.pyplot as plt import seaborn as sns import warnings warnings.simplefilter('ignore', FutureWarning) plt.style.use('fivethirtyeight') from ipywidgets import interact, interactive, fixed import ipywidgets as widgets import os user = os.getenv('JUPYTERHUB_USER') The Actual Big Bang Theory In the early 20th century, the most popular cosmological theory suggested that the universe had always existed at a fixed size. Today, the Big Bang theory prevails: Our universe started out very small and is still expanding. A consequence of this is Hubble's Law, which says that the expansion of the universe creates the appearance that every celestial object that's reasonably far away from Earth (for example, another galaxy) is moving away from us at a constant speed. If we extrapolate that motion backwards to the time when everything in the universe was in the same place, that time is (roughly) the beginning of the universe! Scientists have used this fact, along with measurements of the current location and movement speed of other celestial objects, to estimate when the universe started. The cell below simulates a universe in which our sun is the center and every other star is moving away from us. Each star starts at the same place as the sun, then moves away from it over time. Different stars have different directions and speeds ; the arrows indicate the direction and speed of travel. Run the cell, then move the slider to see how things change over time. Question 1 When did the universe start, in this example? In [3]:
# Just run this cell. (The simulation is actually not # that complicated; it just takes a lot of code to draw # everything. So you don't need to read this unless you # have time and are curious about more advanced plotting.) num_locations = 15 example_velocities = Table().with_columns( "x", np.random.normal(size=num_locations), "y", np.random.normal(size=num_locations)) start_of_time = -2 def scatter_after_time(t, start_of_time, end_of_time, velocities, center_name, other_point_name, make_title): max_location = 1.1*(end_of_time-start_of_time)*max(max(abs(velocities.column("x"))), max(abs(velocities.column("y")))) new_locations = velocities.with_columns( "x", (t-start_of_time)*velocities.column("x"), "y", (t-start_of_time)*velocities.column("y")) plt.scatter(make_array(0), make_array(0), label=center_name, s=100, c="yellow") plt.scatter(new_locations.column("x"), new_locations.column("y"), label=other_point_name) for i in np.arange(new_locations.num_rows): plt.arrow( new_locations.column("x").item(i), new_locations.column("y").item(i), velocities.column("x").item(i), velocities.column("y").item(i), fc='black', ec='black', head_width=0.025*max_location, lw=.15) plt.xlim(-max_location, max_location) plt.ylim(-max_location, max_location) plt.gca().set_aspect('equal', adjustable='box') plt.gca().set_position(make_array(0, 0, 1, 1)) plt.legend(bbox_to_anchor=(1.6, .7)) plt.title(make_title(t)) plt.show() interact( scatter_after_time, t=widgets.FloatSlider(min=start_of_time, max=5, step=.05, value=0, msg_throttle=1), start_of_time=fixed(start_of_time), end_of_time=fixed(5), velocities=fixed(example_velocities), center_name=fixed("our sun"), other_point_name=fixed("other star"), make_title=fixed(lambda t: "The world {:01g} year{} in the {}".format(abs(t), "" if abs(t) == 1 else "s", "past" if t < 0 else "future"))); interactive(children=(FloatSlider(value=0.0, description='t', max=5.0, min=-2.0, step=0.05), Output()), _dom_c…
According to this example, the universe started 2 years ago. Question 2 After 5 years (with the slider all the way to the right), stars with longer arrows are further away from the Sun. Why? The stars are further away from the sun because of the passage of time and according to this code, the universe started with the sun and other star also came from our sun and then they get further away over time. . Analogy: driving Here's an analogy to illustrate how scientists use information about stars to estimate the age of the universe. Suppose that at some point in the past, our friend Mei started driving in a car going at a steady speed of 60 miles per hour straight east. We're still standing where she started. We want to know how long she's been driving, but we forgot to record the time when she left. If we find out that she's 120 miles away, and she's been going 60 miles per hour the whole time, we can infer that she left 2 hours ago. One way we can compute that number is by fitting a line to a scatter plot of our locations and speeds. It turns out that the slope of that line is the amount of time that has passed. Run the next cell to see a picture: In [4]: # Run this cell to see a picture of Mei's locations over time. mei_velocity = Table().with_columns("x", make_array(60), "y", make_array(0)) interact( scatter_after_time, t=widgets.FloatSlider(min=-2, max=1, step=.05, value=0, msg_throttle=1), start_of_time=fixed(-2), end_of_time=fixed(1), velocities=fixed(mei_velocity), center_name=fixed("Us"), other_point_name=fixed("Mei"), make_title=fixed(lambda t: "Mei's position {:01g} hour{} in the {}".format(abs(t), "" if abs(t) == 1 else "s", "past" if t < 0 else "future"))); interactive(children=(FloatSlider(value=0.0, description='t', max=1.0, min=-2.0, step=0.05), Output()), _dom_c… The slope of the line is 2 hours. (The units are vertical-axis units divided by horizontal- axis units, which are $\frac{\texttt{miles}}{\texttt{miles} / \texttt{hour}}$, or hours.) So that's our answer. Imagine that you don't know Mei's exact distance or speed, only rough estimates. Then if you drew this line, you'd get a slightly bad estimate of the time since she left. But if you measured the distance and speed of hundreds of people who left you at the same time going different speeds, and drew a line through them, the slope of that line would be a pretty good estimate of the time they left, even if the individual measurements weren't exactly right. The drivers.csv dataset contains the speeds and distances-from-start of 100 drivers. They all left the same starting location at the same time, driving at a fixed speed on a straight line away from the start. The measurements aren't exact, so they don't fit exactly on a line. We've created a scatter plot and drawn a line through the data. In [5]: # Just run this cell to plot the data. small_driving_example = Table().with_columns( "Name", make_array("Us", "Mei"),
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
"Speed moving away from us (miles per hour)", make_array(0, 60), "Current distance from us (miles)", make_array(0, 120)) x = small_driving_example.column("Speed moving away from us (miles per hour)") y = small_driving_example.column("Current distance from us (miles)") plt.scatter(x,y, s=200) # Now compute the fit and overlay linear_model=np.polyfit(x,y,1) linear_model_fn=np.poly1d(linear_model) x_s=np.arange(0,60) plt.plot(x_s,linear_model_fn(x_s),color="green") plt.show() Question 3 By looking at the fit line, estimate how long ago (in hours) Mei left. In [6]: # Just run this cell. drivers= Table.read_table("drivers.csv") xd,yd = drivers.column("Speed moving away from us (miles per hour)"),drivers.column("Current distance from us (miles)") drivers Out[6]: Speed moving away from us (miles per hour) Current distance from us (miles) 29.5047 47.6285
Speed moving away from us (miles per hour) Current distance from us (miles) 64.8771 110.753 15.9152 22.8718 13.9501 23.0032 72.5393 126.281 38.43 59.2222 29.9099 51.3957 46.18 75.077 62.1933 105.435 23.4237 40.9736 ... (90 rows omitted) In [7]: plt.scatter(xd,yd, color='red') # Second approach to plotting #create scatterplot with regression line and confidence interval lines using Seaborn module sns.regplot(x=xd, y=yd) plt.show()
In [8]: ##### Fill in the start time you infer from the above line. driving_start_time_hours = (100-60)/(60-35) driving_start_time_hours Out[8]: 1.6 In [9]: check('tests/q3.py') Out[9]: All tests passed! Back to cosmology To do the same thing for the universe, we need to know the distance-from-Earth and speed-away-from-Earth of many celestial objects. Using pictures taken by very accurate telescopes and a lot of physics, astronomers have been able to estimate both. It turns out that nearby supernovae -- stars that have recently died and exploded -- are among the best sources of this data, because they are very easy to see. This picture taken by the Hubble telescope shows an entire galaxy, with a single supernova - as bright by itself as billions of stars - at the bottom left. Our astronomical data for today will come from the Supernova Cosmology Project at Lawrence Berkeley Lab. The original dataset is here , with (brief) documentation here . Each row in the table corresponds to a supernova near Earth that was observed by astronomers. From pictures like the one above, the astronomers deduced how far away each supernova was from Earth and how fast it was moving away from Earth. Their deductions were good, but not perfect. Run the cell below to load the data into a table called close_novas and make a scatter
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
plot. (If you prefer, you can also use the name close_novae ; both are correct.) Question 4 Looking this plot, make a guess at the age of the universe. Note : Make sure you get the units right! In case you need to know what a parsec is, it's a big unit of distance, equivalent to 30.86 trillion kilometers. In [10]: # Just run this cell. close_novas = Table.read_table("close_novas.csv") close_novae = close_novas xn,yn = close_novae.column("Speed (parsecs/year)"),close_novae.column("Distance (million parsecs)") sns.regplot(x=xn, y=yn) plt.show() close_novas Out[10]: Speed (parsecs/year) Distance (million parsecs) 0.00873361 117.305 0.0153418 217.007 0.0162256 230.961
Speed (parsecs/year) Distance (million parsecs) 0.00528131 85.2853 0.0129474 185.051 0.0138862 212.841 0.0111837 151.728 0.0060085 82.6121 0.00838228 104.029 0.00812078 124.778 ... (146 rows omitted) In [11]: # Fill this in manually by examining the line above. first_guess_universe_age_years = (10*10**9) # This just shows your guess as a nice string, in billions of years. "{:,} billion years".format(round(first_guess_universe_age_years / 1e9, 2)) Out[11]: '10.0 billion years' In [12]: check('tests/q4.py') Out[12]: All tests passed! Fitting the line yourself Displaying a fit line is visually helpful but we need to be able to calculate the slope as a number. Recall that the least-squares regression line for our supernova data is: the line with the smallest average (over all the supernovae we observe) error, squared, where the error is $$\text{the supernova's actual distance from Earth} - \text{the height of the line at that supernova's speed.}$$ In [13]: ## TEST Fitting a practice line with slope and intercept that you pick slope = 4 intercept = 3 predict = close_novas.column('Speed (parsecs/year)')*slope+intercept tbl = close_novas.with_column('predict',predict/1e6) #tbl = tbl Out[13]:
Speed (parsecs/year) Distance (million parsecs) predict 0.00873361 117.305 3.03493e-06 0.0153418 217.007 3.06137e-06 0.0162256 230.961 3.0649e-06 0.00528131 85.2853 3.02113e-06 0.0129474 185.051 3.05179e-06 0.0138862 212.841 3.05554e-06 0.0111837 151.728 3.04473e-06 0.0060085 82.6121 3.02403e-06 0.00838228 104.029 3.03353e-06 0.00812078 124.778 3.03248e-06 ... (146 rows omitted) Question 5 Define a function called errors . It should take three arguments: 1. a table like close_novas (with the same column names and meanings, but not necessarily the same data) 2. the slope of a line (a number) 3. the intercept of a line (a number). It should return an array of the errors made when a line with that slope and intercept is used to predict distance from speed for each supernova in the given table. (The error is the actual distance minus the predicted distance.) In [14]: def errors(tbl, slope, intercept): ## extract values from table X_values = tbl.column("Speed (parsecs/year)") Y_values = tbl.column("Distance (million parsecs)") ## predicted values from line predicted_y = (slope * X_values) + intercept ## errors error_values = Y_values - predicted_y return error_values Question 6 Using errors , compute the errors for the line with slope 16000 and intercept 0 on the close_novas dataset. Name that array example_errors . Then make a scatter plot of the
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
errors. Hint: To make a scatter plot of the errors, plot the error for each supernova in the dataset. Put the actual speed on the horizontal axis and the error on the vertical axis. In [15]: example_errors = errors(close_novas,16000,0) example_errors plt.scatter(close_novas[0], example_errors) Out[15]: <matplotlib.collections.PathCollection at 0x7fe882bc0d60> In [16]: ## Check sign of error np.round(example_errors.item(0), 2) Out[16]: -22.43 In [17]: check('tests/q6.py') Out[17]: All tests passed! You should find that the errors are almost all negative. That means our line is a little bit too steep. Let's find a better one. Question 7 Define a function called fit_line . It should take a table like close_novas (with the same column names and meanings) as its argument. It should return an array containing the slope (as item 0) and intercept (as item 1) of the least-squares regression line predicting distance from speed for that table. Here we will minimize square errors. Note: If you haven't tried to use the minimize function yet, now is a great time to
practice. Here's an example from the textbook, Inferential Thinking 15.3 . First we will define an mse function to get the mean squared error. In [18]: def mse(tbl,xlabel,ylabel,any_slope,any_intercept): xdata, ydata = tbl.column(xlabel), tbl.column(ylabel) fitted = any_slope * xdata + any_intercept mse = np.mean((ydata - fitted) ** 2) print("Root mean squared error:", mse ** 0.5) return mse In [19]: mse(close_novas,'Speed (parsecs/year)','Distance (million parsecs)',1600,0) Root mean squared error: 129.154916893 Out[19]: 16680.992557632489 In order to use datascience minimize function we need to be able to vary all function parameters which is impossible for table name and data column labels so we creat a related function below. In [20]: def mse_c(any_slope,any_intercept): tbl = close_novas xlabel = 'Speed (parsecs/year)' ylabel = 'Distance (million parsecs)' xdata, ydata = tbl.column(xlabel), tbl.column(ylabel) fitted = any_slope * xdata + any_intercept mse = np.mean((ydata - fitted) ** 2) print("Root mean squared error:", mse ** 0.5) return mse Determine the mse (mean squared error) for three values of any_slope and any_intercept In [21]: ### mse_c(slope, intercept) Root mean squared error: 142.499211976 Out[21]: 20306.025413737752 Now use minimize to find minimum of mse for any_slope and any_intercept In [22]: minimize(mse_c) Root mean squared error: 145.306173836 Root mean squared error: 145.306173836 Root mean squared error: 145.296075973 Root mean squared error: 145.279737294 Root mean squared error: 143.49888016 Root mean squared error: 10.5223813535 Root mean squared error: 231.799957245 Root mean squared error: 89.0721365992 Root mean squared error: 55.6676595884 Root mean squared error: 10.6217148189 Root mean squared error: 10.6217148206 Root mean squared error: 10.5223813535 Root mean squared error: 10.5362970345 Root mean squared error: 10.6996457522
Root mean squared error: 10.5612213794 Root mean squared error: 10.5164811094 Root mean squared error: 10.5164425214 Root mean squared error: 10.5164431154 Root mean squared error: 10.5164431154 Root mean squared error: 145.956039279 Root mean squared error: 10.5164425214 Root mean squared error: 10.5167617359 Root mean squared error: 10.5159466466 Root mean squared error: 10.5113724236 Root mean squared error: 10.5233436836 Root mean squared error: 10.5131198561 Root mean squared error: 10.5120399177 Root mean squared error: 10.5113729308 Root mean squared error: 10.5113729308 Root mean squared error: 10.5113724236 Root mean squared error: 10.5302291223 Root mean squared error: 10.6809654332 Root mean squared error: 10.547214309 Root mean squared error: 10.5073515629 Root mean squared error: 10.5070443118 Root mean squared error: 10.5070447447 Root mean squared error: 10.5070447447 Root mean squared error: 10.4991277292 Root mean squared error: 10.5070443118 Root mean squared error: 10.4991277292 Root mean squared error: 10.489464887 Root mean squared error: 10.481839294 Root mean squared error: 10.5017916282 Root mean squared error: 10.4847526658 Root mean squared error: 10.4829521985 Root mean squared error: 10.4818418175 Root mean squared error: 10.4818418175 Root mean squared error: 10.481839294 Root mean squared error: 10.529432795 Root mean squared error: 10.6059883561 Root mean squared error: 10.500043856 Root mean squared error: 10.4887965472 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.484497273 Root mean squared error: 10.4822271298 Root mean squared error: 10.4818958793 Root mean squared error: 10.4818475497 Root mean squared error: 10.4818404984 Root mean squared error: 10.4818394697 Root mean squared error: 10.4818393196 Root mean squared error: 10.4818392977 Root mean squared error: 10.4818392945 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.5070443118 Root mean squared error: 10.5476993189 Root mean squared error: 10.491473901 Root mean squared error: 10.4855204312 Root mean squared error: 10.481839294 Root mean squared error: 10.4818393401 Root mean squared error: 10.4818393007 Root mean squared error: 10.4818392951 Root mean squared error: 10.4818392942 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Out[22]: array([ 1.40945338e+04, 2.40907951e+00])
In [23]: values = minimize(mse_c) print(values,values[0],values[1]) # print and extract values Root mean squared error: 145.306173836 Root mean squared error: 145.306173836 Root mean squared error: 145.296075973 Root mean squared error: 145.279737294 Root mean squared error: 143.49888016 Root mean squared error: 10.5223813535 Root mean squared error: 231.799957245 Root mean squared error: 89.0721365992 Root mean squared error: 55.6676595884 Root mean squared error: 10.6217148189 Root mean squared error: 10.6217148206 Root mean squared error: 10.5223813535 Root mean squared error: 10.5362970345 Root mean squared error: 10.6996457522 Root mean squared error: 10.5612213794 Root mean squared error: 10.5164811094 Root mean squared error: 10.5164425214 Root mean squared error: 10.5164431154 Root mean squared error: 10.5164431154 Root mean squared error: 145.956039279 Root mean squared error: 10.5164425214 Root mean squared error: 10.5167617359 Root mean squared error: 10.5159466466 Root mean squared error: 10.5113724236 Root mean squared error: 10.5233436836 Root mean squared error: 10.5131198561 Root mean squared error: 10.5120399177 Root mean squared error: 10.5113729308 Root mean squared error: 10.5113729308 Root mean squared error: 10.5113724236 Root mean squared error: 10.5302291223 Root mean squared error: 10.6809654332 Root mean squared error: 10.547214309 Root mean squared error: 10.5073515629 Root mean squared error: 10.5070443118 Root mean squared error: 10.5070447447 Root mean squared error: 10.5070447447 Root mean squared error: 10.4991277292 Root mean squared error: 10.5070443118 Root mean squared error: 10.4991277292 Root mean squared error: 10.489464887 Root mean squared error: 10.481839294 Root mean squared error: 10.5017916282 Root mean squared error: 10.4847526658 Root mean squared error: 10.4829521985 Root mean squared error: 10.4818418175 Root mean squared error: 10.4818418175 Root mean squared error: 10.481839294
Root mean squared error: 10.529432795 Root mean squared error: 10.6059883561 Root mean squared error: 10.500043856 Root mean squared error: 10.4887965472 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.484497273 Root mean squared error: 10.4822271298 Root mean squared error: 10.4818958793 Root mean squared error: 10.4818475497 Root mean squared error: 10.4818404984 Root mean squared error: 10.4818394697 Root mean squared error: 10.4818393196 Root mean squared error: 10.4818392977 Root mean squared error: 10.4818392945 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.5070443118 Root mean squared error: 10.5476993189 Root mean squared error: 10.491473901 Root mean squared error: 10.4855204312 Root mean squared error: 10.481839294 Root mean squared error: 10.4818393401 Root mean squared error: 10.4818393007 Root mean squared error: 10.4818392951 Root mean squared error: 10.4818392942 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 Root mean squared error: 10.481839294 [ 1.40945338e+04 2.40907951e+00] 14094.5337865 2.40907950644 Now define the function fit_line below to use minimize function In [24]: def fit_line(tbl): # Your code may need more than 1 line below here. xdata,ydata = tbl.column(0), tbl.column(1) def mse_c(slope,intercept): fitted = slope * xdata + intercept return np.mean((ydata - fitted)**2) values = minimize(mse_c) slope = values[0] intercept = values[1] return make_array(slope, intercept) # Here is an example call to your function. To test your function, # figure out the right slope and intercept by hand. example_table = Table().with_columns( "Speed (parsecs/year)", make_array(0, 1), "Distance (million parsecs)", make_array(1, 3)) fit_line(example_table) Out[24]: array([ 2., 1.]) In [25]: check('tests/q7.py') Out[25]: All tests passed! Question 8 Use your function to fit a line to close_novas . Then, set new_errors equal to the errors that we get calling errors with our new line. The following line will graph the corresponding residual plot with a best fit line. Make sure that the residual plot makes sense (Hint: what qualities should the best fit line of a residual plot have?) In [26]: best_line = fit_line(close_novas) best_line_slope = best_line.item(0) best_line_intercept = best_line.item(1)
new_errors = errors(close_novas, best_line_slope, best_line_intercept) # This code displays the residual plot, given your values for the best_line_slope and best_line_intercept best=Table().with_columns("Speed (parsecs/year)", close_novas.column("Speed (parsecs/year)"), "Distance errors (million parsecs)", new_errors ) xb,yb = best.column("Speed (parsecs/year)"),best.column("Distance errors (million parsecs)") sns.regplot(x=xb, y=yb) plt.show() # This just shows your answer as a nice string, in billions of years. "Slope: {:g} (corresponding to an estimated age of {:,} billion years)".format(best_line_slope, round(best_line_slope/1000, 4)) Out[26]: 'Slope: 14094.5 (corresponding to an estimated age of 14.0945 billion years)' That slope (multiplied by 1 million) is an estimate of the age of the universe. The current best estimate of the age of the universe (using slightly more sophisticated techniques) is 13.799 billion years. What was our determined age? Did we get close? The determined age is 14.0945 billion years, Yes, this is accurate. One reason our answer might be a little off is that we are using a sample of only some of the supernovae in the universe. Our sample isn't exactly random, since astronomers
presumably chose the novae that were easiest to measure (or used some other nonrandom criteria). But let's assume it is. How can we produce a confidence interval for the age of the universe? Question 9 It's time to bootstrap so that we can quantify the variability in our estimate! Simulate 1000 resamples from close_novas . For each resample, compute the slope of the least- squares regression line, and multiply it by 1 million to compute an estimate of the age of the universe. Store these ages in an array called bootstrap_ages , and then use them to compute a 95% confidence interval for the age of the universe. Note: This might take up to a minute, and more repetitions will take even longer. See text: Inferential Thinking 13.1 In [46]: bootstrap_ages = make_array() for i in np.arange(1000): sampled_novas = fit_line(close_novas.sample()) sampled_age = sampled_novas[0] * 1000000 bootstrap_ages = np.append(bootstrap_ages, sampled_age) lower_end = percentile(2.5,bootstrap_ages) upper_end = percentile(97.5,bootstrap_ages) Table().with_column("Age estimate", bootstrap_ages*1e-9).hist(bins=np.arange(12, 16, .1), unit="billion years") print("95% confidence interval for the age of the universe: [{:g}, {:g}] billion years".format(lower_end*1e-9, upper_end*1e-9)) 95% confidence interval for the age of the universe: [13.614, 14.518] billion years In [47]: check('tests/q9.py') Out[47]: All tests passed!
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Nice work, data astronomer! You can compare your result to the Planck project 2015 results , which estimated the age of the universe to be 13.799±0.021 billion years. Please remember to submit! * In [48]: # For your convenience, you can run this cell to run all the tests at once! import glob from gofer.ok import check correct = 0 checks = [3,4,6,7,9] total = len(checks) for x in checks: print('Testing question {}: '.format(str(x))) g = check('tests/q{}.py'.format(str(x))) if g.grade == 1.0: print("Passed") correct += 1 else: print('Failed') display(g) print('Grade: {}'.format(str(correct/total))) print("Nice work ",name, user) import time; localtime = time.asctime( time.localtime(time.time()) ) print("Submitted @ ", localtime) Testing question 3: Passed Testing question 4: Passed Testing question 6: Passed Testing question 7: Passed Testing question 9: Passed Grade: 1.0 Nice work Sam Z. tul06187@temple.edu Submitted @ Fri Nov 17 03:06:35 2023 In [ ]: