lab12

docx

School

University of California, Berkeley *

*We aren’t endorsed by this school

Course

C200

Subject

Statistics

Date

Feb 20, 2024

Type

docx

Pages

Uploaded by DrStar12779

lab12 April 14, 2023 [1]: # Initialize Otter import otter grader = otter . Notebook( "lab12.ipynb" ) 0.0.1 Content Warning This lab includes discusssion about cancer. If you feel uncomfortable with this topic, please contact your GSI or the instructors, or reach out via the Spring 2023 extenuating circumstances form. 1 Lab 12: Logistic Regression In this lab, we will manually construct the logistic regression model and minimize cross-entropy loss using scipy.minimize. This structure mirrors the linear regression labs from earlier in the semester and lets us dive deep into how logistic regression works. We also introduce the sklearn.linear_model.LogisticRegression module that you would use in practice, and we ex plore performance metrics for classification. 1.0.1 Due Date The on-time deadline is Tuesday, April 18th, 11:59 PM PT . Please read the syllabus for the grace period policy. No late submissions beyond the grace period will be accepted. 1.0.2 Collaboration Policy Data science is a collaborative activity. While you may talk with others about this assignment, we ask that you write your solutions individually . If you discuss the assignment with others, please include their names in the cell below. Collaborators: list names here [2]: # Run this cell to set up your notebook import numpy as np import pandas as pd import sklearn 1 import sklearn.datasets import matplotlib.pyplot as plt import seaborn as sns import plotly.offline as py import plotly.graph_objs as go

import plotly.figure_factory as ff % matplotlib inline sns . set() sns . set_context( "talk" ) 1.0.3 Lab Walk-Through In addition to the lab notebook, we have also released a prerecorded walk-through video of the lab. We encourage you to reference this video as you work through the lab. Run the cell below to display the video. Note : The walkthrough video is recorded from Spring 2022. [3]: from IPython.display import YouTubeVideo YouTubeVideo( "75hj59nas-M" ) [3]: 2 1.1 Data Loading We will explore a breast cancer dataset from the University of Wisconsin ( source ). This dataset can be loaded using the sklearn.datasets.load_breast_cancer() method. [4]: # Run this cell to load the data, no further action is needed. data = sklearn . datasets . load_breast_cancer() # Data is a dictionary. print (data . keys()) print (data . DESCR)

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename', 'data_module']) .. _breast_cancer_dataset: Breast cancer wisconsin (diagnostic) dataset -------------------------------------------- **Data Set Characteristics:** :Number of Instances: 569 :Number of Attributes: 30 numeric, predictive attributes and the class :Attribute Information: - radius (mean of distances from center to points on the perimeter) - texture (standard deviation of gray-scale values) - perimeter - area - smoothness (local variation in radius lengths) - compactness (perimeter^2 / area - 1.0) - concavity (severity of concave portions of the contour) - concave points (number of concave portions of the contour) - symmetry - fractal dimension ("coastline approximation" - 1) The mean, standard error, and "worst" or largest (mean of the three worst/largest values) of these features were computed for each image, resulting in 30 features. For instance, field 0 is Mean Radius, field 10 is Radius SE, field 20 is Worst Radius. - class: - WDBC-Malignant 3 - WDBC-Benign :Summary Statistics: ===================================== ====== ====== Min Max ===================================== ====== ====== radius (mean): 6.981 28.11 texture (mean): 9.71 39.28 perimeter (mean): 43.79 188.5 area (mean): 143.5 2501.0 smoothness (mean): 0.053 0.163 compactness (mean): 0.019 0.345 concavity (mean): 0.0 0.427 concave points (mean): 0.0 0.201 symmetry (mean): 0.106 0.304 fractal dimension (mean): 0.05 0.097 radius (standard error): 0.112 2.873

Your preview ends here