Assignment - Regressions - Adi S.

docx

School

Indiana University, Bloomington *

*We aren’t endorsed by this school

Course

S364

Subject

Computer Science

Date

Feb 20, 2024

Type

docx

Pages

5

Uploaded by EarlNightingale4120

Report
Name(s): Please insert your name(s) here K353 Assignment - Linear and Logistic Regressions (Total Points: 40) 1. Conceptual Questions (10 point): These questions pertain to key concepts covered during the class. This will include a series of multiple choice, fill in the blank, short answer, and matching questions. These questions are tightly linked to the learning objectives of that week. Questions are also likely candidates for exams. 2. Hands-on Exercises (15 point): These questions relate to the hands-on activity(ies). These activities are related to the content covered in the chapter and give students hands-on experience, which is highly sought after by employers for the exciting entry-level positions in the industry. 3. Custom Code Implementation (15 point): These questions allow students to create their own code to achieve a particular task of their choosing. The problem specification must be clear, code commented, and the code must also include concepts that were covered that week. These questions are intended to get students to think creatively about the concepts covered in class and build something of use or of interest to them. 4. Student Feedback (ungraded): These questions allow each student to offer feedback to the instructor of there were particular areas which were difficult or needed additional explanation. Students may form groups of up to two for assignments. You may also choose to work alone. However, the number of questions and the questions themselves will not change if you choose to work alone or with someone. If you choose to work with someone, only one of you is required to submit the assignment with BOTH of your names on it. Both of you will receive the same score for the assignment. You may choose to work individually for certain assignments, and in groups for others. However, you are responsible for making these decisions and resolving any potential conflicts (e.g., free-riding) – neither I nor the TAs will intervene. No late assignments will be accepted. In this course, turnitin.com will be utilized. Turnitin is an automated system which instructors may use to quickly and easily compare each student's assignment with billions of web sites, as well as an enormous database of student papers that grows with each submission. After the assignment is processed, as instructor I receive a report from turnitin.com that states if and how another author’s work was used in the assignment. For a more detailed look at this process visit http://www.turnitin.com. Suggestion The document is tightly styled. After every question, there is space to respond to the question. Questions use the “question” style and the blank space between questions uses the “answer” style. Students should just start typing into the space provided for the answers and their answers will be distinct from the questions to facilitate grading. 1
Name(s): Please insert your name(s) here Linear and Logistic Regression Conceptual Questions (10 point) Underline the correct answer (possible answers in Bold) for the following ... 1. The argument position for an array of predictor(s). train_test_split( X , y, 0.2, 0.7, 7) 2. The argument position for the portion of the dataset to include in the test split. train_test_split(X, y, 0.2 , 0.7, 7) 3. The argument position to set a random state for reproducible data shuffling. train_test_split(X, y, 0.2, 0.7, 7 ) 4. The argument position to for the portion of the dataset to include in the train split. train_test_split(X, y, 0.2, 0.7 , 7) 5. The argument position for a list of labels. train_test_split(X, y , 0.2, 0.7, 7) 6. The argument position for an array of predictor(s). .fit( X , y) 7. The argument position for a list of labels. .fit(X, y ) Please refer to these links for additional resources about the questions: 1. Scikit-Learn train_test_split - https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html 2. Scikit-Learn linear_model.LinearRegression - https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#skl earn.linear_model.LinearRegression.fit 3. Scikit-Learn linear_model.LogisticRegression - https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#s klearn.linear_model.LogisticRegression.fit 4. 1
Name(s): Please insert your name(s) here Hands-on Exercises (15 point) Hands on Exercise prompt Exercise 1 (number classification): Download the dataset for this exercise with the following code snippet: from sklearn.datasets import load_digits digits = load_digits() X = digits.data y = digits.target If you want, you can use the following code to visualize the data in this dataset and look at what the digits look like: import numpy as np import matplotlib.pyplot as pltplt.figure(figsize=(20,4)) for index, (image, label) in enumerate(zip(digits.data[0:5], digits.target[0:5])): plt.subplot(1, 5, index + 1) plt.imshow(np.reshape(image, (8,8)), cmap=plt.cm.gray) plt.title('Training: %i\n' % label, fontsize = 20) This is a dataset that is used to train a model to predict digits. Write a program to train a logistic regression model on the dataset and test it and print its accuracy. The accuracy should be over 90% Please include a screen shot of a portion of your OUTPUT along with your Python code. Please refer to these links for additional resources about the questions: 2
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Name(s): Please insert your name(s) here 1. Scikit-Learn train_test_split - https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html 2. Scikit-Learn linear_model.LinearRegression - https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#skl earn.linear_model.LinearRegression.fit 3. Scikit-Learn linear_model.LogisticRegression - https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#s klearn.linear_model.LogisticRegression.fit Custom Code Implementations (15 point) For any imaginary company of your choosing, create a 2darray of fabricated data. You can use any method we have gone over for this, including making your fabricated data in Excel, saving it as a csv, and importing it into a Pandas dataframe. Your fabricated data needs to have at least 20 rows, 4 columns of relevant business stats (e.g., number of shirts sold, cost of goods, etc.) and one appropriate target variable (e.g., stock price). For your target variable, choose an appropriate model (linear or logistic regression), train the model on 75% of your data and test it on 25%. Report the appropriate metrics for your selected model (MSE or accuracy). Please also answer the questions below concisely and precisely. What is the BDP value or problem you are trying to solve? We are trying to create a linear regression to try and predict stock price of the company’s data using other factors such as revenue, expenses, profit, and number of employees for that company. What is the expected output of your program? Since we are using a linear regression model. The mean squared error is the expected output so we understand how well our model can predict the stock price of the company. What is the value of your program? The model helps us understand how using logistic regression can help us with predictive analysis of a company’s data. We could also run regressions on other variables as well and with many different business scenarios. Please paste a screenshot of sample outputs of your program below. 3
Name(s): Please insert your name(s) here Student Feedback (No Points; Ungraded) On a scale of 1 – 10 how difficult (1 being very easy and 10 being extremely difficult) was this assignment for you? 6 How long did this assignment take you to complete? 45 minutes Please list any additional feedback you have about this assignment. More help on instructions surrounding the creative concepts. Wasn’t exactly sure how to create a NumPy database at first. 4