TB for Quiz4 and Quiz5

pdf

School

SUNY at Albany *

*We aren’t endorsed by this school

Course

438

Subject

Computer Science

Date

Apr 3, 2024

Type

pdf

Pages

Uploaded by BarristerWorld7634

1 Machine Learning: Classification, Regression and Clustering 15.1 Introduction to Machine Learning 15.1 Q1: Which of the following statements is false ? a. We can make machines learn. b. The “secret sauce” of machine learning is data— and lots of it. c. With machine learning, rather than programming expertise into our applica- tions, we program them to learn from data. d. All of the above statements are true . 15.1 Q2: Which of the following kinds of predictions are happening today with machine learning? a. Improving weather prediction, cancer diagnoses and treatment regimens to save lives. b. Predicting customer “churn,” what prices houses are likely to sell for, ticket sales of new movies, and anticipated revenue of new products and services. c. Predicting the best strategies for coaches and players to use to win more games and championships, while experiencing fewer injuries. d. All of the above. 15.1.1 Scikit-Learn 15.1 Q3: Which of the following statements is false ? a. Scikit-learn conveniently packages the most effective machine-learning algo- rithms as evaluators. b. Each scikit- learn algorithm is encapsulated, so you don’t see its intricate details, including any heavy mathematics. c. With scikit- learn and a small amount of Python code, you’ll create powerful models quickly for analyzing data, extracting insights from the data and most im- portantly making predictions. d. All of the above statements are true . 15.1 Q4: Which of the following statements is false ? a. With scikit-learn, you’ll train each model on a subset of your data, then test each model on the rest to see how well your model works. b. Once your models are trained, you’ll put them to work making predictions based on data they have not seen. You’ll often be amazed at how accurate your models will be. c. With machine learning, your computer that you’ve used mostly on rote chores will take on characteristics of intelligence.

2 d. Although you can specify parameters to customize scikit-learn models and pos- sibly improve their performance, if you use the models’ default parameters for simplicity, you’ll often obtain mediocre results. 15.1 Q5: Which of the following statements about scikit-learn and the machine- learning models you’ll build with it is false ? a. It’s difficult to know in advance which model(s) will perform best on your data, so you typically try many models and pick the one that performs best — scikit- learn makes this convenient for you. b. You’ll rarely get to know the details of the complex mathematical algorithms in the scikit- learn estimators, but with experience, you’ll be able to intuit the best model for each new dataset. c. It generally takes at most a few lines of code for you to create and use each scikit-learn model. d. The models report their performance so you can compare the results and pick the model(s) with the best performance. 15.1.2 Types of Machine Learning 15.1 Q6: Which of the following statements is false ? a. The two main types of machine learning are supervised machine learning, which works with unlabeled data, and unsupervised machine learning, which works with labeled data. b. If you’re developing a computer vision application to recognize dogs and cats, you’ll train your model on lots of dog photos labeled “dog” and cat photos labeled “cat.” If your model is effective, when you put it to work processing unlabeled photos it will recognize dogs and cats it has never seen before. The more photos you train with, the greater the chance that your model will accurately predict which new photos are dogs and which are cats. d. In this era of big data and massive, economical computer power, you should be able to build some pretty accurate machine learning models. 15.1 Q7: Which of the following statements is false ? a. Supervised machine learning falls into two categories — classification and re- gression. b. You train machine-learning models on datasets that consist of rows and col- umns. Each row represents a data feature. Each column represents a sample of that feature. c. In supervised machine learning, each sample has an associated label called a target (like “spam” or “not spam”). This is the value you’re trying to predict for new data that you present to your models. d. All of the above statements are true . 15.1 Q8: Which of the following statements is false ?

3 a. “Toy” datasets, generally have a small number of samples with a limited num- ber of features. In the world of big data, datasets commonly have, millions and billions of samples, or even more. b. There’s an enormous number of free and open datasets available for data sci- ence studies. Libraries like scikit-learn bundle popular datasets for you to exper- iment with and provide mechanisms for loading datasets from various reposito- ries (such as openml.org ). c. Governments, businesses and other organizations worldwide offer datasets on a vast range of subjects. d. All of the above statements are true . 15.1 Q9: Which of the following statements is false ? a. Even though k-nearest neighbors is one of the most complex classification al- gorithms, because of its superior prediction accuracy we use it to to analyze the Digits dataset bundled with scikit-learn. b. Classification algorithms predict the discrete classes (categories) to which sam- ples belong. c. Binary classification uses two classes, such as “spam” or “not spam” in an email classification application. Multi-classification uses more than two classes, such as the 10 classes, 0 through 9, in the Digits dataset. d. A classification scheme looking at movie descriptions might try to classify them as “action,” “adventure,” “fantasy,” “romance,” “history” and the like. 15.1 Q10: Which of the following statements is false ? a. Regression models predict a continuous output, such as the predicted temper- ature output in a weather time series analysis. b. We can implement simple linear regression using scikit- learn’s Line- arRegression estimator. c. The LinearRegression estimator also can perform multiple linear regression. d. The LinearRegression estimator, by default, uses all the nonnumerical fea- tures in a dataset to make more sophisticated predictions than you can with a single-feature simple linear regression. 15.1 Q11: Unsupervised machine learning uses ________ algorithms. a. classification b. clustering c. regression d. None of the above 15.1 Q12 : Which of the following are related to compressing a dataset’s large number of features down to two for visualization purposes. a. dimensionality reduction b. TSNE estimator c. both a) and b) d. neither a) nor b)

Your preview ends here