Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
pdf
keyboard_arrow_up
School
Educational Training Center *
*We aren’t endorsed by this school
Course
200
Subject
Industrial Engineering
Date
May 8, 2024
Type
Pages
84
Uploaded by DeaconFireQuetzal39
DUMPS
BASE
EXAM DUMPS
AMAZON
MLS-C01
28% OFF Automatically For You
AWS Certified Machine Learning - Specialty
1 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
1.A Machine Learning Specialist is working with multiple data sources containing
billions of records that need to be joined.
What feature engineering and model development approach should the Specialist
take with a dataset this large?
A. Use an Amazon SageMaker notebook for both feature engineering and model
development
B. Use an Amazon SageMaker notebook for feature engineering and Amazon ML for
model development
C. Use Amazon EMR for feature engineering and Amazon SageMaker SDK for model
development
D. Use Amazon ML for both feature engineering and model development.
Answer: C
Explanation:
Amazon EMR is a service that can process large amounts of data efficiently and cost-
effectively. It can run distributed frameworks such as Apache Spark, which can
perform feature engineering on big data. Amazon SageMaker SDK is a Python library
that can interact with Amazon SageMaker service to train and deploy machine
learning models. It can also use Amazon EMR as a data source for training data.
References:
Amazon EMR
Amazon SageMaker SDK
2.A Machine Learning Specialist has completed a proof of concept for a company
using a small data sample and now the Specialist is ready to implement an end-to-
end solution in AWS using Amazon SageMaker. The historical training data is stored
in Amazon RDS
Which approach should the Specialist use for training a model using that data?
A. Write a direct connection to the SQL database within the notebook and pull data in
B. Push the data from Microsoft SQL Server to Amazon S3 using an AWS Data
Pipeline and provide the S3 location within the notebook.
C. Move the data to Amazon DynamoDB and set up a connection to DynamoDB
within the notebook
to pull data in
D. Move the data to Amazon ElastiCache using AWS DMS and set up a connection
within the notebook to pull data in for fast access.
Answer: B
Explanation:
Pushing the data from Microsoft SQL Server to Amazon S3 using an AWS Data
Pipeline and providing the S3 location within the notebook is the best approach for
training a model using the data stored in Amazon RDS. This is because Amazon
SageMaker can directly access data from Amazon S3 and train models on it. AWS
Data Pipeline is a service that can automate the movement and transformation of
2 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
data between different AWS services. It can also use Amazon RDS as a data source
and Amazon S3 as a data destination. This way, the data can be transferred
efficiently and securely without writing any code within the notebook.
References:
Amazon SageMaker
AWS Data Pipeline
3.Which of the following metrics should a Machine Learning Specialist generally use
to compare/evaluate machine learning classification models against each other?
A. Recall
B. Misclassification rate
C. Mean absolute percentage error (MAPE)
D. Area Under the ROC Curve (AUC)
Answer: D
Explanation:
Area Under the ROC Curve (AUC) is a metric that measures the performance of a
binary classifier across all possible thresholds. It is also known as the probability that
a randomly chosen positive example will be ranked higher than a randomly chosen
negative example by the classifier. AUC is a good metric to compare different
classification models because it is independent of the class distribution and the
decision threshold. It also captures both the sensitivity (true positive rate) and the
specificity (true negative rate) of the model.
References:
AWS Machine Learning Specialty Exam Guide
AWS Machine Learning Specialty Sample Questions
4.A Machine Learning Specialist is using Amazon Sage Maker to host a model for a
highly available customer-facing application.
The Specialist has trained a new version of the model, validated it with historical data,
and now wants to deploy it to production To limit any risk of a negative customer
experience, the Specialist wants to be able to monitor the model and roll it back, if
needed
What is the SIMPLEST approach with the LEAST risk to deploy the model and roll it
back, if needed?
A. Create a SageMaker endpoint and configuration for the new model version.
Redirect production traffic to the new endpoint by updating the client configuration.
Revert traffic to the last version if the model does not perform as expected.
B. Create a SageMaker endpoint and configuration for the new model version.
Redirect production traffic to the new endpoint by using a load balancer Revert traffic
to the last version if the model does not perform as expected.
C. Update the existing SageMaker endpoint to use a new configuration that is
3 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
weighted to send 5% of the traffic to the new variant. Revert traffic to the last version
by resetting the weights if the model does not perform as expected.
D. Update the existing SageMaker endpoint to use a new configuration that is
weighted to send 100% of the traffic to the new variant Revert traffic to the last
version by resetting the weights if the model does not perform as expected.
Answer: C
Explanation:
Updating the existing SageMaker endpoint to use a new configuration that is weighted
to send 5% of the traffic to the new variant is the simplest approach with the least risk
to deploy the model and roll it back, if needed. This is because SageMaker supports
A/B testing, which allows the Specialist to compare the performance of different
model variants by sending a portion of the traffic to each variant. The Specialist can
monitor the metrics of each variant and adjust the weights accordingly. If the new
variant does not perform as expected, the Specialist can revert traffic to the last
version by resetting the weights to 100% for the old variant and 0% for the new
variant. This way, the Specialist can deploy the model without affecting the customer
experience and roll it back easily if needed.
References:
Amazon SageMaker
Deploying models to Amazon SageMaker hosting services
5.A manufacturing company has a large set of labeled historical sales data. The
manufacturer would like to predict how many units of a particular part should be
produced each quarter.
Which machine learning approach should be used to solve this problem?
A. Logistic regression
B. Random Cut Forest (RCF)
C. Principal component analysis (PCA)
D. Linear regression
Answer: D
Explanation:
Linear regression is a machine learning approach that can be used to solve this
problem. Linear regression is a supervised learning technique that can model the
relationship between one or more
input variables (features) and an output variable (target). In this case, the input
variables could be the historical sales data of the part, such as the quarter, the
demand, the price, the inventory, etc. The output variable could be the number of
units to be produced for the part. Linear regression can learn the coefficients
(weights) of the input variables that best fit the output variable, and then use them to
make predictions for new data. Linear regression is suitable for problems that involve
continuous and numeric output variables, such as predicting house prices, stock
prices, or sales volumes.
4 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
References:
AWS Machine Learning Specialty Exam Guide
Linear Regression
6.A manufacturing company has structured and unstructured data stored in an
Amazon S3 bucket A Machine Learning Specialist wants to use SQL to run queries
on this data.
Which solution requires the LEAST effort to be able to query this data?
A. Use AWS Data Pipeline to transform the data and Amazon RDS to run queries.
B. Use AWS Glue to catalogue the data and Amazon Athena to run queries
C. Use AWS Batch to run ETL on the data and Amazon Aurora to run the quenes
D. Use AWS Lambda to transform the data and Amazon Kinesis Data Analytics to run
queries
Answer: B
Explanation:
AWS Glue is a serverless data integration service that can catalogue, clean, enrich,
and move data between various data stores. Amazon Athena is an interactive query
service that can run SQL queries on data stored in Amazon S3. By using AWS Glue
to catalogue the data and Amazon Athena to run queries, the Machine Learning
Specialist can leverage the existing data in Amazon S3 without any additional data
transformation or loading. This solution requires the least effort compared to the other
options, which involve more complex and costly data processing and storage
services.
References: AWS Glue, Amazon Athena
7.A Machine Learning Specialist is packaging a custom ResNet model into a Docker
container so the company can leverage Amazon SageMaker for training. The
Specialist is using Amazon EC2 P3 instances to train the model and needs to
properly configure the Docker container to leverage the NVIDIA GPUs
What does the Specialist need to do1?
A. Bundle the NVIDIA drivers with the Docker image
B. Build the Docker container to be NVIDIA-Docker compatible
C. Organize the Docker container's file structure to execute on GPU instances.
D. Set the GPU flag in the Amazon SageMaker Create TrainingJob request body
Answer: B
Explanation:
To leverage the NVIDIA GPUs on Amazon EC2 P3 instances, the Machine Learning
Specialist needs to build the Docker container to be NVIDIA-Docker compatible.
NVIDIA-Docker is a tool that enables GPU-accelerated containers to run on Docker. It
automatically configures the container to access the NVIDIA drivers and libraries on
the host system. The Specialist does not need to bundle the NVIDIA drivers with the
5 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Docker image, as they are already installed on the EC2 P3 instances. The Specialist
does not need to organize the Docker container’s file structure to execute on GPU
instances, as this is not relevant for GPU compatibility. The Specialist does not need
to set the GPU flag in the Amazon SageMaker Create TrainingJob request body, as
this is only required for using Elastic Inference accelerators, not EC2 P3 instances.
References: NVIDIA-Docker, Using GPU-Accelerated Containers, Using Elastic
Inference in Amazon SageMaker
8.A large JSON dataset for a project has been uploaded to a private Amazon S3
bucket. The Machine Learning Specialist wants to securely access and explore the
data from an Amazon SageMaker notebook instance A new VPC was created and
assigned to the Specialist
How can the privacy and integrity of the data stored in Amazon S3 be maintained
while granting access to the Specialist for analysis?
A. Launch the SageMaker notebook instance within the VPC with SageMaker-
provided internet access enabled Use an S3 ACL to open read privileges to the
everyone group
B. Launch the SageMaker notebook instance within the VPC and create an S3 VPC
endpoint for the notebook to access the data Copy the JSON dataset from Amazon
S3 into the ML storage volume on the SageMaker notebook instance and work
against the local dataset
C. Launch the SageMaker notebook instance within the VPC and create an S3 VPC
endpoint for the notebook to access the data Define a custom S3 bucket policy to only
allow requests from your VPC to access the S3 bucket
D. Launch the SageMaker notebook instance within the VPC with SageMaker-
provided internet access enabled. Generate an S3 pre-signed URL for access to data
in the bucket
Answer: C
Explanation:
The best way to maintain the privacy and integrity of the data stored in Amazon S3 is
to use a combination of VPC endpoints and S3 bucket policies. A VPC endpoint
allows the SageMaker notebook instance to access the S3 bucket without going
through the public internet. A bucket policy allows the S3 bucket owner to specify
which VPCs or VPC endpoints can access the bucket. This way, the data is protected
from unauthorized access and tampering. The other options are either insecure (A
and D) or inefficient (B).
References: Using Amazon S3 VPC Endpoints, Using Bucket Policies and User
Policies
9.Given the following confusion matrix for a movie classification model, what is the
true class frequency for Romance and the predicted class frequency for Adventure?
6 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
A. The true class frequency for Romance is 77.56% and the predicted class
frequency for Adventure is 20 85%
B. The true class frequency for Romance is 57.92% and the predicted class
frequency for Adventure is 1312%
C. The true class frequency for Romance is 0 78 and the predicted class frequency
for Adventure is (0 47 - 0.32).
D. The true class frequency for Romance is 77.56% * 0.78 and the predicted class
frequency for Adventure is 20 85% ' 0.32
Answer: B
Explanation:
The true class frequency for Romance is the percentage of movies that are actually
Romance out of all the movies. This can be calculated by dividing the sum of the true
values for Romance by the total number of movies. The predicted class frequency for
Adventure is the percentage of movies that are predicted to be Adventure out of all
the movies. This can be calculated by dividing the sum of the predicted values for
7 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Adventure by the total number of movies. Based on the confusion matrix, the true
class frequency for Romance is 57.92% and the predicted class frequency for
Adventure is 13.12%.
References: Confusion Matrix, Classification Metrics
10.A Machine Learning Specialist is building a supervised model that will evaluate
customers' satisfaction with their mobile phone service based on recent usage. The
model's output should infer whether or not a customer is likely to switch to a
competitor in the next 30 days.
Which of the following modeling techniques should the Specialist use1?
A. Time-series prediction
B. Anomaly detection
C. Binary classification
D. Regression
Answer: C
Explanation:
The modeling technique that the Machine Learning Specialist should use is binary
classification. Binary classification is a type of supervised learning that predicts
whether an input belongs to one of two possible classes. In this case, the input is the
customer’s recent usage data and the output is whether or not the customer is likely
to switch to a competitor in the next 30 days. This is a binary outcome, either yes or
no, so binary classification is suitable for this problem. The other options are not
appropriate for this problem. Time-series prediction is a type of supervised learning
that forecasts future values based on past and present data. Anomaly detection is a
type of unsupervised learning that identifies outliers or abnormal patterns in the data.
Regression is a type of supervised learning that estimates a continuous numerical
value based on the input features.
References: Binary Classification, Time Series Prediction, Anomaly Detection,
Regression
11.A web-based company wants to improve its conversion rate on its landing page
Using a large historical dataset of customer visits, the company has repeatedly
trained a multi-class deep learning network algorithm on Amazon SageMaker
However there is an overfitting problem training data shows 90% accuracy in
predictions, while test data shows 70% accuracy only.
The company needs to boost the generalization of its model before deploying it into
production to maximize conversions of visits to purchases
Which action is recommended to provide the HIGHEST accuracy model for the
company's test and validation data?
A. Increase the randomization of training data in the mini-batches used in training.
B. Allocate a higher proportion of the overall data to the training dataset
8 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
C. Apply L1 or L2 regularization and dropouts to the training.
D. Reduce the number of layers and units (or neurons) from the deep learning
network.
Answer: C
Explanation:
Regularization and dropouts are techniques that can help reduce overfitting in deep
learning models. Overfitting occurs when the model learns too much from the training
data and fails to generalize well to new data. Regularization adds a penalty term to
the loss function that penalizes the model for having large or complex weights. This
prevents the model from memorizing the noise or irrelevant features in the training
data. L1 and L2 are two types of regularization that differ in how they calculate the
penalty term. L1 regularization uses the absolute value of the weights, while L2
regularization uses the square of the weights. Dropouts are another technique that
randomly drops out some units or neurons from the network during training. This
creates a thinner network that is less prone to overfitting. Dropouts also act as a form
of ensemble learning, where multiple sub-models are combined to produce a better
prediction. By applying regularization and dropouts to the training, the web-based
company can improve the generalization and accuracy of its deep learning model on
the test and validation data.
References:
Regularization: A video that explains the concept and benefits of regularization in
deep learning.
Dropout: A video that demonstrates how dropout works and why it helps reduce
overfitting.
12.A Machine Learning Specialist was given a dataset consisting of unlabeled data.
The Specialist must create a model that can help the team classify the data into
different buckets.
What model should be used to complete this work?
A. K-means clustering
B. Random Cut Forest (RCF)
C. XGBoost
D. BlazingText
Answer: A
Explanation:
K-means clustering is a machine learning technique that can be used to classify
unlabeled data into different groups based on their similarity. It is an unsupervised
learning method, which means it does not require any prior knowledge or labels for
the data. K-means clustering works by randomly assigning data points to a number of
clusters, then iteratively updating the cluster centers and reassigning the data points
until the clusters are stable. The result is a partition of the data into distinct and
homogeneous groups. K-means clustering can be useful for exploratory data
9 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
analysis, data compression, anomaly detection, and feature extraction.
References:
K-Means Clustering: A tutorial on how to use K-means clustering with Amazon
SageMaker.
Unsupervised Learning: A video that explains the concept and applications of
unsupervised learning.
13.A retail company intends to use machine learning to categorize new products A
labeled dataset of current products was provided to the Data Science team. The
dataset includes 1 200 products. The labeled dataset has 15 features for each
product such as title dimensions, weight, and price Each product is labeled as
belonging to one of six categories such as books, games, electronics, and movies.
Which model should be used for categorizing new products using the provided
dataset for training?
A. An XGBoost model where the objective parameter is set to multi: softmax
B. A deep convolutional neural network (CNN) with a softmax activation function for
the last layer
C. A regression forest where the number of trees is set equal to the number of
product categories
D. A DeepAR forecasting model based on a recurrent neural network (RNN)
Answer: A
Explanation:
XGBoost is a machine learning framework that can be used for classification,
regression, ranking, and other tasks. It is based on the gradient boosting algorithm,
which builds an ensemble of weak learners (usually decision trees) to produce a
strong learner. XGBoost has several advantages over other algorithms, such as
scalability, parallelization, regularization, and sparsity handling. For categorizing new
products using the provided dataset, an XGBoost model would be a suitable choice,
because it can handle multiple features and multiple classes efficiently and
accurately. To train an XGBoost model for multi-class classification, the objective
parameter should be set to multi: softmax, which means that the model will output a
probability distribution over the classes and predict the class with the highest
probability. Alternatively, the objective parameter can be set to multi: softprob, which
means that the model will output the raw probability of each class instead of the
predicted class label. This can be useful for evaluating the model performance or for
post-processing the predictions.
References:
XGBoost: A tutorial on how to use XGBoost with Amazon SageMaker.
XGBoost Parameters: A reference guide for the parameters of XGBoost.
14.A Machine Learning Specialist is building a model to predict future employment
10 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
rates based on a wide range of economic factors While exploring the data, the
Specialist notices that the magnitude of the input features vary greatly. The Specialist
does not want variables with a larger magnitude to dominate the model
What should the Specialist do to prepare the data for model training?
A. Apply quantile binning to group the data into categorical bins to keep any
relationships in the data by replacing the magnitude with distribution
B. Apply the Cartesian product transformation to create new combinations of fields
that are independent of the magnitude
C. Apply normalization to ensure each field will have a mean of 0 and a variance of 1
to remove any significant magnitude
D. Apply the orthogonal sparse Diagram (OSB) transformation to apply a fixed-size
sliding window to generate new features of a similar magnitude.
Answer: C
Explanation:
Normalization is a data preprocessing technique that can be used to scale the input
features to a common range, such as [-1, 1] or [0, 1]. Normalization can help reduce
the effect of outliers, improve the convergence of gradient-based algorithms, and
prevent variables with a larger magnitude from dominating the model. One common
method of normalization is standardization, which transforms each feature to have a
mean of 0 and a variance of 1. This can be done by subtracting the mean and dividing
by the standard deviation of each feature. Standardization can be useful for models
that assume the input features are normally distributed, such as linear regression,
logistic regression, and support vector machines.
References:
Data normalization and standardization: A video that explains the concept and
benefits of data normalization and standardization.
Standardize or Normalize?: A blog post that compares different methods of scaling
the input
features.
15.A Machine Learning Specialist prepared the following graph displaying the results
of k-means for k = [1:10]
11 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Considering the graph, what is a reasonable selection for the optimal choice of k?
A. 1
B. 4
C. 7
D. 10
Answer: B
Explanation:
The elbow method is a technique that we use to determine the number of centroids
(k) to use in a k-means clustering algorithm. In this method, we plot the within-cluster
sum of squares (WCSS) against the number of clusters (k) and look for the point
where the curve bends sharply. This point is called the elbow point and it indicates
that adding more clusters does not improve the model significantly. The graph in the
question shows that the elbow point is at k = 4, which means that 4 is a reasonable
choice for the optimal number of clusters.
References:
Elbow Method for optimal value of k in KMeans: A tutorial on how to use the elbow
method with Amazon SageMaker.
K-Means Clustering: A video that explains the concept and benefits of k-means
clustering.
16.A company is using Amazon Polly to translate plaintext documents to speech for
automated company announcements However company acronyms are being
12 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
mispronounced in the current documents.
How should a Machine Learning Specialist address this issue for future documents?
A. Convert current documents to SSML with pronunciation tags
B. Create an appropriate pronunciation lexicon.
C. Output speech marks to guide in pronunciation
D. Use Amazon Lex to preprocess the text files for pronunciation
Answer: B
Explanation:
A pronunciation lexicon is a file that defines how words or phrases should be
pronounced by Amazon Polly. A lexicon can help customize the speech output for
words that are uncommon, foreign, or have multiple pronunciations. A lexicon must
conform to the Pronunciation Lexicon Specification (PLS) standard and can be stored
in an AWS region using the Amazon Polly API. To use a lexicon for synthesizing
speech, the lexicon name must be specified in the <speak> SSML tag.
For example, the following lexicon defines how to pronounce the acronym W3C:
<lexicon version=“1.0” xmlns=“http://www.w3.org/2005/01/pronunciation-lexicon”
alphabet=“ipa” xml:lang=“en-US”> <lexeme> <grapheme>W3C</grapheme>
<alias>World Wide Web Consortium</alias> </lexeme> </lexicon>
To use this lexicon, the text input must include the following SSML tag:
<speak version=“1.1” xmlns=“http://www.w3.org/2001/10/synthesis” xml:lang=“en-
US”> <voice
name=“Joanna”> <lexicon name=“w3c_lexicon”/> The <say-as interpret-
as=“characters”>W3C</say-
as> is an international community that develops open standards to ensure the long-
term growth of
the Web. </voice> </speak>
Reference: Customize pronunciation using lexicons in Amazon Polly: A blog post that
explains how to use lexicons for creating custom pronunciations.
Managing Lexicons: A documentation page that describes how to store and retrieve
lexicons using the Amazon Polly API.
17.A Machine Learning Specialist is using Apache Spark for pre-processing training
data As part of the Spark pipeline, the Specialist wants to use Amazon SageMaker for
training a model and hosting it.
Which of the following would the Specialist do to integrate the Spark application with
SageMaker? (Select THREE)
A. Download the AWS SDK for the Spark environment
B. Install the SageMaker Spark library in the Spark environment.
C. Use the appropriate estimator from the SageMaker Spark Library to train a model.
D. Compress the training data into a ZIP file and upload it to a pre-defined Amazon
S3 bucket.
E. Use the sageMakerModel. transform method to get inferences from the model
13 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
hosted in SageMaker
F. Convert the DataFrame object to a CSV file, and use the CSV file as input for
obtaining inferences from SageMaker.
Answer: B, C, E
Explanation:
The SageMaker Spark library is a library that enables Apache Spark applications to
integrate with Amazon SageMaker for training and hosting machine learning models.
The library provides several features, such as:
Estimators: Classes that allow Spark users to train Amazon SageMaker models and
host them on Amazon SageMaker endpoints using the Spark MLlib Pipelines API.
The library supports various built-in algorithms, such as linear learner, XGBoost, K-
means, etc., as well as custom algorithms using Docker containers.
Model classes: Classes that wrap Amazon SageMaker models in a Spark MLlib
Model abstraction. This allows Spark users to use Amazon SageMaker endpoints for
inference within Spark applications. Data sources: Classes that allow Spark users to
read data from Amazon S3 using the Spark Data Sources API. The library supports
various data formats, such as CSV, LibSVM, RecordIO, etc.
To integrate the Spark application with SageMaker, the Machine Learning Specialist
should do the following:
Install the SageMaker Spark library in the Spark environment. This can be done by
using Maven, pip, or downloading the JAR file from GitHub.
Use the appropriate estimator from the SageMaker Spark Library to train a model.
For example, to train a linear learner model, the Specialist can use the following code:
14 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Use the sageMakerModel. transform method to get inferences from the model hosted
in SageMaker.
For example, to get predictions for a test DataFrame, the Specialist can use the
following code:
References:
[SageMaker Spark]: A documentation page that introduces the SageMaker Spark
library and its features.
[SageMaker Spark GitHub Repository]: A GitHub repository that contains the source
code, examples, and installation instructions for the SageMaker Spark library.
15 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
18.A Machine Learning Specialist is working with a large cybersecurily company that
manages security events in real time for companies around the world. The
cybersecurity company wants to design a solution that will allow it to use machine
learning to score malicious events as anomalies on the data as it is being ingested.
The company also wants be able to save the results in its data lake for later
processing and analysis
What is the MOST efficient way to accomplish these tasks?
A. Ingest the data using Amazon Kinesis Data Firehose, and use Amazon Kinesis
Data Analytics Random Cut Forest (RCF) for anomaly detection Then use Kinesis
Data Firehose to stream the results to Amazon S3
B. Ingest the data into Apache Spark Streaming using Amazon EMR. and use Spark
MLlib with k-means to perform anomaly detection Then store the results in an Apache
Hadoop Distributed File System (HDFS) using Amazon EMR with a replication factor
of three as the data lake
C. Ingest the data and store it in Amazon S3 Use AWS Batch along with the AWS
Deep Learning AMIs to train a k-means model using TensorFlow on the data in
Amazon S3.
D. Ingest the data and store it in Amazon S3. Have an AWS Glue job that is triggered
on demand transform the new data Then use the built-in Random Cut Forest (RCF)
model within Amazon SageMaker to detect anomalies in the data
Answer: A
Explanation:
Amazon Kinesis Data Firehose is a fully managed service that can capture, transform,
and load streaming data into AWS data stores, such as Amazon S3, Amazon
Redshift, Amazon Elasticsearch Service, and Splunk. It can also invoke AWS Lambda
functions to perform custom transformations on the data. Amazon Kinesis Data
Analytics is a service that can analyze streaming data in real time using SQL or
Apache Flink applications. It can also use machine learning algorithms, such as
Random Cut Forest (RCF), to perform anomaly detection on streaming data. RCF is
an unsupervised learning algorithm that assigns an anomaly score to each data point
based on how different it is from the rest of the data. By using Kinesis Data Firehose
and Kinesis Data Analytics, the cybersecurity company can ingest the data in real
time, score the malicious events as anomalies, and stream the results to Amazon S3,
which can serve as a data lake for later processing and analysis. This is the most
efficient way to accomplish these tasks, as it does not require any additional
infrastructure, coding, or training.
References:
Amazon Kinesis Data Firehose - Amazon Web Services
Amazon Kinesis Data Analytics - Amazon Web Services
Anomaly Detection with Amazon Kinesis Data Analytics - Amazon Web Services
[AWS Certified Machine Learning - Specialty Sample Questions]
16 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
19.A Machine Learning Specialist works for a credit card processing company and
needs to predict which transactions may be fraudulent in near-real time. Specifically,
the Specialist must train a model that returns the probability that a given transaction
may be fraudulent.
How should the Specialist frame this business problem?
A. Streaming classification
B. Binary classification
C. Multi-category classification
D. Regression classification
Answer: B
Explanation:
Binary classification is a type of supervised learning problem where the goal is to
predict a categorical label that has only two possible values, such as Yes or No, True
or False, Positive or Negative. In this case, the label is whether a transaction is
fraudulent or not, which is a binary outcome. Binary classification can be used to
estimate the probability of an observation belonging to a certain class, such as the
probability of a transaction being fraudulent. This can help the business to make
decisions based on the risk level of each transaction.
References:
Binary Classification - Amazon Machine Learning
AWS Certified Machine Learning - Specialty Sample Questions
20.Amazon Connect has recently been tolled out across a company as a contact call
center. The solution has been configured to store voice call recordings on Amazon S3
The content of the voice calls are being analyzed for the incidents being discussed by
the call operators Amazon Transcribe is being used to convert the audio to text, and
the output is stored on Amazon S3
Which approach will provide the information required for further analysis?
A. Use Amazon Comprehend with the transcribed files to build the key topics
B. Use Amazon Translate with the transcribed files to train and build a model for the
key topics
C. Use the AWS Deep Learning AMI with Gluon Semantic Segmentation on the
transcribed files to train and build a model for the key topics
D. Use the Amazon SageMaker k-Nearest-Neighbors (kNN) algorithm on the
transcribed files to generate a word embeddings dictionary for the key topics
Answer: A
Explanation:
Amazon Comprehend is a natural language processing (NLP) service that uses
machine learning to find insights and relationships in text. It can analyze text
documents and identify the key topics, entities, sentiments, languages, and more. In
this case, Amazon Comprehend can be used with the transcribed files from Amazon
17 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Transcribe to extract the main topics that are being discussed by the call operators.
This can help to understand the common issues and concerns of the customers, and
provide insights for further analysis and improvement.
References:
Amazon Comprehend - Amazon Web Services
AWS Certified Machine Learning - Specialty Sample Questions
21.A Machine Learning Specialist is building a prediction model for a large number of
features using linear models, such as linear regression and logistic regression During
exploratory data analysis the Specialist observes that many features are highly
correlated with each other This may make the model unstable
What should be done to reduce the impact of having such a large number of
features?
A. Perform one-hot encoding on highly correlated features
B. Use matrix multiplication on highly correlated features.
C. Create a new feature space using principal component analysis (PCA)
D. Apply the Pearson correlation coefficient
Answer: C
Explanation:
Principal component analysis (PCA) is an unsupervised machine learning algorithm
that attempts to reduce the dimensionality (number of features) within a dataset while
still retaining as much information as possible. This is done by finding a new set of
features called components, which are composites of the original features that are
uncorrelated with one another. They are also constrained so that the first component
accounts for the largest possible variability in the data, the second component the
second most variability, and so on. By using PCA, the impact of having a large
number of features that are highly correlated with each other can be reduced, as the
new feature space will have fewer dimensions and less redundancy. This can make
the linear models more stable and less prone to overfitting.
References:
Principal Component Analysis (PCA) Algorithm - Amazon SageMaker
Perform a large-scale principal component analysis faster using Amazon SageMaker |
AWS Machine Learning Blog
Machine Learning- Prinicipal Component Analysis | i2tutorials
22.A Machine Learning Specialist wants to determine the appropriate SageMaker
Variant Invocations Per Instance setting for an endpoint automatic scaling
configuration. The Specialist has performed a load test on a single instance and
determined that peak requests per second (RPS) without service degradation is about
20 RPS As this is the first deployment, the Specialist intends to set the invocation
safety factor to 0 5
18 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Based on the stated parameters and given that the invocations per instance setting is
measured on a per-minute basis, what should the Specialist set as the sageMaker
variant invocations Per instance setting?
A. 10
B. 30
C. 600
D. 2,400
Answer: C
Explanation:
The SageMaker Variant Invocations Per Instance setting is the target value for the
average number of invocations per instance per minute for the model variant. It is
used by the automatic scaling policy to add or remove instances to keep the metric
close to the specified value. To determine this value, the following equation can be
used in combination with load testing: SageMakerVariantInvocationsPerInstance =
(MAX_RPS * SAFETY_FACTOR) * 60
Where MAX_RPS is the maximum requests per second that the model variant can
handle without service degradation, SAFETY_FACTOR is a factor that ensures that
the clients do not exceed the maximum RPS, and 60 is the conversion factor from
seconds to minutes. In this case, the given parameters are:
MAX_RPS = 20 SAFETY_FACTOR = 0.5
Plugging these values into the equation, we get:
SageMakerVariantInvocationsPerInstance = (20 * 0.5) * 60
SageMakerVariantInvocationsPerInstance = 600
Therefore, the Specialist should set the SageMaker Variant Invocations Per Instance
setting to 600.
Reference: Load testing your auto scaling configuration - Amazon SageMaker
Configure model auto scaling with the console - Amazon SageMaker
23.A Machine Learning Specialist deployed a model that provides product
recommendations on a company's website Initially, the model was performing very
well and resulted in customers buying more products on average However within the
past few months the Specialist has noticed that the effect of product
recommendations has diminished and customers are starting to return to their original
habits of spending less. The Specialist is unsure of what happened, as the model has
not changed from its initial deployment over a year ago
Which method should the Specialist try to improve model performance?
A. The model needs to be completely re-engineered because it is unable to handle
product inventory changes
B. The model's hyperparameters should be periodically updated to prevent drift
C. The model should be periodically retrained from scratch using the original data
while adding a regularization term to handle product inventory changes
D. The model should be periodically retrained using the original training data plus new
19 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
data as product inventory changes
Answer: D
Explanation:
The problem that the Machine Learning Specialist is facing is likely due to concept
drift, which is a phenomenon where the statistical properties of the target variable
change over time, making the model less accurate and relevant. Concept drift can
occur due to various reasons, such as changes in customer preferences, market
trends, product inventory, seasonality, etc. In this case, the product recommendations
model may have become outdated as the product inventory changed over time,
making the recommendations less appealing to the customers. To address this issue,
the model should be periodically retrained using the original training data plus new
data as product inventory changes. This way, the model can learn from the latest data
and adapt to the changing customer behavior and preferences. Retraining the model
from scratch using the original data while adding a regularization term may not be
sufficient, as it does not account for the new data. Updating the model’s
hyperparameters may not help either, as it does not address the underlying data
distribution change. Re-engineering the model completely may not be necessary, as
the model may still be valid and useful with periodic retraining.
Reference: Concept Drift - Amazon SageMaker
Detecting and Handling Concept Drift - Amazon SageMaker Machine Learning
Concepts - Amazon Machine Learning
24.A manufacturer of car engines collects data from cars as they are being driven.
The data collected includes timestamp, engine temperature, rotations per minute
(RPM), and other sensor readings. The company wants to predict when an engine is
going to have a problem so it can notify drivers in advance to get engine
maintenance. The engine data is loaded into a data lake for training.
Which is the MOST suitable predictive model that can be deployed into production?
A. Add labels over time to indicate which engine faults occur at what time in the future
to turn this into a supervised learning problem Use a recurrent neural network (RNN)
to train the model to recognize when an engine might need maintenance for a certain
fault.
B. This data requires an unsupervised learning algorithm Use Amazon SageMaker k-
means to cluster the data
C. Add labels over time to indicate which engine faults occur at what time in the future
to turn this into a supervised learning problem Use a convolutional neural network
(CNN) to train the model to recognize when an engine might need maintenance for a
certain fault.
D. This data is already formulated as a time series Use Amazon SageMaker seq2seq
to model the time series.
Answer: A
Explanation:
20 / 84
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
A recurrent neural network (RNN) is a type of neural network that can process
sequential data, such as time series, by maintaining a hidden state that captures the
temporal dependencies between the inputs. RNNs are well suited for predicting future
events based on past observations, such as forecasting engine failures based on
sensor readings. To train an RNN model, the data needs to be labeled with the target
variable, which in this case is the type and time of the engine fault. This makes the
problem a supervised learning problem, where the goal is to learn a mapping from the
input sequence (sensor readings) to the output sequence (engine faults). By using an
RNN model, the manufacturer can leverage the temporal information in the data and
detect patterns that indicate when an engine might need maintenance for a certain
fault.
Reference: Recurrent Neural Networks - Amazon SageMaker
Use Amazon SageMaker Built-in Algorithms or Pre-trained Models Recurrent Neural
Network Definition | DeepAI
What are Recurrent Neural Networks? An Ultimate Guide for Newbies! Lee and
Carter go Machine Learning: Recurrent Neural Networks - SSRN
25.A Data Scientist is working on an application that performs sentiment analysis. The
validation accuracy is poor and the Data Scientist thinks that the cause may be a rich
vocabulary and a low average frequency of words in the dataset
Which tool should be used to improve the validation accuracy?
A. Amazon Comprehend syntax analysts and entity detection
B. Amazon SageMaker BlazingText allow mode
C. Natural Language Toolkit (NLTK) stemming and stop word removal
D. Scikit-learn term frequency-inverse document frequency (TF-IDF) vectorizers
Answer: D
Explanation:
Term frequency-inverse document frequency (TF-IDF) is a technique that assigns a
weight to each word in a document based on how important it is to the meaning of the
document. The term frequency (TF) measures how often a word appears in a
document, while the inverse document frequency (IDF) measures how rare a word is
across a collection of documents. The TF-IDF weight is the product of the TF and IDF
values, and it is high for words that are frequent in a specific document but rare in the
overall corpus. TF-IDF can help improve the validation accuracy of a sentiment
analysis model by reducing the impact of common words that have little or no
sentiment value, such as “the”, “a”, “and”, etc. Scikit-learn is a popular Python
library for machine learning that provides a TF-IDF vectorizer class that can transform
a collection of text documents into a matrix of TF-IDF features. By using this tool, the
Data Scientist can create a more informative and discriminative feature representation
for the sentiment analysis task.
Reference: TfidfVectorizer - scikit-learn
Text feature extraction - scikit-learn
21 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
TF-IDF for Beginners | by Jana Schmidt | Towards Data Science
Sentiment Analysis: Concept, Analysis and Applications | by Susan Li | Towards Data
Science
26.A Machine Learning Specialist is developing recommendation engine for a
photography blog Given a picture, the recommendation engine should show a picture
that captures similar objects. The Specialist would like to create a numerical
representation feature to perform nearest-neighbor searches
What actions would allow the Specialist to get relevant numerical representations?
A. Reduce image resolution and use reduced resolution pixel values as features
B. Use Amazon Mechanical Turk to label image content and create a one-hot
representation indicating the presence of specific labels
C. Run images through a neural network pie-trained on ImageNet, and collect the
feature vectors from the penultimate layer
D. Average colors by channel to obtain three-dimensional representations of images.
Answer: C
Explanation:
A neural network pre-trained on ImageNet is a deep learning model that has been
trained on a large dataset of images containing 1000 classes of objects. The model
can learn to extract high-level features from the images that capture the semantic and
visual information of the objects. The penultimate layer of the model is the layer
before the final output layer, and it contains a feature vector that represents the input
image in a lower-dimensional space. By running images through a pre-trained neural
network and collecting the feature vectors from the penultimate layer, the Specialist
can obtain relevant numerical representations that can be used for nearest-neighbor
searches. The feature vectors can capture the similarity between images based on
the presence and appearance of similar objects, and they can be compared using
distance metrics such as Euclidean distance or cosine similarity. This approach can
enable the recommendation engine to show a picture that captures similar objects to
a given picture.
Reference: ImageNet - Wikipedia
How to use a pre-trained neural network to extract features from images | by Rishabh
Anand | Analytics Vidhya | Medium
Image Similarity using Deep Ranking | by Aditya Oke | Towards Data Science
27.A gaming company has launched an online game where people can start playing
for free but they need to pay if they choose to use certain features. The company
needs to build an automated system to predict whether or not a new user will become
a paid user within 1 year. The company has gathered a labeled dataset from 1 million
users
The training dataset consists of 1.000 positive samples (from users who ended up
22 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
paying within 1 year) and 999.000 negative samples (from users who did not use any
paid features) Each data sample consists of 200 features including user age, device,
location, and play patterns
Using this dataset for training, the Data Science team trained a random forest model
that converged with over 99% accuracy on the training set However, the prediction
results on a test dataset were not satisfactory.
Which of the following approaches should the Data Science team take to mitigate this
issue? (Select TWO.)
A. Add more deep trees to the random forest to enable the model to learn more
features.
B. indicate a copy of the samples in the test database in the training dataset
C. Generate more positive samples by duplicating the positive samples and adding a
small amount of noise to the duplicated data.
D. Change the cost function so that false negatives have a higher impact on the cost
value than false positives
E. Change the cost function so that false positives have a higher impact on the cost
value than false negatives
Answer: C, D
Explanation:
The Data Science team is facing a problem of imbalanced data, where the positive
class (paid users) is much less frequent than the negative class (non-paid users).
This can cause the random forest model to be biased towards the majority class and
have poor performance on the minority class. To mitigate this issue, the Data Science
team can try the following approaches:
C) Generate more positive samples by duplicating the positive samples and adding a
small amount of noise to the duplicated data. This is a technique called data
augmentation, which can help increase the size and diversity of the training data for
the minority class. This can help the random forest model learn more features and
patterns from the positive class and reduce the imbalance ratio.
D) Change the cost function so that false negatives have a higher impact on the cost
value than false positives. This is a technique called cost-sensitive learning, which
can assign different weights or costs to different classes or errors. By assigning a
higher cost to false negatives (predicting non-paid when the user is actually paid), the
random forest model can be more sensitive to the minority class and try to minimize
the misclassification of the positive class.
Reference: Bagging and Random Forest for Imbalanced Classification Surviving in a
Random Forest with Imbalanced Datasets
machine learning - random forest for imbalanced data? - Cross Validated Biased
Random Forest For Dealing With the Class Imbalance Problem
28.While reviewing the histogram for residuals on regression evaluation data a
Machine Learning Specialist notices that the residuals do not form a zero-centered
23 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
bell shape as shown.
What does this mean?
A. The model might have prediction errors over a range of target values.
B. The dataset cannot be accurately represented using the regression model
C. There are too many variables in the model
D. The model is predicting its target values perfectly.
Answer: A
Explanation:
Residuals are the differences between the actual and predicted values of the target
variable in a regression model. A histogram of residuals is a graphical tool that can
help evaluate the performance and assumptions of the model. Ideally, the histogram
of residuals should have a zero-centered bell shape, which indicates that the
residuals are normally distributed with a mean of zero and a constant variance. This
means that the model has captured the true relationship between the input and output
variables, and that the errors are random and unbiased. However, if the histogram of
residuals does not have a zero-centered bell shape, as shown in the image, this
means that the model might have prediction errors over a range of target values. This
is because the residuals do not form a symmetrical and homogeneous distribution
around zero, which implies that the model has some systematic bias or
heteroscedasticity. This can affect the accuracy and validity of the model, and indicate
that the model needs to be improved or modified.
24 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
References:
Residual Analysis in Regression - Statistics By Jim
How to Check Residual Plots for Regression Analysis - dummies Histogram of
Residuals - Statistics.
How To
29.During mini-batch training of a neural network for a classification problem, a Data
Scientist notices that training accuracy oscillates.
What is the MOST likely cause of this issue?
A. The class distribution in the dataset is imbalanced
B. Dataset shuffling is disabled
C. The batch size is too big
D. The learning rate is very high
Answer: D
Explanation:
Mini-batch gradient descent is a variant of gradient descent that updates the model
parameters using a subset of the training data (called a mini-batch) at each iteration.
The learning rate is a hyperparameter that controls how much the model parameters
change in response to the gradient. If the learning rate is very high, the model
parameters may overshoot the optimal values and oscillate around the minimum of
the cost function. This can cause the training accuracy to fluctuate and prevent the
model from converging to a stable solution. To avoid this issue, the learning rate
should be chosen carefully, such as by using a learning rate decay schedule or an
adaptive learning rate algorithm1. Alternatively, the batch size can be increased to
reduce the variance of the gradient estimates2. However, the batch size should not
be too big, as this can slow down the training process and reduce the generalization
ability of the model3. Dataset shuffling and class distribution are not likely to cause
oscillations in training accuracy, as they do not affect the gradient updates directly.
Dataset shuffling can help avoid getting stuck in local minima and improve the
convergence speed of mini-batch gradient descent4. Class distribution can affect the
performance and fairness of the model, especially if the dataset is imbalanced, but it
does not necessarily cause fluctuations in training accuracy.
30.A Machine Learning Specialist observes several performance problems with the
training portion of a machine learning solution on Amazon SageMaker. The solution
uses a large training dataset 2 TB in size and is using the SageMaker k-means
algorithm. The observed issues include the unacceptable length of time it takes
before the training job launches and poor I/O throughput while training the model
What should the Specialist do to address the performance issues with the current
solution?
A. Use the SageMaker batch transform feature
25 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
B. Compress the training data into Apache Parquet format.
C. Ensure that the input mode for the training job is set to Pipe.
D. Copy the training dataset to an Amazon EFS volume mounted on the SageMaker
instance.
Answer: C
Explanation:
The input mode for the training job determines how the training data is transferred
from Amazon S3 to the SageMaker instance. There are two input modes: File and
Pipe. File mode copies the entire training dataset from S3 to the local file system of
the instance before starting the training job. This can cause a long delay before the
training job launches, especially if the dataset is large. Pipe mode streams the data
from S3 to the instance as the training job runs. This can reduce the startup time and
improve the I/O throughput, as the data is read in smaller batches. Therefore, to
address the performance issues with the current solution, the Specialist should
ensure that the input mode for the training job is set to Pipe. This can be done by
using the SageMaker Python SDK and setting the input_mode parameter to Pipe
when creating the estimator or the fit method12. Alternatively, this can be done by
using the AWS CLI and setting the InputMode parameter to Pipe when creating the
training job3.
Reference: Access Training Data - Amazon SageMaker
Choosing Data Input Mode Using the SageMaker Python SDK - Amazon SageMaker
CreateTrainingJob - Amazon SageMaker Service
31.A Machine Learning Specialist is building a convolutional neural network (CNN)
that will classify 10 types of animals. The Specialist has built a series of layers in a
neural network that will take an input image of an animal, pass it through a series of
convolutional and pooling layers, and then finally pass it through a dense and fully
connected layer with 10 nodes. The Specialist would like to get an output from the
neural network that is a probability distribution of how likely it is that the input image
belongs to each of the 10 classes
Which function will produce the desired output?
A. Dropout
B. Smooth L1 loss
C. Softmax
D. Rectified linear units (ReLU)
Answer: C
Explanation:
The softmax function is a function that can transform a vector of arbitrary real values
into a vector of real values in the range (0,1) that sum to 1. This means that the
softmax function can produce a valid probability distribution over multiple classes. The
softmax function is often used as the activation function of the output layer in a neural
network, especially for multi-class classification problems. The softmax function can
26 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
assign higher probabilities to the classes with higher scores, which allows the network
to make predictions based on the most likely class. In this case, the Machine Learning
Specialist wants to get an output from the neural network that is a probability
distribution of how likely it is that the input image belongs to each of the 10 classes of
animals. Therefore, the softmax function is the most suitable function to produce the
desired output.
References:
Softmax Activation Function for Deep Learning: A Complete Guide.
What is Softmax in Machine Learning? - reason.town
machine learning - Why is the softmax function often used as activation … Multi-Class
Neural Networks: Softmax | Machine Learning | Google for …
32.A Machine Learning Specialist is building a model that will perform time series
forecasting using Amazon SageMaker. The Specialist has finished training the model
and is now planning to perform load testing on the endpoint so they can configure
Auto Scaling for the model variant
Which approach will allow the Specialist to review the latency, memory utilization, and
CPU utilization during the load test"?
A. Review SageMaker logs that have been written to Amazon S3 by leveraging
Amazon Athena and Amazon OuickSight to visualize logs as they are being produced
B. Generate an Amazon CloudWatch dashboard to create a single view for the
latency, memory utilization, and CPU utilization metrics that are outputted by Amazon
SageMaker
C. Build custom Amazon CloudWatch Logs and then leverage Amazon ES and
Kibana to query and visualize the data as it is generated by Amazon SageMaker
D. Send Amazon CloudWatch Logs that were generated by Amazon SageMaker lo
Amazon ES and use Kibana to query and visualize the log data.
Answer: B
Explanation:
Amazon CloudWatch is a service that can monitor and collect various metrics and
logs from AWS resources, such as Amazon SageMaker. Amazon CloudWatch can
also generate dashboards to create a single view for the metrics and logs that are of
interest. By using Amazon CloudWatch, the Machine Learning Specialist can review
the latency, memory utilization, and CPU utilization during the load test, as these are
some of the metrics that are outputted by Amazon SageMaker. The Specialist can
create a custom dashboard that displays these metrics in different widgets, such as
graphs, tables, or text. The dashboard can also be configured to refresh automatically
and show the latest data as the load test is running. This approach will allow the
Specialist to monitor the performance and resource utilization of the model variant
and adjust the Auto Scaling configuration accordingly.
References:
[Monitoring Amazon SageMaker with Amazon CloudWatch - Amazon SageMaker]
27 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
[Using Amazon CloudWatch Dashboards - Amazon CloudWatch] [Create a
CloudWatch Dashboard - Amazon CloudWatch]
33.An Amazon SageMaker notebook instance is launched into Amazon VPC. The
SageMaker notebook references data contained in an Amazon S3 bucket in another
account. The bucket is encrypted using SSE-KMS. The instance returns an access
denied error when trying to access data in Amazon S3.
Which of the following are required to access the bucket and avoid the access denied
error? (Select THREE)
A. An AWS KMS key policy that allows access to the customer master key (CMK)
B. A SageMaker notebook security group that allows access to Amazon S3
C. An 1AM role that allows access to the specific S3 bucket
D. A permissive S3 bucket policy
E. An S3 bucket owner that matches the notebook owner
F. A SegaMaker notebook subnet ACL that allow traffic to Amazon S3.
Answer: A, B, C
Explanation:
To access an Amazon S3 bucket in another account that is encrypted using SSE-
KMS, the following are required:
A) An AWS KMS key policy that allows access to the customer master key (CMK).
The CMK is the encryption key that is used to encrypt and decrypt the data in the S3
bucket. The KMS key policy defines who can use and manage the CMK. To allow
access to the CMK from another account, the key policy must include a statement
that grants the necessary permissions (such as kms:Decrypt) to the principal from the
other account (such as the SageMaker notebook IAM role).
B) A SageMaker notebook security group that allows access to Amazon S3. A
security group is a virtual firewall that controls the inbound and outbound traffic for the
SageMaker notebook instance. To allow the notebook instance to access the S3
bucket, the security group must have a rule that allows outbound traffic to the S3
endpoint on port 443 (HTTPS).
C) An IAM role that allows access to the specific S3 bucket. An IAM role is an identity
that can be assumed by the SageMaker notebook instance to access AWS
resources. The IAM role must have a policy that grants the necessary permissions
(such as s3:GetObject) to access the specific S3 bucket. The policy must also include
a condition that allows access to the CMK in the other account.
The following are not required or correct:
D) A permissive S3 bucket policy. A bucket policy is a resource-based policy that
defines who can access the S3 bucket and what actions they can perform. A
permissive bucket policy is not required and not recommended, as it can expose the
bucket to unauthorized access. A bucket policy should follow the principle of least
privilege and grant the minimum permissions necessary to the specific principals that
need access.
28 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
E) An S3 bucket owner that matches the notebook owner. The S3 bucket owner and
the notebook owner do not need to match, as long as the bucket owner grants cross-
account access to the notebook owner through the KMS key policy and the bucket
policy (if applicable).
F) A SegaMaker notebook subnet ACL that allow traffic to Amazon S3. A subnet ACL
is a network access control list that acts as an optional layer of security for the
SageMaker notebook instance’s subnet. A subnet ACL is not required to access the
S3 bucket, as the security group is sufficient to control the traffic. However, if a subnet
ACL is used, it must not block the traffic to the S3 endpoint.
34.A monitoring service generates 1 TB of scale metrics record data every minute A
Research team performs queries on this data using Amazon Athena. The queries run
slowly due to the large volume of data, and the team requires better performance
How should the records be stored in Amazon S3 to improve query performance?
A. CSV files
B. Parquet files
C. Compressed JSON
D. RecordIO
Answer: B
Explanation:
Parquet is a columnar storage format that can store data in a compressed and
efficient way. Parquet files can improve query performance by reducing the amount of
data that needs to be scanned, as only the relevant columns are read from the files.
Parquet files can also support predicate pushdown, which means that the filtering
conditions are applied at the storage level, further reducing the data that needs to be
processed. Parquet files are compatible with Amazon Athena, which can leverage the
benefits of the columnar format and provide faster and cheaper queries. Therefore,
the records should be stored in Parquet files in Amazon S3 to improve query
performance.
References:
Columnar Storage Formats - Amazon Athena
Parquet SerDe - Amazon Athena
Optimizing Amazon Athena Queries - Amazon Athena
Parquet - Apache Software Foundation
35.A Machine Learning Specialist needs to create a data repository to hold a large
amount of time-based training data for a new model. In the source system, new files
are added every hour Throughout a single 24-hour period, the volume of hourly
updates will change significantly. The Specialist always wants to train on the last 24
hours of the data
Which type of data repository is the MOST cost-effective solution?
29 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
A. An Amazon EBS-backed Amazon EC2 instance with hourly directories
B. An Amazon RDS database with hourly table partitions
C. An Amazon S3 data lake with hourly object prefixes
D. An Amazon EMR cluster with hourly hive partitions on Amazon EBS volumes
Answer: C
Explanation:
An Amazon S3 data lake is a cost-effective solution for storing and analyzing large
amounts of time-based training data for a new model. Amazon S3 is a highly scalable,
durable, and secure object storage service that can store any amount of data in any
format. Amazon S3 also offers low-cost storage classes, such as S3 Standard-IA and
S3 One Zone-IA, that can reduce the storage costs for infrequently accessed data. By
using hourly object prefixes, the Machine Learning Specialist can organize the data
into logical partitions based on the time of ingestion. This can enable efficient data
access and management, as well as support incremental updates and deletes. The
Specialist can also use Amazon S3 lifecycle policies to automatically transition the
data to lower-cost storage classes or delete the data after a certain period of time.
This way, the Specialist can always train on the last 24 hours of the data and optimize
the storage costs.
Reference: What is a data lake? - Amazon Web Services
Amazon S3 Storage Classes - Amazon Simple Storage Service Managing your
storage lifecycle - Amazon Simple Storage Service Best Practices Design Patterns:
Optimizing Amazon S3 Performance
36.A retail chain has been ingesting purchasing records from its network of 20,000
stores to Amazon S3 using Amazon Kinesis Data Firehose To support training an
improved machine learning model, training records will require new but simple
transformations, and some attributes will be combined. The model needs lo be
retrained daily
Given the large number of stores and the legacy data ingestion, which change will
require the LEAST amount of development effort?
A. Require that the stores to switch to capturing their data locally on AWS Storage
Gateway for loading into Amazon S3 then use AWS Glue to do the transformation
B. Deploy an Amazon EMR cluster running Apache Spark with the transformation
logic, and have the cluster run each day on the accumulating records in Amazon S3,
outputting new/transformed records to Amazon S3
C. Spin up a fleet of Amazon EC2 instances with the transformation logic, have them
transform the data records accumulating on Amazon S3, and output the transformed
records to Amazon S3.
D. Insert an Amazon Kinesis Data Analytics stream downstream of the Kinesis Data
Firehouse stream that transforms raw record attributes into simple transformed values
using SQL.
Answer: D
30 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Explanation:
Amazon Kinesis Data Analytics is a service that can analyze streaming data in real
time using SQL or Apache Flink applications. It can also use machine learning
algorithms, such as Random Cut Forest (RCF), to perform anomaly detection on
streaming data. By inserting a Kinesis Data Analytics stream downstream of the
Kinesis Data Firehose stream, the retail chain can transform the raw record attributes
into simple transformed values using SQL queries. This can be done without
changing the existing data ingestion process or deploying additional resources. The
transformed records can then be outputted to another Kinesis Data Firehose stream
that delivers them to Amazon S3 for training the machine learning model. This
approach will require the least amount of development effort, as it leverages the
existing Kinesis Data Firehose stream and the built-in SQL capabilities of Kinesis
Data Analytics.
Reference: Amazon Kinesis Data Analytics - Amazon Web Services
Anomaly Detection with Amazon Kinesis Data Analytics - Amazon Web Services
Amazon Kinesis Data Firehose - Amazon Web Services Amazon S3 - Amazon Web
Services
37.A city wants to monitor its air quality to address the consequences of air pollution
A Machine Learning Specialist needs to forecast the air quality in parts per million of
contaminates for the next 2 days in the city as this is a prototype, only daily data from
the last year is available.
Which model is MOST likely to provide the best results in Amazon SageMaker?
A. Use the Amazon SageMaker k-Nearest-Neighbors (kNN) algorithm on the single
time series consisting of the full year of data with a predictor_type of regressor.
B. Use Amazon SageMaker Random Cut Forest (RCF) on the single time series
consisting of the full year of data.
C. Use the Amazon SageMaker Linear Learner algorithm on the single time series
consisting of the full year of data with a predictor_type of regressor.
D. Use the Amazon SageMaker Linear Learner algorithm on the single time series
consisting of the full year of data with a predictor_type of classifier.
Answer: A
Explanation:
The Amazon SageMaker k-Nearest-Neighbors (kNN) algorithm is a supervised
learning algorithm that can perform both classification and regression tasks. It can
also handle time series data, such as the air quality data in this case. The kNN
algorithm works by finding the k most similar instances in the training data to a given
query instance, and then predicting the output based on the average or majority of the
outputs of the k nearest neighbors. The kNN algorithm can be configured to use
different distance metrics, such as Euclidean or cosine, to measure the similarity
between instances.
To use the kNN algorithm on the single time series consisting of the full year of data,
31 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
the Machine Learning Specialist needs to set the predictor_type parameter to
regressor, as the output variable (air quality in parts per million of contaminates) is a
continuous value. The kNN algorithm can then forecast the air quality for the next 2
days by finding the k most similar days in the past year and averaging their air quality
values.
Reference: Amazon SageMaker k-Nearest-Neighbors (kNN) Algorithm - Amazon
SageMaker Time Series Forecasting using k-Nearest Neighbors (kNN) in Python | by
… Time Series Forecasting with k-Nearest Neighbors | by Nishant Malik …
38.For the given confusion matrix, what is the recall and precision of the model?
A. Recall = 0.92 Precision = 0.84
B. Recall = 0.84 Precision = 0.8
C. Recall = 0.92 Precision = 0.8
D. Recall = 0.8 Precision = 0.92
Answer: C
Explanation:
Recall and precision are two metrics that can be used to evaluate the performance of
a classification model. Recall is the ratio of true positives to the total number of actual
positives, which measures how well the model can identify all the relevant cases.
Precision is the ratio of true positives to the total number of predicted positives, which
measures how accurate the model is when it makes a positive prediction.
Based on the confusion matrix in the image, we can calculate the recall and precision
as follows:
Recall = TP / (TP + FN) = 12 / (12 + 1) = 0.92
Precision = TP / (TP + FP) = 12 / (12 + 3) = 0.8
Where TP is the number of true positives, FN is the number of false negatives, and
FP is the number of false positives. Therefore, the recall and precision of the model
are 0.92 and 0.8, respectively.
32 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
39.A Machine Learning Specialist is working with a media company to perform
classification on popular articles from the company's website. The company is using
random forests to classify how popular an article will be before it is published A
sample of the data being used is below.
Given the dataset, the Specialist wants to convert the Day-Of_Week column to binary
values.
What technique should be used to convert this column to binary values.
A. Binarization
B. One-hot encoding
C. Tokenization
D. Normalization transformation
Answer: B
Explanation:
One-hot encoding is a technique that can be used to convert a categorical variable,
such as the Day-Of_Week column, to binary values. One-hot encoding creates a new
binary column for each unique value in the original column, and assigns a value of 1
to the column that corresponds to the value in the original column, and 0 to the rest.
For example, if the original column has values Monday, Tuesday, Wednesday,
Thursday, Friday, Saturday, and Sunday, one-hot encoding will create seven new
columns, each representing one day of the week. If the value in the original column is
Tuesday, then the column for Tuesday will have a value of 1, and the other columns
will have a value of 0.
One-hot encoding can help improve the performance of machine learning models, as
it eliminates the ordinal relationship between the values and creates a more
informative and sparse representation of the data.
Reference: One-Hot Encoding - Amazon SageMaker One-Hot Encoding: A Simple
Guide for Beginners | by Jana Schmidt … One-Hot Encoding in Machine Learning | by
Nishant Malik | Towards …
33 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
40.A company has raw user and transaction data stored in AmazonS3 a MySQL
database, and Amazon RedShift A Data Scientist needs to perform an analysis by
joining the three datasets from Amazon S3, MySQL, and Amazon RedShift, and then
calculating the average-of a few selected columns from the joined data
Which AWS service should the Data Scientist use?
A. Amazon Athena
B. Amazon Redshift Spectrum
C. AWS Glue
D. Amazon QuickSight
Answer: A
Explanation:
Amazon Athena is a serverless interactive query service that can analyze data in
Amazon S3 using standard SQL. Amazon Athena can also query data from other
sources, such as MySQL and Amazon Redshift, by using federated queries.
Federated queries allow Amazon Athena to run SQL queries across data sources,
such as relational and non-relational databases, data warehouses, and data lakes. By
using Amazon Athena, the Data Scientist can perform an analysis by joining the three
datasets from Amazon S3, MySQL, and Amazon Redshift, and then calculating the
average of a few selected columns from the joined data. Amazon Athena can also
integrate with other AWS services, such as AWS Glue and Amazon QuickSight, to
provide additional features, such as data cataloging and visualization.
Reference:
What is Amazon Athena? - Amazon Athena
Federated Query Overview - Amazon Athena
Querying Data from Amazon S3 - Amazon Athena
Querying Data from MySQL - Amazon Athena
[Querying Data from Amazon Redshift - Amazon Athena]
41.A Mobile Network Operator is building an analytics platform to analyze and
optimize a company's operations using Amazon Athena and Amazon S3
The source systems send data in CSV format in real lime. The Data Engineering team
wants to transform the data to the Apache Parquet format before storing it on Amazon
S3.
Which solution takes the LEAST effort to implement?
A. Ingest .CSV data using Apache Kafka Streams on Amazon EC2 instances and use
Kafka Connect S3 to serialize data as Parquet
B. Ingest .CSV data from Amazon Kinesis Data Streams and use Amazon Glue to
convert data into Parquet.
C. Ingest .CSV data using Apache Spark Structured Streaming in an Amazon EMR
cluster and use Apache Spark to convert data into Parquet.
34 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
D. Ingest .CSV data from Amazon Kinesis Data Streams and use Amazon Kinesis
Data Firehose to convert data into Parquet.
Answer: D
Explanation:
Amazon Kinesis Data Streams is a service that can capture, store, and process
streaming data in real time. Amazon Kinesis Data Firehose is a service that can
deliver streaming data to various destinations, such as Amazon S3, Amazon Redshift,
or Amazon Elasticsearch Service. Amazon Kinesis Data Firehose can also transform
the data before delivering it, such as converting the data format, compressing the
data, or encrypting the data. One of the supported data formats that Amazon Kinesis
Data Firehose can convert to is Apache Parquet, which is a columnar storage format
that can improve the performance and cost-efficiency of analytics queries. By using
Amazon Kinesis Data Streams and Amazon Kinesis Data Firehose, the Mobile
Network Operator can ingest the .CSV data from the source systems and use
Amazon Kinesis Data Firehose to convert the data into Parquet before storing it on
Amazon S3. This solution takes the least effort to implement, as it does not require
any additional resources, such as Amazon EC2 instances, Amazon EMR clusters, or
Amazon Glue jobs. The solution can also leverage the built-in features of Amazon
Kinesis Data Firehose, such as data buffering, batching, retry, and error handling.
References:
Amazon Kinesis Data Streams - Amazon Web Services
Amazon Kinesis Data Firehose - Amazon Web Services
Data Transformation - Amazon Kinesis Data Firehose
Apache Parquet - Amazon Athena
42.An e-commerce company needs a customized training model to classify images of
its shirts and pants products. The company needs a proof of concept in 2 to 3 days
with good accuracy.
Which compute choice should the Machine Learning Specialist select to train and
achieve good accuracy on the model quickly?
A. m5 4xlarge (general purpose)
B. r5.2xlarge (memory optimized)
C. p3.2xlarge (GPU accelerated computing)
D. p3 8xlarge (GPU accelerated computing)
Answer: C
Explanation:
Image classification is a machine learning task that involves assigning labels to
images based on their content. Image classification can be performed using various
algorithms, such as convolutional neural networks (CNNs), which are a type of deep
learning model that can learn to extract high-level features from images. To train a
customized image classification model, the e-commerce company needs a compute
choice that can support the high computational demands of deep learning and provide
35 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
good accuracy on the model quickly. A GPU accelerated computing instance, such as
p3.2xlarge, is a suitable choice for this task, as it can leverage the parallel processing
power of GPUs to speed up the training process and reduce the training time. A
p3.2xlarge instance has one NVIDIA Tesla V100 GPU, which can provide up to 125
teraflops of mixed-precision performance and 16 GB of GPU memory. A p3.2xlarge
instance can also use various deep learning frameworks, such as TensorFlow,
PyTorch, MXNet, etc., to build and train the image classification model. A p3.2xlarge
instance is also more cost-effective than a p3.8xlarge instance, which has four
NVIDIA Tesla V100 GPUs, as the latter may not be necessary for a proof of concept
with a small dataset. Therefore, the Machine Learning Specialist should select
p3.2xlarge as the compute choice to train and achieve good accuracy on the model
quickly.
Reference:
Amazon EC2 P3 Instances - Amazon Web Services
Image Classification - Amazon SageMaker
Convolutional Neural Networks - Amazon SageMaker
Deep Learning AMIs - Amazon Web Services
43.A Marketing Manager at a pet insurance company plans to launch a targeted
marketing campaign on social media to acquire new customers
Currently, the company has the following data in Amazon Aurora
• Profiles for all past and existing customers
• Profiles for all past and existing insured pets
• Policy-level information
• Premiums received
• Claims paid
What steps should be taken to implement a machine learning model to identify
potential new customers on social media?
A. Use regression on customer profile data to understand key characteristics of
consumer segments Find similar profiles on social media.
B. Use clustering on customer profile data to understand key characteristics of
consumer segments
Find similar profiles on social media.
C. Use a recommendation engine on customer profile data to understand key
characteristics of consumer segments. Find similar profiles on social media
D. Use a decision tree classifier engine on customer profile data to understand key
characteristics of consumer segments. Find similar profiles on social media
Answer: B
Explanation:
Clustering is a machine learning technique that can group data points into clusters
based on their similarity or proximity. Clustering can help discover the underlying
structure and patterns in the data, as well as identify outliers or anomalies. Clustering
36 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
can also be used for customer segmentation, which is the process of dividing
customers into groups based on their characteristics, behaviors, preferences, or
needs. Customer segmentation can help understand the key features and needs of
different customer segments, as well as design and implement targeted marketing
campaigns for each segment. In this case, the Marketing Manager at a pet insurance
company plans to launch a targeted marketing campaign on social media to acquire
new customers. To do this, the Manager can use clustering on customer profile data
to understand the key characteristics of consumer segments, such as their
demographics, pet types, policy preferences, premiums paid, claims made, etc. The
Manager can then find similar profiles on social media, such as Facebook, Twitter,
Instagram, etc., by using the cluster features as filters or keywords. The Manager can
then target these potential new customers with personalized and relevant ads or
offers that match their segment’s needs and interests. This way, the Manager can
implement a machine learning model to identify potential new customers on social
media.
44.A company is running an Amazon SageMaker training job that will access data
stored in its Amazon S3 bucket A compliance policy requires that the data never be
transmitted across the internet.
How should the company set up the job?
A. Launch the notebook instances in a public subnet and access the data through the
public S3 endpoint
B. Launch the notebook instances in a private subnet and access the data through a
NAT gateway
C. Launch the notebook instances in a public subnet and access the data through a
NAT gateway
D. Launch the notebook instances in a private subnet and access the data through an
S3 VPC endpoint.
Answer: D
Explanation:
A private subnet is a subnet that does not have a route to the internet gateway, which
means that the resources in the private subnet cannot access the internet or be
accessed from the internet. An S3 VPC endpoint is a gateway endpoint that allows
the resources in the VPC to access the S3 service without going through the internet.
By launching the notebook instances in a private subnet and accessing the data
through an S3 VPC endpoint, the company can set up the job in a secure and
compliant way, as the data never leaves the AWS network and is not exposed to the
internet. This can also improve the performance and reliability of the data transfer, as
the traffic does not depend on the internet bandwidth or availability.
References:
Amazon VPC Endpoints - Amazon Virtual Private Cloud Endpoints for Amazon S3 -
Amazon Virtual Private Cloud Connect to SageMaker Within your VPC - Amazon
37 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
SageMaker Working with VPCs and Subnets - Amazon Virtual Private Cloud
45.A Machine Learning Specialist is preparing data for training on Amazon
SageMaker. The Specialist is transformed into a numpy .array, which appears to be
negatively affecting the speed of the training.
What should the Specialist do to optimize the data for training on SageMaker?
A. Use the SageMaker batch transform feature to transform the training data into a
DataFrame
B. Use AWS Glue to compress the data into the Apache Parquet format
C. Transform the dataset into the Recordio protobuf format
D. Use the SageMaker hyperparameter optimization feature to automatically optimize
the data
Answer: C
Explanation:
The Recordio protobuf format is a binary data format that is optimized for training on
SageMaker. It allows faster data loading and lower memory usage compared to other
formats such as CSV or numpy arrays. The Recordio protobuf format also supports
features such as sparse input, variable-length input, and label embedding. To use the
Recordio protobuf format, the data needs to be serialized and deserialized using the
appropriate libraries. Some of the built-in algorithms in SageMaker support the
Recordio protobuf format as a content type for training and inference.
References:
Common Data Formats for Training
Using RecordIO Format
Content Types Supported by Built-in Algorithms
46.A Machine Learning Specialist is training a model to identify the make and model
of vehicles in images. The Specialist wants to use transfer learning and an existing
model trained on images of general objects. The Specialist collated a large custom
dataset of pictures containing different vehicle makes and models.
What should the Specialist do to initialize the model to re-train it with the custom
data?
A. Initialize the model with random weights in all layers including the last fully
connected layer
B. Initialize the model with pre-trained weights in all layers and replace the last fully
connected layer.
C. Initialize the model with random weights in all layers and replace the last fully
connected layer
D. Initialize the model with pre-trained weights in all layers including the last fully
connected layer
Answer: B
38 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Explanation:
Transfer learning is a technique that allows us to use a model trained for a certain
task as a starting point for a machine learning model for a different task. For image
classification, a common practice is to use a pre-trained model that was trained on a
large and general dataset, such as ImageNet, and then customize it for the specific
task. One way to customize the model is to replace the last fully connected layer,
which is responsible for the final classification, with a new layer that has the same
number of units as the number of classes in the new task. This way, the model can
leverage the features learned by the previous layers, which are generic and useful for
many image recognition tasks, and learn to map them to the new classes. The new
layer can be initialized with random weights, and the rest of the model can be
initialized with the pre-trained weights. This method is also known as feature
extraction, as it extracts meaningful features from the pre-trained model and uses
them for the new task.
References:
Transfer learning and fine-tuning
Deep transfer learning for image classification: a survey
47.A Machine Learning Specialist is developing a custom video recommendation
model for an application. The dataset used to train this model is very large with
millions of data points and is hosted in an Amazon S3 bucket. The Specialist wants to
avoid loading all of this data onto an Amazon SageMaker notebook instance because
it would take hours to move and will exceed the attached 5 GB Amazon EBS volume
on the notebook instance.
Which approach allows the Specialist to use all the data to train the model?
A. Load a smaller subset of the data into the SageMaker notebook and train locally.
Confirm that the training
code is executing and the model parameters seem reasonable. Initiate a SageMaker
training job using the full dataset from the S3 bucket using Pipe input mode.
B. Launch an Amazon EC2 instance with an AWS Deep Learning AMI and attach the
S3 bucket to the instance. Train on a small amount of the data to verify the training
code and hyperparameters. Go back to Amazon SageMaker and train using the full
dataset
C. Use AWS Glue to train a model using a small subset of the data to confirm that the
data will be compatible with Amazon SageMaker. Initiate a SageMaker training job
using the full dataset from the S3 bucket using Pipe input mode.
D. Load a smaller subset of the data into the SageMaker notebook and train locally.
Confirm that the training code is executing and the model parameters seem
reasonable. Launch an Amazon EC2 instance with an AWS Deep Learning AMI and
attach the S3 bucket to train the full dataset.
Answer: A
Explanation:
39 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Pipe input mode is a feature of Amazon SageMaker that allows streaming large
datasets from Amazon S3 directly to the training algorithm without downloading them
to the local disk. This reduces the startup time, disk space, and cost of training jobs.
Pipe input mode is supported by most of the built-in algorithms and can also be used
with custom training algorithms. To use Pipe input mode, the data needs to be in a
binary format such as protobuf recordIO or TFRecord. The training code needs to use
the PipeModeDataset class to read the data from the named pipe provided by
SageMaker. To verify that the training code and the model parameters are working as
expected, it is recommended to train locally on a smaller subset of the data before
launching a full-scale training job on SageMaker. This approach is faster and more
efficient than the other options, which involve either downloading the full dataset to an
EC2 instance or using AWS Glue, which is not designed for training machine learning
models.
References:
Using Pipe input mode for Amazon SageMaker algorithms Using Pipe Mode with
Your Own Algorithms PipeModeDataset Class
48.A Machine Learning Specialist is creating a new natural language processing
application that processes a dataset comprised of 1 million sentences. The aim is to
then run Word2Vec to generate embeddings of the sentences and enable different
types of predictions
Here is an example from the dataset
"The quck BROWN FOX jumps over the lazy dog "
Which of the following are the operations the Specialist needs to perform to correctly
sanitize and prepare the data in a repeatable manner? (Select THREE)
A. Perform part-of-speech tagging and keep the action verb and the nouns only
B. Normalize all words by making the sentence lowercase
C. Remove stop words using an English stopword dictionary.
D. Correct the typography on "quck" to "quick."
E. One-hot encode all words in the sentence
F. Tokenize the sentence into words.
Answer: B, C, F
Explanation:
To prepare the data for Word2Vec, the Specialist needs to perform some
preprocessing steps that can help reduce the noise and complexity of the data, as
well as improve the quality of the embeddings.
Some of the common preprocessing steps for Word2Vec are:
Normalizing all words by making the sentence lowercase: This can help reduce the
vocabulary size and treat words with different capitalizations as the same word. For
example, “Fox” and “fox” should be considered as the same word, not two different
words.
Removing stop words using an English stopword dictionary: Stop words are words
40 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
that are very common and do not carry much semantic meaning, such as “the”, “a”,
“and”, etc. Removing them can help focus on the words that are more relevant and
informative for the task.
Tokenizing the sentence into words: Tokenization is the process of splitting a
sentence into smaller units, such as words or subwords. This is necessary for
Word2Vec, as it operates on the word level and requires a list of words as input.
The other options are not necessary or appropriate for Word2Vec:
Performing part-of-speech tagging and keeping the action verb and the nouns only:
Part-of-speech tagging is the process of assigning a grammatical category to each
word, such as noun, verb, adjective, etc. This can be useful for some natural
language processing tasks, but not for Word2Vec, as it can lose some important
information and context by discarding other words.
Correcting the typography on “quck” to “quick”: Typo correction can be helpful for
some tasks, but not for Word2Vec, as it can introduce errors and inconsistencies in
the data. For example, if the typo is intentional or part of a dialect, correcting it can
change the meaning or style of the sentence. Moreover, Word2Vec can learn to
handle typos and variations in spelling by learning similar embeddings for them.
One-hot encoding all words in the sentence: One-hot encoding is a way of
representing words as vectors of 0s and 1s, where only one element is 1 and the rest
are 0. The index of the 1 element corresponds to the word’s position in the
vocabulary. For example, if the vocabulary is [“cat”, “dog”, “fox”], then “cat” can be
encoded as [1, 0, 0], “dog” as [0, 1, 0], and “fox” as [0, 0, 1]. This can be useful for
some machine learning models, but not for Word2Vec, as it does not capture the
semantic similarity and relationship between words. Word2Vec aims to learn dense
and low-dimensional embeddings for words, where similar words have similar vectors.
49.This graph shows the training and validation loss against the epochs for a neural
network.
The network being trained is as follows
• Two dense layers one output neuron
• 100 neurons in each layer
• 100 epochs
• Random initialization of weights
41 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Which technique can be used to improve model performance in terms of accuracy in
the validation set?
A. Early stopping
B. Random initialization of weights with appropriate seed
C. Increasing the number of epochs
D. Adding another layer with the 100 neurons
Answer: A
Explanation:
Early stopping is a technique that can be used to prevent overfitting and improve
model performance on the validation set. Overfitting occurs when the model learns
the training data too well and fails to generalize to new and unseen data. This can be
seen in the graph, where the training loss keeps decreasing, but the validation loss
starts to increase after some point. This means that the model is fitting the noise and
patterns in the training data that are not relevant for the validation data. Early stopping
is a way of stopping the training process before the model overfits the training data. It
works by monitoring the validation loss and stopping the training when the validation
loss stops decreasing or starts increasing. This way, the model is saved at the point
where it has the best performance on the validation set. Early stopping can also save
time and resources by reducing the number of epochs needed for training.
References:
Early Stopping
How to Stop Training Deep Neural Networks At the Right Time Using Early Stopping
42 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
50.A manufacturing company asks its Machine Learning Specialist to develop a
model that classifies defective parts into one of eight defect types. The company has
provided roughly 100000 images per defect type for training During the injial training
of the image classification model the Specialist notices that the validation accuracy is
80%, while the training accuracy is 90% It is known that human-level performance for
this type of image classification is around 90%.
What should the Specialist consider to fix this issue1?
A. A longer training time
B. Making the network larger
C. Using a different optimizer
D. Using some form of regularization
Answer: D
Explanation:
Regularization is a technique that can be used to prevent overfitting and improve
model performance on unseen data. Overfitting occurs when the model learns the
training data too well and fails to generalize to new and unseen data. This can be
seen in the question, where the validation accuracy is lower than the training
accuracy, and both are lower than the human-level performance. Regularization is a
way of adding some constraints or penalties to the model to reduce its complexity and
prevent it from memorizing the training data. Some common forms of regularization
for image classification are:
Weight decay: Adding a term to the loss function that penalizes large weights in the
model. This can help reduce the variance and noise in the model and make it more
robust to small changes in the input.
Dropout: Randomly dropping out some units or connections in the model during
training. This can help reduce the co-dependency among the units and make the
model more resilient to missing or corrupted features.
Data augmentation: Artificially increasing the size and diversity of the training data by
applying random transformations, such as cropping, flipping, rotating, scaling, etc.
This can help the model learn more invariant and generalizable features and reduce
the risk of overfitting to specific patterns in the training data.
The other options are not likely to fix the issue of overfitting, and may even worsen it:
A longer training time: This can lead to more overfitting, as the model will have more
chances to fit the noise and details in the training data that are not relevant for the
validation data.
Making the network larger: This can increase the model capacity and complexity,
which can also lead
to more overfitting, as the model will have more parameters to learn and adjust to the
training data.
Using a different optimizer: This can affect the speed and stability of the training
process, but not necessarily the generalization ability of the model. The choice of
43 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
optimizer depends on the characteristics of the data and the model, and there is no
guarantee that a different optimizer will prevent overfitting.
References:
Regularization (machine learning)
Image Classification: Regularization
How to Reduce Overfitting With Dropout Regularization in Keras
51.Example Corp has an annual sale event from October to December. The company
has sequential sales data from the past 15 years and wants to use Amazon ML to
predict the sales for this year's upcoming event.
Which method should Example Corp use to split the data into a training dataset and
evaluation dataset?
A. Pre-split the data before uploading to Amazon S3
B. Have Amazon ML split the data randomly.
C. Have Amazon ML split the data sequentially.
D. Perform custom cross-validation on the data
Answer: C
Explanation:
A sequential split is a method of splitting data into training and evaluation datasets
while preserving the order of the data records. This method is useful when the data
has a temporal or sequential structure, and the order of the data matters for the
prediction task. For example, if the data contains sales data for different months or
years, and the goal is to predict the sales for the next month or year, a sequential split
can ensure that the training data comes from the earlier period and the evaluation
data comes from the later period. This can help avoid data leakage, which occurs
when the training data contains information from the future that is not available at the
time of prediction. A sequential split can also help evaluate the model performance on
the most recent data, which may be more relevant and representative of the future
data.
In this question, Example Corp has sequential sales data from the past 15 years and
wants to use Amazon ML to predict the sales for this year’s upcoming annual sale
event. A sequential split is the most appropriate method for splitting the data, as it can
preserve the order of the data and prevent data leakage. For example, Example Corp
can use the data from the first 14 years as the training dataset, and the data from the
last year as the evaluation dataset. This way, the model can learn from the historical
data and be tested on the most recent data.
Amazon ML provides an option to split the data sequentially when creating the
training and evaluation datasources. To use this option, Example Corp can specify
the percentage of the data to use for training and evaluation, and Amazon ML will use
the first part of the data for training and the remaining part of the data for evaluation.
For more information, see Splitting Your Data - Amazon Machine Learning.
44 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
52.A company is running a machine learning prediction service that generates 100 TB
of predictions every day A Machine Learning Specialist must generate a visualization
of the daily precision-recall curve from the predictions, and forward a read-only
version to the Business team.
Which solution requires the LEAST coding effort?
A. Run a daily Amazon EMR workflow to generate precision-recall data, and save the
results in Amazon S3 Give the Business team read-only access to S3
B. Generate daily precision-recall data in Amazon QuickSight, and publish the results
in a dashboard shared with the Business team
C. Run a daily Amazon EMR workflow to generate precision-recall data, and save the
results in Amazon S3 Visualize the arrays in Amazon QuickSight, and publish them in
a dashboard shared with the Business team
D. Generate daily precision-recall data in Amazon ES, and publish the results in a
dashboard shared with the Business team.
Answer: C
Explanation:
A precision-recall curve is a plot that shows the trade-off between the precision and
recall of a binary classifier as the decision threshold is varied. It is a useful tool for
evaluating and comparing the performance of different models. To generate a
precision-recall curve, the following steps are needed:
Calculate the precision and recall values for different threshold values using the
predictions and the true labels of the data.
Plot the precision values on the y-axis and the recall values on the x-axis for each
threshold value. Optionally, calculate the area under the curve (AUC) as a summary
metric of the model performance. Among the four options, option C requires the least
coding effort to generate and share a visualization of the daily precision-recall curve
from the predictions.
This option involves the following steps:
Run a daily Amazon EMR workflow to generate precision-recall data: Amazon EMR is
a service that allows running big data frameworks, such as Apache Spark, on a
managed cluster of EC2 instances. Amazon EMR can handle large-scale data
processing and analysis, such as calculating the precision and recall values for
different threshold values from 100 TB of predictions. Amazon EMR supports various
languages, such as Python, Scala, and R, for writing the code to perform the
calculations.
Amazon EMR also supports scheduling workflows using Apache Airflow or AWS Step
Functions, which can automate the daily execution of the code.
Save the results in Amazon S3: Amazon S3 is a service that provides scalable,
durable, and secure object storage. Amazon S3 can store the precision-recall data
generated by Amazon EMR in a cost-effective and accessible way. Amazon S3
supports various data formats, such as CSV, JSON, or Parquet, for storing the data.
Amazon S3 also integrates with other AWS services, such as Amazon QuickSight, for
45 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
further processing and visualization of the data.
Visualize the arrays in Amazon QuickSight: Amazon QuickSight is a service that
provides fast, easy-to-use, and interactive business intelligence and data
visualization. Amazon QuickSight can connect to Amazon S3 as a data source and
import the precision-recall data into a dataset. Amazon QuickSight can then create a
line chart to plot the precision-recall curve from the dataset. Amazon QuickSight also
supports calculating the AUC and adding it as an annotation to the chart.
Publish them in a dashboard shared with the Business team: Amazon QuickSight
allows creating and publishing dashboards that contain one or more visualizations
from the datasets. Amazon QuickSight also allows sharing the dashboards with other
users or groups within the same AWS account or across different AWS accounts. The
Business team can access the dashboard with read-only permissions and view the
daily precision-recall curve from the predictions.
The other options require more coding effort than option C for the following reasons:
Option A: This option requires writing code to plot the precision-recall curve from the
data stored in Amazon S3, as well as creating a mechanism to share the plot with the
Business team. This can involve using additional libraries or tools, such as matplotlib,
seaborn, or plotly, for creating the plot, and using email, web, or cloud services, such
as AWS Lambda or Amazon SNS, for sharing the plot. Option B: This option requires
transforming the predictions into a format that Amazon QuickSight can recognize and
import as a data source, such as CSV, JSON, or Parquet. This can involve writing
code to process and convert the predictions, as well as uploading them to a storage
service, such as Amazon S3 or Amazon Redshift, that Amazon QuickSight can
connect to.
Option D: This option requires writing code to generate precision-recall data in
Amazon ES, as well as creating a dashboard to visualize the data. Amazon ES is a
service that provides a fully managed Elasticsearch cluster, which is mainly used for
search and analytics purposes. Amazon ES is not designed for generating precision-
recall data, and it requires using a specific data format, such as JSON, for storing the
data. Amazon ES also requires using a tool, such as Kibana, for creating and sharing
the dashboard, which can involve additional configuration and customization steps.
Reference:
Precision-Recall
What Is Amazon EMR?
What Is Amazon S3?
[What Is Amazon QuickSight?]
[What Is Amazon Elasticsearch Service?]
53.A Machine Learning Specialist has built a model using Amazon SageMaker built-in
algorithms and is not getting expected accurate results. The Specialist wants to use
hyperparameter optimization to increase the model's accuracy
Which method is the MOST repeatable and requires the LEAST amount of effort to
46 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
achieve this?
A. Launch multiple training jobs in parallel with different hyperparameters
B. Create an AWS Step Functions workflow that monitors the accuracy in Amazon
CloudWatch Logs and relaunches the training job with a defined list of
hyperparameters
C. Create a hyperparameter tuning job and set the accuracy as an objective metric.
D. Create a random walk in the parameter space to iterate through a range of values
that should be used for each individual hyperparameter
Answer: C
Explanation:
A hyperparameter tuning job is a feature of Amazon SageMaker that allows
automatically finding the best combination of hyperparameters for a machine learning
model. Hyperparameters are high-level parameters that influence the learning
process and the performance of the model, such as the learning rate, the number of
layers, the regularization factor, etc. A hyperparameter tuning job works by launching
multiple training jobs with different hyperparameters, evaluating the results using an
objective metric, and choosing the next set of hyperparameters to try based on a
search strategy. The objective metric is a measure of the quality of the model, such
as accuracy, precision, recall, etc. The search strategy is a method of exploring the
hyperparameter space, such as random search, grid search, or Bayesian
optimization.
Among the four options, option C is the most repeatable and requires the least
amount of effort to use hyperparameter optimization to increase the model’s
accuracy. This option involves the following steps:
Create a hyperparameter tuning job: Amazon SageMaker provides an easy-to-use
interface for creating a hyperparameter tuning job, either through the AWS
Management Console, the AWS CLI, or the AWS SDKs. To create a hyperparameter
tuning job, the Machine Learning Specialist needs to specify the following information:
The name and type of the algorithm to use, either a built-in algorithm or a custom
algorithm. The ranges and types of the hyperparameters to tune, such as categorical,
continuous, or integer. The name and type of the objective metric to optimize, such as
accuracy, and whether to maximize or minimize it.
The resource limits for the tuning job, such as the maximum number of training jobs
and the maximum parallel training jobs.
The input data channels and the output data location for the training jobs.
The configuration of the training instances, such as the instance type, the instance
count, the volume size, etc.
Set the accuracy as an objective metric: To use accuracy as an objective metric, the
Machine Learning Specialist needs to ensure that the training algorithm writes the
accuracy value to a file called metric_definitions in JSON format and prints it to stdout
or stderr. For example, the file can contain the following content:
47 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
This means that the training algorithm prints a line like this:
Amazon SageMaker reads the accuracy value from the line and uses it to evaluate
and compare the training jobs.
The other options are not as repeatable and require more effort than option C for the
following reasons:
Option A: This option requires manually launching multiple training jobs in parallel
with different hyperparameters, which can be tedious and error-prone. It also requires
manually monitoring and comparing the results of the training jobs, which can be time-
consuming and subjective.
Option B: This option requires writing code to create an AWS Step Functions
workflow that monitors the accuracy in Amazon CloudWatch Logs and relaunches the
training job with a defined list of hyperparameters, which can be complex and
challenging. It also requires maintaining and updating the list of hyperparameters,
which can be inefficient and suboptimal.
Option D: This option requires writing code to create a random walk in the parameter
space to
iterate through a range of values that should be used for each individual
hyperparameter, which can
be unreliable and unpredictable. It also requires defining and implementing a stopping
criterion,
which can be arbitrary and inconsistent.
Reference: Automatic Model Tuning - Amazon SageMaker
Define Metrics to Monitor Model Performance
54.IT leadership wants Jo transition a company's existing machine learning data
storage environment to AWS as a temporary ad hoc solution. The company currently
uses a custom software process that heavily leverages SOL as a query language and
exclusively stores generated csv documents for machine learning
The ideal state for the company would be a solution that allows it to continue to use
48 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
the current workforce of SQL experts. The solution must also support the storage of
csv and JSON files, and be able to query over semi-structured data.
The following are high priorities for the company:
• Solution simplicity
• Fast development time
• Low cost
• High flexibility
What technologies meet the company's requirements?
A. Amazon S3 and Amazon Athena
B. Amazon Redshift and AWS Glue
C. Amazon DynamoDB and DynamoDB Accelerator (DAX)
D. Amazon RDS and Amazon ES
Answer: A
Explanation:
Amazon S3 and Amazon Athena are technologies that meet the company’s
requirements for a temporary ad hoc solution for machine learning data storage and
query. Amazon S3 and Amazon Athena have the following features and benefits:
Amazon S3 is a service that provides scalable, durable, and secure object storage for
any type of data. Amazon S3 can store csv and JSON files, as well as other formats,
and can handle large volumes of data with high availability and performance. Amazon
S3 also integrates with other AWS services, such as Amazon Athena, for further
processing and analysis of the data.
Amazon Athena is a service that allows querying data stored in Amazon S3 using
standard SQL. Amazon Athena can query over semi-structured data, such as JSON,
as well as structured data, such as csv, without requiring any loading or
transformation. Amazon Athena is serverless, meaning that there is no infrastructure
to manage and users only pay for the queries they run. Amazon Athena also supports
the use of AWS Glue Data Catalog, which is a centralized metadata repository that
can store and manage the schema and partition information of the data in Amazon
S3.
Using Amazon S3 and Amazon Athena, the company can achieve the following high
priorities: Solution simplicity: Amazon S3 and Amazon Athena are easy to use and
require minimal configuration and maintenance. The company can simply upload the
csv and JSON files to Amazon S3 and use Amazon Athena to query them using SQL.
The company does not need to worry about provisioning, scaling, or managing any
servers or clusters.
Fast development time: Amazon S3 and Amazon Athena can enable the company to
quickly access and analyze the data without any data preparation or loading. The
company can use the existing workforce of SQL experts to write and run queries on
Amazon Athena and get results in seconds or minutes.
Low cost: Amazon S3 and Amazon Athena are cost-effective and offer pay-as-you-go
pricing models. Amazon S3 charges based on the amount of storage used and the
number of requests made. Amazon Athena charges based on the amount of data
49 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
scanned by the queries. The company can also reduce the costs by using
compression, encryption, and partitioning techniques to optimize the data storage and
query performance.
High flexibility: Amazon S3 and Amazon Athena are flexible and can support various
data types, formats, and sources. The company can store and query any type of data
in Amazon S3, such as csv, JSON, Parquet, ORC, etc. The company can also query
data from multiple sources in Amazon S3, such as data lakes, data warehouses, log
files, etc.
The other options are not as suitable as option A for the company’s requirements for
the following reasons:
Option B: Amazon Redshift and AWS Glue are technologies that can be used for data
warehousing and data integration, but they are not ideal for a temporary ad hoc
solution. Amazon Redshift is a service that provides a fully managed, petabyte-scale
data warehouse that can run complex analytical queries using SQL. AWS Glue is a
service that provides a fully managed extract, transform, and load (ETL) service that
can prepare and load data for analytics. However, using Amazon Redshift and AWS
Glue would require more effort and cost than using Amazon S3 and Amazon Athena.
The company would need to load the data from Amazon S3 to Amazon Redshift using
AWS Glue, which can take time and incur additional charges. The company would
also need to manage the capacity and performance of the Amazon Redshift cluster,
which can be complex and expensive.
Option C: Amazon DynamoDB and DynamoDB Accelerator (DAX) are technologies
that can be used for fast and scalable NoSQL database and caching, but they are not
suitable for the company’s data storage and query needs. Amazon DynamoDB is a
service that provides a fully managed, key-value and document database that can
deliver single-digit millisecond performance at any scale.
DynamoDB Accelerator (DAX) is a service that provides a fully managed, in-memory
cache for DynamoDB that can improve the read performance by up to 10 times.
However, using Amazon DynamoDB and DAX would not allow the company to
continue to use SQL as a query language, as Amazon DynamoDB does not support
SQL. The company would need to use the DynamoDB API or the AWS SDKs to
access and query the data, which can require more coding and learning effort. The
company would also need to transform the csv and JSON files into DynamoDB items,
which can involve additional processing and complexity.
Option D: Amazon RDS and Amazon ES are technologies that can be used for
relational database and search and analytics, but they are not optimal for the
company’s data storage and query scenario. Amazon RDS is a service that provides
a fully managed, relational database that supports various database engines, such as
MySQL, PostgreSQL, Oracle, etc. Amazon ES is a service that provides a fully
managed, Elasticsearch cluster, which is mainly used for search and analytics
purposes. However, using Amazon RDS and Amazon ES would not be as simple and
cost-effective as using Amazon S3 and Amazon Athena. The company would need to
load the data from Amazon S3 to Amazon RDS, which can take time and incur
50 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
additional charges. The company would also need to manage the capacity and
performance of the Amazon RDS and Amazon ES clusters, which can be complex
and expensive. Moreover, Amazon RDS and Amazon ES are not designed to handle
semi-structured data, such as JSON, as well as Amazon S3 and Amazon Athena.
Reference:
Amazon S3
Amazon Athena
Amazon Redshift
AWS Glue
Amazon DynamoDB
[DynamoDB Accelerator (DAX)]
[Amazon RDS]
[Amazon ES]
55.A Machine Learning Specialist is working for a credit card processing company
and receives an unbalanced dataset containing credit card transactions. It contains
99,000 valid transactions and 1,000 fraudulent transactions. The Specialist is asked
to score a model that was run against the dataset. The Specialist has been advised
that identifying valid transactions is equally as important as identifying fraudulent
transactions
What metric is BEST suited to score the model?
A. Precision
B. Recall
C. Area Under the ROC Curve (AUC)
D. Root Mean Square Error (RMSE)
Answer: C
Explanation:
Area Under the ROC Curve (AUC) is a metric that is best suited to score the model
for the given scenario. AUC is a measure of the performance of a binary classifier,
such as a model that predicts whether a credit card transaction is valid or fraudulent.
AUC is calculated based on the Receiver Operating Characteristic (ROC) curve,
which is a plot that shows the trade-off between the true positive rate (TPR) and the
false positive rate (FPR) of the classifier as the decision threshold is varied. The TPR,
also known as recall or sensitivity, is the proportion of actual positive cases
(fraudulent transactions) that are correctly predicted as positive by the classifier. The
FPR, also known as the fall-out, is the proportion of actual negative cases (valid
transactions) that are incorrectly predicted as positive by the classifier. The ROC
curve illustrates how well the classifier can distinguish between the two classes,
regardless of the class distribution or the error costs. A perfect classifier would have a
TPR of 1 and an FPR of 0 for all thresholds, resulting in a ROC curve that goes from
the bottom left to the top left and then to the top right of the plot. A random classifier
would have a TPR and an FPR that are equal for all thresholds, resulting in a ROC
51 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
curve that goes from the bottom left to the top right of the plot along the diagonal line.
AUC is the area under the ROC curve, and it ranges from 0 to 1. A higher AUC
indicates a better classifier, as it means that the classifier has a higher TPR and a
lower FPR for all thresholds. AUC is a useful metric for imbalanced classification
problems, such as the credit card transaction dataset, because it is insensitive to the
class imbalance and the error costs. AUC can capture the overall performance of the
classifier across all possible scenarios, and it can be used to compare different
classifiers based on their ROC curves.
The other options are not as suitable as AUC for the given scenario for the following
reasons: Precision: Precision is the proportion of predicted positive cases (fraudulent
transactions) that are actually positive. Precision is a useful metric when the cost of a
false positive is high, such as in spam detection or medical diagnosis. However,
precision is not a good metric for imbalanced classification problems, because it can
be misleadingly high when the positive class is rare. For example, a classifier that
predicts all transactions as valid would have a precision of 0, but a very high accuracy
of 99%. Precision is also dependent on the decision threshold and the error costs,
which may vary for different scenarios.
Recall: Recall is the same as the TPR, and it is the proportion of actual positive cases
(fraudulent transactions) that are correctly predicted as positive by the classifier.
Recall is a useful metric when the cost of a false negative is high, such as in fraud
detection or cancer diagnosis. However, recall is not a good metric for imbalanced
classification problems, because it can be misleadingly low when the positive class is
rare. For example, a classifier that predicts all transactions as fraudulent would have
a recall of 1, but a very low accuracy of 1%. Recall is also dependent on the decision
threshold and the error costs, which may vary for different scenarios.
Root Mean Square Error (RMSE): RMSE is a metric that measures the average
difference between the predicted and the actual values. RMSE is a useful metric for
regression problems, where the goal is to predict a continuous value, such as the
price of a house or the temperature of a city. However, RMSE is not a good metric for
classification problems, where the goal is to predict a discrete value, such as the
class label of a transaction. RMSE is not meaningful for classification problems,
because it does not capture the accuracy or the error costs of the predictions.
Reference:
ROC Curve and AUC
How and When to Use ROC Curves and Precision-Recall Curves for Classification in
Python Precision-Recall
Root Mean Squared Error
56.A bank's Machine Learning team is developing an approach for credit card fraud
detection. The company has a large dataset of historical data labeled as fraudulent.
The goal is to build a model to take the information from new transactions and predict
whether each transaction is fraudulent or not
52 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Which built-in Amazon SageMaker machine learning algorithm should be used for
modeling this problem?
A. Seq2seq
B. XGBoost
C. K-means
D. Random Cut Forest (RCF)
Answer: B
Explanation:
XGBoost is a built-in Amazon SageMaker machine learning algorithm that should be
used for modeling the credit card fraud detection problem. XGBoost is an algorithm
that implements a scalable and distributed gradient boosting framework, which is a
popular and effective technique for supervised learning problems. Gradient boosting
is a method of combining multiple weak learners, such as decision trees, into a strong
learner, by iteratively fitting new models to the residual errors of the previous models
and adding them to the ensemble. XGBoost can handle various types of data, such
as numerical, categorical, or text, and can perform both regression and classification
tasks. XGBoost also supports various features and optimizations, such as
regularization, missing value handling, parallelization, and cross-validation, that can
improve the performance and efficiency of the algorithm.
XGBoost is suitable for the credit card fraud detection problem for the following
reasons:
The problem is a binary classification problem, where the goal is to predict whether a
transaction is fraudulent or not, based on the information from new transactions.
XGBoost can perform binary classification by using a logistic regression objective
function and outputting the probability of the positive class (fraudulent) for each
transaction.
The problem involves a large and imbalanced dataset of historical data labeled as
fraudulent. XGBoost can handle large-scale and imbalanced data by using distributed
and parallel computing, as well as techniques such as weighted sampling, class
weighting, or stratified sampling, to balance the classes and reduce the bias towards
the majority class (non-fraudulent).
The problem requires a high accuracy and precision for detecting fraudulent
transactions, as well as a low false positive rate for avoiding false alarms. XGBoost
can achieve high accuracy and precision by using gradient boosting, which can learn
complex and non-linear patterns from the data and reduce the variance and overfitting
of the model. XGBoost can also achieve a low false positive rate by using
regularization, which can reduce the complexity and noise of the model and prevent it
from fitting spurious signals in the data.
The other options are not as suitable as XGBoost for the credit card fraud detection
problem for the following reasons:
Seq2seq: Seq2seq is an algorithm that implements a sequence-to-sequence model,
which is a type of neural network model that can map an input sequence to an output
sequence. Seq2seq is mainly used for natural language processing tasks, such as
53 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
machine translation, text summarization, or dialogue generation. Seq2seq is not
suitable for the credit card fraud detection problem, because the problem is not a
sequence-to-sequence task, but a binary classification task. The input and output of
the problem are not sequences of words or tokens, but vectors of features and labels.
K-means: K-means is an algorithm that implements a clustering technique, which is a
type of unsupervised learning method that can group similar data points into clusters.
K-means is mainly used for exploratory data analysis, dimensionality reduction, or
anomaly detection. K-means is not suitable for the credit card fraud detection
problem, because the problem is not a clustering task, but a classification task. The
problem requires using the labeled data to train a model that can predict the labels of
new data, not finding the optimal number of clusters or the cluster memberships of the
data.
Random Cut Forest (RCF): RCF is an algorithm that implements an anomaly
detection technique, which is a type of unsupervised learning method that can identify
data points that deviate from the normal behavior or distribution of the data. RCF is
mainly used for detecting outliers, frauds, or faults in the data. RCF is not suitable for
the credit card fraud detection problem, because the problem is not an anomaly
detection task, but a classification task. The problem requires using the labeled data
to train a model that can predict the labels of new data, not finding the anomaly
scores or the anomalous data points in the data.
Reference:
XGBoost Algorithm
Use XGBoost for Binary Classification with Amazon SageMaker
Seq2seq Algorithm
K-means Algorithm
[Random Cut Forest Algorithm]
57.While working on a neural network project, a Machine Learning Specialist
discovers that some features in the data have very high magnitude resulting in this
data being weighted more in the cost function.
What should the Specialist do to ensure better convergence during backpropagation?
A. Dimensionality reduction
B. Data normalization
C. Model regulanzation
D. Data augmentation for the minority class
Answer: B
Explanation:
Data normalization is a data preprocessing technique that scales the features to a
common range, such as [0, 1] or [-1, 1]. This helps reduce the impact of features with
high magnitude on the cost function and improves the convergence during
backpropagation. Data normalization can be done using different methods, such as
min-max scaling, z-score standardization, or unit vector normalization. Data
54 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
normalization is different from dimensionality reduction, which reduces the number of
features; model regularization, which adds a penalty term to the cost function to
prevent overfitting; and data augmentation, which increases the amount of data by
creating synthetic samples.
References:
Data processing options for AI/ML | AWS Machine Learning Blog Data preprocessing
- Machine Learning Lens
How to Normalize Data Using scikit-learn in Python Normalization | Machine Learning
| Google for Developers
58.An online reseller has a large, multi-column dataset with one column missing 30%
of its data A Machine Learning Specialist believes that certain columns in the dataset
could be used to reconstruct the missing data.
Which reconstruction approach should the Specialist use to preserve the integrity of
the dataset?
A. Listwise deletion
B. Last observation carried forward
C. Multiple imputation
D. Mean substitution
Answer: C
Explanation:
Multiple imputation is a technique that uses machine learning to generate multiple
plausible values for each missing value in a dataset, based on the observed data and
the relationships among the variables. Multiple imputation preserves the integrity of
the dataset by accounting for the uncertainty and variability of the missing data, and
avoids the bias and loss of information that may result from other methods, such as
listwise deletion, last observation carried forward, or mean substitution. Multiple
imputation can improve the accuracy and validity of statistical analysis and machine
learning models that use the imputed dataset.
References:
Managing missing values in your target and related datasets with automated
imputation support in Amazon Forecast
Imputation by feature importance (IBFI): A methodology to impute missing data in
large datasets Multiple Imputation by Chained Equations (MICE) Explained
59.A Machine Learning Specialist discover the following statistics while experimenting
on a model.
55 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
What can the Specialist from the experiments?
A. The model In Experiment 1 had a high variance error lhat was reduced in
Experiment 3 by regularization Experiment 2 shows that there is minimal bias error in
Experiment 1
B. The model in Experiment 1 had a high bias error that was reduced in Experiment 3
by regularization Experiment 2 shows that there is minimal variance error in
Experiment 1
C. The model in Experiment 1 had a high bias error and a high variance error that
were reduced in Experiment 3 by regularization Experiment 2 shows thai high bias
cannot be reduced by increasing layers and neurons in the model
D. The model in Experiment 1 had a high random noise error that was reduced in
Experiment 3 by regularization Experiment 2 shows that random noise cannot be
reduced by increasing layers and neurons in the model
Answer: A
Explanation:
The model in Experiment 1 had a high variance error because it performed well on
the training data (train error = 5%) but poorly on the test data (test error = 8%). This
indicates that the model was overfitting the training data and not generalizing well to
new data. The model in Experiment 3 had a lower variance error because it
performed similarly on the training data (train error = 5.1%) and the test data (test
error = 5.4%). This indicates that the model was more robust and less sensitive to the
fluctuations in the training data. The model in Experiment 3 achieved this
improvement by implementing regularization, which is a technique that reduces the
complexity of the model and prevents overfitting by adding a penalty term to the loss
function. The model in Experiment 2 had a minimal bias error because it performed
similarly on the training data (train error = 5.2%) and the test data (test error = 5.7%)
as the model in Experiment 1. This indicates that the model was not underfitting the
data and capturing the true relationship between the input and output variables. The
model in Experiment 2 increased the number of layers and neurons in the model,
which is a way to increase the complexity and flexibility of the model. However, this
did not improve the performance of the model, as the variance error remained high.
This shows that increasing the complexity of the model is not always the best way to
reduce the bias error, and may even increase the variance error if the model becomes
too complex for the data.
References:
56 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Bias Variance Tradeoff - Clearly Explained - Machine Learning Plus. The Bias-
Variance Trade-off in Machine Learning - Stack Abuse
60.A Machine Learning Specialist needs to be able to ingest streaming data and store
it in Apache Parquet files for exploration and analysis.
Which of the following services would both ingest and store this data in the correct
format?
A. AWSDMS
B. Amazon Kinesis Data Streams
C. Amazon Kinesis Data Firehose
D. Amazon Kinesis Data Analytics
Answer: C
Explanation:
Amazon Kinesis Data Firehose is a service that can ingest streaming data and store it
in various destinations, including Amazon S3, Amazon Redshift, Amazon
Elasticsearch Service, and Splunk. Amazon Kinesis Data Firehose can also convert
the incoming data to Apache Parquet or Apache ORC format before storing it in
Amazon S3. This can reduce the storage cost and improve the performance of
analytical queries on the data. Amazon Kinesis Data Firehose supports various data
sources, such as Amazon Kinesis Data Streams, Amazon Managed Streaming for
Apache Kafka, AWS IoT, and custom applications. Amazon Kinesis Data Firehose
can also apply data transformation and compression using AWS Lambda functions.
AWSDMS is not a valid service name. AWS Database Migration Service (AWS DMS)
is a service that can migrate data from various sources to various targets, but it does
not support streaming data or Parquet format.
Amazon Kinesis Data Streams is a service that can ingest and process streaming
data in real time, but it does not store the data in any destination. Amazon Kinesis
Data Streams can be integrated with Amazon Kinesis Data Firehose to store the data
in Parquet format.
Amazon Kinesis Data Analytics is a service that can analyze streaming data using
SQL or Apache Flink, but it does not store the data in any destination. Amazon
Kinesis Data Analytics can be integrated with Amazon Kinesis Data Firehose to store
the data in Parquet format.
References:
Amazon Kinesis Data Firehose - Amazon Web Services
What Is Amazon Kinesis Data Firehose? - Amazon Kinesis Data Firehose Amazon
Kinesis Data Firehose FAQs - Amazon Web Services
61.A Machine Learning Specialist needs to move and transform data in preparation
for training Some of the data needs to be processed in near-real time and other data
can be moved hourly There are existing Amazon EMR MapReduce jobs to clean and
57 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
feature engineering to perform on the data.
Which of the following services can feed data to the MapReduce jobs? (Select TWO)
A. AWSDMS
B. Amazon Kinesis
C. AWS Data Pipeline
D. Amazon Athena
E. Amazon ES
Answer: B, C
Explanation:
Amazon Kinesis and AWS Data Pipeline are two services that can feed data to the
Amazon EMR MapReduce jobs. Amazon Kinesis is a service that can ingest,
process, and analyze streaming data in real time. Amazon Kinesis can be integrated
with Amazon EMR to run MapReduce jobs on streaming data sources, such as web
logs, social media, IoT devices, and clickstreams. Amazon Kinesis can handle data
that needs to be processed in near-real time, such as for anomaly detection, fraud
detection, or dashboarding. AWS Data Pipeline is a service that can orchestrate and
automate data movement and transformation across various AWS services and on-
premises data sources. AWS Data Pipeline can be integrated with Amazon EMR to
run MapReduce jobs on batch data sources, such as Amazon S3, Amazon RDS,
Amazon DynamoDB, and Amazon Redshift. AWS Data Pipeline can handle data that
can be moved hourly, such as for data warehousing, reporting, or machine learning.
AWSDMS is not a valid service name. AWS Database Migration Service (AWS DMS)
is a service that can migrate data from various sources to various targets, but it does
not support streaming data or MapReduce jobs.
Amazon Athena is a service that can query data stored in Amazon S3 using standard
SQL, but it does not feed data to Amazon EMR or run MapReduce jobs.
Amazon ES is a service that provides a fully managed Elasticsearch cluster, which
can be used for search, analytics, and visualization, but it does not feed data to
Amazon EMR or run MapReduce jobs.
References:
Using Amazon Kinesis with Amazon EMR - Amazon EMR
AWS Data Pipeline - Amazon Web Services
Using AWS Data Pipeline to Run Amazon EMR Jobs - AWS Data Pipeline
62.An insurance company is developing a new device for vehicles that uses a camera
to observe drivers' behavior and alert them when they appear distracted. The
company created approximately 10,000 training images in a controlled environment
that a Machine Learning Specialist will use to train and evaluate machine learning
models
During the model evaluation the Specialist notices that the training error rate
diminishes faster as the number of epochs increases and the model is not accurately
inferring on the unseen test images.
58 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Which of the following should be used to resolve this issue? (Select TWO)
A. Add vanishing gradient to the model
B. Perform data augmentation on the training data
C. Make the neural network architecture complex.
D. Use gradient checking in the model
E. Add L2 regularization to the model
Answer: B, E
Explanation:
The issue described in the question is a sign of overfitting, which is a common
problem in machine learning when the model learns the noise and details of the
training data too well and fails to generalize to new and unseen data. Overfitting can
result in a low training error rate but a high test error rate, which indicates poor
performance and validity of the model. There are several techniques that can be used
to prevent or reduce overfitting, such as data augmentation and regularization. Data
augmentation is a technique that applies various transformations to the original
training data, such as rotation, scaling, cropping, flipping, adding noise, changing
brightness, etc., to create new and diverse data samples. Data augmentation can
increase the size and diversity of the training data, which can help the model learn
more features and patterns and reduce the variance of the model. Data augmentation
is especially useful for image data, as it can simulate different scenarios and
perspectives that the model may encounter in real life. For example, in the question,
the device uses a camera to observe drivers’ behavior, so data augmentation can
help the model deal with different lighting conditions, angles, distances, etc. Data
augmentation can be done using various libraries and frameworks, such as
TensorFlow, PyTorch, Keras, OpenCV, etc12
Regularization is a technique that adds a penalty term to the model’s objective
function, which is typically based on the model’s parameters. Regularization can
reduce the complexity and flexibility of the model, which can prevent overfitting by
avoiding learning the noise and details of the training data. Regularization can also
improve the stability and robustness of the model, as it can reduce the sensitivity of
the model to small fluctuations in the data. There are different types of regularization,
such as L1, L2, dropout, etc., but they all have the same goal of reducing overfitting.
L2 regularization, also known as weight decay or ridge regression, is one of the most
common and effective regularization techniques. L2 regularization adds the squared
norm of the model’s parameters multiplied by a regularization parameter (lambda) to
the model’s objective function. L2 regularization can shrink the model’s parameters
towards zero, which can reduce the variance of the model and improve the
generalization ability of the model. L2 regularization can be implemented using
various libraries and frameworks, such as TensorFlow, PyTorch, Keras, Scikit-learn,
etc34
The other options are not valid or relevant for resolving the issue of overfitting. Adding
vanishing gradient to the model is not a technique, but a problem that occurs when
the gradient of the model’s objective function becomes very small and the model
59 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
stops learning. Making the neural network architecture complex is not a solution, but a
possible cause of overfitting, as a complex model can have more parameters and
more flexibility to fit the training data too well. Using gradient checking in the model is
not a technique, but a debugging method that verifies the correctness of the gradient
computation in the model. Gradient checking is not related to overfitting, but to the
implementation of the model.
63.The Chief Editor for a product catalog wants the Research and Development team
to build a machine learning system that can be used to detect whether or not
individuals in a collection of images are wearing the company's retail brand. The team
has a set of training data
Which machine learning algorithm should the researchers use that BEST meets their
requirements?
A. Latent Dirichlet Allocation (LDA)
B. Recurrent neural network (RNN)
C. K-means
D. Convolutional neural network (CNN)
Answer: D
Explanation:
A convolutional neural network (CNN) is a type of machine learning algorithm that is
suitable for image classification tasks. A CNN consists of multiple layers that can
extract features from images and learn to recognize patterns and objects. A CNN can
also use transfer learning to leverage pre-trained models that have been trained on
large-scale image datasets, such as ImageNet, and fine-tune them for specific tasks,
such as detecting the company’s retail brand. A CNN can achieve high accuracy and
performance for image classification problems, as it can handle complex and diverse
images and reduce the dimensionality and noise of the input data. A CNN can be
implemented using various frameworks and libraries, such as TensorFlow, PyTorch,
Keras, MXNet, etc12
The other options are not valid or relevant for the image classification task. Latent
Dirichlet Allocation (LDA) is a type of machine learning algorithm that is suitable for
topic modeling tasks. LDA can discover the hidden topics and their proportions in a
collection of text documents, such as news articles, tweets, reviews, etc. LDA is not
applicable for image data, as it requires textual input and output. LDA can be
implemented using various frameworks and libraries, such as Gensim, Scikit-learn,
Mallet, etc34
Recurrent neural network (RNN) is a type of machine learning algorithm that is
suitable for sequential data tasks. RNN can process and generate data that has
temporal or sequential dependencies, such as natural language, speech, audio,
video, etc. RNN is not optimal for image data, as it does not capture the spatial
features and relationships of the pixels. RNN can be implemented using various
frameworks and libraries, such as TensorFlow, PyTorch, Keras, MXNet, etc.
60 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
K-means is a type of machine learning algorithm that is suitable for clustering tasks. K-
means can partition a set of data points into a predefined number of clusters, based
on the similarity and distance between the data points. K-means is not suitable for
image classification tasks, as it does not learn to label the images or detect the
objects of interest. K-means can be implemented using various frameworks and
libraries, such as Scikit-learn, TensorFlow, PyTorch, etc.
64.A Machine Learning Specialist kicks off a hyperparameter tuning job for a tree-
based ensemble model using Amazon SageMaker with Area Under the ROC Curve
(AUC) as the objective metric This workflow will eventually be deployed in a pipeline
that retrains and tunes hyperparameters each night to model click-through on data
that goes stale every 24 hours
With the goal of decreasing the amount of time it takes to train these models, and
ultimately to decrease costs, the Specialist wants to reconfigure the input
hyperparameter range(s).
Which visualization will accomplish this?
A. A histogram showing whether the most important input feature is Gaussian.
B. A scatter plot with points colored by target variable that uses (-Distributed
Stochastic Neighbor Embedding (I-SNE) to visualize the large number of input
variables in an easier-to-read dimension.
C. A scatter plot showing (he performance of the objective metric over each training
iteration
D. A scatter plot showing the correlation between maximum tree depth and the
objective metric.
Answer: D
Explanation:
A scatter plot showing the correlation between maximum tree depth and the objective
metric is a visualization that can help the Machine Learning Specialist reconfigure the
input hyperparameter range(s) for the tree-based ensemble model. A scatter plot is a
type of graph that displays the relationship between two variables using dots, where
each dot represents one observation. A scatter plot can show the direction, strength,
and shape of the correlation between the variables, as well as any outliers or clusters.
In this case, the scatter plot can show how the maximum tree depth, which is a
hyperparameter that controls the complexity and depth of the decision trees in the
ensemble model, affects the AUC, which is the objective metric that measures the
performance of the model in terms of the trade-off between true positive rate and
false positive rate. By looking at the scatter plot, the Machine Learning Specialist can
see if there is a positive, negative, or no correlation between the maximum tree depth
and the AUC, and how strong or weak the correlation is. The Machine Learning
Specialist can also see if there is an optimal value or range of values for the
maximum tree depth that maximizes the AUC, or if there is a point of diminishing
returns or overfitting where increasing the maximum tree depth does not improve or
61 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
even worsens the AUC. Based on the scatter plot, the Machine Learning Specialist
can reconfigure the input hyperparameter range(s) for the maximum tree depth to
focus on the values that yield the best AUC, and avoid the values that result in poor
AUC. This can decrease the amount of time and cost it takes to train the model, as
the hyperparameter tuning job can explore fewer and more promising combinations of
values. A scatter plot can be created using various tools and libraries, such as
Matplotlib, Seaborn, Plotly, etc12
The other options are not valid or relevant for reconfiguring the input hyperparameter
range(s) for the tree-based ensemble model. A histogram showing whether the most
important input feature is Gaussian is a visualization that can help the Machine
Learning Specialist understand the distribution and shape of the input data, but not
the hyperparameters. A histogram is a type of graph that displays the frequency or
count of values in a single variable using bars, where each bar represents a bin or
interval of values. A histogram can show if the variable is symmetric, skewed, or
multimodal, and if it follows a normal or Gaussian distribution, which is a bell-shaped
curve that is often assumed by many machine learning algorithms. In this case, the
histogram can show if the most important input feature, which is a variable that has
the most influence or predictive power on the output variable, is Gaussian or not.
However, this does not help the Machine Learning Specialist reconfigure the input
hyperparameter range(s) for the tree-based ensemble model, as the input feature is
not a hyperparameter that can be tuned or optimized. A histogram can be created
using various tools and libraries, such as Matplotlib, Seaborn, Plotly, etc34
A scatter plot with points colored by target variable that uses t-Distributed Stochastic
Neighbor Embedding (t-SNE) to visualize the large number of input variables in an
easier-to-read dimension is a visualization that can help the Machine Learning
Specialist understand the structure and clustering of the input data, but not the
hyperparameters. t-SNE is a technique that can reduce the dimensionality of high-
dimensional data, such as images, text, or gene expression, and project it onto a
lower-dimensional space, such as two or three dimensions, while preserving the local
similarities and distances between the data points. t-SNE can help visualize and
explore the patterns and relationships in the data, such as the clusters, outliers, or
separability of the classes. In this case, the scatter plot can show how the input
variables, which are the features or predictors of the output variable, are mapped onto
a two-dimensional space using t-SNE, and how the points are colored by the target
variable, which is the output or response variable that the model tries to predict.
However, this does not help the Machine Learning Specialist reconfigure the input
hyperparameter range(s) for the tree-based ensemble model, as the input variables
and the target variable are not hyperparameters that can be tuned or optimized. A
scatter plot with t-SNE can be created using various tools and libraries, such as Scikit-
learn, TensorFlow, PyTorch, etc5
A scatter plot showing the performance of the objective metric over each training
iteration is a visualization that can help the Machine Learning Specialist understand
the learning curve and convergence of the model, but not the hyperparameters. A
62 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
scatter plot is a type of graph that displays the relationship between two variables
using dots, where each dot represents one observation. A scatter plot can show the
direction, strength, and shape of the correlation between the variables, as well as any
outliers or clusters. In this case, the scatter plot can show how the objective metric,
which is the performance measure that the model tries to optimize, changes over
each training iteration, which is the number of times that the model updates its
parameters using a batch of data. A scatter plot can show if the objective metric
improves, worsens, or stagnates over time, and if the model converges to a stable
value or oscillates or diverges. However, this does not help the Machine Learning
Specialist reconfigure the input hyperparameter range(s) for the tree-based ensemble
model, as the objective metric and the training iteration are not hyperparameters that
can be tuned or optimized. A scatter plot can be created using various tools and
libraries, such as Matplotlib, Seaborn, Plotly, etc.
65.A Machine Learning Specialist is configuring automatic model tuning in Amazon
SageMaker.
When using the hyperparameter optimization feature, which of the following
guidelines should be followed to improve optimization? Choose the maximum number
of hyperparameters supported by
A. Amazon SageMaker to search the largest number of combinations possible
B. Specify a very large hyperparameter range to allow Amazon SageMaker to cover
every possible value.
C. Use log-scaled hyperparameters to allow the hyperparameter space to be
searched as quickly as possible
D. Execute only one hyperparameter tuning job at a time and improve tuning through
successive rounds of experiments
Answer: C
Explanation:
Using log-scaled hyperparameters is a guideline that can improve the automatic
model tuning in Amazon SageMaker. Log-scaled hyperparameters are
hyperparameters that have values that span several orders of magnitude, such as
learning rate, regularization parameter, or number of hidden units. Log-scaled
hyperparameters can be specified by using a log-uniform distribution, which assigns
equal probability to each order of magnitude within a range. For example, a log-
uniform distribution between 0.001 and 1000 can sample values such as 0.001, 0.01,
0.1, 1, 10, 100, or 1000 with equal probability. Using log-scaled hyperparameters can
allow the hyperparameter optimization feature to search the hyperparameter space
more efficiently and effectively, as it can explore different scales of values and avoid
sampling values that are too small or too large. Using log-scaled hyperparameters
can also help avoid numerical issues, such as underflow or overflow, that may occur
when using linear-scaled hyperparameters. Using log-scaled hyperparameters can be
done by setting the ScalingType parameter to Logarithmic when defining the
63 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
hyperparameter ranges in Amazon SageMaker12
The other options are not valid or relevant guidelines for improving the automatic
model tuning in Amazon SageMaker. Choosing the maximum number of
hyperparameters supported by Amazon SageMaker to search the largest number of
combinations possible is not a good practice, as it can increase the time and cost of
the tuning job and make it harder to find the optimal values. Amazon SageMaker
supports up to 20 hyperparameters for tuning, but it is recommended to choose only
the most important and influential hyperparameters for the model and algorithm, and
use default or fixed values for the rest3 Specifying a very large hyperparameter range
to allow Amazon SageMaker to cover every possible value is not a good practice, as
it can result in sampling values that are irrelevant or impractical for the model and
algorithm, and waste the tuning budget. It is recommended to specify a reasonable
and realistic hyperparameter range based on the prior knowledge and experience of
the model and algorithm, and use the results of the tuning job to refine the range if
needed4 Executing only one hyperparameter tuning job at a time and improving
tuning through successive rounds of experiments is not a good practice, as it can limit
the exploration and exploitation of the hyperparameter space and make the tuning
process slower and less efficient. It is recommended to use parallelism and
concurrency to run multiple training jobs simultaneously and leverage the Bayesian
optimization algorithm that Amazon SageMaker uses to guide the search for the best
hyperparameter values5
66.A large mobile network operating company is building a machine learning model to
predict customers who are likely to unsubscribe from the service. The company plans
to offer an incentive for these customers as the cost of churn is far greater than the
cost of the incentive.
The model produces the following confusion matrix after evaluating on a test dataset
of 100 customers:
Based on the model evaluation results, why is this a viable model for production?
A. The model is 86% accurate and the cost incurred by the company as a result of
false negatives is less than the false positives.
B. The precision of the model is 86%, which is less than the accuracy of the model.
C. The model is 86% accurate and the cost incurred by the company as a result of
false positives is less than the false negatives.
D. The precision of the model is 86%, which is greater than the accuracy of the
64 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
model.
Answer: C
Explanation:
Based on the model evaluation results, this is a viable model for production because
the model is 86% accurate and the cost incurred by the company as a result of false
positives is less than the false negatives. The accuracy of the model is the proportion
of correct predictions out of the total predictions, which can be calculated by adding
the true positives and true negatives and dividing by the total number of observations.
In this case, the accuracy of the model is (10 + 76) / 100 = 0.86, which means that the
model correctly predicted 86% of the customers’ churn status. The cost incurred by
the company as a result of false positives and false negatives is the loss or damage
that the company suffers when the model makes incorrect predictions. A false positive
is when the model predicts that a customer will churn, but the customer actually does
not churn. A false negative is when the model predicts that a customer will not churn,
but the customer actually churns. In this case, the cost of a false positive is the
incentive that the company offers to the customer who is predicted to churn, which is
a relatively low cost. The cost of a false negative is the revenue that the company
loses when the customer churns, which is a relatively high cost. Therefore, the cost of
a false positive is less than the cost of a false negative, and the company would
prefer to have more false positives than false negatives. The model has 10 false
positives and 4 false negatives, which means that the company’s cost is lower than if
the model had more false negatives and fewer false positives.
67.A Machine Learning Specialist is designing a system for improving sales for a
company. The objective is to use the large amount of information the company has on
users' behavior and product preferences to predict which products users would like
based on the users' similarity to other users.
What should the Specialist do to meet this objective?
A. Build a content-based filtering recommendation engine with Apache Spark ML on
Amazon EMR.
B. Build a collaborative filtering recommendation engine with Apache Spark ML on
Amazon EMR.
C. Build a model-based filtering recommendation engine with Apache Spark ML on
Amazon EMR.
D. Build a combinative filtering recommendation engine with Apache Spark ML on
Amazon EMR.
Answer: B
Explanation:
A collaborative filtering recommendation engine is a type of machine learning system
that can improve sales for a company by using the large amount of information the
company has on users’ behavior and product preferences to predict which products
users would like based on the users’ similarity to other users. A collaborative filtering
65 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
recommendation engine works by finding the users who have similar ratings or
preferences for the products, and then recommending the products that the similar
users have liked but the target user has not seen or rated. A collaborative filtering
recommendation engine can leverage the collective wisdom of the users and discover
the hidden patterns and associations among the products and the users. A
collaborative filtering recommendation engine can be implemented using Apache
Spark ML on Amazon EMR, which are two services that can handle large-scale data
processing and machine learning tasks. Apache Spark ML is a library that provides
various tools and algorithms for machine learning, such as classification, regression,
clustering, recommendation, etc. Apache Spark ML can run on Amazon EMR, which
is a service that provides a managed cluster platform that simplifies running big data
frameworks, such as Apache Spark, on AWS. Apache Spark ML on Amazon EMR
can build a collaborative filtering recommendation engine using the Alternating Least
Squares (ALS) algorithm, which is a matrix factorization technique that can learn the
latent factors that represent the users and the products, and then use them to predict
the ratings or preferences of the users for the products. Apache Spark ML on Amazon
EMR can also support both explicit feedback, such as ratings or reviews, and implicit
feedback, such as views or clicks, for building a collaborative filtering
recommendation engine12
68.A Data Engineer needs to build a model using a dataset containing customer
credit card information.
How can the Data Engineer ensure the data remains encrypted and the credit card
information is secure?
A. Use a custom encryption algorithm to encrypt the data and store the data on an
Amazon SageMaker
instance in a VPC. Use the SageMaker DeepAR algorithm to randomize the credit
card numbers.
B. Use an IAM policy to encrypt the data on the Amazon S3 bucket and Amazon
Kinesis to automatically
discard credit card numbers and insert fake credit card numbers.
C. Use an Amazon SageMaker launch configuration to encrypt the data once it is
copied to the SageMaker instance in a VPC. Use the SageMaker principal component
analysis (PCA) algorithm to reduce the length of the credit card numbers.
D. Use AWS KMS to encrypt the data on Amazon S3 and Amazon SageMaker, and
redact the credit card numbers from the customer data with AWS Glue.
Answer: D
Explanation:
AWS KMS is a service that provides encryption and key management for data stored
in AWS services and applications. AWS KMS can generate and manage encryption
keys that are used to encrypt and decrypt data at rest and in transit. AWS KMS can
also integrate with other AWS services, such as Amazon S3 and Amazon
66 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
SageMaker, to enable encryption of data using the keys stored in AWS KMS. Amazon
S3 is a service that provides object storage for data in the cloud. Amazon S3 can use
AWS KMS to encrypt data at rest using server-side encryption with AWS KMS-
managed keys (SSE-KMS). Amazon SageMaker is a service that provides a platform
for building, training, and deploying machine learning models. Amazon SageMaker
can use AWS KMS to encrypt data at rest on the SageMaker instances and volumes,
as well as data in transit between SageMaker and other AWS services. AWS Glue is
a service that provides a serverless data integration platform for data preparation and
transformation. AWS Glue can use AWS KMS to encrypt data at rest on the Glue
Data Catalog and Glue ETL jobs. AWS Glue can also use built-in or custom
classifiers to identify and redact sensitive data, such as credit card numbers, from the
customer data1234
The other options are not valid or secure ways to encrypt the data and protect the
credit card information. Using a custom encryption algorithm to encrypt the data and
store the data on an Amazon SageMaker instance in a VPC is not a good practice, as
custom encryption algorithms are not recommended for security and may have flaws
or vulnerabilities. Using the SageMaker DeepAR algorithm to randomize the credit
card numbers is not a good practice, as DeepAR is a forecasting algorithm that is not
designed for data anonymization or encryption. Using an IAM policy to encrypt the
data on the Amazon S3 bucket and Amazon Kinesis to automatically discard credit
card numbers and insert fake credit card numbers is not a good practice, as IAM
policies are not meant for data encryption, but for access control and authorization.
Amazon Kinesis is a service that provides real-time data streaming and processing,
but it does not have the capability to automatically discard or insert data values. Using
an Amazon SageMaker launch configuration to encrypt the data once it is copied to
the SageMaker instance in a VPC is not a good practice, as launch configurations are
not meant for data encryption, but for specifying the instance type, security group, and
user data for the SageMaker instance. Using the SageMaker principal component
analysis (PCA) algorithm to reduce the length of the credit card numbers is not a good
practice, as PCA is a dimensionality reduction algorithm that is not designed for data
anonymization or encryption.
69.A Machine Learning Specialist is using an Amazon SageMaker notebook instance
in a private subnet
of a corporate VPC. The ML Specialist has important data stored on the Amazon
SageMaker notebook instance's Amazon EBS volume, and needs to take a snapshot
of that EBS volume. However the ML Specialist cannot find the Amazon SageMaker
notebook instance's EBS volume or Amazon EC2 instance within the VPC.
Why is the ML Specialist not seeing the instance visible in the VPC?
A. Amazon SageMaker notebook instances are based on the EC2 instances within
the customer account, but they run outside of VPCs.
B. Amazon SageMaker notebook instances are based on the Amazon ECS service
67 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
within customer accounts.
C. Amazon SageMaker notebook instances are based on EC2 instances running
within AWS service accounts.
D. Amazon SageMaker notebook instances are based on AWS ECS instances
running within AWS service accounts.
Answer: C
Explanation:
Amazon SageMaker notebook instances are fully managed environments that provide
an integrated Jupyter notebook interface for data exploration, analysis, and machine
learning. Amazon SageMaker notebook instances are based on EC2 instances that
run within AWS service accounts, not within customer accounts. This means that the
ML Specialist cannot find the Amazon SageMaker notebook instance’s EC2 instance
or EBS volume within the VPC, as they are not visible or accessible to the customer.
However, the ML Specialist can still take a snapshot of the EBS volume by using the
Amazon SageMaker console or API. The ML Specialist can also use VPC interface
endpoints to securely connect the Amazon SageMaker notebook instance to the
resources within the VPC, such as Amazon S3 buckets, Amazon EFS file systems, or
Amazon RDS databases
70.A manufacturing company has structured and unstructured data stored in an
Amazon S3 bucket. A Machine Learning Specialist wants to use SQL to run queries
on this data.
Which solution requires the LEAST effort to be able to query this data?
A. Use AWS Data Pipeline to transform the data and Amazon RDS to run queries.
B. Use AWS Glue to catalogue the data and Amazon Athena to run queries.
C. Use AWS Batch to run ETL on the data and Amazon Aurora to run the queries.
D. Use AWS Lambda to transform the data and Amazon Kinesis Data Analytics to run
queries.
Answer: B
Explanation:
Using AWS Glue to catalogue the data and Amazon Athena to run queries is the
solution that requires the least effort to be able to query the data stored in an Amazon
S3 bucket using SQL. AWS Glue is a service that provides a serverless data
integration platform for data preparation and transformation. AWS Glue can
automatically discover, crawl, and catalogue the data stored in various sources, such
as Amazon S3, Amazon RDS, Amazon Redshift, etc. AWS Glue can also use AWS
KMS to encrypt the data at rest on the Glue Data Catalog and Glue ETL jobs. AWS
Glue can handle both structured and unstructured data, and support various data
formats, such as CSV, JSON, Parquet, etc. AWS Glue can also use built-in or custom
classifiers to identify and parse the data schema and format1 Amazon Athena is a
service that provides an interactive query engine that can run SQL queries directly on
data stored in Amazon S3. Amazon Athena can integrate with AWS Glue to use the
68 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Glue Data Catalog as a central metadata repository for the data sources and tables.
Amazon Athena can also use AWS KMS to encrypt the data at rest on Amazon S3
and the query results. Amazon Athena can query both structured and unstructured
data, and support various data formats, such as CSV, JSON, Parquet, etc. Amazon
Athena can also use partitions and compression to optimize the query performance
and reduce the query cost23
The other options are not valid or require more effort to query the data stored in an
Amazon S3 bucket using SQL. Using AWS Data Pipeline to transform the data and
Amazon RDS to run queries is not a good option, as it involves moving the data from
Amazon S3 to Amazon RDS, which can incur additional time and cost. AWS Data
Pipeline is a service that can orchestrate and automate data movement and
transformation across various AWS services and on-premises data sources. AWS
Data Pipeline can be integrated with Amazon EMR to run ETL jobs on the data stored
in Amazon S3. Amazon RDS is a service that provides a managed relational
database service that can run various database engines, such as MySQL,
PostgreSQL, Oracle, etc. Amazon RDS can use AWS KMS to encrypt the data at rest
and in transit. Amazon RDS can run SQL queries on the data stored in the database
tables45 Using AWS Batch to run ETL on the data and Amazon Aurora to run the
queries is not a good option, as it also involves moving the data from Amazon S3 to
Amazon Aurora, which can incur additional time and cost. AWS Batch is a service
that can run batch computing workloads on AWS. AWS Batch can be integrated with
AWS Lambda to trigger ETL jobs on the data stored in Amazon S3. Amazon Aurora is
a service that provides a compatible and scalable relational database engine that can
run MySQL or PostgreSQL. Amazon Aurora can use AWS KMS to encrypt the data at
rest and in transit. Amazon Aurora can run SQL queries on the data stored in the
database tables. Using AWS Lambda to transform the data and Amazon Kinesis Data
Analytics to run queries is not a good option, as it is not suitable for querying data
stored in Amazon S3 using SQL. AWS Lambda is a service that can run serverless
functions on AWS. AWS Lambda can be integrated with Amazon S3 to trigger data
transformation functions on the data stored in Amazon S3. Amazon Kinesis Data
Analytics is a service that can analyze streaming data using SQL or Apache Flink.
Amazon Kinesis Data Analytics can be integrated with Amazon Kinesis Data Streams
or Amazon Kinesis Data Firehose to ingest streaming data sources, such as web
logs, social media, IoT devices, etc. Amazon Kinesis Data Analytics is not designed
for querying data stored in Amazon S3 using SQL.
71.A Machine Learning Specialist receives customer data for an online shopping
website. The data includes demographics, past visits, and locality information. The
Specialist must develop a machine learning approach to identify the customer
shopping patterns, preferences and trends to enhance the website for better service
and smart recommendations.
Which solution should the Specialist recommend?
69 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
A. Latent Dirichlet Allocation (LDA) for the given collection of discrete data to identify
patterns in the
customer database.
B. A neural network with a minimum of three layers and random initial weights to
identify patterns in the customer database
C. Collaborative filtering based on user interactions and correlations to identify
patterns in the customer database
D. Random Cut Forest (RCF) over random subsamples to identify patterns in the
customer database
Answer: C
Explanation:
Collaborative filtering is a machine learning technique that recommends products or
services to users based on the ratings or preferences of other users. This technique is
well-suited for identifying customer shopping patterns and preferences because it
takes into account the interactions between users and products.
72.A Machine Learning Specialist is working with a large company to leverage
machine learning within its products. The company wants to group its customers into
categories based on which customers will and will not churn within the next 6 months.
The company has labeled the data available to the Specialist.
Which machine learning model type should the Specialist use to accomplish this
task?
A. Linear regression
B. Classification
C. Clustering
D. Reinforcement learning
Answer: B
Explanation:
Explanation:
The goal of classification is to determine to which class or category a data point
(customer in our case) belongs to. For classification problems, data scientists would
use historical data with predefined target variables AKA labels (churner/non-churner)
C answers that need to be predicted C to train an algorithm.
With classification, businesses can answer the following questions:
Will this customer churn or not?
Will a customer renew their subscription?
Will a user downgrade a pricing plan?
Are there any signs of unusual customer behavior?
Reference: https://www.kdnuggets.com/2019/05/churn-prediction-machine-
learning.html
70 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
73.The displayed graph is from a foresting model for testing a time series.
Considering the graph only, which conclusion should a Machine Learning Specialist
make about the behavior of the model?
A. The model predicts both the trend and the seasonality well.
B. The model predicts the trend well, but not the seasonality.
C. The model predicts the seasonality well, but not the trend.
D. The model does not predict the trend or the seasonality well.
Answer: D
74.A company wants to classify user behavior as either fraudulent or normal. Based
on internal research, a Machine Learning Specialist would like to build a binary
classifier based on two features: age of account and transaction month.
The class distribution for these features is illustrated in the figure provided.
71 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Based on this information which model would have the HIGHEST accuracy?
A. Long short-term memory (LSTM) model with scaled exponential linear unit (SELL))
B. Logistic regression
C. Support vector machine (SVM) with non-linear kernel
D. Single perceptron with tanh activation function
Answer: C
Explanation:
Based on the figure provided, the data is not linearly separable. Therefore, a non-
linear model such as SVM with a non-linear kernel would be the best choice. SVMs
are particularly effective in high-dimensional spaces and are versatile in that they can
be used for both linear and non-linear data. Additionally, SVMs have a high level of
accuracy and are less prone to overfitting1
References: 1: https://docs.aws.amazon.com/sagemaker/latest/dg/svm.html
75.A Machine Learning Specialist at a company sensitive to security is preparing a
dataset for model training. The dataset is stored in Amazon S3 and contains
Personally Identifiable Information (Pll).
The dataset:
72 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
* Must be accessible from a VPC only.
* Must not traverse the public internet.
How can these requirements be satisfied?
A. Create a VPC endpoint and apply a bucket access policy that restricts access to
the given VPC endpoint and the VPC.
B. Create a VPC endpoint and apply a bucket access policy that allows access from
the given VPC endpoint and an Amazon EC2 instance.
C. Create a VPC endpoint and use Network Access Control Lists (NACLs) to allow
traffic between only the given VPC endpoint and an Amazon EC2 instance.
D. Create a VPC endpoint and use security groups to restrict access to the given VPC
endpoint and an Amazon EC2 instance.
Answer: A
Explanation:
A VPC endpoint is a logical device that enables private connections between a VPC
and supported AWS services. A VPC endpoint can be either a gateway endpoint or
an interface endpoint. A gateway endpoint is a gateway that is a target for a specified
route in the route table, used for traffic destined to a supported AWS service. An
interface endpoint is an elastic network interface with a private IP address that serves
as an entry point for traffic destined to a supported service1
In this case, the Machine Learning Specialist can create a gateway endpoint for
Amazon S3, which is a supported service for gateway endpoints. A gateway endpoint
for Amazon S3 enables the VPC to access Amazon S3 privately, without requiring an
internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.
The traffic between the VPC and Amazon S3 does not leave the Amazon network2
To restrict access to the dataset stored in Amazon S3, the Machine Learning
Specialist can apply a bucket access policy that allows access only from the given
VPC endpoint and the VPC. A bucket access policy is a resource-based policy that
defines who can access a bucket and what actions they can perform. A bucket
access policy can use various conditions to control access, such as the source IP
address, the source VPC, the source VPC endpoint, etc. In this case, the Machine
Learning Specialist can use the aws:sourceVpce condition to specify the ID of the
VPC endpoint, and the aws:sourceVpc condition to specify the ID of the VPC. This
way, only the requests that originate from the VPC endpoint or the VPC can access
the bucket that contains the dataset34
The other options are not valid or secure ways to satisfy the requirements. Creating a
VPC endpoint and applying a bucket access policy that allows access from the given
VPC endpoint and an Amazon EC2 instance is not a good option, as it does not
restrict access to the VPC. An Amazon EC2 instance is a virtual server that runs in
the AWS cloud. An Amazon EC2 instance can have a public IP address or a private
IP address, depending on the network configuration. Allowing access from an
Amazon EC2 instance does not guarantee that the instance is in the same VPC as
the VPC endpoint, and may expose the dataset to unauthorized access. Creating a
VPC endpoint and using Network Access
73 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Control Lists (NACLs) to allow traffic between only the given VPC endpoint and an
Amazon EC2 instance is not a good option, as it does not restrict access to the VPC.
NACLs are stateless firewalls that can control inbound and outbound traffic at the
subnet level. NACLs can use rules to allow or deny traffic based on the protocol, port,
and source or destination IP address. However, NACLs do not support VPC
endpoints as a source or destination, and cannot filter traffic based on the VPC
endpoint ID or the VPC ID. Therefore, using NACLs does not guarantee that the
traffic is from the VPC endpoint or the VPC, and may expose the dataset to
unauthorized access. Creating a VPC endpoint and using security groups to restrict
access to the given VPC endpoint and an Amazon EC2 instance is not a good option,
as it does not restrict access to the VPC. Security groups are stateful firewalls that
can control inbound and outbound traffic at the instance level. Security groups can
use rules to allow or deny traffic based on the protocol, port, and source or
destination. However, security groups do not support VPC endpoints as a source or
destination, and cannot filter traffic based on the VPC endpoint ID or the VPC ID.
Therefore, using security groups does not guarantee that the traffic is from the VPC
endpoint or the VPC, and may expose the dataset to unauthorized access.
76.An employee found a video clip with audio on a company's social media feed. The
language used in the video is Spanish. English is the employee's first language, and
they do not understand Spanish. The employee wants to do a sentiment analysis.
What combination of services is the MOST efficient to accomplish the task?
A. Amazon Transcribe, Amazon Translate, and Amazon Comprehend
B. Amazon Transcribe, Amazon Comprehend, and Amazon SageMaker seq2seq
C. Amazon Transcribe, Amazon Translate, and Amazon SageMaker Neural Topic
Model (NTM)
D. Amazon Transcribe, Amazon Translate, and Amazon SageMaker BlazingText
Answer: A
Explanation:
Amazon Transcribe, Amazon Translate, and Amazon Comprehend are the most
efficient combination of services to accomplish the task of sentiment analysis on a
video clip with audio in Spanish. Amazon Transcribe is a service that can convert
speech to text using deep learning. Amazon Transcribe can transcribe audio from
various sources, such as video files, audio files, or streaming audio. Amazon
Transcribe can also recognize multiple speakers, different languages, accents,
dialects, and custom vocabularies. In this case, Amazon Transcribe can transcribe
the audio from the video clip in Spanish to text in Spanish1 Amazon Translate is a
service that can translate text from one language to another using neural machine
translation. Amazon Translate can translate text from various sources, such as
documents, web pages, chat messages, etc. Amazon Translate can also support
multiple languages, domains, and styles. In this case, Amazon Translate can translate
the text from Spanish to English2 Amazon Comprehend is a service that can analyze
74 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
and derive insights from text using natural language processing. Amazon
Comprehend can perform various tasks, such as sentiment analysis, entity
recognition, key phrase extraction, topic modeling, etc. Amazon Comprehend can
also support multiple languages and domains. In this case, Amazon Comprehend can
perform sentiment analysis on the text in English and determine whether the feedback
is positive, negative, neutral, or mixed3
The other options are not valid or efficient for accomplishing the task of sentiment
analysis on a video clip with audio in Spanish. Amazon Comprehend, Amazon
SageMaker seq2seq, and Amazon SageMaker Neural Topic Model (NTM) are not a
good combination, as they do not include a service that can transcribe speech to text,
which is a necessary step for processing the audio from the video clip. Amazon
Comprehend, Amazon Translate, and Amazon SageMaker BlazingText are not a
good combination, as they do not include a service that can perform sentiment
analysis, which is the main
goal of the task. Amazon SageMaker BlazingText is a service that can train and
deploy text classification and word embedding models using deep learning. Amazon
SageMaker BlazingText can perform tasks such as text classification, named entity
recognition, part-of-speech tagging, etc., but not sentiment analysis4
77.A Machine Learning Specialist is packaging a custom ResNet model into a Docker
container so the company can leverage Amazon SageMaker for training. The
Specialist is using Amazon EC2 P3 instances to train the model and needs to
properly configure the Docker container to leverage the NVIDIA GPUs.
What does the Specialist need to do?
A. Bundle the NVIDIA drivers with the Docker image.
B. Build the Docker container to be NVIDIA-Docker compatible.
C. Organize the Docker container's file structure to execute on GPU instances.
D. Set the GPU flag in the Amazon SageMaker CreateTrainingJob request body
Answer: B
Explanation:
To leverage the NVIDIA GPUs on Amazon EC2 P3 instances for training a custom
ResNet model using Amazon SageMaker, the Machine Learning Specialist needs to
build the Docker container to be NVIDIA-Docker compatible. NVIDIA-Docker is a tool
that enables GPU-accelerated containers to run on Docker. NVIDIA-Docker can
automatically configure the Docker container with the necessary drivers, libraries, and
environment variables to access the NVIDIA GPUs. NVIDIA-Docker can also isolate
the GPU resources and ensure that each container has exclusive access to a GPU.
To build a Docker container that is NVIDIA-Docker compatible, the Machine Learning
Specialist needs to follow these steps:
Install the NVIDIA Container Toolkit on the host machine that runs Docker. This toolkit
includes the NVIDIA Container Runtime, which is a modified version of the Docker
runtime that supports GPU hardware.
75 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Use the base image provided by NVIDIA as the first line of the Dockerfile. The base
image contains the NVIDIA drivers and CUDA toolkit that are required for GPU-
accelerated applications. The base image can be specified as FROM
nvcr.io/nvidia/cuda:tag, where tag is the version of CUDA and the operating system.
Install the required dependencies and frameworks for the ResNet model, such as
PyTorch, torchvision, etc., in the Dockerfile.
Copy the ResNet model code and any other necessary files to the Docker container in
the Dockerfile.
Build the Docker image using the docker build command.
Push the Docker image to a repository, such as Amazon Elastic Container Registry
(Amazon ECR), using the docker push command.
Specify the Docker image URI and the instance type (ml.p3.xlarge) in the Amazon
SageMaker CreateTrainingJob request body.
The other options are not valid or sufficient for building a Docker container that can
leverage the NVIDIA GPUs on Amazon EC2 P3 instances. Bundling the NVIDIA
drivers with the Docker image is not a good option, as it can cause driver conflicts and
compatibility issues with the host machine and the NVIDIA GPUs. Organizing the
Docker container’s file structure to execute on GPU instances is not a good option,
as it does not ensure that the Docker container can access the NVIDIA GPUs and the
CUDA toolkit. Setting the GPU flag in the Amazon SageMaker CreateTrainingJob
request body is not a good option, as it does not apply to custom Docker containers,
but only to built-in algorithms and frameworks that support GPU instances.
78.A Machine Learning Specialist is building a logistic regression model that will
predict whether or not a person will order a pizza. The Specialist is trying to build the
optimal model with an ideal classification threshold.
What model evaluation technique should the Specialist use to understand how
different classification
thresholds will impact the model's performance?
A. Receiver operating characteristic (ROC) curve
B. Misclassification rate
C. Root Mean Square Error (RM&)
D. L1 norm
Answer: A
Explanation:
A receiver operating characteristic (ROC) curve is a model evaluation technique that
can be used to understand how different classification thresholds will impact the
model’s performance. A ROC curve plots the true positive rate (TPR) against the
false positive rate (FPR) for various values of the classification threshold. The TPR,
also known as sensitivity or recall, is the proportion of positive instances that are
correctly classified as positive. The FPR, also known as the fall-out, is the proportion
of negative instances that are incorrectly classified as positive. A ROC curve can
76 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
show the trade-off between the TPR and the FPR for different thresholds, and help
the Machine Learning Specialist to select the optimal threshold that maximizes the
TPR and minimizes the FPR. A ROC curve can also be used to compare the
performance of different models by calculating the area under the curve (AUC), which
is a measure of how well the model can distinguish between the positive and negative
classes. A higher AUC indicates a better model
79.An interactive online dictionary wants to add a widget that displays words used in
similar contexts. A Machine Learning Specialist is asked to provide word features for
the downstream nearest neighbor model powering the widget.
What should the Specialist do to meet these requirements?
A. Create one-hot word encoding vectors.
B. Produce a set of synonyms for every word using Amazon Mechanical Turk.
C. Create word embedding factors that store edit distance with every other word.
D. Download word embedding’s pre-trained on a large corpus.
Answer: D
Explanation:
Word embeddings are a type of dense representation of words, which encode
semantic meaning in a vector form. These embeddings are typically pre-trained on a
large corpus of text data, such as a large set of books, news articles, or web pages,
and capture the context in which words are used. Word embeddings can be used as
features for a nearest neighbor model, which can be used to find words used in
similar contexts. Downloading pre-trained word embeddings is a good way to get
started quickly and leverage the strengths of these representations, which have been
optimized on a large amount of data. This is likely to result in more accurate and
reliable features than other options like one-hot encoding, edit distance, or using
Amazon Mechanical Turk to produce synonyms.
Reference: https://aws.amazon.com/blogs/machine-learning/amazon-sagemaker-obje
ct2vec-adds-new-features-that-support-automatic-negative-sampling-and-speed-up-
training/
80.A Machine Learning Specialist is configuring Amazon SageMaker so multiple Data
Scientists can access notebooks, train models, and deploy endpoints. To ensure the
best operational performance, the Specialist needs to be able to track how often the
Scientists are deploying models, GPU and CPU utilization on the deployed
SageMaker endpoints, and all errors that are generated when an endpoint is invoked.
Which services are integrated with Amazon SageMaker to track this information?
(Select TWO.)
A. AWS CloudTrail
B. AWS Health
C. AWS Trusted Advisor
77 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
D. Amazon CloudWatch
E. AWS Config
Answer: A, D
Explanation:
The services that are integrated with Amazon SageMaker to track the information that
the Machine Learning Specialist needs are AWS CloudTrail and Amazon
CloudWatch. AWS CloudTrail is a service that records the API calls and events for
AWS services, including Amazon SageMaker. AWS CloudTrail can track the actions
performed by the Data Scientists, such as creating notebooks, training models, and
deploying endpoints. AWS CloudTrail can also provide information such as the
identity of the user, the time of the action, the parameters used, and the response
elements returned. AWS CloudTrail can help the Machine Learning Specialist to
monitor the usage and activity of Amazon SageMaker, as well as to audit and
troubleshoot any issues1 Amazon CloudWatch is a service that collects and analyzes
the metrics and logs for AWS services, including Amazon SageMaker. Amazon
CloudWatch can track the performance and utilization of the Amazon SageMaker
endpoints, such as the CPU and GPU utilization, the inference latency, the number of
invocations, etc. Amazon CloudWatch can also track the errors and alarms that are
generated when an endpoint is invoked, such as the model errors, the throttling
errors, the HTTP errors, etc. Amazon CloudWatch can help the Machine Learning
Specialist to optimize the operational performance and reliability of Amazon
SageMaker, as well as to set up notifications and actions based on the metrics and
logs
81.A Machine Learning Specialist trained a regression model, but the first iteration
needs optimizing. The Specialist needs to understand whether the model is more
frequently overestimating or underestimating the target.
What option can the Specialist use to determine whether it is overestimating or
underestimating the target value?
A. Root Mean Square Error (RMSE)
B. Residual plots
C. Area under the curve
D. Confusion matrix
Answer: B
Explanation:
Residual plots are a model evaluation technique that can be used to understand
whether a regression model is more frequently overestimating or underestimating the
target. Residual plots are graphs that plot the residuals (the difference between the
actual and predicted values) against the predicted values or other variables. Residual
plots can help the Machine Learning Specialist to identify the patterns and trends in
the residuals, such as the direction, shape, and distribution. Residual plots can also
reveal the presence of outliers, heteroscedasticity, non-linearity, or other problems in
78 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
the model12
To determine whether the model is overestimating or underestimating the target, the
Machine Learning Specialist can use a residual plot that plots the residuals against
the predicted values. This type of residual plot is also known as a prediction error plot.
A prediction error plot can show the magnitude and direction of the errors made by
the model. If the model is overestimating the target, the residuals will be negative, and
the points will be below the zero line. If the model is underestimating the target, the
residuals will be positive, and the points will be above the zero line. If the model is
accurate, the residuals will be close to zero, and the points will be scattered around
the zero line. A prediction error plot can also show the variance and bias of the model.
If the model has high variance, the residuals will have a large spread, and the points
will be far from the zero line. If the model has high bias, the residuals will have a
systematic pattern, such as a curve or a slope, and the points will not be randomly
distributed around the zero line. A prediction error plot can help the Machine Learning
Specialist to optimize the model by adjusting the complexity, features, or parameters
of the model34
The other options are not valid or suitable for determining whether the model is
overestimating or underestimating the target. Root Mean Square Error (RMSE) is a
model evaluation metric that measures the average magnitude of the errors made by
the model. RMSE is the square root of the mean of the squared residuals. RMSE can
indicate the overall accuracy and performance of the model, but it cannot show the
direction or distribution of the errors. RMSE can also be influenced by outliers or
extreme values, and it may not be comparable across different models or datasets5
Area under the curve (AUC) is a model evaluation metric that measures the ability of
the model to distinguish between the positive and negative classes. AUC is the area
under the receiver operating characteristic (ROC) curve, which plots the true positive
rate against the false positive rate for various classification thresholds. AUC can
indicate the overall quality and performance of the model, but it is only applicable for
binary classification models, not regression models. AUC cannot show the magnitude
or direction of the errors made by the model. Confusion matrix is a model evaluation
technique that summarizes the number of correct and incorrect predictions made by
the model for each class. A confusion matrix is a table that shows the counts of true
positives, false positives, true negatives, and false negatives for each class. A
confusion matrix can indicate the accuracy, precision, recall, and F1-score of the
model for each class, but it is only applicable for classification models, not regression
models. A confusion matrix cannot show the magnitude or direction of the errors
made by the model.
82.A company wants to classify user behavior as either fraudulent or normal. Based
on internal research, a Machine Learning Specialist would like to build a binary
classifier based on two features: age of account and transaction month. The class
distribution for these features is illustrated in the figure provided.
79 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
Based on this information, which model would have the HIGHEST recall with respect
to the fraudulent class?
A. Decision tree
B. Linear support vector machine (SVM)
C. Naive Bayesian classifier
D. Single Perceptron with sigmoidal activation function
Answer: A
Explanation:
Based on the figure provided, a decision tree would have the highest recall with
respect to the fraudulent class. Recall is a model evaluation metric that measures the
proportion of actual positive instances that are correctly classified by the model.
Recall is calculated as follows: Recall = True Positives / (True Positives + False
Negatives)
A decision tree is a type of machine learning model that can perform classification
tasks by splitting the data into smaller and purer subsets based on a series of rules or
80 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
conditions. A decision tree can handle both linear and non-linear data, and can
capture complex patterns and interactions among the features. A decision tree can
also be easily visualized and interpreted1
In this case, the data is not linearly separable, and has a clear pattern of seasonality.
The fraudulent class forms a large circle in the center of the plot, while the normal
class is scattered around the edges. A decision tree can use the transaction month
and the age of account as the splitting criteria, and create a circular boundary that
separates the fraudulent class from the normal class. A decision tree can achieve a
high recall for the fraudulent class, as it can correctly identify most of the black dots
as positive instances, and minimize the number of false negatives. A decision tree
can also adjust the depth and complexity of the tree to balance the trade-off between
recall and precision23. The other options are not valid or suitable for achieving a high
recall for the fraudulent class. A linear support vector machine (SVM) is a type of
machine learning model that can perform classification tasks by finding a linear
hyperplane that maximizes the margin between the classes. A linear SVM can handle
linearly separable data, but not non-linear data. A linear SVM cannot capture the
circular pattern of the fraudulent class, and may misclassify many of the black dots as
negative instances, resulting in a low recall4 A naive Bayesian classifier is a type of
machine learning model that can perform classification tasks by applying the Bayes’
theorem and assuming conditional independence among the features. A naive
Bayesian classifier can handle both linear and non-linear data, and can incorporate
prior knowledge and probabilities into the model. However, a naive Bayesian classifier
may not perform well when the features are correlated or dependent, as in this case.
A naive Bayesian classifier may not capture the circular pattern of the fraudulent
class, and may misclassify many of the black dots as negative instances, resulting in
a low recall5 A single perceptron with sigmoidal activation function is a type of
machine learning model that can perform classification tasks by applying a weighted
linear combination of the features and a non-linear activation function. A single
perceptron with sigmoidal activation function can handle linearly separable data, but
not non-linear data. A single perceptron with sigmoidal activation function cannot
capture the circular pattern of the fraudulent class, and may misclassify many of the
black dots as negative instances, resulting in a low recall.
83.When submitting Amazon SageMaker training jobs using one of the built-in
algorithms, which common parameters MUST be specified? (Select THREE.)
A. The training channel identifying the location of training data on an Amazon S3
bucket.
B. The validation channel identifying the location of validation data on an Amazon S3
bucket.
C. The 1AM role that Amazon SageMaker can assume to perform tasks on behalf of
the users.
D. Hyperparameters in a JSON array as documented for the algorithm used.
81 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
E. The Amazon EC2 instance class specifying whether training will be run using CPU
or GPU.
F. The output path specifying where on an Amazon S3 bucket the trained model will
persist.
Answer: A, C, F
Explanation:
When submitting Amazon SageMaker training jobs using one of the built-in
algorithms, the common parameters that must be specified are:
The training channel identifying the location of training data on an Amazon S3 bucket.
This parameter tells SageMaker where to find the input data for the algorithm and
what format it is in. For example, TrainingInputMode: File means that the input data is
in files stored in S3.
The IAM role that Amazon SageMaker can assume to perform tasks on behalf of the
users. This parameter grants SageMaker the necessary permissions to access the S3
buckets, ECR repositories, and other AWS resources needed for the training job. For
example, RoleArn: arn:aws:iam::123456789012:role/service-role/AmazonSageMaker-
ExecutionRole-
20200303T150948 means that SageMaker will use the specified role to run the
training job. The output path specifying where on an Amazon S3 bucket the trained
model will persist. This parameter tells SageMaker where to save the model artifacts,
such as the model weights and parameters, after the training job is completed. For
example, OutputDataConfig: {S3OutputPath:
s3://my-bucket/my-training-job} means that SageMaker will store the model artifacts in
the specified
S3 location.
The validation channel identifying the location of validation data on an Amazon S3
bucket is an optional parameter that can be used to provide a separate dataset for
evaluating the model performance during the training process. This parameter is not
required for all algorithms and can be omitted if the validation data is not available or
not needed.
The hyperparameters in a JSON array as documented for the algorithm used is
another optional parameter that can be used to customize the behavior and
performance of the algorithm. This parameter is specific to each algorithm and can be
used to tune the model accuracy, speed, complexity, and other aspects. For example,
HyperParameters: {num_round: "10", objective: "binary:logistic"} means that the
XGBoost algorithm will use 10 boosting rounds and the logistic loss function for binary
classification.
The Amazon EC2 instance class specifying whether training will be run using CPU or
GPU is not a parameter that is specified when submitting a training job using a built-in
algorithm. Instead, this parameter is specified when creating a training instance,
which is a containerized environment that runs the training code and algorithm. For
example, ResourceConfig: {InstanceType: ml.m5.xlarge,
InstanceCount: 1, VolumeSizeInGB: 10} means that SageMaker will use one
82 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Useful MLS-C01 Dumps (2024 V11.02) - Best Materials for Exam Preparation
m5.xlarge instance with 10 GB of storage for the training instance.
Reference: Train a Model with Amazon SageMaker
Use Amazon SageMaker Built-in Algorithms or Pre-trained Models CreateTrainingJob
- Amazon SageMaker Service
84.A Data Scientist is developing a machine learning model to predict future patient
outcomes based on information collected about each patient and their treatment
plans. The model should output a continuous value as its prediction. The data
available includes labeled outcomes for a set of 4,000 patients. The study was
conducted on a group of individuals over the age of 65 who have a particular disease
that is known to worsen with age.
Initial models have performed poorly. While reviewing the underlying data, the Data
Scientist notices
that, out of 4,000 patient observations, there are 450 where the patient age has been
input as 0. The other features for these observations appear normal compared to the
rest of the sample population.
How should the Data Scientist correct this issue?
A. Drop all records from the dataset where age has been set to 0.
B. Replace the age field value for records with a value of 0 with the mean or median
value from the dataset.
C. Drop the age feature from the dataset and train the model using the rest of the
features.
D. Use k-means clustering to handle missing features.
Answer: B
Explanation:
The best way to handle the missing values in the patient age feature is to replace
them with the mean or median value from the dataset. This is a common technique
for imputing missing values that preserves the overall distribution of the data and
avoids introducing bias or reducing the sample size. Dropping the records or the
feature would result in losing valuable information and reducing the accuracy of the
model. Using k-means clustering would not be appropriate for handling missing
values in a single feature, as it is a method for grouping similar data points based on
multiple features.
References:
Effective Strategies to Handle Missing Values in Data Analysis
How To Handle Missing Values In Machine Learning Data With Weka.
How to handle missing values in Python - Machine Learning Plus
83 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
GET FULL VERSION OF MLS-C01 DUMPS
Powered by TCPDF (www.tcpdf.org)
84 / 84
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help