TB#3
pdf
keyboard_arrow_up
School
SUNY at Albany *
*We aren’t endorsed by this school
Course
438
Subject
Computer Science
Date
Apr 3, 2024
Type
Pages
38
Uploaded by BarristerWorld7634
Chapter 16, Deep Learning
1
Deep Learning
1
Introduction
16.1 Q1: Which of the following statements is false
?
a. Keras offers a friendly interface to Google’s TensorFlow—the most widely
used deep-learning library. b. François Chollet of the Google Mind team developed Keras to make deep-
learning capabilities more accessible. c. Keras enables you to define deep-learning models conveniently with one
statement. d. Google has thousands of TensorFlow and Keras projects underway internally,
and that number is growing quickly.
Answer: c. Keras enables you to define deep-learning models conveniently
with one statement. Actually, deep learning models require more
sophisticated setups than scikit-learn machine learning models, typically
using several statements to connect multiple objects, called layers.
16.1 Q2: Which of the following statements is false
?
a. Keras is to deep learning as Scikit-learn is to machine learning. b. Deep learning models are complex and require an extensive mathematical
background to understand their inner workings. c. Both Keras and Scikit-learn encapsulate the sophisticated mathematics of
their models, so developers simply need only define, parameterize and
manipulate objects. d. With Keras, you build your models primarily from custom components you
develop to meet your unique requirements. Answer: d. With Keras, you build your models primarily from custom
components you develop to meet your unique requirements. Actually, with
Keras, you build your models from pre-existing
components and quickly
parameterize those components to your unique requirements. 16.1 Q3: Which of the following statements is false
?
a. Keras facilitates experimenting with many deep-learning models and
tweaking them in various ways until you find the models that perform best for
your applications.
b. Deep learning works well only when you have lots of data.
c. Transfer learning uses existing knowledge from a previously trained model as
the foundation for a new model. d. Data augmentation adds data to a dataset by deriving new data from existing
data. For example, in an image dataset, you might rotate the images left and right
so the model can learn about objects in different orientations.
2
Chapter 16, Deep Learning
Answer: b. Deep learning works well only when you have lots of data.
Actually, deep learning works well when you have lots of data, but it also
can be effective for smaller datasets, especially when combined with
techniques like transfer learning and data augmentation.
16.1 Q4: Which of the following statements a), b) or c) is false?
a. Deep learning can require significant processing power. b. Complex models trained on big-data datasets can take hours, days or even
more to train. c. Special high-performance hardware called GPUs (Graphics Processing Units)
and TPUs (Tensor Processing Units) developed by NVIDIA and Google,
respectively, to meet the extraordinary processing demands of edge-of-the-
practice- deep-learning applications.
d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
16.1 Q5: ________ neural networks are especially appropriate for computer vision
tasks, such as recognizing handwritten digits and characters or recognizing
objects (including faces) in images and videos. a. LSTM
b. Recurrent
c. Convolutional
d. None of the above
Answer: c. Convolutional
16.1 Q6: Which of the following are not automated deep-learning capabilities?
a. Auto-Keras from Texas A&M University’s DATA Lab
b. Baidu’s EZDL
c. Google’s AutoML
d. Scikit-learn
Answer: d. Scikit-learn
1
Deep Learning Applications
16.1 Q7: Which of the following are popular deep learning applications:
a. Game playing, computer vision, self-driving cars, robotics, improving
customer experiences and chatbots
b. Diagnosing medical conditions, Google Search, facial recognition, automated
image captioning, video closed captioning, enhancing image resolution
c. Speech recognition, language translation, predicting election results,
predicting earthquakes and weather. d. All of the above
Answer: All of the above
Chapter 16, Deep Learning
3
1
Deep Learning Demos
16.1 Q8: Which of the following deep-learning demos translates a line drawing
into a picture:
a. DeepArt.io.
b. DeepWarp Demo. c. Image-to-Image Demo. d. Google Translate Mobile App.
Answer: c. Image-to-Image Demo.
1
Keras Resources
16.1 Q9: If you’re looking for term projects, directed study projects, capstone
course projects or thesis topics, visit ________. People post their research papers
here in parallel with going through peer review for formal publication, hoping
for fast feedback. So, this site gives you access to extremely current research.
a. https://kerasteam.slack.com
.
b. https://blog.keras.io
. c. http://keras.io
. d. https://arXiv.org
Answer: d. https://arXiv.org
1
Keras Built-In Datasets
16.2 Q1: Which of the following Keras datasets for practicing deep learning is
used for sentiment analysis? b. MNIST. c. Fashion-MNIST. d. IMDb.
e. CIFAR10. f. CIFAR100.
Answer: d. IMDb.
1
Custom Anaconda Environments
16.3 Q1: Which of the following statements about Anaconda environments is
false
?
a. The Anaconda Python distribution makes it easy to create custom
environments. b. These are separate configurations in which you can install different libraries
and different library versions. This can help with reproducibility if your code
depends on specific Python or library versions.
c. The default environment in Anaconda is called the root environment. This is
created for you when you install Anaconda. All the Python libraries that come
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
4
Chapter 16, Deep Learning
with Anaconda are installed into the root environment and, unless you specify
otherwise, any additional libraries you install also are placed there. d. Custom environments give you control over the specific libraries you wish to
install for your specific tasks. Answer: c. The default environment in Anaconda is called the root
environment. This is created for you when you install Anaconda. All the
Python libraries that come with Anaconda are installed into the root
environment and, unless you specify otherwise, any additional libraries
you install also are placed there. Actually, the default environment in
Anaconda is called the base environment
. 16.3 Q2: Which of the following statements a), b) or c) is false?
a. To use a custom Anaconda environment named tf_env
, execute the
following command, which affects only the current Terminal, shell or Anaconda
Command Prompt:
conda activate tf_env
b. When a custom environment is activated and you install more libraries, they
become part of the activated environment, not the base environment. c. If you open separate Terminals, shells or Anaconda Command Prompts, they’ll
use Anaconda’s base environment by default.
d. All of the above statements are true
.
Answer: d. All of the above statements are
true
.
1
Neural Networks
16.4 Q1: Which of the following statements a), b) or c) is false?
a. Deep learning is a form of machine learning that uses artificial neural
networks to learn. b. An artificial neural network is a software construct that operates similarly to
how scientists believe our brains work. c. Our biological nervous systems are controlled via neurons
that communicate
with one another along pathways called synapses
. As we learn, the specific
neurons that enable us to perform a given task, like walking, communicate with
one another more efficiently. d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
16.4 Q2: In supervised deep learning, we aim to predict the ________ labels
supplied with data samples.
a. goal
b. mark
c. object
d. target
Chapter 16, Deep Learning
5
Answer: d. target
16.4 Q3: The following diagram shows a three-
layer
neural network. Each circle
represents a neuron, and the lines between them simulate the synapses. The
output of a neuron becomes the input of another neuron, hence the term neural
network: This particular diagram shows a ________.
a. partially connected network
b. crossover network
c. fully connected network
d. None of the above
Answer: c. fully connected network
16.4 Q4: Which of the following statements a), b) or c) about neural networks is
false
?
a. During the training phase, the network calculates values called weights for
every connection between the neurons in one layer and those in the next. b. On a neuron-by-neuron basis, each of its inputs is multiplied by that
connection’s weight, then the maximum of those weighted inputs is passed to
the neuron’s activation function. c. The activation function’s output determines which neurons to activate based
on the inputs—just like the neurons in your brain passing information around in
response to inputs coming from your eyes, nose, ears and more.
d. All of the above statements are true
. Answer: b. On a neuron-by-neuron basis, each of its inputs is multiplied by
that connection’s weight, then the maximum of those weighted inputs is
passed to the neuron’s activation function. Actually, on a neuron-by-neuron
basis, each of its inputs is multiplied by that connection’s weight, then the
sum
of those weighted inputs is passed to the neuron’s activation function.
16.4 Q5: Which of the following statements about the diagram below is false
?
6
Chapter 16, Deep Learning
a. The diagram shows a neuron receiving three inputs (the black dots) and
producing an output (the hollow circle) that would be passed to all or some of
neurons in the next layer, depending on the types of the neural network’s layers. b. The values w
1
, w
2
and w
3
are weights. c. In a new model that you train from scratch, these values are initialized to zero
by the model. d. As the network trains, it tries to minimize the error rate between the
network’s predicted labels and the samples’ actual labels. Answer: c. In a new model that you train from scratch, these values are
initialized to zero by the model. Actually, in a new model that you train
from scratch, these values are initialized randomly
by the model.
16.4 Q6: Which of the following statements a), b) or c) is false
?
a. The error rate is known as the loss, and the calculation that determines the
loss is called the loss function. b. Throughout training, the network determines the amount that each neuron
contributes to the overall loss, then goes back through the layers and adjusts the
weights in an effort to minimize that loss. c. The technique mentioned in Part (b) is called backpropagation. Optimizing
these weights occurs gradually—typically via a process called gradient descent.
d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
1
Tensors
16.5 Q1: Which of the following statements is false
?
a. Deep learning frameworks generally manipulate data in the form of tensors. b. A tensor is basically a one-dimensional array. c. Frameworks like TensorFlow pack all your data into one or more tensors,
which they use to perform the mathematical calculations that enable neural
networks to learn.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Chapter 16, Deep Learning
7
d. These tensors can become quite large as the number of dimensions increases
and as the richness of the data increases (for example, images, audios and videos
are richer than text).
Answer: b. A tensor is basically a one-dimensional array. Actually, A tensor
is basically a multidimensional array
.
16.5 Q2: Chollet discusses the types of tensors typically encountered in deep
learning: A 0D (0-dimensional) tensor is one value and is known as a scalar. A
1D tensor is similar to a one-dimensional array and is known as a vector. A 1D
tensor might represent a sequence, such as hourly temperature readings from a
sensor or the words of one movie review. A 2D tensor is similar to a two-
dimensional array and is known as a matrix. A 2D tensor could represent a
grayscale image in which the tensor’s two dimensions are the image’s width and
height in pixels, and the value in each element is the intensity of that pixel.
Which of the following statements a), b) or c) about additional types of tensors
is false
? a. A 3D tensor is similar to a three-dimensional array and could be used to
represent a color image. The first two dimensions would represent the width
and height of the image in pixels and the depth at each location might represent
the red, green and blue (RGB) components of a given pixel’s color. A 3D tensor
also could represent a collection of 2D tensors containing grayscale images.
b. A 4D tensor could be used to represent a collection of color images in 3D
tensors. It also could be used to represent one video. Each frame in a video is
essentially a color image. c. A 5D tensor could be used to represent a collection of 4D tensors containing
videos.
d. All of the above statements are true
. Answer: d. All of the above statements are true
.
16.5 Q3: A tensor’s ________ typically is represented in Python as a tuple of values
in which the number of elements specifies the tensor’s number of dimensions
and each value in the tuple specifies the size of the tensor’s corresponding
dimension.
a. shape
b. frame
c. pattern
d. format
Answer: a. shape.
16.5 Q4: Which of the following statements a), b) or c) is false
?
a. Powerful processors are needed for real-world deep learning because the size
of tensors can be enormous and large-tensor operations can place crushing
demands on processors.
8
Chapter 16, Deep Learning
b. NVIDIA GPUs (Graphics Processing Units)—originally developed for computer
gaming—are optimized for the mathematical matrix operations typically
performed on tensors, an essential aspect of how deep learning works “under
the hood.”
c. Recognizing that deep learning is crucial to its future, Google developed TPUs
(Tensor Processing Units). Google now uses TPUs in its Cloud TPU service,
which can perform quadrillions
of floating-point operations per second. d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
1
Convolutional Neural Networks for Vision; Multi-Classification with the MNIST
Dataset
16.6 Q1: Which of the following statements a), b) or c) is false
?
a. In the “Machine Learning” chapter, we classified handwritten digits using the
8-by-8-pixel, low-resolution images from the Digits dataset bundled with Scikit-
learn. b. The Digits dataset is based on a subset of the higher-resolution MNIST
handwritten digits dataset. c. Recurrent neural networks are common in computer-vision applications, such
as recognizing handwritten digits and characters, and recognizing objects in
images and video. d. All of the above statements are true
.
Answer: c. Recurrent neural networks are common in computer-vision
applications, such as recognizing handwritten digits and characters, and
recognizing objects in images and video. Actually, convolutional
neural
networks are common in computer-vision applications, such as
recognizing handwritten digits and characters, and recognizing objects in
images and video. 16.6 Q2: Which of the following statements a), b) or c) is false
?
a. Reproducibility is crucial in scientific studies. b. In deep learning, reproducibility is more difficult because the libraries
sequentialize operations that perform floating-point calculations. c. Getting reproducible results in Keras requires a combination of environment
settings and code settings that are described in the Keras FAQ. d. All of the above statements are true
.
Answer: b. In deep learning, reproducibility is more difficult because the
libraries sequentialize operations that perform floating-point calculations.
Actually, in deep learning, reproducibility is more difficult because the
libraries heavily
parallelize
operations that perform floating-point
Chapter 16, Deep Learning
9
calculations. Each time operations execute, they may execute in a different
order. This can produce differences in your results.
16.6 Q3: Which of the following statements about Keras neural network
components is false
?
a. The network is a sequence of layers containing the neurons used to learn from
the samples. Each layer’s neurons receive inputs, process them via an optimizer
function, and produce outputs. c. The data is fed into the network via an input layer that specifies the
dimensions of the sample data. d. The input layer is followed by hidden layers of neurons that implement the
learning and an output layer that produces the predictions. The more layers you
stack, the deeper the network is (hence the term deep learning).
Answer: a. The network is a sequence of layers containing the neurons
used to learn from the samples. Each layer’s neurons receive inputs,
process them via an optimizer function, and produce outputs. Actually, the
network is a sequence of layers containing the neurons used to learn from
the samples. Each layer’s neurons receive inputs, process them via an
activation function
, and produce outputs.
16.6 Q4: A ________ function produces a measure of how well a neural network
predicts the target values.
a. activation
b. optimizer
c. loss
d. None of the above
Answer: c. loss
1
Loading the MNIST Dataset
16.6 Q5: Which of the following statements is false
?
a. The following code imports the tensorflow.keras.datasets.mnist
module so we can load the dataset: from tensorflow.keras.datasets import mnist
b. When we use the version of Keras built into TensorFlow, the Keras module
names begin with "tensorflow."
. c. TensorFlow uses Keras to execute the deep-learning models. d. The mnist
module’s load_data
function loads the MNIST training and
testing sets:
(X_train, y_train), (X_test, y_test) = mnist.load_data()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
10
Chapter 16, Deep Learning
When you call load_data
it will download the MNIST data to your system. The
function returns a tuple of two elements containing the training and testing sets.
Each element is itself a tuple containing the samples and labels, respectively. Answer: c. TensorFlow uses Keras to execute the deep-learning models.
Actually, Keras
uses TensorFlow
to execute the deep-learning models.
1
Data Exploration
16.6 Q6: Which of the following statements a), b) or c) is false?
a. You should always get to know the data before working with it. b. The following snippets check the dimensions of the MNIST training set images
(
X_train
), training set labels (
y_train
), testing set images (
X_test
) and
testing set labels (
y_test
):
[3]: X_train.shape
[3]: (60000, 28, 28)
[4]: y_train.shape
[4]: (60000,)
[5]: X_test.shape
[5]: (10000, 28, 28)
[6]: y_test.shape
[6]: (10000,)
c. You can see from X_train
’s and X_test
’s shapes that the MNIST images are
the same resolution as those in Scikit-learn’s Digits dataset.
d. All of the above statements are true
.
Answer: c. You can see from X_train
’s and X_test
’s shapes that the
MNIST images are the same resolution as those in Scikit-learn’s Digits
dataset. Actually, you can see from X_train
’s and X_test
’s shapes that
the images are higher resolution
than those in Scikit-learn’s Digits dataset
(which are 8-by-8).
16.6 Q7: The IPython magic ________ indicates that Matplotlib-based graphics
should be displayed in a Jupyter notebook rather than in separate windows.
a. %matplotlib notebook
b. %matplotlib inline
c. %matplotlib Jupyter
d. None of the above
Answer: b. %matplotlib inline
Chapter 16, Deep Learning
11
16.6 Q8: Consider the following code:
import numpy as np
index = np.random.choice(np.arange(len(X_train)), 24
, replace=
False
)
Chapter 16, Deep Learning
1
Deep Learning
figure, axes = plt.subplots(nrows=
4
, ncols=
6
, figsize=(
16
, 9
))
for item in zip(axes.ravel(), X_train[index], y_train[index]):
axes, image, target = item
axes.imshow(image, cmap=
plt.cm.gray_r
)
axes.set_xticks([]) # remove x-axis tick marks
axes.set_yticks([]) # remove y-axis tick marks
axes.set_title(target)
plt.tight_layout()
Which of the following statements a), b) or c) is false
?
a. NumPy’s choice
function (from the numpy.random
module) selects the
number of elements specified in its second argument from the front of the array
of values in its first. b. The choice
function returns an array containing the selected values, which
we store in index
. c. The expressions X_train[index]
and y_train[index]
use index
to
get the corresponding elements from both arrays. d. All of the above statements are true
. Answer: a. NumPy’s choice
function (from the numpy.random
module)
selects the number of elements specified in its second argument from the
front of the array of values in its first. Actually, NumPy’s choice
function
(from the numpy.random
module) randomly selects
the number of
elements specified in its second argument from the array of values in its
first argument.
1
Data Preparation
16.6 Q9: Which of the following statements a), b) or c) is false?
a. Scikit-learn’s bundled datasets were preprocessed into the shapes its models
require. b. In real-world studies, you’ll generally have to do some or all of the data
preparation. c. The MNIST dataset requires some preparation for use in a Keras convnet.
d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
16.6 Q10: Which of the following statements is false
?
a. Keras convnets require NumPy array inputs in which each sample has the
shape:
(
width
, height
, channels
)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
2
Chapter 16, Deep Learning
b. For MNIST, each image’s width
and height
are 28 pixels, and each pixel has one
channel
(the grayscale shade of the pixel from 0 to 255), so each sample’s shape
will be:
(28, 28, 1) c. Full-color images with RGB (red/green/blue) values for each pixel, would have
three channels
—one channel each for the red, green and blue components of a
color.
d. As the neural network learns from the images, it reduces the number of
channels. Answer: d. As the neural network learns from the images, it reduces the
number of channels. Actually, as the neural network learns from the
images, it creates many more channels
.
16.6 Q11: Which of the following statements a), b) or c) is false
?
a. Numeric features in data samples may have value ranges that vary widely.
Deep learning networks perform better on data that is scaled either into the
range 0.0 to 1.0, or to a range for which the data’s mean is 1.0 and its standard
deviation is 0.0. Getting your data into one of these forms is known as
normalization.
b. In MNIST, each pixel is an integer in the range 0–255. c. The following code converts the values to 32-bit (4-byte) floating-point
numbers using the NumPy array method astype
, then divides every element in
the resulting array by 255, producing normalized values in the range 0.0–1.0:
[16]: X_train = X_train.astype(
'float32'
) / 255
[17]: X_test = X_test.astype(
'float32'
) / 255
d. All of the above statements are true.
Answer: a. Numeric features in data samples may have value ranges that
vary widely. Deep learning networks perform better on data that is scaled
either into the range 0.0 to 1.0, or to a range for which the data’s mean is
1.0 and its standard deviation is 0.0. Getting your data into one of these
forms is known as normalization. Actually, deep learning networks
perform better on data that is scaled either into the range 0.0 to 1.0, or to a
range for which the data’s mean is 0.0 and its standard deviation is 1.0
. 16.6 Q12: The MNIST convnet’s prediction for each MNIST digit will be an array
of 10 probabilities, indicating the likelihood that the digit belongs to a particular
one of the classes 0 through 9. When we evaluate the model’s accuracy, Keras
compares the model’s predictions to the labels. To do that, Keras requires both
to have the same ________. a. profile
Chapter 16, Deep Learning
3
b. aspect
c. shape
d. frame
Answer: c. shape 16.6 Q13: Which of the following statements a), b) or c) is false?
a. One-hot encoding, which converts data into arrays of 1.0s and 0.0s in which
only one element is 1.0 and the rest are 0.0s. b. For MNIST targets, the one-hot-encoded values will be 10 x 10 arrays
representing the categories 0 through 9. c. We know precisely which category each digit belongs to, so the categorical
representation of a digit label will consist of a 1.0 at that digit’s index and 0.0s
for all the other elements. d. All of the above statements are true
.
Answer: b. For MNIST targets, the one-hot-encoded values will be 10 x 10
arrays representing the categories 0 through 9. Actually, the one-hot
encoded values will be one-dimensional 10-element arrays.
16.6 Q14: Which of the following statements a), b) or c) is false
?
a. The one-hot encoded representation of the digit 7 is:
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]
b.
The
tensorflow.keras.utils
module
provides
function
to_categorical
to perform one-hot encoding. c. The function counts the unique categories then, for each item being encoded,
creates an array of that length with a 1.0 in the correct position.
d. All of the above statements are true
.
Answer: a. The one-hot encoded representation of the digit 7 is:
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0]
Actually, the one-hot encoded representation of the digit 7 is: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0]
.
1
Creating the Neural Network
16.6 Q15: Which of the following statements a), b) or c) is false
?
a. The following code begins configuring a convolutional neural network with a
Keras Sequential
model from the tensorflow.keras.models
module: [24]: from tensorflow.keras.models import Sequential [25]: cnn = Sequential() b. The network resulting from the code in Part (a) will execute its layers
sequentially—the output of one layer becomes the input to the next.
4
Chapter 16, Deep Learning
c. Networks that operate as described in Part (b) are called feed-forward
networks. All neural networks operate this way.
d. All of the above statements are true
.
Answer: c. Networks that operate as described in Part (b) are called feed-
forward networks. All neural networks operate this way. Actually, not all
neural networks are feed forward—we also discuss recurrent neural
networks
.
16.6 Q16: A typical convolutional neural network consists of several layers—an
input layer that receives the training samples, ________ layers that learn from the
samples and an output layer that produces the prediction probabilities.
a. intermediate
b. study c. training
d. hidden
Answer: d. hidden
16.6 Q17: Which of the following statements a), b) or c) is false
?
a. The following code imports from the tensorflow.keras.layers
module
popular layer classes we can use to construct a basic convnet:
from tensorflow.keras.layers import Conv2D, Dense, Flatten, MaxPooling2D
b. We can begin our network with a convolution layer, which uses the
relationships between pixels that are close to one another to learn useful
features (or patterns) in large areas of each sample. c. The areas that convolution learns from are called kernels or patches. d. All of the above statements are true
.
Answer: b. We can begin our network with a convolution layer, which uses
the relationships between pixels that are close to one another to learn
useful features (or patterns) in large areas of each sample. Actually, we can
begin our network with a convolution layer, which uses the relationships
between pixels that are close to one another to learn useful features (or
patterns) in small
areas of each sample.
16.6 Q18: Consider the following diagram in which the 3-by-3 shaded square
represents the kernel in a convolution layer:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Chapter 16, Deep Learning
5
Which of the following statements a), b) or c) is false
?
a. The areas that convolution learns from are called kernels or patches. b. When the kernel reaches the right edge, the convolution layer moves the
kernel down three pixels and repeats this left-to-right process. c. Kernels typically are 3-by-3, though larger convnets can be used for higher-
resolution images. d. All of the above statements are true
.
Answer: b. When the kernel reaches the right edge, the convolution layer
moves the kernel down three pixels and repeats this left-to-right process.
Actually, when the kernel reaches the right edge, the convolution layer
moves the kernel one
pixel down and repeats this left-to-right process.
16.6 Q19: Which of the following statements a), b) or c) about convolution is
false
?
a. Kernel-size is a tunable hyperparameter.
b. For each kernel position, the convolution layer performs mathematical
calculations using the kernel features to “learn” about them, then outputs one
new feature to the layer’s output. c. By looking at features near one another, the network begins to recognize
features like edges, straight lines and curves.
d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
16.6 Q20: Which of the following statements is false
?
a. The number of filters depends on the image dimensions—higher-resolution
images have more features, so they require more filters. b. If you study the code the Keras team used to produce their pretrained
convnets, you’ll find that they used 64, 128 or even 256 filters in their first
6
Chapter 16, Deep Learning
convolutional layers. Based on their convnets and the fact that the MNIST
images are small, we used 64 filters in our first convolutional layer. c. The set of filters produced by a convolution layer is called a feature map.
Subsequent convolution layers combine features from previous feature maps to
recognize larger features and so on. If we were doing facial recognition, early
layers might recognize lines, edges and curves, and subsequent layers might
begin combining those into larger features like eyes, eyebrows, noses, ears and
mouths. d. Once the network learns a feature, because of convolution, it no longer needs
to recognize that feature elsewhere in the image.
Answer: d. Once the network learns a feature, because of convolution, it no
longer needs to recognize that feature elsewhere in the image. Actually,
Once the network learns a feature, because of convolution, it can recognize
that feature anywhere in the image—this is one of the reasons that
convnets are used for object recognition in images.
16.6 Q21: Which of the following statements is false
?
a. The following code adds a Conv2D
convolution layer to a model named cnn
:
cnn.add(Conv2D(filters=
64
, kernel_size=(
3
, 3
), activation=
'relu'
, input_shape=(
28
, 28
, 1
)))
b. The Conv2D
layer in Part (a) is configured with the following arguments:
•
filters=
64
—The number of filters in the resulting feature map. •
kernel_size=(
3
, 3
)
—The size of the kernel used in each filter. •
activation=
'relu'
—The 'relu'
(Rectified Linear Unit) activation function is
used to produce this layer’s output. 'relu'
is the most widely used
activation function in today’s deep learning networks and is good for
performance because it’s easy to calculate. It’s commonly recommended for
convolutional layers. c. Assuming the Conv2D
layer in Part (a) is the first layer of the network, we
also pass the input_shape=(28,
28,1)
argument to specify the shape of
each sample. This automatically creates an input layer to load the samples and
pass them into the Conv2D
layer, which is actually the first hidden layer
. d. In Keras, for each subsequent layer you must explicitly specify the layer’s
input_shape
to match the previous layer’s output shape, making it possible to
stack layers. Answer: d. In Keras, for each subsequent layer you must explicitly specify
the layer’s input_shape
to match the previous layer’s output shape,
making it possible to stack layers. Actually, in Keras, each subsequent layer
Chapter 16, Deep Learning
7
infers
its input_shape
from the previous layer’s output shape, making it
easy to stack layers.
16.6 Q22: Which of the following statements is false
?
a. Overfitting can occur when your model is too simple compared to what it is
modeling—in the most extreme overfitting case, a model memorizes its training
data. b. When you make predictions with an overfit model, they will be accurate if
new data matches the training data, but the model could perform poorly with
data it has never seen.
c. Overfitting tends to occur in deep learning as the dimensionality of the layers
becomes too large. d. Some techniques to prevent overfitting include training for fewer epochs, data
augmentation, dropout and L1 or L2 regularization. Answer: a. Overfitting can occur when your model is too simple compared
to what it is modeling—in the most extreme overfitting case, a model
memorizes its training data. Actually, overfitting can occur when your
model is too complex
compared to what it is modeling—in the most
extreme overfitting case, a model memorizes its training data.
16.6 Q23: Which of the following statements a), b) or c) is false
?
a. To reduce overfitting and computation time, a convolution layer is often
followed by one or more layers that increase the dimensionality of the
convolution layer’s output. b. A pooling layer compresses (or down-samples) the results by discarding
features, which helps make the model more general. c. The most common pooling technique is called max pooling, which examines a
2-by-2 square of features and keeps only the maximum feature. d. All of the above statements are true
.
Answer: a. To reduce overfitting and computation time, a convolution layer
is often followed by one or more layers that increase the dimensionality of
the convolution layer’s output. Actually, to reduce overfitting and
computation time, a convolution layer is often followed by one or more
layers that reduce
the dimensionality of the convolution layer’s output.
16.6 Q24: Consider the following diagram in which the numeric values in the 6-
by-6 square represent the features that we wish to compress and the 2-by-2
blue square in position 1 represents the initial pool of features to examine:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
8
Chapter 16, Deep Learning
Which of the following statements a), b) or c) is false
?
a. The max pooling layer first looks at the pool in position 1 above, then outputs
the maximum
feature from that pool—9 in our diagram. b. Unlike convolution, there’s no overlap
between pools. Once the pool reaches
the right edge, the pooling layer moves the pool down by its height—2 rows—
then continues from left-to-right. Because of the feature reduction in each group,
2-by-2 pooling compresses
the number of features by 50%. c. The following code adds a MaxPooling2D
layer to a model named cnn
:
cnn.add(MaxPooling2D(pool_size=(
2
, 2
)))
d. All of the above statements are true.
Answer: b. Unlike convolution, there’s no overlap
between pools. Once the
pool reaches the right edge, the pooling layer moves the pool down by its
height—2 rows—then continues from left-to-right. Because of the feature
reduction in each group, 2-by-2 pooling compresses
the number of features
by 50%. Actually, because every group of four features is reduced to one, 2-
by-2 pooling compresses the number of features by 75%.
16.6 Q25: Which of the following statements is false
?
a. Convnets often have many convolution and pooling layers. b. The Keras team’s convnets tend to double the number of filters in subsequent
convolutional layers to enable the model to learn more relationships between
the features. c. The following snippets add a convolution layer with 128 filters, followed by a
pooling layer to reduce the dimensionality by 50%:
cnn.add(Conv2D(filters=
128
, kernel_size=(
3
, 3
), activation=
'relu'
))
Chapter 16, Deep Learning
9
cnn.add(MaxPooling2D(pool_size=(
2
, 2
)))
d. For odd dimensions like 11-by-11, Keras pooling layers round down
by
default. Answer: c. The following snippets add a convolution layer with 128 filters,
followed by a pooling layer to reduce the dimensionality by 50%:
cnn.add(Conv2D(filters=
128
, kernel_size=(
3
, 3
), activation=
'relu'
))
cnn.add(MaxPooling2D(pool_size=(
2
, 2
)))
Actually, the pooling layer in the preceding code reduce the dimensionality
of the Conv2D
layer’s output by 75%
:
16.6 Q26: A Keras ________
layer reshapes its input to one dimension. a. Dropout
b. Masking
c. Dense
d. Flatten
Answer: d. Flatten
16.6 Q27: Which of the following statements a), b) or c) is false?
a. Learning the relationships among features and performing classification is
accomplished with partially connected Dense
layers. b. The following Dense
layer creates 128 neurons (
units
) that learn from the
outputs of the previous layer:
cnn.add(Dense(units=
128
, activation=
'relu'
))
c. Many convnets contain at least one Dense
layer like the one above. Convnets
geared to more complex image datasets with higher-resolution images like
ImageNet (
http://www.image-net.org
)—a dataset of over 14 million images—
often have several Dense
layers, commonly with 4096 neurons. d. All of the above statements are true.
Answer: a. Learning the relationships among features and performing
classification is accomplished with partially connected Dense
layers.
Actually, learning the relationships among features and performing
classification is accomplished with fully connected
Dense
layers.
16.6 Q28: Consider the following code:
cnn.add(Dense(units=
10
, activation=
'softmax'
))
10
Chapter 16, Deep Learning
Which of the following statements a), b) or c) about our convolutional neural net
that recognizes MNIST digits is false
?
a. Our final layer in the preceding snippet is a Dense
layer that classifies the
inputs into neurons representing the classes 0 through 9. b. The softmax
activation function converts the values of these remaining 10
neurons into categorical string labels. c. The neuron that produces the highest probability represents the prediction
for a given digit image.
d. All of the above statements are true
.
Answer: b. The softmax
activation function converts the values of these
remaining 10 neurons into categorical labels. Actually, the softmax
activation function converts the values of these remaining 10 neurons into
classification probabilities in the range 0.0 – 1.0
.
16.6 Q29: Consider the output of the following snippet:
[34]: cnn.summary()
_______________________________________________________________
__
Layer (type) Output Shape Param # ===============================================================
==
conv2d_1 (Conv2D) (None, 26, 26, 64) 640 _______________________________________________________________
__
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 64) 0 _______________________________________________________________
__
conv2d_2 (Conv2D) (None, 11, 11, 128) 73856 _______________________________________________________________
__
max_pooling2d_2 (MaxPooling2 (None, 5, 5, 128) 0 _______________________________________________________________
__
flatten_1 (Flatten) (None, 3200) 0 _______________________________________________________________
__
dense_1 (Dense) (None, 128) 409728 _______________________________________________________________
__
dense_2 (Dense) (None, 10) 1290 ===============================================================
==
Total params: 485,514
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Chapter 16, Deep Learning
11
Trainable params: 485,514
Non-trainable params: 0
_______________________________________________________________
__
Which of the following statements is false
?
a. A model’s summary
method shows you the model’s layers. b. The parameters are the weights that the network learns during training. Our
relatively small convnet, needs to learn nearly 500,000 parameters. c. In the Output
Shape
, None
simply means that the model does not know in
advance how many training samples you’re going to provide—this is known only
when you start the training.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Chapter 16, Deep Learning
1
Deep Learning
1
By default, Keras trains only the parameters that most affect prediction accuracy. Answer: d. By default, Keras trains only the parameters that most affect
prediction accuracy. Actually, by default, Keras trains all
parameters.
16.6 Q30: Which of the following statements a), b) or c) is false?
a. You can visualize the model summary using the plot_model
function from
the module tensorflow.keras.utils
, as in:
from tensorflow.keras.utils import plot_model
from IPython.display import Image
plot_model(cnn, to_file=
'convnet.png'
, show_shapes=
True
, show_layer_names=
True
)
b. Module IPython.display
’s Image
class can be used to load an image into
a Jupyter Notebook and display the image in the notebook. c. Keras assigns the layer names in the image:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
2
Chapter 16, Deep Learning
d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
Compiling the Model
16.6 Q31: Once you’ve added all the layers to a Keras neural network, you
complete the Keras model by calling its compile
method, as in:
cnn.compile(optimizer=
'adam'
,
loss=
'categorical_crossentropy'
,
metrics=[
'accuracy'
])
Which of the following statements about the arguments is false
? a. optimizer='adam'
specifies the optimizer
this model will use to adjust the
weights throughout the neural network as it learns. b. There are many optimizers —
'adam'
performs well across a wide variety of
models. c. loss='categorical_crossentropy'
specifies the loss function used by
the optimizer in multi-classification networks like our convnet, which predicts
10 classes. As the neural network learns, the optimizer attempts to maximize
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Chapter 16, Deep Learning
3
the values returned by the loss function. The greater the loss, the better the
neural network is at predicting what each image is. d. metrics=['accuracy']
—This is a list of the metrics
that the network will
produce to help you evaluate the model. We use the accuracy
metric to check
the percentage of correct predictions. Answer: c. loss='categorical_crossentropy'
specifies the loss
function used by the optimizer in multi-classification networks like our
convnet, which predicts 10 classes. As the neural network learns, the
optimizer attempts to maximize the values returned by the loss function.
The greater the loss, the better the neural network is at predicting what
each image is. Actually, as
the neural network learns, the optimizer
attempts to minimize the values returned by the loss function
. The lower the
loss
, the better the neural network is at predicting what each image is.
1
Training and Evaluating the Model
16.6 Q32: You train a Keras model by calling its fit
method. Which of the
following statements about the fit
method is false
?
a. As in Scikit-learn, the first two arguments are the training data and the
categorical target labels.
b. The iterations
argument specifies the number of times the model should
process the entire set of training data. c. batch_size
specifies the number of samples to process at a time during
each epoch. Most models specify a power of 2 from 32 to 512. Larger batch sizes
can decrease model accuracy. d. In general, some samples should be used to validate the model. If you specify
validation data, after each epoch, the model will use it to make predictions and
display the validation loss and accuracy. You can study these values to tune your
layers and the fit
method’s hyperparameters, or possibly change the layer
composition of your model. Answer: b. The iterations
argument specifies the number of times the
model should process the entire set of training data. Actually, the epochs
argument specifies the number of times the model should process the
entire set of training data.
16.6 Q33: Which of the following statements a), b) or c) is false?
a. TensorBoard is a TensorFlow tool for visualizing data from your deep-learning
models as they execute. b. You can view TensorFlow charts showing how the training and validation
accuracy and loss values change through the epochs. c. Andrej Karpathy’s ConvnetJS tool, trains convnets in your web browser and
dynamically visualizes the layers’ outputs, including what each convolutional
layer “sees” as it learns. d. All of the above statements are true
.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
4
Chapter 16, Deep Learning
Answer: d. All of the above statements are true
.
16.6 Q34: Consider the following code:
[
38]: loss, accuracy = cnn.evaluate(X_test, y_test)
10000/10000 [==============================] - 4s 366us/step
[39]: loss
[39]: 0.026809450998473768
[40]: accuracy
[40]: 0.9917
Which of the following statements a), b) or c) is false
?
a. You can check the accuracy of a model on data the model has not yet seen. To
do so, call the model’s evaluate
method, which displays as its output, how
long it took to process the test samples.
b. According to the output of the preceding snippet, our convnet model is
99.17% accurate when predicting the labels for unseen data. c. With a little online research, you can find models that can predict MNIST with
nearly 100% accuracy. d. Each of the above statements is true
.
Answer: d. Each of the above statements is true
.
16.6 Q35: Which of the following statements a), b) or c) is false?
a. Calling the cnn
model’s predict
method as shown below predicts the
classes of the digit images in its argument array (
X_test
):
predictions = cnn.predict(X_test)
b. You can check what the first sample digit should be by looking at y_test[0]
:
[42]: y_test[
0
]
[42]: array([0., 0., 0., 0., 0., 0., 0., 1., 0., 0.], dtype=float32)
According to this output, the first sample is the digit 7, because the categorical
representation of the test sample’s label specifies a 1.0 at index 7— we created
this representation via one-hot encoding. c. The following code outputs the probabilities returned by the predict
method for the first test sample:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Chapter 16, Deep Learning
5
[43]: for index, probability in enumerate(predictions[
0
]):
print(f
'
{index}
: {probability:
.10
%}
'
)
0: 0.0000000201%
1: 0.0000001355%
2: 0.0000186951%
3: 0.0000015494%
4: 0.0000000003%
5: 0.0000000012%
6: 0.0000000000%
7: 99.9999761581%
8: 0.0000005577%
9: 0.0000011416%
According to the output, predictions[0]
indicates that our model believes
this digit is a 7 with nearly
100% certainty. Not all predictions have this level of
certainty.
d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
1
Saving and Loading a Model 16.6 Q36: Which of the following statements a), b) or c) is false?
a. Neural network models can require significant training time. Once you’ve
designed and tested a model that suits your needs, you can save its state. This
allows you to load it later to make more predictions. Sometimes models are
loaded and further trained for new problems. For example, layers in our model
already know how to recognize features such as lines and curves, which could be
useful in handwritten character recognition as well. This process is called
transfer learning—you transfer an existing model’s knowledge into a new
model. b. A Keras model’s save
method stores the model’s architecture and state
information in a format called Hierarchical Data Format (HDF5). Such files use
the .h5
file extension:
[51]: cnn.save(
'mnist_cnn.h5'
)
c. You can load a saved model with the load_model
function from the
tensorflow.keras.models
module, as in:
from tensorflow.keras.models import load_model
cnn = load_model(
'mnist_cnn.h5'
)
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
6
Chapter 16, Deep Learning
You can then invoke its methods. For example, if you’ve acquired more data, you
could call predict
to make additional predictions on new data, or you could
call fit
to start training with the additional data.
d. All of the above statements are true.
Answer: d. All of the above statements are true
.
1
Visualizing Neural Network Training with TensorBoard
16.7 Q1: Which of the following statements a), b) or c) is false?
a. With deep learning networks, there’s so much complexity and so much going
on internally that’s hidden from you that it’s difficult to know and fully
understand all the details. This creates challenges in testing, debugging and
updating models and algorithms. b. Deep learning learns the features but there may be enormous numbers of
them, and they may not be apparent to you. c. Google provides the TensorBoard
tool for visualizing neural networks
implemented in TensorFlow and Keras. Just as a car’s dashboard visualizes data
from your car’s sensors, such as your speed, engine temperature and the amount
of gas remaining, a TensorBoard dashboard visualizes data from a deep learning
model that can give you insights into how well your model is learning and
potentially help you tune its hyperparameters. d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
16.7 Q2: Which of the following statements a), b) or c) is false
?
a. TensorBoard monitors a folder you specify looking for files output by models
during training. b. TensorBoard loads the data from that folder into a browser-based dashboard,
similar to the following:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Chapter 16, Deep Learning
7
c. TensorBoard can load data from multiple models at once and you can choose
which to visualize. This makes it easy to compare several different models or
multiple runs of the same model. d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
16.7 Q3: To use TensorBoard, before you fit
the model, you need to configure a
TensorBoard
object, which the model will use to write data into a specified
folder that TensorBoard monitors. This object is known as a ________ in Keras.
a. callforward
b. entry point
c. callback
d. None of the above.
Answer: c. callback
16.7 Q4: The following code creates a TensorBoard
object:
from tensorflow.keras.callbacks import TensorBoard
import time
tensorboard_callback = TensorBoard(log_dir=f
'./logs/mnist
{time.time()}
'
, histogram_freq=
1
, write_graph=
True
)
Which of the following statements a), b) or c) about the above code is false
?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
8
Chapter 16, Deep Learning
a. The log_dir
argument is the name of the folder in which this model’s log
files will be written. b. The notation './logs/'
indicates that we’re creating a new folder within
the logs folder you created previously. The preceding code follows that with
'mnist'
and the current time. Using the time ensures that each new execution
of the notebook will have its own log folder. That will enable you to compare
multiple executions in TensorBoard.
b. The histogram_freq
argument is t
he frequency in epochs that Keras will
output to the model’s log files. In this case, we’ll write data to the logs for every
epoch.
c. When the write_graph
argument is True
, a graph of the model will be
output. You can view the graph in the GRAPHS
tab in TensorBoard. d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
1
ConvnetJS: Browser-Based Deep-Learning Training and Visualization No questions.
1
Recurrent Neural Networks for Sequences; Sentiment Analysis with the IMDb Dataset
16.9 Q1: Which of the following statements a), b) or c) is false
?
a. Our convnet used stacked layers that were applied sequentially. Non-
sequential models are possible with recurrent neural networks. b. A recurrent neural network (RNN) processes sequences of data, such as time
series or text in sentences. c. The term “recurrent” comes from the fact that the neural network contains
loops
in which the output of a given layer becomes the input to that same layer
in the next time step. d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
16.9 Q2: Which of the following statements is false
?
a. In a time series, a time step is the next point in time. b. In a text sequence, a “time step” would be the next word in a sequence of
words.
c. The looping in convolutional neural networks enables them to learn and
remember relationships among the data in the sequence.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Chapter 16, Deep Learning
9
d. The word “good” on its own has positive sentiment. However, when preceded
by
“not,” which appears earlier in the sequence, the sentiment becomes negative.
Answer: c. The looping in convolutional neural networks enables them to
learn and remember relationships among the data in the sequence.
Actually, the looping in recurrent neural networks
enables them to learn
and remember relationships among the data in the sequence.
16.9 Q3: Which of the following statements a), b) or c) is false
?
a. RNNs for text sequences take into account the relationships among the earlier
and later parts of a sequence. b. When determining the meaning of text there can be many words to consider
and an arbitrary number of words in between them. c. A Long Short-Term Memory (LSTM) layer makes a neural network
convolutional
and is optimized to handle learning from sequences.
d. All of the above statements are true
.
Answer: c. A Long Short-Term Memory (LSTM) layer makes a neural
network convolutional
and is optimized to handle learning from
sequences. Actually, a Long Short-Term Memory (LSTM) layer makes a
neural network recurrent
and is optimized to handle learning from
sequences.
1
Loading the IMDb Movie Reviews Dataset
16.9 Q4: Which of the following statements is false
?
a. The IMDb movie reviews dataset included with Keras contains 25,000 training
samples and 25,000 testing samples, each labeled with its positive (1) or
negative (0) sentiment. b. The following code imports the tensorflow.keras.datasets.imdb
module so we can load the dataset:
from tensorflow.keras.datasets import imdb
c. The imdb
module’s load_data
function returns the IMDb training and
testing sets. The load_data
function enables you to specify the number of
unique words to import as part of the training and testing data. The following
code loads only the top 10,000 most frequently occurring words:
number_of_words = 10000
(X_train, y_train), (X_test, y_test) = imdb.load_data(
num_words=number_of_words)
d. The load_data
call in Part (c) returns a tuple of two elements containing
the samples and labels, respectively.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
10
Chapter 16, Deep Learning
Answer: d. The load_data
call in Part (c) returns a tuple of two elements
containing the samples and labels, respectively. Actually, the load_data
call in Part (c) returns a tuple of two elements containing the training and
testing sets. Each element is itself a tuple containing the samples and
labels, respectively.
1
Data Exploration
16.9 Q5: Which of the following statements a), b) or c) is false?
a. Assuming the IMDb training set samples, training set labels, testing set
samples and testing set labels are stored in X_train
, y_train
, X_test
, and
y_test
, respectively, the following code snippets check their dimensions:
[4]: X_train.shape
[4]: (25000,)
[5]: y_train.shape
[5]: (25000,)
[6]: X_test.shape
[6]: (25000,)
[7]: y_test.shape
[7]: (25000,)
b. The arrays y_train
and X_test
are one-dimensional arrays containing 1s
and 0s, indicating whether each review is positive or negative. c. Based on the outpus from the snippets in Part (a), X_train
and X_test
appear to be one-dimensional. However, their elements actually are lists
of
integers, each representing one review’s contents, as shown in the code below:
[8]: %pprint
[8]: Pretty printing has been turned OFF
[9]: X_train[
123
]
[9]: [1, 307, 5, 1301, 20, 1026, 2511, 87, 2775, 52, 116, 5, 31, 7, 4, 91, 1220, 102, 13, 28, 110, 11, 6, 137, 13, 115, 219, 141, 35, 221, 956, 54, 13, 16, 11, 2714, 61, 322, 423, 12, 38, 76, 59, 1803, 72, 8, 2, 23, 5, 967, 12, 38, 85, 62, 358, 99]
d. All of the above statements are true
: Answer: b. The arrays y_train
and X_test
are one-dimensional arrays
containing 1s and 0s, indicating whether each review is positive or
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Chapter 16, Deep Learning
11
negative. Actually, the arrays y_train
and y_test
are one-dimensional
arrays containing 1s and 0s, indicating whether each review is positive or
negative.
16.9 Q6: Which of the following statements a), b) or c) is false?
a. Because IMDb movie reviews are numerically encoded in the dataset bundled
with Keras, to view their original text, you need to know the word to which each
number corresponds. b. Keras’s IMDb dataset provides a dictionary that maps the words to their
indexes. Each word’s corresponding value is its frequency ranking among all the
words in the entire set of reviews. c. In the dictionary mentioned in Part (b), the word with the ranking 1 is the
most frequently occurring word (calculated by the Keras team from the dataset),
the word with ranking 2 is the second most frequently occurring word, and so
on. Though the dictionary values begin with 1 as the most frequently occurring
word, in each encoded review, the ranking values are offset by 3
. So any review
containing the most frequently occurring word will have the value 4 wherever
that word appears in the review. d. All of the above statements are true
.
Answer: All of the above statements are true
.
16.9 Q7: Which of the following statements a), b) or c) is false regarding
decoding IMDb movie reviews?
a. The following snippet gets the word-to-index dictionary by calling the
function
get_word_index
from
the
tensorflow.keras.datasets.imdb
module: [10]: word_to_index = imdb.get_word_index()
b. The word 'great'
might appear in a positive movie review, so the following
code checks whether it’s in the dictionary:
[11]: word_to_index[
'great'
]
[11]: 84
c. According to the Part (b) output, 'great'
is the dataset’s 84th most frequent
word. If you look up a word that’s not in the dictionary, you’ll get an exception.
d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
1
Data Preparation
16.9 Q8: Which of the following statements a), b) or c) is false
?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
12
Chapter 16, Deep Learning
a. The number of words per review varies, but the Keras requires all samples to
have the same dimensions. b. To use the IMDb dataset for deep learning, we need to restrict every review to
the same number of words. c. When performing the data preparation in Part (b), some reviews will need to
be padded with additional data and others will need to be truncated.
d. All of the above statements are true
.
Answer: d. All of the above statements are true
. 16.9 Q9: Which of the following statements is false
?
a.
The
pad_sequences
utility
function
(module
tensorflow.keras.preprocessing.sequence
) reshapes the rows in an
array of to the number of features specified by the maxlen
argument (200) and
returns a two-dimensional array:
[16]: words_per_review = 200 [17]: from tensorflow.keras.preprocessing.sequence import pad_sequences
[18]: X_train = pad_sequences(X_train, maxlen=words_per_review)
b. If a sample has more features, pad_sequences
truncates it to the specified
length. c. If a sample has fewer features, pad_sequences
adds blanks to the beginning
of the sequence to pad it to the specified length. d. Let’s confirm X_train
’s new shape: [19]: X_train.shape
[19]: (25000, 200)
Answer: c. If a sample has fewer features, pad_sequences
adds blanks to
the beginning of the sequence to pad it to the specified length. Actually, if a
sample has fewer features, pad_sequences
adds 0
s
to the sequence to
pad it to the specified length.
1
Creating the Neural Network
16.9 Q10: Which of the following statements a), b) or c) is false
?
a. We’ve used one-hot encoding to convert the MNIST dataset’s integer labels
into categorical data. The result for each label was a vector in which all but one
element was 0. We could also do that for the index values that represent our
words. b. For our IMDb example that processes 10,000 unique words, we’d need a
10,000-by-10,000 array to represent all the words. That’s 100,000,000
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Chapter 16, Deep Learning
13
elements, and almost all the array elements would be 0. This is not an efficient
way to encode the data. c. If we were to process all 88,000+ unique words in the IMDb dataset, we’d
need an array of nearly eight billion elements.
d. All of the above statements are true
.
Answer: d. All of the above statements are true
.
16.9 Q11: Which of the following statements a), b) or c) is false
?
a. To reduce ambiguity, RNNs that process text sequences typically begin with an
embedding layer that encodes each word in a compact dense-vector
representation. b. The vectors produced by the embedding layer also capture the word’s context
—that is, how a given word relates to the words around it. c. An embedding layer enables the RNN to learn word relationships among the
training data.
d. All of the above statements are true
.
Answer: a. To reduce ambiguity, RNNs that process text sequences typically
begin with an embedding layer that encodes each word in a more compact
dense-vector representation. Actually, to reduce dimensionality
, RNNs that
process text sequences typically begin with an embedding layer that
encodes each word in a more compact dense-vector representation.
16.9 Q12: Which of the following are popular predefined word embeddings?
a. GloVe
b. Word2Vec
c. a) and b)
d. None of the above.
Answer: c. a) and b)
16.9 Q13” Which of the following statements is false
?
a. The following snippet adds an LSTM layer to an RNN named rnn
:
rnn.add(LSTM(units=
128
, dropout=
0.2
, recurrent_dropout=
0.2
))
b) The units
argument in Part (a)’s snippet specifies the number of neurons in
the layer. The more neurons the more the network can remember. As a guideline,
you can start with a value between the length of the sequences you’re
processing and the number of classes you’re trying to predict.
c) The dropout
argument specifies the percentage of neurons to randomly
disable when processing the layer’s input and output. Like the pooling layers in
our convnet, dropout is a proven technique that reduces underfitting. Keras
provides a Dropout
layer that you can add to your models.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
14
Chapter 16, Deep Learning
d) The recurrent_dropout
argument specifies the percentage of neurons to
randomly disable when the layer’s output is fed back into the layer again to
allow the network to learn from what it has seen previously.
Answer: c) The dropout
argument specifies the percentage of neurons to
randomly disable when processing the layer’s input and output. Like the
pooling layers in our convnet, dropout is a proven technique that reduces
underfitting. Keras provides a Dropout
layer that you can add to your
models. Actually, the dropout
argument specifies the percentage of
neurons to randomly disable when processing the layer’s input and output.
Like the pooling layers in our convnet, dropout is a proven technique that
reduces overfitting
. Keras provides a Dropout
layer that you can add to
your models.
16.9 Q14: The ________ activation function, which is preferred for binary
classification, reduces arbitrary values into the range 0.0–1.0, producing a
probability. a. softmax
b. relu
c. sigmoid
d. softplus
Answer: c. sigmoid
16.9 Q15: With only two possible outputs, we use the ________ loss function.
a. mean_squared_error
b. binary_compression
c. categorical_crossentropy
d. binary_crossentropy
Answer: d. binary_crossentropy
.
1
Training and Evaluating the Model
16.9 Q16: Keras function ________ returns the loss and accuracy values of a
trained model.
a. assess
b. account
c. grade
d. evaluate
Answer: d. evaluate
1
Tuning Deep Learning Models
16.10 Q1: Which of the following are variables that affect model performance?
a. having more or less data to train with, having more or less to test with b. having more or less to validate with, having more or fewer layers
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Chapter 16, Deep Learning
15
c. the types of layers you usef. the order of the layers
d. All of the above
Answer: d. All of the above.
16.10 Q2: Which of the following statements is false
?
a. The compute time required to train models multiple times is significant so, in
deep learning, you generally tune hyperparameters with techniques like k-fold
cross-validation and grid search. b. There are various tuning techniques, but one particularly promising area is
automated machine learning (AutoML). c. Auto-Keras (
https://autokeras.com/
) is geared to automatically choosing the
best configurations for your Keras models. d. Google’s Cloud AutoML and Baidu’s EZDL are among various other automated
machine learning efforts. Answer: a. The compute time required to train models multiple times is
significant so, in deep learning, you generally tune hyperparameters with
techniques like k-fold cross-validation and grid search. Actually, the
compute time required to train models multiple times is significant so, in
deep learning, you generally do not tune hyperparameters with techniques
like k-fold cross-validation or grid search
.
1
Convnet Models Pretrained on ImageNet
16.11 Q1: Moving the weights learned by a deep-learning model for a similar
problem into a new model is called ________ learning. a. assignment
b. transfer c. relegation
d. relocation
Answer: b. transfer
16.11 Q2: Which of the following statements is false
?
a. ImageNet is limited in size, so it can be trained efficiently on most computers.
b. You can reuse just the architecture of each model and train it with new data,
or you can reuse the pretrained weights. c. ImageNet now has a continuously running challenge on the Kaggle
competition site called the ImageNet Object Localization Challenge
. The goal is to
identify “all objects within an image, so those images can then be classified and
annotated.” d. There’s no obvious optimal solution for many machine learning and deep
learning tasks. Answer: a. ImageNet is limited in size, so it can be trained efficiently on
most computers. Actually, ImageNet is too big for efficient training on most
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
16
Chapter 16, Deep Learning
computers, so most people interested in using it start with one of the smaller
pretrained models.
16.11 Q3: Which of the following statements is false
?
a. On Kaggle, companies and organizations fund competitions where they
encourage people worldwide to develop better-performing solutions than
they’ve been able to do for something that’s important to their business or
organization. b. Sometimes companies offer prize money, which has been as high as
$1,000,000 on the famous Netflix competition. c. Netflix wanted to get a 100% or better improvement in their model for
determining whether people will like a movie, based on how they rated previous
ones. They used the results to help make better recommendations to members. d. Even if you do not win a Kaggle competition, it’s a great way to get experience
working on challenging problems of current interest.
Answer: c. Netflix wanted to get a 100% or better improvement in their
model for determining whether people will like a movie, based on how
they rated previous ones. They used the results to help make better
recommendations to members. Actually, Netflix wanted to get a 10% or
better improvement
in their model for determining whether people will
like a movie, based on how they rated previous ones. They used the results
to help make better recommendations to members.
1
Reinforcement Learning
16.12 Q1: Which of the following statements is false
?
a. Reinforcement learning is a form of machine learning in which algorithms
learn from their environment, similar to how humans learn—for example, a
video game enthusiast learning a new game, or a baby learning to walk or
recognize its parents. b. Reinforcement learning implements an agent that learns by trying to perform
a task, receiving feedback about success or failure, making adjustments then
trying again. The goal is to minimize the loss function. c. The agent receives a positive reward for doing a right thing and a negative
reward (that is, a punishment) for doing a wrong thing. d. The agent uses this information to determine the next action to perform and
must try to maximize the reward.
Answer: b. Reinforcement learning
implements an agent that learns by
trying to perform a task, receiving feedback about success or failure,
making adjustments then trying again. The goal is to minimize the loss
function. Actually, the algorithm implements an agent that learns by trying
to perform a task, receiving feedback about success or failure, making
adjustments then trying again. The goal is to maximize the reward
.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
- Access to all documents
- Unlimited textbook solutions
- 24/7 expert homework help
Recommended textbooks for you

Systems Architecture
Computer Science
ISBN:9781305080195
Author:Stephen D. Burd
Publisher:Cengage Learning

Fundamentals of Information Systems
Computer Science
ISBN:9781305082168
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning

Fundamentals of Information Systems
Computer Science
ISBN:9781337097536
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning

Principles of Information Systems (MindTap Course...
Computer Science
ISBN:9781285867168
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning

Information Technology Project Management
Computer Science
ISBN:9781337101356
Author:Kathy Schwalbe
Publisher:Cengage Learning

Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781305627482
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Recommended textbooks for you
- Systems ArchitectureComputer ScienceISBN:9781305080195Author:Stephen D. BurdPublisher:Cengage LearningFundamentals of Information SystemsComputer ScienceISBN:9781305082168Author:Ralph Stair, George ReynoldsPublisher:Cengage LearningFundamentals of Information SystemsComputer ScienceISBN:9781337097536Author:Ralph Stair, George ReynoldsPublisher:Cengage Learning
- Principles of Information Systems (MindTap Course...Computer ScienceISBN:9781285867168Author:Ralph Stair, George ReynoldsPublisher:Cengage LearningInformation Technology Project ManagementComputer ScienceISBN:9781337101356Author:Kathy SchwalbePublisher:Cengage LearningDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781305627482Author:Carlos Coronel, Steven MorrisPublisher:Cengage Learning

Systems Architecture
Computer Science
ISBN:9781305080195
Author:Stephen D. Burd
Publisher:Cengage Learning

Fundamentals of Information Systems
Computer Science
ISBN:9781305082168
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning

Fundamentals of Information Systems
Computer Science
ISBN:9781337097536
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning

Principles of Information Systems (MindTap Course...
Computer Science
ISBN:9781285867168
Author:Ralph Stair, George Reynolds
Publisher:Cengage Learning

Information Technology Project Management
Computer Science
ISBN:9781337101356
Author:Kathy Schwalbe
Publisher:Cengage Learning

Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781305627482
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning