TB#3

pdf

School

SUNY at Albany *

*We aren’t endorsed by this school

Course

438

Subject

Computer Science

Date

Apr 3, 2024

Type

pdf

Pages

38

Uploaded by BarristerWorld7634

Report
Chapter 16, Deep Learning 1 Deep Learning 1 Introduction 16.1 Q1: Which of the following statements is false ? a. Keras offers a friendly interface to Google’s TensorFlow—the most widely used deep-learning library. b. François Chollet of the Google Mind team developed Keras to make deep- learning capabilities more accessible. c. Keras enables you to define deep-learning models conveniently with one statement. d. Google has thousands of TensorFlow and Keras projects underway internally, and that number is growing quickly. Answer: c. Keras enables you to define deep-learning models conveniently with one statement. Actually, deep learning models require more sophisticated setups than scikit-learn machine learning models, typically using several statements to connect multiple objects, called layers. 16.1 Q2: Which of the following statements is false ? a. Keras is to deep learning as Scikit-learn is to machine learning. b. Deep learning models are complex and require an extensive mathematical background to understand their inner workings. c. Both Keras and Scikit-learn encapsulate the sophisticated mathematics of their models, so developers simply need only define, parameterize and manipulate objects. d. With Keras, you build your models primarily from custom components you develop to meet your unique requirements. Answer: d. With Keras, you build your models primarily from custom components you develop to meet your unique requirements. Actually, with Keras, you build your models from pre-existing components and quickly parameterize those components to your unique requirements. 16.1 Q3: Which of the following statements is false ? a. Keras facilitates experimenting with many deep-learning models and tweaking them in various ways until you find the models that perform best for your applications. b. Deep learning works well only when you have lots of data. c. Transfer learning uses existing knowledge from a previously trained model as the foundation for a new model. d. Data augmentation adds data to a dataset by deriving new data from existing data. For example, in an image dataset, you might rotate the images left and right so the model can learn about objects in different orientations.
2 Chapter 16, Deep Learning Answer: b. Deep learning works well only when you have lots of data. Actually, deep learning works well when you have lots of data, but it also can be effective for smaller datasets, especially when combined with techniques like transfer learning and data augmentation. 16.1 Q4: Which of the following statements a), b) or c) is false? a. Deep learning can require significant processing power. b. Complex models trained on big-data datasets can take hours, days or even more to train. c. Special high-performance hardware called GPUs (Graphics Processing Units) and TPUs (Tensor Processing Units) developed by NVIDIA and Google, respectively, to meet the extraordinary processing demands of edge-of-the- practice- deep-learning applications. d. All of the above statements are true . Answer: d. All of the above statements are true . 16.1 Q5: ________ neural networks are especially appropriate for computer vision tasks, such as recognizing handwritten digits and characters or recognizing objects (including faces) in images and videos. a. LSTM b. Recurrent c. Convolutional d. None of the above Answer: c. Convolutional 16.1 Q6: Which of the following are not automated deep-learning capabilities? a. Auto-Keras from Texas A&M University’s DATA Lab b. Baidu’s EZDL c. Google’s AutoML d. Scikit-learn Answer: d. Scikit-learn 1 Deep Learning Applications 16.1 Q7: Which of the following are popular deep learning applications: a. Game playing, computer vision, self-driving cars, robotics, improving customer experiences and chatbots b. Diagnosing medical conditions, Google Search, facial recognition, automated image captioning, video closed captioning, enhancing image resolution c. Speech recognition, language translation, predicting election results, predicting earthquakes and weather. d. All of the above Answer: All of the above
Chapter 16, Deep Learning 3 1 Deep Learning Demos 16.1 Q8: Which of the following deep-learning demos translates a line drawing into a picture: a. DeepArt.io. b. DeepWarp Demo. c. Image-to-Image Demo. d. Google Translate Mobile App. Answer: c. Image-to-Image Demo. 1 Keras Resources 16.1 Q9: If you’re looking for term projects, directed study projects, capstone course projects or thesis topics, visit ________. People post their research papers here in parallel with going through peer review for formal publication, hoping for fast feedback. So, this site gives you access to extremely current research. a. https://kerasteam.slack.com . b. https://blog.keras.io . c. http://keras.io . d. https://arXiv.org Answer: d. https://arXiv.org 1 Keras Built-In Datasets 16.2 Q1: Which of the following Keras datasets for practicing deep learning is used for sentiment analysis? b. MNIST. c. Fashion-MNIST. d. IMDb. e. CIFAR10. f. CIFAR100. Answer: d. IMDb. 1 Custom Anaconda Environments 16.3 Q1: Which of the following statements about Anaconda environments is false ? a. The Anaconda Python distribution makes it easy to create custom environments. b. These are separate configurations in which you can install different libraries and different library versions. This can help with reproducibility if your code depends on specific Python or library versions. c. The default environment in Anaconda is called the root environment. This is created for you when you install Anaconda. All the Python libraries that come
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4 Chapter 16, Deep Learning with Anaconda are installed into the root environment and, unless you specify otherwise, any additional libraries you install also are placed there. d. Custom environments give you control over the specific libraries you wish to install for your specific tasks. Answer: c. The default environment in Anaconda is called the root environment. This is created for you when you install Anaconda. All the Python libraries that come with Anaconda are installed into the root environment and, unless you specify otherwise, any additional libraries you install also are placed there. Actually, the default environment in Anaconda is called the base environment . 16.3 Q2: Which of the following statements a), b) or c) is false? a. To use a custom Anaconda environment named tf_env , execute the following command, which affects only the current Terminal, shell or Anaconda Command Prompt: conda activate tf_env b. When a custom environment is activated and you install more libraries, they become part of the activated environment, not the base environment. c. If you open separate Terminals, shells or Anaconda Command Prompts, they’ll use Anaconda’s base environment by default. d. All of the above statements are true . Answer: d. All of the above statements are true . 1 Neural Networks 16.4 Q1: Which of the following statements a), b) or c) is false? a. Deep learning is a form of machine learning that uses artificial neural networks to learn. b. An artificial neural network is a software construct that operates similarly to how scientists believe our brains work. c. Our biological nervous systems are controlled via neurons that communicate with one another along pathways called synapses . As we learn, the specific neurons that enable us to perform a given task, like walking, communicate with one another more efficiently. d. All of the above statements are true . Answer: d. All of the above statements are true . 16.4 Q2: In supervised deep learning, we aim to predict the ________ labels supplied with data samples. a. goal b. mark c. object d. target
Chapter 16, Deep Learning 5 Answer: d. target 16.4 Q3: The following diagram shows a three- layer neural network. Each circle represents a neuron, and the lines between them simulate the synapses. The output of a neuron becomes the input of another neuron, hence the term neural network: This particular diagram shows a ________. a. partially connected network b. crossover network c. fully connected network d. None of the above Answer: c. fully connected network 16.4 Q4: Which of the following statements a), b) or c) about neural networks is false ? a. During the training phase, the network calculates values called weights for every connection between the neurons in one layer and those in the next. b. On a neuron-by-neuron basis, each of its inputs is multiplied by that connection’s weight, then the maximum of those weighted inputs is passed to the neuron’s activation function. c. The activation function’s output determines which neurons to activate based on the inputs—just like the neurons in your brain passing information around in response to inputs coming from your eyes, nose, ears and more. d. All of the above statements are true . Answer: b. On a neuron-by-neuron basis, each of its inputs is multiplied by that connection’s weight, then the maximum of those weighted inputs is passed to the neuron’s activation function. Actually, on a neuron-by-neuron basis, each of its inputs is multiplied by that connection’s weight, then the sum of those weighted inputs is passed to the neuron’s activation function. 16.4 Q5: Which of the following statements about the diagram below is false ?
6 Chapter 16, Deep Learning a. The diagram shows a neuron receiving three inputs (the black dots) and producing an output (the hollow circle) that would be passed to all or some of neurons in the next layer, depending on the types of the neural network’s layers. b. The values w 1 , w 2 and w 3 are weights. c. In a new model that you train from scratch, these values are initialized to zero by the model. d. As the network trains, it tries to minimize the error rate between the network’s predicted labels and the samples’ actual labels. Answer: c. In a new model that you train from scratch, these values are initialized to zero by the model. Actually, in a new model that you train from scratch, these values are initialized randomly by the model. 16.4 Q6: Which of the following statements a), b) or c) is false ? a. The error rate is known as the loss, and the calculation that determines the loss is called the loss function. b. Throughout training, the network determines the amount that each neuron contributes to the overall loss, then goes back through the layers and adjusts the weights in an effort to minimize that loss. c. The technique mentioned in Part (b) is called backpropagation. Optimizing these weights occurs gradually—typically via a process called gradient descent. d. All of the above statements are true . Answer: d. All of the above statements are true . 1 Tensors 16.5 Q1: Which of the following statements is false ? a. Deep learning frameworks generally manipulate data in the form of tensors. b. A tensor is basically a one-dimensional array. c. Frameworks like TensorFlow pack all your data into one or more tensors, which they use to perform the mathematical calculations that enable neural networks to learn.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter 16, Deep Learning 7 d. These tensors can become quite large as the number of dimensions increases and as the richness of the data increases (for example, images, audios and videos are richer than text). Answer: b. A tensor is basically a one-dimensional array. Actually, A tensor is basically a multidimensional array . 16.5 Q2: Chollet discusses the types of tensors typically encountered in deep learning: A 0D (0-dimensional) tensor is one value and is known as a scalar. A 1D tensor is similar to a one-dimensional array and is known as a vector. A 1D tensor might represent a sequence, such as hourly temperature readings from a sensor or the words of one movie review. A 2D tensor is similar to a two- dimensional array and is known as a matrix. A 2D tensor could represent a grayscale image in which the tensor’s two dimensions are the image’s width and height in pixels, and the value in each element is the intensity of that pixel. Which of the following statements a), b) or c) about additional types of tensors is false ? a. A 3D tensor is similar to a three-dimensional array and could be used to represent a color image. The first two dimensions would represent the width and height of the image in pixels and the depth at each location might represent the red, green and blue (RGB) components of a given pixel’s color. A 3D tensor also could represent a collection of 2D tensors containing grayscale images. b. A 4D tensor could be used to represent a collection of color images in 3D tensors. It also could be used to represent one video. Each frame in a video is essentially a color image. c. A 5D tensor could be used to represent a collection of 4D tensors containing videos. d. All of the above statements are true . Answer: d. All of the above statements are true . 16.5 Q3: A tensor’s ________ typically is represented in Python as a tuple of values in which the number of elements specifies the tensor’s number of dimensions and each value in the tuple specifies the size of the tensor’s corresponding dimension. a. shape b. frame c. pattern d. format Answer: a. shape. 16.5 Q4: Which of the following statements a), b) or c) is false ? a. Powerful processors are needed for real-world deep learning because the size of tensors can be enormous and large-tensor operations can place crushing demands on processors.
8 Chapter 16, Deep Learning b. NVIDIA GPUs (Graphics Processing Units)—originally developed for computer gaming—are optimized for the mathematical matrix operations typically performed on tensors, an essential aspect of how deep learning works “under the hood.” c. Recognizing that deep learning is crucial to its future, Google developed TPUs (Tensor Processing Units). Google now uses TPUs in its Cloud TPU service, which can perform quadrillions of floating-point operations per second. d. All of the above statements are true . Answer: d. All of the above statements are true . 1 Convolutional Neural Networks for Vision; Multi-Classification with the MNIST Dataset 16.6 Q1: Which of the following statements a), b) or c) is false ? a. In the “Machine Learning” chapter, we classified handwritten digits using the 8-by-8-pixel, low-resolution images from the Digits dataset bundled with Scikit- learn. b. The Digits dataset is based on a subset of the higher-resolution MNIST handwritten digits dataset. c. Recurrent neural networks are common in computer-vision applications, such as recognizing handwritten digits and characters, and recognizing objects in images and video. d. All of the above statements are true . Answer: c. Recurrent neural networks are common in computer-vision applications, such as recognizing handwritten digits and characters, and recognizing objects in images and video. Actually, convolutional neural networks are common in computer-vision applications, such as recognizing handwritten digits and characters, and recognizing objects in images and video. 16.6 Q2: Which of the following statements a), b) or c) is false ? a. Reproducibility is crucial in scientific studies. b. In deep learning, reproducibility is more difficult because the libraries sequentialize operations that perform floating-point calculations. c. Getting reproducible results in Keras requires a combination of environment settings and code settings that are described in the Keras FAQ. d. All of the above statements are true . Answer: b. In deep learning, reproducibility is more difficult because the libraries sequentialize operations that perform floating-point calculations. Actually, in deep learning, reproducibility is more difficult because the libraries heavily parallelize operations that perform floating-point
Chapter 16, Deep Learning 9 calculations. Each time operations execute, they may execute in a different order. This can produce differences in your results. 16.6 Q3: Which of the following statements about Keras neural network components is false ? a. The network is a sequence of layers containing the neurons used to learn from the samples. Each layer’s neurons receive inputs, process them via an optimizer function, and produce outputs. c. The data is fed into the network via an input layer that specifies the dimensions of the sample data. d. The input layer is followed by hidden layers of neurons that implement the learning and an output layer that produces the predictions. The more layers you stack, the deeper the network is (hence the term deep learning). Answer: a. The network is a sequence of layers containing the neurons used to learn from the samples. Each layer’s neurons receive inputs, process them via an optimizer function, and produce outputs. Actually, the network is a sequence of layers containing the neurons used to learn from the samples. Each layer’s neurons receive inputs, process them via an activation function , and produce outputs. 16.6 Q4: A ________ function produces a measure of how well a neural network predicts the target values. a. activation b. optimizer c. loss d. None of the above Answer: c. loss 1 Loading the MNIST Dataset 16.6 Q5: Which of the following statements is false ? a. The following code imports the tensorflow.keras.datasets.mnist module so we can load the dataset: from tensorflow.keras.datasets import mnist b. When we use the version of Keras built into TensorFlow, the Keras module names begin with "tensorflow." . c. TensorFlow uses Keras to execute the deep-learning models. d. The mnist module’s load_data function loads the MNIST training and testing sets: (X_train, y_train), (X_test, y_test) = mnist.load_data()
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
10 Chapter 16, Deep Learning When you call load_data it will download the MNIST data to your system. The function returns a tuple of two elements containing the training and testing sets. Each element is itself a tuple containing the samples and labels, respectively. Answer: c. TensorFlow uses Keras to execute the deep-learning models. Actually, Keras uses TensorFlow to execute the deep-learning models. 1 Data Exploration 16.6 Q6: Which of the following statements a), b) or c) is false? a. You should always get to know the data before working with it. b. The following snippets check the dimensions of the MNIST training set images ( X_train ), training set labels ( y_train ), testing set images ( X_test ) and testing set labels ( y_test ): [3]: X_train.shape [3]: (60000, 28, 28) [4]: y_train.shape [4]: (60000,) [5]: X_test.shape [5]: (10000, 28, 28) [6]: y_test.shape [6]: (10000,) c. You can see from X_train ’s and X_test ’s shapes that the MNIST images are the same resolution as those in Scikit-learn’s Digits dataset. d. All of the above statements are true . Answer: c. You can see from X_train ’s and X_test ’s shapes that the MNIST images are the same resolution as those in Scikit-learn’s Digits dataset. Actually, you can see from X_train ’s and X_test ’s shapes that the images are higher resolution than those in Scikit-learn’s Digits dataset (which are 8-by-8). 16.6 Q7: The IPython magic ________ indicates that Matplotlib-based graphics should be displayed in a Jupyter notebook rather than in separate windows. a. %matplotlib notebook b. %matplotlib inline c. %matplotlib Jupyter d. None of the above Answer: b. %matplotlib inline
Chapter 16, Deep Learning 11 16.6 Q8: Consider the following code: import numpy as np index = np.random.choice(np.arange(len(X_train)), 24 , replace= False )
Chapter 16, Deep Learning 1 Deep Learning figure, axes = plt.subplots(nrows= 4 , ncols= 6 , figsize=( 16 , 9 )) for item in zip(axes.ravel(), X_train[index], y_train[index]): axes, image, target = item axes.imshow(image, cmap= plt.cm.gray_r ) axes.set_xticks([]) # remove x-axis tick marks axes.set_yticks([]) # remove y-axis tick marks axes.set_title(target) plt.tight_layout() Which of the following statements a), b) or c) is false ? a. NumPy’s choice function (from the numpy.random module) selects the number of elements specified in its second argument from the front of the array of values in its first. b. The choice function returns an array containing the selected values, which we store in index . c. The expressions X_train[index] and y_train[index] use index to get the corresponding elements from both arrays. d. All of the above statements are true . Answer: a. NumPy’s choice function (from the numpy.random module) selects the number of elements specified in its second argument from the front of the array of values in its first. Actually, NumPy’s choice function (from the numpy.random module) randomly selects the number of elements specified in its second argument from the array of values in its first argument. 1 Data Preparation 16.6 Q9: Which of the following statements a), b) or c) is false? a. Scikit-learn’s bundled datasets were preprocessed into the shapes its models require. b. In real-world studies, you’ll generally have to do some or all of the data preparation. c. The MNIST dataset requires some preparation for use in a Keras convnet. d. All of the above statements are true . Answer: d. All of the above statements are true . 16.6 Q10: Which of the following statements is false ? a. Keras convnets require NumPy array inputs in which each sample has the shape: ( width , height , channels )
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
2 Chapter 16, Deep Learning b. For MNIST, each image’s width and height are 28 pixels, and each pixel has one channel (the grayscale shade of the pixel from 0 to 255), so each sample’s shape will be: (28, 28, 1) c. Full-color images with RGB (red/green/blue) values for each pixel, would have three channels —one channel each for the red, green and blue components of a color. d. As the neural network learns from the images, it reduces the number of channels. Answer: d. As the neural network learns from the images, it reduces the number of channels. Actually, as the neural network learns from the images, it creates many more channels . 16.6 Q11: Which of the following statements a), b) or c) is false ? a. Numeric features in data samples may have value ranges that vary widely. Deep learning networks perform better on data that is scaled either into the range 0.0 to 1.0, or to a range for which the data’s mean is 1.0 and its standard deviation is 0.0. Getting your data into one of these forms is known as normalization. b. In MNIST, each pixel is an integer in the range 0–255. c. The following code converts the values to 32-bit (4-byte) floating-point numbers using the NumPy array method astype , then divides every element in the resulting array by 255, producing normalized values in the range 0.0–1.0: [16]: X_train = X_train.astype( 'float32' ) / 255 [17]: X_test = X_test.astype( 'float32' ) / 255 d. All of the above statements are true. Answer: a. Numeric features in data samples may have value ranges that vary widely. Deep learning networks perform better on data that is scaled either into the range 0.0 to 1.0, or to a range for which the data’s mean is 1.0 and its standard deviation is 0.0. Getting your data into one of these forms is known as normalization. Actually, deep learning networks perform better on data that is scaled either into the range 0.0 to 1.0, or to a range for which the data’s mean is 0.0 and its standard deviation is 1.0 . 16.6 Q12: The MNIST convnet’s prediction for each MNIST digit will be an array of 10 probabilities, indicating the likelihood that the digit belongs to a particular one of the classes 0 through 9. When we evaluate the model’s accuracy, Keras compares the model’s predictions to the labels. To do that, Keras requires both to have the same ________. a. profile
Chapter 16, Deep Learning 3 b. aspect c. shape d. frame Answer: c. shape 16.6 Q13: Which of the following statements a), b) or c) is false? a. One-hot encoding, which converts data into arrays of 1.0s and 0.0s in which only one element is 1.0 and the rest are 0.0s. b. For MNIST targets, the one-hot-encoded values will be 10 x 10 arrays representing the categories 0 through 9. c. We know precisely which category each digit belongs to, so the categorical representation of a digit label will consist of a 1.0 at that digit’s index and 0.0s for all the other elements. d. All of the above statements are true . Answer: b. For MNIST targets, the one-hot-encoded values will be 10 x 10 arrays representing the categories 0 through 9. Actually, the one-hot encoded values will be one-dimensional 10-element arrays. 16.6 Q14: Which of the following statements a), b) or c) is false ? a. The one-hot encoded representation of the digit 7 is: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0] b. The tensorflow.keras.utils module provides function to_categorical to perform one-hot encoding. c. The function counts the unique categories then, for each item being encoded, creates an array of that length with a 1.0 in the correct position. d. All of the above statements are true . Answer: a. The one-hot encoded representation of the digit 7 is: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0] Actually, the one-hot encoded representation of the digit 7 is: [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0] . 1 Creating the Neural Network 16.6 Q15: Which of the following statements a), b) or c) is false ? a. The following code begins configuring a convolutional neural network with a Keras Sequential model from the tensorflow.keras.models module: [24]: from tensorflow.keras.models import Sequential [25]: cnn = Sequential() b. The network resulting from the code in Part (a) will execute its layers sequentially—the output of one layer becomes the input to the next.
4 Chapter 16, Deep Learning c. Networks that operate as described in Part (b) are called feed-forward networks. All neural networks operate this way. d. All of the above statements are true . Answer: c. Networks that operate as described in Part (b) are called feed- forward networks. All neural networks operate this way. Actually, not all neural networks are feed forward—we also discuss recurrent neural networks . 16.6 Q16: A typical convolutional neural network consists of several layers—an input layer that receives the training samples, ________ layers that learn from the samples and an output layer that produces the prediction probabilities. a. intermediate b. study c. training d. hidden Answer: d. hidden 16.6 Q17: Which of the following statements a), b) or c) is false ? a. The following code imports from the tensorflow.keras.layers module popular layer classes we can use to construct a basic convnet: from tensorflow.keras.layers import Conv2D, Dense, Flatten, MaxPooling2D b. We can begin our network with a convolution layer, which uses the relationships between pixels that are close to one another to learn useful features (or patterns) in large areas of each sample. c. The areas that convolution learns from are called kernels or patches. d. All of the above statements are true . Answer: b. We can begin our network with a convolution layer, which uses the relationships between pixels that are close to one another to learn useful features (or patterns) in large areas of each sample. Actually, we can begin our network with a convolution layer, which uses the relationships between pixels that are close to one another to learn useful features (or patterns) in small areas of each sample. 16.6 Q18: Consider the following diagram in which the 3-by-3 shaded square represents the kernel in a convolution layer:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter 16, Deep Learning 5 Which of the following statements a), b) or c) is false ? a. The areas that convolution learns from are called kernels or patches. b. When the kernel reaches the right edge, the convolution layer moves the kernel down three pixels and repeats this left-to-right process. c. Kernels typically are 3-by-3, though larger convnets can be used for higher- resolution images. d. All of the above statements are true . Answer: b. When the kernel reaches the right edge, the convolution layer moves the kernel down three pixels and repeats this left-to-right process. Actually, when the kernel reaches the right edge, the convolution layer moves the kernel one pixel down and repeats this left-to-right process. 16.6 Q19: Which of the following statements a), b) or c) about convolution is false ? a. Kernel-size is a tunable hyperparameter. b. For each kernel position, the convolution layer performs mathematical calculations using the kernel features to “learn” about them, then outputs one new feature to the layer’s output. c. By looking at features near one another, the network begins to recognize features like edges, straight lines and curves. d. All of the above statements are true . Answer: d. All of the above statements are true . Answer: d. All of the above statements are true . 16.6 Q20: Which of the following statements is false ? a. The number of filters depends on the image dimensions—higher-resolution images have more features, so they require more filters. b. If you study the code the Keras team used to produce their pretrained convnets, you’ll find that they used 64, 128 or even 256 filters in their first
6 Chapter 16, Deep Learning convolutional layers. Based on their convnets and the fact that the MNIST images are small, we used 64 filters in our first convolutional layer. c. The set of filters produced by a convolution layer is called a feature map. Subsequent convolution layers combine features from previous feature maps to recognize larger features and so on. If we were doing facial recognition, early layers might recognize lines, edges and curves, and subsequent layers might begin combining those into larger features like eyes, eyebrows, noses, ears and mouths. d. Once the network learns a feature, because of convolution, it no longer needs to recognize that feature elsewhere in the image. Answer: d. Once the network learns a feature, because of convolution, it no longer needs to recognize that feature elsewhere in the image. Actually, Once the network learns a feature, because of convolution, it can recognize that feature anywhere in the image—this is one of the reasons that convnets are used for object recognition in images. 16.6 Q21: Which of the following statements is false ? a. The following code adds a Conv2D convolution layer to a model named cnn : cnn.add(Conv2D(filters= 64 , kernel_size=( 3 , 3 ), activation= 'relu' , input_shape=( 28 , 28 , 1 ))) b. The Conv2D layer in Part (a) is configured with the following arguments: filters= 64 —The number of filters in the resulting feature map. kernel_size=( 3 , 3 ) —The size of the kernel used in each filter. activation= 'relu' —The 'relu' (Rectified Linear Unit) activation function is used to produce this layer’s output. 'relu' is the most widely used activation function in today’s deep learning networks and is good for performance because it’s easy to calculate. It’s commonly recommended for convolutional layers. c. Assuming the Conv2D layer in Part (a) is the first layer of the network, we also pass the input_shape=(28, 28,1) argument to specify the shape of each sample. This automatically creates an input layer to load the samples and pass them into the Conv2D layer, which is actually the first hidden layer . d. In Keras, for each subsequent layer you must explicitly specify the layer’s input_shape to match the previous layer’s output shape, making it possible to stack layers. Answer: d. In Keras, for each subsequent layer you must explicitly specify the layer’s input_shape to match the previous layer’s output shape, making it possible to stack layers. Actually, in Keras, each subsequent layer
Chapter 16, Deep Learning 7 infers its input_shape from the previous layer’s output shape, making it easy to stack layers. 16.6 Q22: Which of the following statements is false ? a. Overfitting can occur when your model is too simple compared to what it is modeling—in the most extreme overfitting case, a model memorizes its training data. b. When you make predictions with an overfit model, they will be accurate if new data matches the training data, but the model could perform poorly with data it has never seen. c. Overfitting tends to occur in deep learning as the dimensionality of the layers becomes too large. d. Some techniques to prevent overfitting include training for fewer epochs, data augmentation, dropout and L1 or L2 regularization. Answer: a. Overfitting can occur when your model is too simple compared to what it is modeling—in the most extreme overfitting case, a model memorizes its training data. Actually, overfitting can occur when your model is too complex compared to what it is modeling—in the most extreme overfitting case, a model memorizes its training data. 16.6 Q23: Which of the following statements a), b) or c) is false ? a. To reduce overfitting and computation time, a convolution layer is often followed by one or more layers that increase the dimensionality of the convolution layer’s output. b. A pooling layer compresses (or down-samples) the results by discarding features, which helps make the model more general. c. The most common pooling technique is called max pooling, which examines a 2-by-2 square of features and keeps only the maximum feature. d. All of the above statements are true . Answer: a. To reduce overfitting and computation time, a convolution layer is often followed by one or more layers that increase the dimensionality of the convolution layer’s output. Actually, to reduce overfitting and computation time, a convolution layer is often followed by one or more layers that reduce the dimensionality of the convolution layer’s output. 16.6 Q24: Consider the following diagram in which the numeric values in the 6- by-6 square represent the features that we wish to compress and the 2-by-2 blue square in position 1 represents the initial pool of features to examine:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
8 Chapter 16, Deep Learning Which of the following statements a), b) or c) is false ? a. The max pooling layer first looks at the pool in position 1 above, then outputs the maximum feature from that pool—9 in our diagram. b. Unlike convolution, there’s no overlap between pools. Once the pool reaches the right edge, the pooling layer moves the pool down by its height—2 rows— then continues from left-to-right. Because of the feature reduction in each group, 2-by-2 pooling compresses the number of features by 50%. c. The following code adds a MaxPooling2D layer to a model named cnn : cnn.add(MaxPooling2D(pool_size=( 2 , 2 ))) d. All of the above statements are true. Answer: b. Unlike convolution, there’s no overlap between pools. Once the pool reaches the right edge, the pooling layer moves the pool down by its height—2 rows—then continues from left-to-right. Because of the feature reduction in each group, 2-by-2 pooling compresses the number of features by 50%. Actually, because every group of four features is reduced to one, 2- by-2 pooling compresses the number of features by 75%. 16.6 Q25: Which of the following statements is false ? a. Convnets often have many convolution and pooling layers. b. The Keras team’s convnets tend to double the number of filters in subsequent convolutional layers to enable the model to learn more relationships between the features. c. The following snippets add a convolution layer with 128 filters, followed by a pooling layer to reduce the dimensionality by 50%: cnn.add(Conv2D(filters= 128 , kernel_size=( 3 , 3 ), activation= 'relu' ))
Chapter 16, Deep Learning 9 cnn.add(MaxPooling2D(pool_size=( 2 , 2 ))) d. For odd dimensions like 11-by-11, Keras pooling layers round down by default. Answer: c. The following snippets add a convolution layer with 128 filters, followed by a pooling layer to reduce the dimensionality by 50%: cnn.add(Conv2D(filters= 128 , kernel_size=( 3 , 3 ), activation= 'relu' )) cnn.add(MaxPooling2D(pool_size=( 2 , 2 ))) Actually, the pooling layer in the preceding code reduce the dimensionality of the Conv2D layer’s output by 75% : 16.6 Q26: A Keras ________ layer reshapes its input to one dimension. a. Dropout b. Masking c. Dense d. Flatten Answer: d. Flatten 16.6 Q27: Which of the following statements a), b) or c) is false? a. Learning the relationships among features and performing classification is accomplished with partially connected Dense layers. b. The following Dense layer creates 128 neurons ( units ) that learn from the outputs of the previous layer: cnn.add(Dense(units= 128 , activation= 'relu' )) c. Many convnets contain at least one Dense layer like the one above. Convnets geared to more complex image datasets with higher-resolution images like ImageNet ( http://www.image-net.org )—a dataset of over 14 million images— often have several Dense layers, commonly with 4096 neurons. d. All of the above statements are true. Answer: a. Learning the relationships among features and performing classification is accomplished with partially connected Dense layers. Actually, learning the relationships among features and performing classification is accomplished with fully connected Dense layers. 16.6 Q28: Consider the following code: cnn.add(Dense(units= 10 , activation= 'softmax' ))
10 Chapter 16, Deep Learning Which of the following statements a), b) or c) about our convolutional neural net that recognizes MNIST digits is false ? a. Our final layer in the preceding snippet is a Dense layer that classifies the inputs into neurons representing the classes 0 through 9. b. The softmax activation function converts the values of these remaining 10 neurons into categorical string labels. c. The neuron that produces the highest probability represents the prediction for a given digit image. d. All of the above statements are true . Answer: b. The softmax activation function converts the values of these remaining 10 neurons into categorical labels. Actually, the softmax activation function converts the values of these remaining 10 neurons into classification probabilities in the range 0.0 – 1.0 . 16.6 Q29: Consider the output of the following snippet: [34]: cnn.summary() _______________________________________________________________ __ Layer (type) Output Shape Param # =============================================================== == conv2d_1 (Conv2D) (None, 26, 26, 64) 640 _______________________________________________________________ __ max_pooling2d_1 (MaxPooling2 (None, 13, 13, 64) 0 _______________________________________________________________ __ conv2d_2 (Conv2D) (None, 11, 11, 128) 73856 _______________________________________________________________ __ max_pooling2d_2 (MaxPooling2 (None, 5, 5, 128) 0 _______________________________________________________________ __ flatten_1 (Flatten) (None, 3200) 0 _______________________________________________________________ __ dense_1 (Dense) (None, 128) 409728 _______________________________________________________________ __ dense_2 (Dense) (None, 10) 1290 =============================================================== == Total params: 485,514
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter 16, Deep Learning 11 Trainable params: 485,514 Non-trainable params: 0 _______________________________________________________________ __ Which of the following statements is false ? a. A model’s summary method shows you the model’s layers. b. The parameters are the weights that the network learns during training. Our relatively small convnet, needs to learn nearly 500,000 parameters. c. In the Output Shape , None simply means that the model does not know in advance how many training samples you’re going to provide—this is known only when you start the training.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter 16, Deep Learning 1 Deep Learning 1 By default, Keras trains only the parameters that most affect prediction accuracy. Answer: d. By default, Keras trains only the parameters that most affect prediction accuracy. Actually, by default, Keras trains all parameters. 16.6 Q30: Which of the following statements a), b) or c) is false? a. You can visualize the model summary using the plot_model function from the module tensorflow.keras.utils , as in: from tensorflow.keras.utils import plot_model from IPython.display import Image plot_model(cnn, to_file= 'convnet.png' , show_shapes= True , show_layer_names= True ) b. Module IPython.display ’s Image class can be used to load an image into a Jupyter Notebook and display the image in the notebook. c. Keras assigns the layer names in the image:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
2 Chapter 16, Deep Learning d. All of the above statements are true . Answer: d. All of the above statements are true . Compiling the Model 16.6 Q31: Once you’ve added all the layers to a Keras neural network, you complete the Keras model by calling its compile method, as in: cnn.compile(optimizer= 'adam' , loss= 'categorical_crossentropy' , metrics=[ 'accuracy' ]) Which of the following statements about the arguments is false ? a. optimizer='adam' specifies the optimizer this model will use to adjust the weights throughout the neural network as it learns. b. There are many optimizers — 'adam' performs well across a wide variety of models. c. loss='categorical_crossentropy' specifies the loss function used by the optimizer in multi-classification networks like our convnet, which predicts 10 classes. As the neural network learns, the optimizer attempts to maximize
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter 16, Deep Learning 3 the values returned by the loss function. The greater the loss, the better the neural network is at predicting what each image is. d. metrics=['accuracy'] —This is a list of the metrics that the network will produce to help you evaluate the model. We use the accuracy metric to check the percentage of correct predictions. Answer: c. loss='categorical_crossentropy' specifies the loss function used by the optimizer in multi-classification networks like our convnet, which predicts 10 classes. As the neural network learns, the optimizer attempts to maximize the values returned by the loss function. The greater the loss, the better the neural network is at predicting what each image is. Actually, as the neural network learns, the optimizer attempts to minimize the values returned by the loss function . The lower the loss , the better the neural network is at predicting what each image is. 1 Training and Evaluating the Model 16.6 Q32: You train a Keras model by calling its fit method. Which of the following statements about the fit method is false ? a. As in Scikit-learn, the first two arguments are the training data and the categorical target labels. b. The iterations argument specifies the number of times the model should process the entire set of training data. c. batch_size specifies the number of samples to process at a time during each epoch. Most models specify a power of 2 from 32 to 512. Larger batch sizes can decrease model accuracy. d. In general, some samples should be used to validate the model. If you specify validation data, after each epoch, the model will use it to make predictions and display the validation loss and accuracy. You can study these values to tune your layers and the fit method’s hyperparameters, or possibly change the layer composition of your model. Answer: b. The iterations argument specifies the number of times the model should process the entire set of training data. Actually, the epochs argument specifies the number of times the model should process the entire set of training data. 16.6 Q33: Which of the following statements a), b) or c) is false? a. TensorBoard is a TensorFlow tool for visualizing data from your deep-learning models as they execute. b. You can view TensorFlow charts showing how the training and validation accuracy and loss values change through the epochs. c. Andrej Karpathy’s ConvnetJS tool, trains convnets in your web browser and dynamically visualizes the layers’ outputs, including what each convolutional layer “sees” as it learns. d. All of the above statements are true .
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
4 Chapter 16, Deep Learning Answer: d. All of the above statements are true . 16.6 Q34: Consider the following code: [ 38]: loss, accuracy = cnn.evaluate(X_test, y_test) 10000/10000 [==============================] - 4s 366us/step [39]: loss [39]: 0.026809450998473768 [40]: accuracy [40]: 0.9917 Which of the following statements a), b) or c) is false ? a. You can check the accuracy of a model on data the model has not yet seen. To do so, call the model’s evaluate method, which displays as its output, how long it took to process the test samples. b. According to the output of the preceding snippet, our convnet model is 99.17% accurate when predicting the labels for unseen data. c. With a little online research, you can find models that can predict MNIST with nearly 100% accuracy. d. Each of the above statements is true . Answer: d. Each of the above statements is true . 16.6 Q35: Which of the following statements a), b) or c) is false? a. Calling the cnn model’s predict method as shown below predicts the classes of the digit images in its argument array ( X_test ): predictions = cnn.predict(X_test) b. You can check what the first sample digit should be by looking at y_test[0] : [42]: y_test[ 0 ] [42]: array([0., 0., 0., 0., 0., 0., 0., 1., 0., 0.], dtype=float32) According to this output, the first sample is the digit 7, because the categorical representation of the test sample’s label specifies a 1.0 at index 7— we created this representation via one-hot encoding. c. The following code outputs the probabilities returned by the predict method for the first test sample:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter 16, Deep Learning 5 [43]: for index, probability in enumerate(predictions[ 0 ]): print(f ' {index} : {probability: .10 %} ' ) 0: 0.0000000201% 1: 0.0000001355% 2: 0.0000186951% 3: 0.0000015494% 4: 0.0000000003% 5: 0.0000000012% 6: 0.0000000000% 7: 99.9999761581% 8: 0.0000005577% 9: 0.0000011416% According to the output, predictions[0] indicates that our model believes this digit is a 7 with nearly 100% certainty. Not all predictions have this level of certainty. d. All of the above statements are true . Answer: d. All of the above statements are true . 1 Saving and Loading a Model 16.6 Q36: Which of the following statements a), b) or c) is false? a. Neural network models can require significant training time. Once you’ve designed and tested a model that suits your needs, you can save its state. This allows you to load it later to make more predictions. Sometimes models are loaded and further trained for new problems. For example, layers in our model already know how to recognize features such as lines and curves, which could be useful in handwritten character recognition as well. This process is called transfer learning—you transfer an existing model’s knowledge into a new model. b. A Keras model’s save method stores the model’s architecture and state information in a format called Hierarchical Data Format (HDF5). Such files use the .h5 file extension: [51]: cnn.save( 'mnist_cnn.h5' ) c. You can load a saved model with the load_model function from the tensorflow.keras.models module, as in: from tensorflow.keras.models import load_model cnn = load_model( 'mnist_cnn.h5' )
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
6 Chapter 16, Deep Learning You can then invoke its methods. For example, if you’ve acquired more data, you could call predict to make additional predictions on new data, or you could call fit to start training with the additional data. d. All of the above statements are true. Answer: d. All of the above statements are true . 1 Visualizing Neural Network Training with TensorBoard 16.7 Q1: Which of the following statements a), b) or c) is false? a. With deep learning networks, there’s so much complexity and so much going on internally that’s hidden from you that it’s difficult to know and fully understand all the details. This creates challenges in testing, debugging and updating models and algorithms. b. Deep learning learns the features but there may be enormous numbers of them, and they may not be apparent to you. c. Google provides the TensorBoard tool for visualizing neural networks implemented in TensorFlow and Keras. Just as a car’s dashboard visualizes data from your car’s sensors, such as your speed, engine temperature and the amount of gas remaining, a TensorBoard dashboard visualizes data from a deep learning model that can give you insights into how well your model is learning and potentially help you tune its hyperparameters. d. All of the above statements are true . Answer: d. All of the above statements are true . 16.7 Q2: Which of the following statements a), b) or c) is false ? a. TensorBoard monitors a folder you specify looking for files output by models during training. b. TensorBoard loads the data from that folder into a browser-based dashboard, similar to the following:
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter 16, Deep Learning 7 c. TensorBoard can load data from multiple models at once and you can choose which to visualize. This makes it easy to compare several different models or multiple runs of the same model. d. All of the above statements are true . Answer: d. All of the above statements are true . 16.7 Q3: To use TensorBoard, before you fit the model, you need to configure a TensorBoard object, which the model will use to write data into a specified folder that TensorBoard monitors. This object is known as a ________ in Keras. a. callforward b. entry point c. callback d. None of the above. Answer: c. callback 16.7 Q4: The following code creates a TensorBoard object: from tensorflow.keras.callbacks import TensorBoard import time tensorboard_callback = TensorBoard(log_dir=f './logs/mnist {time.time()} ' , histogram_freq= 1 , write_graph= True ) Which of the following statements a), b) or c) about the above code is false ?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
8 Chapter 16, Deep Learning a. The log_dir argument is the name of the folder in which this model’s log files will be written. b. The notation './logs/' indicates that we’re creating a new folder within the logs folder you created previously. The preceding code follows that with 'mnist' and the current time. Using the time ensures that each new execution of the notebook will have its own log folder. That will enable you to compare multiple executions in TensorBoard. b. The histogram_freq argument is t he frequency in epochs that Keras will output to the model’s log files. In this case, we’ll write data to the logs for every epoch. c. When the write_graph argument is True , a graph of the model will be output. You can view the graph in the GRAPHS tab in TensorBoard. d. All of the above statements are true . Answer: d. All of the above statements are true . 1 ConvnetJS: Browser-Based Deep-Learning Training and Visualization No questions. 1 Recurrent Neural Networks for Sequences; Sentiment Analysis with the IMDb Dataset 16.9 Q1: Which of the following statements a), b) or c) is false ? a. Our convnet used stacked layers that were applied sequentially. Non- sequential models are possible with recurrent neural networks. b. A recurrent neural network (RNN) processes sequences of data, such as time series or text in sentences. c. The term “recurrent” comes from the fact that the neural network contains loops in which the output of a given layer becomes the input to that same layer in the next time step. d. All of the above statements are true . Answer: d. All of the above statements are true . 16.9 Q2: Which of the following statements is false ? a. In a time series, a time step is the next point in time. b. In a text sequence, a “time step” would be the next word in a sequence of words. c. The looping in convolutional neural networks enables them to learn and remember relationships among the data in the sequence.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter 16, Deep Learning 9 d. The word “good” on its own has positive sentiment. However, when preceded by “not,” which appears earlier in the sequence, the sentiment becomes negative. Answer: c. The looping in convolutional neural networks enables them to learn and remember relationships among the data in the sequence. Actually, the looping in recurrent neural networks enables them to learn and remember relationships among the data in the sequence. 16.9 Q3: Which of the following statements a), b) or c) is false ? a. RNNs for text sequences take into account the relationships among the earlier and later parts of a sequence. b. When determining the meaning of text there can be many words to consider and an arbitrary number of words in between them. c. A Long Short-Term Memory (LSTM) layer makes a neural network convolutional and is optimized to handle learning from sequences. d. All of the above statements are true . Answer: c. A Long Short-Term Memory (LSTM) layer makes a neural network convolutional and is optimized to handle learning from sequences. Actually, a Long Short-Term Memory (LSTM) layer makes a neural network recurrent and is optimized to handle learning from sequences. 1 Loading the IMDb Movie Reviews Dataset 16.9 Q4: Which of the following statements is false ? a. The IMDb movie reviews dataset included with Keras contains 25,000 training samples and 25,000 testing samples, each labeled with its positive (1) or negative (0) sentiment. b. The following code imports the tensorflow.keras.datasets.imdb module so we can load the dataset: from tensorflow.keras.datasets import imdb c. The imdb module’s load_data function returns the IMDb training and testing sets. The load_data function enables you to specify the number of unique words to import as part of the training and testing data. The following code loads only the top 10,000 most frequently occurring words: number_of_words = 10000 (X_train, y_train), (X_test, y_test) = imdb.load_data( num_words=number_of_words) d. The load_data call in Part (c) returns a tuple of two elements containing the samples and labels, respectively.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
10 Chapter 16, Deep Learning Answer: d. The load_data call in Part (c) returns a tuple of two elements containing the samples and labels, respectively. Actually, the load_data call in Part (c) returns a tuple of two elements containing the training and testing sets. Each element is itself a tuple containing the samples and labels, respectively. 1 Data Exploration 16.9 Q5: Which of the following statements a), b) or c) is false? a. Assuming the IMDb training set samples, training set labels, testing set samples and testing set labels are stored in X_train , y_train , X_test , and y_test , respectively, the following code snippets check their dimensions: [4]: X_train.shape [4]: (25000,) [5]: y_train.shape [5]: (25000,) [6]: X_test.shape [6]: (25000,) [7]: y_test.shape [7]: (25000,) b. The arrays y_train and X_test are one-dimensional arrays containing 1s and 0s, indicating whether each review is positive or negative. c. Based on the outpus from the snippets in Part (a), X_train and X_test appear to be one-dimensional. However, their elements actually are lists of integers, each representing one review’s contents, as shown in the code below: [8]: %pprint [8]: Pretty printing has been turned OFF [9]: X_train[ 123 ] [9]: [1, 307, 5, 1301, 20, 1026, 2511, 87, 2775, 52, 116, 5, 31, 7, 4, 91, 1220, 102, 13, 28, 110, 11, 6, 137, 13, 115, 219, 141, 35, 221, 956, 54, 13, 16, 11, 2714, 61, 322, 423, 12, 38, 76, 59, 1803, 72, 8, 2, 23, 5, 967, 12, 38, 85, 62, 358, 99] d. All of the above statements are true : Answer: b. The arrays y_train and X_test are one-dimensional arrays containing 1s and 0s, indicating whether each review is positive or
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter 16, Deep Learning 11 negative. Actually, the arrays y_train and y_test are one-dimensional arrays containing 1s and 0s, indicating whether each review is positive or negative. 16.9 Q6: Which of the following statements a), b) or c) is false? a. Because IMDb movie reviews are numerically encoded in the dataset bundled with Keras, to view their original text, you need to know the word to which each number corresponds. b. Keras’s IMDb dataset provides a dictionary that maps the words to their indexes. Each word’s corresponding value is its frequency ranking among all the words in the entire set of reviews. c. In the dictionary mentioned in Part (b), the word with the ranking 1 is the most frequently occurring word (calculated by the Keras team from the dataset), the word with ranking 2 is the second most frequently occurring word, and so on. Though the dictionary values begin with 1 as the most frequently occurring word, in each encoded review, the ranking values are offset by 3 . So any review containing the most frequently occurring word will have the value 4 wherever that word appears in the review. d. All of the above statements are true . Answer: All of the above statements are true . 16.9 Q7: Which of the following statements a), b) or c) is false regarding decoding IMDb movie reviews? a. The following snippet gets the word-to-index dictionary by calling the function get_word_index from the tensorflow.keras.datasets.imdb module: [10]: word_to_index = imdb.get_word_index() b. The word 'great' might appear in a positive movie review, so the following code checks whether it’s in the dictionary: [11]: word_to_index[ 'great' ] [11]: 84 c. According to the Part (b) output, 'great' is the dataset’s 84th most frequent word. If you look up a word that’s not in the dictionary, you’ll get an exception. d. All of the above statements are true . Answer: d. All of the above statements are true . 1 Data Preparation 16.9 Q8: Which of the following statements a), b) or c) is false ?
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
12 Chapter 16, Deep Learning a. The number of words per review varies, but the Keras requires all samples to have the same dimensions. b. To use the IMDb dataset for deep learning, we need to restrict every review to the same number of words. c. When performing the data preparation in Part (b), some reviews will need to be padded with additional data and others will need to be truncated. d. All of the above statements are true . Answer: d. All of the above statements are true . 16.9 Q9: Which of the following statements is false ? a. The pad_sequences utility function (module tensorflow.keras.preprocessing.sequence ) reshapes the rows in an array of to the number of features specified by the maxlen argument (200) and returns a two-dimensional array: [16]: words_per_review = 200 [17]: from tensorflow.keras.preprocessing.sequence import pad_sequences [18]: X_train = pad_sequences(X_train, maxlen=words_per_review) b. If a sample has more features, pad_sequences truncates it to the specified length. c. If a sample has fewer features, pad_sequences adds blanks to the beginning of the sequence to pad it to the specified length. d. Let’s confirm X_train ’s new shape: [19]: X_train.shape [19]: (25000, 200) Answer: c. If a sample has fewer features, pad_sequences adds blanks to the beginning of the sequence to pad it to the specified length. Actually, if a sample has fewer features, pad_sequences adds 0 s to the sequence to pad it to the specified length. 1 Creating the Neural Network 16.9 Q10: Which of the following statements a), b) or c) is false ? a. We’ve used one-hot encoding to convert the MNIST dataset’s integer labels into categorical data. The result for each label was a vector in which all but one element was 0. We could also do that for the index values that represent our words. b. For our IMDb example that processes 10,000 unique words, we’d need a 10,000-by-10,000 array to represent all the words. That’s 100,000,000
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter 16, Deep Learning 13 elements, and almost all the array elements would be 0. This is not an efficient way to encode the data. c. If we were to process all 88,000+ unique words in the IMDb dataset, we’d need an array of nearly eight billion elements. d. All of the above statements are true . Answer: d. All of the above statements are true . 16.9 Q11: Which of the following statements a), b) or c) is false ? a. To reduce ambiguity, RNNs that process text sequences typically begin with an embedding layer that encodes each word in a compact dense-vector representation. b. The vectors produced by the embedding layer also capture the word’s context —that is, how a given word relates to the words around it. c. An embedding layer enables the RNN to learn word relationships among the training data. d. All of the above statements are true . Answer: a. To reduce ambiguity, RNNs that process text sequences typically begin with an embedding layer that encodes each word in a more compact dense-vector representation. Actually, to reduce dimensionality , RNNs that process text sequences typically begin with an embedding layer that encodes each word in a more compact dense-vector representation. 16.9 Q12: Which of the following are popular predefined word embeddings? a. GloVe b. Word2Vec c. a) and b) d. None of the above. Answer: c. a) and b) 16.9 Q13” Which of the following statements is false ? a. The following snippet adds an LSTM layer to an RNN named rnn : rnn.add(LSTM(units= 128 , dropout= 0.2 , recurrent_dropout= 0.2 )) b) The units argument in Part (a)’s snippet specifies the number of neurons in the layer. The more neurons the more the network can remember. As a guideline, you can start with a value between the length of the sequences you’re processing and the number of classes you’re trying to predict. c) The dropout argument specifies the percentage of neurons to randomly disable when processing the layer’s input and output. Like the pooling layers in our convnet, dropout is a proven technique that reduces underfitting. Keras provides a Dropout layer that you can add to your models.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
14 Chapter 16, Deep Learning d) The recurrent_dropout argument specifies the percentage of neurons to randomly disable when the layer’s output is fed back into the layer again to allow the network to learn from what it has seen previously. Answer: c) The dropout argument specifies the percentage of neurons to randomly disable when processing the layer’s input and output. Like the pooling layers in our convnet, dropout is a proven technique that reduces underfitting. Keras provides a Dropout layer that you can add to your models. Actually, the dropout argument specifies the percentage of neurons to randomly disable when processing the layer’s input and output. Like the pooling layers in our convnet, dropout is a proven technique that reduces overfitting . Keras provides a Dropout layer that you can add to your models. 16.9 Q14: The ________ activation function, which is preferred for binary classification, reduces arbitrary values into the range 0.0–1.0, producing a probability. a. softmax b. relu c. sigmoid d. softplus Answer: c. sigmoid 16.9 Q15: With only two possible outputs, we use the ________ loss function. a. mean_squared_error b. binary_compression c. categorical_crossentropy d. binary_crossentropy Answer: d. binary_crossentropy . 1 Training and Evaluating the Model 16.9 Q16: Keras function ________ returns the loss and accuracy values of a trained model. a. assess b. account c. grade d. evaluate Answer: d. evaluate 1 Tuning Deep Learning Models 16.10 Q1: Which of the following are variables that affect model performance? a. having more or less data to train with, having more or less to test with b. having more or less to validate with, having more or fewer layers
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Chapter 16, Deep Learning 15 c. the types of layers you usef. the order of the layers d. All of the above Answer: d. All of the above. 16.10 Q2: Which of the following statements is false ? a. The compute time required to train models multiple times is significant so, in deep learning, you generally tune hyperparameters with techniques like k-fold cross-validation and grid search. b. There are various tuning techniques, but one particularly promising area is automated machine learning (AutoML). c. Auto-Keras ( https://autokeras.com/ ) is geared to automatically choosing the best configurations for your Keras models. d. Google’s Cloud AutoML and Baidu’s EZDL are among various other automated machine learning efforts. Answer: a. The compute time required to train models multiple times is significant so, in deep learning, you generally tune hyperparameters with techniques like k-fold cross-validation and grid search. Actually, the compute time required to train models multiple times is significant so, in deep learning, you generally do not tune hyperparameters with techniques like k-fold cross-validation or grid search . 1 Convnet Models Pretrained on ImageNet 16.11 Q1: Moving the weights learned by a deep-learning model for a similar problem into a new model is called ________ learning. a. assignment b. transfer c. relegation d. relocation Answer: b. transfer 16.11 Q2: Which of the following statements is false ? a. ImageNet is limited in size, so it can be trained efficiently on most computers. b. You can reuse just the architecture of each model and train it with new data, or you can reuse the pretrained weights. c. ImageNet now has a continuously running challenge on the Kaggle competition site called the ImageNet Object Localization Challenge . The goal is to identify “all objects within an image, so those images can then be classified and annotated.” d. There’s no obvious optimal solution for many machine learning and deep learning tasks. Answer: a. ImageNet is limited in size, so it can be trained efficiently on most computers. Actually, ImageNet is too big for efficient training on most
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
16 Chapter 16, Deep Learning computers, so most people interested in using it start with one of the smaller pretrained models. 16.11 Q3: Which of the following statements is false ? a. On Kaggle, companies and organizations fund competitions where they encourage people worldwide to develop better-performing solutions than they’ve been able to do for something that’s important to their business or organization. b. Sometimes companies offer prize money, which has been as high as $1,000,000 on the famous Netflix competition. c. Netflix wanted to get a 100% or better improvement in their model for determining whether people will like a movie, based on how they rated previous ones. They used the results to help make better recommendations to members. d. Even if you do not win a Kaggle competition, it’s a great way to get experience working on challenging problems of current interest. Answer: c. Netflix wanted to get a 100% or better improvement in their model for determining whether people will like a movie, based on how they rated previous ones. They used the results to help make better recommendations to members. Actually, Netflix wanted to get a 10% or better improvement in their model for determining whether people will like a movie, based on how they rated previous ones. They used the results to help make better recommendations to members. 1 Reinforcement Learning 16.12 Q1: Which of the following statements is false ? a. Reinforcement learning is a form of machine learning in which algorithms learn from their environment, similar to how humans learn—for example, a video game enthusiast learning a new game, or a baby learning to walk or recognize its parents. b. Reinforcement learning implements an agent that learns by trying to perform a task, receiving feedback about success or failure, making adjustments then trying again. The goal is to minimize the loss function. c. The agent receives a positive reward for doing a right thing and a negative reward (that is, a punishment) for doing a wrong thing. d. The agent uses this information to determine the next action to perform and must try to maximize the reward. Answer: b. Reinforcement learning implements an agent that learns by trying to perform a task, receiving feedback about success or failure, making adjustments then trying again. The goal is to minimize the loss function. Actually, the algorithm implements an agent that learns by trying to perform a task, receiving feedback about success or failure, making adjustments then trying again. The goal is to maximize the reward .
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help