import numpy as np import json img_codes = np.load("data/image_codes.npy") captions = json.load(open('data/captions_tokenized.json'))   for img_i in range(len(captions)): for caption_i inrange(len(captions[img_i])): sentence = captions[img_i][caption_i] captions[img_i][caption_i] = ["#START#"] + sentence.split(' ') + ["#END#"]   # Build a Vocabulary from collections import Counter word_counts = Counter() # Compute word frequencies for each word in captions. See code above for data structure # YOUR CODE HERE   #Testing condition:- vocab = ['#UNK#', '#START#', '#END#', '#PAD#'] vocab += [k for k, v in word_counts.items() if v >= 5 if k not in vocab] n_tokens = len(vocab) assert 10000 <= n_tokens <= 10500 word_to_index = {w: i for i, w in enumerate(vocab)} #for reference and more detail go to ---> https://colab.research.google.com/github/hse-aml/intro-to-dl-pytorch/blob/main/week06/week06_final_project_image_captioning.ipynb#scrollTo=x_wH1bYu-QK8

Computer Networking: A Top-Down Approach (7th Edition)
7th Edition
ISBN:9780133594140
Author:James Kurose, Keith Ross
Publisher:James Kurose, Keith Ross
Chapter1: Computer Networks And The Internet
Section: Chapter Questions
Problem R1RQ: What is the difference between a host and an end system? List several different types of end...
icon
Related questions
Question
import numpy as np
import json

img_codes = np.load("data/image_codes.npy")
captions = json.load(open('data/captions_tokenized.json'))
 
for img_i in range(len(captions)):
for caption_i inrange(len(captions[img_i])):
sentence = captions[img_i][caption_i]
captions[img_i][caption_i] = ["#START#"] + sentence.split(' ') + ["#END#"]
 
# Build a Vocabulary
from collections import Counter
word_counts = Counter()

# Compute word frequencies for each word in captions. See code above for data structure

# YOUR CODE HERE
 
#Testing condition:-
vocab = ['#UNK#', '#START#', '#END#', '#PAD#']
vocab += [k for k, v in word_counts.items() if v >= 5 if k not in vocab]
n_tokens = len(vocab)

assert 10000 <= n_tokens <= 10500

word_to_index = {w: i for i, w in enumerate(vocab)}

#for reference and more detail go to ---> https://colab.research.google.com/github/hse-aml/intro-to-dl-pytorch/blob/main/week06/week06_final_project_image_captioning.ipynb#scrollTo=x_wH1bYu-QK8

Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 3 steps

Blurred answer
Similar questions
Recommended textbooks for you
Computer Networking: A Top-Down Approach (7th Edi…
Computer Networking: A Top-Down Approach (7th Edi…
Computer Engineering
ISBN:
9780133594140
Author:
James Kurose, Keith Ross
Publisher:
PEARSON
Computer Organization and Design MIPS Edition, Fi…
Computer Organization and Design MIPS Edition, Fi…
Computer Engineering
ISBN:
9780124077263
Author:
David A. Patterson, John L. Hennessy
Publisher:
Elsevier Science
Network+ Guide to Networks (MindTap Course List)
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:
9781337569330
Author:
Jill West, Tamara Dean, Jean Andrews
Publisher:
Cengage Learning
Concepts of Database Management
Concepts of Database Management
Computer Engineering
ISBN:
9781337093422
Author:
Joy L. Starks, Philip J. Pratt, Mary Z. Last
Publisher:
Cengage Learning
Prelude to Programming
Prelude to Programming
Computer Engineering
ISBN:
9780133750423
Author:
VENIT, Stewart
Publisher:
Pearson Education
Sc Business Data Communications and Networking, T…
Sc Business Data Communications and Networking, T…
Computer Engineering
ISBN:
9781119368830
Author:
FITZGERALD
Publisher:
WILEY