Suppose that when performing attention, we have the following keys and values: Keys: Values: {[-3 0 1],[1 1 -1],[011],[1 0 0]} {[211] [122], [0 3 1], [-1 0 2]} We want to compute the attention embedding using these keys and values for the following query: [2 1 -1] Which of the following is the correct attention embedding? To simplify calculations, replace softmax with argmax. For example, softmax([-1,1,0]) would instead be argmax([-1, 1, 0]) = [0, 1, 0]. [211] [122] [031] [301]

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question
Suppose that when performing attention, we have the following keys and values:
Keys:
Values:
{[−3 0 1],[1 1 −1],[0 1 1],[100]}
{[211] [122], [031], [-1 0 2]}
We want to compute the attention embedding using these keys and values for the following query:
[2 1 -1]
Which of the following is the correct attention embedding?
To simplify calculations, replace softmax with argmax. For example, softmax([-1,1,0])
would instead be argmax ([-1, 1, 0]) = [0, 1, 0].
[2 1 1]
[122]
[031]
[-3 0 1]
Transcribed Image Text:Suppose that when performing attention, we have the following keys and values: Keys: Values: {[−3 0 1],[1 1 −1],[0 1 1],[100]} {[211] [122], [031], [-1 0 2]} We want to compute the attention embedding using these keys and values for the following query: [2 1 -1] Which of the following is the correct attention embedding? To simplify calculations, replace softmax with argmax. For example, softmax([-1,1,0]) would instead be argmax ([-1, 1, 0]) = [0, 1, 0]. [2 1 1] [122] [031] [-3 0 1]
Expert Solution
Step 1

Solution:

 

To compute the attention embedding, we need to first calculate the attention weights, which are computed as the dot product between the query and each key vector, followed by a softmax operation to normalize the results. Then, we compute the weighted sum of the value vectors using the attention weights as the weights.

trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Knowledge Booster
SQL Query
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education