2 Basic disassembly Your final task is to pick apart a binary-encoded x86-64 machine instruction and print its human- readable assembly equivalent. This is the job of a disassembler (the inverse of the assembler), such as the obj dump tool or the gdb disassemble command. The instructions you must decode are the five variants of the pushq instruction shown in the table below. Variant (what is pushed?) Disassembled instruction pushq $0x3f10 pushq %rbx pushq (%rdx) pushq 0x8 (%rax) pushq 0xff (%rbp,%rcx, 4) and scaled-index immediate constant Homework 4 register indirect indirect with displacement indirect with displacement Binary-encoded machine instruction (num bytes in parens) 01101000 00010000 00111111 00000000 00000000 (5) 01010011 (1) 11111111 00110010 (2) ff 32 11111111 01110000 00001000 (3) 11111111 01110100 10001101 11111111 (4) As we discussed in class, x86-64 uses a variable-length encoding for machine instructions; some instructions are encoded as a single byte, and others require more than a dozen bytes. The pushq instructions to be decoded vary from 1 to 5 bytes in length. The first byte of a machine instruction contains the opcode which indicates the type of instruction. The opcode is followed by 0 or more additional bytes that encode additional details about the instruction such as the operands, register selector, displacement and so on. This additional bytes vary based on the particular instruction variant. 4 In hex 68 10 3f 00 00 2.1 Your Task You are to implement a function 53 ff 70 08 ff 74 8d ff The table above shows various pushq assembly instructions, each with its binary-encoded machine equivalent. To interpret the binary encoding: Homework 4 • The black bits identify the opcode and instruction variant. They are constant for all instructions of a given variant type. System Skill • The red, green, and blue bits vary depending on chosen register, amount of displacement, and immediate values. • The red bits used in register/indirect come in groups of 3. It takes 3 bits to select a register from the lower 8 registers. The selected register is encoded using the mapping: %rax=000 %rcx=001 %rdx=010 %rbx=011 %rsp=100 %rbp=101 %rsi=110 %rdi=111. Note in the third byte of the scaled-index variant there are two register selectors side-by-side. The left group of three bits encodes the register for the index, the right group encodes the register for the base address. (To access any of the upper 8 registers, a different instruction encoding is used that you are not responsible for disassembling) • The blue bits encode the scale factor for the scaled-index variant. The legal values for the scale are {1,2,4,8}, thus 2 bits are required to encode a scale factor. The values are encoded with the bit patterns 00, 01, 10, 11 respectively. • The green bits encode an unsigned value of size 1-byte (displacement) or 4-byte (immediate). The 4-byte immediate is stored in little-endian order. void disassemble (const unsigned char *raw_instr) The disassemble function takes a pointer to a sequence of raw bytes that represent a single machine instruction. How many bytes are used by the instruction is figured out during disassembling. The idea is to read the first byte(s) and use the opcode and variant to determine how many additional bytes need to be examined. You may wish to write a helper function print_hex_bytes to print the raw bytes, then use bitwise manipulation to extract and convert the operands and print the assembly language instruction. 5 For constants (immediate or displacement), use the printf format %#x to show hex digits prefixed with Ox and no leading zeros. As an example, if the disassemble function were passed a pointer to the bytes for the last instruction in the table above, it would print this line of output: ff 74 8d ff pushq 0xff (%rbp,%rcx, 4) We will only test your function on well-formed instructions of the 5 pushq variants listed above. There is no requirement that you gracefully handle any other inputs, such as other variants of push or other instruction types or malformed inputs. For slightly obscure reasons, there is a slight anomaly in the encoding used when %rbp or %rsp is the base register in indirect/indirect-with-displacement or %rsp as the index register for scaled-index. You do not need to allow for this nor make a special case for it, just disassemble as though the standard encoding is used for all registers without exceptions. System Skill 2.2 Logistics & Advice Your implementation will go inside disas.c. The file disas.c must not contain any main function. You may wish to write a separate file to test your implementation. Design/style/readability. Bitwise manipulation is known for its obscurity, so take extra care to keep the code clean and be sure to comment any dense expressions. Use macro wisely.

Oh no! Our experts couldn't answer your question.

Don't worry! We won't leave you hanging. Plus, we're giving you back one question for the inconvenience.

Submit your question and receive a step-by-step explanation from our experts in as fast as 30 minutes.
You have no more questions left.
Message from our expert:
Our experts need more information to provide you with a solution. Image not clear Please resubmit your question, making sure it's detailed and complete. We've credited a question to your account.
Your Question:

given disas.c

void disassemble(const unsigned char *raw_instr)
{

}
2 Basic disassembly
Your final task is to pick apart a binary-encoded x86-64 machine instruction and print its human-
readable assembly equivalent. This is the job of a disassembler (the inverse of the assembler), such as
the obj dump tool or the gdb disassemble command. The instructions you must decode are the five
variants of the pushq instruction shown in the table below.
Variant (what is pushed?)
Disassembled
instruction
pushq $0x3f10
pushq %rbx
pushq (%rdx)
pushq 0x8 (%rax)
pushq
0xff (%rbp,%rcx, 4) and scaled-index
immediate constant
Homework 4
register
indirect
indirect with displacement
indirect with displacement
Binary-encoded machine instruction
(num bytes in parens)
01101000 00010000 00111111
00000000 00000000 (5)
01010011 (1)
11111111 00110010 (2) ff 32
11111111 01110000 00001000 (3)
11111111 01110100 10001101
11111111 (4)
As we discussed in class, x86-64 uses a variable-length encoding for machine instructions; some
instructions are encoded as a single byte, and others require more than a dozen bytes. The pushq
instructions to be decoded vary from 1 to 5 bytes in length. The first byte of a machine instruction
contains the opcode which indicates the type of instruction. The opcode is followed by 0 or more
additional bytes that encode additional details about the instruction such as the operands, register
selector, displacement and so on. This additional bytes vary based on the particular instruction variant.
4
In hex
68 10 3f 00 00
2.1 Your Task
You are to implement a function
53
ff 70 08
ff 74 8d ff
The table above shows various pushq assembly instructions, each with its binary-encoded machine
equivalent. To interpret the binary encoding:
Homework 4
• The black bits identify the opcode and instruction variant. They are constant for all instructions
of a given variant type.
System Skill
• The red, green, and blue bits vary depending on chosen register, amount of displacement, and
immediate values.
• The red bits used in register/indirect come in groups of 3. It takes 3 bits to select a register from
the lower 8 registers. The selected register is encoded using the mapping:
%rax=000 %rcx=001 %rdx=010 %rbx=011 %rsp=100 %rbp=101 %rsi=110 %rdi=111.
Note in the third byte of the scaled-index variant there are two register selectors side-by-side. The
left group of three bits encodes the register for the index, the right group encodes the register for
the base address. (To access any of the upper 8 registers, a different instruction encoding is used
that you are not responsible for disassembling)
• The blue bits encode the scale factor for the scaled-index variant. The legal values for the scale
are {1,2,4,8}, thus 2 bits are required to encode a scale factor. The values are encoded with the bit
patterns 00, 01, 10, 11 respectively.
• The green bits encode an unsigned value of size 1-byte (displacement) or 4-byte (immediate).
The 4-byte immediate is stored in little-endian order.
void disassemble (const unsigned char *raw_instr)
The disassemble function takes a pointer to a sequence of raw bytes that represent a single
machine instruction. How many bytes are used by the instruction is figured out during disassembling.
The idea is to read the first byte(s) and use the opcode and variant to determine how many additional
bytes need to be examined. You may wish to write a helper function print_hex_bytes to print the
raw bytes, then use bitwise manipulation to extract and convert the operands and print the assembly
language instruction.
5
For constants (immediate or displacement), use the printf format %#x to show hex digits prefixed
with Ox and no leading zeros.
As an example, if the disassemble function were passed a pointer to the bytes for the last instruction
in the table above, it would print this line of output:
ff 74 8d ff pushq 0xff (%rbp,%rcx, 4)
We will only test your function on well-formed instructions of the 5 pushq variants listed above.
There is no requirement that you gracefully handle any other inputs, such as other variants of push or
other instruction types or malformed inputs.
For slightly obscure reasons, there is a slight anomaly in the encoding used when %rbp or %rsp is
the base register in indirect/indirect-with-displacement or %rsp as the index register for scaled-index.
You do not need to allow for this nor make a special case for it, just disassemble as though the standard
encoding is used for all registers without exceptions.
System Skill
2.2 Logistics & Advice
Your implementation will go inside disas.c. The file disas.c must not contain any main function.
You may wish to write a separate file to test your implementation.
Design/style/readability. Bitwise manipulation is known for its obscurity, so take extra care to keep
the code clean and be sure to comment any dense expressions. Use macro wisely.
Transcribed Image Text:2 Basic disassembly Your final task is to pick apart a binary-encoded x86-64 machine instruction and print its human- readable assembly equivalent. This is the job of a disassembler (the inverse of the assembler), such as the obj dump tool or the gdb disassemble command. The instructions you must decode are the five variants of the pushq instruction shown in the table below. Variant (what is pushed?) Disassembled instruction pushq $0x3f10 pushq %rbx pushq (%rdx) pushq 0x8 (%rax) pushq 0xff (%rbp,%rcx, 4) and scaled-index immediate constant Homework 4 register indirect indirect with displacement indirect with displacement Binary-encoded machine instruction (num bytes in parens) 01101000 00010000 00111111 00000000 00000000 (5) 01010011 (1) 11111111 00110010 (2) ff 32 11111111 01110000 00001000 (3) 11111111 01110100 10001101 11111111 (4) As we discussed in class, x86-64 uses a variable-length encoding for machine instructions; some instructions are encoded as a single byte, and others require more than a dozen bytes. The pushq instructions to be decoded vary from 1 to 5 bytes in length. The first byte of a machine instruction contains the opcode which indicates the type of instruction. The opcode is followed by 0 or more additional bytes that encode additional details about the instruction such as the operands, register selector, displacement and so on. This additional bytes vary based on the particular instruction variant. 4 In hex 68 10 3f 00 00 2.1 Your Task You are to implement a function 53 ff 70 08 ff 74 8d ff The table above shows various pushq assembly instructions, each with its binary-encoded machine equivalent. To interpret the binary encoding: Homework 4 • The black bits identify the opcode and instruction variant. They are constant for all instructions of a given variant type. System Skill • The red, green, and blue bits vary depending on chosen register, amount of displacement, and immediate values. • The red bits used in register/indirect come in groups of 3. It takes 3 bits to select a register from the lower 8 registers. The selected register is encoded using the mapping: %rax=000 %rcx=001 %rdx=010 %rbx=011 %rsp=100 %rbp=101 %rsi=110 %rdi=111. Note in the third byte of the scaled-index variant there are two register selectors side-by-side. The left group of three bits encodes the register for the index, the right group encodes the register for the base address. (To access any of the upper 8 registers, a different instruction encoding is used that you are not responsible for disassembling) • The blue bits encode the scale factor for the scaled-index variant. The legal values for the scale are {1,2,4,8}, thus 2 bits are required to encode a scale factor. The values are encoded with the bit patterns 00, 01, 10, 11 respectively. • The green bits encode an unsigned value of size 1-byte (displacement) or 4-byte (immediate). The 4-byte immediate is stored in little-endian order. void disassemble (const unsigned char *raw_instr) The disassemble function takes a pointer to a sequence of raw bytes that represent a single machine instruction. How many bytes are used by the instruction is figured out during disassembling. The idea is to read the first byte(s) and use the opcode and variant to determine how many additional bytes need to be examined. You may wish to write a helper function print_hex_bytes to print the raw bytes, then use bitwise manipulation to extract and convert the operands and print the assembly language instruction. 5 For constants (immediate or displacement), use the printf format %#x to show hex digits prefixed with Ox and no leading zeros. As an example, if the disassemble function were passed a pointer to the bytes for the last instruction in the table above, it would print this line of output: ff 74 8d ff pushq 0xff (%rbp,%rcx, 4) We will only test your function on well-formed instructions of the 5 pushq variants listed above. There is no requirement that you gracefully handle any other inputs, such as other variants of push or other instruction types or malformed inputs. For slightly obscure reasons, there is a slight anomaly in the encoding used when %rbp or %rsp is the base register in indirect/indirect-with-displacement or %rsp as the index register for scaled-index. You do not need to allow for this nor make a special case for it, just disassemble as though the standard encoding is used for all registers without exceptions. System Skill 2.2 Logistics & Advice Your implementation will go inside disas.c. The file disas.c must not contain any main function. You may wish to write a separate file to test your implementation. Design/style/readability. Bitwise manipulation is known for its obscurity, so take extra care to keep the code clean and be sure to comment any dense expressions. Use macro wisely.
Knowledge Booster
Stack
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education