Lexical Analyzer Assignment Outline 1. Overview of Lexical Analyzer Program: - Identifies Tokens: - Alphanumeric lexemes (variables) as IDENT tokens. - Numeric tokens (constant integers) as INT_LIT tokens. - Other lexemes such as `+`, `*`, `\`, `/`, `(` are marked as "UNKNOWN". - Utilizes a lookup table to retrieve token values; symbols outside this table trigger errors. - Program processes a line of code from an external file, returning lexemes and token numbers. 2. Report Submission: - Submit a one-page report highlighting work and any additional features. - Reports must be uploaded in digital format (no handwritten submissions). 3. Task Instructions: - Code Writing & Lexical Analysis: - Create a line of code and take a screenshot of the lexical analyzer's output. - Example format: `smith = (65 + john) – norfolk85ny` 4. Code Modification: - Modify code to handle unrecognized characters, printing an error message, “Invalid lexeme, character not found.” - Test with characters like `$`, `%`, or `#`. - Provide screenshots of code and output. - Hint: Add a case for the `=` sign. 5. Bonus Points Opportunities: - Explore enhancements for up to 100% additional bonus points. - Include screenshots of code modifications. - Considerations for enhancements: - Handling erroneous variable names, e.g., starting with numerals (`8varX`). - Detecting semicolons at line end. - Identifying reserved words like `IF`, `WHILE`, `FOR`, `INT`. This outline aids in completing the task for understanding and modifying a lexical analyzer in Java, with an emphasis on handling exceptions and expanding token capabilities for more robust code parsing.

Database System Concepts

7th Edition

ISBN:9780078022159

Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan

Chapter1: Introduction

Section: Chapter Questions

Problem 1PE

See similar textbooks

Related questions

Question

source code:

import java.util.*;
import java.io.*;

public class Main {
static File text = new File("/Users/.../Desktop/sourceCode.txt");
static FileInputStream keyboard;

static int charClass;
static char lexeme[] = new char[100]; //stores char of lexemes
static char nextChar;
static int lexLen;// length of lexeme
static int token;
static int nextToken;

// Token numbers
static final int LETTER = 0;
static final int DIGIT = 1;
static final int UNKNOWN = 99;

static final int INT_LIT = 10;
static final int IDENT = 11;
static final int ASSIGN_OP = 20;
static final int ADD_OP = 21;
static final int SUB_OP = 22;
static final int MULT_OP = 23;
static final int DIV_OP = 24;
static final int LEFT_PAREN = 25;
static final int RIGHT_PAREN = 26;

public static void main(String[] args) {

try{

keyboard = new FileInputStream(text);

getChar();
do {
lex();
}while (keyboard.available() > 0);
charClass = -1; // reset the character class
lex();

} catch(Exception e) {
System.out.println("Can not open file");
}
}

// Lookup table for non-alphanumeric characters
public static int lookup(char ch) {
switch (ch) {
case '(' :
addChar();
nextToken = LEFT_PAREN;
break;

case ')' :
addChar();
nextToken = RIGHT_PAREN;
break;

case '+' :
addChar();
nextToken = ADD_OP;
break;

case '-' :
addChar();
nextToken = SUB_OP;
break;

case '*' :
addChar();
nextToken = MULT_OP;
break;

case '/' :
addChar();
nextToken = DIV_OP;
break;

default:
addChar();
nextToken = -1;
break;
}
return nextToken;
}
public static void addChar() {
if (lexLen <= 98) { // a max length of 98
lexeme [lexLen++] = nextChar; // storing the char in lexeme array
lexeme [lexLen] = 0;//append lexeme with a zero
} else {
System.out.println("Error - lexeme is too long \n");
}
}

public static void getChar() throws IOException{
if ((nextChar = (char) keyboard.read()) != -1) {
if (Character.isLetter(nextChar))
charClass = LETTER;
else if (Character.isDigit(nextChar))
charClass = DIGIT;
else charClass = UNKNOWN;
}
else
charClass = -1;
}

public static void getNonBlank() throws IOException{
while (Character.isSpace(nextChar)) { //skip the white spaces
getChar();
}
}

public static int lex() throws IOException{
lexLen = 0;
getNonBlank();

switch (charClass) {
case LETTER:
addChar(); // add the char to lexeme array
getChar(); // get the next char

// Keep scanning for all characters in a variable/identifier
// until you get a non-alphanumeric character
while (charClass == LETTER || charClass == DIGIT) {
addChar();
getChar();
}

nextToken = IDENT; //spit the final token of the stored lexeme array
break;

case DIGIT:
addChar();
getChar();
while (charClass == DIGIT) {
addChar();
getChar();
}
nextToken = INT_LIT;//spit the final token of the stored lexeme array
break;

case UNKNOWN:
lookup(nextChar); //look up in the lookup table
getChar();
break;

case -1:
nextToken = -1;
lexeme[0] = 'E';
lexeme[1] = 'O';
lexeme[2] = 'F';
lexeme[3] = 0;
break;
}
System.out.printf("Next token is: %d, Next lexeme is ", nextToken);
for (char n: lexeme) {
if(n == 0) {
break;
}
System.out.print(n);
}
System.out.println();
return nextToken;
}

}

**Lexical Analyzer Assignment Outline**

1. **Overview of Lexical Analyzer Program:**
- **Identifies Tokens:**
- Alphanumeric lexemes (variables) as IDENT tokens.
- Numeric tokens (constant integers) as INT_LIT tokens.
- Other lexemes such as `+`, `*`, `\`, `/`, `(` are marked as "UNKNOWN".
- Utilizes a lookup table to retrieve token values; symbols outside this table trigger errors.
- Program processes a line of code from an external file, returning lexemes and token numbers.

2. **Report Submission:**
- Submit a one-page report highlighting work and any additional features.
- Reports must be uploaded in digital format (no handwritten submissions).

3. **Task Instructions:**
- **Code Writing & Lexical Analysis:**
- Create a line of code and take a screenshot of the lexical analyzer's output.
- Example format: `smith = (65 + john) – norfolk85ny`

4. **Code Modification:**
- Modify code to handle unrecognized characters, printing an error message, “Invalid lexeme, character not found.”
- Test with characters like `$`, `%`, or `#`.
- Provide screenshots of code and output.
- **Hint:** Add a case for the `=` sign.

5. **Bonus Points Opportunities:**
- Explore enhancements for up to 100% additional bonus points.
- Include screenshots of code modifications.
- Considerations for enhancements:
- Handling erroneous variable names, e.g., starting with numerals (`8varX`).
- Detecting semicolons at line end.
- Identifying reserved words like `IF`, `WHILE`, `FOR`, `INT`.

This outline aids in completing the task for understanding and modifying a lexical analyzer in Java, with an emphasis on handling exceptions and expanding token capabilities for more robust code parsing.

Expert Solution

Trending now

This is a popular solution!

Step by step

Solved in 3 steps with 1 images

SEE SOLUTION Check out a sample Q&A here

Knowledge Booster

Learn more about

Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.