**Lexical Analyzer Assignment Outline** 1. **Overview of Lexical Analyzer Program:** - **Identifies Tokens:** - Alphanumeric lexemes (variables) as IDENT tokens. - Numeric tokens (constant integers) as INT_LIT tokens. - Other lexemes such as `+`, `*`, `\`, `/`, `(` are marked as "UNKNOWN". - Utilizes a lookup table to retrieve token values; symbols outside this table trigger errors. - Program processes a line of code from an external file, returning lexemes and token numbers. 2. **Report Submission:** - Submit a one-page report highlighting work and any additional features. - Reports must be uploaded in digital format (no handwritten submissions). 3. **Task Instructions:** - **Code Writing & Lexical Analysis:** - Create a line of code and take a screenshot of the lexical analyzer's output. - Example format: `smith = (65 + john) – norfolk85ny` 4. **Code Modification:** - Modify code to handle unrecognized characters, printing an error message, “Invalid lexeme, character not found.” - Test with characters like `$`, `%`, or `#`. - Provide screenshots of code and output. - **Hint:** Add a case for the `=` sign. 5. **Bonus Points Opportunities:** - Explore enhancements for up to 100% additional bonus points. - Include screenshots of code modifications. - Considerations for enhancements: - Handling erroneous variable names, e.g., starting with numerals (`8varX`). - Detecting semicolons at line end. - Identifying reserved words like `IF`, `WHILE`, `FOR`, `INT`. This outline aids in completing the task for understanding and modifying a lexical analyzer in Java, with an emphasis on handling exceptions and expanding token capabilities for more robust code parsing.
source code:
import java.util.*;
import java.io.*;
public class Main {
static File text = new File("/Users/.../Desktop/sourceCode.txt");
static FileInputStream keyboard;
static int charClass;
static char lexeme[] = new char[100]; //stores char of lexemes
static char nextChar;
static int lexLen;// length of lexeme
static int token;
static int nextToken;
// Token numbers
static final int LETTER = 0;
static final int DIGIT = 1;
static final int UNKNOWN = 99;
static final int INT_LIT = 10;
static final int IDENT = 11;
static final int ASSIGN_OP = 20;
static final int ADD_OP = 21;
static final int SUB_OP = 22;
static final int MULT_OP = 23;
static final int DIV_OP = 24;
static final int LEFT_PAREN = 25;
static final int RIGHT_PAREN = 26;
public static void main(String[] args) {
try{
keyboard = new FileInputStream(text);
getChar();
do {
lex();
}while (keyboard.available() > 0);
charClass = -1; // reset the character class
lex();
} catch(Exception e) {
System.out.println("Can not open file");
}
}
// Lookup table for non-alphanumeric characters
public static int lookup(char ch) {
switch (ch) {
case '(' :
addChar();
nextToken = LEFT_PAREN;
break;
case ')' :
addChar();
nextToken = RIGHT_PAREN;
break;
case '+' :
addChar();
nextToken = ADD_OP;
break;
case '-' :
addChar();
nextToken = SUB_OP;
break;
case '*' :
addChar();
nextToken = MULT_OP;
break;
case '/' :
addChar();
nextToken = DIV_OP;
break;
default:
addChar();
nextToken = -1;
break;
}
return nextToken;
}
public static void addChar() {
if (lexLen <= 98) { // a max length of 98
lexeme [lexLen++] = nextChar; // storing the char in lexeme array
lexeme [lexLen] = 0;//append lexeme with a zero
} else {
System.out.println("Error - lexeme is too long \n");
}
}
public static void getChar() throws IOException{
if ((nextChar = (char) keyboard.read()) != -1) {
if (Character.isLetter(nextChar))
charClass = LETTER;
else if (Character.isDigit(nextChar))
charClass = DIGIT;
else charClass = UNKNOWN;
}
else
charClass = -1;
}
public static void getNonBlank() throws IOException{
while (Character.isSpace(nextChar)) { //skip the white spaces
getChar();
}
}
public static int lex() throws IOException{
lexLen = 0;
getNonBlank();
switch (charClass) {
case LETTER:
addChar(); // add the char to lexeme array
getChar(); // get the next char
// Keep scanning for all characters in a variable/identifier
// until you get a non-alphanumeric character
while (charClass == LETTER || charClass == DIGIT) {
addChar();
getChar();
}
nextToken = IDENT; //spit the final token of the stored lexeme array
break;
case DIGIT:
addChar();
getChar();
while (charClass == DIGIT) {
addChar();
getChar();
}
nextToken = INT_LIT;//spit the final token of the stored lexeme array
break;
case UNKNOWN:
lookup(nextChar); //look up in the lookup table
getChar();
break;
case -1:
nextToken = -1;
lexeme[0] = 'E';
lexeme[1] = 'O';
lexeme[2] = 'F';
lexeme[3] = 0;
break;
}
System.out.printf("Next token is: %d, Next lexeme is ", nextToken);
for (char n: lexeme) {
if(n == 0) {
break;
}
System.out.print(n);
}
System.out.println();
return nextToken;
}
}


Trending now
This is a popular solution!
Step by step
Solved in 3 steps with 1 images









