Java Code: For Lexer.java Make a HashMap of in your Lex class. Below is a list of the keywords that you need. Make token types and populate the hash map in your constructor (I would make a helper method that the constructor calls). while, if, do, for, break, continue, else, return, BEGIN, END, print, printf, next, in, delete, getline, exit, nextfile, function Modify “ProcessWord” so that it checks the hash map for known words and makes a token specific to the word with no value if the word is in the hash map, but WORD otherwise. For example, Input: for while hello do Output: FOR WHILE WORD(hello) DO Make a token type for string literals. In Lex, when you encounter a “, call a new method (I called it HandleStringLiteral() ) that reads up through the matching “ and creates a string literal token ( STRINGLITERAL(hello world) ). Be careful of two things: make sure that an empty string literal ( “” ) works and make sure to deal with escaped “ (String quote = “She said, \”hello there\” and then she left.”;) Make a new token type, a new method (HandlePattern) and call it from Lex when you encounter a backtick. The last thing that we need to deal with in our lexer is symbols. Most of these will be familiar from Java, but a few I will detail a bit more. We will be using two different hash maps – one for two-character symbols (like ==, &&, ++) and one for one character symbols (like +, -, $). Why? Well, some two-character symbols start with characters that are also symbols (for example, + and +=). We need to prioritize the += and only match + if it is not a +=. Create a method called “ProcessSymbol” – it should use PeekString to get 2 characters and look them up in the two-character hash map. If it exists, make the appropriate token and return it. Otherwise, use PeekString to get a 1 character string. Look that up in the one-character hash map. If it exists, create the appropriate token and return it. Don’t forget to update the position in the line. If no symbol is found, return null. Call ProcessSymbol in your lex() method. If it returns a value, add the token to the token list. Make sure all the functionality of the unit test are tested. Below are lexer.java and token.java. Make sure to show full code with screenshot of the output. Attached is checklist. Lexer.java import java.util.LinkedList; public class Lexer { private StringHandler stringHandler; private int lineNumber; private int charPosition; public Lexer(String input) { stringHandler = new StringHandler(input); lineNumber = 1; charPosition = 0; } public LinkedList lex() { LinkedList tokens = new LinkedList<>(); while (!stringHandler.isDone()) { char c = stringHandler.peek(0); if (c == ' ' || c == '\t') { stringHandler.swallow(1); charPosition++; } else if (c == '\n') { tokens.add(new Token(TokenType.SEPARATOR, lineNumber, charPosition)); stringHandler.swallow(1); lineNumber++; charPosition = 0; } else if (Character.isLetter(c)) { tokens.add(processWord()); } else if (Character.isDigit(c)) { tokens.add(processNumber()); } else { throw new RuntimeException("Unrecognized character: " + c); } } return tokens; } private Token processWord() { StringBuilder value = new StringBuilder(); while (!stringHandler.isDone() && (Character.isLetterOrDigit(stringHandler.peek(0)) || stringHandler.peek(0) == '_' || stringHandler.peek(0) == ',')) { value.append(stringHandler.getChar()); charPosition++; } return new Token(TokenType.WORD, value.toString(), lineNumber, charPosition - value.length()); } private Token processNumber() { StringBuilder value = new StringBuilder(); while (!stringHandler.isDone() && (Character.isDigit(stringHandler.peek(0)) || stringHandler.peek(0) == '.')) { value.append(stringHandler.getChar()); charPosition++; } return new Token(TokenType.NUMBER, value.toString(), lineNumber, charPosition - value.length()); } } Token.java enum TokenType { WORD, NUMBER, SEPARATOR } public class Token { private TokenType type; private String value; private int lineNumber; private int charPosition; public Token(TokenType type, int lineNumber, int charPosition) { this.type = type; this.lineNumber = lineNumber; this.charPosition = charPosition; } public Token(TokenType type, String value, int lineNumber, int charPosition) { this(type, lineNumber, charPosition); this.value = value; } public String toString() { if (value == null) { return type.toString(); } else { return type.toString() + "(" + value + ")"; } } }
Java Code: For Lexer.java
Make a HashMap of <String, TokenType> in your Lex class. Below is a list of the keywords that you need. Make token types and populate the hash map in your constructor (I would make a helper method that the constructor calls).
while, if, do, for, break, continue, else, return, BEGIN, END, print, printf, next, in, delete, getline, exit, nextfile, function
Modify “ProcessWord” so that it checks the hash map for known words and makes a token specific to the word with no value if the word is in the hash map, but WORD otherwise.
For example,
Input: for while hello do
Output: FOR WHILE WORD(hello) DO
Make a token type for string literals. In Lex, when you encounter a “, call a new method (I called it HandleStringLiteral() ) that reads up through the matching “ and creates a string literal token ( STRINGLITERAL(hello world) ). Be careful of two things: make sure that an empty string literal ( “” ) works and make sure to deal with escaped “ (String quote = “She said, \”hello there\” and then she left.”;)
Make a new token type, a new method (HandlePattern) and call it from Lex when you encounter a backtick.
The last thing that we need to deal with in our lexer is symbols. Most of these will be familiar from Java, but a few I will detail a bit more. We will be using two different hash maps – one for two-character symbols (like ==, &&, ++) and one for one character symbols (like +, -, $). Why? Well, some two-character symbols start with characters that are also symbols (for example, + and +=). We need to prioritize the += and only match + if it is not a +=.
Create a method called “ProcessSymbol” – it should use PeekString to get 2 characters and look them up in the two-character hash map. If it exists, make the appropriate token and return it. Otherwise, use PeekString to get a 1 character string. Look that up in the one-character hash map. If it exists, create the appropriate token and return it. Don’t forget to update the position in the line. If no symbol is found, return null. Call ProcessSymbol in your lex() method. If it returns a value, add the token to the token list.
Make sure all the functionality of the unit test are tested. Below are lexer.java and token.java. Make sure to show full code with screenshot of the output. Attached is checklist.
Lexer.java
Trending now
This is a popular solution!
Step by step
Solved in 6 steps with 1 images
Where are the unit test? I need to see various junit test cases being tested.