Java Code: In this project, we will begin our lexer. Our lexer will start by reading the strings of the sample.txt file that the user wants to run. It will break the text file into “words” or tokens and build a collection of these tokens. We can consider the lexer complete when it can take any AWK file and output a list of the tokens being generated. Use Files.readAllBytes. This is a much simpler way of dealing with files. Example of readAllLines: Path myPath = Paths.get(“someFile.awk”); String content = new String(Files.readAllBytes (myPath); Start by creating the “StringHandler” class. It should have a private string to hold the document (sample.txt file) and a private integer index (the finger position). It should have methods: char Peek(i) -looks “i” characters ahead and returns that character; doesn’t move the index String PeekString(i) – returns a string of the next “i” characters but doesn’t move the index char GetChar() – returns the next character and moves the index void Swallow(i) – moves the index ahead “i” positions boolean IsDone() – returns true if we are at the end of the document String Remainder() – returns the rest of the document as a string Create the Token class. It needs an enum of TokenType (values: WORD, NUMBER, SEPERATOR) and a string to hold the value of the token (for example, “hello” and “goodbye” are both WORD, but with different values. The token will also hold the line number and character position of the start of the token. Create two constructors – one for TokenType, line number and position and one that also has a value; some tokens don’t have a value because it is doesn’t matter (new line). Make sure to add a ToString method. Your exact format isn’t critical; I output the token type and the value in parentheses if it is set. Next, we will create the Lexer class. It should take a string as a parameter to the constructor. The constructor should create a StringHandler. All access to the document will be done through the StringHandler’s methods. The Lexer will also keep track of what line of the document we are on and what character position we are on within the line. This is different from the index in the StringHandler. The Lexer needs a “Lex” class – this is the “main” that will break the data from StringHandler into a linked list of tokens. While there is still data in StringHandler, we want to peek at the next character to get an idea what to do with it. If the character is a space or tab, we will just move past it (increment position). If the character is a linefeed (\n), we will create a new SEPERATOR token with no “value” and add it to token list. We should also increment the line number and set line position to 0. If the character is a carriage return (\r), we will ignore it. If the character is a letter, we need to call ProcessWord (see below) and add the result to our list of tokens. If the character is a digit, we need to call ProcessDigit (see below) and add the result to our list of tokens. Throw an exception if you encounter a character you don’t recognize. ProcessWord is a method that returns a Token. It accepts letters, digits and underscores (_) and make a String of them. When it encounters a character that is NOT one of those, it stops and makes a “WORD” token, setting the value to be the String it has accumulated. Make sure that you are using “peek” so that whatever character does NOT belong to the WORD stays in the StringHandler class. Also remember to increment the position value. ProcessNumber is similar to ProcessWord, but accepts 0-9 and one “.”. It does not accept plus or minus – we will handle those next time. Create Junit tests for your lexer. Test with single and multi-line strings. Test with words then numbers and numbers and then words. Finally, create a “main” class. It should take a command line parameter of a filename. It should call GetAllBytes and pass the result to the lexer. It should print out the resultant tokens. Below is sample.txt. Make sure to show the full code for all 3 parts and take screenshots of the output of sample.txt being printed out as tokens. Attached is checklist. sample.txt My inspiration for writing this piece would be how patriarchy is still a detrimental factor in society and females are not treated with respect Atrocities towards females are the highest in East Asian countries because most people have this belief that females are not considered equal to males I want to focus on the patriarchal mindset in East Asian countries because it allows me to see the differences between the levels of female inequality in different countries By writing this story, the message of female inequality is emphasized and how it can have a negative impact on kids
Java Code:
In this project, we will begin our lexer. Our lexer will start by reading the strings of the sample.txt file that the user wants to run. It will break the text file into “words” or tokens and build a collection of these tokens. We can consider the lexer complete when it can take any AWK file and output a list of the tokens being generated.
Use Files.readAllBytes. This is a much simpler way of dealing with files.
Example of readAllLines:
Path myPath = Paths.get(“someFile.awk”);
String content = new String(Files.readAllBytes (myPath);
Start by creating the “StringHandler” class. It should have a private string to hold the document (sample.txt file) and a private integer index (the finger position). It should have methods:
char Peek(i) -looks “i” characters ahead and returns that character; doesn’t move the index
String PeekString(i) – returns a string of the next “i” characters but doesn’t move the index
char GetChar() – returns the next character and moves the index
void Swallow(i) – moves the index ahead “i” positions
boolean IsDone() – returns true if we are at the end of the document
String Remainder() – returns the rest of the document as a string
Create the Token class. It needs an enum of TokenType (values: WORD, NUMBER, SEPERATOR) and a string to hold the value of the token (for example, “hello” and “goodbye” are both WORD, but with different values. The token will also hold the line number and character position of the start of the token. Create two constructors – one for TokenType, line number and position and one that also has a value; some tokens don’t have a value because it is doesn’t matter (new line). Make sure to add a ToString method. Your exact format isn’t critical; I output the token type and the value in parentheses if it is set.
Next, we will create the Lexer class. It should take a string as a parameter to the constructor. The constructor should create a StringHandler. All access to the document will be done through the StringHandler’s methods. The Lexer will also keep track of what line of the document we are on and what character position we are on within the line. This is different from the index in the StringHandler.
The Lexer needs a “Lex” class – this is the “main” that will break the data from StringHandler into a linked list of tokens. While there is still data in StringHandler, we want to peek at the next character to get an idea what to do with it.
If the character is a space or tab, we will just move past it (increment position).
If the character is a linefeed (\n), we will create a new SEPERATOR token with no “value” and add it to token list. We should also increment the line number and set line position to 0.
If the character is a carriage return (\r), we will ignore it.
If the character is a letter, we need to call ProcessWord (see below) and add the result to our list of tokens.
If the character is a digit, we need to call ProcessDigit (see below) and add the result to our list of tokens.
Throw an exception if you encounter a character you don’t recognize.
ProcessWord is a method that returns a Token. It accepts letters, digits and underscores (_) and make a String of them. When it encounters a character that is NOT one of those, it stops and makes a “WORD” token, setting the value to be the String it has accumulated. Make sure that you are using “peek” so that whatever character does NOT belong to the WORD stays in the StringHandler class. Also remember to increment the position value.
ProcessNumber is similar to ProcessWord, but accepts 0-9 and one “.”. It does not accept plus or minus – we will handle those next time.
Create Junit tests for your lexer. Test with single and multi-line strings. Test with words then numbers and numbers and then words.
Finally, create a “main” class. It should take a command line parameter of a filename. It should call GetAllBytes and pass the result to the lexer. It should print out the resultant tokens.
Below is sample.txt. Make sure to show the full code for all 3 parts and take screenshots of the output of sample.txt being printed out as tokens. Attached is checklist.
sample.txt
My inspiration for writing this piece would be how patriarchy is still a detrimental factor in society and females are not treated with respect Atrocities towards females are the highest in East Asian countries because most people have this belief that females are not considered equal to males I want to focus on the patriarchal mindset in East Asian countries because it allows me to see the differences between the levels of female inequality in different countries By writing this story, the message of female inequality is emphasized and how it can have a negative impact on kids
Trending now
This is a popular solution!
Step by step
Solved in 4 steps with 7 images
I ran the code on eclipse but this is what I got? The image is attached. Please run the same code on eclipse and show the same output being printed out on eclipse. Make sure to take screenshots of the code and the output of the tokens being printed out on eclipse.