Java Code:  In this project, we will begin our lexer. Our lexer will start by reading the strings of the sample.txt file that the user wants to run. It will break the text file into “words” or tokens and build a collection of these tokens. We can consider the lexer complete when it can take any AWK file and output a list of the tokens being generated.   Use Files.readAllBytes. This is a much simpler way of dealing with files. Example of readAllLines: Path myPath = Paths.get(“someFile.awk”); String content = new String(Files.readAllBytes (myPath); Start by creating the “StringHandler” class. It should have a private string to hold the document (sample.txt file) and a private integer index (the finger position). It should have methods: char Peek(i) -looks “i” characters ahead and returns that character; doesn’t move the index String PeekString(i) – returns a string of the next “i” characters but doesn’t move the index char GetChar() – returns the next character and moves the index void Swallow(i) – moves the index ahead “i” positions boolean IsDone() – returns true if we are at the end of the document String Remainder() – returns the rest of the document as a string   Create the Token class. It needs an enum of TokenType (values: WORD, NUMBER, SEPERATOR) and a string to hold the value of the token (for example, “hello” and “goodbye” are both WORD, but with different values. The token will also hold the line number and character position of the start of the token. Create two constructors – one for TokenType, line number and position and one that also has a value; some tokens don’t have a value because it is doesn’t matter (new line). Make sure to add a ToString method. Your exact format isn’t critical; I output the token type and the value in parentheses if it is set.   Next, we will create the Lexer class. It should take a string as a parameter to the constructor. The constructor should create a StringHandler. All access to the document will be done through the StringHandler’s methods. The Lexer will also keep track of what line of the document we are on and what character position we are on within the line. This is different from the index in the StringHandler.   The Lexer needs a “Lex” class – this is the “main” that will break the data from StringHandler into a linked list of tokens. While there is still data in StringHandler, we want to peek at the next character to get an idea what to do with it. If the character is a space or tab, we will just move past it (increment position). If the character is a linefeed (\n), we will create a new SEPERATOR token with no “value” and add it to token list. We should also increment the line number and set line position to 0.                If the character is a carriage return (\r), we will ignore it.                If the character is a letter, we need to call ProcessWord (see below) and add the result to our list of tokens.                If the character is a digit, we need to call ProcessDigit (see below) and add the result to our list of tokens.                Throw an exception if you encounter a character you don’t recognize.   ProcessWord is a method that returns a Token. It accepts letters, digits and underscores (_) and make a String of them. When it encounters a character that is NOT one of those, it stops and makes a “WORD” token, setting the value to be the String it has accumulated. Make sure that you are using “peek” so that whatever character does NOT belong to the WORD stays in the StringHandler class. Also remember to increment the position value.   ProcessNumber is similar to ProcessWord, but accepts 0-9 and one “.”. It does not accept plus or minus – we will handle those next time.   Create Junit tests for your lexer. Test with single and multi-line strings. Test with words then numbers and numbers and then words.   Finally, create a “main” class. It should take a command line parameter of a filename. It should call GetAllBytes and pass the result to the lexer. It should print out the resultant tokens.   Below is sample.txt. Make sure to show the full code for all 3 parts and take screenshots of the output of sample.txt being printed out as tokens. Attached is checklist.    sample.txt My inspiration for writing this piece would be how patriarchy is still a detrimental factor in society and females are not treated with respect Atrocities towards females are the highest in East Asian countries because most people have this belief that females are not considered equal to males I want to focus on the patriarchal mindset in East Asian countries because it allows me to see the differences between the levels of female inequality in different countries By writing this story, the message of female inequality is emphasized and how it can have a negative impact on kids

Database System Concepts
7th Edition
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Chapter1: Introduction
Section: Chapter Questions
Problem 1PE
icon
Related questions
Question

Java Code: 

In this project, we will begin our lexer. Our lexer will start by reading the strings of the sample.txt file that the user wants to run. It will break the text file into “words” or tokens and build a collection of these tokens. We can consider the lexer complete when it can take any AWK file and output a list of the tokens being generated. 

 Use Files.readAllBytes. This is a much simpler way of dealing with files.

Example of readAllLines:

Path myPath = Paths.get(“someFile.awk”);

String content = new String(Files.readAllBytes (myPath);

Start by creating the “StringHandler” class. It should have a private string to hold the document (sample.txt file) and a private integer index (the finger position). It should have methods:

char Peek(i) -looks “i” characters ahead and returns that character; doesn’t move the index

String PeekString(i) – returns a string of the next “i” characters but doesn’t move the index

char GetChar() – returns the next character and moves the index

void Swallow(i) – moves the index ahead “i” positions

boolean IsDone() – returns true if we are at the end of the document

String Remainder() – returns the rest of the document as a string

 

Create the Token class. It needs an enum of TokenType (values: WORD, NUMBER, SEPERATOR) and a string to hold the value of the token (for example, “hello” and “goodbye” are both WORD, but with different values. The token will also hold the line number and character position of the start of the token. Create two constructors – one for TokenType, line number and position and one that also has a value; some tokens don’t have a value because it is doesn’t matter (new line). Make sure to add a ToString method. Your exact format isn’t critical; I output the token type and the value in parentheses if it is set.

 

Next, we will create the Lexer class. It should take a string as a parameter to the constructor. The constructor should create a StringHandler. All access to the document will be done through the StringHandler’s methods. The Lexer will also keep track of what line of the document we are on and what character position we are on within the line. This is different from the index in the StringHandler.

 

The Lexer needs a “Lex” class – this is the “main” that will break the data from StringHandler into a linked list of tokens. While there is still data in StringHandler, we want to peek at the next character to get an idea what to do with it.

If the character is a space or tab, we will just move past it (increment position).

If the character is a linefeed (\n), we will create a new SEPERATOR token with no “value” and add it to token list. We should also increment the line number and set line position to 0.

               If the character is a carriage return (\r), we will ignore it.

               If the character is a letter, we need to call ProcessWord (see below) and add the result to our list of tokens.

               If the character is a digit, we need to call ProcessDigit (see below) and add the result to our list of tokens.

               Throw an exception if you encounter a character you don’t recognize.

 

ProcessWord is a method that returns a Token. It accepts letters, digits and underscores (_) and make a String of them. When it encounters a character that is NOT one of those, it stops and makes a “WORD” token, setting the value to be the String it has accumulated. Make sure that you are using “peek” so that whatever character does NOT belong to the WORD stays in the StringHandler class. Also remember to increment the position value.

 

ProcessNumber is similar to ProcessWord, but accepts 0-9 and one “.”. It does not accept plus or minus – we will handle those next time.

 

Create Junit tests for your lexer. Test with single and multi-line strings. Test with words then numbers and numbers and then words.

 

Finally, create a “main” class. It should take a command line parameter of a filename. It should call GetAllBytes and pass the result to the lexer. It should print out the resultant tokens.

 

Below is sample.txt. Make sure to show the full code for all 3 parts and take screenshots of the output of sample.txt being printed out as tokens. Attached is checklist. 

 

sample.txt

My inspiration for writing this piece would be how patriarchy is still a detrimental factor in society and females are not treated with respect Atrocities towards females are the highest in East Asian countries because most people have this belief that females are not considered equal to males I want to focus on the patriarchal mindset in East Asian countries because it allows me to see the differences between the levels of female inequality in different countries By writing this story, the message of female inequality is emphasized and how it can have a negative impact on kids

 

Code Style
Unit Tests
Main
Token class
Token
Don't exist
(0)
Doesn't exist
(0)
Token members Don't exist
(0)
Doesn't exist
(0)
Doesn't exist
(0)
Doesn't exist
(0)
constructors
Token ToString
String Handler
Lex class
Lexer -
constructor
Lexer - lex
Lexer -
ProcessWord
Few
comments,
bad names
(0)
Don't exist
(0)
Doesn't exist
Lexer -
ProcessNumber
Doesn't exist
(0)
Doesn't exist
(0)
Doesn't exist
(0)
Doesn't exist
(0)
Some good naming,
some necessary
comments (3)
At least one (3)
At least one of:
Exists, reads file with
GetAllBytes, calls lex,
prints tokens (3)
Wrong types (1)
Exists and holds string
and index (3)
Exists and loops over
the string (3)
One of: Accepts
required characters,
creates a token,
doesn't accept
characters it shouldn't
(3)
One of: Accepts
required characters,
creates a token,
doesn't accept
characters it shouldn't
(3)
Mostly good naming,
most necessary
comments (6)
Missing tests (6)
Three of:
Exists, reads file with
GetAllBytes, calls lex,
prints tokens (6)
Has enum and string
members (3)
One is correct (3)
All correct, but not
private(3)
Exists, members correct,
constructor correct (6)
Exists and holds
StringHandler (3)
Instantiates
StringHandler (3)
Exists, loops over the
string, returns a
Linked List of tokens (6)
Two of: Accepts required
characters, creates a
token, doesn't accept
characters it shouldn't
(6)
Two of: Accepts required
characters, creates a
token, doesn't accept
characters it shouldn't
(6)
Good naming, non-trivial
methods well commented,
static only when necessary,
private members (10)
All functionality tested (10)
Exists, reads file with
GetAllBytes, calls lex, prints
tokens (10)
Has enum, string members,
line number and start
position (5)
Both exist and are correct
(5)
Exists and outputs type and
value members clearly (5)
All four are correct and
private (5)
All methods, members,
constructor correct (10)
Exists, holds StringHandler,
line number and position (5)
Instantiates StringHandler
and sets line number and
position (5)
Skips appropriate values,
calls ProcessWord and
ProcessNumber and adds
their return values to the list
(10)
Accepts required characters,
creates a token, doesn't
accept characters it
shouldn't (10)
Accepts required characters,
creates a token, doesn't
accept characters it
shouldn't (10)
Transcribed Image Text:Code Style Unit Tests Main Token class Token Don't exist (0) Doesn't exist (0) Token members Don't exist (0) Doesn't exist (0) Doesn't exist (0) Doesn't exist (0) constructors Token ToString String Handler Lex class Lexer - constructor Lexer - lex Lexer - ProcessWord Few comments, bad names (0) Don't exist (0) Doesn't exist Lexer - ProcessNumber Doesn't exist (0) Doesn't exist (0) Doesn't exist (0) Doesn't exist (0) Some good naming, some necessary comments (3) At least one (3) At least one of: Exists, reads file with GetAllBytes, calls lex, prints tokens (3) Wrong types (1) Exists and holds string and index (3) Exists and loops over the string (3) One of: Accepts required characters, creates a token, doesn't accept characters it shouldn't (3) One of: Accepts required characters, creates a token, doesn't accept characters it shouldn't (3) Mostly good naming, most necessary comments (6) Missing tests (6) Three of: Exists, reads file with GetAllBytes, calls lex, prints tokens (6) Has enum and string members (3) One is correct (3) All correct, but not private(3) Exists, members correct, constructor correct (6) Exists and holds StringHandler (3) Instantiates StringHandler (3) Exists, loops over the string, returns a Linked List of tokens (6) Two of: Accepts required characters, creates a token, doesn't accept characters it shouldn't (6) Two of: Accepts required characters, creates a token, doesn't accept characters it shouldn't (6) Good naming, non-trivial methods well commented, static only when necessary, private members (10) All functionality tested (10) Exists, reads file with GetAllBytes, calls lex, prints tokens (10) Has enum, string members, line number and start position (5) Both exist and are correct (5) Exists and outputs type and value members clearly (5) All four are correct and private (5) All methods, members, constructor correct (10) Exists, holds StringHandler, line number and position (5) Instantiates StringHandler and sets line number and position (5) Skips appropriate values, calls ProcessWord and ProcessNumber and adds their return values to the list (10) Accepts required characters, creates a token, doesn't accept characters it shouldn't (10) Accepts required characters, creates a token, doesn't accept characters it shouldn't (10)
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 4 steps with 7 images

Blurred answer
Follow-up Questions
Read through expert solutions to related follow-up questions below.
Follow-up Question

I ran the code on eclipse but this is what I got? The image is attached. Please run the same code on eclipse and show the same output being printed out on eclipse. Make sure to take screenshots of the code and the output of the tokens being printed out on eclipse. 

1 package mypack;
2
30 import java.io.IOException;
4 import java.nio.file.Files;
5 import java.nio.file.Paths;
6 import java.util.LinkedList;
7
8 public class Main {
90 public static void main(String[] args) {
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27 }
28
29
30
31
}
if (args.length != 1) {
System.out.println("Usage: java Main <filename>");
return;
}
try {
String content = new String(Files.readALLBytes (Paths.get(args[0])));
Lexer lexer = new Lexer (content);
LinkedList<Token> tokens = lexer.lex();
for (Token token tokens) {
System.out.println(token);
}
} catch (IOException e) {
e.printStackTrace();
}
Problems @ Javadoc Declaration Console X
<terminated > Main (7) [Java Application]
Usage: java Main <filename>
►
C:\Users\subri\.p2\pool\plugins\org.eclipse.justj.openjdk.hotspot.jre.full.win32.x86_64_17.0.1.v20211116-1657\jre\bin\java
Transcribed Image Text:1 package mypack; 2 30 import java.io.IOException; 4 import java.nio.file.Files; 5 import java.nio.file.Paths; 6 import java.util.LinkedList; 7 8 public class Main { 90 public static void main(String[] args) { 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 } 28 29 30 31 } if (args.length != 1) { System.out.println("Usage: java Main <filename>"); return; } try { String content = new String(Files.readALLBytes (Paths.get(args[0]))); Lexer lexer = new Lexer (content); LinkedList<Token> tokens = lexer.lex(); for (Token token tokens) { System.out.println(token); } } catch (IOException e) { e.printStackTrace(); } Problems @ Javadoc Declaration Console X <terminated > Main (7) [Java Application] Usage: java Main <filename> ► C:\Users\subri\.p2\pool\plugins\org.eclipse.justj.openjdk.hotspot.jre.full.win32.x86_64_17.0.1.v20211116-1657\jre\bin\java
Solution
Bartleby Expert
SEE SOLUTION
Knowledge Booster
Arrays
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
Database System Concepts
Database System Concepts
Computer Science
ISBN:
9780078022159
Author:
Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:
McGraw-Hill Education
Starting Out with Python (4th Edition)
Starting Out with Python (4th Edition)
Computer Science
ISBN:
9780134444321
Author:
Tony Gaddis
Publisher:
PEARSON
Digital Fundamentals (11th Edition)
Digital Fundamentals (11th Edition)
Computer Science
ISBN:
9780132737968
Author:
Thomas L. Floyd
Publisher:
PEARSON
C How to Program (8th Edition)
C How to Program (8th Edition)
Computer Science
ISBN:
9780133976892
Author:
Paul J. Deitel, Harvey Deitel
Publisher:
PEARSON
Database Systems: Design, Implementation, & Manag…
Database Systems: Design, Implementation, & Manag…
Computer Science
ISBN:
9781337627900
Author:
Carlos Coronel, Steven Morris
Publisher:
Cengage Learning
Programmable Logic Controllers
Programmable Logic Controllers
Computer Science
ISBN:
9780073373843
Author:
Frank D. Petruzella
Publisher:
McGraw-Hill Education