Develop the C or C++ source code required to solve the following problem
CS3361 - Assignment 3.pdf
CS 3361 | Fall 2020 | Assignment #3 Lexical Analyzer
Assignment #3
Lexical Analyzer
Develop the C or C++ source code required to solve the following problem.
Problem Develop a lexical analyzer in C or C++ that can identify lexemes and tokens found in a source code file provided
by the user. Once the analyzer has identified the lexemes of the language and matched them to a token group,
the program should print each lexeme / token pair to the screen.
The source code file provided by the user will be written in a new programming language called “DanC” and is
based upon the following grammar (in BNF):
P : : = S
S : : = V: =E | r ead ( V) | wr i te( V) | wh il e C d o S o d | S; S
C : : = E < E | E > E | E = E | E <> E | E <= E | E >= E
E : : = T | E + T | E - T
T : : = F | T * F | T / F
F : : = (E ) | N | V
V : : = a | b | … | y | z | aV | b V | … | y V | z V
N : : = 0 | 1 | … | 8 | 9 | 0 N | 1 N | … | 8 N | 9 N
Your analyzer should accept the source code file as a required command line argument and display an
appropriate error message if the argument is not provided or the file does not exist. The command to run your
application will look something like this:
Form: danc_analyzer <path_to_source_file>
Example: danc_analyzer test_file.danc
Lexeme formation is guided using the BNF rules / grammar above. Your application should output each lexeme
and its associated token. Invalid lexemes should output UNKNOWN as their token group. The following token
names should be used to identify each valid lexeme:
Lexeme Token Lexeme Token Lexeme Token
:= ASSIGN_OP + ADD_OP do KEY_DO
< LESSER_OP - SUB_OP od KEY_OD
> GREATER_OP * MULT_OP <variable name> IDENT
= EQUAL_OP / DIV_OP <integer> INT_LIT
<> NEQUAL_OP read KEY_READ ( LEFT_PAREN
<= LEQUAL_OP write KEY_WRITE ) RIGHT_PAREN
>= GEQUAL_OP while KEY_WHILE ; SEMICOLON
CS 3361 | Fall 2020 | Assignment #3 Lexical Analyzer
Additional Solution Rules Your solution must conform to the following rules:
1) Your solution should be able to use whitespace, tabs, and end of line characters as delimiters between
lexemes, however your solution should ignore these characters and not report them as lexemes nor
should it require these characters to delimit lexemes of different types.
a. Example: “while i<=n do”
i. This line will generate 5 lexemes “while”, “i”, “<=”, “n”, and “do”.
ii. This means the space between “while” and “i” separated the two lexemes but wasn’t a
lexeme itself.
iii. This also means that no space is required between the lexemes “i”, “<=”, and “n”.
2) Your solution should print out “DanC Analyzer :: R<#>” on the first line of output. The double colon “::”
is required for correct grading of your submission.
3) Your solution must be tested to ensure compatibility with the GNU C/C++ compiler version 5.4.0.
4) Lexemes that do not match to a known token should be reported as an “UNKNOWN” token. This should
not stop execution of your program or generate an error message.
Hints 1) Draw inspiration by looking at the lexical analyzer code discussed and distributed in class.
2) Start by focusing on writing the program in your usual C/C++ development environment.
3) Once your solution is correct, then work on testing it in Linux using the appropriate version of the GNU
compiler (gcc).
4) Linux/Makefile tutorials:
a. Linux Video walkthrough: http://www.depts.ttu.edu/hpcc/about/training.php#intro_linux
b. Linux Text walkthrough: http://www.ee.surrey.ac.uk/Teaching/Unix/
c. Makefile tutorial: https://www.tutorialspoint.com/makefile/index.htm
What to turn in to BlackBoard A zip archive (.zip) containing the following files:
• <FirstName>_<LastName>_<R#>_Assignment3.c / <FirstName>_<LastName>_<R#>_Assignment3.cpp
o C/C++ Source code file
o Example: Eric_Rees_R123456_Assignment3.c
• Makefile
o A makefile for compiling your C/C++ file.
o This makefile must work in the HPCC environment to compile your source code file and output
an executable named danc_analyzer.
CS 3361 | Fall 2020 | Assignment #3 Lexical Analyzer
Example Execution The example execution below was run on Quanah, one of the HPCC clusters. It shows all the commands used to
compile and execute my analyzer. Bolded text is text from the Linux OS, text in red are the commands I typed
and executed, and the text in blue represents the output from each step.
quanah:/assignment_3$ make clean rm -f danc_analyzer quanah:/assignment_3$ make gcc -o danc_analyzer Eric_Rees_R123456_Assignment3.c quanah:/assignment_3$ ./danc_analyzer test.danc DanC Analyzer :: R123456 f IDENT := ASSIGN_OP 1 INT_LIT ; SEMICOLON i IDENT := ASSIGN_OP 1 INT_LIT ; SEMICOLON read KEY_READ ( LEFT_PAREN n IDENT ) RIGHT_PAREN ; SEMICOLON while KEY_WHILE i IDENT <= LEQUAL_OP n IDENT do KEY_DO f IDENT := ASSIGN_OP f IDENT * MULT_OP i IDENT ; SEMICOLON i IDENT := ASSIGN_OP i IDENT + ADD_OP 1 INT_LIT od KEY_OD ; SEMICOLON
CS3361 Programming Assignment Grading Rubric.pdf
CS3361 Programming Assignment Grading Rubric
The purpose of this document is to lay out the common criteria used to grade CS 3361 programming assignments. Each
criterion listed has four different levels of mastery, with a description of how a submission will attain each level and the
number of points awarded for achieving it.
Criteria
Program Specifications & Correctness This is the most important criterion as any submitted program must function correctly and meet the specifications given.
Submitted programs should always behave as desired and produce correct output and results for a variety of inputs.
This criterion also includes the need to meet all specifications laid out in the problem statement by writing a program in
a particular way, using a particular language feature, or not using a particular language feature.
If you believe a specification to be ambiguous or unclear, you should consult with the instructor instead of making any
assumptions.
Readability Code should be readable to both you as well as a knowledgeable third party. This involves:
• Using indentation consistently.
• Adding whitespace (blank lines or spaces) where appropriate to help distinguish parts of the program.
o Examples:
▪ Space after commas in a list
▪ Blank lines between functions or between blocks of related lines within function.
• Using meaningful variable names. Variables with names like A, B, C, foo, or bar give the reader no information
regarding the variable’s purpose or what information it may contain. Names like maximum, counter or
inputString are considerably more useful and make their purpose known. Loop variable names are an exception
to this rule and names like x, y, i, or j are allowed for loop variables.
• Properly organizing code to increase readability and reusability. Code should be organized into functions so that
blocks of code that can be re-used are contained within functions that enable that behavior.
Documentation Every file you submit that contains source code should start with the following header comments:
1) The name of the code’s author (you), R#, the date, and the assignment commented along the top.
2) A comment explaining the problem being solved.
Example:
## Eric Rees (R#123456) | Homework #1 | 09/01/2020
##
## This program accepts a temperature in Fahrenheit (floating point value) as input and outputs the
## integer form of that temperature in degrees Celsius.
##
All code in your program should also be well-commented. This requires that you strike a balance between:
1) Being overly verbose and commenting everything – which adds a great deal of unneeded noise to the code, and
2) Writing no comments in the code – which adds a great deal of unnecessary complexity to the code by not giving
any assistance to future readers
In general, you should aim to put a comment on any line of code that you might not understand yourself if you came
back to it in a month without having thought about it in the interim. Much like code organization and elegance,
appropriate commenting is a skill we will be learning as we write code this semester. As such, adherence to
documentation guidelines covered in class will be held to higher standards as the semester progresses.
Code Efficiency There are many ways to write the same functionality into your code, and many of them may be poor choices. They may
be poor choices because they require more lines of code than are necessary, they take considerably longer to execute
than necessary, or they consume considerably more system resources (such as RAM) during execution than necessary.
Whenever possible, code should be concise, stray from using overly complicated formulas, and use the most efficient
algorithms for solving the problem at hand.
Assignment Specifications Programming assignments in this course will often contain specifications or requirements beyond those required to
solve the problem itself. These include, but are not limited to, writing certain information as comments in your code or
the name of the file(s) you submit.
Grading Rubric Each criterion will make up an approximate percentage of the grade given to a programming assignment as indicated by the “%” column. Points will be assigned
for a criterion based on the guidelines listed below the “Excellent”, “Adequate”, “Poor”, and “Not Met” evaluations.
For example, an assignment that is marked as “Adequate” in the Programming Specifications & Correctness criterion, “Poor” for readability and “Excellent” for
all other criteria would receive a score of:
(0.8 * 0.6) + (0.5 * 0.15) + (1 * .1) + (1 * .05) + (1 * .1) = .805 = 80.5%
Criterion % Excellent (100%) Adequate (80%) Poor (50%) Not Met (0%)
Program Specifications &
Correctness*
60% No errors, program always works correctly and meets the specification(s).
Minor details of the program specification are violated, program functions incorrectly for some inputs.
Significant details of the specification are violated, program often exhibits incorrect behavior.
Program only functions correctly in very limited cases or not at all.
Readability 15% No errors, code is clean, understandable, and well-organized.
Minor issues with consistent indentation, use of whitespace, variable naming, or general organization.
At least one major issue with indentation, whitespace, variable names, or organization.
Major problems with at three or four of the readability subcategories.
Documentation 10% No errors, code is well- commented.
One or two places that could benefit from comments are missing them or the code is overly commented.
File header missing, complicated lines or sections of code uncommented or lacking meaningful comments.
No file header or comments present.
Code Efficiency 5% No errors, code uses the best approach in every case.
N/A Code uses poorly-chosen approaches in at least one place.
Many things in the code could have been accomplished in an easier, faster, or otherwise better fashion.
Assignment Specifications
10% No errors. N/A Minor details of the assignment specification are violated, such as files named incorrectly or extra instructions slightly misunderstood.
Significant details of the specification are violated, such as extra instructions ignored or entirely misunderstood.
* As a special case, if a program does not meet the specifications at all or is entirely incorrect, no credit will be received for the other criteria either.
Adapted from Mark Liffiton’s Programming Rubric
makefile (1)
#I would strongly recommend NOT changing any lines below except the CC and MYFILE lines. #Before running this file, run the command: module load gnu EXECS=danc_analyzer #Replace the g++ with gcc if using C CC=g++ #Replace with the name of your C or C++ source code file. MYFILE=Eric_Rees_R123456_Assignment3.cpp all: ${EXECS} ${EXECS}: ${MYFILE} ${CC} -o ${EXECS} ${MYFILE} clean: rm -f ${EXECS}
test.danc
f:=1; i:=1; read(n); while i <= n do f:=f*i; i:=i+1 od;