A1: Implementation of CYK Algorithm (you have studied in ‘Theory of Computation’).
Outcome: It gives the students exposure to implement dynamic programming based parser for Chomsky Normal Form (CNF) of context free grammars.
A2: Write a program to design Lexical Analyzer in C/C++ Language (to recognize any five keywords, identifiers, numbers, operators and punctuations).
Outcome: It gives the students a feel of how a lexical analyzer for a typical regular language is written.
A3: Write a program in LEX to recognize Floating Point Numbers
Outcome: It gives the students an idea about how to specify regular expressions for covering all possible formats of floating point numbers.
A4: Write a program in LEX to recognize different tokens: Keywords, Identifiers, Constants, Operators and Punctuation symbols.
Outcome: It gives the students, a knowledge of using LEX tool to recognize all kinds of tokens of C language.
A5: Write a LEX program that copies file, replacing each nonempty sequence of white spaces by a single blank.
Outcome: It makes the students to know how a LEX tool can be used in general to process the text.
A6: Write a LEX program that converts a file to "Pig Latin". Specifically, assume the file is a sequence of words (groups of letters) separated by whitespace. Every time you encounter a word:
- If the first letter is a constant, move it to the end of the word and then add ay.
- If the first letter is a vowel, just add ay to the end of the word. All non-letters are copied intact to the output.
Outcome: It makes the students to know how a LEX tool may be used to process the content of files.
A7: Write a LEX program to recognize the following tokens over the alphabets {0,1,..,9}
- The set of all string ending in 00.
- The set of all strings with three consecutive 222's.
- The set of all string such that every block of five consecutive symbols contains at least two 5's.
- The set of all strings beginning with a 1 which, interpreted as the binary representation of an integer, is congruent to zero modulo 5.
- The set of all strings such that the 10th symbol from the right end is 1.
- The set of all four digits numbers whose sum is 9
- The set of all four digital numbers, whose individual digits are in ascending order from left to right.
Outcome: It will provide students with how to write regular expressions for different patterns.
A8: “The auxiliary procedure part of a LEX program can be complied separately and loaded with lexical analyzer”. Test and show that, the given statement is valid.
Outcome: This assignment provides the insight of how a LEX compiler works and how to manage the lengthy code written in LEX language.
A9: Implement Question A7 using JFLEX: http://jflex.de/manual.html
Outcome: Students can learn and understand how JFLEX tool works and it is very important for them as Java is one of the programming languages which is commonly used in IT industry.
A10: Suppose an image is encodes as an n x m matrix M of ''light intensifies.'' Mij is a number from 0 to 15, with 0=black and 15=white. M may be stored row by row, with rows terminated by newline (n) characters. Call this string vM. For example, if M is
0 1 4 7
1 8 8 6
then vM is 0026 n 0147 n 1886. We can also encode M by its differences along rows, dM. For Example, for M above dM is 0+2+4 n +1+3+3 n+70-2. If we replace positive numbers by + and negative numbers by - in dM we get the sequence of changes, d'M. In this case, d'M=0++ n +++ n +0-. Suppose a ‘feature’ is defined to be a nonempty sequence of increasing value of intensity, followed by zero to three unchanging value of intensity followed by at least one decreasing value of intensity, all on one row.
- Write a LEX program to find maximal features in d'M and surround them by parentheses. A maximal feature is not a proper substring of any other feature.
- Display the longest maximal feature of each row.
Outcome: Students can learn how to apply the knowledge about various components of a compiler in other domains as well. Here it is shown how a lexical analysis process can be used to solve an image processing problem.
Back Tracking | Trying with other alternative | |
---|---|---|
Brute Force Method | ✓ | ✓ |
Recursive Descent without Backtracking | ✕ | ✓ |
LL(1) | ✕ | ✕ |
The First three programs (B1, B2 and B3) gives the feeling of understanding ‘Backtracking’ and ‘Trying with other alternatives’. To understand this see the video given for Question-B2 and the detailed output given for Question B3.
B1: Write a program to implement
- Recursive Descent Parsing with back tracking (Brute Force Method)
S → cAd
A → ab / a - Recursive Descent Parsing with back tracking (Brute Force Method)
S → cAd
A → a / ab
Outcome: This assignment gives students an insight of how a typical parser is developed and how it works. Further, with the help of this assignment they can understand the limitation of Brute Force Method of parsing practically.
B2: Write a program to implement: Recursive Descent Parsing with back tracking (Brute Force Method)
- S → aaSaa | aa
- S → aaaSaaa | aa
- S → aaaaSaaaa | aa
- S → aaaSaaa | aSa | aa
Outcome: This assignment also provides the implementation of Recursive-Descent parser however it uses a different context-free grammar with more number of productions compared to earlier assignment B1. Here the students can understand the difference between ‘Backtracking’ and ‘Trying with other alternatives in a production’.
B3: Write a program to implement: Recursive Descent Parsing without back tracking.
Give with detailed output, to show that it is working with other alternatives but no backtracking.
Outcome: This assignment provides the insight of how a parser works without backtracking and without trying with other alternatives in a production.
B4: Write a program to implement
- Write a program to find FIRST
- Write a program to find FOLLOW
- Implementation of LL(1) Parsing Table
Outcome: This assignment provides implementation of computing two important sets First and Follow, and also the implementation of LL (1) parser. These functions are very important for developing not only LL (1) parser but also for LR (1) parsers, that are used extensively in practical compilers.
B5: An RR(1) parser is similar to an LL(1) parser except that it works from right-to-left, tracing out a top-down, rightmost derivation. The parser continuously consults the rightmost symbol of the derivation and the rightmost symbol of the input string that has not yet been matched to make its decisions about what production to use. The construction and operation of this parser is analogous to the LL(1) case, except that instead of having FIRST and FOLLOW sets we have LAST and PRECEDE sets, where LAST(A) is the set of terminals that could come at the end of a production of A (plus ε if A can derive the empty string), and PRECEDE(A) is the set of the terminals that could come before a production of A in some sentential form. Similarly, instead of having an end-of-input marker, we have a beginning-of-input marker, which we denote $. Write a program to implement RR(1) parsing.
Outcome: Here RR(1) parser is implemented which is very similar to LL(1) parser but works from right-to-left rather than the traditional way of using left-to-right. The RR(1) parser is not commonly used in practice, however it will help students to understand and differentiate the two scanning approaches – left-to-right and right-to-left. Also they will be able to implement two more functions - Last and Precede.
B6: Write a program to implement operator precedence parsing. (Precedence table construction and parsing from precedence table)
Outcome: This assignment covers one more parsing method - Operator-precedence parsing which is a bottom-up approach and very useful for parsing the construct – expression. The expression is a very common syntactic construct present in almost all of the programming languages.
B7: Write a program to implement Simple Precedence. (Precedence table construction and parsing from precedence table)
Outcome: This assignment is a kind of extension parsing w.r.t. B6. Here in addition to terminals, the precedence relations among both Terminals and Non Terminals of the grammar, are used for getting the parsing actions.
B8: Write a program to design SLR parsing.
Outcome: The last three assignments in this set are very much related to each other as these provide the solutions for all three approaches – SLR(1), CLR(1) and LALR(1) parsing. Students will get to know that how to obtain LR(0) items, FIRST, FOLLOW, Parsing Table Construction and usage of the table for parsing a given input.
B9: Write a program to design CLR parsing.
Outcome: This assignment is a variant of LR(1) parsing. Students can learn about how to obtain LR(1) items, FIRST, FOLLOW, Parsing Table Construction and usage of the table for parsing a given input using CLR parser.
B10: Write a program to design LALR parsing.
Outcome: This assignment is a variant of LR(1) parsing. Students can learn about how to obtain LR(1) items, grouping of LR(1) items with common core, FIRST, FOLLOW, Parsing Table Construction and usage of the table for parsing a given input using LALR parser.
C1: Write a program to design LALR parsing using YACC.
Outcome: This assignment is same as given in the assignment B10 but here the YACC tool is used. Students find this assignment very useful as they can feel the power of the YACC tool and how it reduces the efforts of writing a parser.
C2: Use YACC to Convert Binary to Decimal (including fractional numbers).
Outcome: This assignment shows how a problem which is not directly related to the Compiler can also be solved conveniently using the YACC tool.
C3: Use YACC to implement, evaluator for arithmetic expressions (Desktop calculator).
Outcome: This assignment gives how an expression is parsed and how syntax-directed translation is carried out on the same. It is found to be a very useful problem as an expression is a very common construct in programming languages.
C4: Use YACC to convert: Infix expression to Postfix expression.
Outcome: The students can learn about the conversion of infix to postfix using YACC tool. With this knowledge they are ready for other conversion i.e. infix to prefix form also.
C5: Use YACC to generate Syntax tree for a given expression.
Outcome: The students can obtain syntax tree for a given sequence of statements based on the grammar. The syntax tree is one of the intermediate form used in the development of a typical compiler. It is very much similar to the parse tree and therefore students can correlate both these tree structures.
C6: Use YACC to generate 3-Address code for a given expression.
Outcome: The students can obtain intermediate code: three-address code, which is commonly used intermediate code in practical implementation of compilers.
C7: Use YACC to generate the following 3-Address code which contains Arrays.
Outcome: This assignment is an extension to the previous assignment C6 by including the array construct along with the expressions. With this assignment the students are able to handle the arrays also while generating three-address code.
C8: Construct a Syntax-Directed Translation scheme that takes strings of a’s, b’s and c’s as input and produces as output the number of substrings in the input string that correspond to the pattern a(a|b)*c+(a|b)*b. For example the translation of the input string “abbcabcababc” is “3”.
- A context-free grammar that generates all strings of a’s, b’s and c’s
- Semantic attributes for the grammar symbols
- For each production of the grammar a set of rules for evaluation of the semantic attributes.
Outcome: This assignment deals with the problem of string and pattern matching. It is shown here how this kind of problem can also be solved using the method of syntax-directed translation.
C9: YACC to generate DAG for expression grammar.
Outcome: With the help of this assignment students can know about, how DAG (Directed Acyclic Graph) construct is used and generated practically and further enhance their knowledge and practical skills for writing an optimized compiler.
C10: Write a YACC “desk calculator” program that will evaluate Boolean expressions, consists of ¬, ∧, ∨ → and ↔ where, ¬ is highest precedence and ↔ is with lowest precedence.
INPUT : ¬T ∧ F → (T ∨ F) OUTPUT : T
Outcome: This assignment gives how a Boolean expression is parsed and how syntax-directed translation is carried out on the same.