What is lexical, syntactic, and semantic analysis?
In lexical analysis, we recognize the tokens of a language (the "words") by reading a stream of characters from left to right and grouping them into tokens. Syntactic analysis is used to recognize the structure of the language constructs, and the result is usually a tree showing their relationships. Semantic analysis determines if the constructs make sense in the language.
Why are context-free grammars important for languages?
CFGs can be used to express the syntax of most languages. They can then be translated into PDAs, and then to tables than can be used by the compiler to recognize a program expressed in the language. In addition, it is easy to extend a language that has a CFG.
What can't context-free grammars capture (with respect to programming language syntax)?
CFGs can not recognize languages such as L = {wcw | w is in
(0+1)*}, which corresponds to identifiers being declared before being
used. Likewise, they can not capture L =
{anbmcndm | n,m
1}, which can be
interpreted as procedure declarations a and b with parameters n and m, and
procedure calls of c and d with the same number of parameters.
What is a symbol table, and why do we use them in compilers?
A symbol table holds information related to the various identifiers in a program. Information may be collected for attributes such as the storage allocated, its type, its scope, and in the case of procedures, the number, type, and calling method of the parameters. This information is gathered during the early phases of compilation and passed on to the later phases, such as semantic analysis and optimization.
The main reason is that symbol information is not context free, and thus can not be determined using simple parsers. A secondary reason is that there is a lack of locality in that the location where a symbol is defined is rarely where it is used.
What is the problem that separate compilation causes for type checking?
If one has separate compilation, then type information needs to be kept until link time to verify that parameter types to functions match, as well as the types of external variables. Early compilers for Fortran and Pascal did no such checking between compilable units.
Describe some of the limitations that a one-pass compiler forces upon a language.
A one-pass compiler requires that variables and functions be defined before their use, which separates the code in an unnatural way and results in programs that are read "bottom up" (main declared last, e.g.).
What is "backpatching"?
Some languages, such as PL/I and Algol 68 allow variables to be used before they are declared, and most languages allow goto statements to jump forward in the code. Such constructs make it very difficult to implement a one-pass compiler. In some cases, however, it is possible to leave a blank slot that will be filled in later when the necessary information becomes available.
Outline the structure of a compiler.
|
|
What is parsing?
Parsing, in the context of compilers, is the recognition of a valid program as well as the construction of a parse tree from the input stream of tokens.
What is the difference between bottom-up and top-down parsing?
In bottom-up parsing, one starts with the input and tries to reduce it to the start symbol of the language. In top-down parsing, one begins with the start symbol of the language and tries to find a derivation for the input.
Why would you want to avoid left recursion in a top-down parser?
Left recursion occurs when a rule has its nonterminal on the left side
appearing as the first nonterminal on the right side. It is possible to get into
an infinite loop, in which the parser continually calls the recursive procedure
because the input does not change. An example production with left recursion
would be expr
expr +
term. All left recursion can be eliminated from a grammar by producing an
equivalent right recursive grammar.
How do LL and LR parsers recognize languages?
An LL parser is predictive -- assumes the input is an instance of the language and then attempts to predict the major structures of the instance based on the next few characters (as read left to right). Then it attempts to predict the substructures. (Predictive parsing is a form of recursive-descent parsing in which the lookahead symbol unambiguously determines the procedure selected for the non-terminal.)
An LR parser reads and stacks characters until it recognizes the right hand side of a production on the top of the stack. It then reduces the stack by replacing the characters that it recognized with the left hand side of the production. When the parser runs out of input, the only thing left on the stack should be the start symbol.
What are the relative benefits of predictive and reductive parsers?
A predictive parser is usually easier to write by hand, and the resulting tables are usually smaller. The grammar for the language must be able to disambiguate between productions based on the lookahead symbol. A reductive parsers can efficiently recognize a larger class of languages, and are usually better at handling errors. Also, some languages naturally lend themselves to one method or the other.
What is the difference between LL, LR, LALR, and SLR parsing?
An LL parser is a top-down parser that scans the input from left to right, and creates a leftmost derivation of the parse tree. An LL(1) grammar restricts the language such that each rule with the same left-hand nonterminal has a different terminal or nonterminal as the first symbol on the right-hand side. We need this to be able to determine the next structure based on only one character of input.
An LR parser is a bottom-up parser that shifts input symbols on to a stack until the topmost symbols match the right hand side of a production, at which time the symbols are reduced to the left hand side of the production. The result is a rightmost derivation.
SLR, LALR, and LR parsers are variants of shift-reduce parsers that shift whenever possible, and differ in the method that they resolve shift-reduce conflicts. LR parsers are the most powerful set of non-backtracking parsers that there are, but the resulting tables can get quite large. An SLR parser has a more simplified conflict resolution mechanism, but can not be used for some grammars. A compromise between the two is an LALR parser, in which the number of states is the same as that of an SLR parser, but the reduction rule is a little stronger. (This is the most common, and is the algorithm used by YACC.)
List the following in order of increasing power: LALR(1), LL(1), LR(1), SLR(1), LR(0)
Warning: this problem's answer could be wrong.
The first letter means "left-to-right", the second letter means leftmost or rightmost derivation, and the integer tells the number of lookahead symbols. In general, LL are the weaker than all the LR variants, and of those, SLR is weakest, LALR is next, and LR is the most powerful. The greater the number of lookahead symbols, the more powerful the parser.
LL(1), LR(0), SLR(1), LALR(1), LR(1)
What is a shift-reduce conflict?
A shift reduce conflict occurs in an LR parser when, given a particular state, it can not decide whether to shift another input symbol on to the stack or to reduce the symbols currently on the stack.
What is a parse tree?
It is a derivation of valid string of a language using its production rules.
The internal nodes are all non-terminals in the language, and the leaves are
terminals (tokens). The ordering of the leaves is such that if A is a
parent node, and X1, X2, and
X3 are children, then A
X1X2X3 is a production in the grammar. It
shows how the input stream of tokens should be grouped to form "phrases" of the
language.
What is an abstract syntax tree, and why are they used?
It is a tree in which each internal node is an operator, and the children are the operands. It is a syntax-independent way of representing the operations of a language. (It's basically a compressed version of a parse tree.) ASTs allow for some independence between the front and back ends, and provide an easy way to do manipulation or optimization of the program.
What's the difference between an lvalue and an rvalue?
An lvalue is a part of an expression that represents an location, and a rvalue is an expression that yields a value. If we think of an assignment statement, the left side is really an address, and the right is a value.
(Spring 1994) Describe 5 compiler optimizations that save time, and 2 that save space.
To save time:To save space:
What is dataflow analysis?
Dataflow analysis is a static analysis of the possible uses of variables by a program when it executes. This information is useful for determining when certain optimizations can be performed.
What machines are used to perform lexical analysis and syntactic analysis?
A finite state machine can be used for lexical analysis and a push-down automaton for syntactic analysis. We need the power of a PDA to verify context-sensitive statements such as "every procedure invocation must have the same number of parameters as the procedure definition". In general, the rules for syntax are generally recursive, which can not be captured by a finite state machine.
(Spring 1995) Suppose you want to write a compiler that compiles language L. Eventually, you might want that compiler to be written in language L, and to be able to compile itself. Why? Describe the "bootstrapping" problem that results.
Being able to write a compiler for a language in the language that it compiles is something of a "fire test" for the language. Compilers are a good way to find the ambiguities of a language, and they are nontrivial endeavors that exercise the language's capabilities. Of course, this is something of a chicken-and-the-egg problem since one can not compile the source code for the compiler if no such compiler yet exists.
What is one common way that compiler writers deal with problem (assume this is the first compiler that has ever been written for language L)?
Typically, a very small subset of the language is implemented into a compiler by hand, and then the compiler code is changed to allow a new feature. The old compiler then compiles the new code, and then the resulting compiler has the new feature.
How many versions of the compiler are necessary before the compiler is stable, i.e. it can compile itself and produce the exact binary that is being run to do the compiling?
After the last change to the source, one would need to compile the compiler to add the new change to the binary, and then compile the source again to create a compiler that was made using the latest change.
Consider, for example, an optimization. First we edit the source to allow a certain optimization. Then we compile it to produce an unoptimized compiler that can do the new optimization. Finally, we compile the compiler again to create an optimized compiler that has the new optimization.
(Spring 1994) What is rate monotonic scheduling, and in what sense is it optimal?
Rate monotonic scheduling is a method of scheduling tasks whose requests for service occur at standard intervals. The priorities are given such that the tasks with the highest rate of request have higher priority, and if a request occurs with a higher priority than the current task, the current task is preempted. It has been shown that the algorithm is optimal in the sense that there is no other fixed priority assignment rule that can schedule a task set that it can not.
For a set of m tasks with a fixed priority order that all meet their deadlines, rate monotonic scheduling provides a least upper bound on the processor utilization of U = m(21/m-1), which is about 70% for large m.
How does one determine the utilization of a CPU?
For a set of n processes with running times Ci and period of Ti, the utilization is the sum over all the processes of Ci/Ti.
(Spring 1994) What is the deadline driven scheduling algorithm, and in what sense is it optimal?
Earliest deadline first is a dynamic scheduling algorithm in which priority is given to the tasks whose deadlines are nearest. It is optimal in the sense that if any set of tasks can be scheduled by any algorithm, the deadline driven scheduling algorithm can do it.
What is the worst case request scenario?
A task requests service at exactly the same time that all tasks of higher priority request service. (This is called the critical instant, the instant in which the request for a task will have the longest response time.)
Explain the concept behind Lamport's logical clocks.
Lamport's logical clocks attempt to mathematically model the general concept that people have about one event "happening before" another. Because of the variation in physical clocks, he formalizes the concept with respect to a logical view of time. Since it is often impossible in a distributed system to say which of two event happened first, the "happened before" relation only provides a partial ordering.
What are the benefits of logical clocks?
Logical clocks are a useful way for implementing a distributed system. They can be used to solve a variety of synchronization problems, from mutual exclusion to correcting discrepancies between physical clocks.
What do Lamport's logical clocks not do?
They do not provide a total ordering for events, although they can be extended to do so by breaking ties based on PID. They also assume that no lost messages occur, that all processes know each other, and that processes in the distributed system never fail.
What is the "happened-before" relation?
The happened-before relation, written as "
", is defined
by three conditions:
b.
b.
b, b
c,
then a
c. When are two events concurrent?
Two events A and B are concurrent when A
B and B
A.
What is the clock condition?
For any events a, b: if a
b, then
C<a> < C<b>, where C<x> is the clock value of event x.
What conditions must hold for the clock condition to be true?
What implementation methodology will guarantee that the clock condition holds?
How can fully distributed mutual exclusion be implemented using logical clocks?
When a process wishes to enter its critical section, it sends a request containing its PID and timestamp to all the other processes. It then only enters the critical section if it receives a reply from the other processes. If a process receives a request, one of three actions are taken:Distributed mutual exclusion using Lamport's logical clocks is fully distributed, with no central coordinator. The drawbacks are that there is high message overhead, and it assumes reliable message passing, that all processes know about each other, that no processes ever crash.
Explain the apparently anomalous behavior of a system using logical clocks, as well as the process of correcting it.
An example of the anomalous behavior occurs on a widely distributed system in which a person at site A issues a request to site B, and then telephones a friend at site B and asks him to issue another request locally. It is very likely that the second request will actually get executed before the first, even though it was issued later. The problem is that communication happened outside of the computer system, in the form of the telephone call. One way to fix this is to timestamp every form of communication, including the telephone call.
(Winter 1999) Give an example of where the logical time and physical time would not have the same order.
(Winter 1999) Does Lamport's solution work for multicast?
Two multicast messages may be received in different orders by different processors.
What were the design goals of Haskell?
Expressive power and improved verifiability (same as other functional languages).
What are some of the features of Haskell?
What two things are decoupled in Haskell that were coupled in most other languages of its time period?
Functions and parameter types were decoupled -- a programmer can write a function without worrying about its types, and the compiler will infer them.
Explain Haskell's method of function definition.
Haskell uses pattern matching to compare a set of patterns against the actual parameters. If a match succeeds, the associated expression is evaluated, and if no match is found a runtime error results.
What kind of typing does Haskell use?
Static, with type inferencing and polymorphic types, as well as a strong type system.
What value do all Haskell types have?
, "bottom",
which is returned on an error.
What two kinds of polymorphism are used in Haskell?
Ad hoc, in the form of overloading, and parametric.
How do type classes relate to polymorphism?
Type classes provide a structured way to control ad hoc polymorphism by allowing types to be declared as instances of a particular class, and to allow definitions of overloaded operations to be specified for that class.
How are Haskell's type classes different from traditional OOP?
Types are not objects, and do not have first class status. They also have no notion of internal state. An advantage over OOP is that methods in Haskell are completely type-safe, since methods are not looked up but rather passed as higher-order functions.
How does Haskell handle coercion?
There is no implicit coercion. Numerals are overloaded and may require explicit type coercion.
How does Haskell avoid the name clashes that can occur with multiple inheritance?
A particular operation can be a member of at most one class in any given scope.
According to Wegner, what were the four most important milestones of the first 25 years of programming languages?
What would you consider to be the most important developments of programming languages in recent years?
Object-oriented programming provides a decent method for attacking the complexity of programming. Parallel languages will become more important as our computational needs grow faster than computer chips. Maybe "very high-level languages" like the "non-deterministic" Unity will become popular.
Last changed January 30, 1997. David Coppit, david@coppit.org