CMSC330 - Advanced Programming Languages

Week1 Discussion

What programming languages have you used? What is your favorite?

How familiar are you with attribute grammars, syntax analyzers, and semantic descriptions of programming languages?

Describe with specific examples, how an attribute grammar can be used to describe the structure of a programming language.

Describe the function of a lexical analyzer vs. a syntax analyzer and some of the different parsing algorithms available to implement a syntax analyzer.

What programming languages have you used? What is your favorite?

I have learned C, C++, Java, and CASL II in schools. In my job, I have used PHP, Ruby, and SQL.

Among them, my favorite language is Java because it has many of object-oriented features, doesn't depend on OS, and behaves the same way on all platforms.

How familiar are you with attribute grammars, syntax analyzers, and semantic descriptions of programming languages?

Honestly, I haven't heard these terms before. If I took a Compiler Construction class in previous school, I would be more familiar with them.

Describe with specific examples, how an attribute grammar can be used to describe the structure of a programming language.

An attribute grammer is used for giving semantics to context-free grammars by adding attributes to nonterminal symbols of the grammars. It checks the correctness of the static semantics rules of a program.

An example is shown in Sebesta (2012) Chapter 3 Example 3.6 as follows.

Derivation:
    1. <assign> -> <var> = <expr>
    2.          -> A = <expr>
    3.          -> A = <var>[2] + <var>[3]
    4.          -> A = A + B

Evaluation of attributes:
    1. <var>.actual_type <- look-up(A)
    2. <expr>.expected_type <- <var>.actual_type
    3. <var>[2].actual_type <- look-up(A)
       <var>[3].actual_type <- look-up(B)
    4. <expr>.actual_type <- either int or real
    5. <expr>.expected_type == <expr>.actual_type is either TRUE or FALSE

Attributes:
    1. <var>.actual_type = real_type
    2. <expr>.expected_type = real_type
    3. <var>[2].actual_type = real_type
       <var>[3].actual_type = int_type
    4. <expr>.actual_type = real_type

Describe the function of a lexical analyzer vs. a syntax analyzer and some of the different parsing algorithms available to implement a syntax analyzer.

A lexical analyser is a morphological alanyser which extracts each token from sentences. A word or a number can be expressed by a regular expression.

A syntax analyzer, or a parser is a program which inputs a sentence according to the grammar of the sentence and translated the sentence in to the parse tree, or the derivation tree. Some parsers output the intermediate expression of the program or execute the program directry whithout outputing the parse tree.

Week1 Homework1 (6 points)

(Adapted from Sebasta (2012) Chapter 3 Problem 7)

Using the grammar in text Example 3.4 (with terminals X, Y, and Y vs. A, B, and C), show a leftmost derivation and a parse tree for the following statement:

X = (X * Y) + ((X + Y) * Z)

fileCMSC330_Homework1.pdf

Week2 Writing Recursive Descent Parsing Methods

Choose one of the following productions and write a Java method that will parse that production.

<program> ->
   <statement_list>.

<statement_list> ->
   <statement> |
   <statement> ; <statement_list>

<statement> ->
   <assignment_statement> |
   <write_statement> |
   <read_statement> |
   <if_statement> |
   <while_statement>

<assignment_statement> ->
   ASSIGN VARIABLE TO <expression>

<write_statement> ->
   WRITE <expression>

<read_statement> ->
   READ VARIABLE

<if_statement> ->
   IF <expression> THEN <statement_list> [ELSE <statement_list>]

<while_statement> ->
   WHILE <expression> DO <statement_list>

<expression> ->
   VARIABLE |
   LITERAL

Use the examples in section V-B of Module 1 as a guide.

<assignment_statement> ->
   ASSIGN VARIABLE TO <expression>
boolean assignmentStatement() {
   if (token == ASSIGN) {
      token = getNextToken();
      if (token == VARIABLE) {
         token == getNextToken();
         if (token == TO) {
         token == getNextToken();
            if (expression()) {
            return true;
            }
         }
      }
   }
   return false;
}

Week2 Grammar Restrictions for Recursive Descent Parsers

There are two restrictions on the type of grammars that can be used with a recursive descent parser. The first is that the grammar cannot have any left recursive productions. Give an example of a left recursive production and explain why such productions would be a problem.

The second restriction is that the grammar must not require more than one token look ahead. Give an example of a production that does not have this property. Explain why this restriction is necessary for recursive descent parsing.

The first is that the grammar cannot have any left recursive productions. Give an example of a left recursive production and explain why such productions would be a problem.

A recursive descent parser is a kind of top-down parser. LL parsing is used for it, which parses the input from left to right, and uses a leftmost derivation. The parser cannot parse a grammar that contains left recursion because the grammar will go into an infinite loop.

For example, a simple LL(1) parser written in C is as follows.

int term () 
{
    int d;
    token();                 /* look ahead a token */
    switch(last_token) {
    case '0':                /* if last token is a number */
        d = value;           /* put the number to d */
        token();             /* look ahead a token */
        return d;            /* return d */
    }
}

By using term() above, a multiplying function is as below.

int mexpr()
{
    int d;
    d = term();             /* calculate term */
    switch(last_token) {    /* if last token */
    case '*':               /* is '*' */
        d *= mexpr();       /* recurse mexpr() and the result multiply d */
        return d;           /* return d */
        
    }
}

The grammar above can be expressed as "<mexpr> -> term * mexpr" in CFG; however, in case of "<mexpr> -> mexpr * term", the grammar will go into an infinite loop as following the function.

int mexpr()
{
    int d;                 
    d = mexpr();            /* calculate mexpr. This is a problem. */
    switch(last_token) {    /* if last token */
        case '*':               /* is '*' */
        d *= term();        /* recurse mexpr() and the result multiply d */
        return d;           /* return d */
        
    }
}

The second restriction is that the grammar must not require more than one token look ahead. Give an example of a production that does not have this property. Explain why this restriction is necessary for recursive descent parsing.

The grammar requires only one token look ahead to prevent from backtracking that lowers execution efficiency. You can avoid backtracking by changing expression (a) to (b) as follows.

(a)

-> if exp then state |
                  if exp then state else state

(b)

-> if exp then state SD
-> else state | ε

LL(k) parsing (for k > 1) which requires two or more tokens look ahead, but the method is impractical because of inefficiency.

Week2 Homework2 (6 points)

(Adapted from Sebasta (2012) Chapter 4 Problem 7)

Show a complete parse, including the parse stack contents, input string, and action for the string id + (id * id), using the grammar and parse table in text Section 4.5.3.

fileCMSC330_Homework2.pdf

Week3 Examining the Scope Rules of Java

The scope rules of modern programming languages are rooted in rules developed by the earliest block structured languages like Algol. Adding object orientation to languages adds one more layer of complexity to this issue.

Let's consider Java, a language with which everyone should be familiar, and explore its scope rules. One aspect of the scope rules of any language is when data or methods can have the same name. Give examples in Java to illustrate a case where declaring two local variables with the same name in same method is permitted and one where it is prohibited. Are the rules governing redeclaration of local names the same as those governing redeclaring local names that rename class-level names?

One consequence of scope rules in most languages is that forward references (referring to names before they are declared) are prohibited. Are such forward references always prohibited within a single Java class? Are forward references of local names always prohibited within a single method? If not, provide an example that demonstrates your claim.

Here is my code.

public class ScopeRulesTest {

    public static void main(String[] args) {
        ScopeRulesTest scopeRulesTest = new ScopeRulesTest();
        scopeRulesTest.legalSameNameVariables(false);
        scopeRulesTest.legalForwardReferences();
        scopeRulesTest.illegalForwardReferences();
    }
    
    /** Two local variables with same name exit in same method. **/

    public void legalSameNameVariables(boolean b) {
        boolean isTrue = b;
        if (isTrue == true) {
            int x = 1; // Adds 1 to local variable x.
            System.out.println("x is " + x);
        } else {
            int x = 0; // Adds 0 to local variable x.
            System.out.println("x is " + x);
        }
    }
    
    /** Forward references are not always illegal. **/
    
    public void legalForwardReferences() {
        int x = y; // Adds global variable y to local variable x.
        System.out.println("x is " + x);
    }
    private static int y = 1; // Defines static global variable y.
    
    public void illegalForwardReferences() {
        int y = x; // x is unresolved. You get a compile error.
        System.out.println("y is " + y);
    }
    
}

Week3 Pointers, References, and Object Allocation Strategies

The trend in newer languages is to replace explicit pointers with implicit references. Which languages take this approach? Take a position on whether is this is a good language design strategy or not. Support your argument with an explanation that discusses the advantages and possible limitations of the alternate approaches.

Another related issue is whether the language allows objects to be allocated on the stack or only on the heap. Among C++, Ada, Java and C#, which languages give the programmer the freedom to choose? Is allowing the programmer this freedom a good language design strategy or not? Take a position and defend it.

The trend in newer languages is to replace explicit pointers with implicit references. Which languages take this approach? Take a position on whether is this is a good language design strategy or not. Support your argument with an explanation that discusses the advantages and possible limitations of the alternate approaches.

C/C++ and Ada can have pointer variables, which are called access types in Ada. Actually, by using pointers, programmers can pursue high execution efficiency and fast performance. Therefore, explicit pointers are very powerful especially for embedded system development and game programming, which programmers have to develop with limited resources or keep high frame rates. However, it is very difficult for many programmers especially for beginners to understand the concept of pointers. In addition, it can increase the risk of generating bugs and security holes such as buffer overflows.

On the other hand, Java, for example, doesn't have pointer types because Java VM completely manages memory allocation instead.

From a perspective of program readability, I think that all programmers does not necessarily need pointers except for some people.

Another related issue is whether the language allows objects to be allocated on the stack or only on the heap. Among C++, Ada, Java and C#, which languages give the programmer the freedom to choose? Is allowing the programmer this freedom a good language design strategy or not? Take a position and defend it.

Java and C# support garbage collection, which automatically frees dynamically allocated memory on the heap. When you create a instance of an object by using new operator, memory is automatically allocated on the heap.

On the other hand, C++ and Ada don't support garbage collection. Actually, in C++, you can use new operator, but you explicitly have to free memory by using delete operator which calls destructor.

Garbage collection is very convenient and it allows programmers to develop with little awareness of memory management; however, if you create and destroy instances repeatedly, execution efficiency might decrease due to overheads. Also, there are risks of memory fragmentation and memory leak on the heap.

Therefore, using both stack and heap flexibly is important.

In C++ and C#, you can make structures by using struct operator and make them behave like class. One big difference between the structure and the class is that when you create a structure, memory is allocated on the stack.

In conclusion, from a perspective of freedom to choose, I think C++ is the best because you can free memory yourself and store object data itself on the stack if needed.

Week3 Homework3 (6 points)

1. Scope and lifetime are distinct yet related issues in programming languages. Languages can sometimes make design decisions that cause a conflict between the scope and the lifetime of variables. Java's decision to allow classes to be defined inside a method illustrates this conflict. Consider the following example:

class AnonymousInnerClassInMethod
{
    public static void main(String[] args)
    {
        int local = 1;
        
        Comparable compare = new Comparable ()
            {
            public int compareTo(Object value)
                {
                    return (Integer)value - local;
                }
            };
        System.out.println(compare.compareTo(5)); 
    }
}

Why does this code fail to compile? What could it be modified so it would compile? Explain the reason behind this failure in terms of scope and lifetime.

2. C++ allows pointers to stack-dynamic variables. Consider the following C++ function:

int twice(int x)
{
    int *y;
    *y = x * 2;
    return *y;
}

Will the above function compile in C++? Is it correct? If not, under what circumstances will it fail and how should it be corrected? Consider one other language that has pointers. Does that language have the same problem? Explain.

fileCMSC330_Homework3.pdf

Week4 Operand Evaluation Order

Although Java has the rule that the left operand of every binary operator is evaluated before the right operand, most languages give the compiler the freedom to choose which operand is evaluated first. When expressions have side effects, the value of the expression can be different depending upon which order is used. Give an example in C++ of an expression whose value depends upon the evaluation order. Show the orders that produce different values and the values they produce. Explain what side effect is the expression contains.

Week4 Should Indentation Have Semantic Significance?

Most modern programming languages are free format. Aside from acting as token separators, whitespace, which includes spaces, tabs and new lines, has no effect on the meaning of the program. Python is an exception. In Python, the way a program is indented affects its meaning. Do some investigating and report how the formatting of a Python program affects its meaning. Take a position as to whether Python's approach is a good language design or not. Support your position by illustrating the potential problems with the other approach.

Week4 Project1 (25 points)

The first programming project involves completing a program that parse, using recursive descent, a GUI definition language defined in an input file and generates the GUI that it defines. The grammar for this language is defined below:

gui ::=
    Window STRING '(' NUMBER ',' NUMBER ')' layout widgets End '.'

layout ::=
    Layout layout_type ':'

layout_type ::=
    Flow |
    Grid '(' NUMBER ',' NUMBER [',' NUMBER ',' NUMBER] ')'

widgets ::=
    widget widgets |
    widget

widget ::=
    Button STRING ';' |
    Group radio_buttons End ';' |
    Label STRING ';' |
    Panel layout widgets End ';' |
    Textfield NUMBER ';'

radio_buttons ::=
    radio_button radio_buttons |
    radio_button

radio_button ::=
    Radio STRING ';'

In the above grammar, the red symbols are nonterminals, the blue symbols are tokens and the black punctuation symbols are BNF metasymbols. Among the tokens those in title case are keywords. The character literals are punctuation tokens.

Below is an explanation of the meaning of some of the symbols in the above productions that should help you understand the actions that are to be performed when each of the productions is parsed:

You parser should properly handle the fact that panels can be nested in other panels. Recursive productions must be implemented using recursion. Syntactically incorrect input files should detect and report the first error.

Below is an example of an input file:

Window "Calculator" (200, 200) Layout Flow:
  TextField 20;
  Panel Layout Grid(4, 3, 5, 5):
    Button "7";
    Button "8";
    Button "9";
    Button "4";
    Button "5";
    Button "6";
    Button "1";
    Button "2";
    Button "3";
    Label "";
    Button "0";
  End;
End.

The above input file should produce the GUI shown below:

Week5 Parameter Passing

One characteristic of programming languages that varies widely from language to language is how parameters are passed. Among ALGOL, Pascal, Ada, C, C++, Java, and C#, no two languages pass parameters in exactly the same way. Among these languages, choose the one that you believe has adopted the best approach to parameter passing. Defend your decision by outlining the advantages of the approach of the language that you chose and the disadvantages of the approaches taken by other languages.

I believe Ada is the best approach to parameter passing because it has the highest reliability by defining parameter semantics in, in/out, and out. Languages such as C and C++ which support pointers can be unsafe when passing pointer by value. Java basically adopts passing by value, so it is safe but less flexible. I don't think passing by name adopted in ALGOL is a versatile approach.

Week5 Nested Subprograms vs. Nested Blocks

All the languages in the C family of languages allow nested expressions, nested statements and nested classes. None allow nested subprograms (functions or methods), but all allow nested blocks. Using examples, illustrate the difference between a nested subprogram and a nested block.

What is the rationale for prohibiting nested subprograms? Is it a good language design decision? What languages allow them?

Scopes of variables in if-statements, loop-statements, and functions in Python is completely identified by its indentation level.

This language specification strongly promotes standardization of formating of Python programming to improve efficiency, readability, and code reusability.

I don't think all the programming languages should adopt this specification because if the number of lines in a statement is quite big, its readability is not necessarily high.

However, I agree with the idea in scripting languages including Python. In many of scripting languages, using MVC Web application frameworks are recommended. Therefore, each class or module can be relatively small, so that programmers easily keep the readability.

Week5 Homework4 (6 points)

1. What is the output of the following Java program? Explain in terms of how parameters are passed in Java.

import java.awt.*;

class PointParameters
{
    public static void main(String [] args)
    {
        int x = 1, y = 1;
        Point p = new Point(x, y), q = new Point(x, y);
        doubleScale(x, y, p, q);
        System.out.println( "(x,y) = " + new Point(x, y) +
            " p = " + p + " q = " + q);
    }

    private static void doubleScale(int x, int y, Point p, Point q)
    {
        x *= 2;
        y *= 2;
        p.x *= 2;
        p.y *= 2;
        q = new Point(x, y);
    }
}

Suppose a similar program were written in C# in which all the parameters were ref parameters. What would the output of that program be?

2. Examine the following C++ program, in which a IntList class is defined that contains an overloaded [] operator. What is the output of this program?

#include <iostream>
using name  std;

class IntList 
{ 
private: 
    int list[1]; 
public:
    IntList() {list[0] = 0;}
    int& operator[] (const int index) {return list[index];} 
}; 
  
int main()
{
    IntList list;
    
    cout << list[0] << endl;
    list[0] = 1;
    cout << list[0] << endl;
    return 0;
}

Notice that the overloaded [] operator returns a reference. Change the [] operator to return a value instead and explain what happens in that case. Explain why the ability to return by reference is essential to be able to implement such an operator. Can anything similar be done in Java?

fileCMSC330_Homework4.pdf

Week6 Separating the Specification and Implementation

One important difference between languages that provide syntax to encapsulate the definition of user defined data types is whether the syntax requires the specification details to be separated from the implementation details. Ada requires such a separation. In Ada, the specification information must be placed in the package specification and the implementation details in the package body. Where must the representation details be placed?

Compare Ada with both C++ and Java in this regard. Take and defend a position as to whether requiring separation of the specification and representation information for a data type is a good language design decision.

Week6 Type Parameterization

Implementing type parameterization in Java was a much simpler task than implementing it in a language like C++ and Ada because, in Java, all objects are allocated on the heap. Explain why allocating objects on the stack complicates the implementation of parameterized types.

What about C#? Do some investigation and explore whether C# requires that all objects be allocated from the heap and how it approaches type parameterization.

Week6 Homework5 (6 points)

1. Consider the following Java definition of a mutable string class.

class MutableString
{
    private char[] chars = new char[200];
    private int size = 0;
    public boolean set(char aChar, int index)
    {
        if (index < 0 || index >= chars.length)
            return false;
        chars[index] = aChar;
        return true;
    }
    public String toString()
    {
        String result = "";
        for (int i = 0; i < size; i++)
            result += chars[i];
        return result;
    }
}

Suppose this class was rewritten in C++ in two ways, the first with the array that represents the list as a stack-dynamic variable, and then with the list as a heap dynamic variable. Explain when constructors and destructors would be needed. Explain why no constructors or destructors are needed in the Java implementation.

2. Consider the following C++ template class.

template <typename T, int length>
class Vector
{
public:
    Vector(T values[length])
    {
        for (int i = 0; i < length; i++)
            list[i] = values[i];
    }
    friend bool operator<(const Vector<T, length>& left, const Vector<T, length>& right)
    {
        bool result = true;
        for (int i = 0; i < length; i++)
            result &= left.list[i] < right.list[i];
    return result;
    }
private:
    T list[length];
};


int main()
{
    int first[] = {1, 2}, second[] = {2, 3}; 
    Vector<int, 2> vector1(first), vector2(second);
    cout << (vector1 < vector2) << endl;
    return 0;
}

The class Vector cannot be instantiated for any arbitrary type. For example, consider the following instantiation for a wrapper integer class.

class Int
{
public:
    Int(int i = 0) {this->i = i;}
private:
    int i;
};

int main()
{
    Int first[] = {Int(1), Int(2)}, second[] = {Int(2), Int(3)}; 
    Vector<Int, 2> vector1(first), vector2(second);
    cout << (vector1 < vector2) << endl;
    return 0;
}

Explain why the second implementation fails. What must be added to that class so this program will compile? Suppose this program were written in Java. Explain how Java allows the constraints on a generic type parameter to be specified and how they would be specified in this case

Java does have one limitation, however. Although wrapper classes can be used to instantiate generic type parameters, primitive types cannot. Explain why.

fileCMSC330_Homework5.pdf

Week7 Homework6 (6 points)

fileCMSC330_Homework6.pdf

Week8 Project2 (25 points)

Flash Videos

Week1 Videos

Week2 Videos

Week3 Videos

Week4 Videos

Week5 Videos

Week6 Videos

Week7 Videos

Week8 Videos


トップ   新規 一覧 単語検索 最終更新   ヘルプ   最終更新のRSS