20CS122001  Computer Science II  Spring 2012 

Brownie Point Assignment 
Evaluate Simple Arithmetic Expressions
For 15 brownie points (not required)
(submit instructions: here)
Simple use of the Stack class.
Use two stacks, one for holding operators and one for holding
operands, to evaluate a simple arithmetic expression. Operands are
numbers. Operators include plus (+), minus (),
multiply (*), divide (/). Input should be taken
from stdin to allow the user to apply the evaluator to a file
containing a large expression or to an argument on the command
line.
Consider a simple arithmetic expression using parens, arithmetic operators, and numbers. The following is an example
(7 + (3 * (4 + 4) * 2 * 2 * (1 + 2)) * 3)
How would you evaluate this expression? Before you answer, in order to simplify code later, let us suppose + and * have the same level of precedence and that operations are evaluated from left to right. So, for example, the expression
(8 + 7 * 6)
would be evaluated by first adding 8 and 7 to get 15, then multiplying 15 by 6 to get 90. This is different from what one normally does which would be to multiply first then add. Due to the simplification stated above, operations on the same nested level may be paired from left to right. Doing so to the first expression above gives
((7 + (((((3 * (4 + 4)) * 2) * 2) * (1 + 2)))) * 3)
This expression can be evaluated easily. First, find the most deeply nested parenthesized subexpression and evaluate that. In the above example its (4 + 4). Evaluating and substituting gives
((7 + (((((3 * 8) * 2) * 2) * (1 + 2)))) * 3)
Next, find the most deeply nested subexpression to be (3 * 8). Evaluating and substituting gives
((7 + ((((24 * 2) * 2) * (1 + 2)))) * 3)
Repeating (realizing that (number) can be replaced by number) gives the following sequence
((7 + (((48 * 2) * (1 + 2)))) * 3) ((7 + ((96 * (1 + 2)))) * 3) ((7 + ((96 * 3))) * 3) ((7 + (288)) * 3) ((7 + 288) * 3) ((295) * 3) (295 * 3) (885) 885
Try this
Now consider how a computer program might be written to find this answer. It is desireable to scan the input from left to right, reading every token just once. This means that some scanned operations must be delayed until they become the currently most deeply nested. Accordingly, operands associated with delayed operators are also delayed. The delays are easily implemented using two stacks: one for the operators, called Operator, and one for the operands, called Operands. Stacks are used because it will be easy to push items onto the stacks in such a way that the next operands to use can be found quickly at the top of the Operands stack and the next operator to be applied can be found quickly at the top of the Operator stack.
Let's express the rules for pushing and popping in English by considering the possibilities one at a time:
First, observe when reading a token from the input line there are four possibilities: namely, the token can be a number, a left paren, a right paren, or an operator. Let's start with the easiest one...
The remaining cases are more complex and involve popping data from the stacks. Consider first the case of ...
The next case is reading a number. But a number may be read before
or after the operator it associates with, or there may be no
operator associated with the number (for example, consider the case
(23)). However, when reading a number, there is no way to tell
whether or not an associated operator exists for that number if it is read
before an associated operator is read. Hence, we will ignore the latter case
for now and say the case (
The second subcase is...
The last case is the hardest...
4 * (6 + (2 * 3) + 2)In this case it is unnecessary to do anything with the (. Let's go through it slowly. From the rules given above, 4 is pushed onto the Operand stack, then * is pushed onto the Operator stack. Suppose we do nothing with the (. Then 6 is pushed onto the Operand stack and + is pushed onto the Operator stack. Again, do nothing with the (. Then 2 is pushed onto the Operand stack and * pushed onto the Operator stack. When 3 is read, the 2 is popped and multiplied by the 3. The result is 6 which is pushed onto the Operand stack. Reading ) causes the two 6s to be popped and added and the resulting 12 pushed onto the Operand stack. Observe that, up to this point, the precedence of the parens has been respected. Now consider the input
4 * ((2 * 3) + 2)Push 4 and *. If we do nothing with the two ( symbols, when 2 is read there is not enough information on both stacks to tell whether the 2 is before or after its associated operator  we threw that information out! But we can fix this pretty easily in several ways. One way is to keep track of a "depth" of nesting. But a simpler way is just to push a + onto the Operator stack and 0 onto the Operand stack every time a ( is read. This does not change the result (because we are adding 0 to a number) and eliminates two ( in a row.
Now that we understand the above rules, the question is how to organize the code needed to solve the problem. From the rules above it seems there will be sections in the code where some action is invoked and there will be points where decisions (if or switch statements) will be made about which action section to invoke next, given the type of token that is read and the action section most recently invoked. For example, we can identify four actions:
We can also identify four points at which decisions are made:
It helps considerably to visualize the above. Let's use circles to represent decision points and lines between circles to represent actions. Each line will have a label indicating input token type and one end will contain an arrowhead. The orientation of a line, indicated by the arrowhead, specifies the next decision point, after the action associated with the line is completed. The picture below visually describes the relationship between action and decision which is described above in terms of circles and oriented lines. The numbers in square brackets refer to the actions as numbered in the list above. On lines, the labels Number, Operator, and ) refer to read tokens that are numbers, operators, and right parentheses, respectively. In circles, the labels Number, Operator, ( and ) refer to previously read token types that are numbers, operators, left parentheses and right parentheses, respectively.
What is interesting is that the action taken depends not only on the currently read token type (shown as a line in the Figure), but also on the previously read token type (shown as a circle in the Figure). In the discussion of sample code, given below, we will refer to this figure in the comments. The lower left circle of the Figure corresponds to previous type Number, the top circle to previous type Operator, and the lower right circle to previous type RParen. The previous type LParen turns out not to be necessary in this case. 
Look at this powerpoint presentation to observe the movement between decision points, the decisions made, and the actions invoked as tokens are read, left to right, from the input:
(7 + (3 * (4 + 4) * 2 * 2 * (1 + 2)) * 3)
Alternatively, the following progression shows the contents of both stacks as tokens are read, left to right, from the above input expression.
Input Token  Stacks  Notes  

Initially, stack + and 0, previous type is Operator.  
( 

Stack + and 0, previous type is Operator.  
7 

Pop the 0 and +, add to the 7. Push the resulting 7 onto Operands. Previous type is Number.  
+ 

Push the + onto Operator. Previous type is Operator.  
( 

Push + and 0, previous type is Operator.  
3 

Pop the + and 0, add the 3 to the 0 and push the resulting 3 onto Operands. Previous type is Number.  
* 

Push the * on Operator. Previous type is Operator.  
( 

Stack + and 0, previous type is Operator.  
4 

Pop + and 0, add 0 to 4, push the resulting 4 onto Operands. Previous type is Number.  
+ 

Push the plus on Operator. Previous type is Operator.  
4 

Pop the plus and 4, add the popped 4 to the read 4, push the result onto Operands. Previous type is Number.  
) 

Pop the 8 and the 3 from Operands, pop the * from Operator, multiply 3 and 8 and push the resulting 24 onto Operand. Previous type is Operator.  
* 

Push the * onto Operator. Previous type is Operator.  
2 

Pop the 24 from Operands, pop the * from Operator, multiply the read 2 by the popped 24, push the result onto Operands. Previous type is Number.  
* 

Push the * onto Operator. Previous type is Operator.  
2 

Pop the 48 from Operands, pop the * from Operator, multiply the read 2 by the popped 48, push the resulting 96 onto Operands. Previous type is Number.  
* 

Push the * onto Operator. Previous type is Operator.  
( 

Stack + and 0, previous type is Operator.  
1 

Pop the 0 and +, add the 0 to the read 1, push the resulting 1 onto Operands. Previous type is Number.  
+ 

Push the + onto Operator. Previous type is Operator.  
2 

Pop the 1 from Operands and + from Operator, add the read 2 and popped 1 and push the result onto Operands. Previous type is Number.  
) 

Pop the 3, and 96 from Operands, pop the * from Operator, multiply the popped 3 and popped 96, and push the resulting 288 onto Operands. Previous type is RParen.  
) 

Pop the 288, and 7 from Operands, pop the + from Operator, add the popped 288 and popped 7, and push the resulting 295 onto Operands. Previous type is RParen.  
* 

Push the * onto Operator. Previous type is Operator.  
3 

Pop the 295 from Operands, pop the * from Operator, multiply the popped 295 and read 3, and push the resulting 885 onto Operands. Previous type is Number.  
) 

Pop the 885 and 0 from Operands, and + from Operator, add the 885 to 0, and push the resulting 885 onto Operands. Done. 
The code can be organized around the decision points and action sections as follows:
// Action sections of code are implemented as functions whose // prototypes are given below. // All input tokens are treated as strings even if they are numbers. // Hence, arithmetic operations can be performed only through // appropriate conversions to C++ type int or long void pushOperator(char *op) { /* push op onto Operator stack */ } void pushOperand(char *operand) { /* push operand onto Operand stack */ } void computeWithInputToken(char *number) { /* performs task of action 3) above where <number> is the token read from the input line */ } void computeWithOperandsOnly() { /* performs task of action 4 above */ } // Use this type to distinguish token and previous token (circle) types enum Type { LParen, Number, Operator, RParen }; // This function determines the type of input token. Token types are // LParen (that is, '('), Number, Operator, and RParen (that is, ')') Type tokenType(char *token) { ... } // The following outlines the decision points. // Comments refer to circles of the figure above. The value of variable // "previous_type" signifies both the current decision point and a circle. // Values are Number, Operator, RParen representing, respectively, lower // left, top, lower right circles in the above Figure. The variable // is never given the value LParen since there is no such state. Type previous_type = Operator; // The initial circle (top circle in the figure) ... Type type = tokenType (token); // From the lower right circle or the lower left circle, if the token is an // operator, perform action [1] and enter the top circle if ((previous_type == RParen  previous_type == Number) && type == Operator) pushOperator(token); // action [1] // From the top circle, if the token is a Number, // perform action [3] and enter the lower left circle. else if (previous_type == Operator && type == Number) computeWithInputToken(token); // action [3] // From the lower right circle or the lower left circle, if the token is a // right paren, perform action [4] and enter the lower right circle. else if ((previous_type == RParen  previous_type == Number) && type == RParen) computeWithOperandsOnly(); // action [4] // If there is a left paren, stack 0 and + and bypass the // LParen state, landing in the Operator state. else if (previous_type == Operator && type == LParen) { pushOperand("0"); // action [2] pushOperator("+"); // action [1] type = Operator; } // Otherwise there is some parse error. else return ...; // The following line "moves" us from the previous circle to the new one. previous_type = type; ...
Write a C++ program implementing the rules above and able to evaluate
simple arithmetic expression as described above.
arith "(7+(3*(4+4)*2*2*(1+2))*3)"
#include "tokenizer.h" ... int main (int argc, char **argv) { Parser *parser = new Parser(); Tokenizer *tokenizer = new Tokenizer(argv[1]); char *token; while ((token = tokenizer>nextToken()) != NULL) parser>parse(token); }where class Parser is described below as is the method parse of Parser. The purpose of a tokenizer object is to separate input arithmetic expression tokens, which are all in argv[1] due to the double quotes surrounding the expression, and operate on each one in turn by passing them to method parse, one at a time. All tokens returned by a tokenizer object are of C++ type char * for convenience.
g++ c tokenizer.cc g++ c stacker.cc g++ c arith.cc g++ arith.o stacker.o tokenizer.o o arithwhich results in an executable named arith. Do not forget to add #include "tokenizer.h" and #include "stacker.h" to arith.cc.
When linked to stacker.o you can create a stack using:
Stack *stack = new Stack(disp);where disp is defined below. To push an item onto the stack use:
stack>push("A string item");An example of popping an item from a stack is the following:
char *item = (char *)stack>pop();
void disp (void *object) { cout << (char *)object << " "; }for this purpose.
Stack *operator_stack = new Stack(disp); Stack *operands_stack = new Stack(disp);
enum Type { LParen, Number, Operator, RParen };to define token types.
char *compute (char *opr, char *opnd1, char *opnd2);which converts opnd1 and opnd2 to numbers, applies the operator opr to the numbers (for example, opnd1 * opnd2), converts the result back to a character string, and returns that string. Important: opnd2 should always be the operand on the right side of the operator opr and opnd1 should always be the left operand. Otherwise, this might work OK for + and * but not for  and /.
Also, do not use something like the following:
char *compute (char *opr, char *opnd1, char *opnd2) { ... char number[1024]; ... sprintf(number, "%d", opnd1+opnd2); ... return number; }This is a classic error made by novice C++ coders. This cannot give predictable results because the memory taken by number is returned to the operating system as the invocation of compute is completed. Hence, there is no telling what value is returned.