Lecture notes for 08/28/97 o Matching Parentheses 1 Handout o I/O in C++ * HWK 1 o Recursion If you're missing handouts: see web page http://www.cs.cmu.edu/afs/andrew/scs/cs/15-211/www/home.html For those of you unfamiliar with UNIX, there's a nice primer can get to from the course web page. ============================================================================ Start: C++ review -- do with a kindof fun example: -- Let's work out together a program to see if parentheses are matched in some file. Eg: a * (b + (c - d) - (e - f)) is OK. ( ) ) ( is not. Before we start to write code, let's work out the basic plan of action, or ALGORITHM. RULE#1: Write down algorithm FIRST before you start to program. Alg = high level plan of what you're going to do. So, can make sure basic plan is correct before you get to details. I'll start: -- read in 1 character at a time. -- have a counter starts with 0. -- Now what? -- increment when see left paren. Decrement when see right. -- when do we know there's a problem? -- now we're ready to write it. -- start with INCLUDES. These are standard "header files". Contain certain standard definitions. Say more about later. In particular, has i/o definitions. -- then define CONSTANTS. Reason for using defined constants: (who can think of good reasons) * easy to change. * readability. -- eg, "strcmp" compares 2 strings and returns 0 if they're the same. Can define SAME to equal 0. * maybe several things are this quantity and want to keep track. MINS_PER_HOUR, SECS_PER_MIN RULE #2. Give names to any "magic values" -- All C procedures are functions. C program just a list of functions. Which one gets executed? -- int main(void)... -> when run program, looks for function called "main". the "int" means will be returning an integer. Convention, main routine returns 0 on successful completion. Void means no arguments. =========================================================================== // // Matching parentheses: tell if parens don't match, and give position in // file if the problem is an extra right paren. "count" keeps track of how // deeply nested we currently are. // #include const char LEFT = '('; const char RIGHT = ')'; int main(void) { int count = 0; char letter; while (cin >> letter) { // expression evaluates to 0 on EOF if (letter == LEFT) ++count; if (letter == RIGHT) --count; if (count < 0) { cout << "extra right parenthesis at position " << position << endl; return 0; } } if (count > 0) cout << "too many left parentheses" << endl; else cout << "OK." << endl; return 0; } ========================================================================== * >> is called the input operator, << is the output operator * >>, << point in direction data is moving * >> operator ignores blank spaces (blanks, newlines, tabs) in input stream * cin, cout are like stdin, stdout in C. Note: do not need "&" like scanf. * "=" is assignment operator. "==" is 'is equal to' operator. a==b returns 1 if true and 0 if not true. * While keeps executing contents as long as expression is non-zero. Do test first, then execute contents, then test, then contents, etc. * ++i is shorthand for (i=i+1). * "..." denotes a string. '\n' is return character, so is endl in C++ * return 0. E.g., run program on itself -- OK. Can we change program so it is still a correct program but parens are not matched? ============== modify program: -- Say if extra right we want to output position, how could we do that? -> have "position", and need to change input to not skip blankspaces. Several ways to do this. #1: cin.unsetf(ios::skipws); // tell cin not to skip whitespace while(cin >> letter) { What's going on: cin is a class. unsetf is a member function. "cin.unsetf(ios::skipws)" tells cin to un-set the flag that tells it to skip whitespace. #2: while (cin.get(letter)) { // call by reference This version is call-by-reference. Talk more about later. reads a character and puts it into its argument. Returns 0 on EOF and nonzero otherwise #3: (more "C" like) int letter; while((letter = cin.get()) != EOF) { if we don't give "get" any arguments, then it behaves like getchar() in C. It returns the next byte read, or EOF (-1) on end of file. =========================================== GOTCHAS if and while loops. syntax: * if (expression) statement * while (expression) statement where * is something that has a value (like a==b or 5+2) * is ; or {....} These work as follows: expression is evaluated, and IF IT IS NONZERO, then execute the statement (and we repeat if we're in a while loop). So, we can say: if (-7) a=5; else a=6; and since -7 is not 0, will treat as meaning "true" and set a to 5. All the "relational operators" have value 1 if true and 0 if false. e.g., (a == b) returns 1 if true and 0 if not true. Danger: if (a=7) cout << "hello\n"; else cout << "goodbye\n"; ============================================ RECURSION: functions that call themselves. Very useful. Will talk about lot more later. But, thinking recursively is something that takes practice to get used to. What I want to do now is give an interesting example of recursion. This one's pretty tricky, but let's try it. What if wanted to match parens (), brackets [], and braces {}? Eg: legal: ..(..{..}..(..)..[..(..)..]..)..[ () ] How could we write a program to check if the delimiters here are well matched??? Might think: 3 counters, but this won't work. Eg, bad: ( { [ ] ) } Not so easy. Answer: we're going to use recursion. To save time, suppose we've already written the following support routines. define START, END to equal 0. (Will say in a minute what they're for). char read_to_next(void); // returns next delimeter or END on end of file. int is_left(char c); // returns 1 if c is {,(, or [, and 0 otherwise. int is_match(char left, char right); // returns 1 if match, 0 otherwise (,) [, ] {, } START, END RULE #3: put details into subroutines. More generally, effort spent figuring out how to separate into pieces pays off. (book has list of some useful rules like this on p.190) You could imagine writing these. Not hiding anything magic here. Eg, "is_left" would just see if its input was a left paren, left square bracket, or left curly brace. I won't, because I want to focus on the more interesting part: how are we going to do the match. Let's write: int find_match(char left); Goal: read up to matching right marker. Eg: feed in '(' and it should read up to ')' here. if successful andinside matches OK, it should return 1. If problem inside, return 0. What should this routine do? ---read to next marker. If it's a left marker -> call recursively. --- if recursive says is a problem (outputs 0) we output 0. --- if recursive OK, go back to above. We have a right marker. --- if matches output 1 (good) --- if doesn't then output 0 (no match). There's the algorithm for find_match. If you're not used to recursion, here's the trick for thinking recursively: First, ask: does it do the right think on the simplest cases? You call it with a left delimiter and the next delimiter you see is a right delimiter. (..), (...} Now, look at general case, BUT: when call recursively can now assume it does the right thing. Does it do the right thing given that it's correct recursively? Answer: YES. More generally, works because what it means for this to be well matched is you have a left del., followed by a bunch of well-matched regions, and then the matching right del. More formally, INDUCTION: When you get some input, there are some number of levels of nesting. 0, 1, 2, 3, 4,... What we first did was show this works if there are 0 levels. Now in general case we showed would work for i levels GIVEN that works for i-1 and fewer, (when call recursively, we have one level less of nesting). That means, working for 1 level follows from fact it works for 0. This implies works for 2, and so on. Now, let's write C++ code. Reason for introducing START is can now do: int main(void) { if (find_match(START)) cout << "matched\n"; else cout << "doesn't match"; return 0; } C++ code for alg: int find_match(char left) { char c; for(c = read_to_next(); is_left(c); c = read_to_next()) { if (find_match(c) == 0) return 0; } /* c is a right marker */ if(is_match(left, c)) return 1; else return 0; } Even better: int find_match(char left) { char c; c = read_to_next(); if (!is_left(c)) return is_match(left,c); if (find_match(c) == 0) return 0; return find_match(left); // "tail" recursion here. }