CS3101-1 Programming Languages – C Lecture 4 Matthew P. Johnson Columbia University Fall 2003 10/17/08
CS3101-1, Lecture 4
1
Important Reminder • Do not let me fail to repeat Qs • Remind me if I forget – And before I complete the answer
Win valuable prizes
10/17/08
CS3101-1, Lecture 4
2
Agenda • • • •
10/17/08
Hw2 due last night Hw3 assigned tonight Last time: complex programs This time: arrays & ptrs – Ptr arithmetic – 2-D arrays – qsort
CS3101-1, Lecture 4
3
Data types •
Used several data types: – ints ~ whole numbers – floating pts ~ “decimal” (non-whole) numbers – chars ~ letters
•
Will (next time) use generic complex data types: – structs ~ multiple-field records
•
In general: – a data type ~ class of poss vals, – corresponding to real-life concept (letter, number, customer record) – Indy var means/refers to/denotes today’s date., # pages, etc. – But not always
10/17/08
CS3101-1, Lecture 4
4
• •
Pointers
Do not correspond to real-life concept Indy var means … another var – Somewhat like quotes in natural language – “Snow is white,” v. “’Snow’ has four letters.”
•
Corresponds to computer hardware concept, memory address – Some (e.g., Prof. Stolfo) argue: this is bad language design! – But: It’s C’s design
•
Theoretically, very interesting self-reference Halting Problem: interp nums as TMs Godel’s Incompleteness Theorem: interp nums as sentences – ~ “I am Lying,” “This sentence is unprovable.” – Here: interp nums as var addresses – See 3261/4236, Godel, Escher, Bach, Douglass Hofstadter
10/17/08
CS3101-1, Lecture 4
5
Pointers • • •
Q: What is a pointer? A: “a var that contains the address of a var.” For regular (“first-order”) vars, we have var name, and value: – char c = 10;
•
Remember: var declar/def/init does: – obtains mem for data – attaches the name c to that mem location – writes 10 there
10/17/08
CS3101-1, Lecture 4
6
Var creation • After declar/def/init, can use that data with the identifier c. • When we say c, it remembers where to look • The data lives somewhere in memory. • Don’t have to know where the data actually is— just remember c. • Could we find out the location? (Why?) • Turns out: yes • Ptrs are vars that take on these locations as vals 10/17/08
CS3101-1, Lecture 4
7
Pointers • p = &c • & op applied to a var (lvalue) evals to its address; • p is now set to the address of c • We interp val of p as a memory location p now “points to” c • & applies only to lvalues – Not consts, literals, const exprs
• Draw picture of p & c in mem 10/17/08
CS3101-1, Lecture 4
8
& and * •
Exists: inverse operator to &: * – the dereferencing or indirection op. – Applied to a pointer – evals to the pointer’s referent • what it points to:
int x = 1, y int *ip; ip = &x; y = *ip; *ip = 0; ip = &z[0];
• 10/17/08
= 2, z[10]; /* init */ /* declar */ /* ip points to x */ /* y set to val of ip’s referent, 0*/ /* ip’s referent, y, set to 0 */ /* ip now points to z[0]
After execution, x == 0, y == 1 CS3101-1, Lecture 4
9
& and * • & and * (as ptr ops) are inverses – Cancel each other out
• Examples • *&x “the value at the address of x” x – Who is buried in Grant’s tomb?
• &*xp “the address of the value at the address xp” xp – Where is the contents of Grant’s tomb? 10/17/08
CS3101-1, Lecture 4
10
Pointer declars • To declar ptr-to-type var, similar to type declar, plus *: – int *ip;
• Interp 1: ip is an int-pointer: (int*) ip • Interp 2: dereferenced ip is an int: int (*ip) • Notice: primitives, arrays, ptrs, ftns, can all be declared together: – int x, *ip, y, z[10], myftn(); 10/17/08
CS3101-1, Lecture 4
11
Dereferenced ptrs • Dereferenced ptr gives the actual lvalue, not simply a passive value • Can be used just as the corresponding var could be – – – – 10/17/08
Both accessed and modified *ip = *ip + 1; *, &, other unaries have high precedence *ip += 1; CS3101-1, Lecture 4
12
Dereferenced ptrs • What about ++? • Unaries associate R-to-L – ++*ip ++(*ip)
• But: – – – – 10/17/08
*ip++ *(ip++) Need parenths ip++ is valid! “pointer arithmetic” CS3101-1, Lecture 4
13
Modifying ptrs vals • Pointers can be modified, like other vars: – iq = ip;
• Example: • int *ip, *iq, x; x = 5; ip = &x; iq = ip; *ip = 10; x == *ip == *iq == 10 10/17/08
CS3101-1, Lecture 4
14
Uses of ptrs • •
Limitation mentioned before: ftns have single return value Sometimes nice to have multiple return values – e.g., some value, plus error/success.
•
getchar() returns next char read, EOF (-1) if error – chars are non-neg, so chars, EOF are distinct values
•
• 10/17/08
consider a getint(): return next int read, or EOF (-1) if error – problem: EOF == -1 is an int! – Soln: let return value be error/success; write next int at location of a ptr param: – int getint(int *);
Or consider: e computation from hw CS3101-1, Lecture 4
15
Passing by ref • Remember: all args passed-by-value – Local var created, inited to val of arg – Actual arg never changes
• Alternative method is pass-byreference (C++, Java) – Don’t pass the value, pass the location of the value – Don’t send the webpage, send the URL
• Ptrs can simulate pass-by-ref 10/17/08
CS3101-1, Lecture 4
16
Passing by ref: swap • • •
• • 10/17/08
Swap two ints Naïve swap ftn: void swap(int x, int y) { int temp = x; x = y; y = temp; } … int a = 5, b = 10; swap(a,b); CS3101-1, Lecture 4
• Effect: nothing! • x inited to val of a == 5, y inited to val of b == 10 • x and y swapped • a and b never change • NB: couldn’t be otherwise • Consider: swap(5,10); – Can’t change literals 17
Passing by ref: swap’ • • •
•
Soln 1: write a macro, obviating real params (hw) Soln 2: Don’t pass a and b’s vals, but references to a and b void swap(int *px, int *py) { int temp = *px; *px = *py; *py = temp; } … int a = 5, b = 10; swap(&a,&b); NB: px, py are passed by value! – These vals happen to be (interped as) refeences
• • 10/17/08
Draw picture! Soln 3: Arrays… CS3101-1, Lecture 4
18
Pointers & Arrays • What do they have to do w/each other? • In Java, not much – arrays are special sorts of objs, with many mems – arrays, like all objs, are passed by reference – two distinct ideas
• In C: Ptrs and arrays turn out to be almost identical 10/17/08
CS3101-1, Lecture 4
19
Pointers & Arrays
• Consider int a[10]; – What happens? • The id a is attached to a block of mem with room for 10 ints • These ints accessed with a and a subscript: a[0], a[5] • Important: the 10 ints are continguous, in a row: • a: •
0 1 2 3 4 5 6 7 8 9
• a[0] gives an int • &a[0] gives the address of that int 10/17/08
CS3101-1, Lecture 4
20
Pointers & Arrays • “Ordinary” ptrs can point to arrays: • pa = &a[0]; pa: a[0] a:
0 1 2 3 4 5 6 7 8 9 • Now can get other array elms” – – – – 10/17/08
pa+1 points to next pa+2 pa++ pa-1 points to prev CS3101-1, Lecture 4
21
Pointers & Arrays • Q: How does it know they’re ints? • A: pa is an int* – why ptrs need to know referent type
• Turns out: array name evals to ptr to 0th elm – a == &a[0] – & not applied array name 10/17/08
CS3101-1, Lecture 4
22
Pointers & Arrays • •
• • •
Also turns out: a[i] (array index notation) evals to same as *(a+i) (ptr + offset notation) array[index] is really short-cut notation array[index] converted to *(array+index) in compilation Also, array[index] notation applicable to regular pointers: – ip[20] *(ip+20) who knows? – As usual, be careful!
• • • 10/17/08
array[index] converted to *(array+index) in compilation Q: What if 2[a]? A: 2[a] *(2+a) *(2+a) (by symmetry) a[2] CS3101-1, Lecture 4
23
Array/ptr example
10/17/08
• Consider a string-length ftn: • int strlen(char *s) { int n; for (n = 0; *s != ‘\0’; s++) n++; return n; } • Now change to: • int strlen(char s[]) { … Effects? • None! CS3101-1, Lecture 4 24
Arrs/ptrs as ftn args • Can call: – strlen(“hi”), strlen(a), strlen(pa) – Local var s set to pt to arg passed
• Same behavior: – strlen(a+2), strlen(&a[2]) 10/17/08
CS3101-1, Lecture 4
25
Passing by ref: swap’’ • •
• • 10/17/08
Soln 3: Pass an array containing a and b void swap(int AandB[]) { int temp = AandB[0]; AandB[0] = AandB[1]; AandB[1] = temp; } … int AandB = {5, 10}; swap(AandB); NB: We pass one ptr (by value!), with the understanding that a and b are contiguous Draw picture! CS3101-1, Lecture 4
26
Arrays v. pointers • One important difference: • Defined-array values are immutable – immutable = cannot be changed – ~ “cannot mutate” – Defined-array value = id used to define/create array
• Illegal: – int a[10]; – a++, a = …
• Restriction does not apply to: – Array members – Array params in a ftn (“formal params”) 10/17/08
CS3101-1, Lecture 4
27
Pointer arithmetic • Addition/subtraction with integers: – Ptr +/- n ptr to val distance n away
• Subtraction of two pointers: – p – q distance bet. p and q – p, q should point to mems of same array/data block or NULL
• No other legal ops (apart from ) 10/17/08
CS3101-1, Lecture 4
28
Pointer arithmetic e.g. • int strlen(char *s) { char *p = s; while (*p != ‘\0’) p++; • Loops until ‘\0’ return p-s; • Consider “Hi” ~ }
• • • • 10/17/08
{‘H’,’i’,’\0’}
Initly, p == s ‘H’ ‘H’ != ‘\0’ p++ /* p-s == 1 */ ‘i’ != ‘\0’ p++ /* p-s == 2 */ ‘\0’ == ‘\0’ return 2 CS3101-1, Lecture 4
29
Pointer arithmetic • Here: – p-as-number – s-as-number == p-s == 2
• For other elm types: – p-as-number – s-as-number == (ps)*sizeof(type)
• Sizeof gives # in var/type: – sizeof(int), sizeof(x)
10/17/08
• No effect here since sizeof(char) == 1 • Show sizeof.c CS3101-1, Lecture 4
30
String literals • “Hi” {‘H’, ‘i’, ‘\0’} • len-n str literal is a len-n+1 char[] – Signal ‘\0’ indicates the end
• Same effect: – char am[] = “hi”; /* am is immutable */ – char am[] = {‘H’, ‘i’, ‘\0’};
• Slightly different effect: – char *pm = “hi”; /* pm is mutable */
• pm: am[0] • am: 10/17/08
0 1 2 CS3101-1, 3 4Lecture 54 6 7 8 9
31
Copying strings • • • • • • •
• • 10/17/08
How to copy strings? s = t;? No: simply makes s and t pt to same chars Want: copy actual chars, so indy, from source t to dest s For now: assume s pts to enough free space Copy strings as arrays: void strcpy(char s[], char t[]) { int i = 0; while ((s[i] = t[i]) != ‘\0’) i++; } Each time around, 1) Assign s char to t char; 2) check for \0 NB: Assign expr evals to char assigned CS3101-1, Lecture 4
32
Copying strings’ • Copy strings as pointers: • void strcpy(char *s, char *t) { while ((*s = *t) != ‘\0’) { s++; t++; } } • Essentially the same as prev • No index needed, but now inc both ptrs 10/17/08
CS3101-1, Lecture 4
33
Copying strings’’ •
Mix & match:
• void strcpy(char s[], char *t) { /* s arr, t ptr */ int i = 0; while ((*s = t[i]) != ‘\0’) { s++; /* s ptr */ i++; /* t arr */ } } 10/17/08
CS3101-1, Lecture 4
34
Copying strings’’’ • Shorter version: • void strcpy(char *s, char *t) { while ((*s++ = *t++) != ‘\0’) ; } • Each time around: *s set to *t, check for ‘\0’ • But: unaries assoc R-to-L • So, Q: why do we get *t instead of *(t+1) • A: in *t++, ++ is eval-ed before * – But postfix ++ *lval++ evals to simply lval – After eval, lval is inc-ed – So *s set to *t; both inc-ed 10/17/08
CS3101-1, Lecture 4
35
Copying strings’’’’ • • • • •
Q: What is the value of the null char ‘\0’? A: 0; any other char is != 0 Can take advantage of this in if Shortest (?) version: void strcpy(char *s, char *t) { while ((*s++ = *t++)) ; } • Real strcpy lives in string.h 10/17/08
CS3101-1, Lecture 4
36
Comparing strings • Return <0 if s < t, 0 if s==t, >0 o.w. • int strcmp(char *s, char *t) { int i; for (i = 0; s[i] == t[i]; i++) if (s[i] == ‘\0’) return 0; return s[i] – t[i]; } • NB: two poss outcomes: – Get through all of s w/o discrepancy 0 – Find discrepancy amt greater s[i] than t[i] 10/17/08
CS3101-1, Lecture 4
37
Comparing strings’ • With arrays: • int strcmp(char *s, char *t) { for(; *s == *t; s++, t++) if (*s == ‘\0’) return 0; return *s - *t; } • Omits i but needs , to combine inc stmts 10/17/08
– Very rarely CS3101-1, used Lecture 4
38
*/++/-- and stacks • Lend themselves to elegant stack pushing and popping: – p points to 1 place above top of stack – pal is value pushed/popped
• Push: *p++ = val; – *(p++) evals to *p, so val is written to open slot – then p is inc-ed
• Pop: val = *--p; – *(--p) evals to p-1, the last written slot – Val in last written slot is stored in val – Last written slot now considered open, for pushing 10/17/08
CS3101-1, Lecture 4
39
Arrays of ptrs/ptrs to ptrs • • • • • • •
Consider problem of sorting, say, ints int a[] = {5,23,8,22,99,53,3,77,4,7}; a: 5 23 8 22 99 53 3 77 4 7 If sort, get: a: 3 4 5 7 8 22 23 53 77 99 Q: How much work? A: W/good alg, O(n*logn) int swaps/compars – Read/write 4*O(n*logn) bytes
10/17/08
CS3101-1, Lecture 4
40
Arrays of ptrs/ptrs to ptrs • a: 5 23 8 22 99 53 3 77 4 7 • What if the nums are the values being sorted, but just they’re keys • Q: What if we’re sorting records, or mp3s? How much work then? • Q: O(n*logn) mp3 swaps/compars! – Read/write sizeof(mp3)*O(n*logn) bytes – A lot more work! 10/17/08
CS3101-1, Lecture 4
41
Arrays of ptrs/ptrs to ptrs • •
What to do? Soln: in general, don’t sort vals; – Sort pointers (to vals) – Values stay put – Only arrows move
• •
Q: How much work is this? A: O(n*logn) ptr swaps/compars! – Read/write sizeof(ptr)*O(n*logn) bytes – Q: How large is a ptr? – A: It’s just a memory addres • i.e., the integer that refers to some mem location • 4 bytes, say
– Same as sorting ins – A lot less work!
• 10/17/08
See qsort, which takes an array of ptrs to be sorted CS3101-1, Lecture 4
42
Array review • Arrays are complex (=nonprimitive) data structures • Ordered, fixed-length seq of vars • All elms same type • Access by index: ar[4] • Indices begin at 0 10/17/08
CS3101-1, Lecture 4
43
qsort • Header (simplified, from stdlib.h): • void qsort(void *base, int n, int size, int (*cmp)(const void *, void *)); • Params: – – – – 10/17/08
base ~ ptr to beginning of array n ~ array length (= # elements) size ~ arr elm size, i.e., “sizeof(base[0])” cmp ~ comparison ftn to be used for sort CS3101-1, Lecture 4
44
qsort’s cmp • Comparison ftn ops on 2 elm of array: – cmp(a,b) <0 “a < b” – cmp(a,b)==0 “a == b” – cmp(a,b) >0 “a > b”
• Remember (from 3203/intution): alg for deciding less/=/greater for any 2 elms of set ~ ordering of set • cmp is used as ==/> would be if sorting ints • Only complication: cmp takes ptrs, so must * 10/17/08
CS3101-1, Lecture 4
45
Example cmp • Simple case: int *’s • int cmp(int *a, int *b) { if (*a == *b) return 0; if (*a < *b) return -1; return +1; }
10/17/08
CS3101-1, Lecture 4
More succinct? int cmp(int *a, int *b) { return *a - *b; } For other types, modify compars
46
void * • • • •
void * is the generic ptr type Points to nothing? No, points to anything In some cases, code doesn’t know what referent will be interped as – E.g.: malloc acquires mem, sends back void * – Caller then casts to appropriate type (int *, etc.)
•
NB: we can cast ptrs, like all vars – void * ptr = …; – Int * ip = (int*)ptr;
• •
Q: Does ip’s referent change? A: No, *ip is not cast, just ip – Draw picture
10/17/08
CS3101-1, Lecture 4
47
Multi-D arrays • Given any type t, t[] is a legal type: – int int[] – double double[] – int[] int[][]
• 2-dim array = array of 1-dim arrays 10/17/08
CS3101-1, Lecture 4
48
Multi-D arrays e.g. •
• •
•
static char days[][12] = { {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31}, {31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31} }; Also: static char days[2][12] = { 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31, 31, 29, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 }; /* with sizes, inner brackets optional */ access: – –
• • 10/17/08
days[row][col] not days[row,col]!
Notice: 0th row ~ non-leap year 1st row ~ leap year CS3101-1, Lecture 4
49
Leap year redux • int leapyear(int year) { return year %4 == 0 && (year%100 != 0 || year%400 == 0); } • int leapyear(int year) { return year %4 == 0 && year%100 != 0 || year %4 == 0 && year%400 == 0; } 10/17/08
CS3101-1, Lecture 4
50
Leap year redux • int leapyear(int year) { return year %4 == 0 && year%100 ! = 0 || year%400 == 0; } • int leapyear(int year) { return !(year %4) && year%100 || ! (year%400); } 10/17/08
CS3101-1, Lecture 4
51
Compute day of year • /* month in {0..11} int dayofyear(int year, int month, int day) { int i, leap = leapyear(year); for (i = 0; i < month; i++) day += days[leap][i]; return day; • } • example: 10/17/08
– dayofyear(2002, 0, 15) 15 (16th day) – (2002, 1, 15) 15 + 31 == 46 (47th day) CS3101-1, Lecture 4
52
Multi-D arrays as ftn args •
Any dims after the first (0th) must be specified: – int ftn(int days[2][12]) { … } – int ftn(int days[][12]) { … } – int ftn(int (*days)[12]) { … }
• • •
Q: Why do others need to be specified? Q: Why needn’t the first? A: 2 in days[2] ignored, so days[] is equiv to *days – passing array same as passing ptr
•
So: passing a ptr—to what? – Passing a ptr to int[12]’s
• 10/17/08
Remember: referent type tells ptr how much mem to interp as its referent CS3101-1, Lecture 4
53
Array-ptrs v. 2-dim arrays • •
Almost the same but not quite int a[10][20]; – 10x20 d-dim array 10*20 == 200 ints – each row same length – a[5][6] *a[5]+6*sizeof(int) a+5*sizeof(int[20]) + 6*sizeof(int) a + 5*20*sizeof(int) + 6*sizeof(int) a + 106*sizeof(int) – why we need the length of the rows!
•
int *b[10]; – No ints created, but 10 int *s – Each can be set to point to an int[20] – Could be set to diff length arrays
10/17/08
CS3101-1, Lecture 4
54
Array-ptrs v. 2-dim arrays • Also, we wrote: – int (*days)[12] as the same as int days[2][12] – int *b[10]; the ptr analog of int a[10][20]
• Q: Why not *b[20]? • A: These look analogous, but are not • int (*days)[12] is one ptr, a ptr that points to a length-12 int array • int *b[20] is 20 ptrs, each of which points to int(s) – Interped as a table, this gives a 20x? table
• we had a 10x20 table ~ 10 rows 10 ptrs to ints 10/17/08
CS3101-1, Lecture 4
55
Array-ptrs v. 2-dim arrays • Diff between int (*d)[10] and int *d[10]? • [] have higher precedence than * • In second, [10] attaches to d, then * to result ptr to array • In first, * attaches to d, then [10] to result array of ptrs 10/17/08
CS3101-1, Lecture 4
56
Str arrs v. 2-D char arrs • char *names[] = {“Abc”, “d”, “efgh”}; – names array of char*, each pts to justlarge-enough string ({‘A’,’b’,’c’,’\0’})
• char names[][5] = {“Abc”, “d”, “efgh”}; – len-15 array, with empty space between entries: – Abc\0 d\0 efgh\0 10/17/08
CS3101-1, Lecture 4
57
Command-line args main ftn params • Canonical main ftn: – int main(int argc, char *argv[]) { … }
• • • • • 10/17/08
argv: array of strings arg[0] = … exec name! Other elms: cmd-line args, in order type Args sep-ed by spaces To include space in string, surround by quotes CS3101-1, Lecture 4
58
Command-line args • Q: What is **++argv? • A: **(++argv) **(argv+1) *argv[1] argv[1][0] char 0 of arg 1 • Q: What is *++argv[0]? • A: something of arg 0 something of char 1 of arg 0 val of char 1 of arg 0 • Q: What is (*++argv)[0]? • A: 0th of *(argv+1) 0th of arg 1 char 0 of arg 1 **++argv 10/17/08
CS3101-1, Lecture 4
59
Ftn pointers • Ftns aren’t values; vars can’t be set to them • But: ptrs can point to them! Can pass one ftn to another, so second can call first! • Ftn name simpliciter (no parens) evals to ptr to the ftn • To call, dereference ftn, in parens, followed by args: – res = (*cmp)(pa, pb);
10/17/08
• Dereferencing ftn ptr is actually not required, but reminds us that cmp is a ptr CS3101-1, Lecture 4
60
Ftn pointers • Example: qsort – Sorts according to some ordering of items – Ordering specified by implementation of cmp ftn – How does qsort find out which cmp to call?
10/17/08
• Cmp (or a ptr to it) is passed to qsort as a param • NB: type depends on param-list types and return type • Show ftnptrs.c CS3101-1, Lecture 4
61
* v. () • You should know diff bet int *f() and int (*pf)() (without helpful names) • Be careful with *’s and []’s • Q: What does the following mean? – char (*(*x())[])()
• A: left as an exercise for the listener 10/17/08
CS3101-1, Lecture 4
62
Proper main ftns •
• • •
int main(int argc, char argv[][]) { … return 0; } NB: Ftn signature is similar to Java equiv, minus OO: public static void main(String[] args) {…} Since arrays don’t “know” their length, this info is sent separately: – argc ~ argument count – argv ~ argument vector
• 10/17/08
Also, argv[0] is always the exec name (e.g., “a.out”) CS3101-1, Lecture 4
63
Meta-pointers • • • • • • • 10/17/08
Given any type, exists ptr to type: int int* char char* int* int** “ptr to a ptr to an int” Ever used? Yes! -qsort CS3101-1, Lecture 4
64
Pointer op review • Q: suppose say: • int x = 10; int *ip = &x; int **ipp = &ip; **ipp = ? • A: 10 • Q: suppose say: • int *ip; int **ipp = &ip; int x = 10; ip = &x; • Any problem, since ip is not initiated? • A: No, ip not set, but it has mem reserved for it - picture 10/17/08
CS3101-1, Lecture 4
65
Pointer op review • • • • • • •
• •
Q: suppose say: int *ip; *ip = 10; Any problem? A: Yes! ip not set *ip == who knows? Q: suppose say: int *ftn() { int x = 10l return &x; } Any problem? A: Yes! X is popped off stack on return – “dangling pointer”
10/17/08
CS3101-1, Lecture 4
66
Next time • • • • • 10/17/08
Next time: pointers & arrays Hw2 assigned tonight Reading on web Come to OHs if nec! Sign in! CS3101-1, Lecture 4
67