C & Data Structures
Course Overview: This course begins with introduction to C, delves deep into it and finishes up with a roundup of basic data structures. Estimated duration of the course is 15 classes followed by project guidance sessions. The intended audiences are the 1st and 2nd year students of College of Engineering, Guindy, AC College of Technology and SAP. At the end of the course there will be a project presentation where the students shall present their projects. These projects shall be graded based on their ingenuity, relevance to the course and other key points. The grading will be done by judges who are experts in the field. The students will then be awarded merit certificates and participation certificates based on their projects and participation in the course. The topics to be covered by the course in the 15 classes is given below 1) Introduction to C programming 2) Variables, Constants, Printf and Scanf along with a few operators. 3) Operators in C and flow control statements. 4) Pointers in C 5) Advanced pointers and Arrays 6) Structures and Enumerations 7) Typedefs and their usage 8) Unions in C 9) An overview of graphics in C 10) Introduction to Data Structures 11) Stacks 12) Queues 13) Linked Lists 14) Binary Trees 15) Overview of advanced data structures and introduction to the next course on algorithms. After this there will also be a project guidance class followed by a project presentation event, where the students shall exhibit the skills they have acquired through their projects. These projects will be judged and they shall be awarded grades based on their projects. The students shall be awarded with certificates upon completing the course, based on their levels of participation in the course and their projects. The course material is intended to be as brief as possible and it shall be sufficiently backed up by classroom notes. This material can be used as a collection of key points to be discussed in depth in each class.
Course Material 1) Introduction to C Programming What is programming? Programming is the art of converting a solution in the mind of a person to a formal language. It helps in transfer of knowledge from humans to computers. The art of representing solutions in a clear and systematic form is called programming. An Introduction to C C language is one of the most loved languages in the history of computing. The power and flexibility of C are amazing. It has the pride of being the language behind the most famous UNIX and LINUX systems. C is almost always used for Systems Programming and Operating Systems; such is the power of this magnificent language. Why use Structured Programming Languages? C is a structured programming language. Structured programming languages help us avoid sphagetti code, which was one of the most prominent problems during the early ages of assembly programming. A person reading the code must be able to cleanly understand the logic used in the code, this is possible only if the code follows a particular structure. The idea of structured programming is the main reason behind the success of C. C does include support for “goto” statements but they can almost always be avoided by the use of good programming skills, resulting in clean structured code. The main principle of good programming is making the code as clean as possible so that other people can read and understand the logic with little effort. The Structure of C programs can be generally divided into blocks of code enclosed by flower braces (this need not be always true, there can be global sections but these are special cases). Each block generally follows a same structure in the sense that each block can be divided into two parts. The first part is the declarations section where the declaration statements appear then comes the general section which consists of statements that use the stuff declared in the declaration section and user-defined constants to do some useful operations. 2) Variables, Constants, Printf and Scanf along with a few operators Data Types The basic data types in C are • char a character • int an integer, in the range -32,767 to 32,767 • long int a larger integer (up to +-2,147,483,647) • float a floating-point number • double a floating-point number, with more precision and perhaps greater range than float
There can be signed and unsigned modifiers to these basic types. The specified range of values that can be taken by a basic data type must be guaranteed by any implementation of the language, whereas the actual size of the data type may vary from implementation to implementation. You might wonder how the computer stores characters. The answer involves a character set, which is simply a mapping between some set of characters and some set of small numeric codes. Most machines today use the ASCII character set, in which the letter A is represented by the code 65, the ampersand & is represented by the code 38, the digit 1 is represented by the code 49, the space character is represented by the code 32, etc. (Most of the time, of course, you have no need to know or even worry about these particular code values; they're automatically translated into the right shapes on the screen or printer when characters are printed out, and they're automatically generated when you type characters on the keyboard. Eventually, though, we'll appreciate, and even take some control over, exactly when these translations--from characters to their numeric codes--are performed.) Character codes are usually small--the largest code value in ASCII is 126, which is the ~ (tilde or circumflex) character. Characters usually fit in a byte, which is usually 8 bits. In C, type char is defined as occupying one byte, so it is usually 8 bits. The advantage of ASCII format is that the characters representing the digits ‘0’ to ‘9’occupy consecutive positions in the scheme thus facilitating easy conversions between numeric strings and integers.
Variable Names These names (the formal term is ``identifiers'') consist of letters, numbers, and underscores. For our purposes, names must begin with a letter. Theoretically, names can be as long as you want, but extremely long ones get tedious to type after a while, and the compiler is not required to keep track of extremely long ones perfectly. The capitalization of names in C is significant: the variable names variable, Variable, and VARIABLE (as well as silly combinations like variable) are all distinct. A final restriction on names is that you may not use keywords (the words such as int and for which are part of the syntax of the language) as the names of variables or functions (or as identifiers of any kind).
Constants Character Constants Character constants are enclosed within single quotes like ‘a’, ’3’, ’)’ etc. They can involve escape sequences, the most common of which are
\n \b \r \' \" \\
a ``newline'' character a backspace a carriage return (without a line feed) a single quote (e.g. in a character constant) a double quote (e.g. in a string constant)
a single backslash
Integer Constants Octal constants have a leading 0, Hexadecimal constants have a leading 0x or 0X prefix while decimals are written simply as such. They can be modified by u or U suffix for unsigned counterparts and l or L in case we need a long int. Float constants The float constants have a decimal point in them, in which case they will automatically considered as float. We can have an optional f suffix to explicitly use a constant as a float constant. We could also represent floating point constants in the exponential form like 3e-4 etc. String Constants These are generally called literals in some languages. They are enclosed by double quotes ( “ ) and can have escape characters in between them. printf and scanf Statements The printf statement is the output function provided in <stdio.h> for outputting data, this has a lot of interesting variations which we deal with later. The general format of printf is printf(formatstring, arguments); The scanf statement is the input function provided as a part of the C library. This has the general format scanf(formatstring, arguments); 3) Operators in C and flow control statements Operators in C fall into five main classes: arithmetic operators, unary operators, relational and logical operators, assignment operators, and the conditional operator.
The basic operators for performing arithmetic are the same in many computer languages: + * / %
addition subtraction multiplication division modulus (remainder)
The - operator can be used in two ways: to subtract two numbers (as in a - b), or to negate one number (as in -a + b or a + -b). When applied to integers, the division operator / discards any remainder, so 1 / 2 is 0 and 7 / 4 is 1. But when either operand is a floating-point quantity (type float or double), the division operator yields a floating-point result, with a potentially nonzero fractional part. So 1 / 2.0 is 0.5, and 7.0 / 4.0 is 1.75. The modulus operator % gives you the remainder when two integers are divided: 1 % 2 is 1; 7 % 4 is 3. (The modulus operator can only be applied to integers.) There are also prefix and postfix increment and decrement operators ++ and -The relational and logical operators include < , > , >=, <= , == , &&, ||, ! etc The bitwise operators are &, |, ^, ~ , <<, >> etc The assignment operators include =, +=, -=, *=, /= etc. Simple conditional operations can be carried out with the conditional operator (? :). An expression that makes use of the conditional operator is called a conditional
expression. Such an expression takes the general form expression 1 ? expression 2 : expression 3
This is also referred to as the ternary operator in C. It is useful sometimes in compact code for simple if-else statements. If expression1 is true then expression2 is returned else expression 3 is returned.
Flow Control Statements Most of the statements in a C program are expression statements. An expression statement is simply an expression followed by a semicolon. The lines i = 0; i = i + 1; printf("Hello, world!\n"); c = sqrt(a*a + b*b);
The basic flow control statements are If-else statements The simple case is just the “if” statement of the form if( expression ) statement
These are of the form if( expression ) statement1 else statement2
In C a block of statements enclosed within flower braces can be used in the place of a single statement so we can have if( expression ) { statement1; statement2; ………………………… statementn; }
Here is an example of another common arrangement of if and else. Suppose we have a variable grade containing a student's numeric grade, and we want to print out the corresponding letter grade. Here is code that would do the job: if(grade >= 90) printf("A"); else if(grade >= 80) printf("B"); else if(grade >= 70) printf("C"); else if(grade >= 60) printf("D"); else printf("F");
Switch Statements The general form of a switch statement is: switch (variable) { case expression1: statement 1; break; case expression2: statement 2; break; .... default: do default processing; }
It is used in multiway branching where writing multiple if else statements is simply too tedious. While Statement The general syntax of a while loop is while( expression ) statement
A while loop starts out like an if statement: if the condition expressed by the expression is true, the statement is executed. However, after executing the statement, the condition is tested again, and if it's still true, the statement is executed again. (Presumably, the condition depends on some value, which is changed in the body of the loop.) As long as the condition remains true, the body of the loop is executed over and over again. (If the condition is false right at the start, the body of the loop is not executed at all.) More generally, the syntax of a for loop is for( expr1 ; expr2 ; expr3 ) statement
(Here we see that the for loop has three control expressions. As always, the statement can be a brace-enclosed block.) When the compiler sees a for loop, first, expr1 is evaluated. Then, expr2 is evaluated, and if it is true, the body of the loop (statement) is executed. Then, expr3 is evaluated to go to the next step, and expr2 is evaluated again, to see if there is a next step. During the execution of a for loop, the sequence is: expr1 expr2 statement expr3 expr2 statement expr3 ... expr2 statement expr3 expr2
Do while Loops do { statement1; statement2; ………………………… statementn;
} while (condition is satisfied)
The do loop also executes a block of code as long as a condition is satisfied. The difference between a "do" loop and a "while" loop is that the while loop tests its condition at the top of its loop; the "do" loop tests its condition at the bottom of its loop. As noted above, if the test condition is false as the while loop is entered the block of code is skipped. Since the condition is tested at the bottom of a do loop, its block of code is always executed at least once. Break And Continue Statements Sometimes, due to an exceptional condition, you need to jump out of a loop early, that is, before the main controlling expression of the loop causes it to terminate normally. Other times, in an elaborate loop, you may want to jump back to the top of the loop (to test the controlling expression again, and perhaps begin a new trip through the loop) without playing out all the steps of the current loop. The break and continue statements allow you to do these two things. (They are, in fact, essentially restricted forms of goto.) Ex:- The following program prints all the prime numbers from 1 to 1000 #include <stdio.h> #include <math.h> int main(){ int i, j; printf("%d\n", 2); for(i = 3; i <= 1000; i = i + 1){ for(j = 2; j < ((i/2)+1); j = j + 1){ if(i % j == 0) break; if(j > sqrt(i)) { printf("%d\n", i); break; } } } return 0; }
4) Pointers in C The first things to do with pointers are to declare a pointer variable, set it to point somewhere, and finally manipulate the value that it points to. A simple pointer declaration looks like this: int *p;
Pointers (that is, pointer values) are generated with the ``address-of'' operator &, which we can also think of as the ``pointer-to'' operator. We demonstrate this by declaring (and initializing) an int variable i, and then setting p to point to it: int i = 500; p = &i;
The contents-of operator * does not merely fetch values through pointers; it can also set values through pointers. We can write something like *p = 70;
To change the contents of the pointer variable itself use p = &j;
If we declare a pointer variable and include an initializer int *p = &i;
We're setting the initial value for p, which is where p will point, so that initial value is a pointer. (In other words, the * in the declaration int *p = &i; is not the contentsof operator, it's the indicator that p is a pointer.) When we write int *ip;
although the asterisk affects ip's type, it goes with the identifier name ip, not with the type int on the left. To declare two pointers at once, the declaration looks like int *ip1, *ip2;
This works for one pointer, because C essentially ignores whitespace. But if you ever write int* ip1, ip2;
/* PROBABLY WRONG */
5) Advanced pointers and Arrays Pointers do not have to point to single variables. They can also point at the contents of an array. For example, we can write int *p; int a[100]; p = &a[3];
Once we have a pointer pointing into an array, we can start doing pointer arithmetic. Given that p is a pointer to a[3], we can add 1 to p: p+1
In C, this gives a pointer to the cell one farther on, which in this case is a[4]. To make this clear, let's assign this new pointer to another pointer variable: p2 = p + 1;
If we now do *p2 = 4;
We’ve set a[4] to 4. But it's not necessary to assign a new pointer value to a pointer variable in order to use it; we could also compute a new pointer value and use it immediately: *(p + 1) = 5;
Given that we can add 1 to a pointer, it's not surprising that we can add and subtract other numbers as well. If p still points to a[3], then
*(p + 3) = 7;
sets a[6] to 7, and *(p - 2) = 4;
sets a[1] to 4. You can both compare two pointers as well as subtract two pointers, adding two pointers is illegal because it might point to some illegal memory location. There are a number of similarities between arrays and pointers in C. If you have an array int a[10];
you can refer to a[0], a[1], a[2], etc., or to a[i] where i is an int. If you declare a pointer variable ip and set it to point to the beginning of an array: int *ip = &a[0];
you can refer to *ip, *(ip+1), *(ip+2), etc., or to *(ip+i) where i is an int. There are also differences, of course. You cannot assign two arrays; the code int a[10], b[10]; a = b;
/* WRONG */
is illegal. As we've seen, though, you can assign two pointer variables: int *ip1, *ip2; ip1 = &a[0]; ip2 = ip1; Pointer assignment is straightforward; the pointer on the left is simply made to point wherever the pointer on the right does. We haven't copied the data pointed to (there's still just one copy, in the same place); we've just made two pointers point to that one place. The similarities between arrays and pointers end up being quite useful, and in fact C builds on the similarities, leading to what is called ``the equivalence of arrays and pointers in C.'' When we speak of this ``equivalence'' we do not mean that arrays and pointers are the same thing (they are in fact quite different), but rather that they can be used in related ways, and that certain operations may be used between them. The first such operation is that it is possible to (apparently) assign an array to a pointer: int a[10]; int *ip; ip = a;
Functions in C receive copies of their arguments. (This means that C uses call by value; it means that a function can modify one of its arguments without modifying the value in the caller.) When a function is called, the copies of the arguments are made as if by assignment. But since arrays can't be assigned they are passed by reference as the address to the memory location being pointed to. Strings are actually arrays. Any function that manipulates a string will actually accept it as a char * argument. The caller may pass an array containing a string, but the function will receive a pointer to the array's (string's) first element (character). The strings are terminated by ‘\0’ character therefore we can just use that sentinel character to detect the end of string when traversing a string by using character pointers. The most straightforward way to ``get'' a null pointer in your program is by using the predefined constant NULL, which is defined for you by several standard header files, including <stdio.h>, <stdlib.h>, and <string.h>. To initialize a pointer to a null pointer, you might use code like #include <stdio.h> int *ip = NULL;
We can use this NULL to check for errors when dealing with pointers to check if they have valid values or not. 6) Structures and Enumerations Structures The basic user-defined data type in C is the structure, or struct. (C structures are analogous to the records found in some other languages.) As a simple example, suppose we wanted to define our own type for representing complex numbers. A complex number consists of a real and imaginary part, we need a way of holding both these quantities in one data type, and a structure will do just the trick. Here is how we might declare our complex type: struct complex { double real; double imag; };
We declare variables of our new complex type with declarations like these: struct complex c1; or struct complex c2, c3;
We can also use the following instead:
struct complex { double real; double imag; } c1, c2, c3;
In which case, we would have defined the type struct complex, and right away declared three variables c1, c2, and c3 all of type struct complex. Like subscripted array references, references to the members of structure variables (using the structure selection operator) can appear anywhere, either on the right or left side of assignment operators. We could say c1.real = 1
To set the real part of c1 (that is, the real member within c1) to 1, or c1.imag = c2.imag
To fetch the imaginary part of c2 and assign it to the imaginary part of c1, or c1.real = c2.real + c3.real
There is a relatively small number of operations which C directly supports on structures. As we've seen, we can define structures, declare variables of structure type, and select the members of structures. We can also assign entire structures: the expression c1 = c2
Pointers in C are general; we can have pointers to any type. It turns out that pointers to structures are particularly useful. We declare pointers to structures the same way we declare any other pointers: by preceding the variable name with an asterisk in the declaration. We could declare two pointers to struct complex with struct complex *p1, *p2;
And, as before, we could set these pointers to point to actual variables of type complex: p1 = &c2; p2 = &c3;
Then, *p1 = *p2;
would copy the structure pointed to by p2 to the structure pointed to by p1 (i.e. c3 to c2), and p1 = p2;
would set p1 to point wherever p2 points If we wanted to access the member of a pointed-to structure, it would be a tiny bit messy--first we'd have to use * to get the structure pointed to, then . to access the desired member. Furthermore, since . has higher precedence than *, we'd have to use parentheses: (*p1).real
(Without the parentheses, i.e. if we wrote simply *p1.real, we'd be taking the structure p1, selecting its member named real, and accessing the value that p1.real points to, which would be doubly nonfunctional, since the real member is in our ongoing example not a pointer, and p1 is not a structure but rather a pointer to a structure, so the . operator won't work on it.) Since pointers to structures are common, and the parentheses in the example above are a nuisance, there's another structure selection operator that works on pointers to structures. If p is a pointer to a structure and m is a member of that structure, then p->m
selects that member of the pointed-to structure. The expression p->m is therefore exactly equivalent to (*p).m
Enumerations An enumeration can be declared like this: enum weekdays {sunday, monday, tuesday, wednesday, thursday, friday, saturday};
First use the keyword enum followed by the enumeration's name (or tag). The body of the definition is enclosed in curly brackets. The actual constants are then listed, separated by commas. The whole thing is closed with a semicolon. The code above does not actually create any variables; it just creates the enumeration type called weekdays. To create a variable of type weekdays do the following: enum weekdays myWeekday;
As you can see, this closely follows the way structures are used. In fact, as far as the enumeration name goes, all the ways for naming structure types, and creating variables of those structures, are also valid for enumerations.
The names in the body of the enumeration are all actually integers. In fact, once declared the enumeration values can be used anywhere an integer is used. By default, the values in the enum, start with zero, and increment by one. So the weekdays are: sunday = 0, monday = 1, tuesday = 2 etc.. In the definition you can specify if you want them to be different. Simply set each item in the enum equal to what you want. For example: enum weekdays { sunday = 1, monday = 2, tuesday = 3, wednesday = 4, thursday = 5, friday = 6, saturday = 7 };
This will adjust the values to more intuitive numbers. You could put in any values you want though. An easier way to do it is this: enum weekdays { sunday = 1, monday, tuesday, wednesday, thursday, friday, saturday };
This will start the numbering at one, and the rest will fall in line. The only other thing to mention about enum declarations is that once a symbolic value is declared for one enumerated type, it cannot be used in another enumerated type (at least not in the same scope). 7) Typedefs and their usage The commonest way to define structures is with a typedef, as shown below. typedef struct country { char name[20]; int population; char language[10]; } Country;
This defines a structure which can be referred to either as 'struct country' or Country, whichever you prefer. Strictly speaking, you don't need a tag name both before and after the braces if you're not going to use one or the other. But it's standard practice to put them both in and to give them the same name, but with the one after the braces starting with an uppercase letter.
The typedef statement doesn't occupy storage: it simply defines a new type. Variables that are declared with the typedef above will be of type 'struct country', just like population is of type integer. We can also create aliases for existing types like #include <stdio.h> int main () { typedef int my_type; my_type var1, var2, var3; var1 = 10; var2 = 20; var3 = 30; return 0; }
Once a typedef statement has been used, within the scope we can use the newly defined type as good as a primitive type, though the usual restrictions based on the operations supported by the primitive datatype or the structure we have created still hold good. 8) Union Union is another mechanism to make user defined data-types. The difference between unions and structures is that in the union all the variables share the same memory location and therefore the size of a union is the size of the largest datatype in the union, whereas in a structure the variables contained in the structure are assigned different memory locations and therefore the size of a structure is the sum of the sizes of its constituents. The concept of union is very useful in handling register operations in C which was the original reason for introducing the concept. Unions can be useful when we do not know the datatype we are going to deal with but have an idea of the domain of datatypes, which we will be using. #include<stdio.h> union temp_union { int a; char b; }; int main() { union temp_union c; c.a=10; printf("%d",c.a); c.b='a'; printf("%c %d",c.b,c.a); return 0; }
9) Graphics in C The graphics.h library provides preliminary support for graphics in C. Initializing the graphics: initgraph(&gdriver, &gmode, "c:\\tcplus\\bgi");
The gdriver variable contains the driver number, this is set to DETECT ( a macro defined in graphics.h) The gmode variable is left uninitialized this will be set to the graphics mode being used errorcode = graphresult();
If the errorcode value is not equal to grOk, that means an error occurred in initializing the graphics driver. A most common error would be that the egavga.bgi might not be in the path specified as the third argument to the initgraph, otherwise things would be fine. And remember that when you are done use closegraph();
This will close the graphics mode. This is important because if the graphics mode is not exited properly with this function then the command shell might still remain in the same mode causing problems for other programs running in the same shell. The maximum resolution available using graphics.h and the EGAVGA.BGI file would be about 640X480 with 16 colors. There is also a mode called mode 13 of the graphics which is available , this mode has a 320X240 resolution but supports 256 colors. Better graphics is not supported in C running in dos shell, WIN32 APIs is needed for better graphics. Some open source libraries like Allegro (available in the internet) can be used for better and easy graphics. The best graphics available in C would be at your disposal in the windows environment. There are many device driver level APIs like those provided by DIRECTX and the OPEN GL (open source) that can be made available easily when using the WIN32 API.