Introduction Data Structures

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Introduction Data Structures as PDF for free.

More details

  • Words: 2,178
  • Pages: 38
Data Structures with C++ Fakhar Lodhi

Software Design Quality • What is good design? – Is it efficient code? – compact implementation? – Most maintainable? • For most large, long life time software systems, maintenance cost normally exceeds development cost by factors ranging from 2 to 3. • Boehm (1975) quotes a pathological case where the development cost of an avionics system was $30 per line of code but the maintenance cost was $4000 per instruction

Maintainable Design • Cost of system changes is minimal • readily adaptable to modify existing functionality and enhance functionality • design is understandable • changes should be local in effect • Design should be modular

Abstraction • Abstraction is a technique in which we construct a model of an entity based upon its essential characteristics while ignoring the inessential details. • The principle of abstraction also helps in handling the inherent complexity of a system by allowing looking at its important external characteristic, and hiding its inner complexity at the same time. • Hiding the internal details is called encapsulation. • Engineers of all fields, including computer science, have been practicing abstraction for mastering complexity. • Code and Data abstraction

void selectionSort(int a[], int size) { int i, j, min, temp; for(i = 0; i < size –1; i++) { // find minimum from i to size - 1 min = i; for (j = i; j < size; j++) { if (a[j] < a[min]) min = j; } // swap element with the ith location // with the minimum value temp = a[i]; a[i] = a[min]; a[min] = temp; } }

void selectionSort(int a[], int size) { int i, j, min, temp; for(i = 0; i < size –1; i++) { // find minimum from i to size - 1 min = i; for (j = i; j < size; j++) { if (a[j] < a[min]) min = j; } // swap element with the ith location // with the minimum value temp = a[i]; a[i] = a[min]; a[min] = temp; } }

int minimum (int a[ ], int from, int to) { int min = from; for (int i = from; i <= to; i++) if (a[i] < a[min]) min = i; return min; }

void swap (int &x, int &y) { int temp = x; x = y; y = temp; }

void selectionSort(int a[ ], int size) { int i, j, min; for (i=0; i < size-1; i++) { min = minimum (a, i, size-1); swap(a[i], a[min]); } }

int minimum (int a[ ], int from, int to) { int min = from; for (int i = from; i <= to; i++) if (a[i] < a[min]) min = i; return min; }

void swap (int &x, int &y) { int temp = x; x = y; y = temp; }

void selectionSort(int a[ ], int size) { int i, j, min; for (i=0; i < size-1; i++) { min = minimum (a, i, size-1); swap(a[i], a[min]); } }

void selectionSort(int a[], int size) { int i, j, min, temp; for(i = 0; i < size –1; i++) { // find minimum from i to size - 1 min = i; for (j = i; j < size; j++) { if (a[j] < a[min]) min = j; } // swap element with the ith location // with the minimum value temp = a[i]; a[i] = a[min]; a[min] = temp; } }

Data Abstraction and Abstract Data Types (ADT) • A data type is a template for objects and a set of operations that define the behavior of the objects (or instances) of that type. • An Abstract Data Type (ADT) is a data type in which the implementation details are hidden and the user is concerned with the properties (or behavior) of that type. • An ADT has two components: – Interface – the behavior – Implementation

• Example: – int, float

Abstract Data Type • The data structures used to implement the data type can only be accessed through the interface. • Any change in the implementation does not change the user programs as long as the interface remains the same. • This is also known as data encapsulation or data abstraction

Interface 1

Interface 2

Implementation Interface 3

Interface 4

An ADT

Abstraction VS. Implementation • x  (01000001)2 •x=? • If x is character then x  ‘A’ • If x is integer then x  65

int main(int argc, char* argv[]) { int i, *pi; float f, *pf; i = 1024; pi = &i; pf = (float *) pi; f = *pf; cout << " i = " << i

<< "

f = " << f << "\n \n";

<< "

f = " << f << "\n \n";

f = i; cout << " i = " << i return 0; }

WHY?

Abstraction VS. Implementation Two dimensional array 1

5

3

6

3

2

38

64

22

76

82

99

0

106

345

54

User’s view (abstraction) 1

5

3

6

3

2

38

64

22

76

82

System’s view (implementation)

99

0

106

345

54

ADTs and C++ Classes • A class is used to define (and implement) an ADT in C++. • A class consists of data and functions that operate on that data. • A class in C++ has two parts – public and private (let’s ignore the protected members for now). • The data and functions defined in the class are called the members of the class.

ADTs and C++ Classes • Users of the class can only access and manipulate the class state through the public members of the class. • Private members can only be used by other members of the class (let’s also ignore the friends for now). • Data encapsulation is achieved by declaring all data members of a class to be private. • The interface to the class is provided through the use of public member functions.

Data Structures • The primary objective of programming is to efficiently process the input to generate the desired output. • We can achieve this objective in an efficient and neat style if the input data is organized in a way to help us meet our goal. • Data Structures is nothing but ways and means of organizing data so that it can be processed easily and efficiently. • Data structures dictate the manner in which the data can be processed. In other words, the choice of an algorithm depends upon the underlying data organization.

Problem: Determine if a key is present in a collection of 10 integers Organization 1: Data are stored in 10 disjoint (as opposed to composite) variables: A0, A1, A2, A3, …, A9

Algorithm found = false; if (key = = A0) else if (key = = A1) else if (key = = A2) else if (key = = A3) else if (key = = A4) else if (key = = A5) else if (key = = A6) else if (key = = A7) else if (key = = A8) else if (key = = A9)

found = true; found = true; found = true; found = true; found = true; found = true; found = true; found = true; found = true; found = true;

Problem: Determine if a key is present in a collection of 10 integers

Organization 2: Data are stored in an array A of 10 elements Algorithm found = false; for (int i = 0; i < 10; i++) if (A[i] = = key) { found = true; break; }

• Average number of comparisons in both cases is the same so both algorithms are equally efficient (or in efficient) • Organization 2 is better because it yields an algorithms which is more maintainable. For example, if the collection contains 100 elements. In general, number of elements could be N.

Problem: Determine if a key is present in a collection of 10 integers

Organization 3: Data are stored in an array A in ascending order

Algorithm found = false; low = 0; high = N – 1; while (( ! found) && ( low <= high)) { mid = (low + high)/2; if (A[mid] = = key) else if (A[mid] > key) else }

found = true; high = mid – 1; low = mid + 1;

• Average number of comparisons ? • Order of “log(N)” as compared to N. Animate

found = false; low = 0; high = N – 1; while (( ! found) && ( low <= high)) { mid = (low + high)/2; if (A[mid] = = key) else if (A[mid] > key) else }

5

6

7

8

9

found = true; high = mid – 1; low = mid + 1;

0

1

2

3

4

10 11 12 13 14 15

2

3

5

7

10 12 15 22 28 29 32 47 48 50 55 73 key = 29 Back

Performance Comparison • NADRA database: ~80,000,000 records • Computer which can perform 10,000 comparisons per second – Linear search: – Binary search:

~2.22 hours ~0.005 seconds

– Roughly 1.6 million times less

• Why?

Performance Analysis 1. Does the program efficiently use primary and secondary storage? 2. Is the program’s running time acceptable for the task? 3. Space Complexity: •

The space complexity of a program is the measure of the amount of memory that it needs to run to completion.

4. Time Complexity: •

The time complexity of a program is the measure of the amount of computer time it needs to run to completion.

Performance Estimation • How to determine which algorithm is better? • We need some mechanism to predict the performance without actually executing the program. • Mechanism should be independent of the compiler and underlying hardware.

Step Count • •

Program Step: A program step is a meaningful program segment. We can consider each statement as a single step. a = 2; a = 2 * b + c + 3 * c / d – e;

Step Count To count total number of steps, we must determine: • • • •

Step count for each statement – steps/execution or s/e Frequency of each statement Total steps for each statement Finally sum these counts to get the total step count

Example 1 – Summing of a list of numbers Statement float sum (float list[], int n) { float temp = 0; int i; for (i=0; i
Total

S/ e 0 0 1 0 1 1 1 0

Frequency Total Steps 0 0 0 0 1 1 0 0 n+1 n+1 n n 1 1 0 0 2n+3

Step Count table for summing of a list of numbers

Example 2 – Matrix addition Statement void add(int a[][MAX_ SI ZE], int b[][MAX_ SI ZE], int c[][MAX_ SI ZE], int rows, int cols) { int i,j; for (i=0; i
S/ e 0

Frequency 0

Total Steps 0

0 0 1 1 1

0 0 row+1 rows.(cols+1) rows.cols

0 0 row+1 rows.(cols+1) rows.cols

0

0

0 2.rows.(cols+1)+1

Step Count Table for matrix addition

Big Oh (O) • Determining the exact step count of a program can be a very difficult task • Because of the inexactness of the definition of a step, exact step count is not very useful for comparative purposes. e.g., which one is better 45n+3 or 100n+10 • We use some asymptotic notations as measure of growth.

Big Oh (O) Big Oh is defined as: f(n)=O(g(n)) iff there exists positive constants c and n0 such that f(n) ≤ cg(n) for all values of n ≥ n0 No matter what the value of c1, c2, c3, there will be an n beyond which the program with complexity c3n will be faster than the one with complexity c1n2+c2n Theorem: If f(n) = amnm+….+a1n+a0, then f(n) = O(nm)

Big Oh (O) An estimate of how will the running time grow as a function of the problem size.

Growth Functions log(n)

n

n log(n)

n**2

n**3

2**n

2500

growth

2000 1500 1000 500 0 1

2 3

4

5 6

7 8

9 10 11 12 13 14 15 16 17 18 19 20 n

Big Oh (O) • • • •

Summing of list of numbers O(n) Matrix addition O(rows.cols Searching a key in an array ) Binary Search O(n) O(log n)

Big Oh (O) int search_min(int list[], int from, int to) { int i; int min= from; for (i=from; i<=to; i++) If (list[i] < list[min]) min = i;

return min; }

// O(n)

void swap(int &a, int &b) { int temp=a; a = b; b = temp; }

// O(1)

Big Oh (O) void selection_sort(int list[], int size) { int i,j; for (i=0; i<size-1; i++) { // O(Size) or O(n) j= search_min(list, i+1, size-1); //O(n).O(n)=O(n2)

swap (list[i], list[j]); //O(1).O(n)=O(n) }

// O(n)+O(n2)+O(n)+O(n)=O(n2) }

Big Oh (O) • • • • • • •

O(1) O(log n) O(n) O(n log n) O(n2) O(n3) O(2n) -

constant logarithmic linear log linear quadratic cubic exponential

Related Documents

Data Structures
October 2019 38
Data Structures
June 2020 21
Data Structures
April 2020 34
Data Structures
May 2020 22