here in PDF - Parasol Laboratory, Department of Computer Science ...

CPSC 211 Data Structures & Implementations (c) Texas A&M University [ 0 ] 

About These Slides 

These slides were developed by 

Prof. Jennifer Welch 

Department of Computer Science 

Texas A&M University 

College Station, TX 77843-3112 

welch@cs.tamu.edu 

during Spring 1999. Comments and suggestions for 

improvements are welcome.


What are Data Structures? 

Data structures are ways to organize data (information). 

Examples: 

simple variables — 

objects — 

arrays — 

linked lists — 

Typically, algorithms go with the data structures to 

manipulate the data (e.g., the methods of a class). 

This course will cover some more complicated data 

structures: 

how 

what


Abstract Data Types 

An abstract data type (ADT) defines 

 

 

Similar to a 

This course will cover 

specifications of 

pros and cons of 

how the


Specific ADTs 

The ADTs to be studied (and some sample applications) 

are:


How Does C Fit In? 

Although data structures are universal (can be implemented 

in any programming language), this course will 

use Java and C: 

 

 

We will learn how to gain the advantages of 

Reasons to learn C: 

learn 

useful 

ubiquitous and 

Unix 

C code can be very 

very efficient


Other Topics 

Course will emphasize good software development 

practice: 

 

 

 

 

Course will touch on several more advanced computer 

science topics that appear later in the curriculum, and 

fit in with our topics this semester:


Principles of Computer Science 

Computer Science is like: 

engineering: 

science: 

math: 

However, CS studies 

Recurring concepts in computer science are: 

layers, hierarchies, information-hiding, abstraction, 

interfaces 

efficiency, tradeoffs, resource usage 

reliability, affordability, correctness


Introduction to Data Structures 

Data structures are one of the enduring principles 

in computer science. Why? 

1. Data structures are based on the notion of information 

hiding: 

2. A number of data structures are useful in a wide 

range of applications.


Efficiency Considerations 

Since these data structures are so widespread, it’s important 

to implement them efficiently. Measures of 

efficiency: 

 

 

in 

 

 

We will study tradeoffs, such as 

 

 

Efficiency will be measured using


Asymptotic Analysis 

Actual (wall-clock) time of a program is affected by: 

 

 

 

 

 

 

Instead of wall-clock time, look at the pattern of the 

program’s behavior as the problem size increases. This 

is called asymptotic analysis.


Big-Oh Notation 

Big-oh notation is used to capture the generic 

From a practical point of view, you can get the big-oh 

notation for a function by 

1. 

2. 

Which terms are lower order than others? In increasing 

order: 

Examples: 

4302 = 

n 3 + n log n + n 5 + n = 

34n 3 , 2n log n + :0004n 5 +5:2n= 

See Appendix B, Section 4 of Standish, or CPSC 311, 

for mathematical definitions and justifications.


Why Multiplicative Constants are Unimportant 

An example showing how multiplicative constants become 

unimportant as n gets very large: 

n 1000 log n :0001 n 2 

2 

256 

4096 

8192 

16,384 

32,768 

1,048,576 

Big-oh notation is not always appropriate! If your 

program is working on small input sizes,


Generic Steps 

How can you figure out the running time of an algorithm 

without implementing it, running it on various 

inputs, plotting the results, and fitting a curve to the 

data? And even if you did that, how would you know 

you fit the right curve? 

We count generic steps of the algorithm. Each generic 

step that we count should be 

Classifying an assignment statement as a generic step 

is 

Classifying a statement “sort the entire array” as a generic 

step is


Stack vs. Heap 

Memory used by an executing program is partitioned: 

the stack: 

– When a method begins executing, a piece of the 

stack (stack frame) is devoted to it. 

– There is an entry in the stack frame for 

 

 

 

– For variables of primitive type, the data itself is 

stored 

For variables of object type, 

– When the method finishes, the method’s stack frame 

is 

the heap: Dynamically allocated memory goes here, 

including the actual data for objects. Lifetime is


Stack Frames Example 

q 

p 

p 

main main main 

main calls p 

p calls q 

s 

r 

r 

p p p 


q returns p calls r r calls s 

r 

p 

p 


s returns r returns p returns


Objects 

An object is an entity (e.g., a ball) that has 

state — 

behavior — 

A class is the 

Analogy: a class is like an 

an object is like an 

class defines important 

construction is required to 

many objects/houses can be created


Data Abstraction 

The class concept supports 

Similar principles apply as for procedural abstraction: 

group 

group 

separate the issue of 

separate the issue of


References 

The class of an object is its 

Objects are declared differently than are variables of 

primitive types. 

Suppose there is a class called Person. 

int total; 

Person neighbor; 

Declaration of total allocates storage on the 

Declaration of neighbor allocates storage on the


Creating Objects 

A constructor is a special method of the class that 

When a constructor is called, 

storage space is allocated 

each object gets 

the object’s state is 

The name of the constructor for class X is X(). Ex: 

neighbor = new Person(); 

The operator new must be put in front of the call to the 

constructor. 

Summary: Declaring a variable of an object type produces


Creating Objects (cont’d) 

You can combine the declaration and initialization: 

Person neighbor = new Person(); 

just as you can for primitive types: 

int total = 25;


Object Assignment & Aliases 

The meaning of assignment is different for objects than 

it is for primitive types. 

int num1 = 5; 

int num2 = 12; 

num2 = num1; 

At the end, num2 holds 5. 

Person neighbor = new Person(); // creates object 1 

Person friend = new Person(); // creates object 2 

friend = neighbor; 

At the end, friend and neighbor both refer to object1(theyarealiases 

of each other) and nothing refers 

to object 2 (it is inaccessible).


Data Abstraction Revisited 

As a rule of thumb, referring to instance variables outside 

the class is 

For instance, the implementor of the Person class 

might decide to store the age 

In this case, getAgeInYears must change: 

Code that got the age using this method need not change, 

but code that got the age using .age directly 

Moral:


Public vs. Private 

You can tailor the ability to access methods and variables 

from outside the class, using visibility modifiers. 

public: the variable or method can 

private: the variable or method can 

Visibility modifiers go at the beginning of the line that 

declares the variable or method. Ex: 

public static void main(... 

private int age; 

Rules of thumb: 

make instance variables 

make instance methods that are part of the public 

interface of the class 

make instance methods that help with internal work 

of a class


Public vs. Private (cont’d) 

Instance variables should be accessible only indirectly 

via public ”get” and ”set” methods. Ex: 

getAgeInYears() 

Group together all the private variables/methods, and 

all the public ones when you format your program.


Specification vs. Implementation 

Users of a class should rely only on the specification of 

the class. They are allowed to 

declare 

create 

invoke 

Implementors of a class should 

define 

hide 

protect 

feel free to


Inheritance 

Inheritance lets a programmer derive a new class from 

an existing class. New class can 

use 

modify 

have 

Thus inheritance promotes software reuse. Itisadefining 

characteristic of 

Terminology: 

Class A is derived from (or, inherits from) another 

class B 

A is called subclass or child class. 

B is called superclass or parent class.


Benefits of Inheritance 

Inheritance is particularly useful in large software projects: 

 

– saves 

– provides 

– supports


Costs of Inheritance 

 

– Usually this disadvantage is outweighed by 

– Once system is working,


Inheritance in Java 

To declare that a class is a subclass of another class: 

class extends { 

... // define the child-class 

} 

child class inherits 

child class inherits 

child class does NOT inherit 

child class does NOT inherit 

Inherited variables and methods can be used in the 

child class 

Inheritance is one-way street!!


Protected Visibility 

private: 

public: 

This makes it dangerous to inherit variables, since normally 

instance variables should not be made accessible 

outside the class. 

The solution is 

protected:


Overriding Methods 

When a child class defines a method with the same 

name and signature (sequence of parameters) as the 

parent, the child’s version overrides the parent’s version. 

Useful when 

Polymorphism means that 

These are not necessarily the same, since a variable can 

refer to any object whose class is a descendant of the 

variable’s class. 

When in doubt, draw a memory diagram!


Abstract Classes — Motivation 

Consider a database for a veterinarian to keep track of 

medical and billing information for each patient. 

Each patient is someone’s pet (e.g., dog, bird). 

Some aspects of the vet’s business are independent 

of the particular species (e.g., billing, owner info). 

Some aspects depend critically on the species (e.g., 

the vaccination schedule, diet recommendations). 

An obvious organization is to have a 

Note that it does not make sense to create a Pet object 

— 

The Pet class is used to


Rules for Abstract Classes and Methods 

Only instance methods can be declared 

Any class with an abstract method must be declared 

A class may be declared abstract 

An abstract class cannot 

A non-abstract subclass of an abstract class must 

If a subclass of an abstract class does not implement 

all of the abstract methods that it inherits, then 

Since an abstract class cannot be instantiated, its variables 

and methods are not directly used. But they can 

be


Declaring an Interface 

An interface is an abstract class taken to the extreme. 

It is like an abstract class in which 

interface { 

// public final 

// public abstract 

} 

An interface provides 

a collection of 

a collection of 

For example:


Implementing an Interface 

The syntax for “inheriting from” (called implementing) 

an interface I is: 

class B implements I { ... } 

For example: 

The class Account 

can access the 

must provide an implementation of


Abstract Classes vs. Interfaces 

An abstract class can be used as a repository of 

A class can implement 

Both abstract classes and interfaces can be used to


Object-Oriented Design 

The design of a software system is an iterative process. 

choose 

develop 

previous step may indicate that 

develop 

etc. 

As the design matures, objects are abstracted into classes: 

group 

put 

determine 

Initial design effort focuses on the overall structure of 

the program. The algorithms for the methods are specified 

using pseudocode. Actual coding begins


Deciding on Objects and Classes 

Make some guesses about what the objects in the system 

are and try to arrange them into groups (which 

will be the classes). Although you should put serious 

thought into this, don’t try to do this perfectly on the 

first pass. 

Rule of Thumb: 

Later you may need 

As you come up with the objects, some details (variables 

and methods) will be obvious. Document these 

and test them out with scenarios — 

A scenario is a


Linked List 

Linked lists are useful when 

Linked lists are an example of 

Separate blocks of storage are 

Linked representations are an important alternative to 

Many key abstract data types (lists, stacks, queues, sets, 

trees, tables) can be represented with either 

Important to understand the


Pointers 

Pointers in Java are called 

However, you cannot


Linear Linked Lists 

The list consists of a series of 

Each node contains 

 

 

To realize this idea in Java: 

each 

class 

– 

– 

another class


Linear Linked Lists (cont’d) 

Here is a diagram of the heap: 

Space complexity:


Linked List Example — Node Class 

For a linked list of books, first define a class that represents 

individual list elements (nodes). 

The type of the link variable is the same as the class 

being defined —


Linked List Example — List Class 

Then define a class that represents


Linked List Operations 

What should be the operations on a linked list? 

– 

– 

– 

– 

– 

 

– 

Add some instance methods to the BookList class:


Using a Linked List 

Example:


Inserting at the Front of a Linked List 

Pseudocode: 

1. 

2. 

In Java (assuming the parameter is not null):


Inserting at the Front of a Linked List (cont’d) 

What happens if we do step 1 and step 2 in the opposite 

order? 

Time Complexity:


Inserting at the End of a Linked List 

First, assume the list is empty (i.e., first equals null). 

1. 

2. 

Now, assume the list is not empty (i.e., first does 

not equal null). 

1. 

2. 

Howdowedostep1?


Inserting at the End of a Linked List (cont’d) 

Time Complexity:


Using a Last Pointer 

To improve running time, keep a pointer to the last 

node in the list class, as well as the first node. 

Time Complexity:


Using a Last Pointer (cont’d)


Deleting Last Node from Linked List 

Suppose we want to delete the node at the end of the 

list and return the deleted node. 

First, let’s handle the boundary conditions: 

If the list is empty, 

If the list has only one element


Deleting Last Node from Linked List (cont’d) 

Suppose the list has at least two elements. 

First attempt: 

1. 

2. 

3. 

... 

Step 1 can be done as before. 

return this 

What about step 2?


Deleting Last Node from Linked List (cont’d) 

Time Complexity: 

Would it help to keep a last pointer?


Linked Lists Pitfalls 

Check that a link is not null before following it! 

Example: 

Mark end of list 

Be careful with boundary cases! 

Draw memory diagrams! 

Don’t lose access to needed objects!


Linked Lists vs. Arrays 

Space complexity: 

Time Complexity (n data items): 

insert front 

singly singly doubly doubly array 

linked linked, linked linked, 

last ptr last ptr 

insert end 

delete first 

delete last 

search


Linked Lists vs. Arrays (cont’d) 

Suppose the items in the sequence are in sorted order. 

Then data items must be inserted in the correct place. 

But perhaps this will make searching for an item easier. 

Break the insertion process into two parts: 

1. search 

2. insert 

search 

singly singly doubly doubly array 

linked linked, linked linked, 

last ptr last ptr 

insert


Linked Lists vs. Arrays (cont’d) 

Tradeoff: 

linked list: 

– insert is 

– search is 

because nodes 

arrays: 

– insert is 

– search is 

because nodes 

Binary search cannot be used on 

Later we will see some other data structures that try to


Other Linked Structures 

We don’t have to restrict ourselves to just having one 

link instance variable per node. We can get arbitrarily 

complicated linked structures. 

Some of the more common and useful ones are: 

doubly linked list — 

rings — 

trees — 

general graphs —


Recursion 

Idea of recursion is closely related to the principle of 

Figure out how to 

Assume you have a 

Figure out how to 

This is also an application of 

Rules for recursive programs: 

There must be 

Recursive call(s) must


Stack Frames for Recursive Methods 

When a recursive method is executed, 

Example: 

The factorial of n, represented n!, is calculated as n 

(n , 1) (n , 2) 21. 

To compute n!:


Stack Frames for Factorial Example 

Stack frames when calling fact(4) :


Reversing a Linked List Recursively 

To find a recursive solution, break the problem down 

into a smaller problem. Let the list consist of nodes 

x 1 ;x 2 ;:::;x n . 

One idea: 

1. Reverse 

2. Put 

Step 1 solves a smaller problem; step 2 does a little 

more work to solve the larger problem. 

(A similar idea: 

1. Reverse 

2. Put 

Stopping case?


Reversing a Linked List Recursively (cont’d) 

abstract class Node { 

Node link; 

} 

class LinkedList { 

Node first; 

... 

void reverseList() { 

first = reverse(first); 

} 

} 

reverseList is an instance method that 

Note a common occurrence:


Reversing a Linked List Recursively (cont’d) 

reverse takes as a parameter 

reverse returns


Concatenating Two Lists 

Method concat appends the list starting with node b 

to the end of the list starting with node a. It returns a 

reference to the first node in the resulting list. 

Time Complexity: To reverse a list of n nodes takes


Figure for Reversing a Linked List Recursively


Reversing an Array Recursively 

Let A be an array of size n. To reverse A,wemust 

change which indexes are occupied by which data, so 

that at the end: 

A[0] contains 

A[1] contains 

etc. 

We can follow the ideas from the linked list: 

1. save 

2. recursively cause 

3. store 

This breaks the problem of size n down into a subproblem 

of size n , 1. 

Stopping case:


Reversing an Array Recursively (cont’d) 

The following reverses the elements of A starting at 

index start: 

The top level call is:


Figure for Reversing an Array Recursively


Towers of Hanoi 

Towers of Hanoi is is an example of a problem that 

is much easier to solve using recursion than not using 

recursion. 

There are 3 pegs and n disks, all of different sizes 

Initially all disks are on the start peg, stacked in 

decreasing size, with largest on bottom and smallest 

on top. 

We must move all the disks to the end peg 

The third peg 

Example: n =2. Solution is: 

1. Move 

2. Move 

3. Move 

For larger n, it becomes difficult to figure out.


Recursive Solution to Towers of Hanoi 

Using recursion can help. Suppose someone gives us a 

method M to move n , 1 pegs. We can use it to solve 

the problem for n pegs as follows: 

1. Move 

2. Move 

3. Move 

Steps 1 and 3 will be done 

Stopping case?


Figure for Towers of Hanoi


Recursive Solution to Towers of Hanoi (cont’d) 

The output of the program will be a list of instructions. 

To call this method, suppose you have 4 pegs and you 

want to use peg 1 as the start peg, peg 3 as the finish 

peg, and peg 2 as the spare peg:


Time Complexity of Towers of Hanoi Solution 

Time Complexity: Asymptotically proportional to the 

number of 

Each instantiation of the method 

To count the number of instantiations, draw a 

Number of vertices in the tree is 

Therefore time complexity is


Parsing Arithmetic Expressions 

An important part of a compiler is the parser, which 

checks whether 

An important part of this problem is to check whether 

a +(b,(x=y)) 

a ++b=z 

(a)) c 

To simplify the problem: 

Assume that the operands are 

Only consider operators 

The correct syntax for arithmetic expressions can be 

described using


A Grammar for Arithmetic Expressions 

Sample Rules: (j means “or”) 

1. 

2. 

3. 

Here are some derivations:


Recursive Parsing Algorithm 

Idea is to try to obtain an expression from the input. To 

do this, try to obtain from the input 

 

 

 

To obtain a term from the input (starting at the current 

position), try to obtain 

 

 

 

To obtain a factor from the input (starting at the current 

position), try to obtain


Recursive Parsing Algorithm (cont’d) 

At the top level: 

boolean valid(String input) { 

String remainder = getExpr(input); 

return ((remainder != null) && 

(remainder.length() == 0)); 

} 

getExpr recognizes an expression at the beginning 

of input and returns the rest of the string, which will 

be the empty string if nothing is left over. If a syntax 

error is encountered, it returns null. (Does not handle 

white space in the input.)


Recursive Parsing Algorithm (cont’d)


Abstract Data Types 

An abstract data type (ADT) defines entities that have 

 

 

ADTs provide the benefits of 

There is a strict separation between 

This separation facilitates 

ADTs are easily achieved in


ADT Example: Priority Queue Specification 

The priority queue ADT is useful in many situations. 

Here is its specification: 

The state is 

The operations on a priority queue are: 

– 

– 

– 

Note that there is no operation to 

Example applications: 

Pay 

Provide


Using a Priority Queue to Sort a List of Integers 

Even without knowing anything about how a priority 

queue might be implemented, we can take advantage 

of its operations to solve other problems. 

For example, to sort a list of numbers: 

Insert 

Successively 

Store


Implementing a Priority Queue with an Array


Implementing a Priority Queue with a Linked List 

Pseudocode: 

To insert an element: 

To remove the highest priority element: 

– Scan 

– When 

Time is 

Asymptotic running times are 

Time to sort is 

Can we do things faster by keeping the array, or linked 

list, elements in sorted order? 

Warning:


Implementing a PQ with a Sorted Array 

Keep the array elements in increasing order of priority. 

(If highest priority is smallest element, then elements 

will be in decreasing order). 

Pseudocode: 


To remove the highest priority element:


Implementing a PQ with a Sorted Linked List 

Pseudocode: 


To remove the highest priority element: 

Asymptotic times are


Generic PQ Implementation Using Java 

To avoid rewriting the priority queue implementation 

for every different kind of element (integer, double, 

String, user-defined classes, etc.), we can use Java’s 

interface feature. 

All that is required is


Using the ComparisonKey Interface 

Change the specification of the PriorityQueue 

class to consist of a collection of 

Any class that 

Define a class called PQItem that 

sortPQ, the sorting algorithm that uses a priority 

queue, can


Generic Implementation of PQ with Array 

class PriorityQueue { 

private ComparisonKey[] A = 

new ComparisonKey[100]; // int -> CK 

private int next; 

PriorityQueue() { 

next = 0; 

} 

public void insert(ComparisonKey x) { // int -> CK 

A[next] = x; 

next++; 

} 

public ComparisonKey remove() { // int -> CK 

ComparisonKey high = A[0]; // int -> CK 

int highLoc = 0; 

for (int cur = 1; cur < next; cur++) { 

if (high.compareTo(A[cur]) == 

ComparisonKey.LOWER) { // use compareTo method 

high = A[cur]; 

highLoc = cur; 

} 

} 

A[highLoc] = A[next-1]; 

next--; 

return high; 

} 

}


Implementing the Generic PQItem 

Here is a possible PQItem class for integers. Note 

For a PQItem class for strings: 

make 

make 

the method


Generic PQItem’s (cont’d) 

This approach is particularly powerful since we can 

Suppose the items are 

One form of priority might be 

Another form might be 

All those decisions will be encapsulated inside the


Sorting with Generic PQ 

Finally, here is the sorting algorithm: 

void sortPQ (ComparisonKey[] A) { 

int n = A.length; 

PriorityQueue pq = 

new PriorityQueue(); 

for (int i = 0; i < n; i++) 

pq.insert(A[i]); 

for (int i = 0; i < n; i++) 

A[i] = pq.remove(); 

} 

The only difference from before is 

IMPORTANT TO NOTICE: 

The PriorityQueue class 

The sortPQ method


Importance of Modularity and Information Hiding 

Why is it valuable to be able to do these kinds of things? 

The public/private visibility modifiers of Java, and the 

discipline of not making the internal details be available 

outside are forms of 

Information hiding promotes modular programming 

— you can 

The key to abstraction is


Compiling and Running a C Program in Unix 

Simple scenario in which your program is in a single 

file: Suppose you want to name your program test. 

1. edit 

2. compile 

3. if 

4. run 

5. if


Structure of a C Program 

A C program is a list of 

Every C program must contain 

Functions are 

The 

For 

 

The \n is 

Comments


A Useful Library 

See the Reek book (especially Chapter 16) for a description 

of what you can do with built-in libraries. In 

addition to stdio.h, 

stdlib.h lets you use functions for, e.g., 

– 

– 

– 

– 

math.h provides 

string.h has


Printf 

The function printf is used to print the standard output 

(screen): 

It can take a 

The first argument must 

The first argument might 

A 

Following the first argument is a 

Example: 

Output is:


Variables and Arithmetic Expressions 

The main numeric data types that we will use are: 

 

 

 

Variables are declared and manipulated in arithmetic 

expressions pretty much as in Java. For instance, 

However, in C,

CPSC 211 Data Structures & Implementations (c) Texas A&M University [ 100] 

Reading from the Keyboard 

The function scanf reads in data from the keyboard. 

scanf takes a 

The first argument is 

Each 

After the first argument is a 

The subsequent arguments must each be 

The code for an 

When you run this program, it will wait for you to enter 

two integers, and then continue. The integers can be on 

the same line separated by a space, or on two lines.


Functions 

Functions in C are pretty much like methods in Java 

(dealing only with primitive types). Example: 

#include < stdio.h > 

double times2 (double x) { 

x = 2*x; 

return x; 

} 

main () { 

double y = 301.4; 

printf("Original value is %f; final value is %f.\n", 

y, times2(y)); 

} 

Functions must be 

As in Java, parameters are 

As in Java, if the function does not return any value, 

Parameters and local variables of functions


Recursive Functions 

Recursion is essentially the same as in Java. 

The only difference is if you have mutually recursive 

functions, also called indirect recursion: for instance, 

if function A calls function B, while B calls A. 

Then you have a problem with the requirement that 

functions be defined before they are used. 

You can get around this problem with


Global Variables and Constants 

C also provides global variables. 

A global variable is defined 

A global variable can be used 

Generally, global variables that can be changed are frowned 

upon, as contributing to errors. However, global variables 

are very appropriate for constants. Constants are 

defined using macros:


Boolean Expressions 

The operators to compare two values are the same 

as in Java: 

However, instead of returning a boolean value, they 

return 

Actually, C interprets 

Thus the analog in C of a boolean expression in Java 

is any expression that produces 

As in Java, boolean expressions can be operated on 

with Some examples: 

(10 == 3) evaluates to 

!(10 == 3) evaluates to 

!( (x < 4) || (y == 5) ) :ifxis 10 and 

y is 5, then this evaluates to


If Statements and Loops 

Given the preceding interpretation of “boolean expression”, 

the following statements are the same in C as in 

Java: 

 

 

 

 

Since Boolean expressions are essentially integers, you 

can have a for statement like this in C: 

for (int count = 99; count; count--) { 

... 

} 

count is initialized to 

the loop is executed 

count is 

This loop is executed


Switch 

C has a switch statement that is like that in Java: 

switch ( ) { 

case : 

 

break; 

case : 

 

break; 

... 

default : 

} 

Don’t forget the break statements! 

The integer expression must produce a value belonging 

to any of the integral data types (various size integers 

and characters).


Enumerations 

This is something neat that Java does not have. 

An enumeration is a way to give 

For instance, suppose you need to have some codes 

in your program to indicate whether a library book is 

checked in, checked out, or lost. Intead of 

#define CHECKED_IN 0 

#define CHECKED_OUT 1 

#define LOST 2 

you can use an enumeration declaration:


Using an Enumeration in a Switch Statement 

int status; 

/* some code to give status a value */ 

switch (status) { 

case CHECKED_IN : 

/* handle a checked in book */ 

break; 

case CHECKED_OUT : 

/* handle a checked out book */ 

break; 

case LOST : 

/* handle a lost book */ 

break; 

}


Enumeration Data Type 

You can give a name to an enumeration and thus create 

an enumeration data type. The syntax is: 

enum 

For example: 

enum book_status { CHECKED_IN, CHECKED_OUT, LOST }; 

Why bother to do this?


Type Synonyms 

The enumeration type is our first example of a user 

defined type in C. 

It’s rather unpleasant to have to carry around the word 

enum all the time for this type. 

Instead, you can give a name to this type you have 

created, and subsequently just use that type – without 

having to keep repeating enum. For example:


Structures 

C also gives you a way to create more general types of 

your own, as structures These are essentially like objects 

in Java, if you just consider the instance variables. 

A structure groups together related data items that can 

be of different types. 

The syntax to define a structure is:


Storage on the Stack 

The statement 

struct student stu; 

causes the entire stu structure to be stored


Using typedef with Structures 

When using the structure type, you have to carry along 

the word struct. 

To avoid this, you can use a 

A more concise way to do this is: 

Now you can create a Student variable:


Using a Structure 

You can access the pieces of a structure using dot notation 

(analogous to accessing instance variables of an 

object in Java) : 

You can also have the entire struct on either the left or 

the right side of the assignment operator:


Figure for Copying a Structure


Passing a Structure to a Function 

Structures can be passed as parameters to functions: 

Then you can call the function: 

But if you put the following line of code after the printf 

in print info:


Returning a Structure From a Function 

You can return a structure from a function also. Suppose 

you have the following function: 

Now you can call the function:


Figure for Returning a Structure from a Function 

The copying of formal parameters and return values 

can be avoided by


Arrays 

To define an array: 

For example: 

Unlike Java, 

Unlike Java, 

Unlike Java, 

As in Java, 

As in Java,


Arrays (cont’d) 

Two things you CAN do: 

If you have an array of structures, 

You can declare a two-dimensional array (and higher): 

e.g., 

Two things you CANNOT do: 

 

 

We’ll see how to accomplish these tasks


Pointers in C 

Pointers are used in C to 

circumvent 

– copying of parameters and return values 

– lasting changes 

access 

allow 

For each data type T, 

For instance, 

declares iptr to be of type “pointer to int”. iptr 

refers to a 

Actually, most C programmers write it as:


Addresses and Indirection 

Computer memory is 

Each variable is 

The address of the variable is 

iptr refers to 

*iptr refers to 

Applying the * operator is called


The Address-Of Operator 

We saw the & operator in scanf. It 

int i; 

int* iptr; 

i = 55; 

iptr = &i; 

*iptr = *iptr + 1; 

Last line gets data out of location whose address is in 

iptr, adds 1 to that data, and stores result back in 

location whose address is in iptr.


Comparing Indirection and Address-Of Operators 

As a rule of thumb: 

Indirection: 

– It CANNOT 

– It CAN 

Address-Of: 

– It CAN 

– It CANNOT


Pointers and Structures 

Remember the struct type Student, which has an 

int age and a double grade point: 

Student stu; 

Student* sptr; 

sptr = &stu; 

To access variables of the structure: 

There is a “shorthand” for this notation:


Passing Pointer Variables as Parameters 

You can pass pointer variables as parameters. 

void printAge(Student* sp) { 

printf("Age is %i",sp->age); 

} 

When this function is called, 

1. a Student* variable: 

or 

2. apply the & operator to a Student variable: 

C still uses call by value to pass pointer parameters, but 

because they are pointers, what gets copied are 

Data coming in to the function is not copied.


Passing Pointer Variables as Parameters (cont’d) 

Now we can 

void changeAge(Student* sp, int newAge) { 

sp->age = newAge; 

} 

You can also 

Old initialize with copying: 

Student initialize(int old, double gpa) { 

Student st; 

st.age = old; 

st.grade_point = gpa; 

return st; 

} 

More efficient initialize using pointers:


Passing Pointer Variables as Parameters (cont’d) 

Using pointers is an optimization in previous case. But 

it is 

void swapAges (Student* sp1, Student* sp2) { 

int temp; 

temp = sp1->age; 

sp1->age = sp2->age; 

sp2->age = temp; 

} 

To call this function:


Pointers and Arrays 

The name of an array is 

It is a 

To reference array elements, you can use 

 

or 

 

What is going on with the pointer notation? 

a refers to 

*a refers to 

a+1 refers to 

*(a+1) refers to


Pointers and Arrays (cont’d) 

You can also refer to array elements with 

For example, 

int a[5]; 

int* p; 

p = a; /* p = &a[0]; is same */ 

p refers to 

*p refers to 

p+1 refers to 

*(p+1) refers to 

Since p is a non-constant pointer, you can also 

Warning: NO BOUNDS CHECKING IS DONE IN 

C!


Passing an Array as a Parameter 

To pass an array to a function: 

void printAllAges(int a[], int n) { 


for (i = 0; i < n; i++) { 

printf("%i \n", a[i]); 

} 

} 

The “array” parameter indicates 

Alternative definition: 

void printAllAges(int* p, int n) { 


for (i = 0; i < n; i++) { 

printf("%i \n", *p); 

p++; 

} 

} 

The formal array parameter is a 

You can call the function like this:


Dynamic Memory Allocation in Java 

Java 

That means that 

This happens whenever 

In Java there is strict distinction between 

Every variable is either 

memory for variables is 

This memory 

memory for variables of primitive type 

memory that holds the actual contents of an object 

is 

This memory goes away


Dynamic Memory Allocation in C 

In C, 

Every type has the possibility of being allocated statically 

(on the stack) or dynamically (on the heap). 

To allocate space statically, you 

Space is allocated 

To allocate space dynamically, use 

It takes one integer parameter indicating the 

Use sizeof operator to get the length; 

It returns a 

The pointer has type void*. You MUST cast it to 

the appropriate type. If malloc fails to allocate the 

space,


malloc Example 

To dynamically allocate space for an int: 


p = (int*) malloc(sizeof(int)); /* cast result 

to int* */ 

if (p == NULL) { /* to be on the safe side */ 

printf("malloc failed!"); 

} else { 

*p = 33; 

printf("%i", *p); 

} 

Normally, you don’t need to allocate a single integer at 

a time. Typically, you would use malloc to: 

allocate 

allocate


Another malloc Example 

To dynamically allocate space for a structure: 


sptr = (Student*) malloc(sizeof(Student)); 

sptr->age = 20; 

sptr->grade_point = 3.4;


Allocating a Linked List Node Dynamically 

For a singly linked list of students, use this type: 

typedef struct Stu_Node{ 

int age; 

double grade_point; 

struct Stu_Node* link; 

} StuNode; 

To allocate a node for the list: 

To insert the node pointed to by sptr after the node 

pointed to by some other node, say cur:


Allocating an Array Dynamically 

To allocate an array dynamically, 



p = (int*) malloc(100*sizeof(int)); /* 100 elt array */ 

/* now p points to the beginning of the array */ 

for (i = 0; i < 100; i++) /* initialize the array */ 

p[i] = 0; /* access the elements */ 

Similarly, you can allocate an array of structures:


Deallocating Memory Dynamically 

When memory is allocated using malloc, 

You can get 

void sub() { 

int *p; 

p = (int*) malloc(100*sizeof(int)); 

return; 

} 

Although the space for the pointer variable p goes away 

when sub finishes executing, 

But they are completely useless after sub is done, 

If you had wanted them to be accessible outside of 

sub,


Using free 

To deallocate memory when you are through with it, 

It takes as an argument a 

and returns nothing. The result of free is that all the 

space starting at the designated location will be 

In the function void sub above, just before the return, 

you should say: 

DO NOT DO THE FOLLOWING:


Saving Space with Arrays of Pointers 

Suppose you need an array of structures, where each 

structure is fairly large. But you are not sure at compile 

time how big the array needs to be. 

1. Allocate 

2. Find out 

3. Allocate


Array of Pointers Example 

To implement with the usual Student struct:


Information Hiding in C 

Java provides support for information hiding by 

 

 

Advantages of data abstraction, including the use of 

constructor and accessor (set and get) functions: 

push 

easier 

easy 

easy 

C does not provide the same level of compiler support 

as Java, but you can achieve the same effect with some


Information Hiding in C (cont’d) 

A “constructor” in C would be a function that 

calls 

initializes 

returns 

For example:



The analog of a Java instance method in C would be 

a function whose first parameter is the “object” to be 

operated on. 

You can write set and get functions in C:



You can use the set and get functions to swap the 

ages for two student objects: 

When should you provide set and get functions and 

when should you not? They obviously impose some 

overhead in terms of additional function calls.


Strings in C 

There is no explicit string type in C. 

A string in C is an array of characters that is 

terminated with the null character. 

The length 

The null character 

A sequence of characters enclosed in double quotes


Strings in C (cont’d) 

You can also declare a 

To initialize name, do not assign to a string literal! 

Instead, either 

Access elements using the brackets notation: 

char firstLetter; 

name[3] = ’a’; 

firstLetter = name[0]; 

namePtr[3] = ’b’; 

firstLetter = namePtr[0];


Passing Strings to and from Funtions 

To pass a string into a function or return one from a 

function, you must 

Passing in a string: 

Returning a string: 

You can call these functions like this:


Reading in a String from the User 

To read in a string from the user, call: 

scanf("%s", name); 

Notice the use of %s in scanf. The corresponding 

data must be a 

scanf reads a string from the input stream up to 

The letters are read into 

You must make sure that you have a large enough 

array to hold the string. How much space is needed? 

If you don’t have enough space, whatever follows 

the array will be


String Manipulation Functions 

There are some useful string manipulation functions 

provided for you in C. These include: 

strlen, which takes a string as an argument and 

returns the length of the string, not counting the 

null character at the end. I.e., it counts how many 

characters it encounters before reaching ’\0’. 

strcpy, which takes two strings as arguments and 

copies its second argument to its first argument. 

First, to use them, you need to include headers for the 

string handling library: 

#include 

To demonstrate the use of strlen and strcpy, suppose 

you want to add a name component to the Student 

structure and change the constructor so that it asks the 

user interactively for the name:


String Manipulation Functions Example 

typedef struct { 

char* name; 

int age; 

double grade_point; 

} Student; 

Student* constructStudent(int age, double gpa) { 

char inputBuffer[100]; /* read name into this */ 


sptr = (Student*) malloc(sizeof(Student)); 

sptr->age = age; 

sptr->grade_point = gpa; 

/* here’s the new part: */ 

printf("Enter student’s name: "); 

scanf("%s", inputBuffer); 

/* allocate just enough space for the name */ 

sptr->name = (char*) malloc ( 

(strlen (inputBuffer) + 1)*sizeof(char) ); 

/* copy name into new space */ 

strcpy (sptr->name, inputBuffer); 

return sptr; 

} 

When constructor returns, inputBuffer goes away. 

Space allocated for Student object is an int,adouble 

and just enough space for the actual name.


Other Kinds of Character Arrays 

Not every character array has to be used to represent a 

string. You may want a character array that holds all 

possible letter grades, for instance: 

char grades[5]; 

grades[0] = ’A’; 

grades[1] = ’B’; 

grades[2] = ’C’; 

grades[3] = ’D’; 

grades[4] = ’F’; 

In this case, there is no reason for the last array entry 

to be the null character, and in fact, it is not.


File Input and Output 

File I/O is much simpler than in Java. 

Include 

Declare 

Call 

Writing to a file is done with 

Reading from a file is done with 

Call


File I/O Example 

/* to use the built in file functions */ 

#include 

main () { 

/* create a pointer to a struct called FILE; */ 

/* it is system dependent */ 

FILE* fp; 

char line[80]; 


/* open the file for writing */ 

fp = fopen("testfile", "w"); 

/* write into the file */ 

fprintf(fp,"Line %i ends \n", 1); 

fprintf(fp,"Line %i ends \n", 2); 

/* close the file */ 

fclose(fp); 

/* open the file for reading */ 

fp = fopen("testfile", "r"); 

/* read six strings from the file */ 

for (i = 1; i < 7; i++) { 

fscanf(fp,"%s", line); 

printf("got from the file: %s \n", line); 

} 

/* close the file 

fclose(fp); 

}


Motivation for Stacks 

Some examples of last-in, first-out (LIFO) behavior: 

Web browser’s 

Text editors 

The most recent pending method/function call 

To evaluate an arithmetic expression, 

A stack is a sequence of elements, to which elements 

can be added (push) and removed (pop):


Specifying an ADT with an Abstract State 

We would like a specification to be as independent of 

any particular implementation as possible. 

But since people naturally think in terms of state, a 

popular way to specify an ADT is


Specifying the Stack ADT with an Abstract State 

1. A stack’s state is modeled as 

2. Initially the state of the stack is 

3. The effect of a push(x) operation is to 

4. The effect of a pop operation is to


Specifying an ADT with Operation Sequences 

But a purist might complain that a state-based specification 

is, implicitly, suggesting a particular implementation. 

To be even more abstract, one can specify an 

ADT 

For instance: 

push(a) pop(a): 

pop(a): 

push(a) push(b) push(c) pop(c) pop(b) push(d) pop(d): 

push(a) push(b) pop(a):


Additional Stack Operations 

Other operations that you sometimes want to provide: 

peek: 

size: 

empty:


Balanced Parentheses 

Recursive definition of a sequence of parentheses that 

is balanced: 

the sequence 

if the sequence 

According to this definition: 

(): 

(()(())): 

(()))(): 

())(:


Algorithm to Check for Balanced Parentheses 

Key observations: 

1. There must be 

2. In any prefix, the number of 

Pseudocode:


Java Method to Check for Balanced Parentheses 

Using java.util.Stack class (which manipulates 

objects): 

import java.util.*; 

boolean isBalanced(char[] parens) { 

Stack S = new Stack(); 

try { // pop might throw an exception 

for (int i = 0; i < parens.length; i++) { 

if ( parens[i] == ’(’ ) 

S.push(new Character(’(’)); 

else 

S.pop(); // discard popped object 

} 

return S.empty(); 

} 

catch (EmptyStackException e) { 

return false; 

} 

}


Checking for Multiple Kinds of Balanced Parens 

Suppose there are 3 different kinds of parentheses: 

(and),[and],f and g. 

Modify the program: 

boolean isBalanced3(char[] parens) { 


try { 

for (int i = 0; i < parens.length; i++) { 

if (leftParen(parens[i]) // ( or [ or { 

S.push(new Character(parens[i])); 

else { 

char leftp = ((Character)S.pop()).charValue(); 

if (!match(leftp,parens[i])) return false; 

} 

} 

return S.empty(); 

} // end try 

catch (EmptyStackException e) { 

return false; 

} 

}


Multiple Kinds of Parentheses (cont’d) 

boolean leftParen(char c) { 

return ((c == ’(’) || (c == ’[’) || c == ’{’)); 

} 

boolean match(char lp, char rp) { 

if ((lp == ’(’) && (rp == ’)’) return true; 

if ((lp == ’[’) && (rp == ’]’) return true; 

if ((lp == ’{’) && (rp == ’}’) return true; 

return false; 

}


Postfix Expressions 

We normally write arithmetic expressions using infix 

notation: 

Another way to write arithmetic expressions is to use 

postfix notation: 

For example, 

3 4 +is same as 

1 2 - 5 - 6 5 / +is same as 

One advantage of postfix is that 

For instance, 

(1 + 2) * 3becomes 

1 + (2 * 3)becomes


Using a Stack to Evaluate Postfix Expressions 

Pseudocode:


StringTokenizer Class 

Java’s StringTokenizer class is very helpful to 

break up the input string into operators and operands 

— called 

Create a StringTokenizer object out of the input 

string. It 

Use instance method hasMoreTokens to test 

Use instance method nextToken to 

Second argument to constructor indicates that, 

Third argument to constructor indicates that


Java Method to Evaluate Postfix Expressions 

public static double evalPostFix(String postfix) 

throws EmptyStackException { 


StringTokenizer parser = new StringTokenizer 

(postfix, " \n\t\r+-*/", true); 

while (parser.hasMoreTokens()) { 

String token = parser.nextToken(); 

char c = token.charAt(0); 

if (isOperator(c)) { 

double y = ((Double)S.pop()).doubleValue(); 

double x = ((Double)S.pop()).doubleValue(); 

switch (c) { 

case ’+’: 

S.push(new Double(x+y)); break; 

case ’-’: 

S.push(new Double(x-y)); break; 

case ’*’: 

S.push(new Double(x*y)); break; 

case ’/’: 

S.push(new Double(x/y)); break; 

} // end switch 

} // end if 

else if (!isWhiteSpace(c)) // token is operand 

S.push(Double.valueOf(token)); 

} // end while 

return ((Double)S.pop()).doubleValue(); 

}


Evaluating Postfix (cont’d) 

public static boolean isOperator(char c) { 

return ( (c == ’+’) || (c == ’-’) || 

(c == ’*’) || (c == ’/’) ); 

} 

public static boolean isWhiteSpace(char c) { 

return ( (c == ’ ’) || (c == ’\n’) || 

(c == ’\t’) || (c == ’\r’) ); 

} 

Does not 

Does no


Implementing a Stack with an Array 

Since Java supplies a Stack class, why bother? 

Idea: 

Issues for Java implementation: 

elements in the array are to be of type 

throw exception if 

dynamically increase the size of the array to avoid 

To handle the last point, we’ll do the following: 

initially, 

if array is full and a push occurs,


Implementing a Stack with an Array in Java 

class Stack { 

private Object[] A; 

private int next; 

public Stack () { 

A = new Object[16]; 

next = 0; 

} 

public void push(Object obj) { 

if (next == A.length) { 

// array is full, double its size 

Object[] newA = new Object[2*A.length]; 

for (int i = 0; i < next; i++) // copy 

newA[i] = A[i]; 

A = newA; // old A can now be garbage collected 

} 

A[next] = obj; 

next++; 

} 

public Object pop() throws EmptyStackException { 

if (next == 0) 

throw new EmptyStackException(); 

else { 

next--; 

return A[next]; 

} 

}


Implementing a Stack with an Array in Java (cont’d) 

public boolean empty() { 

return (next == 0); 

} 

public Object peek() throws EmptyStackException { 

if (next == 0) 


else 

return A[next-1]; 

} 

} // end Stack class 

class EmptyStackException extends Exception { 

} 

public EmptyStackException() { 

super(); 

}


Time Performance of Array Implementation 

push: 

pop: 

empty: 

peek:


Impementing a Stack with a Linked List in Java 

Idea: 

class StackNode { 

Object item; 

StackNode link; 

} 

class Stack { 

private StackNode top; // first node in list, the top 

public Stack () { 

top = null; 

} 

public void push(Object obj) { 

StackNode node = new StackNode(); 

node.item = obj; 

node.link = top; 

top = node; 

}


Implementing a Stack with a Linked List in Java 

(cont’d) 

public Object pop() throws EmptyStackException { 

} 

if (top == null) 


else { 

StackNode temp = top; 

top = top.link; 

return temp.item; 

} 


return (top == null); 

} 

} 

public Object peek() throws EmptyStackException { 

if (top == null) 


else 

return top.item; 

}


Time Performance of Linked List Implementation 

push: 

pop: 

empty: 

peek:


Interchangeability of Implementations 

If you have done things right, you can: 

write a program using the built-in Stack class 

compile and run that program 

then make available your own Stack class, using 

the array implementation (e.g., put Stack.class 

in the same directory 

WITHOUT CHANGING OR RECOMPILING YOUR 

PROGRAM, run your program — it will use the local 

Stack implementation and will still be correct! 

then replace the array-based Stack.class file with 

your own linked-list-based Stack.class file 

again, WITHOUT CHANGING OR RECOMPIL- 

ING YOUR PROGRAM, run your program — it 

will use the local Stack implementation and will 

still be correct!


Motivation for Queues 

Some examples of first-in, first-out (FIFO) behavior: 

 

 

 

A queue is a


Specifying the Queue ADT 

Using the abstract state style of specification: 

The state of a queue is modeled as a 

Initially the state of the queue is the 

The effect of an enqueue(x) operation is to 

The effect of a dequeue operation is to


Specifying the Queue ADT (cont’d) 

Alternative specification using allowable sequences would 

give some rules (an “algebra”). Some specific examples: 

enqueue(a) dequeue(a): 

dequeue(a): 

enqueue(a) enqueue(b) enqueue(c) dequeue(a) enqueue(d) 

dequeue(b): 

enqueue(a) enqueue(b) dequeue(b): 

Other popular queue operations:


Applications of Queues in Operating Systems 

The text discusses some applications of queues in operating 

systems: 

to buffer data coming from a running process going 

to a printer: 

a printer may be shared between several computers 

that are networked together.


Application of Queues in Discrete Event Simulators 

A simulation program is a program that mimics, or 

“simulates”, the behavior of some complicated realworld 

situation, such as 

 

 

 

These systems are typically too complicated to be modeled 

exactly mathematically, so instead, they are simulated: 

events take place in them according to some 

random number generator. For instance, 

at random times, 

at random times, 

at random times,


Using a Queue to Convert Infix to Postfix 

First attempt: Assume infix expression is 

For example: 

(((22=7) + 4) (6 , 2)) 

(7 , (((2 3)+5)(8 , (4=2)))) 

Pseudocode:


Converting Infix to Postfix (cont’d) 

Examples: 

(((22=7) + 4) (6 , 2)) 

Q: 

S: 

(7 , (((2 3)+5)(8 , (4=2)))) 

Q: 

S:


Converting Infix to Postfix with Precedence 

It is too restrictive to require parentheses around everything. 

Instead, precedence conventions tell 

For instance, 4 3+2equals 

We need to modify the above algorithm to handle operator 

precedence.


Converting Infix to Postfix with Precedence (cont’d) 

create queue Q to hold postfix expression 

create stack S to hold operators not yet 

added to the postfix expression 

while there are more tokens do 

get next token t 

if t is a number then enqueue t on Q 

else if S is empty then push t on S 

else if t is ( then push t on S 

else if t is ) then 

while top of S is not ( do 

pop S and enqueue result on Q 

endwhile 

pop S // get rid of ( that ended while 

else // t is real operator and S not empty) 

while prec(t)


Converting Infix to Postfix with Precedence (cont’d) 

For example: 

(22=7 +4)(6 , 2) 

Q: 

S: 

7 , (2 3+5)(8 , 4=2) 

Q: 

S:


Implementing a Queue with an Array 

State is represented with: 

array A 

integer head that holds 

integer tail that holds 

Operation implementations: 

enqueue(x): 

dequeue(x): 

empty: 

peek: 

size: 

Problem:


Implementing a Queue with a Circular Array 

Wrap around to reuse the vacated space at the beginning 

of the array in a circular fashion, using mod operator 

%. 

enqueue(x): 

dequeue(x): 

empty: 

The problem is that


Expanding Size of Queue Dynamically 

To avoid overflow problem in circular array implementation 

of a queue, use same idea as for array implementation 

of stack: 

If array is discovered to be full during an enqueue, 

allocate 

copy 

enqueue 

free 

One complication with the queue, though, is that the 

contents of the queue might be in two sections: 

1. from 

2. then from 

Copying the new array must take this into account.


Performance of Circular Array 

Performance of the circular array implementation of a 

queue: 

Time: 

space:


Implementing a Queue with a Linked List 

State representation: 

Data items are kept in 

Pointer head points to 

Pointer tail points to 

Operation implementations: 

To enqueue an item, 

To dequeue an item,


Implementing a Queue with a Linked List (cont’d) 

class Queue { 

private QueueNode head; 

private QueueNode tail; 

public Queue() { 

head = null; 

tail = null; 

} 


return (head == null); 

} 

public void enqueue(Object obj) { 

QueueNode node = new QueueNode(obj); 

if empty() { 

head = node; 

tail = node; 

} else { 

tail.link = node; 

tail = node; 

} 

} 

// continued on next slide


Implementing a Queue with a Linked List (cont’d) 

// continued from previous slide 

} 

public Object dequeue() { 

if ( empty() ) 

return null; // or throw an EmptyQueueException 

else { 

Object returnItem = head.item; 

head = head.link; // remove first node from list 

if (head == null) // fix tail pointer if needed 

tail = null; 

return returnItem; 

} 

} 

Every operation always takes


Motivation for the List ADT 

This ADT is good for modeling 

Some sample applications:


Specifying the List ADT 

The state of a list object is 

Typical operations on a list are: 

create: 

empty: 

length: 

select(i): 

replace(i,x): 

delete(x): 

insert(x):


Implementing the List ADT 

Array implementation: 

Keep a counter 

To select or replace at some location, 

To insert at some location, items down. 

To delete at some location, 

Linked list implementation: 

Keep a count of 

To select, replace, delete or insert an item,


Comparing the Times of List Implementations 

Time for various operations, on a list of n data items: 

list singly 

operation linked list array 

empty 

length 

select(i) 

replace(i) 

delete(i) 

insert(i) 

The time for insert in an array assumes no overflow 

occurs. If overflow occurs,


Comparing the Space of List Implementations 

Space requirements: 

If the array holds pointers to the items, then there is 

the space overhead of 

If the array holds the items themselves, then there is 

the space overhead of 

In both kinds of arrays, there is also the overhead of 

If you use a linked list, then the space overhead is 

for 

To quantify the space tradeoffs between the array of 

items and linked list representations: 

Let p be the number of 

Let q be the number of 

Let m be the number of


Comparing the Space (cont’d) 

To hold n items, 

the array representation uses 

the linked list representation uses 

The tradeoff point is when 

When nqm=(p + q), 

When the item size, q, is much larger than the pointer 

size, p, 

When the item size, q, is closer to the pointer size, 

p,


Generalized Lists 

A generalized list is 

Example: (a; b; (c; (d; e);f);g;(h; i)). 

There are five elements in the (top level) list: 

1. 

2. 

3. 

4. 

5. 

Items which are not lists are called atoms (they cannot 

be further subdivided).


Sample Java Code for Generalized List 

class Node { 

Object item; 

Node link; 

Node (Object obj) { item = obj; } 

} 

class GenList { 

private Node first; 

GenList() { first = null; } 

void insert(Object newItem) { 

Node node = new Node(newItem); 

node.link = first; 

first = node; 

} 

void print() { 

System.out.print("( "); 

Node node = first; 

while (node != null) { 

if (node.item instanceof GenList) 

((GenList)node.item).print(); 

else S.o.p(node.item); 

node = node.link; 

if (node != null) S.o.p(", "); 

} 

S.o.p(" )"); 

} 

}


Sample Java Code (cont’d) 

Notice: 

o instanceof C returns true if 

– object o 

– object o 

– object o 

– object o 

casts node.item to type GenList, if appropriate 

recursive call of the GenList method print 

implicit use of the toString method of every class, 

in the call to System.out.print 

Don’t confuse the print method of System.out 

with the print method we are defining for class 

GenList.)


Sample Java Code (cont’d) 

How do we know that print is well-defined and won’t 

get into an infinite loop? 

The print method is recursive and uses a while loop. 

The while loop 

If an item is not a generalized list, then it 

If an item is itself a generalized list, then 

The while loop stops when 

Each recursive call takes you deeper into the nesting of 

the generalized list. 

Assume 

The stopping case for the recursion is 

Each recursive call takes you closer to a stopping 

case.


Generalized List Pitfalls 

Warning! If there is a cycle in the generalized list, 

print will go into an infinite loop. For instance: 

Be careful about shared sublists. For instance,


Application of Generalized Lists: LISP 

Generalized lists are 

highly 

good for applications where 

the key structuring paradigm in 

LISP is a functional language: 

Each function call is represented as a list, with the 

name of the function coming first, and the arguments 

coming after it:


LISP-like Approach to Arithmetic Expressions 

Apply this approach to evaluating arithmetic expressions: 

Use prefix notation (as opposed to postfix), with parentheses 

to delimit the sublists:


Strings and StringBuffers 

Java differentiates between 

There are no methods that change an existing String. 

If you want to change the characters in a string, use a 

StringBuffer. Some key features are: 

change 

append 

insert 

The StringBuffer class can be implemented using 

an array of characters. The ideas are not complicated.


The Heap 

When you use new or malloc to dynamically allocate 

some space, the run-time system handles the mechanics 

of actually finding the required free space of 

the necessary size. 

When you make an object inaccessible (in Java) or use 

free (in C), again the run-time system handles the 

mechanics of reclaiming the space. 

We are now going to look at HOW one could implement 

dynamic allocation of objects from the heap. The 

reasons are:


What is the Heap? 

The heap is an area of memory used to store objects 

that will by dynamically allocated and deallocated. 

Memory can be viewed as one long array of memory 

locations, where the address of a memory location is 

the index of the location in the array. 

Thus we can view the heap as 

Contiguous locations in the heap (array) are grouped 

together into 

When a request arrives to allocate n bytes, the system 

finds 

allocates 

returns 

Blocks are classified as either 

Initially,


Heap Data Structures 

Once blocks are allocated, the heap might get chopped 

up into alternating allocated and free blocks of varying 

sizes. 

We need a way to locate all the free blocks. 

This will be done by keeping the free blocks in a 

The linked list is implemented using 

Each block has some


Allocation 

When a request arrives to allocate n bytes, 

There are two strategies for choosing the block to use: 

 

 

If the block found is bigger than n, then 

If the block found is exactly of size n, then 

If no block large enough is found, then


Deallocation 

When a block is deallocated, as a first cut, simply insert 

the block at the front of the free list. 

100 

0 

79 

free 

000 111 

000 111 

000 111 

000 111 

000 111 

10 70 

0 

10 

79 

p := alloc(10) 

p 

free 

000 111 000000 

111111 

000 111 000000 

111111 

000 111 10 000000 

111111 

20 

000 111 000000 

111111 

000 111 000000 

111111 

0 10 

30 

50 

p q free 

79 

q := alloc(20) 

10 

0 

000000 

111111 

000000 

111111 

000000 

111111 

000000 

111111 

20 

000000 

111111 

000000 

111111 

10 

free q 

50 

30 79 

free(p) 

10 

0 

1111110000000000 

1111111111 

1111110000000000 

1111111111 

000000 

111111 

20 0000000000 

1111111111 

40 

1111110000000000 

1111111111 

000000 

1111111111111111 

0000000000 

10 

30 

free q r 

10 

70 

79 

r := alloc(40) 

10 20 

0000000000 

1111111111 

0000000000 

1111111111 

0000000000 

1111111111 

40 

0000000000 

1111111111 

1111111111 

0000000000 

0 10 30 70 79 

10 

free(q) 

free 

r


Fragmentation 

00000000000 

11111111111 

00000000000 

11111111111 

00000000000 

11111111111 

40 

00000000000 

11111111111 

00000000000 

11111111111 

00000000000 

11111111111 

10 20 10 

0 10 30 70 79 

free(q) 

free 

r 

Problem with previous example: If a request comes in 

for 30 bytes, the system will check the free list, and 

find


Coalescing 

A solution to fragmentation is to 

physical neighbor: 

virtual neighbor: 

To facilitate this operation, we will need additional space 

overhead in the header, and it will also help to keep 

“footer” information at the end of each block to: 

make 

indicate 

replicate


More Insidious Fragmentation 

00000000000 

11111111111 

00000000000 

11111111111 

00000000000 

11111111111 

40 

00000000000 

11111111111 

00000000000 

11111111111 

00000000000 

11111111111 

10 20 10 

0 10 30 70 79 

free(q) 

free 

r 

However, coalescing will not accommodate a request 

for


Compaction 

The solution to this problem is called 

The difficulty though is that


Master Pointers 

A solution is to use 

A special area of the heap contains 

The addresses 

The address returned by the allocate procedure is 

The contents of a master pointer 

But the user,


Master Pointers (cont’d) 

... 

000000 

111111 

000000 

111111 

000000 

111111 

000000 

111111 

000000 

111111 

0000 1111000 

... 

0000 1111000 

0000 1111000 

0000 1111000 

0000 1111000 

p q r 

master pointers 

Costs: 

Additional 

Additional 

rest of heap 

Unpredictable


Garbage Collection 

The above discussion of deallocation assumes the memory 

allocation algorithm is somehow informed about 

which blocks are no longer in use: 

In C, this is done 

In Java, 

This process is part of garbage collection: 

 

 

One of the challenging aspects of garbage collection is 

how to


Trees 

Important terminology: 

Some uses of trees: 

model 

model 

a clever implementation of


Trees (cont’d) 

Some more terms: 

path: 

length of path: 

height of a node: 

heightoftree: 

depth (or level) of a node: 

depth of tree: 

Fact: The depth of a tree equals the height of the tree.


Binary Trees 

Binary tree: atreeinwhich 

Complete binary tree: tree in which 

Important Facts: 

A complete binary tree with L levels contains 

A complete binary tree with n nodes has


Binary Trees (cont’d) 

Leftmost binary tree: like a complete binary tree, 

except that 

however, all leaves at bottom level are 

Important Facts: 

A leftmost binary tree with L levels contains 

A leftmost binary tree with n nodes has


Binary Heap 

Now suppose that there is a data item, called 

inside each node of a tree. 

A binary heap (or min-heap) is a 

leftmost binary tree 

satisfies the 

Do not confuse this use of “heap” with its usage in 

memory management! 

Important Fact: The same set of keys 

There is no


Using a Heap to Implement a Priority Queue 

To implement the priority queue operation insert(x): 

1. 

2. 

3. 

Time: 

To implement the priority queue operation remove(): 

Tricky part is how to remove the root without messing 

up the tree structure. 

1. 

2. 

3. 

Time:


Using a Heap to Implement a PQ (cont’d) 

PQ operation sorted array unsorted array heap 

or linked list or linked list 

insert 

remove (min) 

No longer have the severe tradeoffs of the array and 

linked list representations of priority queue.


Heap Sort 

Recall the sorting algorithm that used a priority queue: 

1. insert the elements to be sorted, one by one, into a 

priority queue. 

2. remove the elements, one by one, from the priority 

queue; they will come out in sorted order. 

If the priority queue is implemented with a heap, the 

running time is


Linked Structure Implementation of Heap 

To implement a heap with a linked structure, each node 

of the tree will be represented with an object containing 

 

 

 

 

To find the next available location for insert, or the 

rightmost node on the bottom level for remove, in constant 

time, 

 

 

Then keep a


Array Implementation of Heap 

Fortunately, there’s a nifty way to implement a heap 

using an array, based on an interesting observation: If 

you number the nodes in a leftmost binary tree, starting 

at the root and going across levels and down levels, you 

see a pattern: 

1 

2 3 

4 5 

6 7 

8 9 

Node number i has left child 

Node number i has right child 

If 2 i>n, then i has no 

If 2 i +1>n, then i has no 

Therefore, node number i is a leaf if 

The parent of node i is 

Next available location for insert is index 

Rightmost node on the bottom level is index


Array Implementation of Heap (cont’d) 

Representation consists of 

array A[1..max] (ignore location 0) 

integer n, which is initially 0, holding number of 

elements in heap 

To implement insert(x) (ignoring overflow): 

n := n+1 // make a new leaf node 

A[n] := x // new node’s key is initially x 

cur := n // start bubbling x up 

parent := cur/2 

while (parent != 0) && A[parent] > A[cur] do 

// current node is not the root and its key 

// has not found final resting place 

swap A[cur] and A[parent] 

cur := parent // move up a level in the tree 

parent := cur/2 

endwhile


Array Implementation of Heap (cont’d) 

To implement remove (ignoring underflow): 

minKey := A[1] // smallest key, to be returned 

A[1] := A[n] // replace root’s key with key in 

// rightmost leaf on bottom level 

n := n-1 // delete rightmost leaf on bottom level 

cur := 1 // start bubbling down key in root 

Lchild := 2*cur 

Rchild := 2*cur + 1 

while (Lchild


Binary Tree Traversals 

Now consider any kind of binary tree with data in the 

nodes, not just leftmost binary trees. 

In many applications, we need to traverse a tree: “visit” 

each node exactly once. When the node is visited, 

some computation can take place, such as printing the 

key. 

There are three popular kinds of traversals, differing in 

the order in which each node is visited in relation to the 

order in which its left and right subtrees are visited: 

inorder traversal: 

preorder traversal: 

postorder traversal:


Binary Tree Traversals (cont’d) 

preorder(x): 

if x is not empty then 

visit x 

preorder(leftchild(x)) 

preorder(rightchild(x)) 

inorder(x): 


inorder(leftchild(x)) 

visit x 

inorder(rightchild(x)) 

postorder(x): 


postorder(leftchild(x)) 

postorder(rightchild(x)) 

visit x 

a 

b 

c 

d 

f 

g 

preorder: 

inorder: 

postorder: 

e 

h 

i


Binary Tree Traversals (cont’d) 

These traversals are particularly interesting when the 

binary tree is a parse tree for an arithmetic expression: 

Postorder traversal results in the 

Preorder gives 

Does inorder give 

* 

+ - 

preorder: 

inorder: 

postorder: 

5 3 2 

1


Representation of a Binary Tree 

The most straightforward representation for an (arbitrary) 

binary tree is a linked structure, where each node 

has 

 

 

 

Notice that the array representation used for a heap 

will not work, because the structure of the tree is not 

necessarily very regular. 

class TreeNode { 

Object data; 

TreeNode left; 

TreeNode right; 

// data in the node 

// left child 

// right child 

// constructor goes here... 

} 

void visit() { 

// what to do when node is visited 

}


Representation of a Binary Tree (cont’d) 

class Tree { 

TreeNode root; 

// other information... 

void preorderTraversal() { 

preorder(root); 

} 

} 

preorder(TreeNode t) { 

if (t != null) { // stopping case for recursion 

t.visit(); // user-defined visit method 

preorder(t.left); 

preorder(t.right); 

} 

} 

But we haven’t yet talked about how you actually MAKE 

a binary tree. We’ll do that next, when we talk about


Dictionary ADT Specification 

So far, we’ve seen the abstract data types 

 

 

 

 

Another useful ADT is a dictionary (or table). The 

abstract state of a dictionary is a 

The main operations are: 

 

 

 

Some additional operations are: 

find the 

find the


Dictionary ADT Applications 

The dictionary (or table) ADT is 

For instance, student records at a university can be kept 

in a dictionary data structure: 

When a new student enrolls, 

When a student graduates, 

When information about a student needs to be updated, 

Once the search has located the record for that student, 

When information about student needs to be retrieved, 

The world is full of information databases, many of 

them extremely large (imagine what the IRS has). 

When the number of elements gets very large,


Dictionary Implementations 

We will study a number of implementations: 

Search Trees 

 

: 

 

– 

– 

– 

Hash Tables


Binary Search Tree 

Recall the heap ordering property for binary heaps: 

Another ordering property is the binary search tree 

property: for each node x, 

all keys in the left subtree of x 

all keys in the right subtree of x 

A binary search tree (BST) is


Searching in a BST 

To search for a particular key in a binary search tree, 

we take advantage of the binary search tree property: 

search(x,k): // x is node where search starts 

----------- // k is key searched for 

if x is null then // stopping case for recursion 

return "not found" 

else if k = the key of x then 

return x 

else if k < the key of x then 

search(leftchild(x),k) // recursive call 

else // k > the key of x 

search(rightchild(x),k) // recursive call 

endif 

The top level call has x equal to 

In the previous tree, the search path for 17 is 

and the search path for 21 is 

Running Time: 

If BST is a chain, then


Searching in a BST (cont’d) 

Iterative version of search: 

search(x,k): 

------------ 

while x != null do 

if k = the key of x then 

return x 


x := leftchild(x) 


x := rightchild(x) 

endif 

endwhile 

return "not found" 

As in the recursive version, 

The comparison of the search key with the node key 

tells you at each level 

Running Time:


Searching in a Balanced BST 

If the tree is a complete binary tree, then the depth is 

and thus the search time is 

Binary trees with O(log n) depth are considered balanced: 

there is balance between 

You can have binary trees that are 

so that the depth is 

but might have a larger constant hidden in the big-oh. 

As an aside, a binary heap does not have 

Since nodes at the same level of the heap have no particular 

ordering relationship to each other, you will need 

to


InsertingintoaBST 

To insert a key k into a binary search tree, 

Then 

insert(x,k): 

----------- 

if x = null then 

make a new node containing k 

return new node 

else if k = the key of x then 

return null // key already exists 


leftchild(x) := insert(leftchild(x),k) 

return x 


rightchild(x) := insert(rightchild(x),k) 

return x 

endif 

Insert called on node x 

unless x is null, in which case 

As a result, a child of a node 



Inserting into a BST (cont’d)


Finding Min and Max in Binary Search Tree 

Fact: The smallest key in a binary tree is found by 


Guess how to find the largest key and how long it takes. 

Min is 

and max is


Printing a BST in Sorted Order 

Cute tie-in between tree traversals and BST’s. 

Theorem: Inorder traversal of a binary search tree visits 

the nodes 

Inorder traversal on previous tree gives: 

Proof: Let’s look at some small cases and then use 

induction for the general case. 

Case 1: 

Case 2: 

Case n: Suppose true for trees of size 

Consider a tree of size


Printing a BST in Sorted Order (cont’d) 

L contains at most 

and R contains at most 

Inorder traversal: 

prints out 

then prints out 

then prints out 

2 



Tree Sort 

Does previous theorem suggest yet another sorting algorithm 

to you? 

Tree Sort: Insert all the keys 

thendoan 


since each of the n inserts takes


Finding Successor in a BST 

The successor of a node x in a BST is 

Case 1: If x has a right child, then the successor of x 

is the 

follow x’s right pointer, then follow left pointers until 

there are no more. 

Path to find successor of 19 is


Finding Successor in a BST (cont’d) 

19 

10 22 

4 

16 

20 26 

13 

17 27 

Case 2: If x does not have a right child, then find the 

Path to find successor of 17 is 

If you never find an ancestor that is larger than x’s key, 

then 

Path to try to find successor of 27 is 



Finding Predecessor in a BST 

The predecessor of a node x in a BST is the node 

whose 

To find it, 

Case 1: If x has a left child, then the predecessor of x 

follow x’s left pointer, then follow right pointers until 

there are no more. 

Case 2: If x does not have a left child, then find the 

lowest ancestor of x 

(I.e., follow parent pointers from x until reaching a key 

smaller than x’s.) 

If you never find an ancestor that is smaller than x’s 

key, then 



Deleting a Node from a BST 

Case 1: x is a leaf. Then 

Case 2: x has only one child. Then 

Case 3: x has two children. Use the same strategy as 

binary heap: Instead of removing the root node, 

1. Find 

2. Delete 

3. Replace 



Deleting a Node from a BST (cont’d)


Balanced Search Trees 

We would like to come up with a way to keep a binary 

search tree “balanced”, so that the depth is 

and thus the running time for the BST operations will 

be 

There are a number of schemes that have been devised. 

We will briefly look at a few of them. 

They all require much more complicated algorithms 

for insertion and deletion, in order to 

The algorithms for searching, finding min, max, predecessor 

or successor, are essentially the same as for 

Next few slides give the main idea for the definitions 

of the trees, but not why the definitions give O(log n) 

depth, and not how the algorithms for insertion and 

deletion work.


AVL Trees 

An AVL tree is a binary search tree such that for each 

node, the heights of the left and right subtrees of the 

node 

Theorem: The depth of an AVL tree is 

When inserting or deleting a node in an AVL tree, if 

you detect that the AVL tree property has been violated, 

then you


Red-Black Trees 

A red-black tree is a binary search tree in which 

every “real” node is given 

every node is colored 

– every leaf node is 

– if a node is red, then both its children are 

– every path from a node to a leaf contains 

From a fixed node, all paths from that node to a leaf 

differ in length by 

Theorem: The depth of an AVL tree is 

Insert and delete algorithms are quite involved.


B-Trees 

The AVL tree and red-black tree allowed some variation 

in 

An alternative idea is to make sure that all root-to-leaf 

paths have 

and allow 

The definition of a B-tree uses a parameter m: 

every leaf 

the root has 

every non-root node has 

Keys are placed into nodes like this: 

Each non-leaf node has 

Each leaf node has 

The keys within a node are


B-Trees (cont’d) 

And we require the extended search tree property: 

For each node x, the i-th key in x is 

and is 

B-trees are extensively used in the real world, for instance, 

database applications. In practice, 

Theorem: The depth of a B-tree tree is 

Insert and delete algorithms are quite involved.


Tries 

In the previous search trees, each key is 

except for their 

For some kinds of keys, one key might be a 

For example, if the keys are strings, then the key “at” 

is a prefix of the key “atlas”. 

The next kind of tree takes advantage of 

to store them more efficiently. 

A trie is a (not necessarily binary) tree in which 

each node corresponds to 

prefix for each node 

The trie storing “a”, “ ale”, “ant”, “bed”, “bee”, “bet”:


Inserting into a Trie 

To insert into a trie: 

insert(x,s): // x is node, s is string to insert 

------------ 

if length(s) = 0 then 

mark x as holding a complete key 

else 

c := first character in s 

if no outgoing edge from x is labeled with c then 

create a new child node of x 

label the edge to the new child node with c 

put the edge in the correct sorted order 

among all of x’s outgoing edges 

endif 

x := child of x reached by edge labeled c 

s := result of removing first character from s 

insert(x,s) 

endif 

Start the recursion 

To insert “an” and “beep”: 

a 

b 

l 

n 

e 

e 

t 

d e 

t


Searching in a Trie 

To search in a trie: 

search(x,s): // x is node, s is string to search for 

------------ 

if length(s) = 0 then 

if x holds a complete key then return x 

else return null // s is not in the trie 

else 

c := first character in s 

if no outgoing edge from x is labeled with c then 

return null // s is not in the trie 

else 

x := child of x reached by edge labeled c 

s := result of removing first character from s 

search(x,s) 

endif 

endif 

Start the recursion 

To search for “art” and “bee”: 

a b 

l 

n 

e 

e 

t 

d e 

t


Hash Table Implementation of Dictionary ADT 

Another implementation of the Dictionary ADT is a 

Hash tables support the operations 

 

 

 

with 

This is a significant advantage over even balanced search 

trees, which have average times of 

The disadvantage of hash tables is that 

and printing all elements in sorted order takes


Main Idea of Hash Table 

Main idea: exploit random access feature of arrays: 

the i-th entry of array A can be accessed 

Simple example: Suppose all keys are in the range 

Then store elements in an array A with 

Initialize all entries to some empty indicator. 

To insert x with key k: 

To search for key k: 

To delete element with key k: 

All times are 

But this idea does not scale well.


Hash Functions 

Suppose 

elements are 

school has 

keys are 

Since there are 1 billion possible SSN’s, we need an 

array of length 1 billion. And most of it will be wasted, 

since only 40,000/1,000,000,000 = 1/25,000 fraction is 

nonempty. 

Instead, we need a way to 

Let M be the size of the array we are willing to provide. 

Use a hash function, h, to 

Then h maps key values to integers in the range


Simple Hash Function Example 

Suppose keys are integers. Let the hash function be 

h(k) = k mod M. Notice that this always gives you 

something in the range 

To insert x with key k: 

To search for element with key k: 

To delete element with key k: 

All times are 

assuming the hash function can be computed in constant 

time. 

The key to making this work is to


Collisions 

In reality, any hash function will have collisions: when 

two different keys 

This is inevitable, since the hash function is squashing 

down a large domain into a small range. 

For example, if h(k) =kmod M, then 

since they both hash to 

What should you do when you have a collision? Two 

common solutions are 

1. 

2.


Chaining 

Keep all data items that hash to the same array location 

in a 

to insert element x with key k: 

to search for element with key k: 

to delete element with key k: 

Worst case times, assuming computing h is constant: 

insert: 

search and delete: 

Worst case is if all n elements


Good Hash Functions for Chaining 

Intuition: Hash function should 

More formally: 

Impractical to check in practice since 

For example: Suppose the symbol table in a compiler 

is implemented with a hash table. The compiler writer 

cannot know in advance which variable names will appear 

in each program to be compiled. 

Heuristics are used to approximate this condition:


Good Hash Functions for Chaining (cont’d) 

Some issues to consider in choosing a hash function: 

Exploit 

For symbol table example, take into account the kinds 

of variables names that people often choose (e.g., 

x1). 

Hash function should depend on 

For example: if the keys are English words, it is not 

a good idea to hash on the first letter, since many 

words begin with S and few with X.


Average Case Analysis of Chaining 

Define load factor of hash table with M entries and n 

keys to be 

Assume a hash function that is ideal for chaining 

Fact: Average length of each linked list is 

The average running time for chaining: 

Insert: 

Unsuccessful Search: 

O(1) time to compute h(k); items, on average, in 

the linked list are checked until discovering that k is 

not present. 

Successful Search: 

O(1) time to compute h(k); on average, key being 

sought is in middle of linked list, so =2 comparisons 

needed to find k. 

Delete: 

For these times to be O(1), must be O(1),son cannot 

be too much larger than


Open Addressing 

With this scheme, there are 

Instead, 

If there is a collision, you have to probe the table – 

You must pick a pattern that you will use to probe the 

table. 

The simplest pattern is to 

and then check 

This is called 

If h(k) =7, the probe sequence will be


Clustering 

A problem with linear probing: 

If an insert probe sequence begins in a cluster, 

 

 

To reduce clustering, 

to skip over some locations, so locations are not checked 

There are various schemes for how to choose the increments; 

in fact, the increment to use can be


Clustering (cont’d) 

If the probe sequence starts at 7 and the probe increment 

is 4, then the probe sequence will be 

Warning! The probe increment must be 

otherwise you will not search all locations. 

For example, suppose you have table size 9 and increment 

3. You will only search


Double Hashing 

Even when “non-linear” probing is used, it is still true 

that 

To get around this problem, use 

1. One hash function, h 1 ,isusedtodetermine 

2. A second hash function, h 2 ,isusedtodetermine 

If the hash functions are chosen properly,


Double Hashing Example 

Let h 1 (k) =kmod 13 and h 2 (k) =1+(kmod 11). 

To insert 14: start probing at 

Probe increment is 

Probe sequence is 

To insert 27: start probing at 


Probe sequence is 

To search for 18: start probing at 


Probe sequence is


Deleting with Open Addressing 

Open addressing has another complication: 

to insert: 

to search: 

Suppose we use linear probing. Consider this sequence: 

Insert k 1 ,whereh(k 1 )=3, at location 3. 



Delete k 2 from location 4 by setting location 4 to 

empty. 

Search for k 3 . 

Solution: when an element is deleted, instead of marking 

the slot as empty, 

Then the search algorithm needs to continue searching 

if it finds one of those slots.


Good Hash Functions for Open Addressing 

An ideal hash function for open addressing would satisfy 

an even stronger property than that for chaining, 

namely: 

This is even harder to achieve in practice than the ideal 

property for chaining. 

A good approximation is double hashing with this scheme: 

 

Generalizes the earlier example.


Average Case Analysis of Open Addressing 

In this situation, the load factor = n=M is always 

less than 1: 

Assume that there is always at least one empty slot. 

Assume that the hash function ensures that each key is 

equally likely to have each permutation of 

f0; 1;:::;M ,1gas its probe sequence. 

Average case running times: 

Unsuccessful Search: 

Insert: 

Successful Search: 

Delete: 

The reasoning behind these formulas requires more sophisticated 

probability than for chaining.


Sanity Check for Open Addressing Analysis 

The time for searches should 

The formula for unsuccessful search is 

As n gets closer to M, 

so 

so 

At the extreme, when n = M , 1, the formula 1 

1, = 

M, meaning that


Sorting 

Insertion Sort: 

– Consider 

– Shift 

– Insert 

– Worst-case time is 

Treesort: 

– Insert 

– Then do 

– For a basic BST, worst-case time is 

but average time is 

– For a balanced BST, worst-cast time is 

although code is more complicated.


Sorting (cont’d) 

Heapsort: 

– Insert 

– Then 


Mergesort: Apply the idea of 

– Split 

– Recursively 

– Recursively 

– Then 


however, it requires more space.


Object-Oriented Software Engineering 

References: 

Standish textbook, Appendix C 

Developing Java Software, by Russel Winder and 

Graham Roberts, John Wiley & Sons, 1998 (ch 8- 

9). 

Outline of material:


Small Scale vs. Large Scale Programming 

Programming in the small: programs done by 

whose length is 

Programming in the large: projects consisting of 

and producing 

Obviously the complications are much greater here. 

The field of software engineering is mostly oriented 

toward 

However, the principles still hold (although simplified) 

for programming in the small. It’s worth understanding 

these principles so that


Object-Oriented Software Engineering 

Software engineering studies 

Object-oriented software engineering uses 

Why object-oriented? 

use of abstractions to 

benefits of encapsulation to 

power of inheritance to 

Experience has shown that object-oriented software engineering 

helps create robust reliable programs with 

promotes the development of programs by


Object-Oriented Software Engineering (cont’d) 

Solutions to specific problems tend to be fragile and 

short-lived: 

To minimize effects of requirement changes 

instead of just focusing on 

Usually the problem domain is fairly stable, whereas a 

If you capture the problem domain as the core of 

your design, then the code is likely to be 

More traditional structured programming tends to lead 

to a


Object-Oriented Software Engineering (cont’d) 

In OO analysis and design,identify 

and model them as 

Leads to 

go downwards to 

go upwards to 

This approach tends to lead to 

and 

For instance, when the requirements change, you may 

have all the basic abstractions right but you 

Aim for 

which are specialized by inheritance to provide


Software Life Cycle 

inception: 

– requirements: 

elaboration: 

– analysis: 

– design: 

– identify reuse: 

implementation 

– 

– 

– 

testing 

delivery and maintenance


Software Life Cycle (cont’d) 

Lifecycle is not followed linearly; 

An ideal way to proceed is by 

implement 

review 

decide 

proceed 

continue 

This supports 

letting you try alternatives and


Requirements 

Decide what the program is supposed to do 

Harder than it sounds. 

Ask the user 

 

 

Involve the user in reviewing the requirements when 

they are produced and the prototypes developed. 

Typically, requirements are organized 

Helpful to construct scenarios, which describe


Requirements (cont’d) 

An example scenario to look up a phone number: 

1. select 

2. enter 

3. 

4. program computes, to 

(do NOT specify data structure to be used at this 

level) 

5. 

Construct as many scenarios as needed until you feel 

comfortable, and have gotten feedback from the user, 

that 

This part of the software life cycle is no different for 

object-oriented software engineering than for non-objectoriented.


Object-Oriented Analysis and Design 

Main objective: 

Analysis and design are two ends of a spectrum: Analysis 

focuses more on the 

while design focuses more on the 

For large scale projects, there might be a real distinction: 

for example, 

might be required to implement 

For small scale projects, there is typically no distinction 

between analysis and design:


Object-Oriented Analysis and Design (cont’d) 

To decide on the classes: 

Study 

Look for nouns in the requirements: 

These will probably turn into 

and/or 

See how the requirements specify interactions between 

things (e.g., each student has a GPA, each 

course has a set of enrolled students). 

Use an analysis method: 

(Particularly aimed at large scale projects.)


An Example OO Analysis Method 

CRC (Class, Responsibility, Collaboration): It clearly 

identifies the Classes, what the Responsibilities are of 

each class, and how the classes Collaborate (interact). 

In the CRC method, you draw class diagrams: 

each class is 

– 

– 

– 

if class 1 is a subclass of class 2, then 

if an object of class 1 is part of (an instance variable 

of) class 2, then 

if objects of class 1 need to communicate with objects 

of class 2, then 

The arrows and lines can be annotated to indicate the 

number of objects involved, the role they play, etc.


CRC Example 

To model a game with several players who take turns 

throwing a cup containing dice, in which some scoring 

system is used to determine the best score: 

This is a diagram of the 

not the 

Object diagrams are trickier since 

Double-check that the class diagram is consistent with 

requirements scenarios.


Object-Oriented Analysis and Design (cont’d) 

While fleshing out the design, after identifying what 

the different methods of the classes should be, figure 

out 

This means deciding what 

Do not fall in love with one particular solution (such as 

the first one that occurs to you). Generate 

and then try to 

Do not commit to a particular solution too early in the 

process. Concentrate on 

The use of ADTs assists in this aspect.


Verification and Correctness Proofs 

Part of the design includes 

You should have some convincing argument as to why 

these algorithms are correct. 

In many cases, it will be obvious: 

 

 

But sometimes you might be coming up with your own 

algorithm, or 

In these cases, it’s important to check what you are 

doing!


Verification and Correctness Proofs (cont’d) 

The Standish book describes one particular way to prove 

correctness of small programs, or program fragments. 

The important lessons are: 

It is possible to 

Formalisms can help you to 

Spending a lot of time thinking about your program, 

no matter what formalism, will 

These approaches are impossible to do 

For large programs, there are research efforts aimed at 

i.e., programs that 

Generally automatic verification is slow and cumbersome, 

and requires some specialized skills.


Verification and Correctness Proofs (cont’d) 

An alternative approach to program verification is 

Instead of trying to verify actual code, 

Represent the algorithm in 

then 

Of course, you might make a mistake when translating 

your pseudocode into Java, but the proving will be 

much more manageable than the verification.


Implementation 

The design is now fleshed out to the level of code: 

 

 

 

 

As the code is written, document the key design decisions, 

implementation choices, and any unobvious 

aspects of the code. 

Software reuse: Use library classes as appropriate (e.g., 

Stack, Vector, Date, HashTable). Kinds of reuse: 

 

 

 

But sometimes modifications can be more time consuming 

than starting from scratch.


Testing and Debugging: The Limitations 

Testing cannot prove that your program is correct. 

It is impossible to test a program on every single input, 

so 

Even if you could apply some kind of program verification 

to your program, 

And in fact, how do you know that your requirements 

However, testing still serves a worthwhile, pragmatic, 

purpose.


Test Cases, Plans and Logs 

Run the program on various test cases. 

should 

Test cases 

More specifically, 

test on 

test on 

test on 

Organize your test cases according to a 

Purposes: 

make it clear 

ensure that 

Results of running a set of tests is a 

After fixing a bug, you must 

(Winder and Roberts calls this the Principle of Maximum 

Paranoia.)


Kinds of Testing 

Unit testing: 

 

 

Integration testing: 

Two approaches to integration testing: 

Bottom-up testing 

Then progress to the next level up: those methods and 

classes that only use the bottom level ones already tested. 

Use a driver to test combinations of the bottom two 

layers. 

Proceed until


Kinds of Testing (cont’d) 

Top down testing proceeds in the opposite direction, 

making 

Reasons to do top down testing: 

to allow software development to 

if you have modules that are mutually dependent, 

e.g., X uses Y, Y uses Z, and Z uses X. You can


Other Approaches to Debugging 

In addition to testing, another approach to debugging a 

program is to 

A third approach is called a 

Some companies give your (group’s) code to another 

group, whose job is to try to make your code break!


Maintenance and Documentation 

Maintenance includes: 

 

 

 

 

Most often, the person (or people) doing the maintenance 

are NOT the one(s) who originally wrote the 

program. 

There are (at least) two kinds of documentation, both 

of which need to be updated during maintenance: 

internal documentation, 

external documentation,


Maintenance and Documentation (cont’d) 

In addition to good documentation, a clean and easily 

modifiable structure is needed for effective maintenance, 

If changes are made in ad hoc, kludgey way, (either because 

the maintainer does not understand the underlying 

design or because the design is poor), the program 

will 

Trying to fix one problem causes something else to 

break, so in desperation you put in some jumps (spaghetti 

code) to try to avoid this, etc. 

Eventually it may be better to replace the program with


Measurement and Tuning 

Experience has shown: 

 

 

These observations suggest that optimizing your program 

can pay big benefits, but that it is smarter to 

How can you figure out where your program is spending 

its time? 

use a tool called an


Measurement and Tuning (cont’d) 

Things you can do to speed up a program: 

find 

replace 

replace 

take advantage of 

Don’t do things that are stupidly slow in your program 

from the beginning. 

On the other hand, don’t go overboard in supposed 

optimizations (that might hurt readability) unless you


Software Reuse and Bottom-up Programming 

The bottom line from section C.7 in Standish is: 

the effort required to build software is 

making use of reusable components can 

So it makes lots of sense to try to reuse software. Of 

course, there are costs associated with reuse: 

 

 

Using lots of reusable components leads to more bottomup, 

rather than top down, programming. Or perhaps, 

more appropriately,


Design Patterns 

As you gain experience, you will learn to recognize 

good and bad design and build up 

Why not try to exploit other people’s experience in this 

area as well? 

A design pattern captures a component of a complete 

design that has been observed to 

It provides both a solution to a problem and information 

about them. 

There is a growing literature on design patterns, especially 

for object oriented programming. It is worthwhile 

to become familiar with it. For instance, search 

the WWW for “design pattern” and see what you get.


File Structures 

A file is 

Why on mass storage? 

 

 

 

The data is subdivided into 

Each record contains a number of 

One (or more) field is the 

Issue: 

We will discuss sequential files, indexed files, and hashed 

files.


Sequential Files 

Records are conceptually organized in 

The actual storage might or might not be sequential: 

On a tape, 

On a disk, 

Convenient way to batch (group together) a number of 

updates: 

Store the 

Sort the 

Scan through 

Not a convenient organization for accessing a particular 

record quickly.


Indexed Files 

Sequential search is even slower on disk/tape than in 

main memory. Try to improve performance using 

An index for a file is a 

Typically the key field is 

The index can be organized as a list, a search tree, a 

hash table, etc. To find a particular record: 

 

 

 

Multiple indexes, one per key field, allow


Hashed Files 

An alternative to storing the index as a hash table is to 

Instead, hash on the key to find the address of the desired 

record and 

The usual hashing considerations arise.


Databases 

A database is 

 

 

Example: Collection of student records can be viewed 

as a database to be used by: 

 

 

 

 

The advantages of consolidating the data:


Database System Organization 

The “software architecture” of a database system is 

End user calls application software to access the 

data. End user thinks of data 

Application software calls database management system 

(DBMS) software. The applications software 

has a 

DBMS deals with the 

As usual, the advantages of layering are that


Communication with a Database 

Databases usually provide a useful and powerful interface 

for obtaining information from them. So far, 

we’ve just seen requests of the form: 

 

 

 

But suppose you’d like to print out the names of all 

students that are freshman and either have a 4.0 GPA 

or whose names start with X. 

There are ways to conceptually organize the data to 

allow such queries to be answered efficiently, using 

what are called 

The application software communicates with 

The DBMS must


Database Integrity 

Data in a database is typically 

 

 

Thus it must 

Data can be corrupted if 

Example of corrupted data: 

T1 transfers 

T2 inventories 

Suppose this sequence of events occurs: 

T1 subtracts 

T2 gets the 

T2 gets the 

T1 adds 

T2’s total balance is


DB Serializability 

To prevent transactions from interfering with each other, 

the DBMS should 

This property is called 

The DMBS does not have to (and should not) actually 

make the transactions run serially, but if there is a potential 

conflict, 

One solution is 

Before accessing any data item, the transaction must 

Only one transaction at a time can 

If another transaction already has the lock, then 

After accessing all the data items,


Committing and Aborting a Transaction 

Two-phase locking can lead to deadlock, e.g.: 

 

 

 

 

The DBMS must periodically check for deadlock, and 

if one is discovered, it must 

If the aborted transaction has already made changes to 

the database, the DBMS must 

either 

don’t actually 

Once the transaction has successfully completed, then 

it is


Artificial Intelligence 

Goal: Develop machines that 

 

 

and proceed ”intelligently” 

 

 

 

Distinct but related goals: 

1. 

2. 

3.


8-Puzzle Example 

Given a 3-by-3 box that holds 8 tiles, numbered 1 through 

8. One tile is missing. The goal is to start with the tiles 

scrambled and 

We will try to solve this problem by a machine that has 

a gripper, 

a video camera, 

a computer, 

a “finger”, 

Ideas from mechanical engineering can be used to implement 

the gripper and the finger. We will talk about 

how to “see” where the tiles are, and how to decide 

how to move the tiles.


Computer Vision 

It is not enough to simply store the image obtained 

from the camera. The program must be 

figure out which parts of the image are the salient 

objects, called 

and then recognize the objects by comparing them 

to known symbols, called 

For the 8-puzzle, this problem can be highly simplified: 

always expect the digits to 

 

 

 

But in general this is a very difficult problem and one 

where there has been extensive research.


Reasoning 

How can the program solve the puzzle? 

One solution is to 

For example, if the input is 

then the solution is to 

But in this case there are approximately 9! = 362,880 

different inputs, some of which require a long sequence 

of moves to solve, and it would require a lot of space. 

Plus, someone would have to figure out all the answers 

in advance.


Production Systems 

Instead, have the program figure out the solution. One 

approach is the 

First, consider the state graph of the problem: 

 

 

Here is a tiny piece of the state graph for the 8-puzzle: 

Identify the 

The control system figures out how to


Solving a Production System 

We must find a path through the state graph from 

Luckily, finding paths in graphs is 

One way is to build a search tree (not to be confused 

with a binary search tree), which 

Two solutions are


Breadth-First Search 

Build the search tree in a breadth-first manner: 

The root 

The next level 

The next level 

For example: 

1 2 3 

4 

6 

7 

5 

8 

2 3 

1 2 3 

1 2 3 

1 

4 

6 

4 6 

7 

4 

6 

7 

5 

8 

7 

5 

8 

5 

8 

2 

3 

1 3 

1 2 3 

1 2 3 

1 2 3 

1 

4 

6 

4 2 6 

4 

6 

4 5 6 

7 

4 

6 

7 5 8 7 5 8 7 5 8 7 8 

But the search tree grows exponentially. 

5 

8


Depth-First Search 

Another approach is 

Pursue more promising paths to greater depths and 

consider other options only if 

To implement this idea, we need some criterion to decide 

which paths are promising, or appear to be promising. 

Such criteria are called heuristics. A heuristic is 

We need something quantitative so we can


Heuristic for 8-Puzzle 

For the 8-puzzle example, our intuitive rule of thumb 

is to 

A quantitative heuristic measure is: 

For instance, if the input is 

then the heuristic measure is 

This heuristic has two desirable properties: 

1. it is a 

2. it is


Using a Heuristic in Depth-First Search 

Repeatedly 

Choose the 

Generate 

Continue 

In the 8-puzzle example above: 

Generate the root. Its heuristic measure is 

Generate all children of the root. They have measures 

Choose the leaf with measure 2 and generate all its 

children. They have measures 

Choose the leaf with measure 1 and generate all its 

children. They have measures 

In this depth-first search, we only had to generate 9 

states, instead of


Other Applications of Production Systems 

Many problems can be formulated as production systems. 

In addition to the 8-puzzle, 

You can even model the process of drawing logical 

conclusions from a set of given facts as a production 

system. In this case, 

each state is 

a production/rule/move corresponds to 

For instance, part of the state graph might be: 

since there is a rule of logic that says: Given the facts 

1. 

2. 

then you can deduce that


Some Other Areas of AI 

Neural Networks: Try to take advantage of the power 

of parallelism (multiprocessor computer architectures) 

using a paradigm that (roughly) follows the model of 

Robotics: Hardware and software working together, 

e.g., automated manufacturing. Great interest in having 

machines explore and function in uncontrolled and 

unpredictable environments, such as 

 

 

 

Expert Systems: Combine domain specific knowledge 

from human experts with For example:


Time Complexity of an Algorithm 

Time complexity of an algorithm: the function T (n) 

that describes the 

Given a particular algorithm, discover this function by 

attacking the problem from two directions: 

find an upper bound U (n) on the function T (n), i.e., 

convince ourselves that the algorithm will 

find a lower bound L(n) on the function T (n), i.e., 

convince ourselves that, for each n, there is 

Try to find smallest U and largest L, so that T is squeezed 

in between and has no room to hide.


Time Complexity of an Algorithm (cont’d) 

(a) No execution on an input of size n 0 takes 

(b) The slowest execution on all inputs of size n 0 takes 

(c) At least one execution on an input of size n 0 takes


Time Complexity of Heapsort 

Let T (n) be the time complexity of heapsort. 

First cut at upper bound: 

First cut at lower bound: 

Refined argument for upper bound: each heap operation 

never 

Refined argument for lower bound: Describe a particular 

input that 

On input n; n , 1;n,2;:::;3;2;1, running time is at 

least 

Thus T (n) now precisely identified as


Time Complexity of a Problem 

Time complexity of a problem: the time complexity 

for 

To show that a problem has time complexity T (n): 

Identify a 

Then prove 

Example: Sorting problem has time complexity O(n log n). 

 

It can be proved that 

Problems can be classified by their time complexity. 

Harder problems are considered to be those


The Class P 

All problems (not algorithms) whose time complexity 

is at most some polynomial are said to be 

Example: 

Not all problems are in P. 

Example: Consider the problem of listing all permutations 

of the integers 1 through n. 

Output size is 

Thus running time is 

n! is larger than 2 n , thus


NP-Complete Problems 

There is an important class of problems that 

These problems are called 

These problems have the following characteristic: 

 

 

Many real-world problems in science, math, engineering, 

operations research, etc. are NP-complete.


Traveling Salesman Problem 

An example NP-complete problem is the 

Given a set of cities and the distances between them, 

determine an order in which to 

A candidate solution for TSP is 

To check whether the allowed mileage is exceeded, add 

up the distances between adjacent cities in the listing, 

which will take 

But the total number of different candidate solutions is


Pvs.NP 

Imagine an (unrealistically) powerful model of computation 

in which the computer first makes a lucky guess 

(a nondeterministic choice) as to a candidate solution 

in constant time, and then behaves as an ordinary computer 

and verifies the solution. 

Problems solvable on this computer in polynomial time 

are 

NP includes 

Having polynomial running time on this funny computer 

would not seem to ensure polynomial running 

time on a real computer. 

That is, it seems likely that 

But no one has yet been able to prove P 6= NP. Outstanding 

open question in CS since the 1970’s.


Computability Theory 

Complexity theory focuses on 

Computability theory focuses on 

We will focus on computing (mathematical) functions, 

with inputs and outputs. 

We would like to know if there exist functions that


Church-Turing Thesis 

First, we have to decide what constitutes an algorithm. 

Assembly languages have 

High-level languages have 

 

Church-Turing thesis: (“thesis” means “conjecture”) 

Anything that can reasonably be considered an algorithm 

can be 

A Turing machine is a 

Thus, for theoretical purposes,


Computing Functions 

Some sample functions: 

f (n) =3: 

f(n)=2n: 

f(n) = sin n: 

There exist non-computable functions, functions whose 

input/output relationships are so complicated that there 

is no 

We will assume 

your 

with a 

only consider


Goedel Number of a Program 

Here is a way to convert a program into an integer. 

 

 

Conversely, any integer can be converted 

Most of the time, 

Sometimes it 

Rarely, 

More rarely, 

Use this numbering scheme to


An Uncomputable Function 

Define a function h called the 

If the program with Goedel number n halts when its 

input is n, then 

If the program with Goedel number n does not halt 

when its input is n, then 

Theorem: h is uncomputable 

Proof: Assume in contradiction that h is computable. 

Then 

Define another program I (which will be in the listing): 

1. n 

2. run program H 

3. let x be 

4. if x =0then 

5. else


An Uncomputable Function (cont’d) 

Let n I 

be the Goedel number of I. 

Case 1: 

Case 2: 

Thus the hypothetical program H 

2 

Another way to view this result is that

here in PDF - Parasol Laboratory, Department of Computer Science ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?