Network

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Network as PDF for free.

More details

  • Words: 59,352
  • Pages: 156
The Core FAQ File By:-- 03MX (2003-2006) This watermark does not appear in the registered version - http://www.clicktoconvert.com

2

Table of Contents:- Pages I) Computer Networks - 3 - 40 II) Database Management System - 41 - 82 III) Operating System - 83 - 129 IV) Object Oriented Programming - 130 - 177 V) Data Structures and Algorithms - 178 - 216 This watermark does not appear in the registered version - http://www.clicktoconvert.com

3

Part – I COMPUTER NETWORKS This watermark does not appear in the registered version - http://www.clicktoconvert.com

4

Topics Covered:i) Introduction to Network and Physical Layer – Nigitha S J ii) Data Link Layer and Medium Access Control Sublayer -- Jeevan Prakaash D iii) Network Layer and Transport Layer – Sameera S, Gohila G iv) Application Layer & Network Security –Raghu shankar R v) TCP/IP - Sree Vidhya B This watermark does not appear in the registered version - http://www.clicktoconvert.com

5

(i) INTRODUCTION TO NETWORKS AND PHYSICAL LAYER 1. What is Computer Networks? Interconnection of autonomous computer . One system cannot force other system to shut down. 2. What is Data Communication? Data Communication is the exchange of data between two devices via some form of transmission medium. 3. What is Protocol?

Set of rules that govern data communication. 4. What is Line Configuration? Line Configuration refers to the way two or more communication devices attach to a link. 5. What is Link? Link is the physical communication pathway that transfers data from one device to another. 6. State the Categories of line configuration & explain them. i. Point-to-Point A Point-to-Point line configuration provides a dedicated link between two devices. ii. Multipoint A multipoint line configuration is one in which more than two specific devices share a single link. 7. What is Topology? Pattern of interconnection of nodes or system.. 8. What is Mesh Topology? State its advantage & disadvantage. In a Mesh Topology, every device has a dedicated point-to-point link to every other device. Advantages: This watermark does not appear in the registered version - http://www.clicktoconvert.com

6 1] The use of dedicated links, eliminates the traffic problems. 2] It is robust. 3] Another advantage is privacy or security. Disadvantages: The main disadvantage of a mesh is related to the amount of cabling & the number of I/O ports required. 9. What is Star Topology? In a star topology, each device has a dedicated point-to-point link to central controller, called a hub. 10. Define the term Transmission mode. It is used to define the direction of signal flow between two linked devices. 11. Explain the three types of transmission modes. 1] In Simplex mode, the communication is unidirectional. Only one of the two stations on a link can transmit, the other can only receive. 2] In Half-duplex, each station can both transmit and receive, but not at the same time. 3] In Full-duplex mode, both stations can transmit and receive simultaneously. 12. Connection-Oriented Service To use connection-oriented service, the user first establishes a connection, uses the connection and then releases the connection. The sender sends the data in one end, and the receiver takes them out in the same order at the other end. 13. Connectionless Service In connectionless service, each message carries the full destination address and each one is routed through the system independent of all the others. Here the Order of injection is not same as order of delivery. 14. Define Gateway.

This watermark does not appear in the registered version - http://www.clicktoconvert.com

7 A device used to connect two separate networks that use different communication protocols. 15. Define Repeater. A device used in a network to strengthen a signal as it is passed along the network cable . It works at the Physical Layer of the OSI model. A device that receives, amplifies, and retransmits a signal. Consists of a transceiver (transmitter and receiver). It is used to boost signal levels to extend the system distance . 16. Define Router. A communications device between networks that determines the best path between them for optimal performance. Routers are used in complex networks of networks such as enterprise-wide networks and the Internet. 17. Define Switch. A network traffic monitoring device that controls the flow of traffic between multiple network nodes. 18. What is the function of Physical Layer? 1] It transmits a bit stream over a physical medium. 2] It defines the characteristics of the interface between the devices and the transmission medium. 3] It also defines the type of transmission medium. 4] It defines the direction of transmission between two devices i,e; transmission mode. 19. Data rate. The number of bits sent each second is known as transmission rate or data rate. Note: Physical Layer is one of the network support layers. Physical Layer is reliable bit pipe. Physical Layer -the first layer of the OSI model, responsible for the mechanical and electrical specifications of the medium. This watermark does not appear in the registered version - http://www.clicktoconvert.com

8

(ii) DATA LINK LAYER AND MEDIUM ACCESS CONTROL SUBLAYER 1) What are the functions of Data Link Layer? The functions of Data Link Layer are to provide 1) Well define interface with Network layer 2) Deal transmission errors 3) Regulate the flow of Data 2) What is a frame? What does it compose of? The Data link layer takes the packets it gets from the network layer and encapsulates them into frames. Each frame consists of 1) Frame Header 2) Payload Field ( for holding the packet) 3) Trailer

3) What are the services provided by Data Link Layer to the Network Layer? When are they used? Three services are provided to the Network Layer which include : 1) Unacknowledged Connectionless service ->Used when error rate is low & when recovery is done by higher layers ->Used for real time traffic where late data is worse than bad data 2) Acknowledged Connectionless Service -> Used for unreliable channels; such as wireless systems 3) Acknowledged Connection oriented systems ->Used for reliable channels 4) How is the error detected in the Data Link Layer? The error is detected by the sequence of steps which are : 1) Break the bit stream into discrete frames and compute check sum 2) At destination recompute checksum when frame reaches 3) If both are the same, error has ocurred 5) What are the different framing methods available with Data link layer? The different framing methods include: 1) Character Count : Uses a field in the header to specify the no of chacracters in a frame. The destination now knows how many characters follow and where the end of each frame is. This watermark does not appear in the registered version - http://www.clicktoconvert.com

9 2) Flag bytes with byte stuffing: Start and end of frames are FLAG bytes. 2 consecutive FLAG bytes denote a frame. Disadvantage : FLAG byte could also occur as data Rectified by: Stuffing ESC character before the byte (byte stuffing) 3) Starting and ending flags with bit stuffing : Here , a unique byte “01111110” is used as frame delimiter. Disadvantage: The pattern of flag could also occur as data Rectified by: Stuffing 0 after 5 consecutive 1’s (Bit Stuffing) 4) Physical layer coding violation : This is applicable only for those n/wks where encoding in physical medium contains some redundancy 6) What are the methods of Flow Control? The various methods of Flow Control are: 1) Feed Back based Flow Control: Receiver sends back information to sender to permitting to send more data. 2) Rate based Flow Control: Here the protocol has a built in mechanism that limits the rate at which sender transmits data without feedback from receiver. 7) What are the strategies in Error handling? When are they used? The strategies are : 1) Error Detection: Used in highly reliable channels ( as retransmission is cheaper) 2) Error Correction: Used in unreliable wireless links that make may errors 8) What is a codeword? A frame of ‘m’ bits of data (message) has ‘r’ redundant bits (check bits). The bit

unit containing data and check bits is called as codeword. Therefore, codeword n = m + r 9) What is Hamming distance? What is the distance required i) to detect ‘d’ errors and ii) to correct ‘d’ errors? The number of bit positions in which 2 codewords differ is called as Hamming Distance.It is obtained by ORing 2 codewords 10) How do you find the Hamming distance of a complete code? From the list of legal codewords, find 2 codewords whose hamming distance is minimum. This is the hamming distance for the complete code. This watermark does not appear in the registered version - http://www.clicktoconvert.com

10 11) Given the list of codewords: 0000000000,0000011111,1111100000,1111111111 Comment on Hamming distance and Error Control Strategy. The Hamming distance of the complete code = 5 (ORing 0000000000 , 0000011111). Now 2d + 1 = 5; Therefore, this can correct double errors. Now , if the codeword 0000000111 arrives, then receiver knows that original code word must have been 0000011111 12) How to decide on the no of redundant bits ‘r’ to be assigned to a message ‘m’? This is decided based on formula m+r+1 <= 2 power r; 13) What is the limitation of Hamming code? They could correct only single bit errors. They can be used to correct burst errors but then they can correct only one error per code word. 14) What does a burst error imply? A burst error does not imply that all the bits are wrong but it implies that at least the first and last are wrong. 15) Brief about Cyclic Redundancy. This is basically a polynomial code. Here we have, Message to be sent : M(x) called the Message polynomial Agreed upon Polynomial : G(x) called the Generator Polynomial Calculated value : T(x) called the Transmission Polynomial. Error : E(x) called the Error Polynomial Receiver Calculates : ( T(x) + E(x)) /G(x) Results 1) If G(x) contains 2 or more terms – single bit errors are detected. 2) if there are 2 single bit errors E(x) = x power i + x power j ; where i> j if G(x) is not divisible by x ,then the sufficient condition that all the double bit errors are detected is : G(x) does not divide x power k + 1 for k up to max value of i-j . 3) If there are odd number of bits in error then E(X) contains odd number of terms. Polynomial with odd number of terms must be divisible by x + 1. This watermark does not appear in the registered version - http://www.clicktoconvert.com

11

4) A polynomial code with r checkbits will detect all the burst errors of length < r; 16) What are the fields in a frame? The frame consists of 2 fields viz., kind,seq,ack,info. Of these the first three contain control info and are called as Frame headers. kind – tells whether there is data in a frame. seq – indicates the sequence number. ack – represents the acknowledgement number info – represents the actual data 17) What does the destination Data link layer do as son as it receives a frame? This extracts the packet from the frame and passes the packet to destination network layer. 18) What are the different elementary Data link layer protocols? What are the assumptions made in each protocol? The different elementary Data link layer protocols are : 1) Unrestricted Simplex Protocol (Unrealistic protocol) Assumptions 1.1) Data Transmission is simplex 1.2) Sender’s and Receiver’s n/w layers are ready 1.3) Process time ignored 1.4) Infinite buffer space. 1.5) Error free communication protocol. 2) Simplex Stop and Wait Protocol Assumptions 2.1) Restricted Buffer Space 2.2) Error prone channel 2.3) Receiver sends dummy frame as ack for synchronization. 3) Simplex Protocol for Noisy Channel Assumptions 3.1) Restricted Buffer. 3.2) Error prone channel 3.3) Sender also adds sequence number for each frame to prevent duplication. 3.4) Retransmissions are done based on time out of the timers 19) What is a PAR or ARQ? This watermark does not appear in the registered version - http://www.clicktoconvert.com

12 PAR stands for Positive Acknowledgement with Retransmission and ARQ stands for Automatic Repeat Request. Both of them actually refer to protocols in which sender waits for a Positive Acknowledgement before advancing to next data item. 20) What is piggy backing? What is its advantage? What is the complication it introduces? The technique of temporarily delaying outgoing acknowledgements so that they can be hooked to next outgoing data frame is called Piggy Backing. The principal advantage it brings is the better use of available channel bandwidth. The complication it introduces is how long the data link layer must wait to send piggy backed acknowledgements. If it is more than sender’s time out period , frame

retransmitted. 21) What are Sliding Window Protocols? They are bidirectional protocols. At any instant of time , sender maintains a set of sequence numbers corresponding to frames it is permitted to send. The sender’s window and the receiver’s window need not have the same lower and upper limit or even have the same size. 22) What is a one-bit sliding window protocol? They refer to the Sliding Window Protocol with maximum window size 1. Such protocols uses stop and wait since sender transmits a frame and waits for its acknowledgement before sending next one. 23) What is Pipelining? The time required for the ack frame to come back to sender is called Round Trip Delay. The product of Bandwidth and Round Trip delay (bandwidth * Round Trip Delay) basically tells what the capacity of pipe is and the sender needs the ability to fill it without stopping in order to operate at higher efficiency. This technique is known as Pipelining. 24) What are the approaches to deal with errors in presence of Pipelining? There are basically 2 approaches which are a) Go Back n : This is for the receiver to simply discard all the subsequent frames, sending no acknowledgement for discarded frames. b) Selective Repeat: The bad frame received is discarded but the good frames received after it is buffered. This watermark does not appear in the registered version - http://www.clicktoconvert.com

13 25) What is NAK? What does NAK 2 signify in a) Go Back n and b) Selective Repeat ? NAK generally stands for Negative Acknowledgement. It is sent by the receiver to the sender whenever the receiver encounters an error like checksum error and frame out of sequence. a) NAK 2 in Go Back n would cause the sender to retransmit all the frames from frame no 2 b) NAK 2 in Selective Repeat would cause the sender to retransmit only frame no 2 26) What is a finite state model? This is a concept n which each protocol machine (sender or receiver) is always in a specific state at every instant of time. The state consists of all values of its variables, including the program counter. The no of states is 2 power n , where n is the number of bits required to represent all the variables combined. 27) When does a transition occur for a protocol machine? Transition occurs when a frame is sent, when a frame is sent, when a frame arrives, when timer expires , when an interrupt occurs etc. 28) What are the events associated with a channel? Typical events are i) Insertion of New Frame on to channel by a protocol machine ii) Delivery of a frame by a protocol machine iii) Loss of frame due to noise 29) What is Reachability Analysis? What is its significance? From the initial state ( ie the state corresponding to the description of the system

when it starts running), it could be possible to determine which states are reachable and which are not. This technique is called Reachability Analysis. Its significance is determining whether a protocol is correct or not. 30) What is Deadlock? This is the situation in which the protocol makes no forward progress (ie deliver packets to n/w layer) no matter what sequence of events happen. In graph model, deadlock is characterized by existence of subset of states that is reachable from initial state and has 2 properties: i) There is no transition out of subset. ii) There is no transition in subset that cause forward progress. This watermark does not appear in the registered version - http://www.clicktoconvert.com

14 31) What are the errors found in Reachability Analysis? The errors found are as follows: 1) Incompleteness : It is possible for a certain frame to occur in a certain state and finite state machine does not say what action should be taken, the specification is an error. 2) Deadlock 3) Extraneous Transition: It says of how to handle event in a state in which event cannot occur 32) What is Petri Net Model? What are its constituents? This model , like finite state machine is a technique for formally specifying protocols. A Petri Net has four elements namely :Place – Represents the states which the system may be in. token – Represents in which state currently the system may be in Transition- indicated by horizontal or vertical line Arcs – Both the input and output arcs. 33) Brief about working of Petri Net Model. The Working of this model is as follows: - Transition is enabled if there is atleast 1 input token - Any enabled transition may fire t will - if 2 or more transition are enabled, any one of them may fire. The choice of the transition to fire is indeterminate. That is why they are useful for modeling protocols. THE MEDIUM ACCESS CONTOL SUBLAYER Broadcast channels are also called as multi-access channels or random access channels. E.g.: A conference call involving connection of several telephone lines. 1. What do you understand by medium access control (MAC) sublayer? The protocols used to determine who goes next on a multi-access channel belong to a sublayer of datalink layer called the MAC. Note: Many of LANs use multi-access channel as basis for communication. WAN uses point-to- point links, except for satellite networks. This watermark does not appear in the registered version - http://www.clicktoconvert.com

15 2. What is frequency division multiplexing (FDM)?

It is a technique of multiple-access channel where if there are N users, the bandwidth is divided into N equal sized portions, each user being assigned a portion. What are the advantages and disadvantages of frequency division multiplexing(FDM)? ADVANTAGES: There is no interference between users. When there is small and constant number of users, FDM is simple and effective mechanism. DISADVANTAGES: When some users are quiescent , their bandwidth is simply lost. They are not applicable to bursty traffics. Permissions will be denied if there are N users. 3. What are the key assumptions in dynamic channel allocation in LANs and MANs. There are 5 key assumptions which are: Station Model: The model consists of N independent stations or terminals. Once frames are generated here, station is blocked and does nothing until frame successfully transmitted. Single channel assumptions. Collision assumption: When transmitted frames overlap in time, resulting signal is garbled. This event is called collision. Carrier Sense: Stations can tell if the channel is in use before using. No carrier sense: Stations cannot sense the channel before using. 4. What is a “Carrier”? It refers to the electric signal on the cable. What are the multiple access protocols you are familiar with? 1. ALOHA – Abramson’s Local Hawaii Networks. 2. Carrier Sense Multiple Access Protocols. 3. Collision free protocols. 4. Limited contention protocols. 5. Wavelength division multiple access protocols. 6. Wireless LAN protocols. 5. What are the two versions of ALOHA? This watermark does not appear in the registered version - http://www.clicktoconvert.com

16 They are pure ALOHA and slotted ALOHA. The former requires global time synchronization and the latter doesn’t. 6. What do you understand from pure ALOHA? Users transmit whenever there is a data to be sent. Whenever there is a collision, colliding frames are damaged. Sender keeps listening to the channel, if frame is destroyed sender waits a random amount of time and sends it again. 7. What are contention systems? Systems in which multiple users share a common channel in a way that can lead to conflicts are widely known as contention systems. 8. What is slotted ALOHA technique? Here the time is divided into discrete interval, each interval corresponding to one frame.

Here a computer is not permitted to send whenever carriage return is typed but is requested to wait for the beginning of next slot. 9. What are carrier sense protocols? In local area networks, it’s possible for stations to detect what other stations are doing and behave accordingly. protocols in which stations listen for a carrier (ie. A transmission) and act accordingly are called carrier sense protocols. 10. What are persistent, non-persistent, p-persistent CSMA? Persistent: When station has data to send , it listens to channel to find whether it is busy. If busy then station waits until it becomes idle. When channel idle, station transmits a frame. If collision occurs, each station waits for a random amount of time and starts all over again. Here the station transmits with a probability 1 when channel is idle. Non-persistent: Same as former but difference is that: Whenever the channel is busy, the station does not continually monitor channel but waits for a random period of time and repeats the algorithm. P-persistent: This appears to slotted channels. When station ready to send and channel idle, transmits with a probability “p”. With probability of q=1-p, it differs until the next slot. This watermark does not appear in the registered version - http://www.clicktoconvert.com

17 11. What is CSMA/ CD? This stands for carrier sense multiple access with collision detection. whenever the collisions occur with CSMA and when they are detected, then the transmission is aborted. The CSMA/CD protocol is widely used in LANs in the MAC sublayer. 12. How are collisions detected? They are detected by looking at power or pulse width of the received signal and comparing it with transmitted signal. This watermark does not appear in the registered version - http://www.clicktoconvert.com

18

(iii) NETWORK LAYER AND TRANSPORT LAYER 1. Which is the lowest layer that deals with end to end transmission? Network Layer. 2. Goals of network layer: The services should be independent of the router technology. The transport layer should be shielded from the number, type and topology of the router present. The network addresses made available to the transport layer should use a uniform numbering plan, even across LAN’s and WAN’s. 3. What is meant by Datagram Subnet? If connectionless service is offered in n/w layer, packets are injected into the

subnet individually and routed independently of each other. No advance setup is needed. These packets are frequently called datagram. And the subnet is called Datagram Subnet. 4. What is known as Virtual Circuit? In connection oriented service, a path from the source router to the destination router must be established before any data packet can be sent. This connection is called a Virtual Circuit. 5. What is meant by Non – Adaptive Algorithm? They do not base their routing decisions on measurements or estimates of the current traffic and topology. Routing information is computer in advance off –line and downloaded to the routers when the n/w is booted. It is said to be Static Routing. Ex: Shortest Path routing. 6. What is meant by Adaptive Routing Algorithm? They change their routing decisions to reflect changes in the topology. It is said to be Dynamic Routing. Ex: Distance Vector Routing, Link State Routing. 7. What are the key differences between Distance Vector Routing and Link State Routing? Distance Vector Routing: a. It has knowledge about the whole network. This watermark does not appear in the registered version - http://www.clicktoconvert.com

19 b. It routes only to its neighbors. c. The information is shared at regular intervals. d. The databases are different for each router. Link State Routing: a. It has knowledge about the neighboring nodes. b. It can route to any routers. c. The information is shared where there is a change in the neighbors. d. The database is same for all routers. 8. What is known as congestion? When too many packets are present in the subnet, performance degrades. It is said to be congestion. 9. Difference between Congestion Control and Flow Control. Congestion Control: Congestion control has to do with making sure the subnet is able to carry the offered traffic. It is a global issue, involving the behavior of all the hosts, all the routers that store and forwarding processing within the routers. Flow Control: Flow Control relates to the pt. to pt. traffic b/w a given sender and receiver. It makes sure that a fast sender cannot automatically transmit data faster than the receiver is able to absorb. It gives some direct feedback from receiver to sender. 10. What are all the Congestion Prevention Policies? Transport Layer: Retransmission Policy.

Out – of order caching Policy. Acknowledgement Policy. Flow Control Policy. Time out determination Policy. Network Layer: Packet queuing and service Policy. Packets discard Policy. Routing algorithm. Packet lifetime management. Data link Layer: Retransmission Policy. Out of order caching Policy. Acknowledgement Policy. Flow control Policy. 11. What is meant by Choke Packet? Packet sent by the source to indicate that threshold had been crossed and the source is going to be congested. That is said to be Choke Packet. This watermark does not appear in the registered version - http://www.clicktoconvert.com

20 12. What are Internet Gateways? Computers that inter connect two networks and pass packets from one to the other is said to be Internet Gateway. 13. What is an IP (or) Internet Address? A scheme analogous to physical n/w addressing in which each host on the internet is assigned a 32 bit integer address is called IP Address. 14. What are Multi Homed Hosts? A Conventional computer having two or more physical connection. They are called Multi Homed Hosts. 15. Difference between ARP & RARP. ARP: Address Resolution protocol is used to associate the 32 bit IP address with the 48 bit physical address, used by a host or a router to find the physical address of another host. It performs Dynamic Address Resolution. RARP: Reverse Address Resolution Protocol allows a host to discover is IP addressing when it knows only its physical address. 16. What is all the protocol type in ARP/RARP message format? • ARP Request. • ARP Response. • RARP Request. • RARP response. 17. What is meant by Fragmentation? The small pieces into which a datagram is divided are called fragments and the process of dividing a datagram is known as Fragmentation. 18. What do you know about TTL? Whenever a node injects a datagram into the internet it sets a maximum time that the datagram should survive. Router and hosts that process datagram must

increment the TTL field as time pass. And it returns the datagram from the internet when the time expires. 19. What is meant by Time Stamp? Timestamps give the time and date at which a router handles the datagram, expressed as milliseconds. This watermark does not appear in the registered version - http://www.clicktoconvert.com

21 20. What is known as Direct and Indirect delivery? Direct Delivery, the transmission of a datagram from one m/c across a single physical network directly to another. Indirect Delivery occurs when the destination is not on a directly attached n/w, forcing the sender to pass the datagram to a router for delivery. 21. Why do we need ICMP? A datagram travels from router to router until it reaches one that can deliver the datagram directly into its final destination. If a router cannot route or deliver a datagram or if the router detects an unusual condition that affects its ability to forward the datagram, the router needs to inform the original source to take action to avoid or correct the problem. 22. What is PING? Packet Internet Groper, it’s a blind search. It’s a message generated by hosts which runs IP application without knowing whether the destination is there or not. PING is followed by an IP address. 23. What for the term Subnet Mask is used? Subnet Mask along with the IP address is used to find the subnet id. 24. What is said to be CIDR? Classless Inter Domain Routing, a technique followed in super netting, which treats IP addresses as arbitrary integers and collapses a lock of contiguous class C number to a given site or a network within a site. Hosts & Routers that use super netting should have routing software that understands ranges of addresses. 25. What are the drawbacks of RARP? a. Operates at low level using it requires direct access to the n/w hardware. b. Requires a packet exchange b/w a client m/c and a computer that answers to its request. c. RARP uses a computer’s hardware address to identify the machine. 26. What is said to be Retransmission Policy? In BOOTP data are carried in IP, hence data may be lost, corrupted, and out of order. IP does not provide a checksum for data. To avoid corruption, BOOTP requires that UDP uses checksum and set don’t fragment bit. Hence it uses the technique Time Out and Retransmission. This watermark does not appear in the registered version - http://www.clicktoconvert.com

22 27. Why do we go for Dynamic Host configuration Protocol? If the number of physical computers exceeds the number of available IP host addresses, static assignment incurs excessive overhead. Hence we go for DHCP allows a computer to acquire all the configuration information to obtain an IP address quickly and dynamically. DHCP allows three types of address assignment

Manual Configuration. Automatic Configuration. Dynamic Configuration. 28. What are the DHCP Address Acquisition States? • Initialize: When a client first boots. • Select: After collecting a request message to the server. • Request: After sending a request message to the server. • Bound: When a client typically uses the IP address it has acquired. • Renew: When 50% of the lease period expires, client into the renew layer. • Rebind: when 87.5% of the lease period expires. 29. By what characteristics we say that a system as an autonomous system? 1). An Autonomous System consists of a group of routers exchanging information via a common routing protocol. 2). An Autonomous System is a set of routers and networks managed by a single organization. 3). Except in times of failures an Autonomous System is connected that is, there is a path between any two pair of nodes. 30. What is known as Interior Router Protocol & Exterior Router Protocol? An Interior Router protocol passes routing information between routers with in an autonomous system. Border Gateway Protocol is used to exchange routing information among participating routers in multiple Autonomous Systems. 31. What are the services of Transport Layer Protocol? End – To – End Delivery. Addressing. Reliable Delivery. Flow Control. Multiplexing. This watermark does not appear in the registered version - http://www.clicktoconvert.com

23 32. Differentiate between Transport Layer and Data link Layer. Data Link Layer Transport Layer 1. Explicit addressing of destination is not needed. 1. Explicit addressing of destination is needed. 2. No idea of storing the data packets. 2. The subnet can store packets for some times and then delivered later. 3. Provides services with in single network. 3. Provides services across an inter network made of many networks. 33. What is said to be End – To – End Delivery? Network Layer does not see any relationship between the packets of data. It treats each as an independent entity where as transport layer, packs all the incoming packets properly (Concatenation) and makes sure that the delivery of an entire message from source destination. It is said to be End – To – End Delivery.

34. What is called TSAP? In Transport layer, the communication is not just from one end machine to other but from one end application to other end application. Data generated by an application in one network must be received not just by the other machine but by the correct application in some other network. So we need to define transport addresses to which processes can listen for this particular application. It is said to be Service Access Points or Ports. This kind of addressing is done at the Transport Layer. It is said to be Transport Service Access Points (TSAP). 35. What are the aspects of reliable delivery of Transport Layer? a. Error Control: Data must be delivered to their destination exactly as they originated from the source. Mechanisms for error handling at this layer are called as error detection and retransmission with the error handling. b. Sequence Control: It packs all the incoming packets properly according to the sequence number and then combines them into a single data unit. c. Loss Control: The Transport Layer ensures that all pieces of a transmission at the destination. Sequence numbers allow the receiver’s transport layer protocol to identify any missing segments and requester delivery. d. Duplication Control: Transport layer guarantee that number of pieces of data arrive at the receiving system duplicated. Sequence numbers allow the receiver to identify and discard duplicate segments. 36. What are the commonly used protocols in the Transport Layer? TCP: Transmission Control Protocol 1. It is connection oriented. 2. Designed to provide reliable communication between pairs of processes across a variety of reliable and unreliable networks. This watermark does not appear in the registered version - http://www.clicktoconvert.com

24 UDP: User Datagram Protocol 1. It is connectionless. 2. Designed to service for application level procedures. Hence it is an unreliable service; delivery and duplicate protection are not guaranteed. 37. What is said to be Active and Passive machines? Clients that take service from the service from the server are said to be Active Machines. They advertise their routes to others. Server, that waits for the connection from the other side is said to be Passive Machine. They listen and update their routes based on advertisement, but they do not advertise. 38. Define Hop Count. The number of hops along a path from a given source to the destination refers to the number of routers that datagram encounters along the path. 39. What is a Segment? Why they are exchanged? The unit of transfer between the TCP software on two machines is called a Segment. They are exchanged to i. Establish connections. ii. To transfer data.

iii. To send acknowledgement. iv. To advertise window size. v. To close connections. 40. What is the purpose of Pseudo Header in the TCP/UDP? Pseudo Header allows the receiver to verify that the segment has reached its correct destination, which includes both a host IP address as well as a protocol number. Protocol number for TCP is 6 and UDP is 17. Pseudo header is used at the time of check sum calculation. 41. How many hand shakes are used in TCP to establish and close connection? To establish a connection: 3 way hand shake. To close a connection: 2 way hand shake. 42. What is known as Window Advertisement? Window Advertisement specifies how many octets of data the receiver is prepared to accept. If the receiver’s buffer begins to become full, it sends a smaller window advertisement. Hence the size of the window changes with time, it is also said to be Zero Window Advertisement. If the buffer is full in receiver, then the size of the This watermark does not appear in the registered version - http://www.clicktoconvert.com

25 window is 0, said to be Zero Window Advertisement, The advantage of this advertisement is to provide flow control as well as reliable transfer. 43. What is “Urgent Mode”? It is sometimes important that one end of a connection to send data out of band, without waiting for the program at the other end of the connection to consume octets already in the stream. The TCP allows the sender to specify data as urgent, meaning that the receiving program should be notified of its arrival. It is said to be “Urgent Mode”. When the Urgent Bit is set, the urgent pointer specifies the position in the segment where urgent data ends. 44. What are the three types of addresses allowed by IPV6? UniCast: An identifier for a single interface. A packet sent to a unicast address is delivered to the interface identified by that address. AnyCast: An identifier for a set of interfaces. A packet sent to any cast address is delivered to one of the interfaces identified by that address. MultiCast: An identifier for a set of interfaces. A packet sent to a multicast address is delivered to all interfaces identified by that address. 46. What is a Link State Packet? In the Link State Routing, when a router floods the network with the information about its neighborhood, it is said to be advertising. The packet send during the advertisement is said to be Link State Packet. 47. When a HELLO Protocol is used? It uses time instead of distance to determine optimal routing. It is an alternative to Routing Information Protocol (RIP). 48. What is a Sliding Window? A sliding window is used to make data transmission more efficient as well as to control the flow of data so that the receiver does not become overwhelmed. Sliding Window used at the transport layer are usually byte oriented rather than frame oriented. This watermark does not appear in the registered version - http://www.clicktoconvert.com

26

(iv) APPLICATION LAYER & NETWORK SECURITY 1. What is DNS? DNS : Domain Name System 2. When many computers were connected together often host name conflicts were used to occur. To solve this problem domain name system was invented. 3. What is Resolver? It is used to map the name of a system to an IP address. It is a library procedure. 4. What is domain? Domain contains many hosts. Domains can be sub divided in to sub- domains. 5. Explain some domain names and their usage ? Com - commercial Edu - educational institutions. Gov - Us federal goverment Mil - armed forces Int – international organizations. 6. What are Resource records? Every domain have resource records associated with it. One example for such an resource record is the IP address. Name servers: A single name server can contain the entire DNS data base. 7. What is Recursive query? When one server donesnt have the required information it searches for other servers and gets back the needed information. 8. What are sub systems of Emails? This watermark does not appear in the registered version - http://www.clicktoconvert.com

27 It contains mainly two sub systems 1) user agents : which allows people to read and send email 2) message transfer agents: which move the messages from the source to the destination. 9. What is MIME ? Multi pur pose Internet mail extension. Explain some MIME types? 10. Some of the mime types are, Text Image Audio Video Application Message Multipart.etc

11. What is Message transfer? Simple Mail Transfer Protocol is used for the mail transfer. It is ASCII based. 12. What is POP? Post office protocol It is used to contact the message transfer agent and allow the email to be copied form the ISP to the user. 13. Explain the actions of POP? 3 actions of pop are, 1> authorization 2> transactions 3> update. 14. What is IMAP? Internet Message access protocol. This watermark does not appear in the registered version - http://www.clicktoconvert.com

28 It doesnot delete the mails from the mail box after download instead it assumes that the mail will exist indefinitely in multiple mail boxes. 15. What are Filters? To filter the incoming mails based on the sender . 16. Explain www. world wide web. It is an architectural frame work for accessing linked documents spread out over millions of machines all over the internet. 17. Details of W3c? World wide web consortium It was formed by CERN and MIT to for further developing the web, standizing the protocols and encourage the inter operability between sites. 18. What is URL? Uniform resource locators. Web pages are named using resource locators. 19. Explain HTTP? Hyper text transfer protocol. It is the webs native protocol . only webs servers use this protocol. 20. What is URN? Uniform resource names. A urn is a taught of generalized URL. 21. Explain Cookies? It is a small text file stored by the server containing the information about that of the client with respect to the server. 22. What is a persistent cookie and non persistent ? When a cookie expires in an certain amount of time specified in the date field then it is said to be persistent cookie. This watermark does not appear in the registered version - http://www.clicktoconvert.com

29 If the field is null then it is non persistent cookie. 23. Explain HTML?

Hypertext mark up language It is an language used to produce web pages that include text, graphics and pointers and other web pages. 24. What is Caching? When one page is viewed often then the some amount of information about the page is stored in the client. This is used for easy access in the subsequent hits. It is called as caching. 25. What is Mirroring? The server replicates its contents to improve the access. This technique is called as mirroring. 26. Explain about Audio compression? Mp3 MPEG audio layer 3 . It is an very popular audio compression technique. 27. What is Cipher? A cipher is a character for character or bit for bit transformation, without regard to the linguistic structure of the message. The general symbol for symbol substitution is called a mono alphabetic substitution. Substitution ciphers preserve the order of the plaintext symbols but disguise them. Transposition ciphers in contrast, re order the letters but do not disguise them. 28. What is One-time pad? Choose an random bit string as key. Then convert the plain text in to bit string. Compute xor between the two strings. The resulting cipher text cannot be broken because in an large amount of cipher text each letter will occurs equally often, as will every diagram and every tri gram and so on. This watermark does not appear in the registered version - http://www.clicktoconvert.com

30 This is called as one – time pad. 29. What is DES? Data encryption standard. Cipher modes. The different types of cipher modes available are, Electronic code book mode Cipher block chaining mode Cipher feed back mode Stream cipher mode Counter mode 30. What is Usage of Digital signature? Using Digital signature The receiver can verify the claimed identity of the sender. The sender cannot later repudiate the contents of the message. The receiver cannot possibly have concocted the message himself. 31. Explain Message digest? This scheme is based on the idea of one way hash function that takes an arbitrarily long piece of plain text and from it computes a fixed length bit string. The hash function is called as message digest. Example

Md5 SHA-1 This watermark does not appear in the registered version - http://www.clicktoconvert.com

31

(v) TCP/IP 1. What is an IP address? An IP address is a unique 32-bit integer address assigned to each host on the internet. Each address is a pair consisting of netted and the hostid. IP addresses stand for connection to a network rather than specifying an individual computer. 2. IP Address: starts with netid (bits) hostid (bits) Class A 0 7 24 Class B 10 14 16 Class C 110 21 8 Class D 1110 Multicast address Class E 11110 Reserved for future use 3. Can a host be part of more than one physical network? Yes. Such hosts are called as Multi-homed hosts. Routers belong to this category. These multi-homed hosts require multiple IP addresses, where each address corresponds to one of the machine's network connections. 4. What are Limited broadcast and directed broadcast addresses? A directed broadcast address uniquely identifies the target network in addition to specifying broadcast on that network. It contains both the valid networkid and the broadcast hostid. It allows a remote system to send a single packet that will be broadcasted on the specified network.. A Limited broadcast address consists of all 1’s (255.255.255.255) and provides a broadcast address for the local network independent of the assigned IP addresses. A host uses this address as a part of start up procedure before it learns its IP address. NOTE: A broadcast address has hostid with all bits set to 1.A network address has all bits of hostid equal to 0. 5. What is address resolution protocol (ARP) and reverse address resolution protocol (RARP)? The address resolution protocol (ARP) allows a host to find the physical address of a target host on the same physical network, given only the target's IP address. This watermark does not appear in the registered version - http://www.clicktoconvert.com

32 ARP request is broadcasted to all the hosts in the network whereas ARP response is sent only to the host who broadcasted the request. Given a physical network address, to map it to an internet address RARP is used. A diskless machine uses RARP to obtain its IP address from a server. The RARP mechanism supplies the target machine's physical hardware address to uniquely identify the processor and broadcasts the RARP request. The RARP servers process the request and send a reply containing the target IP address.

6. What is an IP datagram? The fundamental unit of information passed across any network utilizing Internet protocol. An IP datagram contains source and destination addresses along with data and a number of fields that define such things as the length of the datagram, the header checksum, and flags that indicate whether the datagram can be (or has been) fragmented. 7. What do you mean by tunneling? A method of transmission over internet which works based on differing protocols. In tunneling, a packet based on one protocol is wrapped, or encapsulated, in a second packet based on whatever differing protocol is needed in order for it to travel over an intermediary network. In effect, the second wrapper "insulates" the original packet and creates the illusion of a tunnel through which the wrapped packet travels across the intermediary network. In real-life terms, tunneling is comparable to "encapsulating" a present (the original packet) in a box (the secondary wrapper) for delivery through the postal system. 8. What is a gateway? A gateway is a computer that lies at the intersection of two networks and routes traffic correctly from one network to another, while keeping traffic internal to the two networks separated. 9. What are internet routers? A router is a network device that transmits message packets, routing them over the best route available at the time. Routers are used to connect multiple network segments, including those based on differing architectures and protocols. Routers operate on network layer information and participate in running one or more network layer routing protocols. Routers use routing tables in determining where to forward transmissions. Tables for routers include information about different paths between other routers. 10. What is a subnet? This watermark does not appear in the registered version - http://www.clicktoconvert.com

33 A portion of an IP network that shares a common address component and defined by a subnet mask.. On TCP/IP networks, subnets are defined as all devices whose IP addresses have the same prefix. For example, all devices with IP addresses that start with 100.100.100 would be part of the same subnet. IP networks are divided using a subnet mask. Each subnet has its own unique submitted network ID. Devices on the same subnet have the same subnet mask. 11.What is the significance of TTL field in an IP datagram? TTL-Time to Live. It specifies the time in seconds for which the datagram is allowed to stay in the internet. When a router process the datagram header, the TTL field is decremented by 1.When TTL reaches 0 before the datagram reaches its destination , the datagram is discarded and an error message is sent to the source. 12.What is ICMP? ICMP-Internet Control Message Protocol. ICMP allow routers in an internet to report errors or control messages to other routers or hosts. It is an error reporting mechanism. They require two levels of encapsulation. ICMP depends on IP to move packets around the network on its

behalf. ICMP can be characterized as follows: ICMP uses IP as if ICMP were a higher level protocol. However, ICMP is an integral part of IP and must be implemented by every IP module. ICMP is used to report errors, not to make IP reliable. Datagrams may still be undelivered without any report on their loss. For fragmented datagrams, ICMP messages are only sent about errors with the first fragment. ICMP messages are never sent in response to datagrams with a broadcast or a multicast destination address. ICMP messages are never sent in response to a datagram that does not have a source IP address representing a unique host. 13.What is IP? Internet Protocol (IP) is the central, unifying protocol in the TCP/IP suite. It provides the basic delivery mechanism for packets of data sent between all systems on an internet, regardless of whether the systems are in the same room or on opposite sides of the world. All other protocols in the TCP/IP suite depend on IP to carry out the fundamental function of moving packets across the internet. IP does not guarantee to actually deliver the data to the destination, nor does it guarantee that the data will be delivered undamaged, nor does it guarantee that data packets will be delivered to the destination in the order in which they were This watermark does not appear in the registered version - http://www.clicktoconvert.com

34 sent by the source, nor does it guarantee that only one copy of the data will be delivered to the destination. 14.What is TCP? Transmission Control Protocol (TCP) provides a reliable byte-stream transfer service between two endpoints on an internet. TCP depends on IP to move packets around the network on its behalf. IP is inherently unreliable, so TCP protects against data loss, data corruption, packet reordering and data duplication by adding checksums and sequence numbers to transmitted data and, on the receiving side, sending back packets that acknowledge the receipt of data. Before sending data across the network, TCP establishes a connection with the destination via an exchange of management packets. The connection is destroyed, again via an exchange of management packets, when the application that was using TCP indicates that no more data will be transferred. 15.What is UDP? User Datagram Protocol (UDP) provides an unreliable packet zed data transfer service between endpoints on an internet. UDP depends on IP to move packets around the network on its behalf. UDP does not guarantee to actually deliver the data to the destination, nor does it guarantee that data packets will be delivered to the destination in the order in which they were sent by the source, nor does it guarantee that only one copy of the data will be delivered to the destination. UDP does guarantee data integrity, and it does this by adding a checksum to the data before transmission. 16. What is DHCP?

Dynamic Host Configuration Protocol (DHCP) allows IP addresses to be allocated to hosts on an as-needed basis. DHCP lets a host 'borrow' an IP address from a pool of IP addresses; when the address is no longer needed it is recycled and made available for use by some other host. DHCP also allows a host to retrieve a variety of configuration information at the same time as it acquires an IP address. DCHP depends on UDP to carry packets between the client and server tasks. 17. What is DNS ? The Domain Name System (DNS) provides dynamic on-demand translation between human-readable names and the numeric addresses actually used by IP (like 192.112.170.243). DNS uses both UDP and TCP. It used UDP to carry simple queries and responses but depends on TCP to guarantee the correct and orderly delivery of large amounts of bulk data across the network. This watermark does not appear in the registered version - http://www.clicktoconvert.com

35 18.What is SNMP ? Simple Network Management Protocol (SNMP) provides a means of monitoring and managing systems over a network. SNMP defines a method of sending queries (the GET and GET-NEXT primitives) and commands (the SET primitive) from a management station client to an agent server running on the target system, and collecting responses and unsolicited event notifications (the TRAP primitive).The various things that can be monitored and managed by SNMP are collectively called as the Management Information Base (MIB). SNMP sends traffic through UDP because of its relative simplicity and low overhead. 19.What is Telnet ? Telnet provides a network terminal or "remote login" capability. The Telnet server accepts data from the telnet client and forwards them to the operating system in such a way that the received characters are treated as though they had been typed at a terminal keyboard. Responses generated by the server operating system are passed back to the Telnet client for display. The Telnet protocol provides the ability to negotiate many kinds of terminal-related behavior between the client and server. 20.What are sockets? A socket is an abstraction that represents an endpoint of communication. Most applications that consciously use TCP and UDP do so by creating a socket of the appropriate type and then performing a series of operations on that socket. The operations that can be performed on a socket include control operations (such as associating a port number with the socket, initiating or accepting a connection on the socket, or destroying the socket) data transfer operations (such as writing data through the socket to some other application, or reading data from some other application through the socket) and status operations (such as finding the IP address associated with the socket).The complete set of operations that can be performed on a socket constitutes the Sockets API (Application Programming Interface). 21.What do you mean by source quench? A source quench message is a request for the host to reduce its current state of

datagram transmission. It is an ICMP message reporting congestion to the source. The routers send one source quench message for every datagram that they discard. 22.What is a PING? This watermark does not appear in the registered version - http://www.clicktoconvert.com

36 The ICMP echo request messages are named as ping. A host sends an ICMP echo request message to a specified destination. The machine that receives the echo request formulates an echo reply and returns it to the original sender. This mechanism is used to test whether a destination is reachable and responding. 23.What is a port? What do the end points specify? A port identifies the ultimate destination within a machine. Each port is assigned a small integer value. A port is what an Internet server uses to distinguish between requests for different service uses. An endpoint specifies a pair(host, port)which is the point of connection for a tcp application where host is the IP address of for a host and port is a TCP port on that host. 24. What is the difference between TCP and UDP if they both operate at the Transport layer? Although both TCP and UDP are Transport layer protocols and provide the same basic function, TCP is a connection-oriented protocol, which means a session is established before data is transmitted, and acknowledgments are sent back to the sending computer to verify that the data did arrive and was accurate and complete. UDP is connectionless; no session, or one-to-one connection, is established prior to data transmission. This makes UDP the faster of the two, and TCP the more reliable. 25.What is IPv6? IP Version 6 (IPv6) is the newest version of IP, sometimes called Ping for "IP, Next Generation". IPv6 is fairly well defined but is not yet widely deployed. The main differences between IPv6 and the current widely-deployed version of IP (which is IPv4) are: IPv6 uses larger addresses (128 bits instead of 32 bits in IPv4) and so can support many more devices on the network, and IPv6 includes features like authentication and multicasting that had been bolted on to IPv4 over the years. 26. How do applications coexist over TCP and UDP? Each application running over TCP or UDP distinguishes itself from other applications using the service by reserving and using a 16-bit port number. Destination and source port numbers are placed in the UDP and TCP headers by the originator of the packet before it is given to IP, and the destination port number allows the packet to be delivered to the intended recipient at the destination system. So, a system may have a Telnet server listening for packets on TCP port 23 while an FTP server listens for packets on TCP port 21 and a DNS server listens for packets on port 53. TCP examines the port number in each This watermark does not appear in the registered version - http://www.clicktoconvert.com

37 received frame and uses it to figure out which server gets the data. UDP has its own similar set of port numbers. 27.What is FTP?

FTP stands for File Transfer Protocol. FTP is the best means for moving large files across the Internet. FTP is a client/server protocol that enables a user with an FTP client to log on to a remote machine, navigate the file system of that remote machine, and upload and download files from that machine. There are two basic types of FTP on the Internet: Anonymous FTP and Private FTP. With Anonymous FTP, one logs in as user "anonymous," giving one's email address as a password. With Private FTP, one logs in with the username and password one has established on that particular system. 28.What is the World Wide Web? The World Wide Web, which uses Hyper Text Transfer Protocol (HTTP), is a connectionless client/server protocol that was invented in 1993 by Tim Burners Lee at CERN. Web servers can deliver a wide variety of media files using MIME. Web clients, like Netscape and MS Internet Explorer, make requests (i.e., they send URL's to servers) of Web Servers, to which the Web server responds by delivering the requested file, running the requested script, or generating an appropriate error message 29. What is URL ? A URL is a Uniform Resource Locator. It is a general purpose Internet addressing protocol, used in WWW (HTTP) service. It is of the form: <protocol>:////<moredirectory or file names>. 30.What do you mean by Domain name resolution? The domain name resolution process can be summarized in the following steps: A user program issues a request such as the gethostbyname() system call. The resolver formulates a query to the name serverThe name server checks to see if the answer is in its local authoritative database or cache, and if so, returns it to the client. Otherwise, it will query other available name server(s), starting down from the root of the DNS tree or as high up the tree as possible. The user program will finally be given a corresponding IP address (or hostname, depending on the query) or an error if the query could not be answered. 31.What is an Autonomous system (AS) ? This watermark does not appear in the registered version - http://www.clicktoconvert.com

38 An AS is defined as a logical portion of a larger IP network. An AS is normally comprised of an internetwork within an organization. It is administered by a single management authority. An AS may connect to other autonomous systems managed by the same organization. Alternatively, it may connect to other public or private networks. 32. What are the types of IP routing and IP routing algorithms ? Routing algorithms are used to build and maintain the IP routing table on a evice. There are two primary methods used to build the routing table: • Static routing: Static routing use preprogrammed definitions representing paths through the network. • Dynamic routing: Dynamic routing algorithms allow routers to automatically

discover and maintain awareness of the paths through the network. This automatic discovery can use a number of currently available dynamic routing protocols. The difference between these protocols is the way they discover and calculate new routes to destination networks. They can be classified into three broad categories: - Distance vector protocols - Link state protocols - Hybrid protocols 33. What is Exterior Gateway Protocol (EGP) ? EGP is an exterior gateway protocol , one of the first protocols developed for communication between autonomous systems. EGP assumes the network contains a single backbone and a single path exists between any two autonomous systems. Due to this limitation, the current use of EGP is minimal. In practice, EGP has been replaced by BGP. EGP is based on periodic polling using a hello/I-hear-you message exchange. These are used to monitor neighbor reachability and solicit update responses. The gateway connecting to an AS is permitted to advertise only those destination networks reachable within the local AS. It does not advertise reachability information about its EGP neighbors outside the AS. 34.What is Border Gateway Protocol (BGP) ? The Border Gateway Protocol (BGP) is an exterior gateway protocol. It was originally developed to provide a loop-free method of exchanging routing information between autonomous systems. BGP has since evolved to support aggregation and summarization of routing information. This watermark does not appear in the registered version - http://www.clicktoconvert.com

39 35.What are the BGP packet types ? All BGP packets contain a standard header. The header specifies the BGP packet type. The valid BGP packet types include: • OPEN2: This message type is used to establish a BGP session between two peer nodes. • UPDATE: This message type is used to transfer routing information between BGP peers. • NOTIFICATION: This message is send when an error condition is detected. • KEEPALIVE: This message is used to determine if peers are reachable. 36. What are the criteria for Routing protocol selection ? The protocol chosen for one type of network may not be appropriate for other types of networks. • Scalability to large environments: If support is needed for large, highlyredundant networks, link state or hybrid algorithms should be considered. Distance vector algorithms do not scale into these environments. • Stability during outages: The counting to infinity problems may cause routing loops or other non-optimal routing paths. Link state or hybrid algorithms reduce the potential for these problems. • Speed of convergence: Triggered updates provide the ability to immediately initiate convergence when a failure is detected. One contributing factor to convergence is the time required to detect a failure. In OSPF and EIGRP

networks, a series of hello packets must be missed before convergence begins. In RIP environments, subsequent route advertisements must be missed before convergence in initiated. These detection times increase the time required to restore communication. Others in the list are Ease of implementation, vendor interoperability, metrics etc. 37. What is cryptography? Cryptography is the science of keeping your data and communications secure. To achieve this goal, techniques such as encryption, decryption and authentication are used. The key factor to strong cryptography is the difficulty of reverse engineering. Strong cryptography means that the computational effort needed to retrieve clear text messages without knowing the proper keys makes the retrieval infeasible. 38.Explain : Authentication, integrity, and non-repudiation • Authentication - A method for verifying that the sender of a message is really who he or she claims to be. Any intruder masquerading as someone else is detected by authentication. • Integrity checking - A method for verifying that a message has not been altered along the communication path. Any tampered message sent by an intruder is This watermark does not appear in the registered version - http://www.clicktoconvert.com

40 detected by an integrity check. As a side effect, communication errors are also detected. • Non-repudiation - The possibility to prove that the sender has really sent the message. When algorithms providing non-repudiation are used, the sender is not able to later deny the fact that he or she sent the message in question. 39.What is a firewall? A firewall is a system (or group of systems) that enforces a security policy between a secure internal network and an untrusted network such as the Internet. Firewalls tend to be seen as a protection between the Internet and a private network. But generally speaking, a firewall should be considered as a means to divide the world into two or more networks: one or more secure networks and one or more non-secure networks. 40.Explain HTTP The hypertext transfer protocol is a protocol designed to allow the transfer of ypertext Markup Language (HTML) documents. HTTP is based on requestresponse activity. A client, running an application called a browser, establishes a connection with a server and sends a request to the server in the form of a request method. The server responds with a status line, including the message's protocol version and a success or error code, followed by a message containing server information, entity information and possible body content. An HTTP transaction is divided into four steps: 1. The browser opens a connection. 2. The browser sends a request to the server. 3. The server sends a response to the browser. 4. The connection is closed. This watermark does not appear in the registered version - http://www.clicktoconvert.com

41

Part – II

DATABASE MANAGEMENT SYSTEM This watermark does not appear in the registered version - http://www.clicktoconvert.com

42

Topics Covered:i) Introduction to Database, ERD, Relational Model, Relational Algebra & Calculus, SQL concepts & definitions(Constraints and triggers etc.) Hemasai B ii) Storage, File Structure, Indexing (Hash based & Tree structure) - Geethavahini N iii) Transaction Management (Concurrency & Crash Recovery) -Russia T iv) Database Design & Tunning (Normalization & deNormalization) -Vijith PV v) SQL Refresher – Sandip Roy This watermark does not appear in the registered version - http://www.clicktoconvert.com

43

(i) INTRODUCTION TO DATABASE, ERD, RELATIONAL MODEL, RELATIONAL ALGEBRA & CALCULUS, SQL CONCEPTS & DEFINITIONS 1. What is a Data? Data is a known fact and that have an implicit meaning. 2. What is a Database? Database is a collection of related data. 3. What is DBMS? DBMS - DataBase Management System It is a general purpose software system that facilitates the process of defining, constructing, and manipulating databases for various applications. Defining - involves specifying the datatypes, structures and constraints for the data

to be stored in the database. Constructing - is the process of storing the data itself on some storage medium that is controlled by the DBMS. Manipulating - includes such functions as querying the database to retrieve specific data, updating the database to reflect changes in the miniworld and generating reports from the data. 4. Differentiate between File system and Database Management system? File system DBMS Redundancy of data exists Controlled redundancy exists Data Inconsistency exists. Change in structure of data results in change in related applications Data Independency exists. Physical data independency – Change in physical schema does not affect the application. Logical data independency - Change in logical schema does not affect the application. Contains the data Contains the data and the structure in which it is stored. The structure information is known as meta data. File processing software can access only specific databases DBMS software can access diverse databases by extracting the database definitions from the meta data and then using the definitions. This watermark does not appear in the registered version - http://www.clicktoconvert.com

44 5. What are the characteristics of database? Data independence. Speedy handling of spontaneous request – real time Controlled redundancy Versatility in representing relationships between data items Security protection Real time accessibility 6. What are the advantages of DBMS? Reduction of redundancy Shared data Integrity Security Conflict resolution Data isolation Data independence 7. What are the disadvantages of DBMS?

Centralization of data – sense of ownership lost Inaccurate data – Extensive data integrity/validation Target of security breaches Increases politics/ organizational struggles-conflicts of interest A threat to privacy Improper design leads to problems 8. When not to use a DBMS? It may be more desirable to use regular files in the following circumstances: The database and applications are simple, well defined and not expected to change. There are stringent real-time requirements for some programs that may not be met because of DBMS overhead. Multiple-user access to data is not required. 9. Define the "integrity rules". There are two Integrity rules: Entity Integrity: States that “Primary key cannot have NULL value” Referential Integrity: States that “Foreign Key can be either a NULL value or should be Primary Key value of other relation”. This watermark does not appear in the registered version - http://www.clicktoconvert.com

45 10. What is a data model? What are the different kinds of data models? Data model is a type of abstraction that models data and their relationship. Data model hides storage and implementation details (hence abstraction). The different kinds of data models are Primitive data model – File based Traditional data model – Hierarchical data model Network data model Relational data model Semantic data model - AI based 11. Differentiate the three Traditional data models. Data model Inventor Basic data structure Hierarchical IBM Tree structure Network CODASYL (Conference On DAta SYstem Language) Network/ Graph structure Relational E.F.Codd (IBM labs) Relation/ Table structure 12. List out the categories of End users. There are several categories of end users, Casual end users - They occasionally access the database and use sophisticated query language to specify the request. They are middle or high level managers or other occasional browsers. Naïve or Parametric end users – They make up a sizable portion of the database end users. Their main job function revolves around constantly querying and updating the database, using standard types of queries and updates – called ‘Canned transaction ‘that have been carefully programmed and tested.

Sophisticated end users - They include engineers, scientists, business analyst, and others who thoroughly familiarize themselves with the facilities of the DBMS so as to implement their applications to meet their complex requirement. Stand-alone users – They maintain personal databases by using ready-made program packages that provide easy-to-use or graphics-based interfaces. 13. Explain the Three-Schema Architecture. The goal of the three-schema architecture is to separate the user application and the physical database. In this architecture, schemas can be defined at the following three levels: The Internal level – Has an internal schema, which describes the physical storage structure of the database. The Conceptual level – Has a conceptual schema, which describes the structure of the whole database for a community of users. This watermark does not appear in the registered version - http://www.clicktoconvert.com

46 The External or View level – Includes a number of external schemas or user views. each external schema describes the part of the database that a particular user group is interested in and hides the rest of the database from that user group. 14. What is Database Tuning? Adjusting a database to improve the performance is referred to as tuning the database. The database administrator or his group is responsible for the tuning and operation of the database, and it is important that he/she should be free to make what changes he/she feels necessary without playing havoc with the application programs. Effective tuning has two requirements. Physical data independence Automatic monitoring of the usage of database so that appropriate adjustments can be made. 15. What is Data Migration? Some data are referred very frequently and others only occasionally. The process of adjusting the storage of data to suit its popularity rating is called Data Migration. In some cases the physical location of data is changed to archive storage. In some cases the reference to them in the indices which are used for addressing them are changed so that hey can be found more quickly. 16. Who are the ‘Actors on the Scene’? Actors on the Scene are the people who use and are interested in the Database. The different actors are Database Administrator (DBA) – The DBA is responsible for authorizing access to the database, for coordinating and monitoring its use and for acquiring software and hardware resources as needed. Database Designer – They are responsible for identifying the data to be store in the database and for choosing appropriate structures to represent and store this data. End users – End users are the people whose job requires access to the database for querying, updating, and generating reports. The database primarily exists for them. 17. Who are the ‘Workers behind the Scene’? People who are associated with the design, development and operation of the DBMS software and system environment but not interested in the database itself are known as

‘Workers behind the Scene’. Some of them are, DBMS system designers and implementers – persons who design and implement the DBMS modules and interfaces as a software package. Tool developers – The persons who design and implement tools – the software packages that facilitates database system design and use, and help improve performance. This watermark does not appear in the registered version - http://www.clicktoconvert.com

47 Operators and maintenance personnel – They are the system administration personnel who are responsible for the actual running and maintenance of the hardware and software environment for the database system. 18. What are Virtual Data and Transparent Data? The Data referred which is not in the main memory at that time of reference is Virtual data. Transparent data appears not to exist but in fact it does exist. (Many of the complex mechanisms used in the data storage and data transmission can be hidden from the programmer so that he does not have to understand or even know about them.) 19. What are the various DBMS languages? The various DBMS languages are: DDL – Used by the DBA and by database designer to define both internal and conceptual schema SDL – Used to specify internal schema (where there is a strict/clear separation between internal and conceptual schema) VDL – Used to specify user views and their mapping to the conceptual schema DML – Used for manipulating the data in the database (insert, delete, update, retrieve) 20. What is Entity-Relationship model? It is a high-level conceptual model used for conceptual design of database applications. 21. How data is represented in ER model? In ER model the data is represented as entities, relationships and attributes. Entity – A “thing” in the real world with an independent existence Attribute – The particular properties that describe entity. Each entity should have a key attribute whose values are unique for an entity Relationship – Implicit relationship among the various entity types 22. Mention and explain the different types of attributes in ER model. The various types of entities are, Simple or Atomic attributes – Attributes that are not further divisible Composite attributes – Attributes that can be divide into smaller parts, which represent more basic attributes with independent meaning. Ex: Address Single valued attributes – Attributes that have a single value for a particular entity Ex: Age of a person This watermark does not appear in the registered version - http://www.clicktoconvert.com

48 Multi-valued attributes – Attributes that can have a set of values for the same entity Ex: Color of car painted in two colors

Derived attribute - Attribute whose value can be derived from another attribute Ex: Age can be derived from date of birth Stored attribute – Attribute from which other attribute values can be derived Ex: Date of birth from which age can be derive Complex attributes – Attributes in which composite and multi-valued attributed are nested Ex: A person having more than one residence and each have more than one phone number 23. Define Entity types and entity sets. Entity types – An entity type defines a collection of entities that have the same attributes Entity sets – The collection of all entities of a particular entity type in the database at any point in time 24. What are value sets or domains of an attribute? An attribute of an entity type is associated with value set (domain of values), which specifies the set of values that may be assigned to that attribute for each individual entity. 25. What is Relationship set and Relationship type? Relationship set – The collection (or set) of similar relationship. Relationship type – Relationship type defines a set of associations or relationship set among a given set of entity types. 26. What is degree of relationship type? It is the number of entity types participating. 27. What are the constraints on the Relationship types? Explain. Relationship types usually have certain constraints that limit the possible combinations of entities that may participate in the corresponding relationship set. Two main types of relationship constraints are: • Cardinality Ratio • Participation Cardinality for Binary Relationships: The cardinality ratio for a binary relationship specifies the number of relationship instances that an entity can participate in. This watermark does not appear in the registered version - http://www.clicktoconvert.com

49 Participation constraints and Existence dependencies: The participation constraint specifies whether the existence of an entity depends on its being related to another entity via the relationship type. 28. What are the types of Participation constraints? Explain. Two types of participation constraints are: • Total participation (or Existence dependency) • Partial participation 29. What are Weak Entity types? Entity types that do not have key attributes of their own are called weak entity types. In contrast, regular entity types that do have a key attribute are called strong entity types. Weak entity types are identified by being related to specific entities from another entity (identifying or owner entity type) in combination with some of their attribute

values. A weak entity type always has a total participation constraints with respect to its identifying relationship, because a weak entity cannot be identified without an owner entity. 30. What is a key? An entity type usually has an attribute whose values are distinct for each individual entity in the collection. Such an attribute is called a key attribute and its values can be used to identify each entity uniquely. 31. What are different types of keys? Explain. (1) Candidate key - If a relation schema has more than one key, each is a candidate key. (2) Primary key - One of candidate keys is arbitrarily designated to be the primary key. (3) Superkey – A superkey of a relation schema R = { A1,A2,….An } is a set of attributes S ⊆R with the property that no two tuples t1 ant t2 in any legal relation state r of R will have t1[S] = t2[S]. A key K is a superkey with the additional property that removal of any attribute from K will cause K not to be a superkey anymore. (4) Foreign key – A foreign key specifies a referential integrity constraints between the two relation schema R1 and R2. 32. What is a Discreminent? A group of attributes in a weak entity which helps to identify it is called a Discreminent. 33. What is a relational model? This watermark does not appear in the registered version - http://www.clicktoconvert.com

50 The usage of pointers and high complexity are demerits of hierarchical and network model. The relational model represents the database as a collection of relations. Each relation resembles a table of values or, to some extent, a flat file of records. 34. What are the different aspects of a relation? Attribute: Columns in a table are called attributes. Tuple: Rows are called tuples. Degree: The number of attributes is called as degree of the relation or arity of the relation. Cardinality: The number of rows is called cardinality of relation. Domain: Set of all values that an attribute can have. 35. What is Domain contraint? Domain contraint specify that the value of each attribute A must be an atomic value from the domain of dom(A ). 36. What is a Relation schema? A Relation schema R, denoted by R(A1, A2, A3..,An), is made up of a relation name R and a list of attributes A1, A2,..,An. 37. What are the constraints of a relational model? The two constraints of a relational model are unique identification and nonredundancy. Unique Identification: Each tuple of the relation should be uniquely identifiable by means of a special attribute known as key(s). No two rows can have the same key. Non-Redundancy:

No attribute that a part of a key can be removed without destroying property of unique identification. The key should be minimal. 38. Explain Integrity contraint. Entity Integrity contraint states that no primary key value can be null. Referential Integrity contraint states that a tuple in one relation that refers to another relation must refer to an existing tuple in that relation. 39. What is Relational Algebra? It is procedural query language. It consists of a set of operations that take one or two relations as input and produce a new relation. This watermark does not appear in the registered version - http://www.clicktoconvert.com

51 40. What are the basic operations of relational algebra? Union A binary operator for which arity of both the relations must be same. It gives a relation of the same arity with set of all tuples in both the operand relations. Ex: RUS Difference A binary operator for which arity of both the relations must be same. Ex: R-S. It gives a relation of the same arity that contains all tuples in R but not in S. Cartesian product A binary operator. Ex: RXS. It is a typical cross product o two relations R, S. Thus, if arity( R ) = K1, arity ( S ) = K2, then arity ( RXS ) = K1+K2. Projection The project operation selects certain columns from the table and discards the other columns. Selection The select operator is used to select a subset of the tuples from a relation that satisfy a selection condition. 41. What are the advanced level operations in Relational algebra? Explain. The advanced level operations are : Intersection A binary operator for which arity of both the relations must be same. It gives a relation of the same arity with set of tuples which are comman in both the operand relations. Ex: RnS Quotient A binary operator. Let R be of arity r and S be of arity s where r>s (pre requisite). Then R÷S is a set of all (r-s) tuples t such that for all s tuples u in S, the tuple tu is in R. Join A join operation is used to combine related tuples from two relations into single tuple. Theta join – A join operation with general join condition. [ ó (R X S) ] ( ó – Condition) Equi-join – A join operation with only comparison operators in condition. Natural join – A join operation based on a common column that have same values in both the relations. [ ó (R X S)] This watermark does not appear in the registered version - http://www.clicktoconvert.com

52

42.What is relational calculus? Relational calculus is a formal query language where we write one declarative expression to specify a retrieval request and hence there is no description of how to evaluate a query; a calculus expression specifies what is to be retrieved rather than how to retrieve it, whereas in relational algebra we write a sequence of operations to specify a retrieval request. Relational calculus is considered to be non-procedural language whereas relational algebra is a procedural language. 43. What are the two types of Relational Calculus? Explain. Tuple Relational Calculus - A Query framed in TRC has the following syntax. { t/ p(t) } i.e, the formula retrieves the set of tuples formula p(t) is satisfied. Domain Relational Calculus - DMC has the following syntax { <x1,x2,x3….,xn> | p(x1,x2,x3….xu) } 44. What is Outer join? Explain. A varient of join that keeps all pieces of information from the operands. It “pads with nulls” the tuples that have no counterpart Three pads: left : only tuples of the first operand are padded right : only tuples of the second operand are padded full : tuples of both operands are padded 45. How does Tuple-oriented relational calculus differ from domain-oriented relational calculus? The tuple-oriented calculus uses a tuple variables i.e., variable whose only permitted values are tuples of that relation. Ex: QUEL The domain-oriented calculus has domain variables i.e., variables that range over the underlying domains instead of over relation. Ex: ILL, DEDUCE. 46. Explain existential and universal quantifiers. The ∃ quantifier is called an existential quantifier because a formula (∃ t)(F) is true if “ there exists “ tuple that makes F true. Ex: { t / ∃ s (s ∈ BORROW ∧ s[AMOUNT] > 75000) ∧ t[ CUST_BNAME] = s[CUST_NAME] } The universal quantifier, (∀t)(F) is true if every possible tuple that can be assigned to free occurrences of t in F is substituted for t and F is true for every such substitution. It is called universal quantifier because every tuple in “the universe of “ tuples must make F true to make the quantified formula true. 47. Transforming the universal and existential quantifiers. This watermark does not appear in the registered version - http://www.clicktoconvert.com

53 It is possible to transform a universal quantifier into an existential quantifier and vice-versa and to get an equivalent expression. (∀x) (P(x) ) ≡ not ( ∃ x) ( not (P(x))) (∃ x) ( P(x)) ≡ not (∀x) (not (P(x))) (∀x) ( P(x) and Q(x)) ≡ not ( ∃ x) ( not (P(x)) or not (Q(x)) ) (∀x) ( P(x) or Q(x)) ≡ not ( ∃ x) ( not (P(x)) and not (Q(x)) ) ( ∃ x ) ( P(x) or Q(x)) ≡ not (∀x) ( not (P(x)) and not (Q(x)) )

(∃ x ) ( P(x) and Q(x)) ≡ not (∀x) ( not (P(x)) or not (Q(x)) ) 48. What is a safe expression? An expression is said to be safe only when it guarantees to yield a finite number of tuples as its result. Otherwise, it is said to be unsafe. All expressions used in tuple relational calculus need to be safe expressions. Ex: { t / ¬(EMPLOYEE(t))} – infinite or unsafe expression 49. How domain relational calculus is different from tuple relational calculus? The domain calculus differs from the tuple calculus in the type of variables used in formulae: rather than having variables range over tuples, the variables range over single values from domains of attributes. 50. What is oracle data dictionary? The oracle data dictionary is read-only set of tables that keeps the metadata- i.e , the schema description for a database. 51. What is a schema? The term schema refers to a collection of data definition objects. Schema objects are the individual objects that describe tables, views etc. There is a distinction between the logical schema objects and the physical storage components called tablespaces. 52. What are Dynamic Performance tables? Oracle constantly monitors database activity and records in tables called Dynamic Performance tables. 53. Explain SQL as a query language. • SQL expresses query in a declarative way - queries specify the properties of the result, not the way to obtain it. This watermark does not appear in the registered version - http://www.clicktoconvert.com

54 • Queries are translated by the query optimizer into the procedural language internal to the DBMS • The programmer should focus on reliability, not on effiency. [ include SQL basics here ] 54. What are assertions? Assertions are used to specify more general constraints. Each assertion is a given a constrained name and is specified via a condition similar to where clause in an SQL query. Ex: CREATE ASSERTIONS SALARY_CONSTRAINT CHECK ( NOT EXISTS (SELECT * FROM EMPLOYEE E, EMPLOYEE M, DEPARTMENT D WHERE E.SALARY>M.SALARY AND E.DNO=D.DNUMBER AND D.MGRSSN=M.SSN)); 55. How access control to resources is given to other users? The owner of the resource assigns previleges to other users. A previlege is chraterized by: • the resource • the user who grants the previlege • the user who receives the previege • the action that is allowed on the resource

• whether or not the previlege can be passed to other users 56. What are the various types of previleges? Insert – to insert a new object into the resource Update – to modify the resource contents Delete – to remove an object from the resource Select – to access a resource contents in a query References – to build a referential integrity constraint with the resource (may limt the ability to modify the resource) Usage – to use the resource in a schema definition 57. How to grant and revoke a previlege? To grant a previlege to a user: grant <previleges | all previleges> on resource to users [wit grant option] - grant option specifies whether the previlege of propagating the previlege to other users must be granted. This watermark does not appear in the registered version - http://www.clicktoconvert.com

55 Ex: grant select on DEPARTMENT to stefano To take away previleges: revoke previleges on resources from users [restrict | cascade] 58. Explain Embedded SQL. A. Traditional applications often need to “embed” SQL statements inside the instructions of a procedural programming language ( C, COBOL, etc.) B. Programs with embedded SQL use a precompiler to manage SQL statements C. Embedded statements are preceded by ‘$’ or ‘EXEC SQL’ D. Program variables may be used as parameters in SQL statements (preceded by ‘:’) E. select preducing a single row and update commands may be embedded easily F. The SQL environment offers a predefined variable ‘sqlcode’ which describes the status of the execution of the SQL statements ( zero if the SQL statement executed successfully) Ex: void DisplayDepartmentSalaries(char DeptName[]) { char FirstName[20], Surname[20]; long int Salary; $ declare DeptEmp cursor for select FirstName, Surname, Salary from Employee where Dept = :DeptName; $ open DeptEmp; $ fetch DeptEmp into :FirstName, :Surname, :Salary; printf(“Department %s\n”,DeptName); while (sqlcode == 0) { printf(“Name: %s %s ”,FirstName,Surname); printf(“Salary: %d\n”,Salary);

$ fetch DeptEmp into :FirstName, :Surname, :Salary; } $ close DeptEmp; } 59. Explain Dynamic SQL • When applications do not know at compile-time the SQL statement to execute, they need dynamic SQL • Major problem: managing the transfer of parameters between the program and the SQL environment • For direct execution: This watermark does not appear in the registered version - http://www.clicktoconvert.com

56 execute immediate SQLStatement • For execution preceded by the analysis of the statement: prepare CommandName from SQLStatement – followed by: execute CommandName [ into TargetList ] [ using ParameterList ] 60. What are program units? Explain. The program units are: Functions, Stored procedures and Packages. Stored procedure – It refers to a procedure that is considered to be a part of the data definition and implements some integrity rule or business rule or a policy when it is invoked. Functions return single values. Package – It provides a method of encapsulating and storing related procedures for easier management and control. 61. What is cluster and hash cluster? Cluster – A group of records from one or more tables physically stored in a mixed file. Tables are clustered by the cluster key. Related rows from multiple tables are physically stored together on disk blocks to improve performance. Hash Cluster – Hash clusters also group records; however the cluster key value is hashed first and all rows belonging to this hash value are stored under the same hash bucket address. 62. What are methods in Oracle8 ? A method is a procedure or a function that is part of the definition of a user defined abstract datatype. Methods are written in PL/SQL and stored in the database or written in a language like C and stored externally. Method differ from stored procedure in the following ways: A program invokes a method by referring to an object of its associated type. An Oracle method has complete access to the attributes of its associated object and to the information about its type. 63. What are triggers? A database trigger is a stored procedure ( or rule) that is implicitly executed ( or fired) when the table with which it is associated has an insert, delete or update performed on it. Triggers can be used to enforce additional constraints or to automatically perform additional actions that are required by business rules or policies that go beyond the standard key, entity integrity and referential integrity constraints imposed by the system. This watermark does not appear in the registered version - http://www.clicktoconvert.com

57 64. What is Rollback segment? Each database must contain one or more rollback segments, which are used for “undoing” transactions. A rollback segment records old values of data that are used to provide read consistency to rollback a transaction and for recovering a database. 65. What is PL/SQL? PL/SQL is a block - structured language. That is , the basic units- procedures, functions and anonymous blocks – that make up a PL/SQL program are logical blocks which can contain any number of nested sub-blocks. A block or sub-block groups logically related declarations and statements. The declarations are local to the block and cease to exist when the block completes. The three parts of PL/SQL block are • Declaration part • Executable part • Exception part. 66. What is a cursor in PL/SQL? A cursor is similar to a file variable or file pointer, which points to a single row from the result of a query. Cursors are controlled by three commands – OPEN, FETCH and CLOSE. 67. Explain Stored procedures. • SQL-2 allows for the definition of procedures, also known as stored procedures • Stored procedures are part of the schema procedure AssignCity(:Dep char(20), :City char(20)) update Department set City = :City where Name = :Dep • SQL-2 does not handle the writing of complex procedures • Most systems offer SQL extensions that permit to write complex procedures (e.g., Oracle PL/SQL) Procedure in Oracle PL/SQL Procedure Debit(ClientAccount char(5),Withdrawal integer) is OldAmount integer; NewAmount integer; Threshold integer; begin select Amount, Overdraft into OldAmount, Threshold from BankAccount This watermark does not appear in the registered version - http://www.clicktoconvert.com

58 where AccountNo = ClientAccount for update of Amount; NewAmount := OldAmount - WithDrawal; if NewAmount > Threshold then update BankAccount

set Amount = NewAmount where AccountNo = ClientAccount; else insert into OverDraftExceeded values(ClientAccount,Withdrawal,sysdate); end if; end Debit; This watermark does not appear in the registered version - http://www.clicktoconvert.com

59

(ii) STORAGE, FILE STRUCTURE, INDEXING (HASH BASED & TREE STRUCTURE) 1. Mention & brief categories in storage media. a) Primary storage and b) Secondary storage. a) - can be directly operated on by the CPU. eg., Main memory, Cache memory b) - can't be directly operated on by the CPU hence data has to be first copied onto primary storage. e.g., magnetic disks, optical disks & tapes. 2. What are Dynamic and Static RAM? Dynamic RAM is Main memory whereas Static RAM is Cache memory. 3. What are off-line and on-line data? Off-line data: data stored on tapes, on-line data: data stored in disk. 4. What is a tape juke box? It contains a bank of tapes that are catalogued and can be automatically loaded onto tape drives and are becoming as a Tertiary Storage. For Example: NASA's EOS (Earth Observation Satellite) system stores archived data in this fashion. 5. What is Volatile & Non-volatile storage? Main memory is a Volatile and Secondary memory is a Non-volatile storage. 6. Expand BLOB, SCSI, and RAID. BLOB: Binary Large Object SCSI: Small Computer Storage Interface RAID: Redundant Array of Inexpensive/Independent Disks 7. What is Seek time? “Time taken to position the read/write head on the track containing data". 8. What is Latency time? "The beginning of desired block rotates into position under the read/write head." 9. What is Data Striping? This watermark does not appear in the registered version - http://www.clicktoconvert.com

60 "The concept that makes use of parallelism to improve disk performance". 10. Explain RAID. Its Goal: To make the wide different rates of performance improvements of disks even against those in memory and micro processor. 11. what is an index? An index is an access structure which is used to speed up the retrieval of records in

response to certain search conditions. 12. What is a Hash function? Given a key, the hash function resolves and returns the address of a record. 13. What is an indexing field? An index access structure is usually defined on a single field of a file called an Indexing field. 14. Mention the type of indices? Primary, Secondary, Clustered, Dense, Sparse indices. 15. Explain the type of indices. Primary index: defined on ordering field of an ordered file of records. Secondary index: defined on non-ordering field of an ordered file of records. Clustered index: defined on records having same value for ordering field of an ordered file. Dense index: defined on every search key value in the data file. 16. What is block anchor? The first record in each block of the data file is called the anchor record or Block anchor. 17. What is a multi level index? The multi level index is one which uses a blocking factor to reduce the search on an index and the blocking factor to be larger than 2 and it is referred to as the fan out of the multi level index. 18. What is an Indexed sequential file? This watermark does not appear in the registered version - http://www.clicktoconvert.com

61 It is an ordered file with a multi level primary index on its ordering key field 19. How dynamic multilevel indexing is achieved? B-trees and B+ -trees. 20. What is a search tree? It is slightly different from a multilevel index with order p such that each node contains at most p-1 search values and p pointers. 21. Define B-tree. B-tree when used as an access structure on a key field to search records in a data file can be defined as follows: Each node has almost p tree pointers Each node is of the form ,P2,………….< Kq1,Pprq-1>,Pq> where Pi is a tree pointer and Pri is a data pointer, Ki is the node. Each node, except the root and leaf nodes has at least p/2 tree pointers. All leaf nodes are at the same level. 22. Define B+-tree. B+-tree when used as an access structure on a key field to search records in a data file can be defined as follows: Each node has almost p tree pointers Each internal node of the form where Pi is a tree pointer. Each node, except the root and leaf nodes has at least p/2 tree pointers.

All leaf nodes are at the same level. 23. Why to use B-trees in particular for dynamic multilevel indexing. B-tree is a balanced tree structure in which each node is atleast half full. 24. what is the popular file organization introduced by IBM? ISAM: Indexed Sequential Access Method 25. What is a fully inverted file? A file that has a secondary index on every one of its fields is often called a fully inverted file.

(iii) This watermark does not appear in the registered version - http://www.clicktoconvert.com

62

TRANSACTION MANAGEMENT (CONCURRENCY & CRASH RECOVERY) 1) What is meant by transaction? Collection of operations that for a single logical unit of work are called transactions. In other words, a transaction is a unit of program execution that accesses and possibly updates various data items. 2) Which property of transaction used to ensure integrity of the data? ACID property, where A=Atomicity: either all the operations of the transaction are reflected properly in the db or none. C=Consistency: execution of a transaction in isolation preserves the consistency of the db. I=Isolation: Even though multiple transactions may execute concurrently each transaction is unaware of other transactions executing concurrently in the system. D=Durability: After a transaction completes successfully, the changes it has made to the db persist, even if there are system failures. 3) What is roll back? Any changes that the aborted transaction made to the database will be undone is called roll back. This undo will be done by compensative transaction. 4) Name the five states of transaction. Active Partially committed Failed Aborted Committed During the transaction failure, either we can restart the transaction or kill the transaction. 5) What is shadow copy? In a shadow copy scheme, a transaction that wants to update the db first creates a complete copy of the db. A pointer called db pointer is maintained, it points to the current copy of db. During the transaction, all updates are done on the new db copy, leaving the original copy (shadow copy), untouched. If at any pt of transaction has to be aborted, the system merely deletes the

new copy. This watermark does not appear in the registered version - http://www.clicktoconvert.com

63 If the transaction completes, the db system updates the db-pointer to pt to the new copy of the db, the new copy then becomes the current copy of te db. The old copy of the db is then deleted. 6) What is the roll of recovery management component in db system? Recovery management component provides support for atomicity and durability by a variety of schemes. 7) What is the use of concurrent transaction over serial transaction? Improved throughput and resource utilization Reduced waiting time 8) What is the need of concurrency-control schemes? The db system must control the interaction among the concurrent transactions o prevent them from destroying the consistency of the db. 9) What is recoverable schedule? A recoverable schedule is one where, for each pair of transaction T1 and T2 such that T2 reads a data item previously written by T1 the commit operation of T1 appears before the commit operation of T2. 10) What is cascading rollback? Cascading rollback is a phenomenon, in which a single transaction failure leads to a series of transaction rollbacks. 11) What is a lock in concurrency control? When a transaction needs to update the db, then it can set the lock, and proceed. When the db is locked be one transaction, the other can’t access the db. Once the transaction gets over, the lock will be released and the other transactions can access it. 12) List the lock modes. Shared: If a transaction T has obtained a shared-mode lock on item Q, then T can read but can’t write Q. Exclusive: If a transaction T has obtained an exclusive-mode lock on item T can both read and write Q. 13) What is a locking protocol? Each transaction in the system follows a set of rules called locking protocol, indicating when a transaction may lock and unlock each of the data items. This watermark does not appear in the registered version - http://www.clicktoconvert.com

64 14) List the type of protocols used in concurrency control. Locking protocol Graph-based protocol Timestamp-based protocol Validation-based protocol 15) What is a deadlock state in Transaction? A system is in a deadlock state if there exists a set of transactions such that every transaction in the set is waiting for another transaction in the set. 16) List the two approaches to deadlock prevention in transaction.

One approach ensures that no cyclic waits can occur by ordering the request for locks or requiring all locks to be acquired together. The other approach is closer to deadlock recovery, and performs transaction rollback instead of waiting for a lock, whenever the wait could potentially result in a deadlock. 17) How the deadlock detection and recovery handled in transaction? The transactions are sketched as a wait-for-graph and check for a presence of a cycle. Recovery from Deadlock involves the following steps: Selection of a victim (the transaction which is responsible for deadlock) Rollback 18) List the classification of a db system failure. • Transaction Failure o Logical Error: The transaction can no longer continue with its normal executing because of some internal conditions, such as bad input, data not found, overflow or resource limit exceeded. o System Error: The system has entered an undesirable state like deadlock. • System Crash: o There is a hardware malfunction or a bug in the db software or the os, that causes the lost of the content of volatile storage and brings transaction processing to a halt. • Disk Failure: o A disk block loses its content as a result of either a head crash or failure during a data transfer operation. 19) Which are all the storage types used for db? Based on the relative speed, capacity and resilience to failure I is classified as follows: • Volatile storage • Non volatile storage • Stable storage This watermark does not appear in the registered version - http://www.clicktoconvert.com

65

(iv) DATABASE DESIGN & TUNNING (NORMALIZATION & DENORMALIZATION) 1).What are the Phases involved Database Application System Life Cycle? a. System definition: the scope of database system, its users, and its applications are defined. b. Database design: At the end of this phase, a complete logical and physical design of the database system on the chosen DBMS is ready. c. Database implementation: specify the conceptual, external, and internal database definitions, creating empty database files, and implementing the software applications. d. Loading or data conversion: the database is populated either by loading the data directly or by converting existing files into the database system format. e. Application conversion: any software applications from a previous system are converted to the new system.

f. Testing and validation: the new system is tested and validated. g. Operation: the database system and its applications are put into operation. h. Monitoring and maintenance: the system is constantly monitored and maintained. Growth and expansions can occur in both data content and software applications. 2).What are the Database Design Phases? Six main phases of the database design are: Requirements collection and analysis. Conceptual database design. Choice of a DBMS. Data model mapping (logical database design). Physical database design. Database system implementation and tuning. 3).What is a Conceptual Database Design? This phase of database design involves two parallel activities: 1. Conceptual Schema Design: examines the data requirements resulting from Phase 1 and produces a conceptual database schema. 2. Transaction and Application Design: examines the database applications analyzed in Phase 1 and produces highlevel specifications for these applications. 4).What are the factors considered for discussing the economic and organizational factors that affect the choice of DBMS? a. Software acquisition cost: This is the "up-front" cost of buying the software, including language options, different interfaces such as forms and screens, recovery/backup option, special access methods, and documentation. The correct DBMS for a specific operating system must be selected. This watermark does not appear in the registered version - http://www.clicktoconvert.com

66 b. Maintenance cost: This is the recurring cost of receiving standard maintenance service from the vender and for keeping the DBMS version up to date. c. Hardware acquisition cost: New hardware may be needed, such as additional memory, terminals, disk units, or special DBMS storage. d. Database creation and conversion cost. This is the cost of either creating the database system from scratch or converting an existing system to the new DBMS software. e. Personnel cost: Acquisition of DBMS software for the first time is often accompanied by hiring new database people. f. Training cost. Because DBMSs are often complex systems, personnel must often be trained to use and program the DBMS. g. Operating cost. The cost of continued operation of the database system is typically not worked into an evaluation of alternatives because it is incurred regardless of the DBMS selected. 5).What is Logical database design? Logical database design is called the Data Model Mapping. The Mapping can be carried out in two phases. System independent mapping: the mapping does not consider any specific characteristics or special cases that apply to the DBMS implementation of the data model. Tailoring the schemas to a specific DBMS: we may have to adjust the schemas obtained in stage 1 to conform to the specific implementation features of a data model as used in the selected DBMS.

6).What is a Physical database design? Physical database design is the process of choosing specific storage structures and access paths for the database files to achieve good performance for the various database applications. Each DBMS offers a variety of options for file organization and access paths. These usually include various types of indexing, clustering of related records on disk blocks, linking related records via pointers. Once a specific DBMS is chosen, the physical database design process is restricted to choosing the most appropriate structures for the database from among the options offered by that DBMS. 7).What are the Elements of a Good Database Design ? The Elements of a good Database design are • Relation design • Data Model • Processing Rules • Additional System requirements These models contribute to a proper Database Design 8).List the Database Design Process? Create tables and columns from entities and attributes Select primary keys Represent relationships Specify constraints Re-examine normalization criteria This watermark does not appear in the registered version - http://www.clicktoconvert.com

67 9).Define Response time. This is the elapsed time between submitting a database transaction for execution and receiving a response. A major influence on response time is under the control of the DBMS is the database access time for data items referenced by the transaction. Response time is also influenced by factors not under DBMS control, such as system load, operating system scheduling, or communication delays. 10).Define Space Utilization. Space utilization: This is the amount of storage space used by the database files and their access path structures. 11).Define Transaction throughput. This is the average number of transactions that can be processed per minute by the database system; it is a critical parameter of transaction system such as those used for airline reservations or banking. Transaction throughput must be measured under peak conditions on the system. Typically, average and worst-case limits on the preceding parameters are specified as part of the system performance requirements. Analytical or experimental techniques, which can include prototyping and simulation, are used to estimate the average and worst-case values under different physical design decisions, to determine whether they meet the specified performance requirements. 12).What are surrogate Keys? A surrogate key is a unique, DBMS-supplied identifier used as the primary key of a relation. The values of a surrogate key have no meaning to the users and are normally hidden on forms and reports. DBMS does not allow the value of a surrogate key to be changed.

13).Disadvantages of the Surrogate Keys Disadvantages are i. Foreign keys that are based on surrogate keys have no meaning to the users. ii. When data shared among different databases contain the same ID, merging those tables might yield unexpected results. 14).How Relationships using he Surrogate keys are represented? If the parent in an ID-dependent relationship has a surrogate key as its primary key, but the child has a data key, use the parent’s surrogate key as a primary key. A mixture of a surrogate key with a data key does not create the best design as the composite key will have no meaning to the users. Therefore, whenever any parent of an ID-dependent relationship has a surrogate key, the child should have a surrogate key as well. By using This watermark does not appear in the registered version - http://www.clicktoconvert.com

68 surrogate keys in the child table, the relationship type has changed to 1:N non-identifying relationship 15).Name the 3 principles of Relationship representation Preservation of referential integrity constraints, Specification of referential integrity actions, Representation of minimum cardinality. 16).How the Referential Integrity actions are specified? If default referential integrity constraint is too strong, overriding the default referential integrity enforcement could be defined during database design. The policy will be programmed into triggers during implementation. 17).Two referential integrity overrides 1).Cascading updates: automatically change the value of the foreign key in all related child rows to the new value 2).Cascading deletions: automatically delete all related child rows 18).What type of Referential Integrity actions must be declared during database design? When enforcing the minimum cardinality between the child and the parent, a required child can be represented by creating update and delete referential integrity actions on the child and insert referential integrity actions on the parent. Such referential integrity actions must be declared during database design and trigger codes must be written during implementation 19).How ID-Dependent Relationships are represented? To represent ID-dependent relationships, primary key of the parent relation is added to the child relation. The new foreign key attribute becomes part of the child’s composite primary key. Referential integrity actions should be carefully determined – For cascading updates, data values are updated to keep child rows consistent with parent rows – If the entity represents multi-value attributes, cascading deletions are appropriate – Check user requirements when designing more complex situation 20).How is the Ternary Relationships represented? Ternary and higher-order relationships can be treated as combinations of binary relationships. There are three types of binary constraints: MUST, MUST NOT, and MUST COVER This watermark does not appear in the registered version - http://www.clicktoconvert.com

69 i. MUST NOT constraint: the binary relationship indicates combinations that are not allowed to occur in the ternary relationship ii. MUST COVER constraint: the binary relationship indicates all combinations that must appear in the ternary relationship iii. Because none of these constraints can be represented in the relational design, they must be documented as business rules and enforced in application programs or triggers DATABASE TUNING 1).What is Database Tuning? It is the activity of making a database application run faster. – Faster means higher throughput (or response time) – Avoiding transactions that create bottlenecks or avoiding queries that run for hours unnecessarily is a must. – A 5% improvement is significant. 2).What is the need for Database Tuning? Nearly 80 to 85 percent of database performance problems arise from the application database's design or the application's own code. Good transaction throughput requires an application designed from the database up, with performance and scalability in mind. With optimum design and tuning, all applications' performance will soar. 3).What are the three main reasons for Database Tuning? • Troubleshooting: Make managers and users happy given an application and a DBMS •Capacity Sizing: Buy the right DBMS given application requirements • Application Programming: Coding your application for performance 4).What are the principles for a Database Tuning? • Think globally, fix locally – Localizing the problems • Partitioning breaks bottlenecks (temporal and spatial) – ONE part of the system limits the overall performance – Two approaches: • Fix locally • Partitioning the LOAD – eg. Free list, lock contention due to long transactions • Partitioning in space/logical resources/time. This watermark does not appear in the registered version - http://www.clicktoconvert.com

70 5).What is achieved by tuning a database? Specify the Goals. Improvement in database and application performance helps to support the database application. The overall performance of the Database server is improved by tuning its components. Tuning is begun with clear, quantifiable, real-world performance objectives; for example, the objective to process 10,000 orders per day. The tuning that we implement for response time, reading and writing, or other operations should be geared toward your real-world objectives.

Database server is tuned to achieve certain goals, such as reducing response time, minimizing contention, and optimizing the utilization of CPU and memory resources. 6).What are the Goals of Tuning an Oracle 8i Server? • One of the goals of tuning an Oracle8i server is to ensure that SQL statements access the smallest possible number of Oracle blocks needed to complete their work. This speeds up the response time for the statement and reduces the impact on other queries and data manipulation language (DML) statements being executed in the database. • Another goal of tuning is to ensure that frequently used data blocks are cached in memory. Data blocks cached in memory can be readily accessed. The availability of needed data in memory reduces the number of overall physical reads. • An additional goal of tuning is to ensure that users share code, such as frequently used SQL statements, within memory areas. This helps to avoid reparsing and duplicate storage of such SQL statements. • A further goal of tuning the Oracle8i server is to ensure that reading and writing are performed as rapidly as possible. For example, separating different segment types across tablespaces can minimize write contention. Maximizing the number of blocks that are obtained during a read operation minimizes the number of read operations required. • Minimizing or preventing contention for resources is another goal of Oracle8i server tuning. Good application tuning and database design help ensure that users have access to data when required. • Finally, the database should be tuned to optimize the speed in which backup procedures and housekeeping tasks are performed. This protects data and minimizes the impact of those tasks on database performance. 7).What are the steps involved in tuning the Oracle 8i Server? • Six steps are involved in tuning the Oracle8i database. In the first step, you tune the architecture and design of the data model. • The second step involves tuning the database applications. The most common applications in Oracle8i are Decision Support Systems and Online Transaction Processing. • Decision Support Systems (DSS) applications distill large amounts of information into understandable reports. Decision makers in an organization use this data to determine what strategies the organization should follow. This watermark does not appear in the registered version - http://www.clicktoconvert.com

71 • Online Transaction Processing (OLTP) applications are high-throughput, insert/update intensive systems. They involve large volumes of data that grow continuously. • To tune the applications, it is important to design efficient SQL statements. Efficient SQL statements take less time to process. You must also consider the data access and manipulation operations in detail. • The next step in tuning the Oracle8i server is to tune memory structures. Memory tuning involves tuning the System Global Area (SGA) and user process memory. • Input/output (I/O) tuning is the next step in tuning the Oracle8i server. I/O

tuning involves configuring the distribution of data files, segment types, and tablespaces. I/O tuning also involves monitoring the interaction between data files and memory during reads and writes. • The interaction between files can be tuned by separating the most frequently accessed data into different tablespaces and separating tables from their related indexes. You can also separate rollback segments from tables and data dictionary objects from all other objects. • Another step in tuning the Oracle8i server involves reducing contention between actions in the database. Since an Oracle database is normally used by multiple users, tuning is needed to minimize the time that processes have to wait until a resource is available. Resources that can involve contention include blocks, shared pool, locks, and latches. • The final step in tuning is to tune the operating system to handle the current demands of the database server. 8).What is performance tuning? It is the methodology of making small adjustments that affect database performance. By tuning the system, can tailor its performance to best meet every needs. Performance must be built in! Performance tuning cannot be performed optimally after a system is put into production. To achieve performance targets of response time, throughput, and constraints, application analysis, design, and implementation must be tuned in. 9).Who tunes the System performance? Everyone involved with the system has some role in the tuning process. When people communicate and document the systems characteristics, tuning becomes significantly easier and faster. • Business executives must define and then reexamine business rules and procedures to provide a clear and adequate model for application design. They must identify the specific kinds of rules and procedures that can influence the performance of the whole system. • Application designers must design around potential performance bottlenecks. They must communicate the system design so that everyone can understand the flow of data in an application. This watermark does not appear in the registered version - http://www.clicktoconvert.com

72 • Application developers must communicate the implementation strategies they choose so that modules and SQL statements can be quickly and easily identified during statement tuning. • Database administrators (DBAs) must carefully monitor and document system activity so that they can identify and correct unusual system performance. • The hardware and software administrators must document and communicate the configuration of the system so that everyone can design and administer the system effectively. Decisions made in application development and design have the most impact on performance. Once the application is deployed the database administrator usually has the primary responsibility for tuning--within the limitations of the design. 10).Is it necessary for setting the performance goal of the system while tuning? Brief.

Whether we are designing or maintaining a system, we should set specific performance goals so that we know when to tune. We can needlessly spend time tuning our system without significant gain if we attempt to alter initialization parameters or SQL statements without a specific goal. When designing the system, we should set a specific goal: for example, an order entry response time of less than three seconds. If the application does not meet that goal, identify the bottleneck causing the slowdown (for example, I/O contention), determine the cause, and take corrective action. During development, you should test the application to determine if it meets the designed performance goals before deploying the application. 11).How is the performance of the system is evaluated? With clearly defined performance goals, we can readily determine when performance tuning has been successful. Success depends on the functional objectives we have established with the user community, our ability to objectively measure whether or not the criteria are being met, and our ability to take corrective action to overcome any exceptions. The rest of this tuning manual describes the tuning methodology in detail, with information about diagnostic tools and the types of corrective actions we can take. DBAs who are responsible for solving performance problems must keep a wide view of the all the factors that together determine response time. The perceived area of performance problems is frequently not the actual source of the problem. Users in the preceding example might conclude that there is a problem with the database, whereas the actual problem is with the network. A DBA must monitor the network, disk, CPU, and so on, to find the actual source of the problem--rather than simply assume that all performance problems stem from the database. Ongoing performance monitoring enables you to maintain a well tuned system. Keeping a history of the application's performance over time enables you to make useful comparisons. With data about actual resource This watermark does not appear in the registered version - http://www.clicktoconvert.com

73 consumption for a range of loads, you can conduct objective scalability studies and from these predict the resource requirements for load volumes you may anticipate in the future. 12).What are the basic methodology involved in tuning the Database Server and Clients? Determine the level of tuning - Component-level tuning or system-level tuning? Do you want to tune the database server as an isolated component or as part of a larger application? Understand the end-user community - Gather metrics regarding the manner in which the database will be accessed. What SQL queries will be executed? What business transactions will be executed? How often are transactions executed? Gather performance requirements - Determining the exit criteria for tuning needs to be established in order to know when sufficient testing has occurred. Automate test scripts - Create automated test scripts that issue the necessary SQL queries, updates and deletes. Generate automated test scripts that emulate the business scenarios. Execute & analyze tests - Run the planned tests and collect metrics, such as response times, transaction volumes, operating system statistics, database server statistics. Application Profilers - Implement ancillary tools to profile transaction characteristics. Determine the network characteristics of a transaction, such as bandwidth utilization and conversational chattiness. Ascertain the CPU utilization on the database server and client,

memory utilization, query compilation and execution times. 13).How is the database server capability and the scalability is increased? As a result of Database Tuning the database server capability and scalability is increased by addressing: a. The use of a small packet size between the client and the server. b. Chatty conversation over high latency network links. c. Large amounts of unused data returned to the client. d. Redundant database queries. additional tuning methods. 14).What are the roles of a System Architect involved in Database Tuning? The system architect communicates information about the Oracle8i database design to the database administrator (DBA) and the application developer. The database design includes the logical structure of the database along with database size information. The database must be designed correctly to ensure that database performance does not deteriorate. It is difficult to tune an Oracle8i database that is based on a poor design. Often, an application designer performs some of the system architect's duties, focusing only on the logical structures and sizing involved with the applications as they are This watermark does not appear in the registered version - http://www.clicktoconvert.com

74 designed. By communicating the database design information, the system architect ensures that everyone using the database understands the flow of data and the processes involved in the execution of applications. 15).How an Application Developer ensures application tuning? An application developer communicates the implementation strategies to the system architect and to the DBA. The implementation strategies include information about the effect of applications on databases and the relationship between diverse application processes. Application tuning begins with the design and development of the application using the implementation strategies, modules, and SQL statements that are part of the application programs. Much of performance tuning is achieved through effective application design and development. 16).What are the jobs of a DBA and a h/w, s/w Administrator? A DBA's tuning responsibility is to configure the database and monitor and document system activity, so that system performance problems can be identified and corrected as quickly as possible. A hardware/software administrator documents and communicates the configuration of the system environment in which the database and application reside. The configuration is communicated to the system architect, the DBA, and the application developer. This helps in designing and administering the system effectively. 17).How Queries and Views can be tuned. If a query runs slower than expected, check if an index needs to be re-built, or if statistics are too old. Sometimes, the DBMS may not be executing the plan you had in mind. Common areas of weakness: –Selections involving null values. –Selections involving arithmetic or string expressions. –Selections involving OR conditions. –Lack of evaluation features like index-only strategies orcertain join methods or poor size estimation

Check the plan that is being used then adjust the choices of indexes or rewrite the Query/Views. 18).Mention the guidelines for Query Tuning. a).Minimize the use of GROUP BY and HAVING This watermark does not appear in the registered version - http://www.clicktoconvert.com

75 b).Consider DBMS use of index when writing arithmetic expressions: E.age=2*D.age will benefit from index on E.age, but might not benefit from index on D.age! c).Minimize the use of DISTINCT: Don’t need it if duplicates are acceptable, or if answer contains a key. Avoid using intermediate relations: SELECT E.dno, AVG(E.sal) FROM Emp E, Dept D WHERE E.dno=D.dno AND D.mgrname=‘Joe’ GROUP BY E.do SELECT * INTO Temp FROM Emp E, Dept D WHERE E.dno=D.dno AND D.mgrname=‘Joe’ and SELECT T.dno, AVG(T.sal) FROM Temp T GROUP BY T.dno Does not materialize the Intermediary relations: d).If there is a dense B+ tree index on , an index-only plan can be used to avoid retrieving Emp tuples in the second query. 19).What are the ways of achieving refined conceptual schema in terms of Tuning? The conceptual schema should be refined by considering performance criteria and workload: – May choose 3NF or lower normal form over BCNF. – May choose among alternative decompositions into BCNF (or 3NF) based upon the workload. – May denormalize, or undo some decompositions. – May decompose a BCNF relation further! – May choose a horizontal decomposition of a relation. – Importance of dependency-preservation based upon the dependency to be preserved, and the cost of the IC check. Can add a relation to ensure dep-preservation (for 3NF, not BCNF!); or else, can check dependency using a join. 20).What is Functional Dependency? A Functional dependency is denoted by X -> Y between two sets of attributes X and Y that are subsets of R specifies a constraint on the possible tuple that can form a relation state r of R. The constraint is for any two tuples t1 and t2 in r if t1[X] = t2[X] then they have t1[Y] = t2[Y]. This means the value of X component of a tuple uniquely determines the value of component Y. 21).What is Multivalued Dependency? This watermark does not appear in the registered version - http://www.clicktoconvert.com

76 Multivalued dependency denoted by X -> Y specified on relation schema R, where X and Y are both subsets of R, specifies the following constraint on any relation r of R: if two tuples t1 and t2 exist in r such that t1[X] = t2[X] then t3 and t4 should also exist in r with

the following properties: t3[X] = t4[X] = t1[X] = t2[X] t3[Y] = t1[Y] and t4[Y] = t2[Y] t3[Z] = t2[Z] and t4[Z] = t1[Z] where [Z = (R-(X U Y)) ] 22).What is Normalization? Normalization is a refined process of removing anomalies from the table. With Normalization redundancy is minimized, insertion, deletion and updation anomalies are also minimized. 23).What are the Three Normal Forms? 1NF: The domain of attribute must include only atomic (simple, indivisible) values. 2NF: A relation schema R is in 2NF if it is in 1NF and every non-prime attribute A in R is fully functionally dependent on primary key. 3NF: A relation schema R is in 3NF if it is in 2NF and for every FD X A either of the following is true, X is a Super-key of R. A is a prime attribute of R. In other words, if every non prime attribute is non-transitively dependent on primary key. 24).What is BCNF? A relation schema R is in BCNF if it is in 3NF and satisfies an additional constraint that for every FD X -> A, X must be a candidate key. 25).What is 4NF? A relation schema R is said to be in 4NF if for every Multivalued dependency X -> Y that holds over R, one of following is true X is subset or equal to (or) XY = R. X is a super key. 26).What is 5NF? A Relation schema R is said to be 5NF if for every join dependency {R1, R2, ..., Rn} that holds R, one of the following is true: This watermark does not appear in the registered version - http://www.clicktoconvert.com

77 Ri = R for some i. The join dependency is implied by the set of FD, over R in which the left side is key of R. 27).What is Domain-Key Normal Form? A relation is said to be in DKNF if all constraints and dependencies that should hold on the the constraint can be enforced by simply enforcing the domain constraint and key constraint on the relation. 28).When do we settle for 3NF against BCNF? If the Decomposition is performed but there is no dependency-preserving we go for settling it with 3NF. For example: A Query may require a join on the decomposed schema, but it can be answered by a scan of the original relation got by the 3NF. 29).What is Horizontal Decomposition? Relation is replaced by a collection of relations that are projections. In most important case, sometimes, might want to replace relation by acollection of relations that are

selections. –Each new relation has same schema as the original, but a subset of the rows. –Collectively, new relations contain all rows of the original. Typically, the new relations are disjoint. 30).How will you Denormalize Contracts (Cid, Sid, Jid, Did, Pid, Qty, Val) into BCNF. There are 2 ways to decompose CSJDPQV into BCNF: –SDP and CSJDQV; lossless-join but not dependent-preserving. –SDP, CSJDQV and CJP; dependent-preserving as well This watermark does not appear in the registered version - http://www.clicktoconvert.com

78

(v) SQL REFRESHER 1. What is a concatenation operator? ‘||’ eg: select lastname || ‘works in ‘ || compname from employees. It’s same as concat. 2. What is the difference between SQL and SQL plus? Sql – language, has functions Sql plus – environment, has commands 3. What is a membership operator? IN (set) 4. How to select all names whose pattern in SA_ ? % zero or more character _ single character eg: select name from employees where name like ‘%sa\_%’ escape ‘\’; ‘ \’ -escape sequence 5. How to check for null values? IS NULL Logical operators: AND,OR,NOT Sorting: ORDER BY [DESC} This watermark does not appear in the registered version - http://www.clicktoconvert.com

79 Single row functions: Character Functions: o Case Manipulation: LOWER (‘attr’) UPPER(‘attr’) INITCAP(‘attr’) o Character manipulations CONCAT SUBSTR LENGTH INSTR-instr(column,string,[m],[n])

LPAD/RPAD –lpad/rpad(column,n,string) TRIM-trim(‘H’ from “hello world”) REPLACE – replace(text,search-string,replacement-string) Number functions: o Round(column, n) o Trunk(column, n) o Mod(m, n) 6. What is sysdate - variable/command/function? Sysdate is a function that returns date and time. Date functions: o months_between (date1,date2) o add_months (date,n) o next_day(date,’char’) o last_day(date) o round(date,[format]) o trunc(date,[format]) This watermark does not appear in the registered version - http://www.clicktoconvert.com

80 Conversion functions: o To_char (date,’format_model) Format model: {yyyy,year,month,mon,dy,day,dd} o to_char(number,’format_model’) Format model: {9 L 0 $ . ,} o To_number(char,’format_model’) o To_date(char,’format_model’) General functions: o NVL(expr1,expr2) o NVL2(expr1,expr2,expr3) o NULLIF(expr1,expr2) o Coalesce(expr1,expr2…,exprn) -- return 1st not null in expr list. o NVL(jobid,’No job yet’) Advantage of coalesce over NVL: Can take multiple values. • Case: Eg: case expr when comp_expr1 then return expr1 [when.. when else else – expr] • Decode: Eg: select name,salary decode(trunk(salary, OUTER JOIN: Put (+) on the side that is deficient in information. Eg:

Select tab1.col, tab2.col from table1, table2 where tab1.col=tab2.col(+) SELF JOIN: Eg: select work.last_name ||’works for’|| manager.lastname from employee work,employee manager where work.manager_id=manager.employee_id. NATURAL JOIN: All column with same name. USING clause – to specify some columns. Eg: -- Select deptid,deptname,locid,city from department NATURAL JOIN locations This watermark does not appear in the registered version - http://www.clicktoconvert.com

81 -- Select e.employee_id,e.last_name,d.location_id from employee e JOIN department d USING (dept_id); Multiple row or Group Functions: • AVG • COUNT • MAX • MIN • SUM • STDDEV • VARIANCE Having and Where: Having – having is a clause to restrict groups. Where – clause to restrict in selection of rows and not groups. 7. What are the forms that a transaction can be of? - many DML’s - One DDL - One DCL Rules for naming tables and columns: (pertaining to oracle) - must begin with - can be of length up to 30 chars - allowed characters: a-z,A-Z,0-9,_,$,# Contents of a database object: Table View Sequence Index Synonym The above details can be got by using the query: Select * from user_catalog – display objects specific to user All_catalog – for all Dba_catalog - for dba info U$_catalog - to provide details regarding lock… 8. Difference between date and time stamp data types: Date date type provides us only with the date.

Time stamp data type provides with date as well as time. This watermark does not appear in the registered version - http://www.clicktoconvert.com

82 Some commands: Forms of alter command: Alter table table_name - add - Modify - Drop - Set unused { may be used to set a column unused for a period of time if available in the table but not required currently} - Drop unused { to drop a column marked as unused permanantely} rename table1 to table2 truncate – removes all rows from the table. DDL commands; - create - alter - drop - rename - truncate - comment Constraints: Not null Unique Primary key Foreign key Check For altering constraints: Alter table employee Add constraint Drop constraint Drop primary key cascade Disable emp_pk cascade Enable emp_pk cascade 9. How to find the nth maximum in the given relation? create table stud(rollno number,marks number); select distinct(a.marks) from stud a where &N = (select count(distinct(b.marks))from stud b where b.marks >= a.marks) This watermark does not appear in the registered version - http://www.clicktoconvert.com

83

Part – III

OPERATING SYSTEM This watermark does not appear in the registered version - http://www.clicktoconvert.com

84

Topics Covered:i) Introduction to Operating System, Comparison and File System –Deepti S Babu ii) Process Management –Dinesh K iii) Memory Management –Kamalesh R S iv) Input/Output & Disc Management(Misc) –Vaibhav Dutt Sharma v) Unix FAQs - Kalamadhuri Ch & Aranganathan K P This watermark does not appear in the registered version - http://www.clicktoconvert.com

85

(i) INTRODUCTION TO OPERATING SYSTEM , COMPARISON AND FILE SYSTEM 1. What are the basic functions of an operating system? Operating system controls and coordinates the use of the hardware among the various applications programs for various uses. Operating system acts as resource allocator and manager. Since there are many possibly conflicting requests for resources the operating system must decide which requests are allocated resources to operating the computer system efficiently and fairly. Also operating system is control program which controls the user programs to prevent errors and improper use of the computer. It is especially concerned with the operation and control of I/O devices. 2. What is a file management system? A file management system is a set of system software that provides services to users and applications related to the use of files. 3. What are the objectives of a file management system? • To meet the data management needs and requirements of the user and the ability to perform various operations. • To guarantee, to the extent possible that the data in the files are valid. • To optimize performance • Tp provide I/O support for a variety of types of storage deivces. • To minimize or eliminate the potential for lost or destroyed data. • To provide a standardized set of I/O interface routines. • To provide support for multiple users.

4. Discuss the 3 methods of record blocking. Fixed blocking: Records of fixed length are used. An integral number of records are stored in a block. There may be unused space at the end of each block. Variable-length spanned blocking: Variable length records are used and are packed into blocks with no unused space. Thus, some records may span 2 blocks. Variable-length unspanned blocking: Here the records are of variable length but the records cannot span over a block. Thus, the length of a record cannot be greater than 1 block. This watermark does not appear in the registered version - http://www.clicktoconvert.com

86 5. Differentiate pre-allocation and dynamic allocation. A pre-allocation policy would require that the maximum size of the file be declared at the time of the file creation request. A dynamic allocation policy allocates space to a file in portions as and when required. 6. What is disk inter-leaving? Multiple disks can be used to store a single file by a technique called disk inter-leaving or disk stripping. A group of disks are said inter-leaved if successive portions of a file are stored on different disks. Since a file is stored across many disks, any request for file I/O tends to be uniformly distributed across all the disks, hence yielding a better throughput and improved response time. 7. How is a file system mounted? Just as a file should be opened before it is used, a file system must be mounted before it can be available to the processes on the system. To mount a file system, the name of the device and the location within the file system is given to the operating system. 8. What is a mount point? The location within the file structure at which we attach the file system is called as the mount point. Typically, a mount point is an empty directory at which the mounted file system will be attached. 9. What is Input spooling? Input spooling takes the information from the input device, prepares the job for scheduling, and places an entry in a job queue. Using input spooling, you can usually shorten job run time, increase the number of jobs that can be run sequentially, and improve device throughput. The main elements of input spooling are: Job queue An ordered list of batch jobs submitted to the server for running and from which batch jobs are selected to run. Reader A function that takes jobs from an input device or a database file and places them on a job queue. When a batch job is read from an input source by a reader, the commands in the input stream are stored in the server as requests for the job, the inline data is spooled as inline data files, and an entry for the job is placed on a job queue. The job information remains This watermark does not appear in the registered version - http://www.clicktoconvert.com

87 stored in the server where it was placed by the reader until the job entry is selected from the job queue for processing by a subsystem.

10. What is Output spooling? Output spooling allows the server to produce output on multiple output devices, such as printer and diskette devices, in an efficient manner. It does this by sending the output of a job destined for a printer or diskette to disk storage. This process breaks a potential job limitation imposed by the availability or speed of the output devices. Spooling is especially important in a multiple-user environment where the number of jobs running often exceeds the number of available output devices. Using output spooling, the output can be easily redirected from one device to another. The main elements of output spooling are: Device description A description of the printer or diskette device Spooled file A file containing spooled output records that are to be processed on an output device Output queue An ordered list of spooled files Writer A program that sends files from an output queue to a device Application program A high-level language program that creates a spooled file using a device file with the spooling attribute specified as SPOOL(*YES) Device file A description of the format of the output, and a list of attributes that describe how the server should process the spooled file 11. What is a Job queue? A job queue is an ordered list of jobs waiting to be processed by a particular subsystem. Jobs will not be selected from a job queue by a subsystem unless the subsystem is active and the job queue is not held. You can use job queues to control the order in which jobs are run. A base set of job queues is provided with your server. In addition, you may create additional job queues that you need. 12. Name five major categories of system calls System calls can be roughly grouped into five major categories: This watermark does not appear in the registered version - http://www.clicktoconvert.com

88 • Process & Job Control: with end, abort, load, execute, create/fork or submit, get/set attributes, terminate/exit, wait, dump, trace, etc. system calls which perform controlling over processes, process status, and memory. • File manipulation: with open, create, delete, read, write, reposition, close, and get/set file attributes, etc.. • Device Management: with request, release, read, write, and reposition, which provide additional resources needed by processes. • Information Maintenance: with time, date, version, memory, chmod, get/set process attributes, etc.. • Communication: with get hostid, get processid, open/close connection, accept connection, wait for connection, read message, write message, map memory, etc..

13. What does the API abbreviation stand for? API, standing for Application Program Interfaces, is the interface between a process and the operating system; which provides convenience of interactions between users and an operating system, and is implemented by a bunch of system calls. 14. Explain Priority inversion. The issue of a higher priority thread being blocked from execution because a lower priority thread is holding a resource (e.g., lock) needed to run is described as priority inversion. The dispatcher should address this problem through priority inheritance. 15. Explain the features of some common operating systems. • DOS o Disk Operating System o only really provides a files system – as name implies o virtually no resource protection o no multi-tasking o little fault monitoring o little convenience in input/output o no built-in networking o commercial product • Windows 3.x o added Graphic User Interface (GUI) to DOS o basic operating system is DOS with all its imperfections o improved user convenience o some multitasking o little protection o no built-in networking o commercial product This watermark does not appear in the registered version - http://www.clicktoconvert.com

89 • Windows 95 (98) o different OS from DOS but provides some DOS and Win 3.x compatibility o improved convenience – better GUI o easier addition and management of resources like devices and drivers o integrates some basic networking support built in Internet clients support for Internet protocols – TCP/IP o still basically single user o some crude multi-tasking o little process protection o commercial product • Windows NT o all the advantages of Windows 95 including GUI o different operating system from Windows 95 o fully multi-tasking o more advanced networking than Windows 95 incorporates Internet client/server capabilities supports Internet protocols – TCP/IP

networking

applications based on NETBIOS and the associated NETBUI protocol BIOS – basic input/output system available on PC’s, DOS etc NETBIOS – extends input/output over a network NETBUI NETBIOS Extended User Interface includes NETBIOS plus additions done by IBM and Microsoft o comes as single user for high end PCs multi-user on network servers – limitations – security problems o commercial product • UNIX – LINUX o LINUX is a UNIX clone running on PC and workstations o multi-tasking o multi-user o better user process and resource protection o many arguable advantages over Windows NT o more robust system o better process logging o built in networking – TCP/IP o better device drivers o FREE and down-loadable from Internet o several freely available applications o ideal for research & development environments o disadvantage – less friendly user interface but GUIs under development This watermark does not appear in the registered version - http://www.clicktoconvert.com

90 • we use LINUX extensively for both servers and clients 16. What is a device file? Device files are files that provide access to attached devices such as tapes, diskettes, printers, displays, spools, and other systems that are attached by a communications line. Example- Tape files, which allow access to data files on tape devices. 17. What Is Multics? Multics (Multiplexed Information and Computing Service) is a timesharing operating system begun in 1965 and used until 2000. The system was started as a joint project by MIT's Project MAC, Bell Telephone Laboratories, and General Electric Company's Large Computer Products Division. Prof. Fernando J. Corbató of MIT led the project. Bell Labs withdrew from the development effort in 1969, and in 1970 GE sold its computer business to Honeywell, which offered Multics as a commercial product and sold a few dozen systems. 18. Operating systems can be classified based on the following criteria: GUI - Short for Graphical User Interface a GUI Operating System contains graphics and icons and is commonly navigated using by using a computer mouse Below are some examples of GUI Operating Systems. System7.x

Windows98 Windows CE Multi-user - A multi-user Operating System allows for multiple users to use the same computer at the same time and/or different times. Below are some examples of multi-user Operating Systems. Linux UNIX Windows 2000 Multiprocessing - An Operating System capable of supporting and utilizing more than one computer processor. Below are some examples of multiprocessing Operating Systems. Linux UNIX Windows 2000 This watermark does not appear in the registered version - http://www.clicktoconvert.com

91 Multitasking - An Operating systems that is capable of allowing multiple software processes to be run at the same time. Below are some examples of multitasking Operating Systems. UNIX Windows 2000 Multithreading - Operating systems that allow different parts of a software program to run concurrently. Operating systems that would fall into this category are: Linux UNIX Windows 2000 19. What is the Zeroth Generation of computers? The term zeroth generation is used to refer to the period of development of computing, which predated the commercial production and sale of computer equipment. The period might be dated as extending from the mid-1800s, and Charles Babbage’s Analytical Engine, to the development of the fist commercial computer in 1951. 20. Features of Microsoft Windows 95 Some of the new features that Windows 95 have which Windows 3.x Does not have are: Plug and Play Allows hardware devices to be automatically installed into the computer with the proper software. Does not require jumpers to be played with. 32 Bit 32-Bit operating system allowing the computer to run faster and more efficiently. Registry Combines the power of multiple configuration files into two files allowing the system configurations to be located easier. Memory Windows 95 has an improved memory handling processes compared to Windows 3.11. Right mouse click Allows you new access and text manipulation by utilizing both buttons instead of one. CD-Player Enhanced CD-Player with improved usability and AutoPlay feature. 21. Features of Microsoft Windows 98 This watermark does not appear in the registered version - http://www.clicktoconvert.com

92 Microsoft Windows 98 is the upgrade to Microsoft Windows 95. While this was not as

big as release as Windows 95, Windows 98 has significant updates, fixes and support for new peripherals. The following is a list of some of its new features. Protection - Windows 98 includes additional protection for important files on your computer such as backing up your registry automatically. Improved support - Improved support for new devices such as AGP, Direct X,DVD,USB,MMX. FAT32 - Windows 98 has the capability of converting your drive to FAT32 without loosing any information. Interface - Users of Windows 95 and NT will enjoy the same easy interface. PnP - Improved PnP support, to detect devices even better then Windows 95. Internet Explorer 4.0 - Included Internet Explorer 4.0 Customizable Taskbar - Windows adds many nice new features to the taskbar which 95 and NT do not have. Includes Plus! - Includes features only found in Microsoft Plus! free. Active Desktop - Includes Active Desktop which allows for users to customize their desktop with the look of the Internet. 22. Features of Microsoft Windows Me Windows Millennium also known as Windows ME was introduced September 14, 2000 to the general public as the upgrade for Windows 95 and Windows 98 users and is designed for the end-users. Overall Windows ME has the look and feel of Windows 98 with some additional fixes and features not available in previous operating systems. While Windows ME includes some of the latest fixes and updates and some enticing new features we recommend this update only for users that may find or want some of the new features listed below or for users who are purchasing a new computer with this operating system included on it. This watermark does not appear in the registered version - http://www.clicktoconvert.com

93 NEW FEATURES: Some of the most noticeable new features include: Revert back to backup of computer - Windows Me allows the user to automatically restore an old backup incase files are corrupted or deleted. Protect important system files - Windows Me allows the user to protect important system files and will not allow these files to be modified by any type of other software. Movie editor - Allows users to edit and or combine Microsoft movie files. Importing movies requires additional hardware. Windows Media Player - Includes Media Player 7, which enables users a more advanced way of listening and organizing their media files. 23. Features of Microsoft Windows NT Windows NT 4.0 is the look and feel of Windows 95 however is a completely different Operating System. Windows NT contains advanced security features, advanced network support, Full 32-bit operating system, advanced multitasking, user administration and much more. While NT is a very advanced operating system it does lack the support of drivers, features, and gaming support when compared to Windows 95 / Windows 98 and is why even today Windows NT is still used primarily by businesses and technical users. 24. Features of Microsoft Windows XP Codenamed Whistler, Microsoft Windows XP is short for Windows Experienced and is the convergence of the two major Microsoft operating systems into one.

Windows XP is available in the following versions: HomeEdition Professional Windows XP includes various new features not found in previous versions of Microsoft Windows. Below is a listing of some of these new features. • New interface - a completely new look and ability to change the look. • Updates - new feature that automatically obtains updates fro the Internet. • Internet Explorer 6 - Includes internet explorer 6 and new IM. • Multilingual support - added support for different languages. In addition to the above features, Windows XP does increase reliability when compared to previous versions of Microsoft Windows This watermark does not appear in the registered version - http://www.clicktoconvert.com

94 25. What are the advantages and disadvantages of Monolithic and microkernel? • Advantages and disadvantages – monolithic: » faster switching to kernel functions » simple interfaces between functions i.e. procedure calls » easier access to data structures shared between functions » larger memory resident core » may become too large to maintain easily – micro-kernel » better partitioning - should be easier to implement and maintain » smaller memory resident core » slower switching to system processes » inter-system-process communications may be slower 26. Mention a few differences between Windows and linux. Topic Linux Windows PRICE The majority of Linux variants are available for free or at a much lower price than Microsoft Windows. Microsoft Windows can run between $50.00 - $150.00 US dollars per each license copy. EASE Although the majority Linux variants have improved dramatically in ease of use, Windows is still much easier to use for new computer users. Microsoft has made several advancements and changes that have made it a much

easier to use Operating System, and although arguably it may not be the easiest Operating System, it is still Easier than Linux. RELIABILITY The majority of Linux variants and versions are notoriously reliable and can often run for months and years without needing to be rebooted Although Microsoft Windows has made great improvements in reliability over the last few versions of Windows, it still cannot match the reliability of Linux. SOFTWARE Linux has a large variety of available software programs, utilities, and games. However, Windows has a much larger selection of available software. Because of the large amount of Microsoft Windows users, there is a much larger selection of available software programs, utilities, and games for Windows. SOFTWARE COST Many of the available software programs, utilities, Although Windows does have software programs, This watermark does not appear in the registered version - http://www.clicktoconvert.com

95 and games available on Linux are freeware and/or open source. Even such complex programs such as Gimp, OpenOffice, StarOffice, and wine are available for free or at a low cost. utilities, and games for free, the majority of the programs will cost

anywhere between $20.00 $200.00+ US dollars per copy. HARDWARE Linux companies and hardware manufacturers have made great advancements in hardware support for Linux and today Linux will support most hardware devices. However, many companies still do not offer drivers or support for their hardware in Linux. Because of the amount of Microsoft Windows users and the broader driver support, Windows has a much larger support for hardware devices and a good majority of hardware manufacturers will support their products in Microsoft Windows. SECURITY Linux is and has always been a very secure Operating System. Although it still can be attacked when compared to Windows, it much more secure Although Microsoft has made great improvements over the years with security on their Operating System, their Operating System continues to be the most vulnerable to viruses and other attacks. OPEN SOURCE Many of the Linux variants and many Linux programs are open source and enable users to customize or modify the code however they wish to. Microsoft Windows is not open source and the

majority of Windows programs are not open source. SUPPORT Although it may be more difficult to find users familiar with all Linux variants, there are vast amounts of available online documentation and help, available books, and support available for Linux. Microsoft Windows includes its own help section, has vast amount of available online documentation and help, as well as books on Each of the versions of Windows. 27. Differentiate command line and GUI. Topic Command line GUI Ease Because of the Although new users may This watermark does not appear in the registered version - http://www.clicktoconvert.com

96 memorization and familiarity needed to operate a command line interface new users find it much more difficult to successfully navigate and operate a command line interface. have a difficult at time learning to use the mouse to operate and use a GUI most users pick up this interface much easier when compared to a command line interface Control Users have much more control of their file system and operating system in a command line interface. For example users can easily copy a specific type of file from one location to another with a one-line command. Although a GUI offers

plenty of control of a file system and operating system often advance users or users who need to do specific task may need to resort to a command line to complete that task. Multitasking Although many command line environments are capable of multitasking they do not offer the same ease and ability to view multiple things at once on one screen. GUI users have windows that enable a user to easily view, control, and manipulate multiple things at once and is commonly much faster to do when compared to a command line. Scripting Because command line users only need to use their keyboards to navigate a command line interface and often only need to execute a few lines to perform a task an advanced command line interface user would be able to get something done faster then an advance GUI user. Although A GUI enables a user to create shortcuts, tasks, or other similar actions to complete a task or run a program it doesn't even come close in comparison to what is available through a command line. This watermark does not appear in the registered version - http://www.clicktoconvert.com

97

(ii) PROCESS MANAGEMENT 1. What is a process?

A program under execution that competes for CPU time and other resources. It is an active entity where as program is a passive entity. 2. What is degree of multiprogramming? The number of process running simultaneously and competing for the CPU is known as the degree of multiprogramming. 3. What is a context switch? Switching from one process to another process without changing the state of process that is currently in the running state. 4. What are process states? The state of a process is defined in part by the current activity of that process. New, Running, Ready, Waiting and Terminated. 5. Difference between multiprogramming & multiprocessing. More than one processes running on a processor is former and the later is more than one processes running on more than one processors. 6. Can a multitasking OS also a multiprogramming OS? No, a multiprogramming OS must be a multitasking but not vice versa. 7.How is the concurrent processes related? By two factors 1) Competition 2) Co-operation 8.What does a PCB consist of? It is also called as a task control block. It has Process state Program counter CPU registers CPU scheduling information This watermark does not appear in the registered version - http://www.clicktoconvert.com

98 Memory management information Accounting information I/O status information 9.What is job queue, ready queue, device queue? Processes entering the system are put into job queue. Processes that are ready to run are put into ready queue. List of processes waiting for a particular I/O device is called a device queue. 10. What are types of schedulers? Long term/job scheduler: selects processes from mass storage device, where processes are spooled for later execution and loads into memory for execution. Short term/CPU scheduler: selects from among the processes that are ready to run, and allocates the CPU to one of them. (Frequent) Medium term scheduler: does swapping. 11. What is cascading termination? If a process terminates then all its children must also be terminated. This phenomenon, referred to as cascading termination, is normally initiated by the

operating system. 12. What are benefits of multithreaded programming? Responsiveness, resource sharing, economy, utilization of multiprocessor architecture. 13. What are user level and kernel level threads? User threads are supported above the kernel and are implemented by a thread library at the user level. They are fast to create and manage. If kernel is single threaded then a thread performing a blocking system call will cause the entire process to block. Kernel threads are supported by operating system. 14. What are different multithreading models? Many-to-one: maps many user level threads to one kernel thread. Concurrency is not achieved. One-to-one: maps each user thread to a kernel thread. Has the burden of creating a kernel thread for each user thread. This watermark does not appear in the registered version - http://www.clicktoconvert.com

99 Many-to-many: maps many user threads to a small or equal number of kernel threads. 15. What is thread cancellation? Task of terminating a thread before it has completed. A thread that is to be cancelled is often referred to as the target thread. 16. What are types of cancellation? Asynchronous: one thread immediately terminates the target thread. Deferred: the target thread periodically check if it should terminate. 17. What is a thread pool and what are its benefits? Threads are created during process startup and placed into a pool called thread pool, where they wait for work. i. Faster to service a request ii. Limits no of threads 18. What are two types of scheduling? Preemptive and non-preemptive scheduling. 19. What are functions involved in dispatching? Dispatcher is the module that gives control of the CPU to the process selected by the short-term scheduler. This involves Switching context Switching to user mode Jumping to the proper location in the user program 20. What dispatch latency? The time taken by the dispatcher to stop one process and start another running is known as the dispatch latency. 21. What are scheduling criteria? CPU utilization, Throughput, Turnaround time, Waiting time, Response time

22. Give some scheduling algorithms? This watermark does not appear in the registered version - http://www.clicktoconvert.com

100 FCFS, SJF, priority scheduling, round-robin, multilevel feedback queue. 23. What is convoy effect? All small process waiting for one big process to get off the CPU. It is usually experienced in FCFS scheme. 24. What is time quantum in RR scheduling? The constant time allocated for every process to stay in CPU. 25. Brief about multilevel queue scheduling. It partitions the ready queue into several queues. The processes are permanently assigned to one queue, generally based on some property of the process, such as memory size, process priority, or process type. Each queue has its own scheduling algorithm. 26. What is an idle thread? If no ready thread is found, the dispatcher will execute a special thread called the idle thread. 27. What is a race condition? A situation where several processes access and manipulate the same date concurrently and the outcome of the execution depends on the particular order in which the access takes place, is called a race condition. 28. What is critical section? Each process has a segment of code, in which the process may be changing common variables, updating a table, writing a file, and so on. This section can be executed only by one process at a time. 29. What is mutual exclusion? If one process is in the critical section no other process can enter the critical section. This property is known as mutual exclusion. 30. What is the solution to critical solution problem? Mutual exclusion, Progress, Bounded waiting. This watermark does not appear in the registered version - http://www.clicktoconvert.com

101 31. What is a semaphore? It is a synchronization tool used to solve critical section problem. It is an integer variable and can be accessed only through two atomic operations: wait and signal. 32. What is busy wait? Process checks for CPU for every clock cycle or often when it is waiting to resume its execution. 33. What is deadlock? Two or more processes waiting indefinitely for an event that can be caused only by one of the waiting process leads to dead lock. 34. Counting semaphore Vs Binary semaphore. Counting semaphore: its integer value can range over an unrestricted domain. Binary semaphore: can take values 0 and 1 only. 35. What is a monitor?

High level synchronization construct is the monitor type. It is characterized by a set of programmer-defined operators. The representation of a monitor type consists of declarations of variables whose values define the state of an instance of the type, as well as the bodies of procedures or functions that implement operations on the type. 36. What are the conditions for deadlock? i. Mutual exclusion ii. Hold and wait iii. No preemption iv. Circular wait 37. What is deadlock prevention and deadlock avoidance Deadlock prevention is a set of methods for ensuring that at least one of the necessary conditions cannot hold. Deadlock avoidance on the other hand, requires that the operating system be given in advance additional information concerning which resources a process will request and use during its life time so that avoiding the deadlock well before scheduling the processes. 38. How process termination is done to avoid deadlock? This watermark does not appear in the registered version - http://www.clicktoconvert.com

102 Abort all deadlocked processes Abort one process at a time until the deadlock cycle is eliminated. 39. What are issues to be considered for resource preemption? Selecting a victim process to preempt Rollback the process to a safe state Ensure that starvation does not occur for any process 40. What for the Banker’s algorithm is used? It is a deadlock avoidance algorithm. This watermark does not appear in the registered version - http://www.clicktoconvert.com

103

(iii) MEMORY MANAGEMENT • Selection of memory management system depends mainly on the hardware design of the system. • Input queue: The collection of processes on the disk that is waiting to be brought into memory for execution are placed in the input queue. • Ways of binding instructions and data to memory addresses can be done at - compile time: Deals with absolute code i.e., the compiler code will always start from that location. Eg: MS_DOS .com format programs are absolute code bound at compile time - load time: not known at compile time regarding the location where the process will reside in memory, these codes are relocatable called as relocatable code. Binding is done only during load time.

- execution time : binding is done only during run time. Requires special hardware for this scheme to work.. • Logical Address space: (Virtual Address space with regards to execution time binding) Address generated by the CPU is commonly referred to as logical address. The set of all logical addresses generated by a program is a logical address space. • Physical Address Space: This watermark does not appear in the registered version - http://www.clicktoconvert.com

104 • Addresses seen by the memory unit i.e., the one loaded into the memory address register is called the physical address and the set of all physical addresses corresponding to the logical addresses is a physical address space. • Memory management unit, a hardware device unit is responsible for mapping from virtual to physical addresses.(A relocation register is used during mapping) • What is a stub? And in what way its related to memory management? A stub is a small piece of code that indicates how to locate the appropriate memory-resident library routine, or how to load the library if the routine is not already present. • Overlays: The idea of overlays is to keep in memory only those instructions and data that are needed at any given time. When other instructions are needed, they are loaded into space occupied previously by instructions that are no longer needed. Generally, the memory required by the bigger process is allocated so that the other processes can be overlaid at the same allocated memory. Ex: in a 2 pass assembler the contents used during pass1 may not be required during pass2 so the memory allocated for pass1 can be used to place the instructions of pass2. • Dynamic linking requires support from OS. • Dynamic ling and overlays doesn’t require support from OS. Roll in and Roll out: The task of swapping process in and out of the memory can be termed as roll in and roll out. Ex: A higher priority process may be given service by swapping out (rolling out) a lower priority process under execution. When the higher priority process finishes execution the lower priority process may be swapped back in (rolled in). This watermark does not appear in the registered version - http://www.clicktoconvert.com

105 Work of a dispatcher: - check whether the next process in the queue is in the memory. - If not and if there is no free memory region, the dispatcher swaps out a process currently in memory and swaps in the desired process. - Reload registers as normal and transfer control to the selected process. 1. Where is the swap space actually allocated? Swap space is allocated as a chunk of disk, separate from the file system, so that its use is as fast as possible.

2. When is swapping done? swapping is done only when there are many processes running and were using a threshold amount of memory. The process of swapping will be halted once the load on the system is reduced. In Microsoft windows 3.1 it is the user who decides on whether to swap a process (in or out), rather than the scheduler deciding it. 3. What are the 2 main partitions of memory for? - one for the resident operating system. - one for the user processes. 4. Where is the OS normally placed and why? The operating system is placed normally in the low memory since this is where the interrupt vector is normally located. Interrupt vector is the one that mainly helps in deciding the location of the os. 5. Types of memory protection: - protecting the OS from the user process. - protecting user processes from one another. This watermark does not appear in the registered version - http://www.clicktoconvert.com

106 6. Registers involved in providing memory protection: - Relocation register: contains the value of the smallest physical address. - Limit register: contains the range of logical address. Always the logical address must be less than that of the (upper bound of the) limit register. 7. Transient operating system code: There may be some code loaded in to the memory that may not be required always and which can be allotted for some other processes when not required. So this type of code that may be loaded and removed as needed are called transient operatingsystem code. Eg: operating system may contain code and buffers for device drivers, if a device driver is not commonly used then there is no need to keep the code and data in memory. 8. Memory Allocation: Degree of multiprogramming is bounded by the number of partitions. Partitioning is one of the major techniques used in memory allocation. Discuss the memory partitioning scheme? Fixed size partitioning - equal - unequal • using single queue • using multiple queue Dynamic (variable size partitioning) - first fit - best fit - next fit - worst fit This watermark does not appear in the registered version - http://www.clicktoconvert.com

107

- block list - block header hole: Available block of memory. Memory allocation can also be done using the Buddy System technique. Here division of memory in to two consecutively in the powers of 2 is done during allocation and the parts are recombined during de-allocation if their adjacent node is free. 9. What is 50-percent rule? Statistical analysis of first fit, for instance reveals that, even with some optimization, given N allocated blocks, another 0.5N blocks will be lost due to fragmentation. That is one-third of memory may be unusable. This property is known as 50-percent rule. 10. Fragmentation and its types? Inability to use the available memory is generally termed as fragmentation. Internal Fragmentation: Fragment (part) of Memory that is internal to a partition but is not being used. External Fragmentation: External fragmentation exists when enough total memory space exists to satisfy a request but it is not contiguous; storage is fragmented into a large number of small holes. Solutions: - Compaction: shuffling and placing all free memory together in one large block. - Permitting the logical address space of a process to be non-contiguous. - Paging - Segmentation - Combination of segmentation and paging This watermark does not appear in the registered version - http://www.clicktoconvert.com

108 11. Is compaction always possible? Compaction is only possible if relocation is dynamic, and is done at execution time. 12. What is paging? Paging is a memory management scheme that permits the physical address space of a process to be non-contiguous. Here in paging technique the address generated by the CPU i.e., the logical address is divided into 2 parts - page number : used as index in to page table. - page offset If the size of the logical address space is and the page size is addressing units (bytes or words), then the high-order m-n bits of logical address designate the page number and the n lower-order bits designate the page offset. 13. Fragmentation in paging: No external fragmentation but may have internal fragmentation. User view: user program views that memory as one single contiguous space, containing only that program. System view: the user program is scattered throughout physical memory, which also holds other programs. Contents of the frame table (data structure used in the paging technique): - Allocated frames

- Available frames - Total no of frames This watermark does not appear in the registered version - http://www.clicktoconvert.com

109 An entry for each page is made in the frame table. Page Table Base Register is used to point to the page table. While using PTBR two-memory access is required, this is time consuming so to overcome the extra memory access we go for TLB. What is a TLB? TLB stands for Translation Look-aside Buffer. - it’s a small ,special, fast lookup hardware cache. TLB entries: - A key (tag) - Value 14. What is Wired Down TLB entries ? Those entries that cannot be removed from the TLB are called as wired down entries. 15. What is an ASID? ASID stands for Address-Space Identifiers, which is sometimes stored in each entry of TLB. An ASID uniquely identifies each process and is used to provide address space protection for that process. 16. What is a hit ratio? The percentage of times that a particular page number is found in the TLB is called the hit ratio. 17. What is a Reentrant code or pure code or non-self modifying code? This code never changes during execution, thus 2 or more processes can execute the same code at the same time. 18. What is segmentation? This watermark does not appear in the registered version - http://www.clicktoconvert.com

110 Segmentation is a memory management scheme that supports user view of memory. A logical address space in segmentation is a collection of segments. Each segment has – name – length (offset) (In contrast to paging where the user specified only a single address, which was partitioned by the hardware into a page number and an offset, all invisible to the programmer) 19. Data structure used in segmentation: Segment table. Entries of a segment table are, - Segment base – contains the physical address where the segment resides I memory. - Segment limit – specifies the length of the segment. 20. What is virtual memory? Virtual memory is one that allows the execution of processes that may not be completely in memory.

Advantages: - programs can be larger than the physical memory. - Frees users from the concern of memory storage limitations. - Allows sharing of files and address spaces. - Provides an efficient mechanism for process creation. Disadvantage: - may decrease the performance if used carelessly. 21. Virtual memory is commonly implemented by demand paging. Demand paging: This watermark does not appear in the registered version - http://www.clicktoconvert.com

111 Demand paging is one where a page is loaded in to main memory from the secondary memory only on demand. Lazy swapper: A lazy swapper never swaps a page into memory unless that page will be needed. 22. What is thrashing? Generally high paging activity is called thrashing. A process is thrashing if it spends more time in paging than executing. This watermark does not appear in the registered version - http://www.clicktoconvert.com

112

(iv) INPUT/OUTPUT & DISC MANAGEMENT(MISC) 1) What do you mean by buffer swapping ? Buffer swapping is also know as double buffering and it smoothes out the flow of data between the I/O device and a process. It is an improvement over the single buffering technique and involves assigning two system buffers to the operation. A process now transfers data to(and from ) one buffer while the operating system empties(or fills) the other. 2) What is a circular buffer and when do we use it ? Double buffering is inadequate if the process performs rapid bursts of I/O. In this case the problem can be alleviated by using more than two buffers. When more than two buffers are used , the collection of buffers is itself referred to as circular buffer. 3) What is a device controller ? I/O devices are not directly used to communicate with the computer system. Instead, there is an intermediate electronic device called the device controller. This device controller is really a special purpose computer with a limited instruction set, limited to input/output control. The processor gives it one instruction at a time. However some device controllers do know how to execute programs of more than one instruction. Then they are called as I/O processors (IOPs) or sometimes channels. 4) Which algorithm is called the Elevator algorithm ? The SCAN algo is called the Elevator algo since it behaves like an elevator and goes in one direction until there are no more requests in that direction, and then reverse and go the other direction. 5) Define the following terms : Seek time : The seek time is the time for the disk arm to move the heads to the cylinder containing the desired sector.

Rotational latency : rotational latency is the additional time waiting for the disk to rotate the desired sector to the disk head. Access time : The sum of seek time, if any, and the rotational delay is called as access time. This watermark does not appear in the registered version - http://www.clicktoconvert.com

113 Disk bandwith = total no of bytes transferred -------------------------------( Time of completion of last transfer Time of first req for service ) 6) List the various disk scheduling algorithms : FCFS ( first come,first served ) SSTF ( Shortest-seek-time-first ) SCAN algorithm C-SCAN (Circular Scan algorithm ) LOOK Scheduling C-LOOK Scheduling N-STEP SCAN F-SCAN 7) What is " arm stickiness " and how is avoided ? With SSTF, SCAN and C-SCAN it is possible that the arm may not move for a considerable peroid of time. Arm stickiness occurs when one or more processes have high access rates and monopolise the entire device by repeated request to that track . To avoid this arm stickiness the disk request queue must be segmented. Two examples of this approach is N-step-SCAN and FSCAN. 8) Explain the N-STEP SCAN policy and the F-SCAN policy? The n step scan policy segments the disk request into subqueues of the len L . Sub quesues are processed one at at a time , using the SCAN . While a queue is being processed , new requests must be added to some other queue. If fewer than N requests are avilable at the ned of the scan , then all of them are processed with the next scan.FSCAN policy uses two subqueues. When the scan begins, all new request are in one of the queues , with the other empty. During the scan , all the new requets are put in the other queue. Thus , the serveice of the new request is deferred until all of the old processes have been processed. This watermark does not appear in the registered version - http://www.clicktoconvert.com

114 9) Explain the difference betwen Physical formatting and Logical formatting ? A new disk is simply is just a platter of magnetic material. Before we can store data , it must be divided into sectors that the disk controller can read and write. This process is called low-level formatting or Physical formatting. Logical formatting corressponds to " making the file system " . In this step the operating system stores the initial file system data structures onto the disk. 10) What is "sector sparing" ? Explain ! When the disk is intialised during the low level format at the factory , some spare sectors are set aside and are not visible even by the operating system. When a bad sector is encounterd , the controller replaces each bad sector logically with one of the spare sectors. This scheme is called as sector sparing / Fowarding .

11) What is DRAM? In which form does it store data? DRAM is not the best, but it’s cheap, does the job, and is available almost everywhere you look. DRAM data resides in a cell made of a capacitor and a transistor. The capacitor tends to lose data unless it’s recharged every couple of milliseconds, and this recharging tends to slow down the performance of DRAM compared to speedier RAM types. 12) What is Dispatcher? Dispatcher module gives control of the CPU to the process selected by the short-term scheduler; this involves: Switching context, Switching to user mode, Jumping to the proper location in the user program to restart that program, dispatch latency – time it takes for the dispatcher to stop one process and start another running. 13) Is disk scheduling, other than FCFS scheduling, useful in a singleuserenvironment? Explain your answer. In a single-user environment, the I/O queue usually is empty.Requests generally arrive from a single process for one block or for a sequence of consecutive blocks. In these cases, FCFS is an economical method of disk scheduling. But LOOK is nearly as easy to program and will give much better performance when multiple processes are performing concurrent I/O, such as when aWeb browser retrieves data in the background while the operating system is paging and another application is active in the foreground. 14) Why is rotational latency usually not considered in disk scheduling? How would you modify SSTF, SCAN, and C-SCAN to include latency optimization? This watermark does not appear in the registered version - http://www.clicktoconvert.com

115 Most disks do not export their rotational position informationto the host. Even if they did, the time for this information to reach the scheduler would be subject to imprecision and the time consumed by the scheduler is variable, so the rotational position information would become incorrect. Further, the disk requests are usually given in terms of logical block numbers, and the mapping between logical blocks and physical locations is very complex. 15) State three advantages of placing functionality in a device controller, rather than in the kernel. State three disadvantages. Three advantages: Easy debugging : Bugs are less likely to cause an operating system crash. Performance Optimization : Performance can be improved by utilizing dedicated hardware and hard-coded algorithms. Thin kernel : The kernel is simplified by moving algorithms out of it. Three disadvantages: Bugs are harder to fix. a new firmware version or new hardware is needed Improving algorithms likewise require a hardware update rather than just a kernel or device-driver update. Embedded algorithms could conflict with application’s use of the device,causing decreased performance. 16) Why might a system use interrupt-driven I/O to manage a single serial port, but polling I/O to manage a front-end processor, such as a terminal concentrator? Polling can be more efficient than interrupt-driven I/O. This is the case when the I/O is

frequent and of short duration. Even though a single serial port will perform I/O relatively infrequently and should thus use interrupts, a collection of serial ports such as those in a terminal concentrator can produce a lot of short I/O operations, and interrupting for each one could create a heavy load on the system. A well-timed polling loop could alleviate that load without wasting many resources through looping with no I/O needed. 17) Distinguish between a STREAMS driver and a STREAMS module. The STREAMS driver controls a physical device that could be involved in a STREAMS operation. The STREAMS module modifies the flow of data between the head (the user interface) and the driver. 18) How does DMA increase system concurrency? How does it complicate hardware design? This watermark does not appear in the registered version - http://www.clicktoconvert.com

116 DMA increases system concurrency by allowing the CPU to perform tasks while the DMA system transfers data via the system and memory buses. Hardware design is complicated because the DMA controller must be integrated into the system, and the system must allow the DMA controller to be a bus master. Cycle stealing may also be necessary to allow the CPU and DMA controller to share use of the memory bus. 19) What problems could occur if a system allowed a file system to be mounted simultaneously at more than one location? There would be multiple paths to the same file, which could confuse users or encourage mistakes (deleting a file with one path deletes the file in all the other paths). 20) Consider a system that supports the strategies of contiguous, linked, and indexed allocation.What criteria should be used in decidingwhich strategy is best utilized for a particular file? • Contiguous—if file is usually accessed sequentially, if file is relatively small. • Linked—if file is large and usually accessed sequentially. • Indexed—if file is large and usually accessed randomly. This watermark does not appear in the registered version - http://www.clicktoconvert.com

117

(v) UNIX FAQs 1) What does fork() do? The fork() function is used to create a new process from an existing process. The new process is called the child process, and the existing process is called the parent. You can tell which is which by checking the return value from fork(). The parent gets the child's pid returned to him, but the child gets 0 returned to him. Thus this simple code illustrate's the basics of it. Inherited by the child from the parent: * process credentials (real/effective/saved UIDs and GIDs) * environment * stack * memory * open file descriptors (note that the underlying file positions are shared between the parent and child, which can be confusing)

* close-on-exec flags * signal handling settings * nice value * scheduler class * process group ID * session ID * current working directory * root directory * file mode creation mask (umask) * resource limits * controlling terminal Unique to the child: * process ID * different parent process ID * Own copy of file descriptors and directory streams. * process, text, data and other memory locks are NOT inherited. * process times, in the tms struct * resource utilizations are set to 0 * pending signals initialized to the empty set * timers created by timer_create not inherited * asynchronous input or output operations not inherited 2) What's the difference between fork() and vfork()? This watermark does not appear in the registered version - http://www.clicktoconvert.com

118 Some systems have a system call vfork(), which was originally designed as a loweroverhead version of fork(). Since fork() involved copying the entire address space of the process, and was therefore quite expensive, the vfork() function was introduced (in 3.0BSD). However, since vfork() was introduced, the implementation of fork() has improved drastically, most notably with the introduction of `copy-on-write', where the copying of the process address space is transparently faked by allowing both processes to refer to the same physical memory until either of them modify it. This largely removes the justification for vfork(); indeed, a large proportion of systems now lack the original functionality of vfork() completely. For compatibility, though, there may still be a vfork() call present, that simply calls fork() without attempting to emulate all of the vfork() semantics. As a result, it is very unwise to actually make use of any of the differences between fork() and vfork(). Indeed, it is probably unwise to use vfork() at all, unless you know exactly why you want to. The basic difference between the two is that when a new process is created with vfork(), the parent process is temporarily suspended, and the child process might borrow the parent's address space. This strange state of affairs continues until the child process either exits, or calls execve(), at which point the parent process continues. This means that the child process of a vfork() must be careful to avoid unexpectedly modifying variables of the parent process. In particular, the child process must not return from the function containing the vfork() call, and it must not call exit() (if it needs to exit,

it should use _exit(); actually, this is also true for the child of a normal fork()). 3) Why use _exit rather than exit in the child branch of a fork? There are a few differences between exit() and _exit() that become significant when fork(), and especially vfork(), is used. The basic difference between exit() and _exit() is that the former performs clean-up related to user-mode constructs in the library, and calls user-supplied cleanup functions, whereas the latter performs only the kernel cleanup for the process. In the child branch of a fork(), it is normally incorrect to use exit(), because that can lead to stdio buffers being flushed twice, and temporary files being unexpectedly removed. In C++ code the situation is worse, because destructors for static objects may be run incorrectly. (There are some unusual cases, like daemons, where the parent should call _exit() rather than the child; the basic rule, applicable in the overwhelming majority of cases, is that exit() should be called only once for each entry into main.) This watermark does not appear in the registered version - http://www.clicktoconvert.com

119 In the child branch of a vfork(), the use of exit() is even more dangerous, since it will affect the state of the parent process. 4) How can I get/set an environment variable from a program? Getting the value of an environment variable is done by using getenv(). #include <stdlib.h> char *getenv(const char *name); Setting the value of an environment variable is done by using putenv(). #include <stdlib.h> int putenv(char *string); The string passed to putenv must not be freed or made invalid, since a pointer to it is kept by putenv(). This means that it must either be a static buffer or allocated off the heap. The string can be freed if the environment variable is redefined or deleted via another call to putenv(). Now suppose you wanted to create a new environment variable called MYVAR, with a value of MYVAL. This is how you'd do it. static char envbuf[256]; sprintf(envbuf,"MYVAR=%s","MYVAL"); if(putenv(envbuf)) { printf("Sorry, putenv() couldn't find the memory for %s\n",envbuf); /* Might exit() or something here if you can't live without it */ } 5) How can I read the whole environment? If you don't know the names of the environment variables, then the getenv() function isn't much use. In this case, you have to dig deeper into how the environment is stored. A global variable, environ, holds a pointer to an array of pointers to environment strings, each string in the form "NAME=value". A NULL pointer is used to mark the end of the array. Here's a trivial program to print the current environment (like printenv): extern char **environ; This watermark does not appear in the registered version - http://www.clicktoconvert.com

120 6) How can I sleep for less than a second?

The sleep() function, which is available on all Unixes, only allows for a duration specified in seconds. If you want finer granularity, then you need to look for alternatives: * Many systems have a function usleep() * You can use select() or poll(), specifying no file descriptors to test; a common technique is to write a usleep() function based on either of these (see the comp.unix.questions FAQ for some examples) * If your system has itimers (most do), you can roll your own usleep() using them (see the BSD sources for usleep() for how to do this) * If you have POSIX realtime, there is a nanosleep() function Of the above, select() is probably the most portable (and strangely, it is often much more efficient than usleep() or an itimer-based method). However, the behaviour may be different if signals are caught while asleep; this may or may not be an issue depending on the application. Whichever route you choose, it is important to realise that you may be constrained by the timer resolution of the system (some systems allow very short time intervals to be specified, others have a resolution of, say, 10ms and will round all timings to that). Also, as for sleep(), the delay you specify is only a minimum value; after the specified period elapses, there will be an indeterminate delay before your process next gets scheduled. 7) How can I get a finer-grained version of alarm()? Modern Unixes tend to implement alarms using the setitimer() function, which has a higher resolution and more options than the simple alarm() function. One should generally assume that alarm() and setitimer(ITIMER_REAL) may be the same underlying timer, and accessing it both ways may cause confusion. Itimers can be used to implement either one-shot or repeating signals; also, there are generally 3 separate timers available: ITIMER_REAL counts real (wall clock) time, and sends the SIGALRM signal ITIMER_VIRTUAL counts process virtual (user CPU) time, and sends the SIGVTALRM signal ITIMER_PROF counts user and system CPU time, and sends the SIGPROF signal; it is intended for interpreters to use for profiling. Itimers, however, are not part of many of the standards, despite having been present since 4.2BSD. The POSIX realtime extensions define some similar, but different, functions. This watermark does not appear in the registered version - http://www.clicktoconvert.com

121 8) How can a parent and child process communicate? A parent and child can communicate through any of the normal inter-process communication schemes (pipes, sockets, message queues, shared memory), but also have some special ways to communicate that take advantage of their relationship as a parent and child. One of the most obvious is that the parent can get the exit status of the child. Since the child inherits file descriptors from its parent, the parent can open both ends of a pipe, fork, then the parent close one end and the child close the other end of the pipe. This is what happens when you call the popen() routine to run another program from within yours, i.e. you can write to the file descriptor returned from popen() and the child process sees it as its stdin, or you can read from the file descriptor and see what the

program wrote to its stdout. (The mode parameter to popen() defines which; if you want to do both, then you can do the plumbing yourself without too much difficulty.) Also, the child process inherits memory segments mmapped anonymously (or by mmapping the special file `/dev/zero') by the parent; these shared memory segments are not accessible from unrelated processes. 9) What is a zombie? When a program forks and the child finishes before the parent, the kernel still keeps some of its information about the child in case the parent might need it -- for example, the parent may need to check the child's exit status. To be able to get this information, the parent calls wait(); when this happens, the kernel can discard the information. In the interval between the child terminating and the parent calling wait(), the child is said to be a `zombie'. (If you do `ps', the child will have a `Z' in its status field to indicate this.) Even though it's not running, it's still taking up an entry in the process table. (It consumes no other resources, but some utilities may show bogus figures for e.g. CPU usage; this is because some parts of the process table entry have been overlaid by accounting info to save space.) This is not good, as the process table has a fixed number of entries and it is possible for the system to run out of them. Even if the system doesn't run out, there is a limit on the number of processes each user can run, which is usually smaller than the system's limit. This is one of the reasons why you should always check if fork() failed, by the way! If the parent terminates without calling wait(), the child is `adopted' by init, which handles the work necessary to cleanup after the child. (This is a special system program with process ID 1 -- it's actually the first program to run after the system boots up). This watermark does not appear in the registered version - http://www.clicktoconvert.com

122 10) How do I prevent them from occuring? You need to ensure that your parent process calls wait() (or waitpid(), wait3(), etc.) for every child process that terminates; or, on some systems, you can instruct the system that you are uninterested in child exit states. Another approach is to fork() twice, and have the immediate child process exit straight away. This causes the grandchild process to be orphaned, so the init process is responsible for cleaning it up. For code to do this, see the function fork2() in the examples section. The other technique is to catch the SIGCHLD signal, and have the signal handler call waitpid() or wait3(). See the examples section for a complete program. 11) How do I get my program to act like a daemon? A daemon process is usually defined as a background process that does not belong to a terminal session. Many system services are performed by daemons; network services, printing etc. Simply invoking a program in the background isn't really adequate for these long-running programs; that does not correctly detach the process from the terminal session that started it. Also, the conventional way of starting daemons is simply to issue the command manually or from an rc script; the daemon is expected to put itself into the background. Here are the steps to become a daemon: 1. fork() so the parent can exit, this returns control to the command line or shell invoking your program. This step is required so that the new process is guaranteed not to be a process group leader. The next step, setsid(), fails if you're a process group leader.

2. setsid() to become a process group and session group leader. Since a controlling terminal is associated with a session, and this new session has not yet acquired a controlling terminal our process now has no controlling terminal, which is a Good Thing for daemons. 3. fork() again so the parent, (the session group leader), can exit. This means that we, as a non-session group leader, can never regain a controlling terminal. 4. chdir("/") to ensure that our process doesn't keep any directory in use. Failure to do this could make it so that an administrator couldn't unmount a filesystem, because it was our current directory. [Equivalently, we could change to any directory containing files important to the daemon's operation.] 5. umask(0) so that we have complete control over the permissions of anything we write. We don't know what umask we may have inherited. [This step is optional] 6. close() fds 0, 1, and 2. This releases the standard in, out, and error we inherited from our parent process. We have no way of knowing where these fds might have been redirected to. Note that many daemons use sysconf() to determine the limit _SC_OPEN_MAX. _SC_OPEN_MAX tells you the maximun open files/process. Then This watermark does not appear in the registered version - http://www.clicktoconvert.com

123 in a loop, the daemon can close all possible file descriptors. You have to decide if you need to do this or not. If you think that there might be file-descriptors open you should close them, since there's a limit on number of concurrent file descriptors. 7. Establish new open descriptors for stdin, stdout and stderr. Even if you don't plan to use them, it is still a good idea to have them open. The precise handling of these is a matter of taste; if you have a logfile, for example, you might wish to open it as stdout or stderr, and open `/dev/null' as stdin; alternatively, you could open `/dev/console' as stderr and/or stdout, and `/dev/null' as stdin, or any other combination that makes sense for your particular daemon. Almost none of this is necessary (or advisable) if your daemon is being started by inetd. In that case, stdin, stdout and stderr are all set up for you to refer to the network connection, and the fork()s and session manipulation should not be done (to avoid confusing inetd). Only the chdir() and umask() steps remain as useful. 12) How can I look at process in the system like ps does? You really don't want to do this. The most portable way, by far, is to do popen(pscmd, "r") and parse the output. (pscmd should be something like `"ps -ef"' on SysV systems; on BSD systems there are many possible display options: choose one.) In the examples section, there are two complete versions of this; one for SunOS 4, which requires root permission to run and uses the `kvm_*' routines to read the information from kernel data structures; and another for SVR4 systems (including SunOS 5), which uses the `/proc' filesystem. It's even easier on systems with an SVR4.2-style `/proc'; just read a psinfo_t structure from the file `/proc/PID/psinfo' for each PID of interest. However, this method, while probably the cleanest, is also perhaps the least well-supported. (On FreeBSD's `/proc', you read a semi-undocumented printable string from `/proc/PID/status'; Linux has something similar.) 13) Given a pid, how can I tell if it's a running program? Use kill() with 0 for the signal number.

There are four possible results from this call: * kill() returns 0 o this implies that a process exists with the given PID, and the system would allow you to send signals to it. It is system-dependent whether the process could be a zombie. * kill() returns @math{-1}, errno == ESRCH o either no process exists with the given PID, or security enhancements are causing the system to deny its existence. (On some systems, the process could be a zombie.) This watermark does not appear in the registered version - http://www.clicktoconvert.com

124 * kill() returns @math{-1}, errno == EPERM o the system would not allow you to kill the specified process. This means that either the process exists (again, it could be a zombie) or draconian security enhancements are present (e.g. your process is not allowed to send signals to anybody). * kill() returns @math{-1}, with some other value of errno o you are in trouble! The most-used technique is to assume that success or failure with EPERM implies that the process exists, and any other error implies that it doesn't. An alternative exists, if you are writing specifically for a system (or all those systems) that provide a `/proc' filesystem: checking for the existence of `/proc/PID' may work. 14) What's the return value of system/pclose/waitpid? The return value of system(), pclose(), or waitpid() doesn't seem to be the exit value of my process... or the exit value is shifted left 8 bits... what's the deal? The man page is right, and so are you! If you read the man page for waitpid() you'll find that the return code for the process is encoded. The value returned by the process is normally in the top 16 bits, and the rest is used for other things. You can't rely on this though, not if you want to be portable, so the suggestion is that you use the macros provided. These are usually documented under wait() or wstat. Macros defined for the purpose (in `<sys/wait.h>') include (stat is the value returned by waitpid()): WIFEXITED(stat) Non zero if child exited normally. WEXITSTATUS(stat) exit code returned by child WIFSIGNALED(stat) Non-zero if child was terminated by a signal WTERMSIG(stat) signal number that terminated child WIFSTOPPED(stat) non-zero if child is stopped WSTOPSIG(stat) number of signal that stopped child WIFCONTINUED(stat) non-zero if status was for continued child WCOREDUMP(stat) If WIFSIGNALED(stat) is non-zero, this is non-zero if the process left behind a core dump. 15) How do I find out about a process' memory usage?

This watermark does not appear in the registered version - http://www.clicktoconvert.com

125 Look at getrusage(), if available. 16) Why do processes never decrease in size? When you free memory back to the heap with free(), on almost all systems that doesn't reduce the memory usage of your program. The memory free()d is still part of the process' address space, and will be used to satisfy future malloc() requests. If you really need to free memory back to the system, look at using mmap() to allocate private anonymous mappings. When these are unmapped, the memory really is released back to the system. Certain implementations of malloc() (e.g. in the GNU C Library) automatically use mmap() where available to perform large allocations; these blocks are then returned to the system on free(). Of course, if your program increases in size when you think it shouldn't, you may have a `memory leak' -- a bug in your program that results in unused memory not being freed. 17) How do I change the name of my program (as seen by `ps')? On BSDish systems, the ps program actually looks into the address space of the running process to find the current argv[], and displays that. That enables a program to change its `name' simply by modifying argv[]. On SysVish systems, the command name and usually the first 80 bytes of the parameters are stored in the process' u-area, and so can't be directly modified. There may be a system call to change this (unlikely), but otherwise the only way is to perform an exec(), or write into kernel memory (dangerous, and only possible if running as root). Some systems (notably Solaris) may have two separate versions of ps, one in `/usr/bin/ps' with SysV behaviour, and one in `/usr/ucb/ps' with BSD behaviour. On these systems, if you change argv[], then the BSD version of ps will reflect the change, and the SysV version won't. Check to see if your system has a function setproctitle(). 18) How can I find a process' executable file? This would be a good candidate for a list of `Frequently Unanswered Questions', because the fact of asking the question usually means that the design of the program is flawed. :-) You can make a `best guess' by looking at the value of argv[0]. If this contains a `/', then it is probably the absolute or relative (to the current directory at program start) path of the executable. If it does not, then you can mimic the shell's search of the PATH variable, looking for the program. However, success is not guaranteed, since it is possible to invoke programs with arbitrary values of argv[0], and in any case the executable may have been renamed or deleted since it was started. This watermark does not appear in the registered version - http://www.clicktoconvert.com

126 If all you want is to be able to print an appropriate invocation name with error messages, then the best approach is to have main() save the value of argv[0] in a global variable for use by the entire program. While there is no guarantee whatsoever that the value in argv[0] will be meaningful, it is the best option available in most circumstances. The most common reason people ask this question is in order to locate configuration files with their program. This is considered to be bad form; directories containing executables should contain nothing except executables, and administrative requirements often make it desirable for configuration files to be located on different filesystems to executables. A less common, but more legitimate, reason to do this is to allow the program to call

exec() on itself; this is a method used (e.g. by some versions of sendmail) to completely reinitialise the process (e.g. if a daemon receives a SIGHUP). 19) So where do I put my configuration files then? The correct directory for this usually depends on the particular flavour of Unix you're using; `/var/opt/PACKAGE', `/usr/local/lib', `/usr/local/etc', or any of several other possibilities. User-specific configuration files are usually hidden `dotfiles' under $HOME (e.g. `$HOME/.exrc'). From the point of view of a package that is expected to be usable across a range of systems, this usually implies that the location of any sitewide configuration files will be a compiled-in default, possibly using a `--prefix' option on a configure script (Autoconf scripts do this). You might wish to allow this to be overridden at runtime by an environment variable. (If you're not using a configure script, then put the default in the Makefile as a `-D' option on compiles, or put it in a `config.h' header file, or something similar.) User-specific configuration should be either a single dotfile under $HOME, or, if you need multiple files, a dot-subdirectory. (Files or directories whose names start with a dot are omitted from directory listings by default.) Avoid creating multiple entries under $HOME, because this can get very cluttered. Again, you can allow the user to override this location with an environment variable. Programs should always behave sensibly if they fail to find any per-user configuration. 20) Why doesn't my process get SIGHUP when its parent dies? Because it's not supposed to. SIGHUP is a signal that means, by convention, "the terminal line got hung up". It has nothing to do with parent processes, and is usually generated by the tty driver (and delivered to the foreground process group). This watermark does not appear in the registered version - http://www.clicktoconvert.com

127 However, as part of the session management system, there are exactly two cases where SIGHUP is sent on the death of a process: * When the process that dies is the session leader of a session that is attached to a terminal device, SIGHUP is sent to all processes in the foreground process group of that terminal device. * When the death of a process causes a process group to become orphaned, and one or more processes in the orphaned group are stopped, then SIGHUP and SIGCONT are sent to all members of the orphaned group. (An orphaned process group is one where no process in the group has a parent which is part of the same session, but not the same process group.) 21) How can I kill all descendents of a process? There isn't a fully general approach to doing this. While you can determine the relationships between processes by parsing ps output, this is unreliable in that it represents only a snapshot of the system. However, if you're lauching a subprocess that might spawn further subprocesses of its own, and you want to be able to kill the entire spawned job at one go, the solution is to put the subprocess into a new process group, and kill that process group if you need to. The preferred function for creating process groups is setpgid(). Use this if possible rather than setpgrp() because the latter differs between systems (on some systems `setpgrp();' is equivalent to `setpgid(0,0);', on others, setpgrp() and setpgid() are identical).

Putting a subprocess into its own process group has a number of effects. In particular, unless you explicitly place the new process group in the foreground, it will be treated as a background job with these consequences: * it will be stopped with SIGTTIN if it attempts to read from the terminal * if tostop is set in the terminal modes, it will be stopped with SIGTTOU if it attempts to write to the terminal (attempting to change the terminal modes should also cause this, independently of the current setting of tostop) * The subprocess will not receive keyboard signals from the terminal (e.g. SIGINT or SIGQUIT) In many applications input and output will be redirected anyway, so the most significant effect will be the lack of keyboard signals. The parent application should arrange to catch at least SIGINT and SIGQUIT (and preferably SIGTERM as well) and clean up any background jobs as necessary. 22) Which address spaces would be accessed by the 2 modes or execution levels ? User mode -> User stack and Kernel mode -> kernel stack This watermark does not appear in the registered version - http://www.clicktoconvert.com

128 23) The swapper is sometimes called a 'scheduler' why ? Swapping contributes to scheduling sometimes. 24) what does 'i' stands for in inode ? Index 25) what does a fork system call do? It duplicates parent process and gives a new PID to child, returned to calling process. 26) How many u areas can kernel access at any point of time ? why ? 1 uarea,using the only 1 register triplet for the currently running process. 27) Relate critical region and processor execution levels ? When critical regions/shared regions are executed, processor execution level is RAISED. 28) "No process can put another process to sleep" True/False True 29) In buffer caching, it's a physical or conceptual cache ? why required ? Conceptual cache. For Faster access of info. in RAM. 30) why is a block read ahead left Unlocked ? It is an advance read that may/may not be required. 31) Difference between unnamed regular file and simple pipe ? For simple pipes: Only direct blocks are used, and data is transient here 32) what is a mount point ? The point in the tree, where a new path/tree is attached to. 33) what are the 3 logical sections of a process in the UNIX system ? Data,Text/Code,Stack. 34) what are user register triples and kernel register triples ? This watermark does not appear in the registered version - http://www.clicktoconvert.com

129 It is used to store Page Table Info, virtual address for kernel & user processes. 35) when and how are signals for a process handled ? When a process gets its slot, the signal bits are checked for setting. 36) "Round robin with multilevel feedback" what does it mean ? It's a mechanism that takes care of levels (or) process groups and an arithmetic for

priority that maintains an order, between processes from different groups also. This watermark does not appear in the registered version - http://www.clicktoconvert.com

130

Part – IV

OBJECT ORIENTED PROGRAMMING This watermark does not appear in the registered version - http://www.clicktoconvert.com

131

Topics Covered:i) Introduction to OOP, Friend function - Soumya Ann Jose ii) Inheritance - Ivy R iii) Polymorphism , STL , Misc -Mukesh Kumar Singh iv) Operator Overloading -Rathika N v) Introduction to Java, Comparison -Vijayarani K This watermark does not appear in the registered version - http://www.clicktoconvert.com

132

(i) INTRODUCTION TO OOP, FRIEND FUNCTION 1. Difference between Object-Based And Object-Oriented. Object-Based Programming usually refers to objects without inheritance and hence without polymorphism, as in '83 Ada and Modula-2.These languages support abstract data types (ADT) and not classes, which provide inheritance and polymorphism. Object Oriented : a language is object-oriented if and only if it satisfies the following requirements: 1.It supports objects that are data abstractions with an interface of named operations and a hidden local state. 2.Objects have an associated type. 3.Types may inherit attributes from supertypes. object-oriented = data abstractions + object types + type inheritance or Object-Oriented = Classes and Objects + Inheritance + Communication with messages 2. Benefits of object-orientation? Reuse Increased quality Emphasis on modeling the real world

Faster development Increased Quality Easier maintenance Enhanced modifiability Systems more change resilient, evolvable Reduced development risks for complex systems 3. Differences between C and C++. This watermark does not appear in the registered version - http://www.clicktoconvert.com

133 1. In C, when a function takes no parameters, its prototype has the word void inside its function parameter list. Eg char f1(void).In C++, the void is optional ie char f1() 2. In C an empty parameter list would mean nothing is said about the parameters to the function. In C++, it means that the function has no parameters. 3. In C prototypes are optional, in C++ they are required. 4. In C++ if a function is declared as returning a value, it must return a value ie if a function has a return type other than void, any return statement within that function must contain a value. In C, a non-void function is not required to actually return a value. 5. In C, if you don’t explicitly specify the return type of a function, an integer type is assumed. In C++ you must explicitly declare the return type of all functions. 6. In C, local variables can be declared only at the start of a block. In C++, local variables can be declared anywhere. 4. What is a class? "A class is a set of objects that share a common structure and a common behavior." (Booch) general format: class class_name{ //private functions and variables public: //public functions and variables } object-list; 5. What Is An Object? Some Of the common definitions are : An object has state, behavior, and identity; the structure and behavior of similar objects are defined in their common class; the terms instance and object are interchangeable. (Booch) An object represents an individual, identifiable item,unit, or entity, either real or abstract, with a well-defined role in the problem domain. ( Smith and Tockey ) An object is an abstraction of a set of real-world things such that: * all of the real-world things in the set - the instances - have the same characteristics * all instances are subject to and conform to the same rules. (Shlaer) 6. What is a method? This watermark does not appear in the registered version - http://www.clicktoconvert.com

134 A method is a function or procedure which is defined in a class and typically can access the internal state of an object of that class to perform some operation. It can be thought

of as a procedure with the first parameter as the object to work on. This object is called the receiver, which is the object the method operates on. An exception exists with C++'s static member functions which do not have a receiver, or "this" pointer. 7. Major Object Oriented Features Abstraction: Encapsulation : mechanism that binds together code and the data It manipulates, and keeps both safe from outside interference and misuse. Polymorphism: Polymorphism(many forms) is the quality that allows one name to be used for two or more related but technically different purposes. Polymorphism allows one name to specify a general class of actions. Within a general class of actions, the specific action to be applied is determined by the type of data. Inheritance : Inheritance is the process by which one object can acquire the properties of another. More specifically, an object can inherit a general set of properties to which it can add those features that are specific only to itself. 8. What is object encapsulation (Or Protection)? Encapsulation is the process of hiding all of the details of an object that do not contribute to its essential characteristics.(Booch) C++ has a very general encapsulation/protection mechanism with public, private and protected members. Public members (member data and member functions) may be accessed from anywhere. A Stack's Push and Pop methods will be public. Private members are only accessible from within a class. A Stack's representation, such as a list or array, will usually be private. Protected members are accessible from within a class and also from within subclasses (also called derived classes). A Stack's representation could be declared protected allowing subclass access. C++ also allows a class to specify friends (other (sub)classes and functions), that can access all members (its representation). 9. Constructor A constructor is a function that is applied automatically to each class object prior to the first use of that object within the program. This watermark does not appear in the registered version - http://www.clicktoconvert.com

135 a class’s constructor is called each time an object of that class is created. Has the same name as the class of which it is a part and has no return type. For global objects, an object’s constructor is called once, when the program first begins execution. For local objects, the constructor is called each time the declaration statement is executed. A constructor can take parameters. 10. Destructor complement of a constructor is the destructor. called when an object is destroyed. Name of the destructor is the name of its class , preceded by a ~. Local objects are destroyed when they go out of scope. Global objects are destroyed when the program ends. Destructor functions cannot have parameters. 11. Difference between structure and a class

By default, the members of a class are private but the members of a structure are public. Syntax of a structure Struct type_name { // public function and data members private: // private function and data members } object-list; 12. Relation between union and class A union defines a class type that can contain both functions and data as members. Union similar to structure in that by default all members are public until the private specifier is used. In union however , all data members share the same memory location The unions ability to link code and data allows you to create class types in which all data uses a shared location. They cannot inherit any other class and they cannot be used as a base class for any other type. They also must not contain any object that has a constructor or destructor. The union itself can have a constructor and destructor. Unions cannot have virtual member functions. 13. Friend Functions This watermark does not appear in the registered version - http://www.clicktoconvert.com

136 a friend function is not a member of a class but still has access to its private members. Enables a function to have access to the private members of two or more different classes. A friend function is not a member and cannot be qualified by an object name. It must be called just like a normal function. Although a friend function has knowledge of the private elements of the class for which it is a friend, it can only access them through an object of the class. A friend can access the variables only in conjunction with an object that is declared within or passed to the friend function. A friend function is not inherited. A friend function can be friends with more than one class. An example of a friend function class myclass{ int n,d; public: myclass(int i, int j) { n = i ; d = j; } // declare a friend of myclass friend int isfactor(myclass ob); }; /* friend function definition. It returns true if d is a factor of n. the keyword friend is not used in the definition of isfactor() */

int isfactor(myclass ob) { if(!(ob.n % ob.d)) return 1; else return 0; } int main() { myclass ob1(10, 2); if(isfactor(ob1)) cout<< “ 2 is a factor of 10”; else cout<<”2 is not a factor of 10”; return 0; } Here , myclass declares its constructor function and the friend isfactor() inside its class declaration. Because isfactor() is a friend of myclass, isfactor() has access This watermark does not appear in the registered version - http://www.clicktoconvert.com

137 to its private members. That is why, within isfactor(), it is possible to directly refer to ob.n and ob.d. A function can be a member of one class and a friend of another. Example class B; //forward declaration class A { public: void function_name(B obj) { cout<< a; } }; class B { public: friend void A:: function_name(B obj); }; int A::function_name(B obj) { // function definition } int main() { A obj1; B obj2; obj1.function_name(obj2); return 0; } This watermark does not appear in the registered version - http://www.clicktoconvert.com

138

(ii) INHERITENCE

1.What is Inheritance? The mechanism of deriving a new class from an old one is called Inheritance. The old class is called as base class and the new class is called as derived class. 2.Why do we go for inheritance? • Reusabilty 3.Types of Inheritance: • Single Inheritance • Multiple Inheritance • Multilevel Inheritance • Hybrid Inheritance B D B1 B2 D S B D S B1 B2 This watermark does not appear in the registered version - http://www.clicktoconvert.com

139 • Hierarchical Inheritance 4.Define Derived Class: A derived class can be defined by specifying its relationship with the base class in addition to its own details. Class derieved_class_name : visibility_mode base_class_name { ------------------}; visibility_mode: • specifies Public, Private, Protected…. • if not specified .. taken as private • is optional VISIBILITY MODES: 5.What happens when a base class is privately inherited? ‘Public’ members of the base class become ‘Private’ members of derived class. Therefore public members of base class can be accessed only through member functions of the derived class. They are not accessible to the objects of the derived class. 6.What happens when a base class is publicly inherited? ‘Public ‘ members of the base class become ‘Public’ members of the derived class and therefore are accessible to the objects of the derived class. 7. NOTE: In both private and public inheritance the private members of the base class are not inherited. 8.How to make a Private member inheritable?

• Inherit as Protected • A member function declared as Protected is accessible by the member functions within the class and any class immediately derived from it. It cannot be accessed by functions outside these two classes. D B D1 D2 D3 This watermark does not appear in the registered version - http://www.clicktoconvert.com

140 9. NOTE1: When a protected member is inherited in Public mode, it becomes protected in the derived class too and is therefore accessible only by the member functions of the derived class. NOTE2: When a protected member is inherited in Private mode, it becomes Private int the derived class. 10. Protected Derivation: Both Public and Protected members of the base class become protected members of the derived class. 11. Visibility of Inherited members: Base Class Visibility Derived Class Visibility Public Derivation Private Derivation Protected Derivation

Private Not inherited Not inherited Not inherited

Protected Protected Private Protected

Public Public Private Protected 12. What are the various functions that can access the private and protected members of a class? • Member functions of the class. • A function that is a friend of the class. • A member function of a class that is a friend of the class. • A member function of a derived class. 13. Difference between Friend functions and member functions of derived class in accessing the private and protected members of a class? * The Friend functions and the member functions of a friend class can access both the private and protected data. * The member functions of a derived class can directly access only the protected data. However, they can access the private data through the member functions of the base class. 14. An object of a derived class can be treated as an object of the base class when manipulated with pointers and references. This watermark does not appear in the registered version - http://www.clicktoconvert.com

141 Base Derived Employee: Manager: /*

Manager is an employee. So a Manager* can be used as an Employee* However all employee cannot be managers, therefore an employee* cannot be used as manager* */ If a class Derived has a public base class Base, then a Derived* can be assigned to a variable of type Base* without the use of explicit type conversion. The reverse needs explicit type conversion. Eg: Void fun(Manager mm, Employee ee) { Employee* pe = &mm; // ok : base class pointer pointing to derived class object Manager* pm = ⅇ // error : derived class pointer cannot point to base class object….need explicit type conversion ……….. ……….. } The explicit type cast can be done by: Manager* pm=static_cast<Manager*>(pe); // ok… since ‘pe’ points to the manager mm; 15. Member Functions: consider the example……..: class Employee { char *first_name, *family_name; char middle_initial; …. Protected: Char *str; …. first_name family_name …

first_name family_name … group level …. This watermark does not appear in the registered version - http://www.clicktoconvert.com

142 Public: Int I; …… Void print() const;

…… Char * fam_name() const { return family_name; } …… }; class manager : public employee { int level; public: ……. void print() const; …. } A member function of the derived class can use the public and protected members of its base class as if it where declared in the derived class it self Void member :: print() const { cout<<”level”<
143 char *first_name,; int dept; Public: Employee(char* fname, int dept); }; class manager : public employee { int level; manager(char * fname, int dep, int lev); }

employee:: employee(char * fname, int dep) : first_name(fname), dept(dep) {} manager :: manager(char * fname,int dep, int lev) : employee(fname,dep), level(lev) {} 17. Order of invoking constructors: 1. base class constructor 2. members 3. derived class constructor 18. They are destroyed in the opposite order. Virtual Functions: Class employee { …. public: virtual void print() const; // virtual function virtual void display()=0; // pure virtual function }; void employee:: print() const { cout<
144 public: void print () const; }; void manager::print() const { employee::print(); cout<<”\t”<
Class task { delay() { … } virtual void pending()=0; virtual int fun(); }; class Displayed { draw() { … } virtual void draw()=0; virtual int fun(); }; class Satellite : public task, public Displayed { transmit() { … } void pending(); //override task::pending void draw(); //override displayed::draw void f(satellite * sp) { int g=sp->fun(); // error: ambiguous g=sp->task::fun(); //ok g=sp->displayed::fun(); //ok}}; This watermark does not appear in the registered version - http://www.clicktoconvert.com

145

(iii) POLYMORPHISM , STANDARD TEMPLATE LIBRARY AND MISC Polymorphism Hiding many implementation behind a common interface. Two types of polymorphism 1. state poplymorphism (compile time) 2. dynamic polymorphism (run time) In c++ overloading of function is an example of static polymorphism. e.g. int A(int); char A(char *); Virtual functions are example for dynamic polymorphism. Virtual Function • A virtual function is sometimes called method also. ( Only with C++ ) • Virtual function is used with the base and derived class.(it's not necessary but without it, not useful). • The declaration of virtual function should be same in base and derived class. • A virtual function must be defined for the class in which it is first declared (exception a pure virtual function) . • To get runtime polymorphism, member function call must be virtual and object must be manipulated through pointers and references.

• A virtual function can be inline. • The class in which a virtual function is declared stores one pointer (hidden). • The traditional and obvious implementation of virtual function is simply an indirect function call. class A{ ... ... virtual print(char *) =0 ; // Pure virtual function }; class A{ ... ... virtual int print(char *); ... ... }; This watermark does not appear in the registered version - http://www.clicktoconvert.com

146 class B:public class A { ... ... int print(char *); // in derive class, no need to write virtual keyword ... ... } It is not necessary that a virtual function should be defined in derived class. Class Hierarchy Inheritance simple inheritance multiple inheritance Inheritance and using declaration. if A & B are two base classes for C and both contains int f(int) and char f(char) respectively. Now with the help of using ( Keyword ) we can declare both. e.g. class A{ ... ... int f(int); char f(char); ... ... }; class B{ public: double f(double);

... ... }; class AB:public A, public B{ publi: using A::f; using B::f; char f(char); hides A::f(char); AB f(AB); This watermark does not appear in the registered version - http://www.clicktoconvert.com

147 }; Virtual Base Class: AA BD C Now C has two copy of members and attributes of A. Let in A we have f(); Now, C.f() // error C.B::f() // ok C.A::f()// > error Solution is Virtual Base Class. It should be defined like this. class A{ ... ... }; class B:public virtual A{ ... ... }; class D:public virtual A{ ... ... }; class C:public B, public D{ ... ... }; Standard Template Library This watermark does not appear in the registered version - http://www.clicktoconvert.com

148 The following function and descriptions are common for all containers differences are specified separately. • The facilities of the Standard Template Library are defined in the Standard namespace(std)

• A container is an object which holds other objects. e.g. : list, vector , stack, queue etc. • There are 2 ways to define container o Define a base class container and extend it.( problem is in two containers here may not be any common interface). o Define them separately ( It follows in c++) • Fat interface means a lot of methods in which all are not useful and necessary present in the interface e.g. : subscripting in list is not necessary. • In c++ all containers implements all the standard container interface. (but there is no common iterator base class) • To be used generally the containers use allocators which is passed as template argument. It is necessary because a vector can contain numbers or a vector of objects of a user defined class. e.g. Template > Class ::vector{ ------} • Iterator: Iterator is used to navigate containers Key member functions of iterator are: 1. iterator begin(); points to first element 2. const-iterator begin() const; 3. iterator end(); points to one past last element 4. const-iterator end() const; 5. reverse-iterator rbegin() points to 1st element of the reverse sequence. 6. reverse-iterator rend(); points to one past last element of reverse sequence 7. const for both also exists. This watermark does not appear in the registered version - http://www.clicktoconvert.com

149 How to use find ? r1= find (c.rbegin (), c.rend (), v); It checks the value v from last to first and if found returns a pointer to that objects *r1 gives object. If not found returns c.rend(); Similar for find(c.begin(),c.end(),v); Vector also provides [ ](script) and at() but at() checks for rangewhere as [ ] doesn’t (like c); It is defined as, reference operator [] (size-type n); // operator overloading Reference at (size-type n); /* Const versions available. */ Reference front(); // first element Reference back(); // last element Only vector and dequeue support subscripting. • begin() returns pointer to first element.

• constructors are ‘explicit’ in vector. // explicit doesn’t support = in initializing e. g. : vector vr (100); If record has default constructor then vr is a vector of 100 records, if not then error we must explicitly call constructor of record. Eg: vector v1(10); ok Vector v2=vector(10); ok This watermark does not appear in the registered version - http://www.clicktoconvert.com

150 Vector v3=v2; ok Vector v4=10; error; explicit not allowed. How to pass containers into a function …the 3 methods are here 3rd shouldn’t be used.. Void f (vectorv2& ); // pass by reference Void f (const vectorv2& ); // pass by reference Void f (vector v ); creates and copy another vector( pass by value) Use of assign. Assign() fnction is used to assign values. It also changes size. e. g. : vn.assign(10,num(0)); // now size is 10 Char s[]=”abcd”; vc.assign(s,&s[sizeof(s)-1]); size=4. // kind of assignment from start address to end address List lb; Vb.assign (lb.begin (), lb.end ()); // kind of assignment Some others functions. Void push_back(const t &x); // add to end Void pop_back(); // remove last element After each push_back() vector size grows by 1. s.back() gives most recently pushed element but doesn’t pops. Size(); // returns the size. Overflow, underflow is undefined. As a list , vector also provides facility to add and delete element at any position. Iterator insert(iterator pos,const T &x); Void insert(iterator pos,size_type n,const T &x); // add n elements Iterator erase(iterator first,iterator last); // erase sequence. It also supports Bool empty(); Sizetype max_size(); Size_type capacity(); resize(size_type sz,T val=t()); // like realloc This watermark does not appear in the registered version - http://www.clicktoconvert.com

151 //added elements only initialized void reserve(size_type n) ; // only reserve not initialized. When vector grows till range of reserve it needn’t be relocated but reserve

has no impact on size. Capacity gives the current no of reserved memory slot. To know the limit to which vector can grow without reallocation. v.capacity () - r.size (); List doesn’t have capacity and reserve. HELPER FUNCTIONS Vectors can be compared using = = and < . = = menas size and contents equal < means lexicographically less. STL also provides <=, !=, >=, > ,= Vector is implemented separately. // doesn’t follow the rules discussed above since it need only 1 bit to represent a element. TABLE from book(pg no: 462-464 stroustrup) Representation: it specify how the containers will be implemented. It is left to implementers but with some constraint. Generally, Vector uses arrays Lists uses link lists Map uses balance tree String uses sequence of array of some chars. Element: the object in container. Comparison: we can write our own comparison function or by default < is used. e.g. : template void sort(A a,A b); // default < used here Template void sort(A a,A b,Cmp cmp); // cmp( ) will specify the comparison rule. Now how a and b i.e., 2 objects of A will be compared is specified in function cmp. it must return bool. This watermark does not appear in the registered version - http://www.clicktoconvert.com

152 To sort names we must write cmp ( ) , if we want alphabetic (dictionary) order. i.e., a A b B. By default (ASCII) AB ab Relation operation !=,<=,>,>= are not included by default. It is defined in std: reloperators namespace. Sequence: Fundemental sequences are vector,list, dequeue. A list provides all the member types and operations offered by vector except subscripting, capacity() and reserve(),at() LIST’s operations: Void splice(iterator pos, list &x’) moves all element from x to before pos in this list without copying. Void splice(iterator pos, list &x,iterator p); moves *p before pos from list x without copying. Similarly Void splice(iterator pos,list x,iterator first,iterator last) Void merge(list &) merge sorted list. Template < class Cmp> void merge (list&,Cmp); Void sort();

Template void sort(Cmp); Above list operations are stable i.e., relative order of equal elements doesn’t change. Before using merge we must use sort(); Reference front(); Void push_front(const t&) ; Reference front()const; Void push_front(const t&) const ; Void remove(const t& val) removes val from list. Template void remove_if(pred p); removes on condition. Void unique(); Template void unique(Binpredb); Void reverse(); reverses the list. Eg., let fruit is a list of fruit names. Fruit.remove(“orange’); This watermark does not appear in the registered version - http://www.clicktoconvert.com

153 Fruit.remove_if(initial(‘l’)); Unique() works only on sorted list because it checks consecutive elements. Unique(initial (‘P’)); will give pear apple apple if list contain peer pear apple apple. A dequeue (it rhymes with check) is double ended queue. operation on ends are efficient like list. subscripting is efficiend like vector. ADAPTOR: Stack, queue are adaptor of containers vector , dequeue and list. Because they can be defined efficiently in terms of 3 containers. By default stack uses dequeue. Eg. Stack,char> s1; dequeue container Stack s2; vector container. • Stack relies on the iterator of container used to implement stack. • queue by default uses dequeue as container to implement. • vector doesn’t provide pop_front() so it cant be used for queue. Associative array also called map or dictionary keeps pairs of keys, given only we can get other. Map is a sequence of (key,value) pair.

Miscellaneous:1. What is an object in C++? An object is a package that contains related data and instructions. The data relates to what the object represents, while the instructions define how this object relates to other objects and itself. 2. What is a message? A message is a signal from one object to another requesting that a computation take place. It is roughly equivalent to a function call in other languages. 3. What is a class? This watermark does not appear in the registered version - http://www.clicktoconvert.com

154 A class defines the characteristics of a certain type of object. It defines what its

members will remember, the messages to which they will respond, and what form the response will take. 4. What is an instance? An individual object that is a member of some class. 5. What is a super-class? Given a class, a super-class is the basis of the class under consideration. The given class is defined as a subset (in some respects) of the super-class. Objects of the given class potentially posses all the characteristics belonging to objects of the super-class. 6. What is inheritance? Inheritance is property such that a parent (or super) class passes the characteristics of itself to children (or sub) classes that are derived from it. The sub-class has the option of modifying these characteristics in order to make a different but fundamentally related class from the super-class. 7. To what does message protocol refer? An object's message protocol is the exact form of the set of messages to which the object can respond. 8. What is polymorphism? Polymorphism refers to the ability of an object to respond in a logically identical fashion to messages of the same protocol, containing differing types of objects. Consider 1 + 5 and 1 + 5.1. In the former, the message "+ 5" is sent to an object of class integer (1). In the later, the message "+ 5.1" is sent to the same integer object. The form of the message (its protocol) is identical in both cases. What differs is the type of object on the right-hand side of these messages. The former is an integer object (5) while the later is a floating point object (5.1). The receiver (1) appears (to other objects) to respond in the same way to both messages. Internally, however, it knows that it must treat the two types of objects differently in order to obtain the same overall response. 9. What are instance variables? These represent an object's private memory. They are defined in an object's class. 10. What are class variables? These represent a class's memory which it shares with each of its instances. This watermark does not appear in the registered version - http://www.clicktoconvert.com

155 11. What is a method? A method is a class's procedural response to a given message protocol. It is like the definition of a procedure in other languages. 12. In C++ what is a constructor? A destructor? A constructors and destructors are methods defined in a class that are invoked automatically when an object is created or destroyed. They are used to initialize a newly allocated object and to cleanup behind an object about to be removed. 13. Compare and contrast C and C++. Comparison: C++ is an extension to the C language. When C++ is used as a procedural language, there are only minor syntactical differences between them. Contrast: When used as a procedural language, C++ is a better C because: o It vigorously enforces data typing conventions. o It allows variables to be defined where they are used. o It allows the definition of real (semantically significant) constants.

It allows for automatic pointer dereferencing. It supports call-by-reference in addition to call-by-value in functions. It supports tentative variable declarations (when the type and location of a variable cannot be known before hand. As an object oriented language, C++ introduces much of the OOP paradigm while allowing a mixture of OOP and procedural styles. 14. What is operator overloading? It is the process of, and ability to redefine the way an object responds to a C++ operator symbol. This would be done in the object's class definition. 15. What is cin and cout? They are objects corresponding to a program's default input and output files. 16. Contrast procedural and object oriented programming. The procedural paradigm performs computation through a step-by-step manipulation of data items. Solving problems this way is akin to writing a recipe. ie: All the ingredients (data items) are defined. Next a series of enumerated steps (statements) are defined to transform the raw ingredients into a finished meal. The object oriented model, in contrast, combines related data and procedural information into a single o o o

This watermark does not appear in the registered version - http://www.clicktoconvert.com

156 package called an object. Objects are meant to represent logically separate entities (like real world objects). Objects are grouped together (and defined by) classes. (This is analogous to user defined data types in procedural languages.) Classes may pass-on their "makeup" to classes derived from them. In this way, Objects that are of a similar yet different nature need not be defined from scratch. Computation occurs though the intercommunication of objects. Programming this way is like writing a play. First the characters are defined with their attributes and personalities. Next the dialog is written so that the personalities interact. The sum total constitutes a drama. 17. How do you link a C++ program to C functions? By using the extern "C" linkage specification around the C function declarations. You should know about mangled function names and type-safe linkages. Then you should explain how the extern "C" linkage specification statement turns that feature off during compilation so that the linker properly links function calls to C functions. Another acceptable answer is "I don't know. We never had to do that." Merely describing what a linker does would indicate to me that you do not understand the issue that underlies the question. 18. Explain the scope resolution operator. The scope resolution operator permits a program to reference an identifier in the global scope that has been hidden by another identifier with the same name in the local scope. The answer can get complicated. It should start with "colon-colon," however. (Some readers had not heard the term, "scope resolution operator," but they knew what :: means. You should know the formal names of such things so that you can understand all communication about them.) If you claim to be well into the design or use of classes that employ inheritance, you tend to address overriding virtual function overrides to explicitly call a function higher in the hierarchy. That's good knowledge to demonstrate, but address your comments specifically to global scope resolution. Describe C++'s ability to override the particular C behavior where identifiers in the global scope are always hidden by similar identifiers in a local

scope. 19. What are the differences between a C++ struct and C++ class? The default member and base class access specifiers are different. This is one of the commonly misunderstood aspects of C++. Believe it or not, many programmers think that a C++ struct is just like a C struct, while a C++ class has inheritance, access specifiers, member functions, overloaded operators, and so on. Some of them have even written books about C++. Actually, the C++ struct has all the features of the class. The only differences are that a struct defaults to public member access and public base class inheritance, and a class defaults to the private access specifier and private base class inheritance. Getting this question wrong does not necessarily This watermark does not appear in the registered version - http://www.clicktoconvert.com

157 disqualify you because you will be in plenty of good company. Getting it right is a definite plus. 20. How many ways are there to initialize an int with a constant? Two. There are two formats for initializers in C++ as shown in Example 1. Example 1(a) uses the traditional C notation, while Example 1(b) uses constructor notation. Many programmers do not know about the notation in Example 1(b), although they should certainly know about the first one. Many old-timer C programmers who made the switch to C++ never use the second idiom, although some wise heads of C++ profess to prefer it. A reader wrote to tell me of two other ways, as shown in Examples 2(a) and 2(b), which made me think that maybe the answer could be extended even further to include the initialization of an int function parameter with a constant argument from the caller. 21. How does throwing and catching exceptions differ from using setjmp and longjmp? The throw operation calls the destructors for automatic objects instantiated since entry to the try block. Exceptions are in the mainstream of C++ now, so most programmers, if they are familiar with setjmp and longjmp, should know the difference. Both idioms return a program from the nested depths of multiple function calls to a defined position higher in the program. The program stack is "unwound" so that the state of the program with respect to function calls and pushed arguments is restored as if the calls had not been made. C++ exception handling adds to that behavior the orderly calls to the destructors of automatic objects that were instantiated as the program proceeded from within the try block toward where the throw expression is evaluated. It's okay to discuss the notational differences between the two idioms. Explain the syntax of try blocks, catch exception handlers, and throw expressions. Then specifically address what happens in a throw that does not happen in a longjmp. Your answer should reflect an understanding of the behavior described in the answer just given. One valid reason for not knowing about exception handling is that your experience is exclusively with older C++ compilers that do not implement exception handling. I would prefer that you have at least heard of exception handling, though. It is not unusual for C and C++ programmers to be unfamiliar with setjmp/ longjmp. Those constructs are not particularly intuitive. A C programmer who has written

recursive descent parsing algorithms will certainly be familiar with setjmp/ longjmp. Others might not, and that's acceptable. In that case, you won't be able to discuss how setjmp/longjmp differs from C++ exception handling, but let the interview turn into a This watermark does not appear in the registered version - http://www.clicktoconvert.com

158 discussion of C++ exception handling in general. That conversation will reveal to the interviewer a lot about your overall understanding of C++. 22. What is your reaction to this line of code? delete this; It's not a good practice. A good programmer will insist that the statement is never to be used if the class is to be used by other programmers and instantiated as static, extern, or automatic objects. That much should be obvious. The code has two built-in pitfalls. First, if it executes in a member function for an extern, static, or automatic object, the program will probably crash as soon as the delete statement executes. There is no portable way for an object to tell that it was instantiated on the heap, so the class cannot assert that its object is properly instantiated. Second, when an object commits suicide this way, the using program might not know about its demise. As far as the instantiating program is concerned, the object remains in scope and continues to exist even though the object did itself in. Subsequent dereferencing of the pointer can and usually does lead to disaster. A reader pointed out that a class can ensure that its objects are instantiated on the heap by making its destructor private. This idiom necessitates a kludgy DeleteMe kind of function because the instantiator cannot call the delete operator for objects of the class. The DeleteMe function would then use "delete this." I got a lot of mail about this issue. Many programmers believe that delete this is a valid construct. In my experience, classes that use delete this when objects are instantiated by users usually spawn bugs related to the idiom, most often when a program dereferences a pointer to an object that has already deleted itself. 23. What is a default constructor? A constructor that has no arguments or one where all the arguments have default argument values. If you don't code a default constructor, the compiler provides one if there are no other constructors. If you are going to instantiate an array of objects of the class, the class must have a default constructor. 24. What is a conversion constructor? A constructor that accepts one argument of a different type. The compiler uses this idiom as one way to infer conversion rules for a class. A constructor with more than one argument and with default argument values can be interpreted by the compiler as a conversion constructor when the compiler is looking for an object of the type and sees an object of the type of the constructor's first argument. This watermark does not appear in the registered version - http://www.clicktoconvert.com

159 25. What is the difference between a copy constructor and an overloaded assignment operator? A copy constructor constructs a new object by using the content of the argument object. An overloaded assignment operator assigns the contents of an existing object to another existing object of the same class. First, you must know that a copy constructor is one that has only one argument,

which is a reference to the same type as the constructor. The compiler invokes a copy constructor wherever it needs to make a copy of the object, for example to pass an argument by value. If you do not provide a copy constructor, the compiler creates a member-by-member copy constructor for you. You can write overloaded assignment operators that take arguments of other classes, but that behavior is usually implemented with implicit conversion constructors. If you do not provide an overloaded assignment operator for the class, the compiler creates a default member-by-member assignment operator. This discussion is a good place to get into why classes need copy constructors and overloaded assignment operators. By discussing the requirements with respect to data member pointers that point to dynamically allocated resources, you demonstrate a good grasp of the problem. 26. When should you use multiple inheritance? There are three acceptable answers: "Never," "Rarely," and "When the problem domain cannot be accurately modeled any other way." There are some famous C++ pundits and luminaries who disagree with that third answer, so be careful. Let's digress to consider this issue lest your interview turn into a religious debate. Consider an Asset class, Building class, Vehicle class, and CompanyCar class. All company cars are vehicles. Some company cars are assets because the organizations own them. Others might be leased. Not all assets are vehicles. Money accounts are assets. Real-estate holdings are assets. Some real-estate holdings are buildings. Not all buildings are assets. Ad infinitum. When you diagram these relationships, it becomes apparent that multiple inheritance is an intuitive way to model this common problem domain. You should understand, however, that multiple inheritance, like a chainsaw, is a useful tool that has its perils, needs respect, and is best avoided except when nothing else will do. Stress this understanding because your interviewer might share the common bias against multiple inheritance that many object-oriented designers hold. 27. What is a virtual destructor? This watermark does not appear in the registered version - http://www.clicktoconvert.com

160 The simple answer is that a virtual destructor is one that is declared with the virtual attribute. The behavior of a virtual destructor is what is important. If you destroy an object through a pointer or reference to a base class, and the base-class destructor is not virtual, the derived-class destructors are not executed, and the destruction might not be complete. 28. Explain the “IS A” and “HAS A” class relationships. How would you mplement each in a class design? A specialized class "is a" specialization of another class and, therefore, has the ISA relationship with the other class. An Employee ISA Person. This relationship is best implemented with inheritance. Employee is derived from Person. A class may have an instance of another class. For example, an Employee "has a" Salary, therefore the Employee class has the HASA relationship with the Salary class. This relationship is best implemented by embedding an object of the Salary class in the Employee class. The answer to this question reveals whether you have an understanding of the

fundamentals of object-oriented design, which is important to reliable class design. There are other relationships. The USESA relationship is when one class uses the services of another. The Employee class uses an object (cout) of the ostream class to display the employee's name onscreen, for example. But if you get ISA and HASA right, you usually don't need to go any further. 29. When is a template a better solution than a base class? When you are designing a generic class to contain or otherwise manage objects of other types, when the format and behavior of those other types are unimportant to their containment or management, and particularly when those other types are unknown (thus the genericity) to the designer of the container or manager class. Prior to templates, you had to use inheritance; your design might include a generic List container class and an application-specific Employee class. To put employees in a list, a ListedEmployee class is multiply derived (contrived) from the Employee and List classes. These solutions were unwieldy and error-prone. Templates solved that problem. 30. What is the result of compiling this program in a C compiler? in a C++ compiler? int __ = 0; main() { int ___ = 2; printf("%d\n", ___ + __); } This watermark does not appear in the registered version - http://www.clicktoconvert.com

161 On GNU compiler both C and C++ give output as 2. I believe other C++ compiler will complain "syntax error", whereas C compiler will give 2. 31. What is the difference between C and C++ ? Would you prefer to use one over the other ? C is based on structured programming whereas C++ supports the object-oriented programming paradigm. Due to the advantages inherent in object-oriented programs such as modularity and reuse, C++ is preferred. However almost anything that can be built using C++ can also be built using C. 32. What are the access privileges in C++ ? What is the default access level ? The access privileges in C++ are private, public and protected. The default access level assigned to members of a class is private. Private members of a class are accessible only within the class and by friends of the class. Protected members are accessible by the class itself and it's sub-classes. Public members of a class can be accessed by anyone. 33. What is data encapsulation ? Data Encapsulation is also known as data hiding. The most important advantage of encapsulation is that it lets the programmer create an object and then provide an interface to the object that other objects can use to call the methods provided by the object. The programmer can change the internal workings of an object but this transparent to other interfacing programs as long as the interface remains unchanged. 34. What is inheritance ? Inheritance is the process of deriving classes from other classes. In such a case, the sub-class has an 'is-a' relationship with the super class. For e.g. vehicle can be a

super-class and car can be a sub-class derived from vehicle. In this case a car is a vehicle. The super class 'is not a' sub-class as the sub- class is more specialized and may contain additional members as compared to the super class. The greatest advantage of inheritance is that it promotes generic design and code reuse. 35. What is multiple inheritance ? What are it's advantages and disadvantages ? Multiple Inheritance is the process whereby a sub-class can be derived from more than one super class. The advantage of multiple inheritance is that it allows a class to inherit the functionality of more than one base class thus allowing for modeling of complex relationships. The disadvantage of multiple inheritance is that it can lead to a lot of confusion when two base classes implement a method with the same name. This watermark does not appear in the registered version - http://www.clicktoconvert.com

162 36. What is polymorphism? Polymorphism refers to the ability to have more than one method with the same signature in an inheritance hierarchy. The correct method is invoked at run-time based on the context (object) on which the method is invoked. Polymorphism allows for a generic use of method names while providing specialized implementations for them. 37. What do the keyword static and const signify ? When a class member is declared to be of a static type, it means that the member is not an instance variable but a class variable. Such a member is accessed using Classname.Membername (as opposed to Object.Membername). Const is a keyword used in C++ to specify that an object's value cannot be changed. 38. How is memory allocated/deallocated in C ? How about C++ ? Memory is allocated in C using malloc() and freed using free(). In C++ the new() operator is used to allocate memory to an object and the delete() operator is used to free the memory taken up by an object. 39. What is UML ? UML refers to Unified Modeling Language. It is a language used to model OO problem spaces and solutions. 40. What is the difference between a shallow copy and a deep copy? A shallow copy simply creates a new object and inserts in it references to the members of the original object. A deep copy constructs a new object and then creates in it copies of each of the members of the original object. 41. What is a mutable member? One that can be modified by the class even when the object of the class or the member function doing the modification is const. Understanding this requirement implies an understanding of C++ const, which many programmers do not have. I have seen large class designs that do not employ the const qualifier anywhere. Some of those designs are my own early C++ efforts. One author suggests that some programmers find const to be such a bother that it is easier to ignore const than to try to use it meaningfully. No wonder many programmers don't understand the power and implications of const. Someone who claims to have enough interest in the This watermark does not appear in the registered version - http://www.clicktoconvert.com

163 language and its evolution to keep pace with the ANSI deliberations should not be ignorant of const, however.

42. What is an explicit constructor? A conversion constructor declared with the explicit keyword. The compiler does not use an explicit constructor to implement an implied conversion of types. Its purpose is reserved explicitly for construction. 43. Describe run-time type identification. The ability to determine at run time the type of an object by using the typeid operator or the dynamic_cast operator. 44. What problem does the namespace feature solve? Multiple providers of libraries might use common global identifiers causing a name collision when an application tries to link with two or more such libraries. The namespace feature surrounds a library's external declarations with a unique namespace that eliminates the potential for those collisions. This solution assumes that two library vendors don't use the same namespace, of course. 45. Are there any new intrinsic (built-in) data types? Yes. The ANSI committee added the bool intrinsic type and its true and false value keywords and the wchar_t data type to support character sets wider than eight bits. Other apparent new types (string, complex, and so forth) are implemented as classes in the Standard C++ Library rather than as intrinsic types. This watermark does not appear in the registered version - http://www.clicktoconvert.com

164

(iv) OPERATOR OVERLOADING 1. Why Operator Overloading? It allows us to provide an intuitive interface to users of our class, plus makes it possible for templates to work equally well with classes and built-in/intrinsic types. Operator overloading allows C/C++ operators to have user-defined meanings on user-defined types (classes). Overloaded operators are syntactic sugar for function calls. 2.What are the benefits of operator overloading? By overloading standard operators on a class, we can exploit the intuition of the users of that class. This lets users program in the language of the problem domain rather than in the language of the machine.The ultimate goal is to reduce both the learning curve and the defect rate. 3. What are some examples of operator overloading? A few of the many examples of operator overloading: • myString + yourString might concatenate two std::string objects • myDate++ might increment a Date object • a * b might multiply two Number objects • a[i] might access an element of an Array object • x = *p might dereference a "smart pointer" that "points" to a disk record — it could seek to the location on disk where p "points" and return the appropriate record into x. 4. Operator Overloading makes classes look clumsy.Comment on this. Operator overloading makes life easier for the users of a class, not for the developer of the class! Consider the following example. class Array {

public: int& operator[] (unsigned i); // Some people don't like this syntax ... }; inline int& Array::operator[] (unsigned i) // Some people don't like this syntax This watermark does not appear in the registered version - http://www.clicktoconvert.com

165 { ... } Some people don't like the keyword operator or the somewhat funny syntax that goes with it in the body of the class itself. But the operator overloading syntax isn't supposed to make life easier for the developer of a class. It's supposed to make life easier for the users of the class: int main() { Array a; [3] = 4; // User code should be obvious and easy to understand... } 5.What operators can/cannot be overloaded? Most can be overloaded. The only C operators that can't be are . and ?: (and sizeof, which is technically an operator). C++ adds a few of its own operators, most of which can be overloaded except :: and .*. 6. Is it possible to overload operator== so it lets us compare two char[] using a string comparison? No: at least one operand of any overloaded operator must be of some user-defined type (most of the time that means a class).But even if C++ allowed you to do this, which it doesn't, you wouldn't want to do it anyway since you really should be using a std::stringlike class rather than an array of char in the first place since arrays are evil. 7.Can we create a operator** for "to-the-power-of" operations? No. The names of, precedence of, associativity of, and arity of operators is fixed by the language. There is no operator** in C++, so we cannot create one for a class type. Consider that x ** y is the same as x * (*y) (in other words, the compiler assumes y is a pointer). Besides, operator overloading is just syntactic sugar for function calls. This particular syntactic sugar doesn't add anything fundamental. It is suggested to overload pow(base,exponent) (a double precision version is in ).By the way, operator^ can work for to-the-power-of, except it has the wrong precedence and associativity. 8.What are some guidelines / "rules of thumb" for overloading operators? If the overloaded operator makes the output easier and safer for the users, do it; otherwise don't. This is the most important guideline. In fact it is, in a very real sense, the only guideline; the rest are just special cases. While defining arithmetic operators, maintain the usual arithmetic identities. This watermark does not appear in the registered version - http://www.clicktoconvert.com

166

You should provide mixed-mode arithmetic operators only when they make logical sense to users. For example, it makes sense to add a duration (say 35 days) to a date (say July 4, 1776), so you might define date + duration to return a Date. Similarly date - duration could also return a Date. But duration - date does not make sense at the conceptual level (what does it mean to subtract July 4, 1776 from 35 days?) so you should not define that operator. 9.What is the rule to be followed when using constructive operators? If you provide constructive operators, they should return their result by value.If it returns by reference, that will probably run into lots of problems figuring out who owns the referent and when the referent will get destructed. Doesn't matter if returning by reference is more efficient; it is probably wrong. If you provide constructive operators, they should not change their operands and allow promotion of the left hand operand. 10.While using x==y, what does it mean by ‘the objects need be behaviourally equivalent’? If you define x == y, then x == y should be true if and only if the two objects are behaviorally equivalent. The term "behaviorally equivalent" means the observable behavior of any operation or sequence of operations applied to x will be the same as when applied to y. The term "operation" means methods, friends, operators, or just about anything else you can do with these objects (except, of course, the address-of operator). You won't always be able to achieve that goal, but you ought to get close, and you ought to document any variances (other than the address-of operator). 11.Is it recommended to overload short-circuiting operatos? Avoid overloading short-circuiting operators: x || y or x && y. The overloaded versions of these do not short-circuit — they evaluate both operands even if the left-hand operand "determines" the outcome, so that confuses users. 12.Why should we avoid overloading the comma operator? Avoid overloading the comma operator: x, y. The overloaded comma operator does not have the same ordering properties that it has when it is not overloaded, and that confuses users. 13.How do I create a subscript operator for a Matrix class? Use operator() rather than operator[]. When there are multiple subscripts, the cleanest way to do it is with operator() rather than with operator[]. The reason is that operator[] always takes exactly one parameter, but operator() can take any number of parameters (in the case of a rectangular matrix, two parameters are needed). 14.Why shouldn't any Matrix class's interface look like an array-of-array? The operator() approach is never worse than, and sometimes better than, the [][] approach. • The operator() approach is never worse because it is easy to implement the dense, row-major physical layout using the operator() approach, so when that This watermark does not appear in the registered version - http://www.clicktoconvert.com

167 configuration happens to be the optimal layout from a performance standpoint, the operator() approach is just as easy as the [][] approach (perhaps the operator() approach is a tiny bit easier, but I won't quibble over minor nits). • The operator() approach is sometimes better because whenever the optimal layout

for a given application happens to be something other than dense, row-major, the implementation is often significantly easier using the operator() approach compared to the [][] approach. 15.How can I overload the prefix and postfix forms of operators ++ and --? Via a dummy parameter. Since the prefix and postfix ++ operators can have two definitions, the C++ language gives us two different signatures.Both are called operator++(), but the prefix version takes no parameters and the postfix version takes a dummy int.(Although this discussion revolves around the ++ operator, the -- operator is completely symmetric, and all the rules and guidelines that apply to one also apply to the other.) However we must *not* make the postfix version return the 'this' object by reference. This watermark does not appear in the registered version - http://www.clicktoconvert.com

168

(v) INTRODUCTION TO JAVA, COMPARISON 1. what is self assignment? Self assignment is when someone assigns an object to itself. For example, void userCode(Fred& x) { x = x; // Self-assignment } Obviously no one ever explicitly does a self assignment like the above, but since more than one pointer or reference can point to the same object (aliasing), it is possible to have self assignment without knowing it: 2. Is Encapsulation a Security device? No. Encapsulation != security. Encapsulation prevents mistakes, not espionage. 3. What's the difference between the keywords struct and class? The members and base classes of a struct are public by default, while in class, they default to private. struct and class are otherwise functionally equivalent. A struct simply feels like an open pile of bits with very little in the way of encapsulation or functionality. A class feels like a living and responsible member of society with intelligent services, a strong encapsulation barrier, and a well defined interface. Since that's the connotation most people already have, you should probably use the struct keyword if you have a class that has very few methods and has public data. 4.Is there any difference between List x; and List x();? A big difference! Suppose that List is the name of some class. Then function f() declares a local List object called x: void f() { List x; // Local object named x (of class List) ... }

But function g() declares a function called x() that returns a List: void g() { List x(); // Function named x (that returns a List) ... } This watermark does not appear in the registered version - http://www.clicktoconvert.com

169 5. Should the constructors use "initialization lists" or "assignment"? Initialization lists. In fact, constructors should initialize as a rule all member objects in the initialization list. One exception is discussed further down. Consider the following constructor that initializes member object x_ using an initialization list: Fred::Fred() : x_(whatever) { }. The most common benefit of doing this is improved performance. For example, if the expression whatever is the same type as as member variable x_, the result of the whatever expression is constructed directly inside x_ - the compiler does not make a separate copy of the object. Even if the types are not the same, the compiler is usually able to do a better job with initialization lists than with assignments. The other (inefficient) way to build constructors is via assignment, such as: Fred::Fred() { x_ = whatever; }. In this case the expression whatever causes a separate, temporary object to be created, and this temporary object is passed into the x_ object's assignment operator. Then that temporary object is destructed at the ;. That's inefficient. As if that wasn't bad enough, there's another source of inefficiency when using assignment in a constructor: the member object will get fully constructed by its default constructor, and this might, for example, allocate some default amount of memory or open some default file. All this work could be for naught if the whatever expression and/or assignment operator causes the object to close that file and/or release that memory (e.g., if the default constructor didn't allocate a large enough pool of memory or if it opened the wrong file). Conclusion: All other things being equal, your code will run faster if you use initialization lists rather than assignment 6.Should the this pointer can be used in the constructor? Here is something that always works: the {body} of a constructor (or a function called from the constructor) can reliably access the data members declared in a base class and/or the data members declared in the constructor's own class. This is because all those data members are guaranteed to have been fully constructed by the time the constructor's {body} starts executing. Here is something that never works: the {body} of a constructor (or a function called from the constructor) cannot get down to a derived class by calling a virtual member function that is overridden in the derived class. If your goal was to get to the overridden function in the derived class, you won't get what you want. Note that you won't get to the override in the derived class independent of how you call the virtual member function: explicitly using the this pointer (e.g., this->method()), implicitly using the this pointer (e.g., method()), or even calling some other function that calls the virtual member function on your this object. The bottom line is this: even if the caller is constructing an object of a derived class, during the constructor of the base class, your object is not yet of that derived class. You have been warned. Here is something that sometimes works: if you pass any of the data members in this object to another data member's initializer, you must make sure that the other data

member has already been initialized. The good news is that you can determine whether the other data member has (or has not) been initialized using some straightforward This watermark does not appear in the registered version - http://www.clicktoconvert.com

170 language rules that are independent of the particular compiler you're using. The bad news it that you have to know those language rules (e.g., base class sub-objects are initialized first (look up the order if you have multiple and/or virtual inheritance!), then data members defined in the class are initialized in the order in which they appear in the class declaration). If you don't know these rules, then don't pass any data member from the this object (regardless of whether or not you explicitly use the this keyword) to any other data member's initializer! And if you do know the rules, please be careful. 7.Why are classes with static data members getting linker errors? Because static data members must be explicitly defined in exactly one compilation unit. If you didn't do this, you'll probably get an "undefined external" linker error. For example: // Fred.h class Fred { public: ... private: static int j_; // Declares static data member Fred::j_ ... }; 1 The linker will holler at you ("Fred::j_ is not defined") unless you define (as opposed to merely declare) Fred::j_ in (exactly) one of your source files: // Fred.cpp #include "Fred.h" int Fred::j_ = some_expression_evaluating_to_an_int; // Alternatively, if you wish to use the implicit 0 value for static ints: // int Fred::j_; The usual place to define static data members of class Fred is file Fred.cpp (or Fred.C or whatever source file ext 8.What's the order that local objects are destructed? In reverse order of construction: First constructed, last destructed. 9.Can I overload the destructor for my class? No. You can have only one destructor for a class Fred. It's always called Fred::~Fred(). It never takes any parameters, and it never returns anything. You can't pass parameters to the destructor anyway, since you never explicitly call a destructor This watermark does not appear in the registered version - http://www.clicktoconvert.com

171 10.What is a friend? Something to allow your class to grant access to another class or function. Friends can be either functions or other classes. A class grants access privileges to its friends. Normally a developer has political and technical control over both the friend and member functions of a class (else you may need to get permission from the owner of the

other pieces when you want to update your own class). 11.Do friends violate encapsulation? No! If they're used properly, they enhance encapsulation. 12.What are some advantages/disadvantages of using friend functions? They provide a degree of freedom in the interface design options. Member functions and friend functions are equally privileged (100% vested). The major difference is that a friend function is called like f(x), while a member function is called like x.f(). Thus the ability to choose between member functions (x.f()) and friend functions (f(x)) allows a designer to select the syntax that is deemed most readable, which lowers maintenance costs. The major disadvantage of friend functions is that they require an extra line of code when you want dynamic binding. To get the effect of a virtual friend, the friend function should call a hidden (usually protected) virtual member function. This is called the Virtual Friend Function 13.What's the deal with inline functions? When the compiler inline-expands a function call, the function's code gets inserted into the caller's code stream (conceptually similar to what happens with a #define macro). This can, depending on a zillion other things, improve performance, because the optimizer can procedurally integrate the called code - optimize the called code into the caller. There are several ways to designate that a function is inline, some of which involve the inline keyword, others do not. No matter how you designate a function as inline, it is a request that the compiler is allowed to ignore: it might inline-expand some, all, or none of the calls to an inline function. 14.What are the benefits of operator overloading? By overloading standard operators on a class, you can exploit the intuition of the users of that class. This lets users program in the language of the problem domain rather than in the language of the machine. The ultimate goal is to reduce both the learning curve and the defect rate. 15.What operators can/cannot be overloaded? This watermark does not appear in the registered version - http://www.clicktoconvert.com

172 Most can be overloaded. The only C operators that can't be are . and ?: (and sizeof, which is technically an operator). C++ adds a few of its own operators, most of which can be overloaded except :: and .*. 16.Can I overload operator== so it lets me compare two char[] using a string comparison? No: at least one operand of any overloaded operator must be of some user-defined type (most of the time that means a class). But even if C++ allowed you to do this, which it doesn't, you wouldn't want to do it anyway since you really should be using a std::string-like class rather than an array of char in the first place since arrays are evil. 17.Which is more efficient: i++ or ++i? ++i is sometimes faster than, and is never slower than, i++. . 18.Is the type of "pointer-to-member-function" different from "pointertofunction"? Yes.

Consider the following function: int f(char a, float b); The type of this function is different depending on whether it is an ordinary function or a non-static member function of some class: • Its type is "int (*)(char,float)" if an ordinary function • Its type is "int (Fred::*)(char,float)" if a non-static member function of class Fred 19.What is a reference? An alias (an alternate name) for an object. References are frequently used for pass-by-reference: 20.What does object.method1().method2() mean? It chains these method calls, which is why this is called method chaining. The first thing that gets executed is object.method1(). This returns some object, which might be a reference to object (i.e., method1() might end with return *this;), or it might be some other object. Let's call the returned object objectB. Then objectB becomes the this object of method2(). The most common use of method chaining is in the iostream library. E.g., cout << x << y works because cout << x is a function that returns cout. 21.When should references, and pointers are used? This watermark does not appear in the registered version - http://www.clicktoconvert.com

173 Use references when you can, and pointers when you have to. References are usually preferred over pointers whenever you don't need "reseating". This usually means that references are most useful in a class's public interface. References typically appear on the skin of an object, and pointers on the inside. The exception to the above is where a function's parameter or return value needs a "sentinel" reference. This is usually best done by returning/taking a pointer, and giving the NULL pointer this special significance . 22.What's the idea behind templates? A template is a cookie-cutter that specifies how to cut cookies that all look pretty much the same (although the cookies can be made of various kinds of dough, they'll all have the same basic shape). In the same way, a class template is a cookie cutter for a description of how to build a family of classes that all look basically the same, and a function template describes how to build a family of similar looking functions. 23.What is a "parameterized type"? Another way to say, "class templates." A parameterized type is a type that is parameterized over another type or some value. List is a type (List) parameterized over another type (int). 24.What is a "virtual member function"? From an OO perspective, it is the single most important feature of C++ A virtual function allows derived classes to replace the implementation provided by the base class. The compiler makes sure the replacement is always called whenever the object in question is actually of the derived class, even if the object is accessed by a base pointer rather than a derived pointer. This allows algorithms in the base class to be replaced in the derived class, even if users don't know about the derived class. The derived class can either fully replace ("override") the base class member function, or the derived class can partially replace ("augment") the base class member function. The latter is accomplished by having the derived class member function call the base class

member function, if desired. 25.How can C++ achieve dynamic binding yet also static typing? When you have a pointer to an object, the object may actually be of a class that is derived from the class of the pointer (e.g., a Vehicle* that is actually pointing to a Car object; this is called "polymorphism"). Thus there are two types: the (static) type of the pointer (Vehicle, in this case), and the (dynamic) type of the pointed-to object (Car, in this case). Static typing means that the legality of a member function invocation is checked at the earliest possible moment: by the compiler at compile time. The compiler uses the static type of the pointer to determine whether the member function invocation is legal. If the type of the pointer can handle the member function, certainly the pointed-to object can This watermark does not appear in the registered version - http://www.clicktoconvert.com

174 handle it as well. E.g., if Vehicle has a certain member function, certainly Car also has that member function since Car is a kind-of Vehicle. Dynamic binding means that the address of the code in a member function invocation is determined at the last possible moment: based on the dynamic type of the object at run time. It is called "dynamic binding" because the binding to the code that actually gets called is accomplished dynamically (at run time). Dynamic binding is a result of virtual functions. 26.what is the difference between overloading and overriding? overloading occurs when two or more methods use the same name (but different parameter lists),such that both methods are available side by side in the sam eclass;overriding occurs when a subclass method and a superclass method use yje same name (and matching parameter lists),such that the subclass mathod blocks out the superclass method. 27.what is the syper keyword is used for? The super keyword gives a class explicit access to the constructors ,methods ,and variables of its superclass. 28.what is a final class? A final class is a classs that is declared with final keyword so that it can never be subclassed. using final keyword in thwe java language has the general meaning that you define an entity once and cannot change it or derve from it later.more specifically: >a final class cannot be subclassed. >a final method cannot be overridden. >a final variable cannot change from its initialized value. 29.what is an interface? In java language,an interface is like a stripped-down class:it specifies a set of methods that an instance must handle,but it omits inheritance relations and method implementations. 30.How is an abtract class different from concrete class and interface? concrete class: i)specifies the full set of methods for an object. ii)implements all of its methods. iii)can have instances. iv)can have subclasses. This watermark does not appear in the registered version - http://www.clicktoconvert.com

175 Abstract class: i)specifies the full set of methods for an object ii)implements none,some ,or all of its methods. iii)can't have instances. iv)must have subclasses;useless without them Interface: i)specifies a subset of methods for an object. ii)implements none of its methods. iii)can't have instances iv)can't have subclasses;must have classes that implement it;useless without them. 31.what is the instanceof keyword ,and what does it do? The instanceof keyword is a two -argument operator that tests whether the run time type of its first argument is assignment compatible with its second argument. 32.what are packages and what are they used for? A package is a collection of classes and inetrfaces that provides a high-level layer of access protection and name space management. 33.why f is put after a floating point constant? The f suffix directs the compiler to create a float value from a sequence of characters representing a floating point number (a float literal) - otherwise the compiler would by default create either a double value or an int values. 34.Are java objects are pointers? no:java objects are objects (typed containers for data plus associated methods ) but the java language lets you access objects only via references. 35.In a method invocation,does java pass arguments by reference or by value? java method invocations pass arguments by value. java methods invocations pass all arguments by value-primitive data types and reference data types alike.This means that the value of each argument is copied to a corresponding method-internal variable before the method's code is executed.The method 's code can act only on this internal copy of the data. 36.what is an exception? An exception is a condition (typically an error condition ) that transfers program execution from a thrower ( at the source of the condition) to a catcher ( handler for the condition ); information about the condition is passed as an exeception or error object. This watermark does not appear in the registered version - http://www.clicktoconvert.com

176 37.what is the difference between a runtime exception and a plain exception -why don't runtime exceptions have to be declared? The java laguage specifies that all runtime exceptions are exempted from the standard method declarations and compiler checks.such exceptions belong more to the system as a whole than to the method that happens to be executing when the exception is thrown. 38.when,and by whom ,is the main method of a class invoked? The java virtua machine invokes the main method of java class as its starting point of execution. 39.what are bytecodes? Java bytecodes encode are a language of machine instructions understood by the java virtual machine and usually generated (compiled) from java language source code.

40.what is an applet? An applet is a java -compatible program that you can embed in a web page. it is usually composed of several pieces : 1.code :-the java class files that represent the executable code of the applet. 2.other resources:-data needed by the applet,including images and sounds. 41.How do applets differ from applications? Applications are stand-alone,full-featured programs ,whereas applets are embeddable,almost full-featured programs. 42.what is an abstract class? An abstract class is a class designed with implementation gaps for subclasses to fill in.an abstract class is deliberately incomplete. it defines a skeleton that various subclasses can flesh out with custoomized implementation details.The different concrete subclasses provide a family of variants. 43.what is an object reference? An object reference is essentially an object pointer with strong restrictions for integrity and safety.An object reference provides two kinds of information to be the run-time system: 1.a pointer to the object's instance information -its data. 2.a pointer to the object's class information -its run-time type and its table of methods. 44. What is the difference between an instance variable and class variable? This watermark does not appear in the registered version - http://www.clicktoconvert.com

177 An instance variable represents a separate data item for each instance of a class, whereas a class variable represents a single data item shared by the class and all its instances. 45. What are the component and container classes? Component and container are the two fundamental classes in the abstract window toolkit(AWT).component instances represent individual user-interface elements and container instances represent groups of components. This watermark does not appear in the registered version - http://www.clicktoconvert.com

178

Part – V

DATA STRUCTURES AND

ALGORITHM This watermark does not appear in the registered version - http://www.clicktoconvert.com

179

Topics Covered:Data Structures and Algorithm - Kiran Thomas and Sumitra K S This watermark does not appear in the registered version - http://www.clicktoconvert.com

180 1. Overview of Data Structure Data Structure encompasses the study in the following areas: Machines that holds data / executing algorithms Languages for describing data manipulation / algorithms Foundation of algorithms Structures for representing data & Analysis of algorithms A data type is a well-defined collection of data with a well-defined set of operations on it. A data structure is an actual implementation of a particular abstract data type. Abstraction can be thought of as a mechanism for suppressing irrelevant details while at the same time emphasizing relevant ones. 2. What is an algorithm? An algorithm is a finite set of instructions, which accomplish a particular task. Every algorithm must satisfy the following criteria: Input – Zero or more quantities which are externally supplied Output – at least one quantity is produced Definiteness – Each instruction must be clear and unambiguous Finiteness – If we trace out the instructions of an algorithm, then for all cases the algorithm will terminate after a finite number of steps Effectiveness – Every instruction must be sufficiently basic that a person using only pencil and paper can in principle carry it out. If is not enough that each operation be definite but it must also be feasible. The logical or mathematical model of organization of data is called a data structure. A data structure describes not just a set of data but also how they are related. Data Structures and algorithms should be thought of as a unit, neither one makes sense without the other. 3. Types of Data Structures There are several data structures that are available and the choice of an appropriate one depends on the requirement. Some of the data structures that are commonly used include: Arrays Linked Lists Stacks Queues Trees Graphs 4. Static vs. Dynamic Data Structures

A static data structure has a fixed size. Arrays are static; once you define the number of elements it can hold, the number doesn’t change. A dynamic data structure grows and This watermark does not appear in the registered version - http://www.clicktoconvert.com

181 shrinks at execution time as required by its contents. A dynamic data structure is implemented using linked lists. 5. Introduction to Arrays The simplest type of data structure is a linear array and most often it is the only data structure that is provided in any programming language. An array can be defined as a collection of homogenous elements, in the form of index/value pairs, stored in consecutive memory locations. An array always has a predefined size and the elements of an array are referenced by means of an index / subscript. 6. Memory Organization for an array If A is an array of 5 integer elements declared as: int A[5] = {1, 2, 3, 4, 5}; Then the memory organization for this array is as shown below: The elements of the array are always stored in consecutive memory locations. The elements are accessed by adding the index with the base address and this being a pointer arithmetic will always fetch the appropriate address depending on the type of data. 7. Multi Dimensional Arrays An array can be of more than one dimension. There are no restrictions to the number of dimensions that we can have. But as the dimensions increase the memory requirements increase drastically which can result in shortage of memory. Hence a multidimensional array must be used with utmost care. For example the following declaration int a[50][50][50] will result in occupying 50 * 50 * 50 = 1,25,000 * 4 = 5,00,000 bytes of memory. 8. Memory Organization The memory arrangement for a multidimensional array is very similar to linear arrays. Only the way in which we access the data varies i.e. if it is a 2 dimensional array for example, then it can be accessed by using 2 indexes one, which represents the row and the other representing the column. If A is a 2 dimensional array defined like int arr[3][3]; The maximum elements that this array can hold is 3 * 3 which is 9 elements and the This watermark does not appear in the registered version - http://www.clicktoconvert.com

182 elements of this array can be referred as arr[0][0], arr[0][1], arr[0][2], arr[1][0], arr[1][1]….arr[2][2]. 9. Advantages Searching is faster as elements are in continuous memory locations Memory management is taken care of by the compiler itself Limitations Number of elements must be known in advance Inserting and Deletions are costlier as it involves shifting the rest of the elements 10. Introduction to Pointers A pointer is nothing but a variable, which can hold the address of another variable.

Memory Representation of a variable int * p; p = &a; This stores the address of the variable ‘a’ in the pointer ‘p’, 1000 in our example. 11. De-referencing a Pointer A pointer is a datatype whose value is used to refer to ("points to") another value stored elsewhere in the computer memory. Obtaining the value that a pointer refers to is called Memory Representation of a Pointer Variable dereferencing the pointer. In the above example the * operator is used to get the value of the pointer p. 12. Assigning a value through pointer This watermark does not appear in the registered version - http://www.clicktoconvert.com

183 Whenever a value is assigned to a pointer the * operator has to be used so that it gets stored in the address pointed by p. Whenever an address is assigned do not use the * as we need to store an address only inside p and not at where p is pointing to. *p = 50; Observe the results of the following printf statements printf(“P = %u”, p); printf(“&P = %u”, &p); printf(“*p = %d”, *p); printf(“a = %d”, a); The results of the first two printf statements are exactly similar to the previous ones. But the third printf statement prints the value 50. Surprisingly the last printf statement also prints 50 and not 10 which was assigned earlier to a. This is because *p refers to the place where p is pointing. In other words *p denotes that the value will be stored on the address pointed by p which is the same as the address of the variable a. Thus with the help of the pointer it is possible to change the value of the variable/memory location it is pointing to. If p is a pointer and a is a variable defined as int *p; int a; then the following assignments are invalid : *p = &a; // as *p can hold only a value and not an address p = a; // as P can hold only an address and not a value Note: *(&a) is equal to a. [ *(&a) = a ] this means the content of the address of ‘a’ is nothing but ’a’ itself. If p is a pointer then, Whenever we refer the pointer as p we are referring to the address that p is pointing to. Whenever we refer the pointer as &p we are referring to the address of the pointer variable. Whenever we refer the pointer as *p we are referring to the value pointed by the address that the pointer contains. Note: When more than one value needs to be returned from a method, pointers are the only way through which it is possible. So from the above example let us look into the definition of Passing by value and passing by reference

13. Pointers and Arrays Having seen how to use pointer with variables let’s see how pointers can be used with arrays. Just like a pointer can point to a variable, it can also point to an array. It really doesn’t matter as to what it is pointing to as long as a valid address is assigned to a pointer. 14. Pointer Arithmetic This watermark does not appear in the registered version - http://www.clicktoconvert.com

184 Just like a normal variable, increment / decrement of a pointer variable is a valid operation. That is if we say P++ or P += 2 what does that it mean? Whenever a pointer variable is incremented the number of bytes that gets incremented depends on the data type. If it is an integer pointer and if we say P++, then the pointer moves by 2 bytes in a turbo C compile and 4 bytes in Microsoft Visual Studio. i.e. it points to the next valid address. Similarly if it is a character pointer, then incrementing the pointer will increment it by one byte. This is referred to as pointer arithmetic. Having understood pointer arithmetic how do we print the subsequent elements from an array? There are various ways for doing it. One way is to increment the pointer and then print the value again. This we can repeat till the end of the array is reached i.e. p++; printf(“%d”, *p); This time it prints the second element of the array, as the pointer is incremented. For printing subsequent elements the above 2 steps can be repeated. The more easier and elegant way of writing it is as follows: for(i = 0; i< 5; i++) { printf(“Value = %d\n”, *(p+i)); } The above loop will print the values of all the elements in the array. Here we keep the base address of the pointer as constant and adding the value of i to p (pointer arithmetic) and hence we will be referring to the elements of the array properly. It is also possible to write the above block as follows for(i =0; i< 5; i++) { printf(“Value = %d\n”, p[i]); } Yes. We can access a pointer just like an array. Similarly we can access an array like a pointer i.e. the following reference to the array is perfectly valid. for(i =0; i< 5; i++) { printf(“Value = %d\n”, *(arr + i)); } In other words, arrays are nothing but constant pointers. The difference between arrays and pointers are: An array has a fixed base address whereas a pointer’s can point to different addresses at different points of time A pointer can be incremented / decremented where as an array cannot be

Incremented/decremented. 15. Pointers and multi dimensional arrays As the internal representation of a multi dimensional array is also linear. A pointer can point to an array of any dimension. Only the way in which we access the pointer varies This watermark does not appear in the registered version - http://www.clicktoconvert.com

185 according to the dimension. For example if arr is a2 dimensional array of 3 rows and 3 cols int arr[3][3] = {1, 2, 3, 4, 5, 6, 7, 8, 9}; then the following assignment is valid to make int* p = arr; The elements can be accessed using the following loop for(i =0; i< 9; i++) { printf(“Value = %d\n”, *(p+i)); } 16. Character Pointers Just like an integer pointer a character pointer is a pointer, which is capable of holding the address of a character variable. Suppose if we have a character array declared as char name[30] = {“Data Structures”}; and if we need to store the address of this array we need a character pointer declared as follows char *p = NULL; Once the pointer is declared we need to assign the address of the array to this pointer. We know that when an array is referenced by its name we are referring to the address of the 0th element i.e. p = name; will assign the address of the 0th element to p. Pointers can be used interchangeably with arrays. 17. Memory Allocation So far we have been discussing about a pointer pointing to another variable or array. Can’t we directly make use of pointers? i.e. int *p; followed by scanf(“%d”, p); Note that there is no ‘&’ in the scanf for the pointer as referring a pointer by its name actually refers to the address it points to. The above statements are treated valid by the compiler and the code might execute properly as well. But this is really a dangerous programming. Because p is a pointer and we have not assigned anything to it, it might be having some value in it. Trying to store a value in that address can cause serious damage to the system, as we are not aware of what p will be pointing to. So it is always a better programming practice to initialize any pointer declaration to NULL i.e. int *p = NULL; This ensures that the pointer is not pointing to anywhere. Even after this we should not accept values from the user because there is no valid memory location for storing this value. In such cases before we start using the pointer we should allocate memory to it. Malloc() is the function used for memory allocation in C. malloc() takes one argument that is the number of bytes to allocate. Suppose P is a pointer then int *p;

p = (int *)malloc(sizeof(int)); This watermark does not appear in the registered version - http://www.clicktoconvert.com

186 The above assignment will allocate memory for holding an integer value. Remember whenever malloc is used for memory allocation the system allocates memory from the heap and not from the stack. So it is the responsibility of the user to free the memory that is allocated. For freeing the allocated memory the function used is, free(<pointer variable>); 18. Allocating memory for arrays int *p = (int*)malloc(10 * sizeof(int)); The above statement will allocate memory for 10 integer elements. 19. Pointer To Pointers Most often in real time situations the number of rows as well as columns might vary dynamically. This being the case the above approach cannot be used. In such situations we can make use of pointer to pointers. A pointer to pointer is nothing but a pointer, which is capable of holding the address of another pointer variable in it. As discussed earlier a pointer is also a variable and hence it is quite possible to hold the address of the pointer also in another pointer. A pointer to pointer can be declared as, int **p; The above declaration means that p is a pointer, which can hold the address of another integer pointer ( a pointer which can hold the address of an integer ). As such there is no restriction imposed by the compiler as to how many levels we can go about in using a pointer. i.e. the following declaration is perfectly valid int *****p; However beyond 3 levels if we make use of pointers then it will make the code highly complex and un-maintainable. 20. Pointer and structures A pointer can be used with a structure just like any other data type. The only difference is in the way we access the data members of the structure. Suppose if we have a structure as typedef struct player{ Memory Organization of Array of Pointers This watermark does not appear in the registered version - http://www.clicktoconvert.com

187 char name[30]; int age; }player; player p1; Then we can access the members as, p1.name and p1.age Suppose if we change the declaration as Player* p1; then for accessing the members of the structure we can follow either one of the following approaches p1->name or (*p1).name p1->age (*p1).age 21. Function Pointers Just like variables have address functions also have address. This means that we can make use of a pointer to represent the address of a function as well. If we use a pointer to

represent the address of an integer we declare it as int* p. Similarly if it is a character pointer we declare it as char* p. But if it is a function pointer how can we declare it. Suppose we have a function as void add(int x, int y) { printf(“Value = %d”, x + y); } if we want to declare a pointer to represent this function we can declare it as void (*p)(int x, int y); The above declaration means that p is a function pointer that can point to any function whose return type is void and takes two integer arguments. The general syntax for a function pointer declaration will look like, () ( [variable name], [variable name]); Variable names in the argument list are optional. It is only the data type that matters. Having seen how we can declare a function pointer the next step is to assign the address of a function to this pointer. But how do we get the address of a function? Just like an array whenever we refer a function by its name we actually refer to the address of the function. This means that we can assign the function address as, p = add; Note that we are not specifying the function parenthesis for add. If we put a function parenthesis it means that we are calling a function. Now how can we use this function pointer? Wherever we can call add we can use this function pointer interchangeably i.e. we can say, (*p)(10, 20); Which will make a call to add and print the result. Since it is a pointer it can be made to point to different functions at different places. i.e. if we have another method sub declared as void sub(int x., int y) { This watermark does not appear in the registered version - http://www.clicktoconvert.com

188 printf(“Value = %d”, x - y); } Then we can say, p = sub and call it as (*p)(20, 10); This time it will make a call to sub and print the result as 10. 22. Advantages It gives direct control over memory and thus we can play around with memory by all possible means. For example we can refer to any part of the hardware like keyboard, video memory, printer, etc directly. As working with pointers is like working with memory it will provide enhanced performance Pass by reference is possible only through the usage of pointers Useful while returning multiple values from a function

Allocation and freeing of memory can be done wherever required and need not be done in advance 23. Limitations If memory allocations are not freed properly it can cause memory leakages If not used properly can make the program difficult to understand 24. What is Recursion? Recursion is one of the wonderful features provided by C. A function making a call to itself is called as recursion. This can be helpful in a lot of situation making the logic compact and easy to understand. When a function is going to call itself how will this recursive call get terminated? There must always be one condition based on which the recursion should terminate, otherwise it will go in an infinite loop resulting in a stack overflow error. Recursion is typically meant for programs where The problem itself is recursively defined, for example factorial The data structure that the algorithm operates is recursively defined, for example binary trees Most importantly whenever we need to describe a backtracking procedure 25. Back Tracking Algorithms Back tracking algorithms are those places where we need to repeat a process going back and forth. One famous example for back tracking is the eight-queen problem. The problem is to place 8 queens on a chessboard without clashing each other. If we do it This watermark does not appear in the registered version - http://www.clicktoconvert.com

189 normally on chessboard how we will go about doing it. First we can place up to 5 queens easily. When we are about to place the 6th queen we might not be having any place for it to be placed. In such situations we will take the 5th queen and place it on some other place and we will try to place the 6th one again. If we still couldn’t locate a place then we will replace the 4th queen also on a different place and then we will be trying all possible permutation combinations. Similarly we will be trying going back and forth till we successfully place all the queens. Such kinds of problems are called as back tracking problems. It is on these places recursion can play its role wonderfully and make the algorithm look compact without much of memory or performance overhead. 26. Introduction To Stacks A stack is an ordered list in which items are inserted and removed at only one end called the TOP. There are only 2 operations that are possible on a stack. They are the ‘Push’ and the ‘Pop’ operations. A Push operation inserts a value into the stack and the Pop operation retrieves the value from the stack and removes it from the stack as well. An example for a stack is a stack of plates arranged on a table. This means that the last item to be added is the first item to be removed. Hence a stack is also called as Last-In-First-Out List or LIFO list. To understand the usage of a stack let us see what does the system does when a method call is executed: Whenever a method call is executed the compiler needs to know the return address for it to resume from the place where it left off in the calling method. 27. Graphical Representation A graphical representation of a stack is shown below: Graphical Representation of a Stack

28. Memory Organization of a Stack This watermark does not appear in the registered version - http://www.clicktoconvert.com

190 The memory organization of stack is very similar to that of a linear list. 29. Programmatic representation of a stack typedef struct node { int iData; struct node* pNext; }node; node* pTop = NULL; The node of a stack is very similar to the node in a list. The difference is only in the way the data is organized. As discussed there are only 2 operations permitted on a stack: the Push and the Pop Memory organization of a Stack 30. Push Operation A push operation is for inserting an element in to a stack. Elements are always inserted in the beginning of a stack. 31. Steps for the push operation Create a new node Make the new node’s next point to the node pointed by Top Make the top point to the new node void push(int iData) { node* pNewNode = (node *)malloc(sizeof(node)); pNewNode->iData = iData; pNewNode->pNext = pTop; pTop = pNewNode; } This watermark does not appear in the registered version - http://www.clicktoconvert.com

191 If you had a careful look over this code this is nothing but equal to inserting an element at the beginning of a list. 32. Pop Operation A pop operation is basically used for retrieving a value from the stack. This also removes the element from the Stack. int pop() { int iData; node* pTempNode = NULL; if (pTop == NULL) { iData = -1; printf(“Stack Empty”); } else {

pTempNode = pTop; iData = pTop->iData; pTop = pTop->pNext; free(pTempNode); } return iData; } 33. Application of Stacks Evaluating Expressions In normal practice any arithmetic expression is written in such a way that the operator is placed between its operands. For example (A + B) * (C + D) Such kind of expressions is called as infix notation. Polish notation refers to the notation in which the operator is placed before the operands. For example the above expression can be written as *+AB+CD The idea is, whenever we write expressions in this notation, parenthesis are not required for determining the priority of the expressions. Let us see the steps involved in the conversion (A+B)*(C+D) = (+AB)*(+CD) = *+AB+CD Reverse polish notation is exactly the opposite of polish notation i.e. the operator is always placed after the operands. For example, the above expression can be written in Reverse Polish Notation as (A+B) * (C+D) = (AB+) * (CD+) = AB+CD+* Whenever an expression is evaluated by the system it usually performs it by means of 2 steps. First it converts any expression into prefix or postfix notation and then it evaluates the expression as it makes the job much simpler. Converting an infix expression to postfix or prefix expression makes use of This watermark does not appear in the registered version - http://www.clicktoconvert.com

192 stacks extensively. The following is the algorithm for doing this conversion. A complete program requires a lot more than what is described in the algorithm. PostFixExpression ConvertToPolishNotation(InfixExpression) { 1. Push “(“ onto the STACK and add “)” to the end of the InfixExpression 2. Scan the InfixExpression from left to right and repeat steps 3 to 6 for each element of InfixExpression until the stack is empty 3. If an operand is encountered, add it to Postfix Expression 4. If a left parenthesis is encountered, push it to STACK 5. If an operator XX is encountered a. Pop from STACK repeatedly and add it to Postfix expression which has the same/ higher precedence than XX. b. Add XX to STACK 6. If a right parenthesis is encountered, then a. Pop from STACK repeatedly and add it to Postfix Expression until a left parenthesis is encountered. b. Remove the left parenthesis 7. Exit } 34. Introduction To Queues A Queue is an ordered list in which all insertions can take place at one end called the rear and all deletions take place at the other end called the front. The two operations that are

possible in a queue are Insertion and Deletion. A real time example for a queue is people standing in a queue for billing on a shop. The first person in the queue will be the first person to get the service. Similarly the first element inserted in the queue will be the first one that will be retrieved and hence a queue is also called as First In First Out or FIFO list. 35. Graphical Representation Graphical Representation of a Queue This watermark does not appear in the registered version - http://www.clicktoconvert.com

193 36. Programmatic representation of a Queue The node of a queue will have 2 parts. The first part contains the data and the second part contains the address of the next node. This is pretty much similar to a linear list representation. But in a queue insertions and deletions are going to occur on two different ends. If we have only one pointer called START then for every insertion we need to traverse the complete queue as insertions are always on the end and hence will be time consuming. To avoid this we are going to have 2 pointers to represent a queue, one pointer is called the FRONT which will always be pointing to the first element and is used for deletions, and the other pointer called REAR which will always be pointing to the last element in the queue and is used in insertions. typedef struct node { int iData; struct node* pNext; }node; node* pFRONT = NULL, *pREAR = NULL; When the queue is empty, both FRONT and REAR will be pointing to NULL. When there is only one element in the queue, then both FRONT and REAR will be pointing to the same element. 37. Steps for inserting an element into a Queue Inserting into a queue can happen only at the REAR end. 1. Create a new node 2. Make the next of new node as NULL as it is the last element always 3. If the queue is empty, then make FRONT point to new node 4. Otherwise make the previous node’s next ( the node pointed by REAR is always the previous node ) point to this node 5. Make Rear point to new node void QInsert(int iData) { This watermark does not appear in the registered version - http://www.clicktoconvert.com

194 node* pTempNode = getNode(iData); if ( pFRONT == NULL ) { //If the queue is empty pFRONT = pTempNode;

} else { pREAR->pNext = pTempNode; } pREAR = pTempNode; } 38. Steps for deleting an element from the queue The deletion is the only way through which we can retrieve values from a queue. Along with retrieving the value this will remove the entry from the queue. Deletions always happen in the FRONT end. 1. Store the node pointed by FRONT in a temporary pointer 2. Make Front as Front’s next 3. Delete the node pointed by temporary pointer 4. If all the nodes are deleted from the queue make the FRONT and REAR as NULL int QDelete() { int iData = -1; node* pDelNode = NULL; if (pFRONT == NULL) { printf(“Queue Empty”); } else { iData = pFRONT->iData; pDelNode = pFRONT; pFRONT = pFRONT->pNext; if (pFRONT == NULL) { pREAR = NULL; } free(pTempNode); } return iData; } 39. Deque This watermark does not appear in the registered version - http://www.clicktoconvert.com

195 A Deque is a queue in which insertions and deletions can happen at both ends of a queue. A deque, or double-ended queue is a data structure, which unites the properties of a queue and a stack. Like the stack, items can be pushed into the deque; once inserted into the deque the last item pushed in may be extracted from one side (popped, as in a stack), and the first item pushed in may be pulled out of the other side (as in a queue). 40. Implementation of Deque The push (insert/assign) and pop operation is done at both the end that is start and end of the deque. The following pictures show how a deque is formed based on this change in

algorithm. Initially the base and end pointer will be pointing to NULL or 0 (zero). We define two pointers, p_base and p_end to keep track of front and back of the deque. Initially when the deque object is created, both p_base and p_end would point to NULL. When the first node is created, p_base assumes the position and p_end starts pointing to p_base. The next and previous pointer of p_base assumes NULL or 0(zero). Further insertions happen at p_end. 41. Priority Queues A priority queue is the one in which each element will have a priority associated with it. The element with the highest priority is the one that will be processed/deleted first. If two or more nodes have the same priority then they will be processed in the same order as they were entered in to the queue. 42. Application of Queues The most common occurrence of a queue in computer applications is for scheduling of print jobs. For example if several people are giving print requests, all the request get queued up in the printer and is processed on first come first serve basis. Priority queues are used for job scheduling by the operating system. The operating system will assign a priority for every process running currently in the system. The jobs with the highest priory are the ones which are taken up by the operating system for processing first and once all the jobs in that priority are over it will try to find the job with the next priority and so on. 43. Evolution of Linked Lists In real time systems most often the number of elements will not be known in advance. The major drawback of an array is that the number of elements must be known in advance. Hence an alternative approach was required. This give rise to the concept called linked lists. A linear list is a linear collection of data elements called nodes, where the linear order is given by means of pointers. The key here is that every node will have two parts: first part contains the information/data and the second part contains the link/address of the next node in the list. Memory is allocated for every node when it is actually required and will be freed when not needed. This watermark does not appear in the registered version - http://www.clicktoconvert.com

196 44. What is a Structure in C? Aggregate data types built using elements of other types. Example: struct Time { int hour; int minute; int second; }; 45. Self-referential structure Self-referential structures contain pointers within the structs that refer to another identical structure. Linked lists are the most basic self-referential structures. Linked lists allow you to have a chain of structs with related data. 46. Graphical Representation of Linked List A graphical representation of a node will look like This means that for traversing through the list we need to know the starting point always. This is achieved by having a START pointer always pointing to the first element in the

list. Programmatic representation of a node typedef struct node This watermark does not appear in the registered version - http://www.clicktoconvert.com

197 { int iData; struct node* pNext; }node; node* pSTART = NULL; The node structure contains the data and the pointer for the next node. The address of the next node must again be of type node and hence we have a pointer to the node within node itself. Such kind of structures which refers to itself are called as Re-Directions or Self Referential1 Structures. 47. Inserting a node into a list Insertion logic varies depending on where in the list the element is going to be inserted, first, middle or at the end. In all situations it is as simple as making the pointer point to different nodes and hence insertion at any place will be much faster. 48. Steps for inserting an element in the beginning of a list 1. Create a new node 2. Make the next part of new node point to the node pointed by START 3. Make the START pointer point to new node void add_begin(int iData) { node* pNewNode = (node*)malloc(sizeof(node)); pNewNode->iData = iData; pNewNode->pNext = pSTART; pSTART = pNewNode; } 49. Steps for inserting an element in the middle 1. Create a new node 2. Make the next part of new node point to the next of previous node 3. Make the previous node point to new node void add_Middle(int iData, int iLoc) { int iPos = 1; node* pTempNode = NULL; node* pNewNode = NULL; if (iLoc == 1) { add_begin(iData); } else { This watermark does not appear in the registered version - http://www.clicktoconvert.com

198 for (pTempNode = pSTART; pTempNode->pNext != NULL && iPos < iLoc;

pTempNode = pTempNode->pNext, iPos++); if (iPos == iLoc) { pNewNode = (node*)malloc(sizeof(node)); pNewNode->iData = iData; pNewNode->pNext = pTempNode->pNext; pTempNode->pNext = pNewNode; } } } 50. Steps for inserting an element in the end 1. Create a new node 2. Make the next of new node point to NULL 3. Make the previous node point to new node void add_End(int iData) { int iPos = 0; node* pTempNode = NULL; node* pNewNode = NULL; for (pTempNode = pSTART; pTempNode->pNext != NULL; pTempNode = pTempNode->pNext); pNewNode = (node*)malloc(sizeof(node)); pNewNode->iData = iData; pNewNode->pNext = NULL; pTempNode->pNext = pNewNode; } 51. Steps for deleting a node in the beginning The next step is deleting a node from a list. Again deletion logic varies depending on from where the node is getting deleted, beginning, middle or at the end. 1. Store the node to be deleted in a temporary pointer 2. Make START point to next node in the list 3. Delete the node pointed by temporary pointer 52. Steps for deleting a node from the middle / end 1. Store the node to be deleted in a temporary pointer 2. Make the previous node’s next point to the next of the node that is being deleted 3. Delete the node pointed by temporary pointer void delete(int iData) { This watermark does not appear in the registered version - http://www.clicktoconvert.com

199 node* pTempNode = NULL; node* pDelNode = NULL; node* pPrevNode = NULL; for(pTempNode = pPrevNode = pSTART; pTempNode != NULL; pPrevNode = pTempNode, pTempNode = pTempNode->pNext) {

if (iData == pTempNode->iData) { pDelNode = pTempNode; pPrevNode->pNext = pDelNode->pNext; if (pDelNode == pSTART) { pSTART = pSTART->pNext; } free(pDelNode); } } } 53. Applications of Linked Lists A linked list is used typically in computer applications for memory allocations. The system keeps tracks of all the memory available in the system with the help of a list. This area of memory is called as the memory pool. As and when the user requests for memory the system allocates memory to the user from this available pool. Once allocated these blocks will be deleted from the available list. Once the user frees the memory again the memory will be added to the list. In general linked lists are used in almost all places whenever a collection of elements are required and when the number of elements in the collection is not known in advance. Advantages The number of elements in the linked lists need not be known in advance. As and when required elements can be added to a list Insertions and deletions are much faster in a linked list as there is no physical movement of elements of a list. Limitations Searching a list is always sequential and hence it is a time consuming operation Traversal is always possible in only one direction. Circular Linked Lists Circular linked lists are very similar to a linear list except that the last node will be pointing to the first node again instead of NULL. So care should be taken while doing the This watermark does not appear in the registered version - http://www.clicktoconvert.com

200 traversal to avoid infinite loops. A graphical representation of a circular linked list is as follows Assuming that someNode is some node in the list, this code iterates through that list starting with someNode: Forwards node := someNode do <do something with node.value> node := node.next while node not someNode 54. Introduction To Doubly Linked Lists With the linked list traversal is possible in only one direction. This limitation is overcome

with the help of doubly linked lists. A doubly linked list is a list in which each node will have 3 parts: one part containing the information / data, one part containing the pointer/address of next node and one part containing the pointer / address of previous node. In addition to the START pointer, which contains the first node there will also be a TAIL pointer pointing to the last node in the list. Memory organization of a Doubly Linked List Programmatic Representation of a node in Doubly Linked Lists typedef struct node { This watermark does not appear in the registered version - http://www.clicktoconvert.com

201 int iData; struct node* pNext; struct node* pPrev; }node; node* pSTART = NULL, *pTAIL = NULL; The node of a doubly linked list will have 2 pointers apart from the data, one for pointing to the previous element and the other for pointing to the next element in the list. There will also be a TAIL pointer pointing to the last node in the list. It is possible for us to traverse from the beginning of a list till the end or from the end of the list to the beginning. The end of the list is identified with the help of a NULL value in the next / previous parts. 55. Inserting an element in a doubly linked list Inserting an element in a doubly linked list involves just changing the pointers unlike arrays where every element after the inserted element needs to be shifted once. Hence inserting an element in a doubly linked list is much faster than inserting a node in an array. 56. Steps for inserting in the beginning of a doubly linked list 1. Create a new node 2. Make the next part of new node equal to START 3. Make the previous part of new node equal to NULL 4. Make the previous part of the node pointed by START to new node 5. Make START point to new node node* getNode(int iData) { node* pNewNode = (node *)malloc(sizeof(node)); pNewNode->iData = iData; pNewNode->pNext = NULL; pNewNode->pPrev = NULL; } void addHead(int iData) { node* pNewNode = getNode(iData); pNewNode->pNext = pSTART; pSTART->pPrev = pNewNode; pSTART = pNewNode; }

57. Steps for inserting an element in the middle of a doubly linked list 1. Create a new node 2. Make the next part of new node equal to next part of the previous node 3. Make the previous part of new node equal to previous part of next node 4. Make the next part of previous node point to new node This watermark does not appear in the registered version - http://www.clicktoconvert.com

202 5. Make the previous part of next node point to new node void insertAt(int iLoc, int iData) { node* pNewNode = NULL; node* pTempNode = NULL; int iPos = 1; if (iLoc == 1) { addHead(iData); } else { for(pTempNode = pSTART; pTempNode->pNext != NULL && iLoc < iPos; pTempNode = pTempNode->pNext, iPos++); if (iLoc == iPos) { pNewNode = getNode(iData); pNewNode->pNext = pTempNode->pNext; pNewNode->pPrev = pTempNode; if (pTempNode->pNext != NULL) { pTempNode->pNext->pPrev = pNewNode; } else // If it is the last node { pTAIL = pNewNode; } pTempNode->pNext = pNewNode; } else { printf(“Invalid Position”); } } } 58. Steps for inserting an element at the end of a doubly linked list 1. Create a new node 2. Make the next part of the new node equal to NULL 3. Make the previous part of the new node equal to TAIL 4. Make the next part of the previous node equal to new node 5. Make TAIL equal to new node

void addTail(int iData) { node* pNewNode = NULL; This watermark does not appear in the registered version - http://www.clicktoconvert.com

203 node* pTempNode = NULL; pNewNode = getNode(iData); pNewNode->pPrev = pTAIL; pTail->pNext = pNewNode; pTAIL = pNewNode; } 59. Deleting an element from a doubly linked list Deleting a node from a list is as simple as changing the links. Hence deleting a node from a list is much faster when compared to arrays. Like insertion the deletion logic also varies depending on from where in the list we are going to delete the node. 60. Deletion in the beginning of a doubly linked list 1. Make the temporary pointer point to the node to be deleted 2. Make the START point to the next node of START 3. Make the previous of the next node equal to previous of the node to be deleted 4. Delete the node pointed to by temporary pointer void removeHead() { node* pDelNode = NULL; pDelNode = pSTART; pSTART = pSTART->pNext; if (pDelNode == pTAIL) // If it is the last element in the list { pTAIL = NULL; } else { pSTART->pPrev = NULL; } free(pTempNode); } 61. Deletion in the middle of a doubly linked list 1. Make the temporary pointer point to the node to be deleted 2. Make the next part of the previous node equal to next of the node to be deleted 3. Make the previous part of the next node equal to previous part of the node to be deleted 4. Delete the node pointed to by temporary pointer void deleteAt(int iLoc) { node* pTempNode = NULL; node* pDelNode = NULL; This watermark does not appear in the registered version - http://www.clicktoconvert.com

204

int iPos = 1; if (iLoc == 1) { removeHead(); } else { for (pTempNode = pSTART; iPos < iLoc && pTempNode->pNext != NULL; pTempNode = pTempNode->pNext, iPos++); if (iLoc == iPos) { pDelNode = pTempNode->pNext; pTempNode->pNext = pDelNode->pNext; if (pDelNode == pTAIL) { pTAIL = pTAIL->pNext; } else { pDelNode->pNext->pPrev = pTempNode; } free(pDelNode); } } } 62. Deletion at the end of a doubly linked list 1. Make the temporary pointer point to the node to be deleted 2. Make the next part of the previous node equal to next of the node to be deleted 3. Make the TAIL equal to the previous part of the node to be deleted 4. Delete the node pointed to by temporary pointer void removeTail() { struct node* pTempNode = NULL; struct node* pDelNode = NULL; if (pSTART == NULL) { printf(“List is Empty”); } else { for (pTempNode = pSTART; pTempNode->pNext != NULL; pTempNode = pTempNode->pNext); pDelNode = pTempNode->pNext; pTempNode->pNext = NULL; This watermark does not appear in the registered version - http://www.clicktoconvert.com

205 pTAIL = pTAIL->pNext;

if (pTAIL == NULL) { pSTART = NULL; } free(pDelNode); } } 63. Traversing a doubly linked list A doubly linked list can be traversed in both directions. void displayFromStart() { node* pTempNode = NULL; for(pTempNode = pSTART; pTempNode != NULL; pTempNode = pTempNode>pNext) { printf(“Data = %d\n”, pTempNode->iData); } } void displayFromTail() { node* pTempNode = NULL; for(pTempNode = pTAIL; pTempNode != NULL; pTempNode = pTempNode>pPrev) { printf(“Data = %d\n”, pTempNode->iData); } } 64. Application of Doubly Linked Lists A doubly linked list can be used in all the places where we use a linear list if the traversal happens frequently in both the directions. Advantages The number of elements in a doubly linked list need not be known in advance. As and when required elements can be added to a list Insertions and deletions are much faster as there is no physical movement of elements of a list. Traversal is possible in both the directions unlike a linear list where the traversal is possible in only one direction Limitations This watermark does not appear in the registered version - http://www.clicktoconvert.com

206 Takes up additional memory for storing the previous pointer in each node Makes the insertion and deletion logic a bit complex 65. Introduction To Trees A tree is a finite set of one or more nodes such that i) There is a specially designated node called the root ii) The remaining nodes are partitioned into n >= 0 disjoint sets T1, ..., Tn where each of

these sets is a tree. T1, ..., Tn are called sub trees of the root 66. Graphical Representation A tree is a non-linear data structure. This structure is mainly used to represent data containing a hierarchical relationship between elements like family tree, organization chart etc. Some of the commonly used terms and their definitions with respect to a tree are as follows: 67. Definitions Degree of a node The number of sub trees of a node is called its degree. In the diagram shown above the degree of A is 3, B is 3, C is 0 , D is 2, ... Terminal or Leaf Nodes Nodes that have degree 0 are called as Terminal or Leaf nodes. In other words the nodes that does not have any sub trees are called Terminal or Leaf nodes Siblings The children of the same parent are called as siblings This watermark does not appear in the registered version - http://www.clicktoconvert.com

207 Level The level of the root node is 1. If a node is at level L then its children are at level L + 1. Height or Depth of a Tree The height or depth of a tree is the maximum level of any node in the tree. Forest A forest is a set of n>= disjoint trees. 68. Introduction to Binary Trees A binary tree T is defined as a finite set of elements, called nodes, such that: T is empty or T contains a distinguished node R, called the root of T, and the remaining nodes of T forms an ordered pair of disjoint binary trees T1 and T2. If T does contain a root R, then the two trees T1 and T2 are called the left and right sub trees of R Graphical Representation A binary tree is said to be complete if all its levels except the last level have the maximum number of possible nodes and if all the nodes at the last level appear as far left as possible. 69. Binary Search Trees This watermark does not appear in the registered version - http://www.clicktoconvert.com

208 A binary tree T is called a binary search tree if each node of T has the property: The value at N is greater than every value in the left sub tree of N and is less than every value in the right sub tree of N. Programmatic Representation of a Binary Tree typedef struct node { int iData; struct node* pLeft; struct node* pRight; }node;

node* pROOT = NULL; 70. Inserting an element in a binary search tree Inserting an element in a binary search tree involves creating a new node and reorganizing the parent nodes link to accommodate the new node. Typically insertions will always be on a terminal node. We will never be inserting a node in the middle of a binary search tree. 71. Steps for inserting an element in a binary search tree 1. If the tree is empty, then make the root node point to this node 2. If the tree contains nodes then compare the data with node’s data and take appropriate path till the terminal node is reached 3. If data < node’s data then take the left sub tree otherwise take the right sub tree 4. When there are no more nodes add the node to the appropriate place of the parent node. node* getNode(int iData) { node* pNewNode = (node *)malloc(sizeof(node)); pNewNode->iData = iData; pNewNode->m_pLeft = NULL; pNewNode->m_pRight = NULL; return pNewNode; } void insert(node* pParent, int iData) { if (pParent == NULL) { pParent = getNode(iData); if (pRoot == NULL) This watermark does not appear in the registered version - http://www.clicktoconvert.com

209 { pRoot = pParent; } } else { if (iData < pParant->iData) { insert(pParent->pLeft, iData); } else { insert(pParent->pRight, iData); } } } // end of function … 72. Deleting an element from a binary tree The logic for deleting an element from a binary tree varies depending on from where we

are going to delete the node, whether it is a leaf node, a node with a single child or a node in the middle. If the node being deleted is a terminal node then deletion is a straightforward procedure of just making the parent node point to NULL. If the node being deleted has only one child then the parent node will have to point to the child node of the node being deleted. If the node being deleted has both the children then we first need to find the inorder successor of the node being deleted and replace the node being deleted with this node. (The inorder successor of a node can be obtained by taking the right node of the current node and traversing in the left till we reach the left most nodes.) 73. Steps for deleting an element from a binary search tree If the node N has no children, then change the location of this node in the parent node to NULL. If the node N has exactly one child, then change the location of this node, in the parent, to child of N. If the node N has 2 children, then a. Store N in a temporary pointer b. Determine the inorder successor of N ( an inorder successor is obtained by moving to the right node and finding the left most node of that node ) c. Make the left of this node point to left of N d. Make the right of this node point to right of N This watermark does not appear in the registered version - http://www.clicktoconvert.com

210 e. Make the parent of N point to this node. f. Delete the node pointed by temporary pointer node* delete(node* pParent, int iData) { node* pTempNode1, *pTempNode2; if (pROOT == NULL) { printf(“Tree is Empty”); return NULL; } else { if (iData == pParent->iData) { if (pParent->pLeft == pParent->pRight) // if it is a terminal node { free(pParent); return NULL; } else if (pParent->pLeft == NULL) // if there is only right child { pTempNode1 = pParent->pRight;

free(pParent); return pTempNode1; } else if (pParent->pRight == NULL) // if there is only left child { pTempNode1 = pParent->pLeft; free(pParent); return pTempNode1; } else //if it has both the child { pTempNode2 = pParent->pRight; // Get the inorder successor of pParent for (pTempNode1 = pParent->pRight; pTempNode1->pLeft != NULL; pTempNode1 = pTempNode1>pLeft); pTempNode1->pLeft = pParent->pLeft; free(pParent); return pTempNode2; } } if (iData < pParent->iData) { This watermark does not appear in the registered version - http://www.clicktoconvert.com

211 pParent->pLeft = delete(pParent->pLeft, iData); } else { pParent->pRight = delete(pParent->pRight, iData); } return pParent; } } // end of function delete … 74. Binary Tree Traversal A binary tree can be traversed in 3 ways namely Preorder Traversal / NLR Traversal a) Process the root b) Traverse the left sub tree in preorder c) Traverse the right sub tree in preorder Inorder Traversal / LNR Traversal a) Traverse the left sub tree in inorder b) Process the root c) Traverse the right sub tree in inorder Postorder Traversal / LRN Traversal a) Traverse the left sub tree in postorder

b) Traverse the right sub tree in postorder c) Process the root From the definition itself it is very clear that the implementation of this traversal algorithm involves recursion. 75. Code Snippet for Inorder Traversal void InOrder(struct node* pParent) { if ( pParent != NULL ) { InOrder(pParent->pLeft); printf("Data = %d\n", pParent->iData ); InOrder(parent->pRight); } } This watermark does not appear in the registered version - http://www.clicktoconvert.com

212 76. Code Snippet for Preorder Traversal void PreOrder(struct node* pParent) { if ( pParent != NULL ) { printf("Data = %d\n", pParent->iData ); PreOrder(pParent->pLeft); PreOrder(parent->pRight); } } 77. Code Snippet for Postorder Traversal void PostOrder(struct node* pParent) { if ( pParent != NULL ) { PostOrder(pParent->pLeft); PostOrder(parent->pRight); printf("Data = %d\n", pParent->iData ); } } 78. Application of trees A tree can be used in a wide variety of applications, which includes set representations, decisionmaking, game trees, etc. Within the context of programming language execution, compilers utilize tree structures to obtain forms of an arithmetic expression, which can be evaluated efficiently. The in-order traversal of the binary tree for an arithmetic expression produces the infix form of the expression, while the pre-order and post-order traversal lead to the prefix and postfix (reverse Polish) forms of the expression respectively. The advantage of reverse Polish notation is that arithmetic expressions can be represented in a way, which can be simply read from left to right without the need for parentheses. For example, consider the expression a * (b + c). This expression can be represented by the binary tree in the following way,

* /\ a+ /\ bc If we perform a post-order traversal in which we traverse the left branch of the tree, the right branch, followed by the node, we immediately recover the reverse Polish form of the expression, namely abc+*. This expression can then be evaluated using a stack. Starting from the left of the expression, each time an operand (one of the numerical values a, b or c in this example) is found, it is placed on the top of the stack. When an This watermark does not appear in the registered version - http://www.clicktoconvert.com

213 operator (* or +) is read, the top two elements are removed form the stack and the appropriate operator applied to them and the result placed on the stack. When the complete expression has been evaluated, the stack will contain a single item, which is the result of evaluating the expression. 79. Introduction To Graph Theory A graph G consists of two things: a) A set of elements called vertices / nodes / points V. b) A set of edges E, such that each edge e in E is identified with a unique pair [u,v] of nodes in V, denoted by e = [u, v] A graph can also be represented as G =(V, E) Before getting into the details of a Graph let’s see few of the commonly used terms with respect to graph and their definitions: 80. Adjacent Nodes If e = [u, v] then the nodes u and v are called endpoints of e and u and v are said to be adjacent nodes or neighbors. 81. Definitions Degree of a node The degree of a node u is the number of edges containing u. If deg(u) = 0, then the node is called as an isolated node. Path A Path P of length n from node u to v is defined as a sequence of n + 1 nodes. P = (V0, V1,....Vn) Closed Path A path P is said to be closed if V0 = Vn Simple Path The path P is said to be simple, if all the nodes are distinct, with the exception that v0 may equal v1, that is, P is simple if the nodes v0, v1, ... vn-1 are distinct and the nodes v0, v1, ...vn are distinct. Cycle A cycle is a closed simple path. Connected Graph A graph G is connected if and only if there is a simple path between any two nodes in G. Complete Graph A graph G is said to be complete if every node u in G is adjacent to every other node v in G. A graph G is said to be labeled, if its edges are assigned data. A This watermark does not appear in the registered version - http://www.clicktoconvert.com

214 graph is said to be weighted if each edge e in G is assigned a non-negative numerical value called the weight or length of e. Multiple edges Distinct edges e and e’ are called multiple edges if they connect the same end points. i.e. if e = [u, v] and e’ = [u, v]. Loops An edge e is called a loop if it has identical endpoints. i.e. e = [u, u] Multi Graph A graph, which contains multiple edges or Loops, is called as Multigraph M. Directed Graphs A directed graph G is the same as a Multigraph except that each edge e in G is assigned a direction. Out Degree The number of edges beginning at a node u is the outdegree of that node and the number of edges ending at the node e is the indegree of e. A node is called a Source if it has positive outdegree and zero indegree. A node is called as a Sink if it has a positive indegree and zero outdegree. 82. Memory Organization of a Graph A graph is represented in memory with the help of a linear list of all the nodes in the graph called as the node list. The adjacent nodes for every node are again maintained by means of a list, 1which is called as the adjacency list. Consider the following graph: This watermark does not appear in the registered version - http://www.clicktoconvert.com

215 Graphical Representation of Graph First prepare an adjacency list for every node which is as shown in the table: Table - Adjacency List for the graph Another way of representing graphs is by means of adjacency matrix Structural Representation of a Graph typedef struct listNode { struct node* pData; struct listNode* pNext; }listNode; The above structure is for the adjacency list. The data part represents a pointer to a node in the node list and the next part contains the link for the subsequent node in the adjacency list typedef struct node { int iData; struct node* pNext; struct listNode* pAdjNode; }node; The above structure is for the node list. It contains 3 parts. The first part contains the data, the second part contains a pointer to the next node in the node list and the third part contains a pointer to the adjacency list. 83. Inserting into graph

We can insert either a node or an edge in a graph. Inserting a node involves inserting an element in the node list and the adjacent part has to be set NULL as this will be an This watermark does not appear in the registered version - http://www.clicktoconvert.com

216 isolated node. Inserting an edge involves inserting an element in to the adjacency list of all the nodes, which are adjacent to that node. 84. Deleting from a graph Deleting an element from a graph again involves deleting an edge in which case the adjacency list of all the nodes adjacent to this node will have to be deleted and the node as such will have to be deleted from the node list. Deleting an edge involves deleting an element from the adjacency list of the node that is getting deleted. 85. Traversing a graph There are 2 ways by means of which a graph can be traversed namely: Breadth First Search Check the starting node A. The check all the neighbors of A. Then examine the neighbors of neighbors of A and so on. Depth First Search Check the starting node A. Then search each node N along the path P, which begins at node A. i.e., search neighbor of A, then neighbor of neighbor of A and so on. Once we reach the dead end then we backtrack on P until we can continue along another path P’ and so on. 86. Application of Graphs Graphs are used in a wide variety of applications. A graph can be used by an airlines for maintaining the flight information such as the various flights, their routes and the distance between the places, etc. With the help of such a graph one will be easily able to find out whether there is any flight for a particular destination and if there are several routes to reach that destination which one is having the optimum distance or cost, etc.

COLLECTED BY --B.RAJ--

Related Documents

Network
November 2019 46
Network
November 2019 47
Network
April 2020 49
Network
November 2019 52
Network
October 2019 48
Network
November 2019 71