Identifying Clones By Gaurav

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Identifying Clones By Gaurav as PDF for free.

More details

  • Words: 801
  • Pages: 29
Identifying Clone in Linux kernel

By Gaurav Taywade

History • Unix : 1969 Thompson & Ritchie AT& T Bell Lab. • BSD : 1978 Berkeley Software Distribution. • GNU: 1984 Richard Stallman, FSF. • Minix: 1987 Andy Tannenbaum. • Linux: 1991 Linus Torvalds Intel 386 (i386).

Linux Features • UNIX-like operating system. • Preemptive multitasking. • Virtual memory (protected memory, paging). • Demand loading, dynamic kernel modules. • TCP/IP networking. • Open source.

What’s a Kernel? • System monitor. • Controls and mediates access to hardware. • Schedules / allocates system resources: • Enforces security and protection. • Responds to user requests for service (system calls).

Kernel Design Goals • Performance: efficiency, speed. • Stability: robustness, resilience. • Capability: features, flexibility, compatibility. • Security, protection. • Portability

Linux kernels • Consists of 538 .c and .h files, 279,118 LOC. • 42 file system implementations. • Layered design.

Clone Code • The practice of copying code promotes the appearance of duplicated code snippets, called as clones.

• Typically 5% to 10% of code, up to 50%.

Associated Problems • Errors can be difficult to fix. • Change in requirements may be difficult to implement. • Code size unnecessarily increased. • Can lead to unused, dead code. • Can be indicative of design problems. • Bugs may be copied as well.

Where clone occurs???? • Duplicated blocks within the same function. • Cloned blocks across functions, files and directories. • Similar functions, same file. • Functions cloned between files in the same directory. • Functions cloned across directories. • Cloned files.

Frequency of Clone Types

The Clone Identification process

Case Study

Duplication Detection Techniques • String based • Token based • Parse-tree based

The Method Two feasible approaches ,to obtain information on several platform • Pre-Process and parse the code source with different configuration. • Adopt a fictitious reference configuration

The Method Values that can be assumed by pre-processor switch. • Y the code is included into the compiled kernel; • N (commented switch),the code is excluded. • M a dynamically loadable module is produced.

The Method

1) Function Identification. 2) Clone Identification

The Method

1) Function Identification.

The Method 2) Clone Identification • • • • • • • •

The number of passed parameters. The number of LOC. The cyclomatic complexity. The number of used/defined local variable. The number of used/defined non local variable. The number of arithmetic and logical operator. The number of function call & return/exit points. The number of structure pointer access fields.

Results: Accuracy • Number of false matches: • Parameterized suffix tree matching and simple line matching find no false matches. • Parameterized line matching finds few false matches. • Metrics based matching finds many false positives when applying metrics to block fragments, only a few when applying to methods.

Results: Accuracy • Number of useless matches: • Both parameterized methods returned low amounts of useless matches. • Metrics found more useless matches, 133 out of 138 in TextEdit when applying metrics to methods. • Simple line matching finds many, 229 useless matches in TextEdit.

Results: Accuracy • Number of recognizable matches • Parameterized matching techniques return less recognizable matches. • Simple string match returns the lowest.

Kernel analysis

Cloning evolution Levels Of granularity: 1)The overall cloning on the entire Linux kernel; 2) The cloning among major subsystems; and 3) The cloning among architecturedependent code of some subsystems.

Results • 12% of the Linux kernel file-system code is involved in code duplication. • Detected 3116 clone pairs, with an average length is 13.5 lines. • 78% of cloning occurs in the same directory.

Conclusion • We have begun to build a taxonomy of code clones in software. • Cloning activity in the Linux kernel filesystem subsystem is at a non-trivial rate. • Cloning most commonly occurs within a subsystem.

Conclusion • Parameterized string matching provides an interesting and powerful method for function duplication detection. • 3D visualization provided an interesting method of viewing clones amongst subsystems

Visualization of Cloning Without Showing Same Directory Clones

References •

G. Casazza, G. Antoniol_, U. Villano, E. Merlo, M. Di Penta, Identifying Clones in the Linux Kernel, IEEE Computer society press, 2001



Christopher Negus, Linux Bible, 2005 Edition



I. Bowman, Conceptual architecture of the Linux kernel, Technical report, Technical Report, University of Waterloo.



J. Mayrand, C. Leblanc, and E. Merlo., Experiment on the automatic detection of function



clones in a software system using metrics. In Proceedings of the International Conference on Software Maintenance - IEEE Computer SocietyPress, Nov 1996.



G. Antoniol, U. Villano, E. Merlo, M. Di Penta, Analyzing cloning evolution in the Linux kernel, Information and Software Technology 44 (2002)

References •

Gary nutt , Kernel projects, Addison Wesley,2001 edition

• M.Beck,H. Bohme, Linux kernel Programming,Pearson education,2004 • Neil Matthew,Richard Stones, Wiley Publication,2004 edition.

Related Documents

Identifying Clones By Gaurav
November 2019 12
Gaurav
May 2020 17
Gaurav
April 2020 31
Gaurav
November 2019 34
Gaurav
August 2019 30
Magnificent Clones
October 2019 14