IJCSIS Vol. 6, No. 1, October 2009 ISSN 1947-5500
International Journal of Computer Science & Information Security
© IJCSIS PUBLICATION 2009
IJCSIS Editorial Message from Managing Editor I am pleased to introduce the Volume 6 No. 1 October 2009 issue of IJCSIS containing 30 papers (Acceptance rate of ~ 37%) that have been selected after undergoing rigorous journal-style review process. The goal is to present more high-quality research results from the world’s top researchers in the area of computer science, networking, emerging technologies and information security. This journal issue again clearly demonstrates that the goal has been achieved. With a continuing open-access policy, I welcome the readers to appreciate this exclusive collection of high quality computer science research works.
Special thanks to our reviewers and sponsors for their valuable service. Available at http://sites.google.com/site/ijcsis/ IJCSIS Vol. 6, No. 1, October 2009 Edition ISSN 1947-5500 © IJCSIS 2009-2010, USA.
Indexed by (among others):
IJCSIS EDITORIAL BOARD Dr. Gregorio Martinez Perez Associate Professor - Professor Titular de Universidad University of Murcia (UMU), Spain Dr. M. Emre Celebi, Assistant Professor Department of Computer Science Louisiana State University in Shreveport, USA Dr. Yong Li School of Electronic and Information Engineering, Beijing Jiaotong University P.R. China Dr. Sanjay Jasola Professor and Dean School of Information and Communication Technology, Gautam Buddha University, Dr Riktesh Srivastava Assistant Professor, Information Systems Skyline University College, University City of Sharjah, Sharjah, PO 1797, UAE Dr. Siddhivinayak Kulkarni University of Ballarat, Ballarat, Victoria Australia Professor (Dr) Mokhtar Beldjehem Sainte-Anne University Halifax, NS, Canada
TABLE OF CONTENTS 1. A New Fuzzy Approach for Dynamic Load Balancing Algorithm (pp. 001-005) Abbas Karimi1,2,3, Faraneh Zarafshan 1,3, Adznan b. Jantan1,A.R. Ramli1, M. Iqbal b.Saripan1 1 Department of Computer Systems Engineering, Faculty of Engineering, UPM, Malaysia 2 Computer Department, Faculty of Engineering, IAU, Arak, Iran 3 Young Researchers’ Club, IAU, Arak, Iran 2. Knowledge Extraction for Discriminating Male and Female in Logical Reasoning from Student Model (pp. 006-015) A. E. E. ElAlfi, Dept. of Computer Science, Mansoura University, Mansoura Egypt, 35516 M. E. ElAlami, Dept. of Computer Science, Mansoura University, Mansoura Egypt, 35516 Y. M. Asem, Dept. of Computer Science, Taif University, Taif, Saudia Arabia. 3. A Mirroring Theorem and its Application to a New Method of Unsupervised Hierarchical Pattern Classification (pp. 016-025) Dasika Ratna Deepthi, Department of Computer Science, Aurora’s Engineering College, Bhongir, Nalgonda Dist., A.P., India. K. Eswaran, Department of Computer Science, Srinidhi Institute of Science and Technology, Yamnampet, Ghatkesar, Hyderabad, India. 4. Algorithm as Defining Dynamic Systems (pp. 026-028) Keehang Kwon, Department of Computer Engineering, Dong-A University, Busan, Republic of Korea Hong Pyo Ha, Department of Computer Engineering, Dong-A University, Busan, Republic of Korea 5. A Wavelet-Based Digital Watermarking for Video (pp. 029-033) A.Essaouabi and F.regragui, Department of physics, LIMIARF Laboratory, Faculty of Sciences Mohammed V University Rabat, Morocco E.Ibnelhaj, Image laboratory, National Institute of Posts and Telecommunications, Rabat, Morocco 6. A Cost Effective RFID Based Customized DVD-ROM to Thwart Software Piracy (pp. 034-039) Prof. Sudip Dogra, Electronics & Communication Engineering, Meghnad Saha Institute of Technology, Kolkata, India Ritwik Ray, Electronics & Communication Engineering, Meghnad Saha Institute of Technology, Kolkata, India Prof. Subir Kr. Sarkar, Electronics and Telecommunication Engineering, Jadavpur University, Kolkata, India Saustav Ghosh, Electronics & Communication Engineering, Meghnad Saha Institute of Technology, Kolkata, India Debharshi Bhattacharya, Electronics & Communication Engineering, Meghnad Saha Institute of Technology Kolkata, India 7. A O(|E|) Time Shortest Path Algorithm For Non- Negative Weighted Undirected Graphs (pp. 040046) Muhammad Aasim Qureshi, Dr. Fadzil B. Hassan, Sohail Safdar, Rehan Akbar Computer And Information Science Department, University Technologi PETRONAS, Perak, Malaysia 8. Biologically Inspired Execution Framework for Vulnerable Workflow Systems (pp. 047-051) Sohail Safdar, Mohd. Fadzil B. Hassan, Muhammad Aasim Qureshi, Rehan Akbar Department of Computer & Information Sciences,Universiti Teknologi PETRONAS, Malaysia 9. RCFT : Re-Clustering Formation Technique in Hierarchical Sensor Network (pp. 052-055) Boseung Kim, Joohyun Lee, Yongtae Shin, Computing Department, Soongsil University Seoul, South Korea
10. An Alternative To Common Content Management Techniques (pp. 056-060) Rares Vasilescu, Computer Science and Engineering Department, Faculty of Automatic Control and Computers, Politehnica University, Bucharest, Romania 11. Routing Technique Based on Clustering for Data Duplication Prevention in Wireless Sensor Network (pp. 061-065) Boseung Kim, HuiBin Lim, Yongtae Shin, Computing Department, Soongsil University Seoul, South Korea 12. An Optimal Method For Wake Detection In SAR Images Using Radon Transformation Combined With Wavelet Filters (pp. 066-069) Ms. M. Krishnaveni, Lecturer (SG), Department of Computer Science, Avinashilingam University for Women, Coimbatore, India. Mr. Suresh Kumar Thakur, Deputy Director, Naval Research Board-DRDO, New Delhi, India. Dr. P. Subashini, Research Assistant-NRB, Department of Computer Science, Avinashilingam University for Women, Coimbatore, India 13. AES Implementation and Performance Evaluation on 8-bit Microcontrollers (pp. 070-074) Hyubgun Lee, Kyounghwa Lee, Yongtae Shin, Computing Department, Soongsil University Seoul, South Korea 14. GoS Proposal to Improve Trust and Delay of MPLS Flows for MCN Services (pp. 075-082) Francisco J. Rodríguez-Pérez, Computer Science Dept., Area of Telematics Engineering, University of ExtremaduraCáceres, Spain José-Luis González-Sánchez, Computer Science Dept., Area of Telematics Engineering, University of Extremadura, Cáceres, Spain Alfonso Gazo-Cervero, Computer Science Dept., Area of Telematics Engineering, University of Extremadura, Cáceres, Spain 15. Novel Intrusion Detection using Probabilistic Neural Network and Adaptive Boosting (pp. 083091) Tich Phuoc Tran & Longbing Cao, Faculty of Engineering and Information Technology, University of Technology, Sydney, Australia Dat Tran, Faculty of Information Sciences and Engineering University of Canberra, Australia Cuong Duc Nguyen, School of Computer Science and Engineering, International University, HCMC, Vietnam 16. Building a Vietnamese Language Query Processing Framework for e-Library Searching Systems (pp. 092-096) Dang Tuan Nguyen, & Ha Quy-Tinh Luong, Faculty of Computer Science, University of Information Technology, VNU- HCM, Ho Chi Minh city, Vietnam Tuyen Thi-Thanh Do, Faculty of Software Engineering, University of Information Technology, VNU – HCM, Ho Chi Minh city, Vietnam 17. Detecting Botnet Activities Based on Abnormal DNS traffic (pp. 097-104) Ahmed M. Manasrah & Awsan Hasan, National Advanced IPv6 Center of Excellence, Universiti Sains Malaysia, Pulau Pinang, Malaysia Omar Amer Abouabdalla, & Sureswaran Ramadass, National Advanced IPv6 Center of Excellence Universiti Sains Malaysia, Pulau Pinang, Malaysia 18. SOAP Serialization Performance Enhancement - Design And Implementation Of A Middleware (pp. 105-110) Behrouz Minaei, Computer Department, Iran University of Science and Technology, Tehran, Iran Parinaz Saadat, Computer Department, Iran University of Science and Technology, Tehran, Iran
19. Breast Cancer Detection Using Multilevel Thresholding (pp. 111-115) Y.Ireaneus Anna Rejani, Noorul Islam College of Engineering, Kumaracoil,, Tamilnadu, India. Dr.S.Thamarai Selvi, Professor & Head, Department of Information and technology, MIT, Chennai, Tamilnadu, India 20. Energy Efficient Security Architecture for Wireless Bio-Medical Sensor Networks (pp. 116-122) Rajeswari Mukesh, Dept of Computer Science & Engg, Easwari Engineering College, Chennai- 600 089 Dr. A. Damodaram, Vice Principal, JNTU College of Engineering, Hyderabad-500 072 Dr. V. Subbiah Bharathi, Dean Academics, DMI College of engineering, Chennai-601 302 21. Software Security Rules: SDLC Perspective (pp. 123-128) C. Banerjee, S. K. Pandey Department of Information Technology, Board of Studies, The Institute of Chartered Accountants of India, Noida- 201301, INDIA 22. An Entropy Architecture for Defending Distributed Denial-of-service Attacks (pp. 129-136) Meera Gandhi, Research Scholar, Department of CSE, Sathyabama University, Chennai, Tamil Nadu S. K. Srivatsa, Professor, Sathyabama University, ICE, St.Joseph’s College of Engineering, Chennai, Tamil Nadu 23. A Context-based Trust Management Model for Pervasive Computing Systems (pp. 137-142) Negin Razavi, Islamic Azad University, Science and Research Branch, Tehran, Iran Amir Masoud Rahmani, Islamic Azad University, Science and Research Branch, Tehran, Iran Mehran Mohsenzadeh, Islamic Azad University, Science and Research Branch, Tehran, Iran 24. Proposed Platform For Improving Grid Security By Trust Management System (pp. 143-148) Safieh Siadat, Islamic Azad University, Science and Research Branch, Tehran, Iran Amir Masoud Rahmani, Islamic Azad University, Science and Research Branch, Tehran, Iran Mehran Mohsenzadeh, Islamic Azad University, Science and Research Branch, Tehran, Iran 25. An Innovative Scheme For Effectual Fingerprint Data Compression Using Bezier Curve Representations (pp. 149-157) Vani Perumal, Department of Computer Applications, S.A.Engineering College, Chennai – 600 077, India. Dr.Jagannathan Ramaswamy, Deputy Registrar (Education), Vinayaka Missions University, Chennai, India. 26. Exception Agent Detection System for IP Spoofing Over Online Environments (pp. 158-164) Al-Sammarraie Hosam , Center for IT and Multimedia, Universiti Sains Malaysia, Penang, Malaysia Adli Mustafa, School of Mathematical sciences, Universiti Sains Malaysia, Penang, Malaysia Shakeel Ahmad, School of Mathematical sciences, Universiti Sains Malaysia, Institute of Computing and Information Technology, Gomal University, Pakistan, Penang, Malaysia Merza Abbas, Center for IT and Multimedia, Universiti Sains Malaysia, Penang, Malaysia 27. A Trust-Based Cross-Layer Security Protocol for Mobile Ad hoc Networks (pp. 165-172) A.Rajaram, Anna University, Coimbatore, India Dr. S. Palaniswami, Anna University, Coimbatore 28. Generalized Discriminant Analysis algorithm for feature reduction in Cyber Attack Detection System (pp. 173-180) Shailendra Singh, Department of Information Technology, Rajiv Gandhi Technological University Bhopal, India Sanjay Silakari, Department of Computer Science and Engineering, Rajiv Gandhi Technological University Bhopal, India
29. Management of Location Based Advertisement Services using Spatial Triggers in Cellular Networks (pp. 181-185) M. Irfan , M.M. Tahir N. Baig, Furqan H. Khan, Raheel M. Hashmi, Khurram Shehzad, Assad Ali Department of Electrical Engineering, COMSATS Institute of Information Technology, Islamabad, Pakistan 30. A Way to Understand Various Patterns of Data Mining Techniques for Selected Domains (pp. 186-191) Dr. Kanak Saxena, Computer Applications, SATI, Vidisha D.S Rajpoot, UIT, RGPV, Bhopal --------------------
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
A New Fuzzy Approach for Dynamic Load Balancing Algorithm 1
Abbas Karimi1,2,3, Faraneh Zarafshan 1,3, Adznan b. Jantan1, A.R. Ramli1, M. Iqbal b.Saripan1
Department of Computer Systems Engineering, Faculty of Engineering, UPM, Malaysia 2 Computer Department, Faculty of Engineering, IAU, Arak, Iran 3 Young Researchers’ Club, IAU, Arak, Iran .
Abstract— Load balancing is the process of improving the Performance of a parallel and distributed system through is distribution of load among the processors[1-2]. Most of the previous work in load balancing and distributed decision making in general, do not effectively take into account the uncertainty and inconsistency in state information but in fuzzy logic, we have advantage of using crisps inputs. In this paper, we present a new approach for implementing dynamic load balancing algorithm with fuzzy logic, which can face to uncertainty and inconsistency of previous algorithms, further more our algorithm shows better response time than round robin and randomize algorithm respectively 30.84% and 45.45%.
II.
In computer networking, load balancing is a technique to spread work between two or more computers, network links, CPUs, hard drives, or other resources, in order to get optimal resource utilization, throughput, or response time. Using multiple components with load balancing, instead of a single component, may increase reliability through redundancy. Load balancing attempts to maximize system throughput by keeping all processors busy Load balancing is done by migrating tasks from the overloaded nodes to other lightly loaded nodes to improve the overall system performance. Load balancing algorithms are typically based on a load index, which provides a measure of the workload at a node relative to some global average, and four policies, which govern the actions taken once a load imbalance is detected[6]. The load index is used to detect a load imbalance state. Qualitatively, a load imbalance occurs when the load index at one node is much higher (or lower) than the load index on the other nodes. The length of the CPU queue has been shown to provide a good load index on timeshared workstations when the performance measure of interest is the average response time[7-8]. In the case of multiple resources (disk, memory, etc.), a linear combination of the length of all the resource queues provided an improved measure, as job execution time may be driven by more than CPU cycles[9-10] . The four policies that govern the action of a load-balancing algorithm when a load imbalance is detected deal with information, transfer, location, and selection. The information Policy is responsible for keeping up-to-date load information about each node in the system. A global information policy provides access to the load index of every node, at the cost of additional communication for maintaining accurate information[5, 10]. The transfer policy deals with the dynamic aspects of a system. It uses the nodes’ load information to decide when a node becomes eligible to act as a sender (transfer a job to another node) or as a receiver (retrieve a job from another node). Transfer policies are typically threshold based. Thus,
Keywords- Load balancing, Fuzzy logic, Distributed systems.
I.
LOAD BALANCING
INTRODUCTION
Distributed computing systems have become a natural setting in many environments for business and academia. This is due to the rapid increase in processor and/or memory hungry applications coupled with the advent of low-cost powerful workstations[3]. In a typical distributed system setting, tasks arrive at the different nodes in a random fashion. This causes a situation of non-uniform loading across the system nodes to occur. Loading imbalance is observed by the existence of nodes that are highly loaded while others are lightly loaded or even idle. Such situations are harmful to the system performance in terms of the mean response time of tasks and resource utilization[3]. A system [4-5] of distributed computers with tens or hundreds of computers connected by high-speed networks has many advantages over a system that has the same standalone computers. A distributed system provide the resource sharing as one of its major advantages, which provide the better performance and reliability than any other traditional system in the same conditions[1]. Section II describes the load balancing and the kinds of its models. In section III, we explain and demonstrate our model, then in section IV, we explain the methodology and fuzzy rules. The evaluation of performance is inspected in section V and finally we describe the conclusion.
1
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
if the load at a node increases beyond a threshold , the node becomes an eligible sender. Likewise, if the load at a node drops below a threshold, the node becomes an eligible receiver The location policy selects a partner node for a job transfer transaction. If the node is an eligible sender, the location policy seeks out a receiver node to receive the job selected by the selection policy (described below). If the node is an eligible receiver, the location policy looks for an eligible sender node[10]. Once a node becomes an eligible sender, a selection policy is used to pick which of the queued jobs is to be transferred to the receiver node. The selection policy uses several criteria to evaluate the queued jobs. Its goal is to select a job that reduces the local load, incurs as little cost as possible in the transfer, and has good affinity to the node to which it is transferred. A common selection policy is latest-job arrived which selects the job which is currently in last place in the work queue[10]. There are two types of load balancing algorithms:
III.
Ts
SYSTEM MODEL
We have a distributed network consists of n node which every node may be a complex combination of multiple types of resources (CPUS, memory, disks, switches, and so on) and the physical configurations of resources for each node may be heterogeneous. This heterogeneity can be manifested in two ways[17]. The amount of a given resource at one node site may be quite different from the configuration of a node at another site. Additionally, nodes may have different balance of each resource. For example, one node may have a (relatively) large memory with respect to its number of CPUs while another node may have a large number of CPUs with less memory [18-19]. As in Fig. 1 illustrated, our system model is involved Routing table, Load index, Cost table and a fuzzy controller, which manages Load balancing of system. Routing table Load index
A. Static Load-Balancing In this method, the performance of the nodes is determined at the beginning of execution. Then depending upon their performance the workload is distributed in the start by the master node. The slave processors calculate their allocated work and submit their result to the master. A task is always executed on the node to which it is assigned that is static load balancing methods are non-preemptive. A general disadvantage of all static schemes is that the final selection of a host for process allocation is made when the process is created and cannot be changed during process execution to make changes in the system load[1]. Major load balancing algorithms are Round Robin[11] and Randomized Algorithms[12], Central Manager [13]Algorithm and Threshold[1, 14] Algorithm.
Cost table
Fuzzy Controller
Load Balancer
Fig.1:System model
The routing table presents the communication links among nodes in the system. Load index indicates the load of its related node, which is used by the policies in section II. In order to determine the node status as a sender, receiver or neutral by using fuzzy controller and based on fuzzy rules, we need a cost table that provides the nodes communication costs and the number of heavy loaded nodes. The cost table is obtained by using load index and routing table while the number of heavy loaded nodes can be extracted from the cost table.
B. Dynamic Load-Balancing It differs from static algorithms in that the workload is distributed among the nodes at runtime. The master assigns new processes to the slaves based on the new information collected[4, 15]. Unlike static algorithms, dynamic algorithms allocate processes dynamically when one of the processors becomes under loaded. Instead, they are buffered in the queue on the main host and allocated dynamically upon requests from remote hosts[1]. This method is consisted of Central Queue Algorithm and Local Queue Algorithm[16].
IV.
METHODOLOGY
Load index value based on a given threshold is classified into five categories and is defined between 0 to w and threshold is s. Five Fuzzy sets (Fig.2) are used to describe the load index value: very lightly loaded, lightly loaded, moderate loaded, heavy loaded and very heavy loaded. Variables for load index take grade values of Fuzzy variables are uncertainties and depends on network situation it can be changed.
Load balancing algorithms work on the principle that in which situation workload is assigned, during compile time or at runtime. Comparison shows that static load balancing algorithms are more stable compare to dynamic. It is also ease to predict the behavior of static, but at the same time, dynamic distributed algorithms are always considered better than static algorithms[1].
2
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Secu Vol. 6, No. 1, 2009
μless (N) =
p
q
r
s
t
u
μverylightlyload (load) =
q−p 0
0
⎧load − p ⎪
q−p μlightly load (load) = 1 ⎨ ⎪ s − load
⎩ s−r
μmoderate load (load) =
μveryheavy load (load) =
0
⎧ load − r ⎪ s−r
⎨ t − load ⎪ t−s ⎩ 0
p ≤ load ≤ q
load <
load <
t−s
μheavyload (load) =
q≤N≤r N>r
<
Rule [3]. If (load is heavey_load) and (no__heavy__load___nodes is more) then (status__loadbalance__node is reciver) Rule [4]. If (load is heavey_load) and (no__heavy__load___nodes is less) then (status__loadbalance__node is sender)
r ≤ load ≤ s s ≤ load ≤ t
Rule [5]. If (load is lightly_load) and (no__heavy__load___nodes is less) then (status__loadbalance__node is sender)
>
s ≤ load ≤ t
htly_load) and Rule [6]. If (load is lightly_load) (no__heavy__load___nodes is more) then (status__loadbalance__node is reciver)
<
u ≤ load ≤ v
1 v − load
N<
Rule [2]. If (load is very_heavey_load) then (status__loadbalance__node is sender)
>
t<
⎨1 ⎪v − load ⎩ v−u
N>q
>
r ≤ load ≤ s
load <
⎧ load − s ⎪
r−p 1
p≤N≤q
Rule [1]. If (load is very_lightly_load) then (status__loadbalance__node is receiver)
p ≤ load ≤ q
q<
0 N−q
N<
Assuming sender initiated load balance algorithm, the proposed knowledge base is as follows:
load <
load >
q−p 0
μmoreequal (N) =
v w
Fig.2: Fuzzy Index load chart
1 q − load
1 q−N
Rule [7]. If (load is moderate_load) and (no__heavy__load___nodes is more) then (status__loadbalance__node is reciver)
load <
u ≤ load ≤ v v−u 0 load > v for input 2, number of heavy nodes fuzzy sets are define as less and more equal (N is number of heavy nodes).
and Rule [8]. If (load is moderate_load) an (no__heavy__load___nodes is less) then (status__loadbalance__node is sender) Rule [9] IF the node is sender Then select a receiver as a migration partner Rule [10] IF the node fails to find a migration partner Then the node is neutral
p
q
Rule [11] IF the node is a sender Then select a suitable task to transfer
r
Fig.3: Fuzzy Input put load
3
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
Rule [12] IF the node fails to select a suitable task to transfer Then select another migration partner
(IJCSIS) International Journal of Computer Science and Information Security, Secu Vol. 6, No. 1, 2009
Response time(msec)
Fuzzy sets for output are shown as Fig.4:
Response time chart
45 40 35 30 25 20 15 10 5 0
randomize round robin fuzzy
6 8 10 2 4 Number of task for 5 nodes Fig.5: Response Time between Randomize, Round Robin and Fuzzy load balancing algorithm.
Fig.4: Fuzzy output
V.
Table 2:: Comparison of improvement Percentage of Fuzzy purpose algorithm vs. Round Robin & Randomize load loadbalancing algorithm.
PERFORMANCE EVALUATIO EVALUATION
Simulation was performed in MATLAB and NS2 to verify our approach. We evaluate our fuzzy load balancer in a system with five node using a randomly generated network graph and a random generated load vector-load vector vector consist of the number of task on the node ode and load index for each node. The edge connectivity in the network graph is generated with probability of 0.2 and task allocation with a Uniform distribution U [0, 1]. The generated task is assigned to the node corresponding to the interval of the generated rated random variable. Inter arrival times are taken from the exponential distribution. Processor speeds for all nodes are taken from Uniform distribution. Our fuzzy proposed algorithm in form of real time during updating amount of nodes load refreshes the cost table. Then we generated the cost table according to network graph and load vector. Load of each node is equal to the number of the node tasks. From cost table we can calculate the number of heavy nodes. In fuzzy system according to status of heavy load oad nodes, amount of node load and based base on fuzzy rule base, we can determine status of each node, which can be in one of three states: sender, receiver and neutral. Results of our fuzzy load balancer algorithm are presented in Table 1.
Number Of Task 2 4 6 8 10
Randomize Round Robin Fuzzy
2 3 2 1
Performance Of Fuzzy vs. Randomize %66.7 %50 %42.9 %36.4 %31.25
In Table 3 total improvement of our fuzzy approach is shown. This table confirms fuzzy load balancing algorithm has better response time and performance in comparison to Round Robin and Randomize load balancing algorithm respectively 30.84% and 45.45%. Table 3:: proportion percentage of improving our novel algorithm
FuzzyI.
Round Randomize Robin % 30.84 % 45.45 CONCLUSION
CONCLUSION AND FUTURE WORKS
Table 1: Responsee time of load balancing algorithm for different number of tasks. Algorithm
Performance Of Fuzzy vs. Round Robin %50 %33.3 %33.3 %22.2 %15.4
Fuzzy logic systems can make absolute outputs from uncertainties inputs. In this paper, we present a new approach for implementing dynamic load balancing algorithm with fuzzy logic and we have shown its response time is significantly better than round robin and randomize algorithm.
Number of Task 4 6 8 10 4 7 11 16 3 6 9 13 2 4 7 11
In the future works, we will follow the load balancing issue in parallell systems to find out whether the load balancing action will be quicker than the previous works or not. Moreover, we will present a new load balancing approach for predicting the nodes status as sender, receiver or neutral with less time complexity by usin using genetic algorithms and neurofuzzy techniques.
Fig. 5 shows fuzzy approach has significantly better response time. In Table 2 improvement percentage of our algorithm for different number of tasks in Round Robin and Randomize algorithm are shown. This table shows performance of fuzzy algorithm is better than RR and randomize algorithm.
4
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
REFERENCES
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Abbas Karimi: Received his Bachelor degree in Computer hardware engineering and MS in Computer Software Engineering from Iran. He is PhD candidate in UPM, Malaysia in the field of computer system Engineering. He has been working as a lecturer and faculty member in the Department of computer engineering at IAU-Arak Branch and lecturer in several universities. He was involved in several research projects, authorizing one textbook in Persian, several management posts, etc. His research interests are in load balancing algorithms, real time, distributed, parallel and fault-tolerant systems.
[1] S. Sharma, S. Singh, and M. Sharma, "Performance Analysis of Load Balancing Algorithms," World Academy of Science, Engineering and Technology, vol. 38, 2008. [2] G. R. Andrews, D. P. Dobkin, and P. J. Downey, "Distributed allocation with pools of servers," in Proceedings of the first ACM SIGACT-SIGOPS symposium on Principles of distributed computing. Ottawa, Canada: ACM, 1982, pp. 73-83. [3] A. E. El-Abd, "Load balancing in distributed computing systems using fuzzy expert systems," presented at International Conference on Modern Problems of Radio Engineering, Telecommunications and Computer Science (TCSET 2002), Lviv-Slavsko, Ukraine, 2002. [4] S. Malik, "Dynamic Load Balancing in a Network of Workstation," 19 November 2000 2000. [5] D. L. Eager, E. D. Lazowska, and J. Zahorjan, "Adaptive load sharing in homogeneous distributed systems," IEEE Trans. Softw. Eng., vol. 12, pp. 662-675, 1986. [6] N. G. Shivaratri, P. Krueger, and M. Singhal, "Load Distributing for Locally Distributed Systems," Computer, vol. 25, pp. 33-44, 1992. [7] D. L. Eager, E. D. Lazowska, and J. Zahorjan, "A comparison of receiver-initiated and sender-initiated adaptive load sharing (extended abstract)," SIGMETRICS Perform. Eval. Rev., vol. 13, pp. 1-3, 1985. [8] M. Livny and M. Melman, "Load balancing in homogeneous broadcast distributed systems," in Proceedings of the Computer Network Performance Symposium. College Park, Maryland, United States: ACM, 1982, pp. 47-55. [9] D. Ferrari and S. Zhou, "An empirical investigation of load indicies for load balancing applications pages " presented at 12th International Symposium on Computer Performance Modeling, Measurement, and Evaluation, North-Holland, Amsterdam, 1987. [10] W. Leinberger, G. Karypis, and V. Kumar, "Load Balancing Across Near-HomogeneousMulti-Resource Servers," presented at Proceedings. 9thHeterogeneous Computing Workshop (HCW 2000) Cancun, Mexico, 2000. [11] Z. Xu and R. Huang, "Performance Study of Load Balancing Algorithms in Distributed Web Server Systems " CS213 Parallel and Distributed Processing Project Report. [12] R. Motwani and P. Raghavan, "Randomized algorithms," ACM Comput. Surv., vol. 28, pp. 33-37, 1996. [13] P. L. McEntire, J. G. O'Reilly, and R. E. Larson, Distributed Computing: Concepts and Implementations. New York: IEEE Press, 1984. [14] W. I. Kim and C. S. Kang, "An adaptive soft handover algorithm for traffic-load shedding in the WCDMA mobile communication system," presented at WCNC'2003, 2003. [15] n. Y.-T. Wang and A.-R. J. T. Morris, "Load Sharing in Distributed Systems," IEEE Transactions on Computers, vol. 34, pp. 204217, 1985. [16] W. Leinberger, G. Karypis, and V. Kumar, "Load Balancing Across Near-Homogeneous Multi-Resource Servers," presented at s, Cancun, Mexico, 2000. [17] A. Kumar, M. Singhal, and T. L. Ming, "A model for distributed decision making: An expert system for load balancing in distributed systems " presented at 11th Symposium on Operating Systems, 1987. [18] S. Darbha and D. P. Agrawal, "Optimal Scheduling Algorithm for Distributed-Memory Machines," IEEE Transactions on Parallel and Distributed Systems, vol. 9, pp. 87-95, 1998. [19] S. A. Munir, Y. W. Bin, R. Biao, and M. Man, "Fuzzy Logic based Congestion Estimation for QoS in Wireless Sensor Network," in Wireless Communications and Networking Conference, WCNC 2007. IEEE. Kowloon, 2007, pp. 4336-4341.
5
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Knowledge Extraction for Discriminating Male and Female in Logical Reasoning from Student Model A. E. E. ElAlfi
M. E. ElAlami
Dept. of Computer Science Mansoura University Mansoura Egypt, 35516 .
Dept. of Computer Science Mansoura University Mansoura Egypt, 35516
Y. M . Asem Dept. of Computer Science Taif University Taif, Saudia Arabia
education and learning process plays an important role in the learners performance, their achievements and their behavioral patterns. Therefore, it is the way to the establishment of ties of understanding between teacher and learners and between learners themselves, and it is the facilitator to understand the goals of education strategies and how to achieve them [2].
Abstract:The learning process is a process of communication and interaction between the teacher and his students on one side and between the students and each others on the other side. Interaction of the teacher with his students has a great importance in the process of learning and education. The pattern and style of this interaction is determined by the educational situation, trends and concerns, and educational characteristics.
The learning skills are indispensable to every student in any area of science. They are inherent in the learner because of its significant impact on his level of collections. This level depends on the quality of the used manner or method in the learning process [3]. Learning skills allow the learner to acquire patterns of behavior that will be associated with him during the course of study. These patterns become study habits and will have a relative stability adjective with respect to the learner[4].
Classroom interaction has an importance and a big role in increasing the efficiency of the learning process and raising the achievement levels of students. Students need to learn skills and habits of study, especially at the university level. The effectiveness of learning is affected by several factors that include the prevailing patterns of interactive behavior in the classroom. These patterns are reflected in the activities of teacher and learners during the learning process. The effectiveness of learning is also influenced by the cognitive and non cognitive characteristics of teacher that help him to succeed, the characteristics of learners, teaching subject, and the teaching methods.
Students in the university have the responsibility to identify their goals and pursue strategies that lead to the achievement of these objectives. Therefore, these strategies should include the study habits, which lead to develop the composition of the student's knowledge [5].
This paper presents a machine learning algorithm for extracting knowledge from student model. The proposed algorithm utilizes the inherent characteristic of genetic algorithm and neural network for extracting comprehensible rules from the student database. The knowledge is used for discriminating male and female levels in logical reasoning as a part of an expert system course.
The importance of following good habits of study, which result in reducing students' level of concern for their examination, the high level of self-confidence, and the development of positive attitudes towards the faculty members and the materials was presented by [6]. As a result, the students' achievement will increase as well as their self-satisfaction also [7].
Keywords: Knowledge extraction, Student model, Expert system, Logical reasoning, Classroom interaction, Genetic algorithm, Neural network.
Motivation is also of great importance in raising the tendency towards individual learning. It is one of the basic conditions which achieve the goal of the learning process, the learning ways of thinking, the formation of attitudes and values, the collection of information and the problem solving [8].
I.
INTRODUCTION
The learning environment is one of the major task variables that has a special concern from researchers for a long time, in order to identify the factors that may affect its efficiency. The process of interaction within the classroom has a large share of their studies, and they have concluded that the classroom interaction is the essence of the learning process[1].
The achievement motivation is one of the main factors that may be linked to the objectives of the school system. The students assistance to achieve this motivation will lead to revitalize the level of performance and motivation in order to achieve the most important aspects of school work [9] .
The classroom interaction which is represented by the communication patterns between the parties of the
Logical Reasoning lets the individuals think logically to solve the problems, which proves the logical ability of
6
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
each individual. Induction or inductive reasoning, sometimes called inductive logic, is reasoning which takes us beyond the confines of our current evidence or knowledge to conclusions about the unknown [10]. If the variables of the classroom interaction, the learning and studying skills and the motivation are the whole factors affecting learning, is it possible to compensate each other?
3.
There are statically differences between the female and male degrees in achievement motivation.
4.
There are statically differences between the female and male degrees in understanding the efficiency of classroom interaction.
5.
There are statically differences between the female and male degrees in learning skills and logical reasoning when the efficiency of classroom interaction is fixed.
6.
There are statically differences between the female and male degrees in achievement motivation and logical reasoning when the efficiency of classroom interaction is fixed.
The current study aims to: 1.
Identify the differences between female and male in logical reasoning, learning skills, achievement motivation, and their understanding to the efficiency of classroom interaction.
2.
Determine the relation between the learning skills, the achievement motivation and the logical reasoning.
3.
Then, the study discusses the following questions:
Present a method for knowledge extraction from the student module in e-learning system. II.
PROBLEM AND OBJECTIVS OF
STUDY
Most researchers agree that the classroom interaction is the essence of the quality of teaching process, and its results are often positive. Also, the pattern and quality of this interaction not only determine the learning situation but also the trends, the concerns, and some aspects of the students' personality.
There are statically differences between the female and male degrees in learning skills.
Can the extracted knowledge from the students data discriminate between the male and female students in the logical reasoning score? EFFECTIVE STUDENT ATTRIBUTES
A. Learning skills A set of behaviors or practices used by the learner during studying the school material. It is determined by the degree which the student obtained through the measure used in the present study B. Achievement motivation Achievement motivation was looked at as a personality trait that distinguished persons based on their tendency or aspiration to do things well and compete against a standard of excellence [11].
Accordingly, the problem of the current study determines the following hypotheses:
2.
2.
The student model plays an important role in the process of teaching and learning. If the elements of this model are chosen properly we can get an important students database. This database can provide useful knowledge when using data mining techniques. Learning skills, achievement motivation, classroom interaction and logical reasoning are the main effective dimensions in student model presented in this study. The following section explains these features.
Are there other intermediate variables among the learning environment , the classroom interaction and the student achievement؟. Do these variables affect the student achievement and compensate the classroom interaction؟. Can we extract knowledge by data mining from student model.?
There are statically differences between the female and male students in logical reasoning in faculty of information and computer science at Taif university, Saudi Arabia .
Can we provide a machine learning algorithm to extract useful knowledge from the available students data?
III.
In Saudi universities, the educational environment of male and female are different. Male students have successful interaction, because the teacher is allowed to observe students and what they do in the classroom. In female environment it is not permissible to watch what happened in the classroom. Logically, this difference may be considered as an advantage to male students. However, female students achievements showed superiority than the male students. This prompted the following questions:
1.
1.
Motivation is the internal condition that activates behavior and gives it direction; energizes and directs goaloriented behavior. Motivation in education can have several effects on how students learn and how they behave towards subject matter [12]. It is composed of several internal and external motives that affect the behavior of students, orientation and activate individual in different positions to achieve excellence.
. 7
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
or 1. Psychometric measures of the indicator were calculated as follows;
C. Classroom Interaction During classroom lessons, teachers promote, direct, and mediate discussions to develop learners’ understandings of the lesson topics. In addition, teachers need to assess the learners’ understanding of the material covered, monitor participation and record progress. Discussions in classrooms are frequently dominated by the more outgoing learners, while others, who may not understand, are silent. The teacher is typically only able to obtain feedback from a few learners before determining the next course of action, possibly resulting in an incomplete assessment of the learners’ grasp of the concepts involved. Learners who are not engaged by the discussion, are not forming opinions or constructing understanding, and may not be learning. Classroom distractions can become the focus of attention unless there is some compelling reason for the learner to participate in the discussion [13].
- Criteria Validity This measure was applied on student sample of 40 male and female students in the faculty of computers and information systems, Taif University, Saudi Arabia. The correlation between their total degrees was 0.82. which is statistically significant at 0.01 level. So, it indicates the validity of the measure . - Internal Consistency Validity The correlations between each item and its indicator were calculated. The correlation values vary between 0.37 and 0.65 which are significant at the levels 0.01 and 0.05. Also, the correlation between the total degree and the degrees of each measure are calculated as showin in table I.
D. Logical Reasoning
TABLE I.
Reasoning is the process of using existing knowledge to draw conclusions, make predictions, or construct explanations. Three methods of reasoning are the deductive, inductive, and abductive approaches. Deductive reasoning starts with the assertion of a general rule and proceeds from there to a guaranteed specific conclusion. Inductive reasoning begins with observations that are specific and limited in scope, and proceeds to a generalized conclusion that is likely, but not certain, in light of accumulated evidence. One could say that inductive reasoning moves from the specific to the general. Abductive reasoning typically begins with an incomplete set of observations and proceeds to the likeliest possible explanation for the set [10].
Learning skills
Correlation coefficient
Management of dispersants Management of the study time Summing and taking notes Preparing for examinations Organization of information Continuation of study The use of computer & Internet
0.76 0.70 0.81 0.69 0.82 0.88 0.79
IV.
APPLICATIONS
AND
THE CORRELATION COEFFICIENT VALUES
Significant level
0.05
- Indicator reliability The indicator reliability was measured by two methods as shown in table II. TABLE II.
RESULTS
THE RELIABILITY VALUES OF LEARNING SKILLS MEASURE
Learning skills
A. Sample of Study
Management of dispersants Management of the study time Summing and taking notes Preparing for examinations Organization of information Continuation of study The use of computer & Internet
With regard to the population of students participating in the experiment was 95, (47 of them female and 48 male). These students have studied an expert system course using CLIPS language [14]. B. Tools of Study Three measures have been prepared; learning skills, achievement motivation and classroom interaction. 1. Learning skill A set of 47 clauses reflect the learning skills presented to the students during their study. These clauses dealt with 7 skills. The skills are; management of dispersants, management of the study time, summing and taking notes, preparing for examinations, organization of information, continuation of study, the use of computer and Internet. The student has to choose one of three alternatives (always, sometimes, or never). Their evaluations are 3, 2,
Re-application Correlation coefficient 0.77
Significant level
0.75
0.66 0.61
Cronbach's α
0.67 0.01
0.62
0.71
0.70
0.77
0.75
0.68 0.81
0.69 0.79
This table shows high reliability values of the learning skills measure. 2. Achievement motivation A set of 71 clauses which reflect the achievement motivation were classified into internal and external pivots. The internal achievement motivation includes; challenge, desire to work, ambition, and self-reliance. The external
. 8
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
achievement motivation includes; fear of failure, social motivations, awareness of time importance, and competition. Psychometric measures of the indicator were calculated as follows; :
- Criteria Validity This measure was applied on the same student sample (90 male and female students). The principle components method is used for factor analysis.
- Criteria Validity
Getman criterion for factor analysis was used to determine the number of factors. Varimax orthogonal rotation was also used. These two methods yield to the extraction of three factors (saturation ≥ ± 3 ). Each new factor has ≥ three factors. Table 5 shows the results of the factor analysis.
This measure was applied on the same student sample (40 male and female students) in the same faculty. The correlation between their total degrees was 0.79. which is statistically significant at 0.01 level. So, it indicates the validity of the measure . - Internal Consistency Validity
TABLE V.
The correlations between each item and its indicator were calculated. The correlation values varies between 0.37 and 0.74 which are significant at the levels 0.01 and 0.05. Table 3 shows the calculated correlation coefficients. TABLE III.
Clause No. 1 2 3 4 5 6 7 8 9 10 11 12 13
CORRELATION COEFFICIENT AND THEIR SIGNIFICANT LEVEL
Achievement motivation Internal achievement motivation
The indicator Challenge Desire to work Ambition Self-reliance Fear of failure Social motivations Awareness of time importance Competition
External achievement motivation
Correlation coefficient 0.68 0.69 0.66 0.56 0.58 0.71 0.64
Significant level
0.01
Third
0.51 0.41 0.44 0.61 0.55 0.52 0.63 0.66 0.46 0.44 0.55 0.63
Clause No. 14 15 16 17 18 19 20 21 22 23 24 25 26
First 0.35 0.40 0.51 0.38 0.47 0.46 0.45
Factor Second
Third
0.44 0.51 0.61 0.39
3.84 14.23
3.8 14.08
0.55 0.59 2.32 8.59
This table shows that the measure has saturated by 3 factors:
0.56
The first factor is saturated with 13 individual items. These items revolve around the lecturer's ability to manage the classroom interaction. This factor may be defined as teacher's positivity.
The indicator reliability was measured by two methods as shown in table 4. THE RELIABILITY VALUES OF ACHIEVEMENT MOTIVATION MEASURE
Cronbach's α
The second factor has a saturation of 10 items that revolve around the student's ability to interact with the lecturer on the basis of the lecture theme. This factor may be defined as student's positivity.
0.71 0.78 0.72 0.65 0.70 0.66
The third factor has a saturation of 3 items only. It revolves around the potential of the classroom that facilitate the process of interaction between the student and lecturer. This might be called the potential of the classroom. The factor analysis has deleted the factor number 27.
0.62
0.64
- Internal Consistency Validity
0.59
0.61
Learning skills
challenge Desire to work ambition self-reliance fear of failure social motivations awareness of time importance competition
Factor Second
Eigen values Variance
- Indicator reliability
TABLE IV.
First 0.45
THE RESULTS OF THE FACTOR ANALYSIS.
Correlation coefficient ρ 0.74 0.81 0.71 0.66 0.73 0.68
Significant level
0.01
The correlations between each item and its indicator were calculated. Table 6 shows the calculated correlation coefficients. This table indicates that the individual factors are correlated to their main factors (1st, 2nd and 3rd) which proves internal consistency of the measure.
This table shows high reliability values of the achievement motivation measure. 3. Classroom interaction A set of 27 clauses which measure the level of the classroom interaction were prepared.
. 9
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
TABLE VI.
THE CORRELATION COEFFICIENTS
st
2nd Factor
1 Factor Fact. No. 1 3 5 9 11 13 14 15 16 17 18 19 20
ρ 0.49 0.61 0.39 0.38 0.45 0.42 0.35 0.61 0.52 0.46 0.59 0.52 0.42
.Sign level
Fact. No. 2 4 6 7 8 10 12 21 22 24
0.01
ρ 0.45 0.46 0.44 0.52 0.51 0.59 0.43 0.51 0.60 0.42
analysis of variance of the multi-variables (MANOVA) was used. Both the Box test for homogeneity of the matrix and the value of the Levene test of equal contrast were insignificant for all dimensions. Wilks Lambda test value is equal to 0.68 which is significant. The ETA value is equal to 0.32. These results indicate the validity of the test and give an indication of the existence of differences in accordance with the type of learning skills. The following table shows the results of the analysis of variance test.
3rd Factor Sign. level
Fact. No. 23 25 26
ρ
Sign. level
0.45 0.36 0.35
0.01
0.01
TABLE IX.
Dimensions
- Indicator reliability The indicator reliability was measured by two methods as shown in table 7. Type
TABLE VII.
THE RELIABILITY VALUES OF THE CLASSROOM INTERACTION MEASURE
Learning skills
Correlation coefficient ρ 0.79 0.77 0.69
Teacher's positivity Student's positivity Potential of the classroom
Significant level
Cronbach's α 0.80 0.75 0.68
0.01
Error
So, the above table shows that the measure of the classroom interaction has an acceptable degree of consistency. C. Testing the study hypotheses - The first hypothesis There are statistical differences between the mean scores of the female and male students in logical reasoning in the faculty of information and computer science at Taif university, , Saudi Arabia . To verify this hypothesis the t test was used to measure the differences between the means of the independent groups. The results are shown in the following table. TABLE VIII.
THE ANALYSIS OF VARIANCE OF THE MULTI-VARIABLES (MANOVA) IN LEARNING SKILLS
Mean
Male Female
49 48
11.84 13.73
Standard deviation 2.86 1.67
T 3.99
F Significant η level
18.05
1
18.045
1.46 Insignificant 0.02
114.19
1
114.186
16.2
106.23
1
106.231
23.2
20.77
1
20.716
9.88
0.1
43.08
1
43.079
12.9
0.12
36.03 177.97
1 1
36.029 177.968
10.5 15.5
0.10 0.14
3102.3 1177.3
1 95
3102.31 12.393
24.5
0.21
669.36
95
7.046
436.02
95
4.590
199.30
95
2.098
317.29
95
3.340
325.72 1088.9
95 95
3.429 11.462
12041.8
95
126.76
0.15 0.01
0.2
This table shows that there are statistical differences in the learning skills in favor of females. Females are more likely to use the correct methods of learning, more able to manage time and planning to take advantage of it. They are more able to take of observations, notes and summaries. They are more able to prepare well for exams throughout the semester and organize information and use them correctly more than males. Also, they do not delay studying till the end of the year. They use computer and Web to get and exchange information. Females are better in general. The results did not show differences between males and females in management of dispersants, everyone is making effort to overcome them but what is important is what happens after that.
FEMALE STUDENTS IN LOGICAL REASONING
Number
Mean square
The above table shows that there are statistical of differences between males and females in the learning skills in all dimensions except the first dimension (Management of dispersants). To measure the differences, the mean and standard deviation were calculated as shown in table 10.
THE T TEST VALUE FOR THE DIFFERENCES OF MALE AND
Gender
Management of dispersants Management of the study time Summing and taking notes, Preparing for examinations Organization of information Continuation of study The use of computer & Internet Total Management of dispersants Management of the study time Summing and taking notes Preparing for examinations Organization of information Continuation of study The use of computer & Internet Total
Sum of Degree of squares freedom
Significant level 0.01
The above table shows that there are statistical differences between the mean scores of the males and females in the logical reasoning in favor of females. Also, this result indicates the superiority of females in logical reasoning ability to understand the linkage of precondition and conclusion . - The second hypothesis There exist mean differences between female and male degrees in learning skills. To verify this hypothesis the
. 10
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
TABLE X.
THE MEAN AND STANDARD DEVIATION OF DEGREES IN
The above table shows that there are statistical differences between males and females in achievement motivation in five dimensions. To measure the differences, the mean and standard deviation were calculated as shown in the following table.
LEARNING SKILLS
Dimensions Management of dispersants Management of study time Summing and taking notes Preparing for examinations Organization of information Continuation of study Use of computer & Internet Total
Gender
Number
Mean
M F M F M F M F M F M F M F M F
49 48 49 48 49 48 49 48 49 48 49 48 49 48 49 48
25.408 26.271 15.163 17.333 14.469 16.563 10.367 11.292 10.980 12.313 9.510 10.729 12.041 14.750 97.939 109.250
Standard deviation 0.503 0.508 0.379 0.383 0.306 0.309 0.207 0.209 0.261 0.264 0.265 0.267 0.484 0.489 1.608 1.625
TABLE XII.
ACHIEVEMENT MOTIVATION
Dimensions Challenge Desire to work Ambition Self-reliance Fear of failure
- The third hypothesis
Social motivations Awareness of time importance
There are statistical differences between the mean scores of the female and male students in achievement motivation.
Competition
To verify this hypothesis the analysis of variance of the multi-variables (MANOVA) test was used. The Box test for homogeneity of the matrix was insignificant. The value of the Levene test of equal contrast, was also insignificant. Wilks Lambda test value is equal to 0.56 which is significant. The value of ETA is 0.44. So, all these results indicate the validity of the test. The following table shows the results of the analysis of variance test and an indicate that the differences are affected by the achievement motivation. TABLE XI.
Total
F
Type
24.89 1.46 13.1 4.85 1.65 16.5
Significant level 0.01 not 0.01 0.05 not 0.01
0.21 0.02 0.12 0.05 0.02 0.15
29.79
0.01
0.24
0.59 12.37
not 0.01
0.06 0.12
Gender
Number
Mean
M F M F M F M F M F M F M F M F M F
49 48 49 48 49 48 49 48 49 48 49 48 49 48 49 48 49 48
21.306 24.063 24.898 25.812 13.000 14.542 12.735 13.625 17.898 18.667 21.041 23.750 18.449 20.979 21.673 22.229 151.000 163.667
Standard deviation 0.389 0.393 0.533 0.538 0.300 0.303 0.285 0.287 0.421 0.425 0.469 0.474 0.326 0.330 0.507 0.512 2.534 2.560
The above table shows that there are statistical differences in achievement motivation in favor of females in all dimensions except the dimensions of the desire to work, the fear of failure, and competition. This means that females have external incentives which lead them to exert effort, such as the motives of the desire to challenge the male society significantly, as if to prove a kind of self-motivation, ambition and self-reliance. Also, it seems that they were in need to change their society's perception that they must rely only on men in everything, and they are motivated by external motivation like satisfaction of parents, acquiring others' admiration and attract their attention, awareness of the importance of time, and they achieve success in running time.
THE ANALYSIS OF VARIANCE OF THE MULTI-VARIABLES (MANOVA) IN ACHIEVEMENT MOTIVATION.
Sum of Degrees of Mean squares freedom square challenge 184.22 1 184.22 Desire to work 20.28 1 20.28 ambition 57.63 1 57.63 self-reliance 19.22 1 19.22 fear of failure 14.33 1 14.33 social 177.97 1 177.97 motivations awareness of 155.23 1 155.23 time importance competition 7.49 1 7.49 Total 3890.4 1 3960.4 challenge 703.22 95 7.4 Desire to work 1321.8 95 13.91 ambition 419.22 95 4.42 self-reliance 376.8 95 3.97 fear of failure 825.18 95 8.69 social 1024.92 95 10.79 motivations awareness of 495.1 95 5.21 time importance competition 1197.26 95 12.6 Total 29880.6 95 314.53 Dimensions
THE MEAN AND STANDARD DEVIATION OF DEGREES IN
η
- The fourth hypothesis There are statistical differences between the female and male degrees in understanding the efficiency of classroom interaction.
Error
To verify this hypothesis the analysis of variance of the multi-variables (MANOVA) test was used. The Box test for homogeneity of the matrix was insignificant. The value of the Levene test of equal contrast, was also insignificant. Wilks Lambda test value is equal to 0.82 which is significant. The value of ETA is 0.18. All these results indicate the validity of the test and prove that the differences are affected by the type of classroom interaction. The following table shows the results of the analysis of variance test.
. 11
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
TABLE XIII.
THE ANALYSIS OF VARIANCE OF THE MULTI-VARIABLES (MANOVA) IN CLASSROOM INTERACTION
Dimensions
Type Error
Potential of the classroom. Student's positivity Teacher's positivity Total Potential of the classroom. Student's positivity Teacher's positivity Total
Sum of squares 14.937
TABLE XV.
LOGICAL REASONING
Degree of Mean F Significant η freedom square level 1 14.937 5.06 0.05 0.051
11.376
1
11.376 0.88
not
0.009
127.436
1
127.44 6.57
0.05
0.065
138.786 280.692
1 95
138.79 2.38 2.955
not
0.024
1229.08
95
12.938
1842.585
95
19.396
5537.17
95
58.286
Dimensions Management of dispersants Management of study time Summing and taking notes Preparing for examinations Organization of information Continuation of study The use of computer & Internet Total
The above table shows that the F values are significant in the dimensions of teacher's positivity and the potential of the classroom. To measure the differences, the mean and standard deviation were calculated as shown in the following table. TABLE XIV.
Number
Mean
Potential of the classroom
M F M F M F M F
49 48 49 48 49 48 49 48
5.327 4.542 22.878 23.563 29.959 27.667 58.163 55.771
Student's positivity Teacher's positivity Total
0.45
0.01
0.4
0.01
0.26
Not
0.35
0.01
0.23
Not
0.24
Not
0.35
0.05
0.39
0.01
0.3 0.27
0.05 Not
0.45 0.31
0.01 0.05
0.57
0.01
0.64
0.01
For males: There is positive correlation coefficient between the degree of learning skills and levels of logical reasoning in the dimensions of management of dispersants, Management of the study time, the organization of information, Continuation of study, and the total degree. This means that , if the student is more able to focus, manage time, organize information, and study continuously without delay, it is expected to achieve highly in the degree of logical reasoning. This result agrees with the nature of the material needs to get a high degree of focus, organization and effort unlike any other material.
CLASSROOM INTERACTION
Gender
Male Female Correlation Significant Correlation Significant coefficient ρ level coefficient ρ level 0.7 0.01 0.64 0.01
The above table shows the following:
THE MEAN AND STANDARD DEVIATION OF DEGREES IN
Dimensions
THE CORRELATION BETWEEN THE LEARNING SKILLS AND
Standard deviation 0.246 0.248 0.514 0.519 0.629 0.636 1.091 1.102
As for females: there is a correlation coefficient between the degree of learning skills and the levels of logical reasoning in all dimensions except preparing for examinations. Correlation has appeared in the dimensions of summing and taking notes, the use of computers and the Web to access information. Consequently, the logical reasoning degrees are affected by the same factors as in the male case in addition to the latter two dimensions.
The above table shows that there are statistical differences in the dimensions of potential of the classroom and teacher's positivity in favor of males. This means that males interact better than females in the classroom, particularly in the dimensions of teacher's positivity and potential of the classroom. This can be attributed to the nature of teaching to the male students as there is direct and face to face interaction. However, in the absence of direct interaction, females feel that the learning environment is not valid, the lecturer does not do his utmost in the commentary. Hence, the problem is not the women from their point of view.
- The sixth hypothesis There are statistical differences between female and male degrees in achievement motivation and logical reasoning when the efficiency of classroom interaction is fixed. To verify this hypothesis the partial correlation coefficient test between the degrees of achievement motivation and logical reasoning while the classroom interaction is fixed was used. This was done due to the presence of differences among the achievement motivation, logical reasoning and understanding of the efficiency of classroom interaction. The results are shown in the following table.
- The fifth hypothesis There are statistical differences between the female and male degrees in learning skills and logical reasoning when the efficiency of classroom interaction is fixed. The partial correlation coefficient between the degrees in learning skills and logical reasoning is used to verify this hypothesis while the efficiency of classroom interaction is fixed. The results are shown in the following table.
. 12
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
TABLE XVI.
THE CORRELATION COEFFICIENT VALUES AND THE SIGNIFICANCE BETWEEN ACHIEVEMENT MOTIVATION AND LOGICAL REASONING.
Dimensions Challenge Desire to work Ambition Self-reliance Fear of failure Social motivations Awareness of time importance Competition Total
Male Correlation Significant coefficient ρ level 0.52 0.54 0.40 0.41 0.44 0.01 0.53
the most important information in large data sets. The general goal of DM is to discover knowledge that is not only correct, but also comprehensible and interesting for the user. Among the various DM algorithms, such as clustering, association rule finding, data generalization and summarization, classification is gaining significant attention [ 15]. Classification is the process of finding a set of models or functions which describe and distinguish data classes or concepts, for the purpose of being able to use the model to predict the class of objects whose class label is unknown. In classification, a rule generally represents discovered knowledge in the form of IF-THEN rules. The classification methods can be categorized into two groups, non-rule-based and rule-based classification groups [16]. Non-rule-based classification methods are such as artificial neural network (ANN) [17-18] and support vector machines [19]. Rule-based classification methods are such as C4.5 [20], and decision table [21]. Rule-based classification methods directly extract hidden knowledge from the data. However, non-rule-based classification methods are generally more accurate than rule-based classification methods. This section presents the proposed algorithm for extracting a set of accurate and comprehensible rules from the input database via trained ANN using genetic algorithm (GA). The details of the proposed algorithms is explained in previous work [22]. A concise algorithm for extracting a set of accurate rules is shown in the following steps:
Female Correlation Significant coefficient ρ level 0.66 0.63 0.52 0.75 0.55 0.01 0.60
0.47
0.63
0.56 0.61
0.61 0.79
The above table shows a correlation between the degree of achievement motivation and levels of logical reasoning for both male and female. This is very important, since the nature of the course requires large student's motivation to deal with. They need to exert an effort, regardless of what is behind this effort, and this result agrees with the majority of studies that proved a positive relationship between achievement motivation and achievement. D. Knowledge Extraction This section well illustrate the students database description in addition to discussing the study questions. - Students database description A student model database used for knowledge extraction is composed of four main predictive measures and one target measure. The first measure is the learning skills which includes 7 attributes namely: management of dispersants, management of the study time, summing and taking notes, preparing for examinations, organization of information, continuation of study, and the use of computer & internet. The second measure is achievement motivation which is divided into internal and external motivations. Internal motivations includes 4 attributes; challenge, the desire to work, ambition, and self-reliance. External motivations includes 4 attributes; fear of failure, social motivations, awareness of time importance, and competition. The third measure is classroom interaction which includes potential of the classroom, student's positivity, and teacher's positivity. The final measure is student score in the expert system course which is divided into 5 test units. In deed, the target measure is logical reasoning .
1. Assume that; 1.1 The input database has N predictive attributes plus one target attribute. 1.2 Each predictive attribute has a number of values, and can be encoded into binary sub-string of fixed length. 1.3 Each element of a binary sub-string equals one if its corresponding attribute value exists, while all the other elements are equal to zero. 1.4 Repeat the steps (1.2) and (1.3) for each predictive attribute, in order to construct the encoded input attributes’ vectors. 1.5 The target attribute has a number of different classes, and can be encoded as a bit vector of a fixed length as explained in step (1.3). 2. The ANN is trained on the encoded vectors of the input attributes and the corresponding vectors of the output classes until the convergence rate between the actual and the desired output will be achieved. 3. The exponential function of each output node of ANN can be constructed as a function of the values of the input attributes and the extracted weights between the layers.
- The first question: Can we provide a machine learning algorithm to extract useful knowledge from the available students data ? Data mining (DM) or in other words "the extraction of hidden predictive information from data" is a powerful new technology with great potential to help users focus on
. 13
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
4. To find the rule belongs to a certain class, GA is used to find the optimal chromosome (values of input attributes), which maximizes the output function of the corresponding node (class) of the ANN. 5. The extracted chromosome must be decoded to find the corresponding rule as follows; 5.1 The optimal chromosome is divided into N segments. 5.2 Each segment represents one attribute, and has a corresponding bits length represent their values. 5.3 The attribute values are existed if the corresponding bits in the optimal chromosome are equal to one and vice versa. 5.4 The operators “OR” and “AND” are used to correlate the existing values of the same attribute and the different attributes, respectively. 5.5 The extracted rules must be refined to cancel the redundant attributes.
9. If Gender = Ma and Ambition = L Æ Then Reasoning = F. 10. If Gender = Fe and Ambition = L Æ Then Reasoning = P. 11. If Gender = Ma and Management of dispersants = L Æ Then Reasoning = F. 12. If Gender = Fe and Management of dispersants = L Æ Then Reasoning = P. 13. If Gender = Ma and Self-reliance = L Æ Then Reasoning = F. 14. If Gender = Fe and Self-reliance = L Æ Then Reasoning = P. 15. If Gender = Fe and The desire to work = M and organization of information = H Æ Then Reasoning = G. 16. If Gender = Fe and Fear of failure = M and Selfreliance = H Æ Then Reasoning = Good. 17. If Gender = Fe and organization of information = H and Maintaining learning = M Æ Then Reasoning =G. 18. If Gender = Fe and The potential class = L and Unit 2 = P Æ Then Reasoning = G. 19. If Gender = Fe and Time management = H and organization of information = H and unit 3 = V.G Æ Then Reasoning = V.G
- The second question: Can the extracted knowledge from the students data discriminate between the male and female students in the logical reasoning score ? This question will be dealt with through the following rules extraction and their interpretations. Assume the following abbreviations: F : Fail, P : Pass, G : Good, V.G : Very Good, L : Low, M : Medium, H : High , Ma : Male, Fe : Female.
The previous rules clarify the attributes effect on the reasoning results taking into consideration the effect of the gender attribute.
Unit 1 = F Æ Then Reasoning = F. Unit 2 = F Æ Then Reasoning = F. Unit 5 = F Æ Then Reasoning = F. Unit 1= V.G or Unit 2 = V.G and Maintaining learning = H Æ Then Reasoning = V.G. 5. If Unit 1 = V.G or Unit 2 = V.G and Fear of failure = H Æ Then Reasoning = V.G. 1. 2. 3. 4.
If If If If
V.
CONCLUSIONS
It is our intent to explore how data mining is being used in education services at Taif University in Saudi Arabia. Educational data mining is the process of converting raw data from educational systems to useful information that can be used to inform design decisions and answer research questions.
From the above rules one can conclude that units number 1, 2, and 5 are the most effective attributes in the final results. This is because they include the principles, the inductive reasoning and the object oriented programming in CLIPS respectively.
The importance of the study can be stated as follows: It is dealing with the learning environment of Saudi Arabia, that has a special nature in the education of females and the factors affecting it . The study combines the variables related to personality, mental and environmental aspects in order to reach an integrated view of the learning nature process and the factors affecting it. It addresses the subject of study habits and achievement motivations, which are important issues that affect the educational process. Good study habits will help students in the collection of knowledge, the achievement motivation and push them to the challenge of the obstacles to achieve their goals. The study presents an efficient technique that utilizes artificial neural network and genetic algorithm for extracting comprehensive rules from student database. The extracted
6. If Unit 1 = P and Unit 3 = F or Unit 4 = F Æ Then Reasoning = P. 7. If Ambition = H and Unit 3 = F or Unit 4 = F Æ Then Reasoning = P. 8. If Self-reliance = M and Unit 3 or Unit 4 = F Æ Then Reasoning = P. The rules numbers 6, 7, and 8 indicate that units number 3 and 4 are not effective. The high ambition and medium self-reliance lead to passing in reasoning although the fail score in unit 3 or unit 4.
14
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
.
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
knowledge supports the effective attributes that are most effective in the final score of the logical reasoning.
and Individual Differences, Volume 19, Issue 1, Pages 8090, 1st Quarter 2009. [10] Yuichi Goto, Takahiro Koh, Jingde Cheng, "A General Forward Reasoning Algorithm for Various Logic Systems with Different Formalizations", 12th International Conference, Knowledge-Based Intelligent Information & Engineering Systems, Proceedings Part II, Pages 526-535, September 3-5, 2008. [11] Wigfield, A, & Eccles, J.S, "Development of achievement motivation", San Diego, San Francisco, New York, Boston, London, Sydney, Tokyo: Academic Press, 2002. [12] David C, McClelland, "Methods of Measuring Human Motivation", in John W. Atkinson, ed., Motives in Fantasy, Action and Society (Princeton, N.J.: D. Van Nos-trand, Pages 12-13, 1958. [13]Timothy W. Pelton & Leslee Francis Pelton, "The Classroom Interaction System", (CIS): Neo-Slates for the Classroom" W.-M. Roth (ed.), CONNECTIONS ‘03, Pages 101–110, 2003. [14] Joseph C. Giarratano, "CLIPS User's Guide", Version 6.2, March 31st 2002.
VI. FUTURE WORKS E-learning represents a great challenge in education, that large amounts of information are continuously generated and available. Using data mining to extract knowledge from information is the best approach to process the obtained knowledge in order to identify the student needs. Tracking student behavior in virtual e-learning environment makes the web mining of the resulting databases possible, which encourages the educationalists and curricula designers to create learning contents. So, we aim at introducing a novel rule extraction method that depends on Fuzzy Inductive Reasoning methodology. This method has been driven from a data set obtained from a virtual campus e-learning environment. Hence, to gain the best benefit from this knowledge, the results should be described in terms of a set of logical rules that trace the different level of the student performance. References [1] Kudret Ozkal, Ceren Tekkaya, Jale Cakiroglu, Semra Sungur, "A conceptual model of relationships among constructivist learning environment perceptions, epistemological beliefs, and learning approaches", Learning and Individual Differences, Volume 19, Issue 1, 1st Quarter, Pages 71-79, 2009. [2] Shun Lau, Youyan Nie, "Interplay Between Personal Goals and Classroom Goal Structures in Predicting Student Outcomes: A Multilevel Analysis of Person–Context Interactions", Journal of Educational Psychology, Volume 100, Issue 1, Pages 15-29, February 2008. [3] Karin Tweddell Levinsen, "Qualifying online teachersCommunicative skills and their impact on e-learning quality", Education and Information Technologies, Volume 12, Number 1 / March 2007. [4] Richards, L.G, " Further studies of study habits and study skills", Frontiers in Education Conference, 31st Annual, Volume 3, Page(s):S3D - S13, 10-13 Oct. 2001. [5] Nneji, L . M, "Study habits of Nigerian University Students", Nigerian Educational Research, Development Council , Abuja , Nigeria , Pages 490 – 495, 2002. [6] Okapala. A, Okapala. C, Ellis. R, "Academic Efforts and study habits among students in a principles of macroeconomics course", Journal of Education for Business , 75 (4), Pages 219 – 224, 2000.
[15] Li Liu, Murat Kantarcioglu, Bhavani Thuraisingham, "The applicability of the perturbation based privacy preserving data mining for real-world data", Data & Knowledge Engineering, Volume 65, Issue 1, Pages 5-2, April 2008.
[16] Tan, C., Yu, Q., & Ang, J. H., "A dual-objective evolutionary algorithm for rules extraction in data mining", Computational Optimization and Applications, 34, Pages 273–294, 2006. [17] Humar Kahramanli and Novruz Allahverdi, "Rule extraction from trained adaptive neural networks using artificial immune systems", Expert Systems with Applications 36, Pages 1513–1522, 2009. [18] Richi Nayak, "Generating rules with predicates, terms and variables from the pruned neural networks", Neural Networks 22, Pages 405-414, 2009. [19] J.L. Castro, L.D. Flores-Hidalgo, C.J. Mantas and J.M. Puche, "Extraction of fuzzy rules from support vector machines", Fuzzy Sets and Systems, Volume 158, Issue 18, Pages 2057-2077, 16 September 2007. [20] Kemal Polat and Salih Güneş, "A novel hybrid intelligent method based on C4.5 decision tree classifier and one-against-all approach for multi-class classification problems", Expert Systems with Applications, Volume 36, Issue 2, Part 1, Pages 1587-1592, March 2009. [21] Yuhua Qian, Jiye Liang and Chuangyin Dang, "Converse approximation and rule extraction from decision tables in rough set theory", Computers & Mathematics with Applications, Volume 55, Issue 8, Pages 1754-1765, April 2008. [22] A. Ebrahim ELAlfi, M. Esmail ELAlami, R. Haque, "Extracting Rules From Trained Neural Network Using GA For Managing E- Business", Applied Soft Computing 4, Pages 65-77, 2004.
[7] Marcus Credé, Nathan R. Kuncel, " Study Habits, Skills, and Attitudes: The Third Pillar Supporting Collegiate Academic Performance", Perspectives on Psychological Science, Volume 3, Issue 6, Pages: 425-453, November 2008,
[8] Weiqiao Fan, Li-Fang Zhang, " Are achievement motivation and thinking styles related? A visit among Chinese university students", Learning and Individual Differences, Volume 19, Issue 2, Pages 299-303, June 2009. [9] Ricarda Steinmayr, Birgit Spinath, "The importance of motivation as a predictor of school achievement", Learning
. 15
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
A Mirroring Theorem and its application to a New method of Unsupervised Hierarchical Pattern Classification K. Eswaran
Dasika Ratna Deepthi
Department of Computer Science, Aurora’s Engineering College, Bhongir, Nalgonda Dist., A.P., India
Department of Computer Science, Srinidhi Institute of Science and Technology, Yamnampet, Ghatkesar, Hyderabad, India.
Abstract— In this paper, we prove a crucial theorem called “Mirroring Theorem” which affirms that given a collection of samples with enough information in it such that it can be classified into classes and sub-classes then (i) There exists a mapping which classifies and subclassifies these samples (ii) There exists a hierarchical classifier which can be constructed by using Mirroring Neural Networks (MNNs) in combination with a clustering algorithm that can approximate this mapping. Thus, the proof of the Mirroring theorem provides a theoretical basis for the existence and a practical feasibility of constructing hierarchical classifiers, given the maps. Our proposed Mirroring Theorem can also be considered as an extension to Kolmogrov’s theorem in providing a realistic solution for unsupervised classification. The techniques we develop, are general in nature and have led to the construction of learning machines which are (i) tree like in structure, (ii) modular (iii) with each module running on a common algorithm (tandem algorithm) and (iv) self-supervised. We have actually built the architecture, developed the tandem algorithm of such a hierarchical classifier and demonstrated it on an example problem.
approach, we developed a self-learning machine (based on our proposed Mirroring Theorem) which performs feature extraction and pattern learning simultaneously to recognize/classify the patterns in an unsupervised mode. This automatic feature extraction step, prior to unsupervised classification fulfills one more additional crucial requirement called dimensional reduction. Furthermore, by proving our stated mirroring theorem, we actually demonstrate that such unsupervised hierarchical classifiers mathematically exist. It is also proved that the hierarchical classifiers that do perform a level-by-level unsupervised classification can be approximated by a network of “nodes” forming a tree-like architecture. What we term as a “node”, in this architecture, is actually an abstraction of an entity which executes two procedures: the “Mirroring Neural Network” (MNN) algorithm coupled with a clustering algorithm. The MNN performs automatic data reduction and feature extraction (see [18] for more details on MNN) and clustering does the subsequent step called unsupervised classification (of the extracted features of the MNN); these two steps are performed in tandem - hence our term Tandem Algorithm. The Mirroring Theorem provides a proof that this technique will always work provided sufficient information is contained in the ensemble of samples for it to be classified and sub-classified and certain continuity conditions for the mappings are satisfied. The Mirroring Theorem, we prove in this paper may be considered as an extension to Kolmogrov’s theorem [19] in providing a practical method for unsupervised classification. The details of the theorem and how it can be used for developing an unsupervised hierarchical classifier are discussed in the next sections.
Keywords-Hierarchical Unsupervised Pattern Recognition; Mirroring theorem; classifier; Mirroring Neural Networks; feature extraction; Tandem Algorithm; self-supervised learning.
I. INTRODUCTION There have been various ways in which the fields of artificial intelligence and machine learning have been furthered: starting with experimentation [1], abstraction [2], [3] and the study of locomotion [4]. Many techniques have been developed to learn patterns [5] & [6] as well as to reduce large dimensional data [7] & [8] so that relevant information can be used for classification of patterns [9] & [10]. Investigators have tackled, to varying degrees of success, pattern recognition problems like face detection [11], gender classification [12], human expression recognition [13], object learning [14] & [15], unsupervised learning of new tasks [16] and also have studied complex neuronal properties of higher cortical areas [17], naming but a few. However, most of the above techniques did not require automatic feature extraction as a pre-processing step to pattern classification. In our
Our main contribution in this paper, is that we propose and prove a theorem called “Mirroring Theorem” which provides a mathematical base for constructing a new kind of architecture that performs an unsupervised hierarchical classification of its inputs by implementing a single common algorithm (which we call as the “Tandem Algorithm”) and this is demonstrated on an example set of image patterns. That is, the proposed hierarchical classifier is mathematically proved to exist, for which we develop a new common algorithm that does the two machine steps, namely, automatic feature extraction and clustering to execute a level-by-level unsupervised
16
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
classification of the given inputs. Hence, we can say that this paper proposes a new method to build a hierarchical classifier, (with a mathematical basis) a new kind of common algorithm (which is implemented throughout the hierarchy of the classifier) and it is demonstration on an example problem.
theorem, we show how to build pattern classifiers which possess the ability to automatically extract features, have a tree-like architecture and can be used to develop the proposed architecture for unsupervised pattern learning and classification (including the proposed tandem algorithm). In section 4, we report the results of the demonstration of such a classifier when applied an unsupervised pattern recognition problem wherein real images of faces, flowers and furniture are automatically classified and sub-classified in an unsupervised manner. Section 5, we discuss the future possibilities of this kind of architecture.
We find it necessary to discuss a few points about the MNN before moving on to the details of the proposed Theorem and the Tandem Algorithm. An MNN is nothing but a neural network (NN) with converging-diverging architecture which is trained to produce an output which equals its input as closely as possible (i.e. mirror the input at its output layer). And this training process proceeds through repeated presentations of all the input samples and it stops when the MNN could mirror at least above 95% of its input samples. Then the MNN said to be successfully trained with the given input samples. Now, the best possible extracted features of the inputs are automatically obtained at the MNN’s least dimensional hidden layer and these features are used for unsupervised input classification by a clustering algorithm. See Figure 1 for illustration of an MNN architecture wherein input given to it is ‘X’ of dimension ‘n’ which is reduced to ‘Y’ of dimension ‘m’ (m is much less than n). Since Y is capable of mirroring X at the output, Y contains as much information as X, even though it has a lesser dimension, the components of Y can then be thought of as features that contains the patterns in X, hence, the Y can be used for classification. More details on MNN architecture can be referred from [20] & [21].
II. MIRRORING THEOREM We now prove what we term as the mirroring theorem of pattern recognition, Statement of the Theorem: If a hierarchical invertible map exists that (i) maps a set of n-dimensional from X-space into a mdimensional data in Y-space (m ≤ n) which fall into j distinct classes, and, (ii) if for each of the above j classes, in turn, maps exist which map each class in Y-space to a r-dimensional Z-space into t subclasses, then such a map can be approximated by a set of j + 1 nodes (each of which are MNNs with an associated clustering algorithm) forming a treelike structure.
Before, proceeding to proving the main theorem and the presentation of actual computer simulation, it is perhaps appropriate to write a few lines on the ideas that motivated this paper.
Proof: The very fact that invertible maps exist indicate that there exist j ‘vector’ functions which map the points (x1, x2, x3,…xn) falling into some d different regions Rd in Y-space. These ‘vector’ functions may be denoted as: F1, F2,...., Fj . We clarify our notation here by cautioning that F1, F2,...., Fj, should not be treated as the vectoral components of F. What we mean by F1 are the maps that carry those points in X-space to the set of points contained in S1, hence F1 can be thought of as a collection of ‘rays’, where each ‘ray’ denotes a map starting from some point in X-space and culminating in a point belonging to S1 in Y -space. Similarly, F2 is the collection of ‘rays’ each of which starts from some point in X-space and culminates in a point belonging to S2 in Y -space. Thus we define the map F as F ≡ F1 U F2 U……U Fj.
It is presently well known that the neural architecture in the human neocortex is hierarchical [22], [23], [24] & [25] and constituted by neurons at different levels and information is exchanged between these levels via these neurons [26], [27], [28] & [29] when initiated by the receipt of data coming in from sensory receptors in the eyes, ears, etc. The organization of the various regions within each level of the neo-cortical system, are not completely understood, but there is much evidence that regions of neurons in one level are connected with regions of neurons in another level thus forming many tree like structures [25] & [30] (also see [31]). Various intelligent tasks, for example “image recognition”, are performed by numerous neurons firing and sending signals back and forth these levels [32]. Many researchers working in the field of artificial intelligence have sought to imitate the human brain in their attempt to build learning machines [33] & [34] and have employed a tree like architecture at different levels for performing recognition tasks [35]. As described above, our attempt here is to demonstrate that a hierarchical classifier which addresses the tasks of feature extraction (/data reduction) and recognition can be constructed and such architecture can perform intelligent recognition tasks in an unsupervised manner.
Now we argue as follows: since the first map F1 takes Xspace into an image in Y -space, say a set of points S1 and similarly F2 takes X-space into an image in Y -space, say a set of points S2 and so to the set Sj and since, by assumption, the target (image) region in Y -space contains j distinct classes, we could conclude that the set of points S1, S2, …, Sj are distinct and non overlapping, (for otherwise the assumption of there being j distinct classes is violated). These regions are separable are distinct from one another and there also exist maps that are all distinct, and we can renumber the regions Rd in such a manner that the union of the first k1 sets belong to S1 i.e., S1 = R1UR2U….URk1 and the union of the next k2 sets belong to S2 ….. and similarly S2 = Rk1+1URk1+2U.....URk1+k2 , e.t.c., till Sj = Rd-kj+1URd-kj+2U….URd. It also implies, since each
The plan of the paper is as follows: In the next section we prove the proposed Mirroring Theorem of Pattern Recognition. In section 3, based on the proof of the mirroring
17
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
of the image sets can be again reclassified into k patterns (by assumption), that there exists a ‘vector’ function set G11, G12, G13 , ..., G1t which take S1 to t distinct regions in z-space, these t distinct sets are denoted by c11, c12, c13, ...., c1t. Again, here G11, G12, G13 , ..., G1t can be thought of a collection of ‘rays’ denoting maps from points in S1 in Y -space to points in the sets c11, c12, c13, ...., c1t in Z-space. Thus G11 is the collection of ‘rays’ leading to the set c11 from S1 and G12 is the collection of ‘rays’ from S1 to the set c12. Hence, similar to the definition of F we can denote the map G1 as G1 ≡ G11 U G12 U G13 U ..... U G1t. In order not to clutter the diagram the possible sub-regions within each of the sets c11, c12, c13, ...., c1t have not been drawn in Figure 2 and we assume, without prejudicing the theorems generality, that the number of subsets t are the same in all maps.
with sufficient samples and then the weights obtained. At this stage we have a diverging neural network which takes as input Y and outputs the corresponding X. Now by combining the two converging and diverging so that the first leads to the second (without changing the weights) we have nothing but an MNN (pictorially represented by Figure 1(c)), this MNN mirrors the input vector X and outputs Y from the middle layer of ‘m’ adalines. So, we have thus proved the existence of an MNN which maps points from the n-dimensional X space to the mdimensional Y space and then back to the original points in ndimensional X space. Then the points in Y space can be classified into j classes using a suitable clustering algorithm. Thus, we have proved that a node of the hierarchical classifier is approximated by the combination of an MNN and a clustering algorithm.
Similarly G21, G22, G23, ..., G2t take S2 to distinct t sets c21, c22, c23, ..., c2t and so on to the function set Gj1 , Gj2, Gj3 , ..., Gjt which map Sj to respective cj1, cj2, cj3,..., cjt in Z-space.
The proof that the second set of maps from m space to r space exists, uses similar arguments. We can immediately see that there will be j maps because there are j classes in Y space, hence there will be j nodes for each class each of which constructed by using a similar procedure. So we see that the set of maps assumed can be approximated by j + 1 nodes, whose existence we have just proved, all of which forming a treelike structure, shown in Figure 3 QED. It may be noted that each node in our architecture is depicted in Figure 4.
The existence of the function maps F1, F2,...., Fj which map points from the set in the n dimensional space to j distinct classes implies that the set of points in are separable to j distinct classes in Y-space which is of m dimensions. (Strictly speaking it is necessary to assume that these functions Fi, i = 1, 2, …., j have the property of being invertible (i.e., are bijective) in order to exclude many-to-one maps; also this property is necessary to prove that the function such as F can be approximated by an MNN along with a clustering algorithm. Further, it is also being implicitly assumed that all maps considered in this theorem are at least continuous up to first order, points which are close to one another are expected to have their images close to one another).
We will now illustrate, the use of the mirroring theorem to develop a hierarchical classifier which performs the task of unsupervised pattern recognition. III. UNSUPERVISED HIERARCHICAL PATTERN RECOGNITION This section describes the architecture of a self-learning engine and the next section, we report its application to an example problem, wherein a set of input images are automatically classified and then sub-classified.
To proceed with the proof we will first show that it is possible to approximate the set of maps that take X-space to Y - space by a single MNN. To do this we will show an MNN can explicitly be constructed and trained to perform this map. We will assume that sufficient samples are available for training. Now consider the MNN to have a converging set of layers of adalines (PE's), the converging set consists of ‘n’ inputs in the first layer and ends with a layer of ‘m’ adalines, shown in Figure 1(a).
Our intent is to build a learning engine which has the following characteristics: It is (i) hierarchical (ii) modular (iii) unsupervised and (iv) runs on a single common algorithm (MNN associated with clustering). The advantage of developing a recognition system with these 4 characteristics is that the learning method does not depend on the problem size and the learning network can be simply extended as the recognition task becomes more complex. It has been surmised by investigators that the architecture of human neo-cortex does, loosely speaking, possess the above 4 characteristics (except that instead of (iv) there is some kind of analog classification process (procedure) performed by sets of neurons, which seemingly behave in a similar manner). We are also reinforced by the conviction, since our architecture imitates the neural architecture (though admittedly in a crude manner), it is reasonable to expect that we would meet with greater successes as we make improvements in our techniques and as we deal with problems of larger size using increasingly powerful computers. In fact, it is this prospect that has been the prime motive force behind our work.
This set from ‘n’ to ‘m’ can be thought as a simple NN which can be trained such that if X = (x1, x2, x3,…xn) is the input then Y = (y1, y2, y3,…ym) is the output, the weights of this network can be obtained by using the back propagation algorithm to impose the following conditions on the output: Y = Fk (x1, x2, x3,…xn) where k is the class to which the input vector (x1, x2, x3,…xn) belongs obviously k is known before hand because the F’s are known. Thus we can train this converging part of the NN. Similarly, we can now assume that there exists a diverging NN (depicted in Figure 1 (b)) to exist starting from ‘m’ adalines and ending in ‘n’ adalines, to this second neural network we will assume that the input will be the set (y1, y2, y3,…ym) and the output of this would be the original point in X dimension space whose image is Y. So by imposing the latter condition the second (diverging neural network) can be trained
18
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Adaline (PE) Input X(x 1 , x2 , .…x n)
o o o o o o o
o o o o o
o o o
Output Input Y(y 1, y 2, .…ym)
(a) X(x 1, x2, .…xn)
o o o o o o o
o o o o o o o
o o o o o
Output X(x1 , x2 , .…x n)
(b)
o o o o o
o o o o o
o o o
o o o o o o o
X(x 1, x2, .…xn)
Y(y1 , y2 , .…y m) (c)
Figure 1. (a) Convergning NN (b) Diverging NN (c) Mirroring Neural Network (combining (a) and (b))
Figure 2. An Illustration of the hierarchical invertible map
19
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009 Node-M X
Le vel I Y
Node-1
Node-2
Node-j …….
Z
C 11
C 12
Le vel II
Z
C 1t
C 21
Z
C 22
C 2t
…
C j1
…
...
C j2
C jt
.…
Le vel III
Figure 3. Organized collection of Nodes (blocks) containing MNN’s and their corresponding Forgy’s algorithm – Forming a treelike hierarchical structure
Node X
X
MNN
X Y Forgy
Y
Figure 4. A Node (block) of the Hierarchical Classifier constructed with MNN and Forgy’s clustering
The Tandem Algorithm which we devised in this paper for pattern recognition tasks using a hierarchical tree-like architecture (depicted in Figure 3). It may be noted that each block in the hierarchical architecture is trained through the implementation of a single common algorithm (tandem algorithm). This tandem process is done (at each node) in two steps. The 1st step being the data reduction and feature extraction by an MNN and the 2nd step is the classification of the extracted features (outputs of the MNN) using a clustering algorithm. The MNN at the first level trains itself to extract the features through repeated presentations of the data samples, after which the extracted features of the data are sent to the clustering procedure for classification. The modules in the second level again undergo this tandem process of feature extraction and classification (/sub-classification). This is how a single common algorithm is implemented throughout the hierarchy (at each module), resulting a level-by-level unsupervised classification. In section 4, we show that our method actually works: we apply our classifier on a collage of images of flowers, faces and furniture, this collection is automatically classified and sub classified.
is a pictorial representation of the hierarchical architecture, the details of each block or node is shown in Figure 4 and the structure of a MNN in Figure 1. The tandem Algorithm proceeds block (node) by block (node) at each level starting from the 1st level (see Figure 3). The Tandem Algorithm for a hierarchical classifier: 1. Train the MNN of the topmost block, i.e. Node-M (of the hierarchy, see Figure 3) with an ensemble of samples such that the inputs given to the MNN are mirrored at its output with minimum loss of information (for which a suitable threshold is fixed). And mark the topmost node as the “present node”. This is an iterative step and stops when the output of Figure 1c, almost equals the input, that is able to reconstruct the input. 2. After due training of the MNN of the present node (i.e., the MNN could accurately reconstruct above 95% of its inputs within the specified threshold limit), the collection of outputs of the MNN’s least dimensional hidden layer (the extracted number of features is equal to the dimension of Y of the MNN see Figure 1c) is considered for classifying the input of the present node.
We will now develop the tandem algorithm and actually implement it by writing a computer program by which such a learning engine can be used to classify the patterns by itself and report the results. The technique used for the development of this algorithm is based upon the application of the two procedures (i) mirroring for automatic feature extraction and (ii) unsupervised classification, in a tandem manner as described by the following algorithm, at each module (block) of the hierarchy, level-by-level. (In our computer program we have used it on a two level hierarchy). Continuing the discussion of the Tandem Algorithm, consider Figure 3 which
3. The features extracted in step 2 are given as “input data set” to the Forgy’s clustering algorithm (subroutine) of the present node for unsupervised classification, explained in step 4. 4. The steps involved in clustering procedure are: a. Select initial seed points (as many in number as the no. of classes the inputs to be divided into) from “input data set”.
20
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
(module) at Level II reduces its input and does a subclassification (we denote it as Level II classification) based only on its reduced units (at Level II). The gender classification which distinguishes a male face from a female face is a typical Level II classification by Node-1. In the pictorial representation, Level II classification contains ’t’ subcategories in each of j categories. Assuming that there are some more lower levels (identical to Level I and/or Level II) containing the nodes to further classify the patterns, so, for instance, the reduced units at Level II are given as input to one of the appropriate modules at Level III for more detailed classification which, an example case, sub-categorizes ‘k’ different persons in male/female face group. This tandem procedure of (i) mirroring followed by (ii) classification, performed at each level, can be extended to other lower levels, say, level IV, V and so on. That is how; the proposed architecture does level-by-level unsupervised pattern learning and recognition.
b. For all the data samples in the input dataset repeat this step b. (i) Calculate distance between each sample in the input data set and the each of the seed points representing a cluster. (ii) Place the input data sample into the group associated with the seed point which is closest (least of the distances in step 4 b (i)). c. When all the samples are clustered, the centroids of each of the clusters are considered as new seed points. d. Repeat step 4 b, 4 c as long as the data sets leave one cluster to join another in step 4 b (ii). 5. To proceed further for sub-classification, repeat step 6 for all the nodes in the next below level of the hierarchy. 6. Mark one of nodes as the “present node” and train the MNN of the present node with the samples (extracted features of the immediate above level) belong to a particular cluster of step 4 such that the samples given to the MNN are mirrored at its output with minimum loss of information (for which a threshold is fixed by trail and error). Repeat steps 2, 3 and 4.
As explained earlier, the hierarchical architecture implements a common algorithm for data reduction and extracted feature classification at its each node. And as the data reduction precedes the classification, the accuracy of classification is dependent on the efficiency of the data reduction algorithm. So there is a need to evaluate the performance of the MNN’s data reduction. The fact that the MNN dimensional reduction technique is an efficient method to reduce the irrelevant parts of the data was amply demonstrated over extensive trials (details are in [20] & [21]). It is because of this that we used the MNN (along with clustering algorithm) as a data reduction and feature extraction tool for the hierarchical pattern classification. For our demonstration, we use the Forgy’s algorithm for clustering the reduced units (of the input, at each module), wherein the number of clusters for the classification/sub-classification is provided by the user. Instead, without prejudice to the generality of our technique, one could use a more sophisticated clustering algorithm wherein the number of classes (clusters) is determined by the algorithm. We leave this work as a part of future enhancement which would then result in a completely automated unsupervised classification algorithm.
7. Repeat steps 5 and 6 till there is no enough data present in the samples to be further sub-classified (at the immediate below level). In this tandem algorithm, the feature extraction (concurrent with data reduction) is through steps 1, 2 and 3 and the automatic data classification (based on the reduced units of data) is by step 4. This tandem process of data reduction and classification is extended to next lower levels of the hierarchy for sub-classifying the ensemble through steps 5 and 6 till the stated condition is met in step 7. More details on the MNN architecture and MNN’s training through self-learning are given in [20] & [21]. We now, illustrate this concept of hierarchical architecture for unsupervised classification using Figure 3. If we assume, for the purpose of illustration, that there are only 4 categories of images; say faces, flowers, furniture and trees (j = 4), then at its broadest level, the MNN-M at Node-M is trained with these 4 categories of images. On successful training, MNN-M can reduce the amount of its input data; and based on the reduced units of data, Node-M categorizes the pattern into one of the classes using Forgy’s algorithm. The reduced units (which represent the input data) of the pattern from the present node (Node-M) are fed to one of the next level (Level II) nodes. (Alternatively, the input vector could be fed to the appropriate MNN in next level (Level II), instead of the reduced vector, in cases where too large an amount of data reduction done at the present level (Level I), is expected to have loss of information required for the finer classification at Level II). Selection of a node (module) from next level depends upon the classification of the input pattern at the present level. For example, Node-1 is selected if Node-M classifies the input as a face, else Node-2 is selected if NodeM classifies the same input as a flower and so on for Node-3 (furniture) or Node-4 (tree). Then, the respective node
IV.
DEMONSTRATION AND RESULTS
We now show by explicit construction that a hierarchical architecture can actually be built and used for classification and sub-classification of images, giving an example case. Example: Here we took a collection of 360 images for training with an equal no. of faces (See databases Feret [36], Manchester [37], Jaffe [38] in references), tables and flowers. We build a two level classifier constructed out of MNNs (associated with Forgy’s clustering); which at the first level automatically classifies the 360 images of the training set into three classes one of them would be a “face class” and the other two belong to the “table class” and “flower class”. The automatic procedure which does this is as follows: A 4 layer MNN (676-60-47-676) consisting an input layer of 676 inputs representing a 26 X 26 image, with the 60 processing elements in the first layer and 47 and 676 processing elements in the
21
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
subcategorizing a “face” as “male” or “female”, a “table” as “centrally supported” or “four-legged” and a “flower” as “flower bud” or “open flower”) is an additional 7%. Actually, this is not too bad at all because the whole exercise is unsupervised and the errors made in the 1st level classification remain undiscovered and are actually uncorrected by the classifier which indiscriminately feeds all the data into the second level as inputs.
other two layers is used to train the MNN to mirror the input data on to itself. The training is done automatically and stops when the output vector of 676 dimensions closely matches the corresponding input vector for each image, at this point, the MNN can said to be satisfactorily mirror all the 360 input images.
Then the output of the layer with the least number of processing elements (in this case 47) is taken as a reduced feature vector for the corresponding input image. We would have a set of 360 vectors (each of 47 dimensions) representing the input data of 360 images. This set of 360 vectors (of reduced dimensions) is then classified into three classes by using Forgy’s Algorithm (see [39] & [40]). The actual classification varies somewhat on the choice of the initial seed points which are randomly chosen. The program chooses three distinct initial random seed points and uses Forgy’s algorithm to cluster the reduced vectors of the input images. This clustering is done over many iterations till convergence and the classes are then frozen; after this the data is clustered a second time (starting from the second set of seed points) again using Forgy's algorithm till another set of three classes are obtained. After this the average of the resulting two nearest sets of cluster centroids is considered as the new cluster centroid, based on which the reduced feature vectors are once again classified to obtain three distinct classes, these classes are then treated as the final three classes (if everything works out well one of them would be the face class and the other remaining two would be the table class and flower class).
See the sample illustration for the Example in Figure 5. The summary of the results for Example is given in Table I. The various parameters used in the MNN training and classification are given in Table II. The brute force (obvious procedure) of training the MNN at each node of the hierarchical classifier by using a NewtonRaphston is beyond the capability of the PCs available with us and was not tried. Instead, we adopted an approximate procedure and trained the MNNs by using the Back Propagation algorithm ([41] & [42]) which actually tries to determine the best MNN by changing the weights at each presentation of an image; ideally a “best MNN” should be obtained for the entire ensemble of input images (or reduced units of images) at each MNN of a node, which again would involve a Newton-Raphston and was avoided. The techniques used and reported here were very efficient in terms of time and space taken for execution and they were all performed on a PC. V. SIGNIFICANCE OF OUR WORK & CONCLUSIONS In this paper we have proved a crucial theorem called “Mirroring Theorem”; based on the mathematical proof of the theorem we developed an architecture for a hierarchical classifier which implements our proposed Tandem Algorithm to perform an unsupervised pattern recognition. We have also specifically written a computer code to demonstrate the technique of building such a self-supervising classifier and applied it for an example. These classifiers have the characteristics of being hierarchical, modular, unsupervised and they run on a single common algorithm and therefore, they mimic (admittedly in a crude manner), the collective behavior of neurons in the neo-cortex. It is expected that they can be expanded to analyze much more complex data, such “super classifiers” could employ many structures, (each being of the type shown in Figure 3), working in parallel.
After this first level classification, the program proceeds to the next level for sub-classifying the three classes identified at level I. The procedure of reduction and classification at this Level II, is similar to that carried out at Level I, except that now three MNNs have to be trained, one receiving inputs form the Face class, another from Table class and the other from the Flower class. These MNNs at Level II use the architecture (47-37-30-47). After the two MNNs are suitably trained to mirror their respective inputs, to an acceptable accuracy, the program proceeds to classify the inputs into sub categories for each of the MNNs separately. Of course, this time the feature vector (reduced vectors) has 30 dimensions. Once again, Forgy's Algorithm is used, following a similar procedure as described above for level I, except that this time the classification is done on the reduced vectors of the MNN-1 at Node-1 which would render the sub categories male face and female face, a classification of the reduced vectors of the MNN-2 at Node-2 obtaining the subcategories centrally supported table and four legged table and a classification of the reduced vectors of the MNN-3 at Node-3 obtaining the subcategories flower bud and open flower.
In our experimentation, (within the available resources) we have found that it is not possible to have too many classes at the first level (Figure 3), i.e. j cannot be too large a value (at best j = 4). Therefore, for large problems involving many classes, we need to have a network of “structures” (each being of the type shown in Figure 3 but with j limited to 2, 3 or 4) working in parallel, each structure trained to recognize its own set of classes (eg. face classes, alphabet classes etc.). Thus a binary or tertiary “super- tree” with each “node” itself being a structure of type shown in Figure 3, can be envisaged for the construction of a “super classifier”.
Because the MNNs are initiated with random weights (chosen initially), and again by choosing random seed points while executing the Forgys Algorithm, it is our intention to demonstrate that the classification is not overly dependent on these random choices. So, we ran the program over and over again each time starting ab initio. We have taken 10 trials, meaning, 10 different training and classification sessions at level I followed by level II. On an average of these 10 trails, considering the training and test sets, the error at level I is 7% and an average error of the three nodes at level II (for
22
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
MNN-M using Forgy’s algorithm (with 676 input variables)
S1
S2
S3
MNN-1 using Forgy
MNN-2 using Forgy
MNN-3 using Forgy
(47 re duce d dimensional units as inpu t)
(47 re duce d dimensional units as input)
(47 re duce d dimensional units as input)
c11
c 12
c21
c22
c31
c 32
Figure 5 Pictorial representation of Hierarchical classifier implemented using Example images; (S1 (face), S2 (flower), S3 (table): classification at level I based on 47 reduced dimensional vectors of the input image; c11 (male face), c12 (female face), c21 (flower bud), c22 (open flower), c31 (centrally supported table), c32 (four-legged table): sub-classification at level II based on 30 reduced dimensional vectors of the image).
TABLE I. Input type
Dimension of the input
Dimension of the reduced units
676
Image
RESULTS OF THE HIERARCHICAL CLASSIFIER FOR EXAMPLE IMAGES No. of samples for training
No. of samples for testing
47
360
150
30
≈ 120 (for each category)
≈ 50 (for each category)
(26 X 26 )
Reduced units of image
47
TABLE II.
No. of categories
Success rate (averaged over 10 trails) of clustering on reduced units Training samples
Average of Training & Test sets
3(face, table & flower)
94.0% (Efficiency of the Level I Node)
93.4% (Efficiency of the Level I Node)
2 (sub-categories for each category)
88.5%(Average efficiency of the level II Nodes)
86.3%(Average efficiency of the level II Nodes)
VARIOUS PARAMETERS USED FOR THE MNN AND FORGY’S ALGORITHM
Type of MNN architecture
Distance between input and output
Seed points for Forgy’s algorithm
Learning rate parameter
Weights& bias terms
Level I MNN
0.8
Threshold of 1.0, between the random seed points
0.025
-0.25 to +0.25 (random selection)
0.8
Threshold of 0.8, between any two random seed points
0.01
-0.25 to +0.25 (random selection)
(676-60-47-676) Level II MNNs (47-37-30-47)
23
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
It is expected that the techniques that we have developed and presented in this paper will be implemented by many future researchers for building advanced and highly sophisticated pattern classifiers. Further it is also hoped that these procedures will also be used for building models for associative memories [43] where, say a voice signal (eg. “Mary”: a spoken word) can be associated with a picture (image of Mary). These developments could, in the near future, lead to very versatile machine learning systems which can possibly ape the human brain in at least its elemental functions.
[14] P. Baldi & K. Harnik, “Neural networks and principal component analysis:learning from examples without local minima”, In Neural Networks Vol. 2, pp. 53-58, 1989.
ACKNOWLEDGMENT
[18] K. Eswaran, System and method of identifying patterns. Patents filed in Indian Patent Office on 20/7/06 and 19/03/07 and also in U.S. Patent and Trade Mark Office vide Nos. 20080019595 and 20080232682 respectively, 2006.
[15] D. DeMers & G. Cottrell, “Non-linear dimensionality reduction”, Advances in Neural Information Processing Systems Vol. 5, Morgan Kaufmann, pp. 580-587, 1993. [16] J.J Hopfield & C.D. Brody, “Learning rules and network repair in spiketiming-based computation networks”, Proc. Natl. Acad. Sci. U. S. A., Vol. 101, pp. 337-342, 2004. [17] B. Lau, G.B. Stanley & Y. Dan, “Computational subunits of visual cortical neurons revealed by artificial neural networks”, Proc. Nat. Acad. Sci. USA, Vol. 99, pp. 8974-79, 2002.
We thank the managements of Srinidhi and the group of Aurora Educational Institutions for their encouragement and Altech Imaging and Computing for providing the facilities for research.
[19] A.N. Kolmogorov, “On the representation of continuous functions of several variables by superposition of continuous functions of one variable and addition”, Doklady Akademia Nauk SSSR 114(5), pp. 953956, 1957.
REFERENCES
[20] D.R. Deepthi, S. Kuchibhotla, & K. Eswaran, “Dimensionality reduction and reconstruction using mirroring neural networks and object recognition based on reduced dimension characteristic vector”, IEEE International Conference on Advances in Computer Vision and Information Technology (IEEE, ACVIT-07), pp. 348-353, 2007.
[1] B. G. Buchanan, “The role of experimentation in A.I”, Phil. Trans. R. Soc. A, Vol. 349, pp. 153-166, 1994. [2] J. D. Zucker, “A grounded theory of abstraction in artificial intelligence”, Phil. Trans. R. Soc. B, Vol. 358, pp. 193-1309, 2003. [3] R. C. Holte, & B.Y. Choueiry, “Abstraction & reformulation in A.I.”, Phil. Trans. Roy. Soc.B, Vol. 358, pp. 1197-1204, 2003.
[21] D.R. Deepthi, Automatic pattern recognition for applications in image processing and robotics, Ph. D. Thesis, Osmania University, Hyderabad, India, 2009.
[4] H. Cruze, V. Durr, & J. Schmitz, “Insect walking is based on a deetralized architecture revealing a simple and robust controller”, Phil. Tran. R. Soc. A, 365, pp. 221-250, 2007.
[22] D.O. Creutzfeldt, “Generality of the functional structure of the Neocortex”, Naturwissenschaften, Vol. 64, 507-517, 1977. [23] B.V. Mountcastle, An organizing principle for cerebral function: The unit model and the distributed system In The Mindful Brain, Edelman,G.M, and V.B. Mountcastle,V.B. Eds., Cambridge, Mass.: MIT Press, 1978.
[5] Y. Freund, & R. E. Schapire, “Experiments with a new boosting algorithm”, Proc. 13th International Conference on Machine Learning, pp. 148-156, 1996.
[24] D.J. Felleman, & D.C. Van Essen, “Distributed hierarchical processing in the primate cerebral cortex”, Cerebral Cortex Vol. 1, pp. 1-47, 1991.
[6] P. Viola & M. Jones, “Rapid object detection using a boosted cascade of simple features”, Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Vol. 1, pp. I-511-I-518, 2001.
[25] R.P. Rao & D.H. Ballard, “Predictive coding in the visual cortex: A functional interpretation of some extra-classical-receptive-field effects”, Nature Neuroscience Vol. 2, pp. 79-87, 1999.
[7] G.E. Hinton & R.R Salakhutdinov, “Reducing the Dimensionality of Data with Neural Networks”, Science Vol. 313, pp. 504-507, 2006. [8] H.C. Law, “Clustering, Dimensionality Reduction and Side Information”, Ph. D. Thesis, Michigan State University, 2006.
[26] S.M. Sherman & R.W. Guillery, “The role of the thalamus in the ow of information to the cortex”, Phil. Trans. Roy. Soc. London Vol. 357, pp. 1695-708, 2002.
[9] T. Joachims, “Text categorization with support vector machines: learning with many relevant features”, Proc. 10th European Conference on Machine Learning, pp. 137-142, 1998.
[27] M. Kaiser, “Brain architecture: A design for natural computation”, Phil. Trans. Roy. Soc. A, Vol. 365, pp. 3033-3045, 2007. [28] G. Buzsaki, C. Geisler, D.A. Henze & X.J. Wang, “Interneuron diversity series: Circuit complexity and axon wiring economy of cortical interneurons”, Trends Neurosci. Vol. 27, pp. 186-193, 2004.
[10] M. Craven, D. DiPasquo, D. Freitag, A.K. McCallum & T.M. Mitchell, “Learning to construct knowledge bases from the World Wide Web.”, Artificial Intelligence Vol 118, pp. 69-113, 2000. [11] C. Garcia & M.Delakis, “Convolutional face finder: A neural architecture for fast and robust face detection”, IEEE Trans. Pattern Anal. Mach. Intell. Vol. 26, pp. 1408-1423, 2004.
[29] J. D. Johnsen, V. Santhakumar, R. J. Morgan, R. Huerta, L. Tsimring and I. Soltesz, “Topological determinants of epileptogenesis in largescale structural and functional models of the dentate gyrus derived from experimental data”, J. Neuro-physiol. Vol. 97, 1566-1587, 2007.
[12] S.M. Phung, & A. Bouzerdoum, “A Pyramidal Neural Network For Visual Pattern Recognition”, IEEE Transactions on Neural Networks Vol. 18, pp. 329-343, 2007.
[30] D. C. Van Essen, C. H. Anderson & D. J. Felleman, “Information processing in the primate visual system: an integrated systems perspective”, Science Vol. 255, 419-423, 1992.
[13] M. Rosenblum, Y. Yacoob & L.S. Davis, “Human expression recognition from motion using a radial basis function network architecture” IEEE Trans. Neural Networks Vol. 7, pp. 1121-1138, 1996.
[31] J. Hawkins, On intelligence, Owl Books, Henry Holt & Co., New York, pp. 110-125, 2005. [32] B.G. Bell, “Levels & loops: the future of artificial intelligence & neuroscience”, Phil. Trans. R. Soc. B, Vol. 354, 2013-2030, 1994.
24
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
[33] J. Hawkins & D. George, “Hierarchical Temporal Memory, Concepts, Theory, and Terminology”, Numenta (Numenta Inc), pp. 1-20, 2007, www.numenta.com.
[43] D.R. Deepthi & K. Eswaran, “Pattern recognition and memory mapping using mirroring neural networks”, IEEE International Conference on Emerging Trends in Computing (IEEE, ICETiC 2009), India, 317-321, 2009.
[34] D. George, How the brain might work: A hierarchical and temporal model for learning and recognition, Ph. D. Thesis, Stanford University, 2008.
AUTHORS PROFILE Author 1: Working as an Associate Professor (CSE Dept.), Aurora’s Engineering College. Submitted Ph. D.(CSE) Thesis on “Automatic pattern recognition for applications in image processing and robotics” to Osmania University, Hyderabad in Feb. 2009. M. Tech. (Software Engineering) from J. N. T. University, Hyderabad.
[35] J. Herrero, A. Valencia & J. Dopazo, “A hierarchical unsupervised growing neural network for clustering gene expression patterns”, Bioinformatics, Vol. 17, pp. 126-136, 2001. [36] FERET database:www.frvt.org/FERET/.
Author 2: Working as a Professor (CSE Dept.), Srinidhi Institute of Science Technology. Ph. D. (Mathematical Physics) on “On Phase and Coherence in Quantum Systems” from University of Madras, Jan. 1973. 36 years of research experience in the application of Computers in the areas of Industrial Image Processing, Pattern Recognition, Neural Networks, Electromagnetics, Fluid Mechanics, Structural Mechanics and Artificial Intelligence. He has more than 40 papers in international journals and international conferences on the above subjects.
[37] MANCHESTER database: www.ecse.rpi.edu/ cvrl/database/. [38] JAFFE database:www.kasrl.org/jaffe.html [39] E. Gose, R. Johnsonbaugh & S. Jost, Pattern Recognition and Image Analysis, Prentice Hall of India, New Delhi, pp 211-213, 2000. [40] D. R. Deepthi, G.R.A. Krishna & K. Eswaran, “Automatic pattern classification by unsupervised learning using dimensionality reduction of data with mirroring neural networks”, IEEE International Conference on Advances in Computer Vision and Information Technology (IEEE, ACVIT-07), 354-360, 2007. [41] D. E. Rumelhart, G.E. Hinton & R.J. Williams, “Learning Representations by back-propagating Errors”, Nature Vol. 323, 533-536, 1986. [42] B. Widrow, & M.A. Lehr, 30 Years of Adaptive Neural Networks: Perceptron, Madaline, and Backpropagation, Proceedings of the IEEE 78 (9), 1990.
25
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, `Vol. 6, No. 1, 2009
Algorithm as Defining Dynamic Systems Keehang Kwon Department of Computer Engineering Dong-A University Busan, Republic of Korea
Hong Pyo Ha Department of Computer Engineering Dong-A University Busan, Republic of Korea
An attractive feature of this view is that it enhances the readability and modifiability of the algorithm for nondeterministic problems. The remainder of this paper is structured as follows. We discuss a new way of describing algorithms in the next section. In Section 3, we present some examples. Section 4 concludes the paper.
Abstract—This paper proposes a new view to algorithms: Algorithms as defining dynamic systems. This view extends the traditional, deterministic view that an algorithm is a step-by-step procedure with nondeterminism. As a dynamic system can be designed by a set of its defining laws, it is also desirable to design an algorithm by a (possibly nondeterministic) set of defining laws. This observation requires some changes to algorithm development. We propose a two-step approach: the first step is to design an algorithm via a set of defining laws of dynamic system. The second step is to translate these laws (written in a natural language) into a formal language such as linear logic. key words: dynamic, systems, algorithm, nondeterminisim, linear logic.
I.
II.
ALGORITHMS AS DEFINING DYNAMIC SYSTEMS
Our interest is in a process for developing algorithms based on the observation describe in the previous section. The traditional, sequential algorithm process models provide a useful structure for such a process, but some changes are needed. The first problem arises from the machinedependent, deterministic view for algorithms. A standard definition is that an algorithm is a sequence of instructions. This definition requires algorithms to be deterministic. However, it is easily observed that this deterministic view makes an algorithm to be (sequential) machine-dependent and extra complicated. In algorithm design, nondeterministic algorithms are desirable quite often. This natural when there are multiple ways to get there and we simply don’t know in advance which of them is chosen. Such examples include graph algorithms, backtracking algorithms, and AI planning problems. In ensuring that algorithms are described as simple and machine-independent as possible, it is desirable to express an algorithm via a set of governing laws- in natural language – in the form of initial resources and transition rules. In fact, the above approach to defining algorithms has been used for centries in other fields such as physic and mechanics. The second problem arises from the specification languages to translate these laws. In choosing a language, there is an aspect that requires a special attention. First, we observe that translating the laws into a sequential pseudo code makes the resulting description much bigger, leading to extra complications. An acceptable language should not expand the resulting description too much, rather support a reasonable translation of the laws. An ideal language would support an optimal translation of the laws. Unfortunately, it is a never-ending task to develop this ideal language, as there are too many dynamic systems with too many different features: autonomous systems, open systems with
INTRODUCTION
Designing an algorithm is central to the development of software. For this reason, many algorithms have been developed. However, no guidelines for designing an algorithm have been provide so far: this deficiency is mainly due to the lacking this understanding, algorithm are being designed in an ad-hoc fashion. As a consequence, designing algorithms has been quite cumbersome and error-prone. What is software/algorithm? Computer science is still looking for an answer to this question. One attempt is based on the view that software is a function and an algorithm is a sequence of instruction for implementing the function. This view has been popular and adopted in many algorithm textbook[6]. Despite some popularity, this view of sequential algorithms stands for a deterministic computation and lack devices for handling nondeterminism. Lacking such devices as nondeterministic transitions, dealing with nondeterminism in this view is really difficult and relies on extra devices such as stacks(for DFS) and queues (for BFS). Those extra devices greatly reduce the readability and modifiability of the algorithm. This paper proposes another view of software/algorithms, i.e., software as (possibly nondeterministic) dynamic systems and algorithms as defining dynamic systems. This paper also considers its effects on the algorithms development process. To be precise, we consider here algorithm design to be the process of finding a set of defining laws of dynamic system.
26
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, `Vol. 6, No. 1, 2009
9, 2. The standard algorithm creates a new directory max where it keeps track of the maximum value of the elements .An alternative, more dynamic algorithm is shown below:
interactions, stochastic systems, etc. We argue that a reasonable, high-level translation of the laws can be achieved via linear logic[3]. An attractive feature of linear logic over other formalisms such as nondeterministic Turing machines, recursive functions, sequential pseudo code, etc, is that it can optimally encode a number of essential characteristics of dynamic system: nondeterminism, updates (also called state change), etc. Hence, the main advantage of linear logic over other formalisms is the minimum (linear) size of the encoding of governing laws of most dynamic systems. The basic operator in linear logic is the linear implication of the form a⊸b. This expression means that the resource A can be transformed to another resource B. The expression A⊗B means two resources A and B. The expression !A means the resources A is reusable. We point the reader to [3] to find out more about the whole calculus of linear logic.
(1) Initial resources: 4 elements consisting of 5,10,9,2. (2) Transitions: pick two elements p and q, and discard the smaller one. This algorithm produces the desired output by repeatedly discarding the smaller input resources. The following is a linear logic translation of the above algorithm. i(5)⊗i(10)⊗i(9)⊗i(2). !((i(X)⊗i(Y)⊗<(X,Y))⊸i(Y)). !((i(X)⊗i(Y)⊗≥(X,Y))⊸i(X)). Note that the fact that 3 is an item is encoded as the proposition i(3), i.e., there is a file whose name is 3 under directory i. We assume that, in dealing with < (X,Y), each file(X,Y) such that X is smaller than Y will be created dynamically under the directory <. A final state is a state where there is only one element remaining. Hence, solving the query i(X) will produce i(10) – after deleting i(5), i(9) and i(2) – using the second law three times. It is observed that this kind of algorithm is not easily translated into a sequential pseudo code, as the pseudo code has no construct for discarding the input resources. A good motivation for introducing the nondeterminism might be graph algorithms. An example of nondeterministic problems is provided by the following which computes connectivity over an infinite, directed graph. Now we try to determine whether the string miuiuiu can be produced from mi with the following four rules:
We sum up our observation in the following equation: software
= dynamic system.
algorithm design = a set of defining laws. algorithm writing = translation of defining laws. into linear logic. III.
EXAMPLES
The view of “software-as-dynamic-systems” makes algorithms simpler and versatile compared to traditional approach. As an example, we present the factorial algorithm to help understand this notion. The factorial algorithm can be seen as a dynamic system consisting of two laws described below in English: (1) Initial resource (0, 1).
(a) If you possess a string of the form Xi, you can replace it by Xiu.
(2) Transition: (X, Y) can be replaced by (X+1, XY+Y).
(b) Suppose you have mX. Then you can replace it by mXX.
This algorithm discards the old resource to produce the new resource and is, therefore, more efficient in term of space usage than its Prolog counterpart. It is shown below that the above laws can be translated into linear logic formulas of the same size. A state is described by a collection of resources. A resource a is represented by a linear logic formula of the form d(a) is represented by a linear logic formula of the form(d) where a is a resource under a directory d. For example, fact(0,1) represents the fact that there exist a resource (0,1) under the directory fact. The following is a linear logic translation of the above algorithm, where the reusable action is preceded with!.
(c) A string of the form XiiY can be replaced by XuY. (d) A string of the form XuuY can be replaced by XY. This problem requires both nondeterminism (There are multiple paths from a node) and updates (An old node is replaced by a new one). For example, the string mi can become either miu or mii. An algorithm for this problem based on functions would be awkward, as functions are too weak, i.e., they support neither nondeterminism nor updates. On the other hand, an algorithm for this problem can be easily formulated as a nondeterministic dynamic system with the following five laws:
fact(0,1).
(1) Initial resource: mi.
! (fact(X,Y) ⊸fact(X+1,XY+Y)).
(2) Transition: if Xi, you can replace it by Xiu. (3) Transition: if mX, you can replace it by mXX.
A final state is typically given by a user in the form of a query. Computation tries to solve the query. As an example, solving the query fact(5,X) would result in the initial resource fact (0,1) being transformed to fact (1,1), then to fact (2,2), and so on. It will finally produce the desired result fact(5,120) using the second law five times. We now consider the problem of finding the maximum value of the n elements. Suppose they are 5, 10,
(4) Transition: if XiiY, you can replace it by XY. (5) Transition: if XuuY, you can replace it by XY. Note that this algorithm does not concern whether it will use DFS or BFS when it explores the graph. The following is a linear logic translation of the above algorithm.
27
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, `Vol. 6, No. 1, 2009
development process proposed here. We point the reader to [1],[4],[5] for more examples.
s(mi). !∀X(s(Xi)⊸s(Xiu)). !∀X(s(mX)⊸s(XX)). !∀X(s(XiiiY)⊸s(XuY)). !∀X(s(XuuY)⊸s(XY)).
CONCLUSION A proposal for designing algorithms is given. It is based on the view that softwares are dynamic systems simulated on a machine and an algorithm is a constructive definition of a dynamic system. The advantage of our approach is that is simplifies the process of designing and writing algorithms for the problems that require nondeterministic updates. Our ultimate interest is in a procedure for carrying out computations of the kind described above. Hence it is important to realize this linear logic this interpreter in an efficient way, as discussed in [2][4]. In the future, we are also interested in choosing an extension to linear logic, computability logic [7, 8] to express algorithms.
Now solving the query s(miuiuiu) would decide whether miuiuiu can be produced in the above puzzle. Another example of nondeterministic problem is provided by the following menu at a fast-food restaurant. Now we try to determine what can be obtained for four dollars. (a) three dollars for a hamburger set consisting of a hamburger and a coke, (b) four dollars for a fish burger set consisting of a fishburger and a coke, (c) three dollars for a hamburger, four dollars for a fish burger, one dollar for a coke (with unlimited refills), and one dollar for a fry.
ACKNOWLEDGMENT
This paper was supported by Dong-A University Research Fund in 2009.
The following is a linear logic translation of the above algorithm.
REFERENCES
p(4). !∀X(p(X )⊗ ≥(X,3)⊸ p(h) ⊗ p(c ) ⊗ p(X-3)). !∀X(p(X) ⊗ ≥(X,4)⊸ p(fi) ⊗ p(c ) ⊗ p(X-4)). !∀X(p(X) ⊗ ≥(X,3)⊸ p(h) ⊗ p(X-3)). !∀X(p(X )⊗ ≥(X,4)⊸ p(fi) ⊗ p(X-4)). !∀X(p(X) ⊗ ≥(X,1)⊸ p(c) ⊗ !p(c)⊗p(X-1)). !∀X(p(X) ⊗ ≥(X,1)⊸ p(f) ⊗ p(X-1)).
[1] [2]
[3] [4]
The proposition p(4) represents that a person has four dollars. Now solving the query p(h) ⊗ p(c) ⊗ p(f) would succeed as we can obtain a hamburger and a coke for three dollars, and a fry for a dollar. Solving the query p(h) ⊗ p(c)⊗ p(c) would also succeed as we can obtain a hamburger for three dollars, a coke and a (refilled) coke for one dollar. The examples presented here have been of a simple nature. They are, however, sufficient for appreciating the attractiveness of the algorithm
[5] [6] [7] [8]
28
M.Banbara.Design and implementation of linear logic programming languages. Ph.D Disseration , Kobe University .2002. Iliano Cervesato,Joshua S. Hodas,and Frank Pfenning.Efficient resource management for linear logic proof search .In Programming of the 1996 Workshop on Extensions of Logic Programming,LNAI 1050,pages 67~81. Jean-Yves Girard.Linear Logic. Theretical Computer Science,50:1102 ,1987. Joshus Hodas and Dale Miller.Logic programming in a fragment of intuitionistic linear logic.Journal of Information and Computation,1994 Invited to a special issue of submission to the 1991 LICS conference . P.Kungas.Linear Logic Programming for AI Planning Master. Thesis,Tallin Technical University,2002. R.Neapolitan and K.Naimipour.Foundation of Algorithms Health, Amsterdam,1997. G.Japaridze, The logic of tasks, Ann.Pure Appl. Logic 117(2002)263-295. G.Japaridze, Introduction to computability logic, Ann. Pure Appl. Logic 123(2003) 1-99.
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No.1, 2009
A Wavelet-Based Digital Watermarking for Video
A.Essaouabi and F.regragui
E.Ibnelhaj
Department of physics, LIMIARF Laboratory, Faculty of Sciences Mohammed V University Rabat, Morocco
Image laboratory National Institute of Posts and Telecommunications Rabat, Morocco
be accomplished with little or no damage to the video signal. However, the watermark may be adversely affected. Scenes must be embedded with a consistent and reliable watermark that survives such pirate attacks. Applying an identical watermark to each frame in the video leads to problems of maintaining statistical invisibility. Applying independent watermarks to each frame also is a problem. Regions in each video frame with little or no motion remain the same frame after frame. Motionless regions in successive video frames may be statistically compared or averaged to remove independent watermarks [7][8]. In order to solve such problems, many algorithms based on 3D wavelet have been adopted but most of them use the binary image as watermark. In this paper we propose a new blind watermarking scheme based on 3D wavelet transform and video scene segmentation [8][9]. First By still image decomposition technique a gray- scale watermark image is decomposed into a series of bitplanes which are correlative with each other and preprocessed with a random location matrix. After that the preprocessed bitplanes are adaptively spread spectrum and added in 3D wavelet coefficients of the video shot. As the 1-D multiresolution temporal representation of the video is only for the temporal axis of the video, each frame along spatial axis is decomposed into 2D discrete wavelet multiresolution representations for watermarking the spatial detail of the frame as well as the motion and motionless regions of the video. Experimental results show that the proposed techniques are robust enough against frame dropping, averaging and MPEG lossy compression. The rest of this paper is organized as follows: in section II we will explain the decomposition procedure of watermark image and video. Section III will describe the basic functionalities of watermarking embedding and extraction procedure. Finally, section IV will give the simulations results and section V will give the conclusion.
Abstract— A novel video watermarking system operating in the three-dimensional wavelet transform is here presented. Specifically the video sequence is partitioned into spatio-temporal units and the single shots are projected onto the 3D wavelet domain. First a gray- scale watermark image is decomposed into a series of bitplanes that are preprocessed with a random location matrix. After that the preprocessed bitplanes are adaptively spread spectrum and added in 3D wavelet coefficients of the video shot. Our video watermarking algorithm is robust against the attacks of frame dropping, averaging and swapping. Furthermore, it allows blind retrieval of embedded watermark which does not need the original video and the watermark is perceptually invisible. The algorithm design, evaluation, and experimentation of the proposed scheme are described in this paper. Keywords-component; video watermarking; copyright protection; wavelet transform
I.
security;
INTRODUCTION
We have seen an explosion of data change in the Internet and the extensive use of digital media. Consequently, digital data owners can transfer multimedia documents across the Internet easily. Therefore, there is an increase in the concern over copyright protection of digital content [1, 2, 3]. In the early days, encryption and control access techniques were employed to protect the ownership of media. They do not, however protect against unauthorized copying after the media have been successfully transmitted and decrypted. Recently, the watermark techniques are utilized to maintain the copyright [4, 5, 6]. Digital watermarking, one of the popular approaches considered as a tool for providing the copyright protection, is a technique based on embedding a specific mark or signature into the digital products, it has focused on still images for a long time but nowadays this trend seems to vanish. More and more watermarking algorithms are proposed for other multimedia data and in particular for video content. However, even if watermarking still images and video is a similar problem, it is not identical. New problems, new challenges show up and have to be addressed. Watermarking digital video introduces some issues that generally do not have a counterpart in images and audio. Due to large amounts of data and inherent redundancy between frames, video signals are highly susceptible to pirate attacks, including frame averaging, frame dropping, frame swapping, collusion, statistical analysis, etc. Many of these attacks may
II.
DECOMPOSITION OF THE WATERMARK IMAGE AND VIDEO
A. Watermark process The watermark gray scale image W(i,j) is decomposed into 8 bitplanes for watermarking [10]. For robustness to the common picture-cropping processing, a fast two dimensional
29
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No.1, 2009
pseudo-random number traversing method is used to permute each bitplane of the watermark image to disperse its spatial location for the sake of spreading the watermarking information (first key). Finally each bitplane is changed into a pseudo random matrix Wdk by the disorder processing (second key). Wdk is a serie of binary image with value 1 and -1.
III.
VIDEO EMBEDDING AND EXTRACTING PROCEDURE
Fig.3 shows the watermarking embedding procedure. Assume that original video is a series of gray-level images of size (352x288) and the watermark image is a 8-bitgrayscale image of size 42x42.
B. Decomposition of video. For watermarking the motion content of video, we decompose the video sequence into multiresolution temporal representation with a 2-band or 3-band perfect reconstruction filter bank by 1-D Discrete Wavelet Transform (DWT) along the temporal axis of the video. To enhance the robustness against the attack on the identical watermark for each frame, the video sequence is broken into scenes and the length of the 1-D DWT depends on the length of each scene. Let N be the length of a video scene, Fk be the k-th frame in a video scene and WFk be the k-th wavelet coefficient frame. The Wavelet frames are ordered from lowest frequency to highest frequency i.e, WF0 is a DC frame. The procedure of multiresolution temporal representation is shown in Fig.1.
Figure 3. Digital watermarking embedding scheme diagram
A. Watermark embedding procedure The main steps of the digital watermark embedding process are given as follows:
Figure 1. Procedure of multiresolution temporal representation
1) Video segmentation: the host video is segmented into scene shots, and then some shots are selected by randomly for embedding watermark, then for each scene selected scene shots, the following steps are repeated.
The multiresolution temporal representation mentioned above is only along the temporal axis of the video. The robustness of spatial watermarking for each frame (especially for I-frame) should be considered in addition for the sake of surviving MPEG video lossy compression. Hence, the wavelet coefficient frame WFk is decomposed into multiresolution representation by the 2D discrete Wavelet transform 2DDWT. Fig.2 shows the three-scale discrete wavelet transform with 3 levels using Haar filter. Rk denote the 3D wavelet coefficient frames.
2) The shot is projected by 1-D DWT and 2-D DWT into multiresolution representation of three levels. Denote Rk the 3D wavelet coefficient frames. 3) Each bitplane is adaptively spread spectrum and embedded in each original wavelet coefficient frame (subband LH3). Hence there are 8 original wavelet coefficient frames are watermarked. For each pixel (i,j) of the selected area in RK (k=1,2,..8), the value is compared with the max of its eight neighbors, t denote the max of its neighbours. Watermark is embedded by changing the corresponding coefficient value as shown Eq 1. R’K(i,j)= RK(i,j)+α WK(i,j). RK(i,j)
(1)
Where α is an intensity factor, R’K is the watermarked 3DDWT coefficient frames, WK (k=1,2…8) is the spread
Figure 2. Three-scale wavelet decomposition with three levels of the k-th wavelet coefficient frame in a video
30
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No.1, 2009
spectrum watermark image sequence which is the third key of our video watermarking scheme as shown in Eq 2 and Fig.4. Wk(i,j) = 1 if t>Rk(i,j) and Wdk(i,j)=1 Or t
3) The preprocessing and the pseudo-random permutation is reversed according to the predefined pseudo-random order for these bitplanes 4) By composing these bitplanes into the gray-level image G0 the extracted watermark is reconstructed
(2)
IV.
-1 else
EXPERIMENTAL RESULTS
The “foreman” and “Stefan” sequences with 100 frame long (about 4 seconds) and 352x288 pixels per frame as shown in (Fig 6-a et b) were used in our experiments. The image tire (watermark) 42x42 that we used in our experiments is shown in (Fig 6-c). The corresponding experiment results for various possible attacks such as frame dropping, frame averaging, frame swapping, and MPEG compression are shown as follow section , in the other hand a similarity measurement of the extracted and the referenced watermarks is used for objective judgment of the extraction fidelity and it is defined as:
Figure 4. The detail of the watermark embedding
∑ ∑ W (i, j )W ' (i, j ) NC = ∑ ∑ [W (i, j )] i
j
(4)
2
4) By inversing the watermarked 2D-DWT and 1-D DWT wavelet coefficient frames, we obtain the watermarked video. B.
i
j
which is the cross-correlation normalized by the reference watermark energy to give unity as the peak correlation. We will use this measurement to evaluate our scheme in our experiment. Peak signal-to-noise ratio (PSNR), a common image quality metric, is defined as:
Watermark extracting procedure 1) We first parse the watermarked video into shots with the same algorithm as watermark embedding, and then the 3D wavelet transform is performed on each selected test video shot, for each wavelet coefficient frame R’K(k=1,2,…n). 2) For each pixel in R’K(i,j), its value is compared with max of its eight neighbors. t’ denotes the max value of its eight neighbors to extract the corresponding bitplane. As shown in Fig.5 and Eq.3
PSNR = 20 log 255 SNR
(5)
The signal-to-noise ratio (SNR) is computed between the original and watermarked frame.
(a) Foreman scene
(b) Stefan scene
Figure 5. The detail of the watermark detecting
(c) Original watermark d
’
W k(i,j) = 1 if t>R k(i,j) and Wk(i,j) = 1 Or t
Figure 6. Two scenes original and the watermark image in the experiment
(3)
31
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No.1, 2009
watermarked so in our experiment we use the average of current frame and its two nearest neighbors to replace the currenk=2,3,4,……n-1, the corresponding results are presented in Fig.10
(a) Watermarked video (PSNR=36.8211db)
(b) Watermarked video (PSNR= 35.4263db)
Figure 7. The watermarked scenes
The number of frame averaging (a)The effect of the frame averaging from foreman scene
(a) Extracted watermark from foreman scene NC(0.9736)
(b) Extracted watermark from stefan scene NC(0.9587) Figure 8. The extracted watermark from each scene The number of frame averaging (b)The effect of the frame averaging from stefan scene
A. Frame dropping attack
Figure 10. NC values under statistical averaging. It is found that the proposed scheme can resist to statistical averaging quite well.
There is a little change between frames in shot .so the frame dropping which are some frames (even index frame)are removed from the video shot and replaced by corresponding original frames is used as an effective video watermark attack. The experimental result is plotted in Fig.9
B. Frame swapping attack Frame swapping can also destroy some dynamic composition of the video watermark. We define the following swapping mode by FK(i,j)= Fk-1(i,j) k=1,3,5…….n-1 the corresponding results are presented in Fig.11
The number of frame dropping (a)The effect of the frame dropping from foreman scene The number of frame swapping (a)The effect of the frame swapping from foreman scene
The number of frame dropping (b)The effect of the frame dropping from stefan scene The number of frame swapping (b)The effect of the frame swapping from stefan scene
Figure 9. NC values under frame dropping. From the experiment, we found that our scheme achieves better performance
Figure 11. NC values under frame swapping. From the experiment, we found that our scheme achieves better performance.
Frame averaging is also a significant video watermarking attack that will remove dynamic composition of the video
32
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No.1, 2009 [2]
Chun-Shien Lu, Hong-Yuan, and Mark Liao: Multipurpose Watermarking for Image Authentication and Protection. IEEE Transactions on Image Processing, Volume: 10 Issue: 10, Oct 2001 Page(s): 1579 –1592 [3] C. S. Lu, S. K. Huang, C. J. Sze, and H. Y. M. Liao: Cocktail watermarking for digital image protection. IEEE Transactions Multimedia, Volume 2, pp. 209–224, Dec. 2000. [4] Joo Lee and Sung-Hwan Jung: A survey of watermarking techniques applied to multimedia. Proceedings 2001 IEEE International Symposium on Industrial Electronics (ISIE2001), Volume. 1, pp: 272 -277, 2001. [5] M. Barni, F. Bartolini, R. Caldelli, A. De Rosa, and A. Piva: A Robust Watermarking Approach for Raw Video. Proceedings 10th International Packet Video Workshop PV2000,Cagliari, Italy, 1-2 May 2000 [6] M. Eskicioglu and J. Delp: An overview of multimedia content protection in consumer electronics devices. Signal Processing Image Communication 16 (2001), pp: 681-699, 2001. [7] Gwenaël Doërr , Jean-Luc Dugelay “ Video watermarking overview and challenges” Chapter in the book :Handbook of Video Databases: Design and Applications by Borko Furht, ISBN :084937006X, Publisher: CRC Press; (September 2003). [8] M. D.Swanson, B. Zhu and A. H. Tewfik, Multiresolution SceneBased Video Watermarking Using Perceptual Models, IEEE Journal on Selected Areas in Communications, Vol.16, No.4, May 1998, pp.540-550. [9] Xiamu Niu, Shenghe Sun “ A New Wavelet-Based Digital Watermarking for Video”, 9th IEEE Digital Signal Processing Workshop[C].Texas,USA:IEEE,2000. [10] Xiamu Niu, Zheming Lu and Shenghe Sun, “Digital Watermarking of Still Images with Gray-Level Digital Watermarks”, IEEE Trans. on Consumer Electronics, Vol.46, No.1, Feb. 2000, pp137-145.
C. MPEG compression MPEG compression is one of the basic attacks to video watermark. The video watermarking scheme should robust against it.fig.12 shows the extracted watermark from foreman scene after MPEG2 compression.
Figure 12. Extracted watermark after MPEG2 compression
V.
CONCLUSION
This paper proposes an innovative blind video watermarking scheme in the 3D wavelet transform using a gray scale image as a watermark. The process of this video watermarking scheme, including watermark preprocessing, video preprocessing, watermark embedding, and watermark detection, is described in detail. Experiments are performed to demonstrate that our scheme is robust against attacks by frame dropping, frame averaging, and lossy compression. REFERENCES [1]
A. Piva, F. Bartolini, and M. Barni: Managing copyright in open networks. IEEE Internet Computing, Volume 6, Issue: 3, pp: 18 -26, May-June 2002
33
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
A Cost Effective RFID Based Customized DVDROM to Thwart Software Piracy Prof. Sudip Dogra
Prof. Subir Kr. Sarkar
Electronics & Communication Engineering Meghnad Saha Institute of Technology Kolkata, India
Electronics and Telecommunication Engineering Jadavpur University Kolkata, India
Ritwik Ray
Saustav Ghosh
Student Electronics & Communication Engineering Meghnad Saha Institute of Technology Kolkata, India
Student Electronics & Communication Engineering Meghnad Saha Institute of Technology Kolkata, India
Debharshi Bhattacharya Student Electronics & Communication Engineering Meghnad Saha Institute of Technology Kolkata, India
discussed about RFID and the functioning of a DVD-ROM in sections II and III respectively. Following which, a brief discussion about software piracy has been done in sections IV and V. After this, we have described our scheme and listed the advantages in sections VI and VII respectively.
Abstract—Software piracy has been a very perilous adversary of the software-based industry, from the very beginning of the development of the latter into a significant business. There has been no developed foolproof system that has been developed to appropriately tackle this vile issue. We have in our scheme tried to develop a way to embark upon this problem using a very recently developed technology of RFID.
II.
RFID stands for Radio Frequency IDentification, a term that describes any system of identification wherein an electronic device that uses radio frequency or magnetic field variations to communicate is attached to an item. The two most talked-about components of an RFID system are the tag, which is the identification device attached to the item we want to track, and the reader, which is a device that can recognize the presence of RFID tags and read the information stored on them. The reader can then inform another system about the presence of the tagged items. The system with which the reader communicates usually runs software that stands between readers and applications. This software is called RFID middleware. In a typical RFID system [2], passive tags are attached to an object such as goods, vehicles, humans, animals, and shipments, while a vertical/circular polarization antenna is connected to the RFID reader. The RFID reader and tag can radio-communicate with each other using a number of different frequencies, and currently most RFID systems use unlicensed spectrum. The common frequencies used are low
Keywords- DVD, DVD-ROM, Piracy, RFID, Reader, Software, Tag
I.
RFID: RADIO FREQUENCY IDENTIFICATION
INTRODUCTION
the years, the software industry has developed into a OVER multi-billion dollars business, with it spreading its wings
throughout the world. Not only in the commercial field, but softwares are now being applied in almost all spheres of our life. Ranging from defense activities to health monitoring, there are softwares for every purpose. As a result, these softwares come with varying price tags. Softwares used in scholarly, medical or defense activities are generally highly priced because of their significance. The utmost peril that has been menacing this exceptionally vital industry is the act of software piracy. In our present work, we have tried to develop a DVDROM which will be capable of reading only the authorized DVDs containing softwares, and will be used only for the purpose of storing costly sensitive data. For this purpose, we have taken the help of the latest RFID technology. We have
34
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
frequency (125 KHz), high frequency (13.56 MHz), ultra high frequency (860–960 MHz), and microwave frequency (2.4 GHz). The typical RFID readers are able to read (or detect) the tags of only a single frequency but multimode readers are becoming cheaper and popular which are capable of reading the tags of different frequencies [3]. III.
fundamental job of the DVD player is to focus the laser on the track of bumps. The laser can focus either on the semi-transparent reflective material behind the closest layer, or, in the case of a double-layer disc, through this layer and onto the reflective material behind the inner layer. The laser beam passes through the polycarbonate layer, bounces off the reflective layer behind it and hits an opto-electronic device, which detects changes in light. The bumps reflect light differently than the "lands," the flat areas of the disc, and the opto-electronic sensor detects that change in reflectivity. The electronics in the drive interpret the changes in reflectivity in order to read the bits that make up the bytes.
OPERATION OF A DVD-ROM
A DVD ROM is very similar to a CD ROM. It has a laser assembly that shines the laser beam onto the surface of the disc to read the pattern of bumps. The DVD player decodes the encoded Data, turning it into a standard composite digital signal. The DVD player has the job of finding and reading the data stored as bumps on the DVD. Considering how small the bumps are, the DVD player has to be an exceptionally precise piece of equipment. The drive consists of three fundamental components: •
A drive motor to spin the disc - The drive motor is precisely controlled to rotate between 200 and 500 rpm, depending on which track is being read.
•
A laser and a lens system to focus in on the bumps and read them - The light from this laser has a smaller wavelength (640 nanometers) than the light from the laser in a CD player (780 nanometers), which allows the DVD laser to focus on the smaller DVD pits.
•
A tracking mechanism that can move the laser assembly so the laser beam can follow the spiral track - The tracking system has to be able to move the laser at micron resolutions.
IV.
SOFTWARE PIRACY: A MODERN MENACE
Over the years, the software industry has developed into a multi-billion dollars business, with it spreading its wings throughout the world. Not only in the commercial field, but softwares are now being applied in almost all spheres of our life. Ranging from defense activities to health monitoring, there are softwares for every purpose. As a result, these softwares come with varying price tags. Softwares used in scholarly, medical or defense activities are generally highly priced because of their significance. The utmost peril that has been menacing this exceptionally vital industry is the act of software piracy. The copyright infringement of software (often referred to as software piracy) refers to several practices which involve the unauthorized copying of computer software. Copyright infringement of this kind is extremely common. Most countries have copyright laws which apply to software, but degree of enforcement varies. After a dispute over membership between Iran and USA led to the legalization in Iran of the unconstrained distribution of software (see Iran and copyright issues), there have been fears that world governments might use copyright politically. When software is pirated, customers, software developers, and resellers are harmed. Software piracy increases the risk consumer's computers will be corrupted by malfunctioning software and infected with viruses. Those who supply defective and illegal software do not tend to provide sales and technical support. Pirated software usually has insufficient documentation, which prevents consumers from enjoying the full benefits of the software package. In addition, consumers are not capable to take advantage of technical support and product upgrades, which are typically available to legitimate registered users of the software. Pirated software can cost consumers lost time and additional money.
Fig. 1. Functional Diagram of a DVD-ROM
Inside the DVD player, there is a good bit of computer technology involved in forming the data into understandable data blocks, and sending them either to the DAC, in the case of audio or video data, or directly to another component in digital format, in the case of digital video or data. The
35
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
or reseller, or is improperly using a trademark or trade name. . Indications of reseller piracy are multiple users with the same serial number, lack of original documentation or an incomplete set, and non-matching documentation. D. BBS/Internet Piracy
BBS/ Internet Piracy occur when there is an electronic transfer of copyrighted software. If system operators and/or users upload or download copyrighted software and materials onto or from bulletin boards or the Internet for others to copy and use without the proper license. Often hackers will distribute or sell the hacked software or cracked keys. The developer does not receive any money for the software the hacker distributed. This is an infringement on the developer's copyright. Another technique used by software pirates is to illegally obtain a registered copy of software. Pirates acquire the software once and use it on multiple computers. Purchasing software with a stolen credit card is another form of software piracy. Usually, the softwares are sold in the market in secondary memory devices like CDs and DVDs. Necessary measures are taken so that, the disks are copy protected and there are no likelihood of replicating the valuable software stored in it. Table I enlists some of the present technologies available for this purpose.
Fig.2. Rate of Software Piracy across the countries (Cortesy:IDC)
Developers lose revenue from pirated software, from current products as well as from future programs. When software is sold most developers invest a portion of the revenue into future development and superior software packages. When software is pirated, software developers lose revenue from the sale of their products, which hinders development of new software and stifles the growth of the software company. V.
TABLE I Various existing Technologies used in the prevention of Software piracy
SOFTWARE PIRACY: TYPES AND PREVENTIVE MEASURES
Serial No. 1.
There are numerous kinds of software piracy. The bottom line is once software is pirated, the developer does not receive reparation for their toil. We have mentioned a few methods, which have been used contemporarily to check this despicable practice A. End User Piracy Using multiple copies of a single software package on several different systems or distributing registered or licensed copies of software to others. Another common form of end user piracy is when a cracked version of the software is used. Hacking into the software and disabling the copy protection or illegally generating key codes that unlocks the trial version making the software a registered version creates a cracked version.
Name. Alkatraz
2.
CD-Cops
3.
CDShield
4.
HexaLock
5.
Laser Lock
6.
Roxxe
7.
SafeDisc
8.
SmarteCD
B. Reseller Piracy Reseller piracy occurs when an unscrupulous reseller distributes multiple copies of a single software package to different customers this includes preloading systems with software without providing original manuals & diskettes. Reseller piracy also occurs when resellers knowingly sell counterfeit versions of software to unsuspecting customers. C. Trademark/Trade Name Infringement Infringement occurs when an individual or dealer claims to be authorized either as a technician, support provider
36
Description Copy protection for CD and DVD based on a "watermark" system CD-Cops is a envelope protection which is added to the CD’s main executable. CDSHiELD protect your CD (before burning it) with putting voluntary sectors-errors to prevent copying from third unauthorized persons. HexaLock CD-RX media are specially made CD-R's that contain a pre-compiled session, which includes security elements that make the discs copy protectable. LaserLock uses a combination of encryption software and a unique laser marking a "physical signature" on the CD surface made during the special LaserLock glass mastering procedure, in order to make copying virtually impossible. Roxxe CD protection is a brand new combination of hardware and software protection that makes it impossible to run software from illegally copied CDs. Software publishers and developers need an effective and comprehensive antipiracy solution to protect their intellectual property from copying, hacking and Internet distribution, while still ensuring a high quality experience for consumers. Smarte Solutions ("Smarte") is the leading provider of next generation Piracy Management solutions that secure and control the use of software and digital information while enhancing the
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009 9.
StarForce
VI. A.
distribution and marketing-related capabilities of those products. StarForce Technologies is well known to the games and software world for its outstanding and hacker-proof copy protection systems for applications distributed on CD, DVD and CD-R.
DESCRIPTION OF OUR PROPOSED SCHEME Our Consideration
In our scheme, we have proposed a modified DVD drive, in which only modified DVDs can be read. The Basic architecture of both the devices has been kept nearly the same. Only we have changed the working of the devices. The List of items used for our scheme is given in Table II. TABLE II Components used in our scheme Serial No. 1. 2. 3. 4. 5. 6.
Name. DVD-ROM short range RFID reader RFID passive tag Computer DVD Basic Stamp Microcontroller
Number 1 1 4 1 4 1
Fig. 3. Schematic Diagram of our arrangement
When this DVD is inserted into the drive the reader antenna will first read the code stored in the tag and send it to the microcontroller. Here the microcontroller will match this code with the ones existing in the computer’s database. If the code does not match any of the previously existing codes, it will eject the DVD, and no data transfer will take place. It will send the signal to run the DVD only if it finds a match. Hence, the DVD-ROM won’t be able to read no other DVDs other than the one having the authenticated RFID tags. The flowchart of the working has been shown in Fig.4.
Each of the DVDs will be fitted with a RFID Tag on the non-readable surface The Reader will be connected with the DVD-ROM. The interfacing will be done using a Basic Stamp Microcontroller. The power supply will provide the necessary power to run the reader, microcontroller and the DVD-ROM at the same time. B.
Start
Functioning of our Scheme
The basic principle underlying the mechanism of this scheme is that of authentication of two parties before the transfer of information actually begins. In our case the authentication process is carried out using the RFID technology. Each of the DVDs will be provided with a set of two serial numbers. One will be written on the DVD which will be visible to the user. The second code will be stored inside that of the RFID tag and can be read only by the reader. This code will have to be stored in a database inside the computer. If the process is carried out by a software company, then the second code will be given out in the internet in an encrypted form along with the serial number written on the DVD. The user will have to get this code first before he can run the DVD. A schematic diagram of the arrangement has been shown in Fig.3.
Read the code from the DVD and store it in C
N
Is the code present in the database? Y
Send the signals for ejecting the DVD
Send the signals for running the DVD
Start Fig. 4. Flowchart of the working of our scheme
Moreover, the DVD will be suitably encrypted so that it cannot be run on any other DVD-ROM, as well as the material stored in it wont be copied even by the modified DVD-ROM. We have simulated the signals that would be sent by the microprocessor using Verilog HDL, in Micro Sim. The simulations are showed in Fig. 5.
37
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
VIII.
We would like to take this opportunity to Show our gratitude to the faculty of the Electronics and Communication Department of our college, including our Head of the Department, Prof. Sudip Dogra who provided us with invaluable contributions regarding our present work. This achievement is also dedicated to our Administrator Mr. Satyen Mitra, who provided continuous support for this work. A special mention is made here about our friend Ms. Emon Dastider, who helped us with the composition of our document. And finally, we would like to thank Prof. Subir Kr. Sarkar for guiding us through this project.
Fig. 5. Simulation of the various signals, used in our scheme, made in Verilog
As shown in the simulation, the codes stored in the 4 DVDs were 1000, 1001, 1010 and 1011. Whenever, a DVD having a false code is encountered, the Eject signal turns high, whereas the run signal turns low, making the DVD-ROM to eject the DVD. However, when the code matches with those of the database, the run signal turns high, and the eject signal goes low. VII.
ACKNOWLEDGEMENT
IX.
CONCLUSION
The basic advantage of our scheme lies in its costeffectiveness, and its simple design. Once it is implemented on a commercial basis, it will establish itself as a great hindrance to the degraded practice of Software-piracy. There is also scope for more development in the design, which will enhance its efficiency and security
ADVANTAGES OF OUR SCHEME
Over the years the piracy rackets in the Software industry has taken a huge toll in the section of losses incurred in the selling of this software. Numerous costly softwares like Operating System, Antivirus, etc are available in cheap CD/DVDs in the illegal markets in many parts of the world. Our scheme offers a cost effective solution in tackling this problem. The following advantages can be easily pointed out.
REFERENCES [1] [2]
1) Since the special RFID DVD can only be run using a RFID optical drive, there will be very little possibility of the content being copied, as the DVD wont start running without proper authentication .
[3] [4]
2) As the DVD will be completely made especially for the purpose of selling costly software, there will be proper configuration of the hardware, so that there will be neither any chance of transferring the software data into any computer nor any chance of ripping the DVD.
[5]
[6]
3) New and advanced software are being launched everyday, which will eventually take the place of the older ones in the market. If our scheme is implemented by the software based companies, it will prevent the newer versions of the existing software to be available cheaply through piracy. Hence, the customer using the older version will be forced to buy the newer version only from the sources selling the original versions.
[7]
“RFID handbook: applications, technology, security, and privacy” by Syed Ahson and Mohammad Ilyas. CRC Press , Boca Raton “RFID Technology & Applications” by Stephen B. Miles, Sanjay E. Sharma & John R. Williams. Cambridge University Press, New York. G. O. Young, “Synthetic structure of industrial plastics (Book style with paper title and editor),” in Plastics, 2nd ed. vol. 3, J. Peters, Ed. New York: McGraw-Hill, 1964, pp. 15–64. The Effect of Piracy on Markets for Consumer Transmutation Rights Lang, K.R.; Shang, R.D.; Vragov, R.; System Sciences, 2009. HICSS '09. 42nd Hawaii International Conference on 5-8 Jan. 2009 Method based static software birthmarks: A new approach to derogate software piracy Mahmood, Y.; Sarwar, S.; Pervez, Z.; Ahmed, H.F.; Computer, Control and Communication, 2009. IC4 2009. 2nd International Conference on 17-18 Feb. 2009 An intention model-based study of software piracy Tung-Ching Lin; Meng Hsiang Hsu; Feng-Yang Kuo; Pei-Cheng Sun; System Sciences, 1999. HICSS-32. Proceedings of the 32nd Annual Hawaii International Conference on Volume Track5, 5-8 Jan. 1999 Understanding the behavioral intention to digital piracy in virtual communities - a propose model Kwong, T.C.H.; Lee, M.K.O.; eTechnology, e-Commerce and e-Service, 2004. EEE '04. 2004 IEEE International Conference on 28-31 March 2004 AUTHORS PROFILE
4) In view of the decreasing prices of the RFID readers and tags, a cheaper version of the modified DVD and its reader will be easily realizable for the customers of limited financial abilities. 5) The scheme will also provide enhanced security to the confidential data having huge importance, and hence can be used in places, where handling of sensitive data of high priority takes place
38
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009 Sudip Dogra: Sudip Dogra received the B. Saustav Ghosh is pursuing his Bachelor’s Tech and M. Tech. Degree from the Institute degree in Electronics & Communication of Radio Physics and Electronics, University Engineering in Meghnad Saha Institute of of Calcutta in 1996 and 2003, respectively. Technology. He has published more than 6 He is doing PhD at Jadavpur University. He technical research papers in journals and served Andrew Yule & Company Limited (A peer reviewed national and International Govt. Of India Enterprise) as a Development Engineer( R & D Dept.) for about 6 years Conferences. His earlier works were done before coming to teaching profession. He in the fields of 4G Mobile communications, joined as a faculty member in the Dept. of Co-operation in Mobile Communication, Electronics and Communication Mobile Security and WiMAX. His present Engineering, Meghnad Saha Institute of field of interest is RFID and it’s Technology, Kolkata in 2003. Presently he is application. Assistant Professor and Head of the Department in Electronics & Communication Engineering Department of Ritwik Ray is pursuing his Bachelor’s degree in Meghnad Saha Institute of Technology, Kolkata. He has published Electronics & Communication Engineering in more than 25 technical research papers in journals and peer – Meghnad Saha Institute of Technology. He has reviewed conferences. His most recent research focus is in the areas published more than 6 technical research papers of 4th Generation Mobile Communication, MIMO, OFDM, WiMax, in journals and peer reviewed national and UWB, RFID& its applications etc. International Conferences. His earlier works were done in the fields of 4G Mobile communications, Subir Kumar Sarkar completed his B. Co-operation in Mobile Communication, Mobile Tech and M. Tech. from Institute of Radio Security and WiMAX. His present field of Physics and Electronics, University of interest is RFID and it’s application. Calcutta in 1981 and 1983, respectively. He was in industry for about 10 years before coming to teaching profession. Debharshi Bhattacharya is pursuing his He completed his Ph.D. (Tech) Bachelor’s degree in Electronics & Degree from University of Calcutta in Communication Engineering in Meghnad Microelectronics. Currently he is a Saha Institute of Technology. He has professor in the Department Electronics and published more than 6 technical research telecommunication Engineering, Jadavpur papers in journals and peer reviewed national University. His present field of interest includes nano, single electron and spintronic and International Conferences. His earlier device based circuit modeling, wireless works were done in the fields of 4G Mobile mobile communication and data security in computer networks. communications, Co-operation in Mobile Communication, Mobile Security and WiMAX. His present field of interest is RFID and it’s application.
39
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal on Computer Science and Information Security, Vol. 6, No. 1, 2009
A O(|E|) Time Shortest Path Algorithm For NonNegative Weighted Undirected Graphs Muhammad Aasim Qureshi, Dr. Fadzil B. Hassan, Sohail Safdar, Rehan Akbar Computer And Information Science Department University Technologi PETRONAS Perak, Malaysia
to O(m log n). The complexity was further improved [9] when Fredman and Tarjan developed Fibonaccii heap. The work in [9] was an optimal implementation of Dijkstra’s algorithm in a comparison model since Dijkstra’s algorithm visits the vertices in sorted order. Using fusion trees of [8], we get an O(m (log n) ½ ) randomized bound. Their later atomic heaps give an O(m + n log n/log log n) bound presented in [7]. Afterwards, in [11][12][16] priority queues gave an O(m log log n) bound and an O(m + n(log n1+ε)½) bound. These bounds are randomized assuming that we want linear space. Afterwards [14] reduced it to O(m + n(log n log log n) ½) and next year [15] improved it with randomized bound to O(m + n(log n1+ε) 1/3) .
Abstract— In most of the shortest path problems like vehicle routing problems and network routing problems, we only need an efficient path between two points—source and destination, and it is not necessary to calculate the shortest path from source to all other nodes. This paper concentrates on this very idea and presents an algorithms for calculating shortest path for (i) nonnegative weighted undirected graphs (ii) unweighted undirected graphs. The algorithm completes its execution in O(|E|) for all graphs except few in which longer path (in terms of number of edges) from source to some node makes it best selection for that node. The main advantage of the algorithms is its simplicity and it does not need complex data structures for implementations.
Priority queue presented in [6] for SSSP improved the shortest path cost giving a running time of O(m + n(log C) ½) where C was the cost of the heaviest edge. Next work by [13] to reduced the complexity to O(m + n (3 log C log log C) 1/3 ) expected time and [15] presented a further improvement to O(m + n(log C) 1/4+ε). [3] presented an algorithm and claimed that it will out class dijekstra’s algorithm.
Keywords-component; Shortest Path, Directed Graphs, Undirected Graphs, Algorithm, Theoretical Computer Science
I.
INTRODUCTION
Shortest Path Problem can formally be defined as follows:
Contrary to Dijekstra and many others this algorithm attacks the problem from both ends—source and destination. It searches source node (i.e.’s’), starting from destination node (i.e. ‘t’) and on the other side searches destination node starting from source node in parallel.
Let G be a graph such that G = (V, E), where V = { v1, v2, v3, v4, …, vn } and E = { e1, e2, e3, e4, …, em } such that |V| = n and |E| = m. G is an undirected weighted connected graph having no negative weight edge, with pre-specified source vertex ‘s’ and destination vertex ‘t’ such that s ∈ V and d ∈ V. We have to find simple path from s to t with minimum most total edge weight.
II.
Theoretical Computer Science is one of the most important and hardest areas of Computer science (TCS) [17][18][19]. The single-source shortest paths problem (SSSP) is one of the classic problems in algorithmic graph theory of TCS. Since 1959, all theoretical developments in SSSP for general directed and undirected graphs have been based on Dijkstra’s algorithm, visiting the vertices in order of increasing distance from s. As a matter of fact many real life problems can be represented as SSSP. As such, SSSP has been extensively applied in communication, computer systems, transportation networks and many other practical problems [1].
BASIC IDEA
This algorithm is basically an extension of the work done in [20]. The basic idea can be best described using an analogy of two distinct persons involved in a task of searching a path between starting point (Point1) and finish point (Point2) of a labyrinth. First person, A, starts from point1 and second person starts from point2 as illustrated in fig. 1. A explores all possible paths searching for either B or point2. and in the same way second man, B, starts exploring all the paths starting from point2 looking for point1 or A as illustrated in fig. 2. They meets on the their way (see fig. 3) to their destination and as soon as they meet they exchange and combine their information about the path they have traversed and can easily be made a path along with total cost of the path.
The complexity of Dijkstra’s algorithm [10] has been determined as O(n2 + m) if linear search is used to calculate the minimum [2]. A new heap data structure was introduced by [4][5] to calculate the minimum which resulted he complexity
40
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal on Computer Science and Information Security, Vol. 6, No. 1, 2009
t s
Point1
Point2 Figure 1:(step 1)Person A starts from point1 and Person B starts from pointB
s t
Point1
Point2 Figure 2: at next levels Both A and B are exploring different paths in search of one another
s t
Point1
Point2
Figure 3: Both A and B meet at some point and by interchanging information can make the whole path
41
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal on Computer Science and Information Security, Vol. 6, No. 1, 2009
III.
RED: the node is explored and traversed. Each YELLOW node is picked and its neighbors are explored and then it is painted as RED.
ALGORITHM
A. Algorithm Input Constraints
While exploring, neighboring nodes from any node (say p), it calculates the cost of the node being explored (cost of the path starting from NL0 point to the node in hand i.e. h ) in order to calculate the best cost so far (i.e. CSTh) and best parent of the node making its cost minimum (i.e. Πh). CSTh and Πx will be calculated as below:
This Algorithms runs well for all graphs but for the few having following properties: Len(Pi(s,w)) > Len(Pj(s,w)
Cost at node h is
k −1 l −1 ∑ w( xi, xi +1) ∠ ∑ w( y i, y i +1) i =0 i =0 for Pi for Pj
old_ CSTh = CSTh CSTh = min(CSTh , CSTp + ep,h) and if CSTh = CSTp + ep,h
where k ≠ l and x0 = y0 = s and xk = yk = w and Pi and Pj are paths from s to w such that
then
Pi = { x0 ,x1 ,x2 , . . . . . . . . ,xk } and
B. Algorithm Definition
During these traversals if some node is found that was marked RED by other part then two nodes p and h are stored along with the total cost the complete path calculated as old_SPCST Å SPCST
First of all data structures are initialized as: CSTu Å—∞,CLRuÅYELLOW, DSTuÅ-1
(2)
Until all the YELLOW nodes are converted to RED of some level no node is selected for traversal from the next level. As soon as one level is completed the control is switched to the other part of the algorithm to proceed and it also performs the same steps.
This algorithm has three main parts namely, PartA, PartB and PartC. Part A and PartB are identical. Both are searching the footmarks of the other Part. PartA is concerned with the search for the shortest path from the source node s to the destination node t and PartB targeting s, starts its search from t. Both PartA and PartB are replicas of one another and perform similar actions. The two parts are running in pseudo-parallel fashion, exploring nodes of the graph level by level.
uÅNIL,
Πh Å p
Initially all nodes are painted as GREEN (while initializing except source and destination nodes) and as soon as a node is explored during the traversal it is painted as YELLOW and as soon as any node completes its traversal (i.e. all its neighbors are explored i.e. painted YELLOW) it is painted RED.
Pj = { y0 ,y1 ,y2 , . . . . . . . . ,yl }
∀[π
(1)
SPCST Å min ( SPCST, CSTh + CSTp + ep,h)
]
(3)
If SPCST = CSTh + CSTp + ep,h Then SP Å ph
U ∈ V(G)/s/t
(4)
In this way all possible paths are covered and their costs are stored. This algorithm continues until there is any YELLOW or RED node in the graph.
s and t is initialized with NIL, 0, GRAY, 0 respectively. Each part—PartA and PartB, starts investigation from its respective starting node (s for PartA and t for PartB) and explores all its neighboring nodes. So let’s say s and t are level0 nodes (NL0) and all nodes being explored from these nodes will be level-1 nodes (NL1) and all the nodes explored from level-1 node are level-2 (NL2) and so forth.
When all nodes are colored RED, PartA and PartB of algorithm stops and PartC is invoked. PartC using a simple linear search algorithm searches for the path with minimum cost from the stored costs using the nodes that were stored.
Other than NL0 all nodes have to calculate and keep track of the best cost (i.e. CSTx) along with the parent node (Πx) making the best cost. The track of the status of each node is kept by the coloring them with specific colors (i.e. CLR). Details are as below:
C. Working Example of the Algorithm NOTE: For this example the color scheme is changed to WHITE, GRAY and BLACK to get better display.
GREEN: the node is neither explored nor traversed. It means that algorithm has not yet come across this node
The algorithm starts with PartA (i.e. from source s) and marking the cost of the node as 0 and painting it GRAY as shown in fig. 4.
YELLOW: node is explored by some node. This node can still be explored by other node(s).
On the other end PartB starts in parallel from t and marking its cost as 0 and painting it GRAY as shown in fig. 4.
42
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal on Computer Science and Information Security, Vol. 6, No. 1, 2009 0
NL0
s
PartA
4
1
parent is adjusted using (1), (2) (its cost is 4 as shown in red color i.e. 0+4) and marking p i.e. ‘s’ as its parent being shown as green line and marking it explored by painting it GRAY. Then next neighbor is chosen and the whole process is repeated and then next neighbor is picked. This process continues until all the neighbors are painted GRAY. Upon the completion of the exploration process p is painted BLACK.(see fig. 5)
b
a
2
5
5
2
3
c 7
On the other end PartB starts its processing from NL0, and picks p = ‘t’ for traversal. All its neighbors are explored, one by one, randomly. Supposing h = ‘n’ is picked and now its cost and parent is adjusted using (1), (2) (its cost is 1 as shown in red color). t is marked as the parent of n, shown with blue line and its status is changed to explored by painting it GRAY. Then next neighbor is chosen and the whole process is continued until all the neighbors are painted GRAY. Upon the completion of the traversal process p is painted BLACK.(see figure 5)
1
9
d
e
2
4
f 6
4
3
g
h 7
7
j
i
7 2
3
3
k 2
Now investigating NL1 nodes (a and b) one by one and checking their neighbors and performing actions like marking and/or adjusting costs and parents (using (1) and (2)) and painting neighbors GRAY. All NL1 nodes are painted BLACK (see fig. 6) one by one.
l
3
6
5
m
n 7
1
Same process is being repeated in PartB on NL1 nodes (m and n).(see fig. 6)
PartB
NL0
t
0
PartA is repeating same steps that were performed previously but now on NL2 nodes(c, d, e, and f) (see fig. 7)
Figure 4: Starting PartA and PartB from s and t respectively
In partB doing the same steps as in PartA e.g. exploring all nodes of NL2 (I, j, k, and l) one by one, the notable point, here, is that node i explores e and f and finds them already traversed
Continuing from NL0, the algorithm starts with investigation from p = ‘s’ exploring all it neighbors, one by one, randomly. Assume h = ‘b’ is picked and now its cost and
NL1
0 0
a
2
5
1
4
b 5
6
d 5
e
2
f
4
6
5
2
3
1
c 7
2
c 7
4
1
4
d
e
3
f
5
2
4
6
4
5
3
g
h 7
2
7
6
m 7
7
j
2
7
1
6
m
0
5
n
1
PartB
1
7
t
6
l
3
1
3
9
k
2
5
n
0
9
l
3
7
6
10
i
7
3
3
k 2
h 7
7
j
g
i
7
PartA
4
b
a
5
2
3
4
1
PartA
4
1
1
s
NL0
s
t
NL1
Figure 6: Traversing Level-1 nodes from both sides
Figure 5: Traversing level-0 nodes from both sides
43
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal on Computer Science and Information Security, Vol. 6, No. 1, 2009 0
0 s
1
2
4
5
1
2
3
c
2
5
d
e
3
f
5
2
4
6
4
6
6
6
g
13
h
4
c
7
9
9
7
2
9
6
9
6
l
3
7
3
f
2
4
6
4
h
7
6
6
5
m
1
n 7
t
3
9 l
3
7
1
0
2
k
2 1
n
7
j
5
m
7
6
9
i
7
3
k
2
e
6
g
13
d
7
7
j
10
i
7
2
1
5
s
5
3
1
4
4
b
a
7
7
4
1 b
a
5
6
s
4
1
1 t
0
Figure 9: Collision of two Parts of algorithm making SP
Figure 7: Investigating nodes (c,d,e,f) (only processing of PartA
GRAY. Performing the same steps PartB did in the previous step (see fig. 8). Here three new paths are explored and stored along with their costs.
(painted BLACK) by other part (i.e. PartA) so algorithm here stores ‘I’ and ‘e’ and the cost 3+6+9 (using (3) and (4)) and then stores I and f and cost 6+4+9 (using (3) and (4)). This path is marked with yellow line. Here we explored and stored three steps(see fig. 8)
PartB has no GREY nodes to continue its traversal. So it will terminate. As all the nodes in the graph are now BLACK so PartA and PartB terminates.
PartA starts exploring NL3 nodes (i.e. g and h) that are 0 0
s s
4
1
1
2
5
1
4
b
a
5
3
c 7
d
e
3
f
9
2
4
6
4
6
6
g
h
7
13
9
i
7
7
7
j
6
m
0
9 6
3
f
2
4
6
4
6
g
h 2
7
j
7 1
6
m
0
5 n
7
t
6
l
3
1
3
9
k
6
9
i
7
7
2
5
n 7
e
3
l
3
9
d
7
2
k
2 7
2 1
4
c
5
9 9
5
3
7
5 13
2
1
4
4
b
a
5
2
6 6
4
1
1
1 t
Figure 10: linear search in calculated paths resulted in SP
Figure 8: Traversing i, j, k, l
44
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal on Computer Science and Information Security, Vol. 6, No. 1, 2009
PartC invokes and calculates the minimum out of all costs calculated so far and determines the shortest path (see fig. 10) (in the algorithm you will find it embedded in the PartA and PartB calculations)
(6) .. .. DSTv Å —∞ .. (7) .. CLRs = YELLOW (8) .. CLRt = YELLOW (9) .. ENQueue (Qs, s) (10) . EnQueue (Qt, t)
D. Pseudo Code In this pseudo-code we are using four subroutines— Shortest_Path_Algorithm, Initialize, PartA_B_C, Print_Path. Shortest_Path_Algorithm is the main subroutine that invokesother subroutines. Initialize is for all kind of initializations required for the execution of the algorithm. Then PartA_B_C is invoked twice with different queues making it PartA and PartB. PartC is embedded in the PartA_B_C in the end. Finally Print_Path is invoked to print the shortest path.
--------------------------------------PartA_B_C(Q) (1)Qtmp Å ∅ (2)while Q ≠ ∅ (3) .. then u Å DeQueue(Q) (4) .. .. if u ≠ YELLOW_ (5) .. .. .. Then for each v ∈ Adj[u] (6) .. .. .. do if CLRv = GREEN (7) .. .. .. .. then CLRv Å YELLOW (8) .. .. .. .. .. EnQueue (Qtmp, v) (9) .. .. .. .. .. Πv Å u (10) . .. .. .. .. CSTv Å CSTu + eu,v (11) . .. .. .. .. DSTv Å DSTu + 1 (12) . .. .. .. Else if CLRv = YELLOW (13) . .. .. .. .. Then if CSTv > CSTu + eu,v (14) . .. .. .. .. .. Then If DSTv=DSTu & CLRv≠YELLOW_ (15) . .. .. .. .. .. .. Then EnQueue (Qtmp, v) (16) . .. .. .. .. .. .. .. CLRv Å YELLOW_ (17) . .. .. .. .. .. .. Πv Å u (18) . .. .. .. .. .. .. CSTv Å CSTu + eu,v (19) . .. .. .. .. .. .. DSTv Å DSTu + 1 (20) . .. .. .. Else if CLRv = REDt (21) . .. .. .. .. Then if CSTv > CSTu + eu,v (22) . .. .. .. .. .. Then print “wrong graph” (23) . .. .. .. .. .. .. Terminate Algorithm (24) . .. .. .. Else if CLRv = REDo (25) . .. .. .. .. Then (26) . .. .. .. .. .. Πv Å u (27) . .. .. .. .. .. If CSTu +CSTv +eu,v<SPCST (28) . .. .. .. .. .. Then . (29) . .. .. .. .. .. .. SP Å “uv” (30) . .. .. .. .. .. .. SPCSTÅCSTu +CSTv +eu,v
1) Legend being used in the algorithm: CLR: Color – can be one of the three—GREEN (no processing has yetstarted on this node) , YELLOW ( processing has started on this node) and RED ( Processing on the current node has completed CLRv: Color of v CLRu: Color of u CSTv: Cost of v (i.e. minimum cost from source to v) DSTv: Number of Edges in the path from source s to current node v REDo: Color RED painted by other part of the algorithm e.g. if currently PartA is being executed then it will be referring to a node that would be painted by PartB REDt: Color RED painted by this part of the algorithm e.g. if currently PartA is being executed then it will be referring to a node that would be painted by PartA YELLOW_: node is marked YELLOW and it is inserted in the next queue and should not be processed from current queue
--------------------------------------
Qs: Queue that is being used by PartA
Print_Path(SP)
Qs: Queue that is being used by PartB
(1) .. PTH[1 to DSTv + DSTu + 1
SPCST: shortest Path Cost
(2) .. (3) .. (4) .. (5) .. (6) .. (7) ..
iÅDSTSP[1] PTH[i] Å u While p is not eual to NULL Do .. i Å i - 1 .. PTH[i] Å p Å Πp
(8) .. (9) .. (10) . (11) . (12) . (13) . (14) . (15) . (16) . (17) .
iÅDSTSP[1] + 1 PTH[i] Å v While p is not eual to NULL Do .. i Å i + 1 .. PTH[i] Å p Å Πp
SP: Shortest Path Shortest_Path_Algorithm () (1) .. Initialize () (2) .. while (Qs ≠ ∅ AND Qt ≠ ∅ ) (3) .. do .. (4) .. .. PartA_B_C(Qs) (5) .. .. PartA_B_C(Qt) (6) .. Print_Path(SP) --------------------------------------Initialize () (1) .. for each v ∈ V (2) .. do .. (3) .. .. CLRv Å GREEN (4) .. .. Πv Å ∅ (5) .. .. CSTv Å —∞
45
For iÅ1 to DSTv + DSTu + 1 Do .. Print PTH[i], “,”
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal on Computer Science and Information Security, Vol. 6, No. 1, 2009
E. Complexity [2]
This example shows that the algorithm successfully completes its execution for targeted graphs. The algorithm starts with two parts both traversing and covering the neighbors using edges. And a node never re-covers the node that it has already covered. Both parts are moving at the same pace (i.e. covering nodes level by level) so both parts will be covering almost same number of nodes (on average). So in this way each part will be covering E/2 edges making total = E.
[3]
[4] [5]
Embedded in Part A and PartB, PartC calculates shortest path using the technique of linear search. As there can not be more than E paths (in the worst case) so linear search can take maximum E time to complete its execution and find out minimum cost path. so it make total complexity to E+E=2E which is O(E).
[6] [7]
The main advantage as well as the beauty of this algorithm is that it is very simple, easy to learn and easy to implement. At the same time it does not require complex data structures.
[8]
[9]
So this algorithm can be applied for problems like vehicle routing, where the maps of the roads grow always in hierarchical fashion and very rarely a situation occur in which a long path give a smaller cost.
[10] [11]
IV. SAME ALGORITHM FOR DIFERENT TYPES OF GRAPHS Applying this algorithm on weighted directed graphs, it produced a quick result as it solves the given problem from two ends (i.e. source and destination).
[12]
[13]
Minor modification is required to calculate the shortest path for unweighted directed/undirected graphs of all types without any bound and/or condition. Modification that is required is to terminate the algorithm as soon as one investigating node checks some node that is colored GRAY by other part of the algorithm. In other words we can say that as soon as two parts collide for the first time. Algorithm is terminated and combining the paths of two nodes will give the shortest path. Though this algorithm also work in O(E) in worst case that is also the complexity of BFS but results showed that it conclude quite efficiently and calculates the path in less time.
[14]
[15] [16]
[17]
V.
CONCLUSION
This algorithm is very efficient and robust for the targeted graphs due to its simplicity and along with it the constant factor is quite negligible. For all kinds of unweighted graphs, algorithm showed promising results. Though, it does not improve asymptotic time complexity but in terms of he number of processing steps its results were much better (most of the times) than Breadth First Search. In nonnegative weighted undirected graphs (except few) this is very fast and efficiently convergent algorithm for targeted graphs.
[18]
[19]
[20]
REFERENCES [1]
Science+Business Media, LLC 2006, (Published online: 20 September 2006) Mikkel Thorup :Undirected Single-Source Shortest Paths with Positive Integer Weights in Linear Time. In AT&T Labs Research, Florham Park, New Jersey, Journal of the ACM, vol. 46, No. 3, pp. 362–394 (May 1999) Seth Pettie, Vijaya Ramachandran, and Srinath Sridhar :Experimental Evaluation of a New Shortest Path Algorithm_ (Extended Abstract). In D. Mount and C. Stein (Eds.): ALENEX 2002, LNCS 2409, pp. 126– 142, 2002. Springer-Verlag Berlin Heidelberg (2002) Williams, J. W. J. :Heapsort. Commun. In ACM 7, 6 (June), 347–348. (1998) John Hershberger, Subhash, and Amit Bhosle :On the Difficulty of Some Shortest Path Problems. In ACM Transactions on Algorithms, Vol. 3, No. 1, Article 5 (2007) Ahuja, R. K., Melhorn, K., Orlin, J. B., and Tarjan, R. E. :Faster algorithms for the shortest path problem. J. ACM 37, 213–223. (1990) Fredman, M. L., and Willard, D. E. :Trans-dichotomous algorithms for minimum spanning trees and shortest paths. In J. Comput. Syst. Sci. 48, 533–551. (1994) Fredman, M. L., and Willard, D. E. : Surpassing the information theoretic bound with fusion trees. J. Comput. Syst. Sci. 47, 424 – 436. (1993) Fredman, M. L., and Willard, D. E. : Fibonacci heaps and their uses in improved network optimization algorithms. J. ACM 34, 3 (July), 596 – 615. (1987) Dijekstra, E. W. 1959. A note on two problems in connection with graphs. Numer. Math. 1, 269 –271. Therup, M. :On RAM priority queues. In Proceedings of the 7th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM, New York, pp. 59 – 67 (1996) Thorup, M. :Floats, integers, and single source shortest paths. In Proceedings of the 15th Symposium on Theoretical Aspects of Computer Science. Lecture Notes on Computer Science, vol. 1373. Springer-Verlag, New York, pp. 14 –24.( 1998) Cherkassky, B. V., Goldberg, A. V., and Silverstein, C. :Buckets, heaps, lists, and monotone priority queues. In Proceedings of the 8th Annual ACM-SIAM Symposium on Discrete Algorithms. ACM, New York, pp. 83–92.( 1997) Raman, R.: Priority queues: small monotone, and trans-dichotomous. In Proceedings of the4th Annual European Symposium on Algorithms. Lecture Notes on Computer Science, vol. 1136, Springer-Verlag, New York, pp. 121–137. (1996) Raman, R.: Recent results on the single-source shortest paths problem. SICACT News 28, 81– 87.(1997) Andersson, A. Miltersen, P. B. and Thorup, M. :Fusion trees can be implemented with AC0 instructions only. Theoret. Comput. Sci., 215, 337–344. (1999) Muhammad Aasim Qureshi, Onaiza Maqbool, 2007, Complexity of Teaching: Computability and Complexity In ‘International Conference on Teaching and Learning 2007’ organized by INTI International University College at Putrajaya, Malaysia. Muhammad Aasim Qureshi, Onaiza Maqbool, 2007, Complexity of Teaching: Computability and Complexity, INTI Journal Special Issue on Teaching and Learnning 2007 Muhammad Aasim Qureshi, Mohd Fadzil Hassan, Sohail Safdar, Rehan Akbar; 2009, Raison D'Être of Students’ Plimmet in Comprehending Theoretical Computer Science (TCS) Courses, International Journal on Computer Science and Information Security Volume 6 (No 1) 2009.(in press) Muhammad Aasim Qureshi, Mohd Fadzil Hassan, Sohail Safdar, Rehan Akbar, Rabia Sammi; 2009, An Edge-wise Linear Shortest Path Algorithm for Non-Negative Weighted Undirected Graphs, Frontiers of Information Technology December 2009.(in press)
Binwu Zhang, Jianzhong Zhang, Liqun Qi :The shortest path improvement problems under Hamming distance. In Springer
46
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Biologically Inspired Execution Framework for Vulnerable Workflow Systems Sohail Safdar, Mohd. Fadzil B. Hassan, Muhammad Aasim Qureshi, Rehan Akbar Department of Computer & Information Sciences, Universiti Teknologi PETRONAS, Malaysia . The major concern of any business is to secure all its data hence to keep customers’ as well as company’s privacy intact. The customer’s satisfaction in terms of getting good quality services well in time along with the guarantee of protected and secured transactions are of prime importance.
Abstract—The main objective of the research is to introduce a biologically inspired execution framework for workflow systems under threat due to some intrusion attack. Usually vulnerable systems need to be stop and put into wait state, hence to insure the data security and privacy while being recovered. This research ensures the availability of services and data to the end user by keeping the data security, privacy and integrity intact. To achieve the specified goals, the behavior of chameleons and concept of hibernation has been considered in combination. Hence the workflow systems become more robust using biologically inspired methods and remain available to the business consumers safely even in a vulnerable state.
Hence various mechanisms have been provided over the period of time to provide secured workflow transactions using Workflow transaction management and IDS (Intrusion Detection System). The current research is motivated from the efforts that have been made to provide secure workflow systems and the problems associated with those systems. Currently, the WFMS rely merely on IDS for intrusion detection. Once intrusion is detected the whole system is set to wait state and the running process is undo and redoes to recover the faulty parts. This practice might lead to lose the customer satisfaction to use the system as customers always willing to have the timely and accurate result with all of the protection provided. So when the system is in vulnerable state then the questions arise
Keywords— IDS (Intrusion Detection System), WFMS (Workflow Management Systems), Chameleon, Hibernation.
I.
INTRODUCTION
Now days, the world is moving towards the economic growth, achieving the business goals is of prime importance. The major requirement for achieving the business goals is the reliable business processes to provide customers with great deal of satisfaction in terms of Quality of Services. Customized software are in common use to provide solutions for different business processes to increase the performance and providing quick, in time concrete trade results. These business processes are known as business workflow processes or business workflows in terms of computing.
How data would be secured and its privacy would be maintained, when intrusion is found? How can the current and the remaining activities safely continue their execution? How will the workflow engine be able to execute workflow process in a robust fashion and ensures the secure availability of the system along with the integrity of data to all the customers?
Workflow Management Systems (WFMS) are the systems that are used to automate, manage, monitor and control the execution of the workflow processes. Workflow process is a business process for an enterprise. Workflow process contains set of workflow activities that are required to be completed in the specified sequence to finish the workflow process. Each workflow activity is a single or set of instructions to be executed.
All the above mentioned questions concludes the problem associated to the workflow systems that are in vulnerable state due to some intrusion detected during their execution. The problem is to avoid the possibility of system to enter into wait state whenever the intrusion is detected. The current research is dealing with all of the concerns associated with the problem to provide best possible solution. Specifically, the problem statement for the research is:
Workflows are currently very active area of research. Various efforts have been made to provide the business process optimizations and improving the quality of services. Improving coordination among cooperating workflows, process synchronizations, robustness of operational workflows, workflow representational techniques and secure workflows are all very hot areas in which lots of research work is going on.
In the case of intrusion threat, the system goes into unsafe state. The workflow management system should ensure in time availability of services while keeping the data integrity intact and continue the workflow process robustly to provide satisfactory results to the end user/customer. The main objective of this research is to design a framework that will provide the data and services availability all the time by keeping data security and privacy intact, when
47
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
related concurrency control in databases and its transaction [12] [13]. It must be noted that whenever an intrusion strikes a workflow system and is detected, the system must be stopped immediately to avoid any data infection for maintaining its integrity. So all of the recovery method needs the mechanism of undo all the faulty areas that require the system to wait and then start the process again to redo things once the system gets back to safe state. But making the system wait for the recovery and redoing the same processing again annoys the customer from the system. Hence ensuring the availability of the system even in the unsafe state is very much required that has not addressed yet by anyone.
the intrusion strikes and system goes into unsafe state. The proposed framework will be utilizing the biological inspired mechanisms to provide data protection, security and privacy. The following section will explain the background of the related literature in the context proposed research area followed by the overview of the proposed research. The details of the related concepts and proposed frame work will be explained in the proposed methodology. II.
BACKGROUND
WFMS is a very hot area of research. Various efforts have been made in areas of workflow representations, adaptive workflows, workflows performance and management issues, workflows security and self healing in workflows. The current research is also related to the area of security and workflow system recovery. Various existing work involves different approaches for intrusion detection and then system recovery. Multi-Version Objects [1] approach is to replace the dirty objects with the clean version to recover the system, and the whole system works in more than one version of each object. Whenever there is an intrusion that infects the data object, the system is stopped and then recovered to the previous state with the help of these clean versions of the objects. The graph theory in theoretical computer science is also referred while recovering procedures are applied [18], [19], [20]. Trace back recovery [2] mechanism is based on Flow-Back Recovery Model [16] that uses the traces of the flow of execution and then recovers the workflow system. Another approach utilizes the workflow specification to detect the intrusions with the help of independent Intrusion Detection System. It proposes an “Attack Tree Model” [3] to describe the major goal of the attack and then splitting it to the sub goals. The work focuses to provide the system recovery through dynamic regeneration of workflow specification. The Undo and Redo mechanism is utilized to recover and bring the system to consistent state. This approach deals with the exception raised by the intrusions and regenerate the workflow specification dynamically for the workflow to execute successfully. Architecture consists of BPEL (Business Process Enterprise Language) Engine and Prolog Engine for intelligence is utilized to regenerate the workflow dynamically [3]. There is another architecture named MANET [4] that provides additional features of Mobile services, Workflow Modeler and policy Decision point to regenerate the Workflow specification more effectively [4]. Vulnerabilities are also detected by the use of a workflow layer on any system as a non intrusive approach is proposed based on this architecture for survivability in the cyber environment [5]. The overall security is based on the model [6] that Threat Agent causes threats that cause vulnerability. Vulnerability that causes risks can be reduced by a safe guard that protects an asset [6].There are different approaches like Do-It-All-Up-Front Approach, All or Nothing, Threat Modeling and Big Bang approach etc. for ensuring security on web and has their own pros and cons [7]. Ammann et al. [11] deals with the transactions done by malicious users and recover the system by cleaning the infected data items due to these transactions and hence undo all those transactions. Panda et al. [17] provides number of algorithms to recover the system based on the dependency information that is stored separately. Eder and Liebhart [14] also studies potential failures in workflows and found its possible recovery mechanisms. Problems associated with recovery and rollback in distributed environment has also been handled [15]. Few more work
III.
OVERVIEW OF THE PROPOSED RESEARCH
A. Problem Statement Business requires 100% availability of their workflow systems, so that the services have been provided to the customers securely and the customers have 100% satisfaction on their services. In the case of intrusion detected, the system needs to be stop so that it can be recovered from the possible threat. Due to which the availability of services at that time might not be possible. Hence whenever a system goes in to unsafe state due to some intrusion, the workflow management system should provide. • Security and privacy of data • In time availability of correct data to ensure the completion of desired transaction. • Complete the workflow process robustly to provide satisfactory services to the end user/customer. B. Objectives The main objectives of the research are following. Design an alternative execution framework for the workflows in vulnerable state such that it • Provides the robust execution of the entire workflow process. • Ensure the data security and privacy. • Availability of in time correct data to the customers. C. Concerns There are certain concerns associated with the methodology to achieve the objectives. How data can be secured and its privacy can be maintained, when intrusion is found? How can the remaining activities continue their execution? The argument for the first concern lies in the concept of chameleon characteristics, hence we can say that by applying chameleon characteristics to database portion for the specific ongoing activity to carry on. However the argument for the second concern is that we can apply the concept of data hibernation. Following section will provide the definitions of the concepts of Chameleon data sources and Data hibernation.
48
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
D. Definitions The following are the definitions of the useful concepts regarding the research paper.
Dimensions
1) Chameleon Data Sources The term is taken from the concept of chameleon characteristics of changing color and is defined as changing of data values to unreadable data when the data source is found to be under threat. The concept is shown in Figure 1 and Figure 2.
Normal data Source
2) Data Hibernation The term is driven from the concept of hibernation in animals in which animals go to sleep for a certain period of time under soil and is defined as shifting a data from the original data source to multiple dimensions when there is a threat to its integrity and return back to original source when the threat is removed. The concept is shown in Figure 3 and Figure 4.
Data Shifted to Dimensions Figure 4. Behavior of Data Hibernation
IV.
PROPOSED METHODOLOGY
Proposed methodology is the base line for the desired framework to provide the execution of vulnerable workflows to provide services and data availability. The methodology includes designing a mechanism that provides and ensures the data security, integrity and privacy in the operational workflows. There is also a requirement of a mechanism to make the data available to the customer retaining its integrity when the system is in unsafe state. These two mechanism leads to the proposed framework for the execution of the vulnerable workflow system. The following is the explanation of the proposed methodology. A. Explanation 1) Designing a mechanism, to provide and ensure the data security and privacy in operational workflows: This whole mechanism is biologically inspired from the behavior of changing colors such as Chameleon does and hibernation mechanism in the wild life. There are two milestones to achieve while dealing with this issue. One is handling ongoing activity while other is handling the upcoming activities.
Figure 1. Chameleon and its Characteristics
Normal data Source
Encrypted to
Chameleon Data Source
Handling the ongoing activity while the system is declared as unsafe due to some intrusion requires an implementation of Chameleon Data Sources concept as follows:
Figure 2. Chemeleon Data Source Behavior
a) Role of Chameleon Data Sources The concept is drawn from the natural phenomenon of changing colors by Anole and Chameleon in case of any threat to provide them an appropriate sabotage. Getting inspiration from this concept and applying it to the portion of data source that has been utilized by the ongoing activity leads the data to be sabotage and becomes secured from the threat of intrusion and hence keep its privacy and integrity. Applying the concept requires data in the database should be changed dynamically from the meaningful state into meaningless state by using the encryption rules. It is not only the encryption of data but it is dynamically applying the encryption to the data sets whenever data’s privacy seems to be in danger. Figure 3. Behavior of Animal Hibernation
49
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
5. The current workflow activity accesses the data using encryption and decryption mechanism. However the upcoming workflow activities in the running system will access the data from the hibernated data source.
Simultaneously, the data associated with the upcoming activities should also be handled, which can be done by applying the data hibernation concept as following b) Role of Data Hibernation The concept is also raised from the animal behavior to reside under soil for the specific time. During Modeling Phase, the chunks of larger database needs to be divided and modeled correctly, so that it can easily be integrated with the larger database in terms of moving the portion of unsafe data to that small dimension and shifting them back when the system returns in a safe state again.
6. Using these two key phenomenon, workflow transaction will not stop and even in the unsafe state the whole system will robustly keep on operating in a secured, available and manageable fashion. A. Explanation The above mentioned working has to be done while workflow system follows an alternative path due to the intrusion threat and needs to be carried until the system is recovered fully from the threat as shown in the Figure 5 and Figure 6. In Figure 5, when the data in the main database has encrypted, at the same time, the data becomes read only so that all the activities to spoil the data by writing garbage on it can also be controlled. After the current active tasks finishes its execution then the transformed results in the memory and the existing data inside encrypted database portion would be written in the respective dimension. The other dimensions can be populated during the execution of the current task as background process. When the system is recovered from the possible threat or the threat is rectified then the data in the dimensions will be transferred back to the original database at their appropriate location. This whole phenomenon can be seen in Figure 6.
The portion of database schema, whose data needs to be hibernating, should be changed using dimensional modeling. Each dimension of database is one that is referred during any specific workflow activity execution, i.e. the dimension is w.r.t the context of workflow activity. Each dimension is a normalized dimension unlike the dimensions in the data warehousing context. The data is then transformed using ETL into that dimension and needs to be accessed from that area until the system regain its safe state. 2) Designing a mechanism to make the data available in its correct form to the customer even if the system is in unsafe state: Dealing with the ongoing activity requires continue referring the same portion of the database on which the current transaction is based on. Applying the dynamic encryption to that portion of database making it a chameleon natured will help to solve the problem in the current scenario. Not only data becomes meaningless for all the external sources but also it becomes ready to use by the alternative commands that can be able to decrypt it and use it. The point of consideration here is to make the portion of that database as read only so that encrypted data might not be overwritten by intrusion activity with dirty data to become useless at all. Hence by doing so, the change that has been made by the ongoing activity should be stored using caching. Once the data is completed its required transformation then it should be written in the relevant hibernated dimension. All of the upcoming activities will refer the hibernated data from the respective dimensions. V.
PROPOSED FRAMEWORK FOR ROBUST EXECUTION OF VULNERABLE WORKFLOWS
Figure 5. Workflow System state when intrusion strikes
The following is the proposed algorithm for robust execution of the vulnerable workflows to provide data and services availability to the customers in a non discrete fashion. 1. The intrusion attack is detected by a workflow process using some IDS. 2. Workflow server signals the flag to the workflow engine. 3. On receiving the flag, the workflow engine interrupts the resource manager. 4. Resource manager forces the active data source to change its state and hibernate the data in all of the dimensions except that of the currently active data.
Figure 6. Workflow System state when intrusion is rectified
50
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
[2]
[3]
[4]
[5]
[6]
[7] Figure 7: Overall view of the execution of vulnerable workflow in a secured fashion using the proposed framework
[8]
Figure 7 shows the overall view of the workflow process robustly executing under the proposed framework guidelines, providing the availability of the services and data to the customers.
[9] [10] [11]
B. Strength & Weaknesses of the Proposed Framework The framework provides the workflow with great strength to continue its execution robustly in a secured manner by making the availability of the data and services possible for the customers. Due to this robustness and security, the end users and customers rely on the system with more confidence. On the other hand the proposed framework is targeting the centralized data sources. Framework does not target the issues related to distributed data sources that has to be taken care as its future implications.
[12] [13] [14] [15]
CONCLUSION The research contributes to resolve the issue of service unavailability to the end user or business customers in case of intrusion intervention in the workflow system. The services not only are available but in a secured fashion by keeping the privacy and integrity of the data intact. Moreover the research is a pioneer step in the area of making system keep working even in the unsafe state, so to provide maximum satisfaction to the customer. Providing such framework enables the enterprises to run their own customized solutions based on the provided guideline. The work also focuses to provide the workflow process providing self security. The framework is targeting the centralized data source, it may however be extended to cater the distributed data sources and services in future.
[16]
[17] [18]
[19]
[20]
REFERENCES [1]
for Workflow Systems”, Proceedings of the 19th Annual Computer Security Applications Conference, ACSAC 2003, 1063-9527/03, IEEE Meng Yu, Peng Liu, Wanyu Zang, “Self-Healing Workflow Systems under Attacks”, Proceedings of the 24th International Conference on Distributed Computing Systems, ICDCS’ 2004, 1063-6927/04, IEEE Casey K. Fung, Patrick C. K. Hung, “System Recovery through Dynamic Regeneration of Workflow Specification”, Proceedings of the Eighth IEEE International Symposium on Object-Oriented RealTime Distributed Computing, ISORC’ 2005, 0-7695-2356-0/05, IEEE Casey K. Fung, Patrick C. K. Hung, William M. Kearns, Stephen A. Uczekaj, “Dynamic Regeneration of Workflow Specification with Access Control Requirements in MANET”, IEEE International Conference on Web Services, ICWS' 2006, 0-7695-2669-1/06, IEEE Kun Xiao, Nianen Chen, Shangping Ren, Kevin Kwiat, Michael Macalik, “A Workflow-based Non-intrusive Approach for Enhancing the Survivability of Critical Infrastructures in Cyber Environment”, Third International Workshop on Software Engineering for Secure Systems, SESS' 2007, 0-7695-2952-6/07, IEEE Gernot Goluch, Andreas Ekelhart, Stefan Fenz, Stefan Jakoubi, Simon Tjoa, Thomas M¨uck , “Integration of an Ontological Information Security Concept in Risk-Aware Business Process Management”, Proceedings of the 41st Hawaii International Conference on System Sciences, 2008, IEEE “Web Application Security Engineering”, IEEE Security Magazine, Published By The IEEE Computer Society, 2006, 1540-7993/06, p16 – 24 Margie Virdell, “Business processes and workflow in the Web services world”, 2003, http://www.ibm.com/developerworks/webservices/library/wswork.html , (referred in March 2009). Scott Mitchell, “Encrypting Sensitive Data in a Database”, MSDN Spotlight, 2005 Sung Hsueh , “Database Encryption in SQL Server 2008” Enterprise Edition SQL Server Technical Article, 2008 Paul Ammann, Sushil Jajodia, and Peng Liu., “Recovery from malicious transactions”. IEEE Trans on Knowledge and Data Engineering, 2002, 14:1167–1185. P. A. Bernstein, V. Hadzilacos, and N. Goodman., “Concurrency Control and Recovery in Database Systems”. Addison-Wesley, Reading, MA., 1987. P. Chrysanthis. ACTA, “A framework for modeling and reasoning out extended transactions”. PhD thesis, University of Massachusetts, Amherst, Amherst, Massachusetts, 1991. J. Eder, W. Liebhart., “Workflow Recovery”. In Proceeding of Conference on Cooperative Information Systems, 1996, pages 124– 134. M. M. Gore, R. K. Ghosh., “Recovery in Distributed Extended Longlived Transaction Models.” In Proceedings of the 6th International Conference DataBase Systems for Advanced Applicationns 1998, pages 313–320. B. Kiepuszewski, R.Muhlberger, M. Orlowska., “Flowback: Providing backward recovery for workflow systems”. In Proceeding of the ACM SIGMOD Inter- national Conference on Management of Data, 1998, pages 555–557. C. Lala, B. Panda., “Evaluating damage from cyber attacks.” IEEE Transactions on Systems, Man and Cybernetics, 2001, 31(4):300–3 Muhammad Aasim Qureshi, Mohd Fadzil Hassan, Sohail Safdar, Rehan Akbar; 2009, Raison D'Être of Students’ Plimmet in Comprehending Theoretical Computer Science (TCS) Courses, International Journal on Computer Science and Information Security Volume 6 (No 1) October 2009. Muhammad Aasim Qureshi, Mohd Fadzil Hassan, Sohail Safdar, Rehan Akbar, Rabia Sammi; 2009, An Edge-wise Linear Shortest Path Algorithm for Non-Negative Weighted Undirected Graphs, Frontiers of Information Technology December 2009. Muhammad Aasim Qureshi, Mohd Fadzil Hassan, Sohail Safdar, Rehan Akbar; 2009, A O(|E|) time Shortest Path Algorithm for NonNegative Weighted Undirected Graphs, International Journal on Computer Science and Information Security Volume 6 (No 1) October 2009.
Meng Yu, Peng Liu, Wanyu Zang, “Multi-Version Attack Recovery
51
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
RCFT : Re-Clustering Formation Technique in Hierarchical Sensor Network Boseung Kim
Joohyun Lee
Dept. of Computing Soongsil University Seoul, South Korea
Dept. of Computing Soongsil University Seoul, South Korea
The cluster settled in this way is not to be re-organized at every round, but to be fixed by the end of its life-span. This research aims for properly dividing the range of cluster, decreasing the waste of energy by fixing the cluster, and prolonging the life-span of the sensor network.
Abstract— Because of limited energy of nodes, an important issue for sensor network is efficient use of the energy. The clustering technique reduces energy consumption as cluster head sends sensed information to a sink node. Because of such character of clustering technique, electing cluster head is an important element for networks. This paper proposes RCFT(Re-Clustering Formation Technique) that reconstruct clusters in hierarchical sensor networks. RCFT is a protocol that reconstructed clusters considering position of a cluster head and nodes in randomly constructed clusters. And this paper demonstrated that clusters are composed evenly through simulation, accordingly this simulation shows the result reducing energy consumption.
II.
RELATED STUDY
A. LEACH(Low-Energy Adaptive Clustering Hierarchy) LEACH[2] is the technique of Routing based upon clustering for the purpose of dispersing the loads of energy between the sensor-nodes. In LEACH, the sensor-nodes are being composed by themselves, and one sensor-node plays a part of head.
Keywords-Wireless Sensor Networks, Clustering
I.
Yongtae Shin Dept. of Computing Soongsil University Seoul, South Korea
INTRODUCTION
In case of functioning as head of cluster, the sensor nodes waste energy much more than the ordinary senor nodes because they should collect and summarize data from other sensor nodes, and transmit it to BS. So, assuming that all the sensor nodes have the identical level of energy, the sensor nodes selected as the cluster heads exhaust out fast.
AS the interest in the surroundings of Ubiquitous increases recently, we, also, pay much attention to the sensor-network, which composes one of the components in Ubiquitous. Sensornodes, which have the limited energy, are mostly set in the area where is dangerous or not easily accessible[1]. Accordingly, as it is very difficult for sensor-nodes to be replaced even after the energies of them are used up once set in, it is the most important part for study in the field to prolong the life-span of the sensor network through the proficient use of energy.
Therefore, LEACH makes many sensor-nodes within the cluster take the position of the cluster heads by shift to prevent this situation. Also, LEACH exercises the regional absorption of data in order to absorb the data from cluster to BS, which helps to save the energy and to make life-span of the system longer.
Considering the trait that gathering of data is required in order to decrease the waste of energy caused by the double transmittance of information between the adjacent sensornodes, The Routing Protocol based upon cluster has much of advantages.
LEACH is composed of rounds, and each round has two(2) stages; 'set-up' stage, in which cluster get organized, and 'steady-state' stage, in which many TDMA frames get formed. LEACH is the basic technique for the hierarchical sensor network. So far LEACH-C[3], TEEN[4], APTEEN[5], which are gotten rid of weak points of LEACH, have been introduced.
Selecting the head of cluster is essential in the hierarchical Routing Protocol based upon cluster, so the proper selecting the head enables us to save the electrical power as well as to disperse the waste of energy.
B. LEACH-C LEACH-C(LEACH-Centralized) is also the technique of Routing Protocol based upon clustering. Though it is similar with LEACH, LEACH-C is the method that synch selects the cluster heads according to the information on sensor-nodes' position and the holding amount of energy with regard to selecting the cluster heads.
This paper suggests the devices to select differently the head taking the positions of the heads of cluster and the distance between sensor-nodes. The suggested technique is to select the heads of cluster optionally. And then the heads of cluster applied with the techniques are diversified in order.
52
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
During the stage of 'set-up', each sensor-node transmits the information on its present position and the level of energy to BS. On receiving the message, BS calculates the average values of energy level of all the sensor-nodes, and then decides the heads of cluster by minimizing the total sum of the distances between the cluster heads and non cluster heads.
B. RCFT(Re-Clustering Formation Technique) LEACH re-organizes cluster at every termination of one round, in this process the cluster heads is selected at random except for the sensor-nodes to have been already selected as heads. Accordingly, cluster can be divided equally, or not as shown in figure1)[7].
When cluster is established, BS broadcasts messages including the ID of cluster heads to each sensor-node, and It is the sensor-nodes having the identical ID to the ID in the message that are to be selected as cluster heads.
If cluster is not divided properly, each sensor-node' waste level of energy would increase as well as not be in order. 1) Abstract of RCFT: RCFT suggested in the paper selects randomly cluster heads at first, then re-selects cluster heads considering the numbers of hops between each cluster heads , and the numbers of hops of cluster nodes farthest away from the cluster heads. After selecting cluster heads, RCFT reorganizes cluster which is to be fixed till the termination of network's life-span. 2) Operation of RCFT: After broadcasting the broadcast message, the sensor-nodes selected as the first cluster heads wait for response for a while, and when received response, they inspect whether the responses are identical with ones from the same sensor-nodes, which responded before. If the response is the first time, they record the response of head with most small numbers of hops, and also record the information of the sensor-node with the most counting values among the responses on sensor-nodes. If there are over two of sensor-nodes having the most counting values, the information of sensor-node, which responded at the last, is to be recorded as it means that the sensor-node, which responded at the last, is the farthest away. If the sensor-nodes selected as the first cluster heads receive responses from all the nodes, they subtract the numbers of hops of the farthest sensor-node from the numbers of hops of the closest head. If the calculation results in plus, they move for the direction of the closest head as many as the numbers of hops in the values; If the calculation results in minus, they move for the direction of the farthest sensor-node as many as the numbers of hops in the values. Given the ttl value as result value, and if ttl value makes 0, the sensor-nodes are to be selected as new head cluster. If the result value makes 0, the first cluster head does not move to become cluster head again. (Figure2) shows the example of technique suggested.
The strong point of LEACH-C is that it can leads in the equal waste of energy between sensor-nodes by inducing the cluster heads into the centre of cluster. However, each sensor-node should recognize the information on its position, for which each sensor-node should be loaded with GPS receiver set. This apparatus will make the price of sensor-nodes increase highly. As quantity of sensornodes to be needed for the network ranges from hundreds to hundred-thousands, increase in the price of sensor-nodes is not appropriate[6]. III.
CLUSTERING ALGORITHM SUGGESTED
A. Problem of the established Clustering Algorithm LEACH re-organizes cluster at every termination of one round, in this process the cluster heads is selected at random except for the sensor-nodes to have been already selected as heads. Accordingly, cluster can be divided equally, or not as shown in figure1)[7]. If cluster is not divided properly, each sensor-node' waste level of energy would increase as well as not be in order.
Figure 2. Example of RCFT
(a) in left side shows that A and B are selected as the first cluster head. The cluster of (a) is not in order, which is common when the cluster heads are selected at random. (b) in
Figure 1. Division of cluster in LEACH
53
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
of node result in. Especially, numbers of nodes under 10 and over 31 come out frequently. On the contrary, numbers of nodes ranging from 16 to 25, which can be considered as relatively good result, are found almost over 50% in the technique suggested.
right side shows the condition of division of cluster after technique suggested applied. Head A moved 4 hops for the direction of nodes, and head B moved 1 hop for the direction of cluster. It is seen that the irregularly divided cluster(a) can be divided comparatively in order. IV.
EVALUATION OF EFFICIENCY
LEACH
In order to show the superiority of RCFT, it is needed to compare and analyze the average numbers of sensor-nodes which each one cluster in LEACH and in the technique suggested have respectively. Also it is to be done to calculate the average distance between LEACH and LEACH-C, and between the cluster head in the technique suggested and the sensor-nodes belonging in, and to compare and analyze the level of energy consumed by sensor-node while the rounds repeat.
RCFT
노드의 수 23 22 21 20 19 18 17 Cu s te r 1
A. Condition of experiment Table 1 shows the condition for evaluating the efficiency. Under the circumstance of 100m X 100m, the total numbers of sensor-nodes is 100units, and the cluster heads compose 5 units, which is 5% of the total sensor-nodes.
Cu s te r 2
Cu s te r 3
Cu s te r 4
Cu s te r 5
Figure 3. Distribution of the numbers of nodes per cluster LEACH
RCFT
30.0% 25.0%
TABLE I. Classification Work condition
Experiment condition
TABLE TYPE STYLES
20.0%
Factor
Set-up
Language
Visual C++
OS Rage of sensor-field Total numbers of nodes Numbers of heads Position of BS Times of experiment size of packet
Windows XP Professional 100m X 100m 100 units 5 (50, 500) 20 Round X 10times 2000 bit
15.0% 10.0% 5.0% 0.0% 1~10
11~15
16~20
21~25
26~30
31~
the number of nodes
Figure 4. Distribution of the numbers of nodes per cluster
(Figure 5) shows the average distances between cluster heads and all the belonging node in LEACH, the technique suggested, and LEACH-C which uses separate information of position. As the longer is average distances, the wider cluster is, and the shorter is average distance, the closer cluster heads come to the center of cluster, it can be demonstrated that the more efficient does cluster get, the shorter is the average distance.
To analyze whether the cluster is divided in order, calculated are the distances between cluster head and the each node belonging, and the numbers of belonging nodes at every termination of each round. The average distance was calculated by dividing the total sum of the distances between cluster head and the each belonging nodes by the numbers of the belonging nodes, and this experiment was conducted 10 times based upon the criteria of 20 rounds.
In case of LEACH, 21.11m was measured as the average distance. On the contrary, the average distance was 20.68m in LEACH-C using the information of position. The technique suggested had 20.88m as average, which is a little worse than LEACH-C, but still shows similar capability.
The same numerical Formula used with LEACH was adopted for analyzing the consuming of LEACH, LEACH-C, and RCFT.
(Figure 6) shows the average energy-consuming quantity of nodes. LEACH-C using the positional information saved about 20% more of energy waste than LEACH. Even though at the first stage of 20 round RCFT caused almost two(2) times more of energy waste than other techniques as it organizes cluster once more than others at first, it shows gradually low rates of increase. After 120 round, the energy waste of RCFT became smaller than LEACH, and it increase in the similar ratio of LEACH-C.
B. Result of experiment and Analysis (Figure 3) shows the average numbers of nodes per cluster. It can be estimated that the closer to 20 units are the average numbers of nodes per cluster, the cluster is divided more regularly. In (Figure 3), it can be found out that as RCFT comes closer to the average values, 20 than LEACH, the gaps get small. (Figure 4) shows the distribution of the numbers of nodes per cluster. As the cluster heads of LEACH are selected randomly, the numbers of nodes belonging in cluster are irregular. Therefore, as shown in (Figure 4), various numbers
54
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
information. Through minimizing the energy-wastes of the entire network with the aid of the Re-clustering Formation Technique suggested in the paper, it is possible to accomplish more efficient surrounding of communications in the hierarchical sensor network.
(m) 25.00 24.00 23.00 22.00
REFERENCES
21.00
[1]
Ian F. Akyildiz, Weilian Su, Yogesh SanKarasubramaniam, and Erdal Cayirci, "A survey on Sensor Networks, "IEEE Communications Magazine, vol.40, No.8, pp.102-114, August 2002. [2] Wendy Rabiner Heinzelman, Anantha Chandrakasan, and Hari Balakrishnan, "Energy-Efficient Communication Protocol for Wireless Microsensor Networks", Proceedings of the Hawaii International Conference on System Sciences, January 2000. [3] Endi B. Heinzelman, Anantha P. Chandrakasan, and Hari Balakrishnan, “An Application-Specific Protocol Architecture for Wireless Microsensor Networks”, IEEE Transactions On Woreless Communications, Vol. 1, No. 4, October 2002. [4] Arati Manjeshwar, Dharma P. Agrawal, "TEEN: A Routing Protocol for Enhanced Efficiency in Wireless Sensor Networks," ipdps, p. 30189a, 15th International Parallel and Distributed Processing Symposium (IPDPS'01) Workshops, 2001. [5] A. Manjeshwar and D.P. Agrawal, "APTEEN: A Hybrid Protocol for Efficient Routing and Comprehensive Information Retrieval in Wireless Sensor Networks," in the Proceeding of the 2nd International Workshop on Parallel and Distributed Computing Issues in Wireless Networks and Mobile Computing, Ft.Lauderdale, FL, April 2002. [6] Mohammad Ilyas, Imad Mahgoub, "Handbook of Sensor Networks: Compact Wireless and Wired Sensing Systems", CRC PRESS, 01, 2006. [7] M. J. Handy, M. Haase, D. Timmermann, “Low Energy Adaptive Clustering Hierarchy with Deterministic Cluster-Head Selection", IEEE, 2002. AUTHORS PROFILE B. Kim. Author is with the Department of Computing, Ph.D. course, Soongsil University, Seoul, Korea. His current research interests focus on the communications in wireless sensor networks (e-mail:
[email protected]).
20.00 19.00 18.00 LEACH
LEACH-C
RCFT
Figure 5. Average distance between clusters and nodes LEACH
LEACH-C
RCFT
mj 35 30 25 20 15 10 5 0 20R
40R
60R
80R
100R
120R
140R
160R
180R
200R R o u n d
Figure 6. Average amount of energy-waste of nodes
C. Analysis on experimental results The result of experiment shows that the technique suggested is more efficient than LEACH with regard to division of cluster. The average numbers of nodes was closer to the average value in the technique suggested than LEACH, and the average distance between cluster heads and nodes became shorter in the technique suggested than LEACH.
J. Lee. Author is with the Department of Computing, M.Sc. course, Soongsil University, Seoul, Korea. His current research interests focus on the communications in wireless sensor networks (e-mail:
[email protected]).
In addition, the technique suggested enabled it to get the similar value with LEACH-C using the separate positional information. Even though at the first stage of 20 rounds RCFT caused more of energy waste than other techniques, it showed gradually low rates of increase than LEACH. After 120 rounds, the energy waste of RCFT became much smaller than LEACH. V.
Y. Shin. Author was with the Computer Science Department M.Sc. and Ph.D., University of Iowa. He is now with the Professor, Department of Computing, Soongsil University. (e-mail:
[email protected]).
CONCLUSION
This paper suggests Re-clustering Formation Technique in the hierarchical sensor network. The technique suggested is to disperse and re-organize cluster heads considering the numbers of hops between the clusters organized randomly and the belonging nodes for the sake of the efficient division of clusters. Network model was realized for analyzing the efficiency of the suggestion. The analysis on the efficiency shows that division of clusters is more efficient in the technique suggested than in the established techniques, which can save the waste of energy. Also, It was shown that the technique suggested is not much different from the one using separate positional
55
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
An alternative to common content management techniques Rares Vasilescu Computer Science and Engineering Department Faculty of Automatic Control and Computers, Politehnica University Bucharest, Romania
We identified several key characteristics of CMS and during research and experiments each will be addressed and a new architecture implemented [8].
Abstract— Content management systems use various strategies to store and manage information. One of the most usual methods encountered in commercial products is to make use of the file system to store the raw content information, while the associated metadata is kept synchronized in a relational database management system. This strategy has its advantages but we believe it also has significant limitations which should be addressed and eventually solved.
In Section 2 we will present such list of key functionalities, functionalities which should be addressed by a high performance implementation model. In Section 3 we will describe the proposed information storage alternative while in the next section we will discuss the challenges generated by this approach in terms of finding the managed data. The conclusion summarizes experimental results derived from the model implementation experience and from some performance benchmarks. It also outlines the next open points for research.
In this paper we propose an alternative method of storing and managing content aiming at finding solutions for current limitations both in terms of functional and nonfunctional requirements. Keywords-CMS; architecture
content
I.
management;
performance;
II.
CMS SPECIFIC FUNCTIONALITIES
During previous years, several efforts [1, 2] were made to standardize an interface to content management systems.
INTRODUCTION
Content management systems (CMS) can be defined as a set of processes and technologies which support the digital information management lifecycle. This digital information is usually referred as “content” and can be found as not-structured or semi-structured - such as photographs, images, documents or XML data. While one can look at CMS as a software application, it is more and more used as a technological software platform on which other end-user applications are built. In turn, CMS are commonly based on other core technologies such as relational database management systems (RDBMS) and file systems thus is common for information and processes to traverse multiple technical layers to implement a given functionality.
These initiatives have still some more room to expand but we can consider their existence as a validation of the fact that CMS becomes an infrastructure service, similar with database management systems and file systems. It therefore supports our approach of trying to design a high performance implementation model for CMS not necessarily based on other infrastructure services. In order to design a model for the CMS one must look at the key functions these systems provide and aim to implement them. When looking at CMS functionalities set the following key features were identified:
The usage of out of the box components such as RDBMS helps systems achieve a lower time to market and high reliability metrics. On the other hand, this reuse comes with an inherent mismatch between components which can lead to nonoptimal performance, both in terms of functional and nonfunctional needs. Following experiments [3], [6] and practice we came to the conclusion that a high performance content management system needs to be designed specifically as an core infrastructure technology (such as database management systems are) rather than employing multiple layers from applications to data items.
56
Data (content and metadata) management
Security management
Ability to ingest content
Ability to process content
Ability to classify content
Retrieve data (metadata and content)
Allow and control concurrent access
Manage storage space
Allow collaboration on content
Allow definition of content enabled flows
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
A high performance system should address this characteristic at its core and provide means to efficiently store and manage each and every item with performance scaling at least linearly comparing with size.
We consider that each of these features can be explored from the point of view of high performance. The scope of this paper is not to address all of them but to present some first steps done in this direction and outlining the next activities which are done to build a high performance CMS. Understanding how content management systems are different from other systems (such as database management or file management systems) is essential for being able to design and build a high performance variant.
C. Nonstructured content processing We are used to find and process information by using relational algebra on tuple based data organization. The fact that the piece of information is comprised of metadata and content at the same time leads to the need for at least enhancing the algebra with operators which can work on content. Since content is unstructured (or semi-structured in case or XML data, for example) such operators are different in nature than the common ones. Content processing is an essential function of CMS and is not unusual to be one of the most important functionality evaluated while choosing such a system. It is therefore mandatory that the system architecture embeds these at its core.
Content management usually needs a specialized approach on data management since it expresses a set of characteristics from which we mention the following:
Manages complex data structures (not only data tuples with atomic values)
Shows a high variance in each item data size
Nonstructured content processing (e.g. text or image based search) is necessary for standard data discovery functions
Another fact is that technology evolves while content not necessarily changes. For example a photo would be taken at a certain moment in time and its original representation remains the same while the manipulation technologies evolve and can extract and process more and more information based on the representation. Considering this, a CMS must allow this technological evolution without requiring a fundamental change and while still observing the performance topic.
Security rules and management rules need to act at multiple levels on the complex data structures
A. Complex data structures Each element managed by such systems is comprised of a set of metadata (key-value(s) pairs) and the content itself (e.g. the binary file representing a photo).
D. Security management Arguably one of the top performance factors is the security model implementation subsystem. This is due to the fact that security should govern everything and this is not a trivial task to fulfill.
Metadata are not only simple key-value pairs in which the value is an atomic element – they can also contain complex data structures sometimes repetitive (e.g. a table with many columns and rows). This characteristic is obviously in contradiction with the first normal form [5] and a relational database implementation will most probably not model it in this manner. But what we consider essential is that the actual information can be modeled in various ways and we should identify a method adequate for high performance.
Each managed element usually has an associated security set which determines who can perform what kind of operation on it. Characteristic to CMS is that these security rules apply not only at item level but also at sub-item level. For example, one system user could have the permissions to update some of the document’s metadata but not some of them and could operate on the content only for versioning not overwriting it. More, such permissions could address only an item version or format, not all of them (e.g. a user could be authorized to see only the PDF format of an item which also has a text editable format).
Information also includes the actual data content which needs to be managed in synch with the metadata. There are ways for storing and managing this content inside the relational database tuples but experiments [3], [6] shown that such methods pose specific performance problems. Adding more challenge, each content item can have multiple versions which need to be tracked and managed. Versioning is not natively managed by common database management systems thus we can expect that such models are less than optimal. Content is not only versioned but can also be represented in multiple formats (each of the versions having multiple binary formats, such as multiple image representation formats of a picture). The relationship between renditions, versions and the information item itself should be addressed as core functionality.
III.
PROPOSED STORAGE MODEL
The proposed model shows a content management system which stores data in an autonomous, self descriptive manner, scalable both in terms of functionality and of usage. Individual content items are self-described and stored in a standardized format on generic file systems. The file format (Fig. 1) can follow any container convention (e.g. can be XML based) but is essential to contain all the information necessary to manage that specific piece of content regardless of the software built for this reason.
B. High variance of item size Managed items vary substantially in size between CMS implementations and even inside the same system. It is not unusual to encounter a system with item size ranging from several bytes to multiple gigabytes or even terabytes.
57
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
related ones appear or are modified in turn. The main reason behind this requirement is that items can be stored also on readonly media (such as tapes or optical disks) and are therefore not-updateable physically. Also, compliance rules could mandate the read-only access and the requirement is thus not only from a technical limitation but also from a legal perspective. Metadata values are contained into the next section (pairs of metadata name and associated values). A significant decision we took is to define each metadata completely (such as its name and data type) without referencing a global datadictionary. This decision keeps the item self-described and independent of other data collections. The independence comes at the price of storage overhead since each metadata item which is present in several items is described also in each of them. This overhead would be significant if there would be a fixed schema by which to classify items. In exchange, we choose not to employ a schema based classification but to include in each item’s metadata only the attributes which are relevant for that particular item. This decision has an impact also on the information retrieval techniques which need to be implemented since traditional methods are no longer suited. Another section contains data about the links to other items. Each other item is referenced by unique identifier or by version series identifier. Each relation has also a type classification to differentiate between possible link variants. Relations are necessary to link together different versions of the same item and different formats of the same version. After all these sections, the file records the content itself. This positioning of the data for several main reasons: any update of the metadata or associated relations can happen without accessing the whole file contents and the majority content updates can be handled by a file update not by an entire rewrite. In special cases we could choose to add at the end of the file some certificates to ensure the authenticity of item sections. These certificates can be applied using any kind of technique but one common method is using the IETF standard defined as RFC 3852 [7].
Figure 1. Item file structure
One addition to the above structure would be a header subsection which can determine which other sections of the file are protected in a different manner than the others. For example, the actual content and a part of the metadata need to be readonly while some metadata information can be added or changed still. This is particularly useful for CMS compliance and retention needs.
The said file is designed to contain multiple segments, each representing a specific data area characterizing the information item. It is expected to store these segments in fixed size data pages (e.g. 1KB page increments) so that eventual updates do not trigger the rewrite of the entire file (which would be time consuming). Of course, the paging would increase the overhead on the storage space and this need to be considered when configuring the segment size. One option can be to define the segment size for each item or to dynamically choose it at system runtime based on item properties.
IV.
SPECIFIC PROPOSED MODEL CHALLENGES
The proposed model is based on a series of architectural decision which have a significant impact on the overall system design. We will discuss here some impacted functionalities and propose ways of mitigating the risk of negative impact while enhancing the benefits.
The header area begins the file and contains the format version and the key content item identifier. Alongside it must contain also the version series identifier and the version identifier. This makes each item very easy to identify without reading or processing the whole file. The strategy used to assign series identification is designed so it does not need an update of existing version metadata when a new version appears in the series, keeping existing items stable. It is essential to not need modifications into an existing item when
Content is many times subject to retention rules. As the information gets transformed from physical supports (such as paper) to digital (e.g. scanned documents) the regulations also extend in similar ways. CMS users are expecting their system to provide methods of enforcing compliance rules and
58
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
applying search filters – although this is possible with a full scan and filter approach. It is thus necessary to implement a data discovery mechanism which would enable application use the CMS for fast item retrieval and processing.
managing these processes as efficiently as possible. It is not uncommon for regulations to state that certain content be stored for many years (tens or even permanently) and on storage which prevent alterations (accidental or not). Looking back at current software application lifespan we can see that operating systems evolve significantly every several years and we can hardly see systems which remain stable over a decade. Given this state of things it is not reasonable to believe that a system built today will remain as is for the time needed to manage its content. With consideration to the above we proposed the item storage technique described in the previous section.
The proposed model considers also the lack of schema. Since there is no enforced schema, the top application is left with the task of choosing how an item is classified and then retrieved. Although this decision is different than the commonly established practice of the database system enforcing a schema which is then obeyed by caller applications we consider that this enforcement is necessary only when applications are not yet stable enough (e.g. in development phases) while afterwards the application itself becomes an enforcer of the schema. This assumptions is based on actual solution implementation experience and from observing that even though database systems have the ability to enforce referential constraints between tables, these features are seldom used when performance is key.
Each stored item has an essential ability of being selfdescribed and stable during its lifecycle. Having items selfdescribed empowers the system to employ information retrieval techniques which evolve in time while keeping the initial content unchanged. For example, when a new information processing technique is developed, the system can be simply extended to implement it also and then run it over the existing repository of items. More, the items can be stored on Write Once Read Many (WORM) mediums which can be stored outside the information system itself and processed only when needed (e.g. tapes libraries). All of this is possible by keeping the item catalog (index structure) separated to the content. The main difference versus common existing CMS models is that the catalog structure is not mandatory to be synchronized and maintained alongside the content itself since the content is selfdescribed and such catalog can be entirely rebuilt only in a matter of time.
While it can be the task of the application to determine the metadata used for an item, it is still the task of the CMS to manage these data and to provide a way to filter it efficiently. We propose a system which includes a collection of independent agents, each of them processing an atomic part of data: a metadata field or a content piece. Once an item is created or updated, these agents get triggered and each of them processed and indexes the associated data. When search queries are submitted the filters will be splitted in basic operators and then submitted in parallel to respective search agents. These agents will process the sub-queries and return results as found to a query manager which aggregates the results and replies to the top application with partial results as they are collected.
As previously presented, the self-described characteristic comes with an associated cost: overhead on the needed storage space and complexity of operations on the content file store itself generated by the paging algorithm. We believe that this cost is reduced since items do not include a fixed schema but are classified by individual characteristics (not even using a schema associated with item types). The approach gives the flexibility to classify an item by an initial set of attributes determined by the top application logic and then eventually add more metadata as the content progresses through its lifecycle. It helps a lot also in cases when an item needs to be perceived differently by various applications (e.g. a content item representing and invoice is used and classified differently by an accounts payable application then by a records management one). Considering that items have multiple versions and formats, this approach reduces significantly the metadata associated with each one since the only differentiating attributes can be stored on these items (e.g. format type) and the rest of them being inherited through the use of relations.
A key advantage is that the top application can receive not only precise results but also results which partially match the search criteria. While this can seem not adequate (having a system which does not return precise matches) it can prove very efficient in practice since a user could be satisfied to obtain very fast an initial set of results and then – while it is evaluating the partial set – receive the complete result. One should note that the above terms “partial” and “complete” refer not only to the quantitative measure of the result (number of returned items) but also to the matching of partial or complete filter criteria. A challenge to this model is the query optimization technique which cannot be based on traditional relational database models given the lack of schema and the related statistical information. Solving this challenge requires a change of the approach to optimization itself: not aiming to provide a complete response opens the door to other optimization techniques by focusing on the feedback from actual execution rather than preparing a query plan. This optimization should take into account the fact that given the vertical organization of the metadata (each agent having its own specific associated metadata item) the memory locality of frequently used index structures can help the process a lot. Since memory efficiency tends to grow at a faster pace than disk efficiency and processors tend to include multi-core elements more and more, we expect than an architecture geared up memory usage and
The current large majority of content management systems need to keep a data dictionary to describe each type of content item they manage. This might be seen as convenient for a number of system administration tasks but actually we found that it imposes a lot of restrictions and overhead. It is also naturally not flexible and a lot of workarounds need to be designed in order to allow concepts like “inheritance” or “aspects”. A challenge of the proposed model is to retrieve and make use of the stored items. Only storing the self-described items does not provide an efficient manner to access them by
59
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
parallel processing will provide the maximum possible performance right now.
on building an actual implementation of the system and benchmarking it versus other common CMS products.
Coming back to the item storage, a key element is the file organization within the file system itself. Tests [3] have shown that file systems generally provide very good performance but files need to be organized properly beforehand. For example, it is not advisable to store millions of file in the same “folder” inside a file system. While this is perfectly possible in modern file systems, it can experience major performance impact on accessing that folder – thus also any of the contained files. Although there are a lot of different file management systems available, this behavior is valid for most of them. The proposed solution is to store files in such way that the location of the file is determined by the unique object identifier and that no more than 256 files exist on the same folder. This is achieved by representing the unique identifier as a hexadecimal number resulting 8 pairs of 2 digits. The less significant pair represents the filename. The rest of the digits represent the folder names toward that content file (in order). By applying this simple logic files will not overwhelm file system folders and each item is directly identified on the disk, saving a lot of expensive I/O operations.
Since there is no known accepted benchmark procedure for content management systems we will consider the functional elements defined by industry standards such as CMIS [1] but we will also include nonfunctional requirements such as the ability of the system to manage information over extended time periods. REFERENCES [1]
[2]
[3]
[4] [5] [6]
Other concerns of the proposed architecture are modern challenges such as refreshing digital signatures on fixed content for items which need long retention periods (e.g. over 10 years). For this reason, the content file has a dedicated area at the end of the file to store digital signatures on various areas (metadata and / or content). Multiple signatures can be stored for any area (e.g. successive signatures for same content part). V.
[7] [8]
OASIS, “Content Management Interoperability Services (CMIS) TC”, 01.04.2009, http://www.oasis-open.org/committees/cmis, accessed on 25.09.2009 Java Community Process, “JSR 170 – Content repository for java technology API”, 24.04.2006, http://jcp.org/en/jsr/detail?id=170, , accessed on 25.09.2009 M. Petrescu, R. Vasilescu, D. Popeanga, “Performance Evaluation in Databases – Analysis and experiments”, Fourth International Conference on Technical Informatics CONTI’2000, 12-13 October, “Politehnica” University of Timisoara J. F. Gantz, “The diverse and exploding digital universe”, IDC, 2008 Codd E.F, “A relational model of data for large shared data banks”, Communications of the ACM 13 (6) pag. 377-387, 1970. S.Stancu Mara, P. Baumann, V. Marinov, “A comparative benchmark of large objects in relational databases”, Proceedings of the 2008 international symposium on Database engineering & applications, 2008. R. Housely, “RFC 3852 - Cryptographic Message Syntax”, July 2004. R. Vasilescu, “Architectural model for a high performance content management system”, The 4th International Conference for Internet Technology and Secured Transactions (ICITST 2009), London, November 2009, in print AUTHORS PROFILE
CONCLUSION AND NEXT STEPS
Independent studies [4] show that about 80% of the stored data is not inside a database management system and that the total volume increases exponentially to reach over a thousand Exabytes by 2011 (ten times more than in 2006).
Dipl. Eng. Rares Vasilescu is a PhD student at Politehnica University, Faculty of Automatic Control and Computers, Computer Science and Engineering Department, Bucharest, Romania.
We believe that designing a CMS able to handle very large structured and semi structured content is key to maintaining the pace of this information growth. To validate the content storage techniques presented in high level within this paper, we work
Previous work includes studies and experiments on the performance of database management systems. Current research addresses the area of content management systems in preparation of the PhD thesis conclusion.
60
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Routing Technique Based on Clustering for Data Duplication Prevention in Wireless Sensor Network Boseung Kim
Huibin Lim
Dept. of Computing Soongsil University Seoul, South Korea
Dept. of Computing Soongsil University Seoul, South Korea
technique to have the multi-hop routing as its trait. And, hierarchical routing protocol is the technique to grant the role of heads to the specific nodes in each region by dividing network into many regions based upon cluster[10].
Abstract— Wireless Sensor Networks is important to node’s energy consumption for long activity of sensor nodes because nodes that compose sensor network are small size, and battery capacity is limited. For energy consumption decrease of sensor nodes, sensor network’s routing technique is divided by flat routing and hierarchical routing technique. Specially, hierarchical routing technique is energy-efficient routing protocol to pare down energy consumption of whole sensor nodes and to scatter energy consumption of sensor nodes by forming cluster and communicating with cluster head. but though hierarchical routing technique based on clustering is advantage more than flat routing technique, this is not used for reason that is not realistic. The reason that is not realistic is because hierarchical routing technique does not consider data transmission radius of sensor node in actually. so this paper propose realistic routing technique base on clustering.
This paper suggests RTBC(Routing Technique Based on Clustering), routing protocol based on cluster, organizing network as per a cluster, which can grasp the traits of communication happening in the surroundings of wireless sensor network, and control the resources of energy in terms of protocol. RTBC sets up a route between sink and cluster head by using the data values of sensor nodes distributed randomly, suggesting the technique for each member of nodes to transmit efficiently sensing information in cluster organized of cluster heads selected randomly like LEACH[4]. The structure of the paper is as follows: Section 2 in the paper discusses subjects to be considered of hierarchical protocol of sensor network, and analyzes various traits, weak and strong points. Section 3 suggests RTBC, routing protocol based on cluster, which can transmit efficiently sensing information by organizing network as per a cluster. Section 4 suggests the devices to realize the simulation of RTBC, and analyzes the efficiency of protocol suggested. At last, Section 5 summarizes the contents of paper and suggests the direction of research on the field later on.
Keywords-Wireless Sensor Networks, Clustering
I.
Yongtae Shin Dept. of Computing Soongsil University Seoul, South Korea
INTRODUCTION
The recent technology of wireless communication and electronics makes it possible to develop multi-functional sensor-nodes of small sizes, which enable communicating between short distance, with such low costs, and relatively a little amount of electrical power. Network Protocol is one of the technical factors which to organize the wireless network. As the wireless network has some factors to be overcome, which is not the case for the traditional networks, it is important to understand this traits in advance before designing wireless network.
II.
RELATED STUDY
Flooding is the traditional technique being used in wireless sensor network. Flooding is the technique for them to repeatedly transmit the packet to their adjacent nodes in case that the nodes receiving packet are not the last, or can not reach the most numbers of hops of packet. However, it has three(3) problems of double message, double sensing, and efficiency of energy which should be overcome so that flooding can be used in wireless network.
Among these traits, It is the requirement for efficient utilization of the energy resources that should be regarded the most important for the reflection into network protocol. If network protocol operates in the surroundings of wireless sensor network where communications occur frequently without any consideration for the resources of energy, it can interfere with the operation of wireless sensor network by causing separation, isolation, interruption etc. of network[1,2,3].
SPIN(Sensor Protocols for Information via Negotiation)[5] is the protocol to transmit sensing information to many nodes via three(3) steps of negotiation in order to improve double message, double sensing, and efficiency of energy, which was pointed out as weak points of flooding. The message of SPIN includes meta-data which is the concise data on sensing information. It decides the double message and double sensing
Routing Protocol of wireless sensor network diverges largely into plane routing Protocol and the hierarchical routing protocol. Plane routing protocol regards the whole network as one region, enabling all the nodes to participate in; It is the
61
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
information via negotiation before carrying out the transmittance of message. This trait of meta-data can control network protocol, which is distinctive for SPIN.
2) Organizing Cluster: Like in LEACH, cluster heads are selected randomly by Sink nodes. The cluster heads, selected randomly, select nodes which reach 5% of the entire nodes. In the process, the node selected as cluster head is to be received the cluster head ID(CHID) from Sink. The selected cluster head should organize cluster by notifying the adjacent nodes that it is the cluster head via ADV message. The node, which received ADV message, organizes cluster by modifying its node information, and transmitting REP message later on. The message for organizing cluster is as follows.
DD(Direct Diffusion)[6] is the routing technique focusing data based upon question-broadcasting of Sink, which can transmit sensing information on the specific region to random nodes. DD transmits sensing information after setting up a route reversely from the targeted region to source nodes via three(3) steps. LEACH(Low-Energy Adaptive Clustering Hierarchy)[4] is the routing protocol based on clustering for the purpose of dispersing the energy of nodes which organize network by themselves. In LEACH, selected cluster heads collect sensing information from member nodes of cluster, and transmit it Sink by itself.
TABLE I.
MESSAGE FOR ORGANIZING CLUSTER
It is pointed out as traits of this technique that LEACH makes cluster heads, which functions as energy intensive, circulate at random in order to distribute energy waste equally to all the sensor-nodes in network, and collect and manage data of cluster from cluster heads for saving the cost of communication. But it is difficult to be applied to the real situation considering that all the nodes, which are selected as cluster heads, should communicate directly with Sink. III.
ROUTING TECHNIQUE BASED ON CLUSTERING
A. RTBC Even though the algorithm of hierarchical RTBC(Routing Technique Based on Clustering) has more strong points than the algorithm of plane routing, it is not used because it is unrealistic. In order to apply the algorithm of hierarchical routing to the real model, radius of transmitting data of senornodes should be taken into consideration. IEEE 802.15.4, known as the criteria of sensor network, defines radius of transmitting data of senor-nodes as 10m[7]. MICA2, which is being used most commonly as sensor-node, also rules the maximum radius as 10m[8]. This paper also limits the maximum radius of transmitting data of nodes to 10m.
Figure 1. Cluster organizing and defined route within cluster-1
This chapter suggests RTBC(Routing Technique Based on Clustering) using sensor-nodes which have the limiting radius of transmittance. Like LEACH, RTBC selects cluster heads in between nodes by the equal times based on probability, and organize cluster based upon the selected cluster head. 1) Selecting Cluster Heads: It is the first priority to obtain the information of sensor-nodes, which are distributed randomly at first in order to select cluster heads. So Sink transmits questioning message to sensor-nodes, which are one(1) hop away. Sink is able to count the numbers of sensornodes distributed at random as each nodes transmit its hopcount and ID to Sink in the responses to the questioning message. Using the sensor-nodes to be obtained this way, like LEACH[4], Sink selects cluster heads in between nodes by the equal times based on probability so that the energy waste between nodes in network.
Figure 2. Cluster organizing and defined route within cluster-2
62
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
can bring A3 to its route. As shown in the picture above, A6 preoccupies A3 as it is closer than A11.
Sensor-nodes, which extended ADV message, meet the sensor-nodes with different CH1. In ④ of [Figure 4], A31, A32, A33 do not receive ADV message which A27 transmitted as A31, A32, A33 have different cluster head from A27. This part form the boundary in between clusters, organizing cluster of CH1 as shown in the picture above. 3) Routing Within Cluster: After each member node organized cluster, cluster head CH1 defined the imaginary routs of nodes for the direction of itself based upon the each belonging node data. Though node CH1 recognizes only nodes, which are one(1) hop away, each node also is connected together by this low level of information. Therefore, the node, which has an event, can transmit data by CH1 following the imaginary route set up shown as above. In case above, cluster head does not need to define route by transmitting a questioning message to the node with the event. Also to define the only route is because cluster heads change regularly; it is more efficient to maintain the transmittance of data within cluster via the defined route rather than to re-define the route according to events.
Figure 3. Cluster organizing and defined route within cluster-3
Figure 4. Cluster organizing and defined route within cluster-4
In [Figure 1], the selected cluster head CH1 transmits ADV message(CH1, CH1, 0) shown as ① to the nodes A15, A16, A22 which are one(1) hop away . The nodes which received the message, define CHID of ADV message as the cluster head ID, CH1 which they belong to, and also define the sender node as CH1, their own DNID. And they set 1 for their Hopcnt value by adding 1 to the received Hopcnt value 0. Shown as ② in response to ADV message, each node transmits REP message(CH1, 1) which is received by the cluster head CH1, responding node of REP message. In this way, the sensornodes have direction to the cluster head.
4) Routing out of Cluster: Each sensing information, which was received by cluster head from all the nodes, makes double data of cluster head as one, and checks the condition of each node by transmitting multi-hop and transmits it to Sink later on. Sink node should transmit regularly the interest message to network for the sake of communication between cluster head and Sink. Then, interest message is to be transmitted to the whole network from Sink node, and each node existing in the network recognizes the energy and numbers of hops of neighboring nodes by using this message. When transmitting data to Sink node, cluster head defines the nodes, of which condition of energy is good, and the number of hops is small as the receiving nodes among its neighboring nodes' tables, and transmits data later on. Also, the nodes receiving the data of cluster head transmit data in the same manner. Considering that it is difficult for cluster head to communicate directly with Sink node, the routing technique of cluster head suggested in this paper uses routing based upon the neighboring energy and numbers of hops directing to Sink. The suggested routing technique does not maintain a special routing-route for routing, but it is easy to use as is routing to the neighboring nodes having minimum of hops to Sink node. Additionally, cluster head can use the shortest distance to transmit data to Sink node.
[Figure 2] shows how the nodes A15, A16, A22 transmit again ADV message to the adjacent nodes. A15 re-transmits ADV message to the neighboring nodes A10, A6, A11. Then, shown as ① of [Figure 1], ADV message transmits the value of (CH1, A15, 1). And like in[Figure 1] A15, A16, A22 do, A10, A6, A11, which received this message, define their own information of sensor-nodes, and also, transmit REP message(A15, 2) to A15. Then, A15 decides whether the hop count value(Hopcnt) is 0. If hop count value(Hopcnt) is not 0, which means that this node plays a role of the mid-node, A15 re-transmits REP message(CH1, 1) shown as ②. At last, the cluster head is able to recognize the node within its cluster by receiving this value. [Figure 3] shows node A15, [Figure 2] shows the third stage to transmit ADV message. ③ shows the competition between sensor-nodes, or non transmittance of message owing to the nodes receiving ADV message. In case of A10, ADV message can be transmitted to A5, A14, but in case of A6, it can have the same hop-count value with A6 through ADV message transmitted from A15. Therefore, ADV message transmittance does not happen mutually because they are judged as the same level of nodes. Even though A6 and A11 transmit simultaneously ADV message, the one that arrives first
IV.
REALIZATION AND ANALYSIS ON EFFICIENCY
A. Evaluation Model for RTBC Efficiency For evaluating efficiency, routing technique based upon clustering with limited radius of transmittance was realized by C++, and the related factors are decided to define the related environment. For simulation make-up, assuming N units of sensor-nodes are to form in space of a regular square
63
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6 No. 1, 2009
occurrence. Looking over RTBC(50), RTBC(100), RTBC(200) which define the frequency to organize clustering respectively as 50, 100, 200, the demanding interest message is higher than DD which is non hierarchical routing technique when RTBC is set to 50 at the lowest. But when RTBC is set over 100, the interest message is lower than DD, or at the almost same level.
coordinates, movability and additional nodes was not considered. Also, each node has the same trait, and begins from the same condition. The nodes selected as each cluster head is the same nodes as well. In the process of experiment, it was noticeable that it can screen double data through clustering. To measure the efficiency of utilizing energy, it was done to compare the average amount of energy waste of the entire network according to the event node, to the one of established plane routing by changing the cycle of organizing cluster. Accordingly the amount of energy waste was measured.
When comparing the numbers of message presented by the simulation, the message technique using clustering is shown as effective to protect double data. Also its effectiveness on the whole is as follows. 35
1) Definition of Environment Factor: As shown in Table 2, the size of network was limited to 100m x 100m. The numbers of sensor-nodes are to be used for recognizing the numbers of nodes with no errors which the simulation has in the size of network. Each node, which has event, was occurred at random from 100 units to 500 units. The range of sensing was defined as 10m based on the distance of nodes having limiting radius, and the maximum distance between each node was limited to 5m. Assuming that the coordinates defines as (50, 0), the energy of each node distributed at first, transmitting and receiving energy was also defined as in Table 2.
number of nodes in cluster
30 25 20 15 10 5 0 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Cluster Head ID
Figure 5.
Number of nodes within cluster
25000
SURROUNDING FACTOR FOR SIMULATION
Energy
Transmitting Receiving
Value of setting up
number of messages
Surrounding factor Size of network Unitofnodes Unitofeventnodes Sensor-node Unitofround Rangeofsensing Minimumdistancebetweennodes Sink Position Valueatfirst
20000
100m x 100m 50,100,150,200,250,300 100,200,300,400,500 50,100,200 10m(=1hop) 5m (50,0) 100unit 1unit 0.25 (data) (interest) 1unit 0.25 (data) (interest)
15000
DD RTBC(50) RTBC(100) RTBC(200)
10000 5000 0 100
Figure 6.
200
300 number of events
400
500
Comparison of number of interest messages
3000 2500 number of datas
TABLE II.
B. Evaluation and Analysis on RTBC In the experiment, the numbers of nodes of each cluster head in the network which does not have the isolation of nodes, and having 300 units of nodes, were compared. In average, the cluster forms the stable shape with 20 units of nodes.
2000 DD RTBC(50) RTBC(100) RTBC(200)
1500 1000 500 0 100
200
300
400
500
number of events
Figure 7. Comparison of number of datas
In the simulation, DD and RTBC were compared respectively 10 times under the same condition in sensor network having 300 units of nodes. Also, the frequency for organizing cluster was experimented by changing the occurrence of the event nodes as 50, 100, 200. This shows which frequency of occurrence of the event nodes is the most efficient; This is because organizing cluster needs more energy waste than non hierarchical techniques.
16000 14000
energy consumtion(unit)
12000
The numbers of each message and the consuming amount of total energy were compared. The horizontal axis stands for the units of event, and the vertical axis of coordinators stands for the numbers of interest message according to the
10000 DD RTBC(50) RTBC(100) RTBC(200)
8000 6000 4000 2000 0 100
200
300
400
500
number of events
Figure 8.
64
Comparison of the total amount of energy consumption
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
technique based on clustering is possible, and superior from comparison and evaluation with DD.
C. Result of Evaluating Efficiency of RTBC RTBC, the hierarchical routing technique for the wireless sensor network was defined to 10m, which is the range of receiving and transmitting, and the realistic and practical technique was suggested through routing within and out of cluster. The result of the experiment above can induce several consequences.
Based on the result of this research, it is well expected that the realistic routing technique will be able to used widely through preventing double data through clustering and managing data regionally.
First, in case of using the hierarchical technique applied with clustering in the sensor network, it was possible to save entirely the energy waste as well as to consume the energy efficiently through the equal distribution of energy. REFERENCES
Second, the numbers of interest message to the nodes with the occurrence of event also decreased. This can help to improve the efficiency of energy by over 18% on the average through the experiment. By transmitting interest message received from Sink, the cluster head can also prevent double message.
[1]
Akyildiz, I.F., W. Su, Y. Sankarasubramaniam, E. Cayirci, "A Survey on Sensor Networks", IEEE Communication Magazine, pp. 102-114 August 2002. [2] M. Tubaishat, S. Madria, "Sensor Networks : An Overview," IEEE Potencials, April/May 2003. [3] A. Wadaa, S. Olariu, L. Wilson, K. Jones, Q. Xu, "On Training a Sensor Network", Proceedings of the International Parallel and Distributed Processing Symposium(IPDPS'03), IEEE, 2003. [4] Wendi Rabiner Heinzelman, Anantha Chandrak asan, and Hari Balakrishnan, “Energy- efficient communication protocols for wireless microsensor networks,” in Proceedings of the Hawaii International Conference on Systems Sciences, Jan. 2000. [5] Wendi Rabiner Heinzelman, Joanna Kulik, Hari Balakrishnan, "Adaptive protocols for information dissemination in wireless sensor networks," Proceedings of the fifth annual ACM/IEEE international Conference on Mobile Computing and Networking, August 1999. [6] Chalermek Intanagonwiwat, Ramesh Govindan and Deborah Estrin, "Direct Diffustion : A Scalable and Robust Communication Paradigm for Sensor Networks," Proceedings of the Sixth Annual International Conference on Mobile Computing and Networks, August 2000. [7] J. A. Gutierrez, M. Naeve, E. Callaway, M. Bourgeois, V. Mitter and B. Heile, “IEEE 802.15.4: A Developing Standard for Low-Power LowCost Wireless Personal Area Networks,” IEEE Network Magazine, volume 15, Issue 5, pp.12-19, September/October 2001 [8] Noseong Park, Daeyoung Kim, Yoonmee Doh, Sangsoo Lee and Ji-tae Kim, “An Optimal and Lightweight Routing for Minimum Energy Consumption in Wireless Sensor Networks,” IEEE RTCSA 2005, August 2005 [9] Li-Chun Wang, Chuan-Ming Liu, Chung-Wei Wang, "Optimizing the Number of Clusters in a Wireless Sensor Networks Using Cross-layer", IEEE International Conference on Mobile Ad-hoc and Sensor Systems. 2004. [10] M. J. Handy, M. Haase, D. Timmermann, “Low Energy Adaptive Clustering Hierarchy with Deterministic Cluster-Head Selection", IEEE, 2002.
Third, it can not only improve the credibility of transmitting data, but help to save the energy waste nationwide to prevent double data of cluster. RTBC was proven as the efficient routing technique by preventing about 58% of double message transmittance. Fourth, it is possible to organize the realistic clustering by using the sensor-nodes having the limiting range, which means that it is possible to use the trait of sensor-nodes based upon the communication of low electrical power. But it needs to define properly the frequency to organize each cluster so that this can be possible. V.
CONCLUSION
In wireless sensor network, it is more important to preserve the energy of nodes for organizing the continuous network than to consider efficiency owing to the trait of applications program and limitation of hardware. Also, collecting the sensing information should be easy. These traits can be applied to the network protocol, and the protocols of Flooding, SPIN, DD, and LEACH were suggested by the former research. But even though LEACH using the hierarchical routing can have a lot of strong points by sensing double data or managing regionally data transmittance, it is not efficient because it is not appropriate for the sensor-nodes having the limiting range. Therefore, the network protocol, which realistically has limiting range of transmittance, and can sense double data and manage it regionally compared with non hierarchical routing, is required. So this paper suggests RTBC, routing technique based on clustering for preventing double data, which recognizes diachronic trait in the surrounding of wireless sensor network; For this purpose, comparison between RTBC and the established non hierarchical routing technique was done by defining the process of organizing cluster, routing within cluster, and routing out of cluster.
AUTHORS PROFILE B. Kim. Author is with the Department of Computing, Ph.D. course, Soongsil University, Seoul, Korea. His current research interests focus on the communications in wireless sensor networks (e-mail:
[email protected]). H. Lim. Author is with the Department of Computing, M.Sc. course, Soongsil University, Seoul, Korea. His current research interests focus on the communications in wireless sensor networks (e-mail:
[email protected]). Y. Shin. Author was with the Computer Science Department M.Sc. and Ph.D., University of Iowa. He is now with the Professor, Department of Computing, Soongsil University. (e-mail:
[email protected]).
Through the simulation, it was experimented preventing double data of RTBC and analyzing the efficiency of managing data regionally, also it was induced that the realistic routing
65
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
An optimal method for wake detection in SAR images using Radon transformation combined with wavelet filters * Ms.M.Krishnaveni
** Mr. Suresh Kumar Thakur *** Dr.P.Subashini
*** Research Assistant-NRB, Department of Computer Science, Avinashilingam University for Women, Coimbatore, India ** Deputy Director, Naval Research Board-DRDO, New Delhi, India. *Lecturer (SG), Department of Computer Science, Avinashilingam University for Women, Coimbatore, India.
presentation of the objects. The analysis of ship wakes in SAR imagery with specialized algorithms can provide significant information about a wake’s associated vessel, including the approximate size and heading of the ship [13]. The velocity of the vessel can be estimated by measuring the disarticulation of the ship relative to the height of the wake. Image processing algorithms such as the Fourier Transform and the Radon Transform allow the user to manipulate SAR images in a way that dramatically increases the chance of detecting ship wakes [2]. The paper is organized as follows: Section 2 deals with the Image localization (SAR images). Section 3 deals with wavelet denoising methods and its metrics. Section 4 comprises the comparison radon transformation and its performance. Section 5 converses the experimental results of the shrinkage methods and radon transformation. This paper also concludes with remarks on achievable prospects in this area.
Abstract -A new-fangled method for ship wake detection in synthetic aperture radar (SAR) images is explored here. Most of the detection procedure applies the Radon transform as its properties outfit more than any other transformation for the detection purpose. But still it holds problems when the transform is applied to an image with a high level of noise. Here this paper articulates the combination between the radon transformation and the shrinkage methods which increase the mode of wake detection process. The latter shrinkage method with RT maximize the signal to noise ratio hence it leads to most optimal detection of lines in the SAR images. The originality mainly works on the denoising segment of the proposed algorithm. Experimental work outs are carried over both in simulated and real SAR images. The detection process is more adequate with the proposed method and improves better than the conventional methods.
Keywords: SAR images, threshold, radon transformation, Signal to noise ratio, denoising
II. IMAGE LOCALIZATION
I INTRODUCTION
This is the first and lowest level operation to be done on images. The input and the output are both intensity images. The main idea with the preprocessing is to suppress information in the image that is not relevant for its purpose or the following analysis of the image. The pre-processing techniques use the fact that neighboring pixels have essentially the same brightness. There are many different pre-processing methods developed for different purposes. Interesting areas of pre-processing for this work is image filtering for noise suppression. Conservative methods based on wavelet transforms have been emerged for removing Gaussian random noise from images [1]. This local preprocessing speckle reduction technique is necessary prior to the processing of SAR images. Here we identify wavelet Shrinkage or thresholding as denoising method [3]. It is well known that increasing the redundancy of wavelet transforms can significantly improve the denoising performances [7][8].
In navy radar applications, the presentation of the radar image traditionally has been the way for the radar operator to interpret the information manually. The large increase in calculation capacity of the image processing in modern radar systems has great effects in detection and extraction of targets[5]. With powerful image processing techniques and algorithms, modern radar systems has the possibility to extract targets and their velocity from the surrounding background. A condition for this automatic detection is that the radar image should be relatively free from undesired signals [2]. Such undesired signals can be rain clutter, sea clutter, measuring noise, landmasses, birds etc. Conventional filtering like Doppler, median and wiener filtering is often used to remove these undesired signals and extract the interesting part of the radar image. Image processing techniques will improve the radar image and investigate an automatic classification and 66
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Thus a thresholding process which passes the coarsest approximation sub-band and attenuates the rest of the subbands should decrease the amount of residual noise in the overall signal after the denoising process [4].
SNR =
σ (μ ) , σ ( n)
-- (2)
σ ( μ ) in equation (3) denotes the empirical standard deviation of μ (i ), Where
III. IMAGE DENOISING USING WAVELET
The two main confines in image accuracy are categorized as blur and noise. Blur is intrinsic to image acquisition systems, as digital images have a finite number of samples and must respect the sampling conditions. The second main image perturbation is noise. Image denoising is used to remove the additive noise while retaining as much as possible the important signal features[1]. Currently a reasonable amount of research is done on wavelet thresholding and threshold selection for signal de-noising, because wavelet provides an appropriate basis for separating noisy signal from the image signal[3]. Two shrinkage methods are used over here to calculate new pixel values in a local neighborhood. Shrinkage is a well known and appealing denoising technique[9][10]. The use of shrinkage is known to be optimal for Gaussian white noise, provided that the sparsity on the signal’s representation is enforced using a unitary transform[6]. Here a new approach to image denoising, based on the image-domain minimization of an estimate of the mean squared error-Stein's unbiased risk estimate (SURE) is proposed and equation (1) specifies the same. Surelet the method directly parameterizes the denoising process as a sum of elementary nonlinear processes with unknown weights. Unlike most existing denoising algorithms, using the SURE makes it needless to hypothesize a statistical model for the noiseless image. A key of it is, although the (nonlinear) processing is performed in a transformed domain-typically, an undecimated discrete wavelet transform, but we also address nonorthonormal transforms-this minimization is performed in the image domain [6].
⎛ 1 σ ( μ ) = ⎜⎜ ⎝ I −
And
μ=
1 I
∑ ε μ (i) i I
⎞ ∑i ( u ( i ) − μ ) ⎟⎟ ⎠
1/ 2
2
--(3)
is the average grey level value.
The standard deviation of the noise can also be obtained as an empirical measurement or formally computed when the noise model and parameters are known. This parameter measures the degree of filtering applied to the image [5]. It also demonstrates the PSNR rises faster using the proposed method than the former. Hence the resulted denoised image is conceded to the next segment for the transformation to be applied and it is also proved to improve detection process. IV. RADON TRANSFORMATION
Detection of ships and estimating their velocities are major work done in SAR images. Here the proposed method takes advantage of two thresholding techniques and inserts some innovation by using the Radon Transform to detect the ship wake and estimate the range velocity component[12]. The proposed technique was applied to synthetic raw data, which contains a moving vessel and its respective wake. The Radon Transform calculates the angle that a straight line perpendicular to the track makes with the x-axis in the center of the image. Knowing this, simply add 90º to the value obtained to find the angle of the wake arm. If an image is consider as I, ^
d
sure(t ; x) = d − 2.#{i : xi ≤ t} + ∑ ( xi Λt ) 2 --(1)
with dimensions MxM. The Radon transform I is given in equation (4)
i =1
where d is the number of elements in the noisy data vector and xi are the wavelet coefficients. This procedure is smoothness-adaptive, meaning that it is suitable for denoising a wide range of functions from those that have many jumps to those that are essentially smooth.
^
I (xθ ,θ) =
M/2
∑I(xθ cosθ − yθ sinθ, xθ sinθ + yθ cosθ) -(4)
yθ =− M / 2
where ( ( xθ , yθ ) ∈ Z and
θ ∈ [0; π ]
It have high characteristics as it out performs Neigh shrink method. Comparison is done over these two methods to prove the elevated need of Surelet shrinkage for the denoising the SAR images. The experimental results are projected in graph format which shows that the Surelet shrinkage minimizes the objective function the fastest, while being as cheap as neighshrink method[15]. Measuring the amount of noise equation (2) is done by its standard deviation , σ (n) , one can define the signal to noise ratio (SNR) as
67
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Several definitions of the radon transform exists, and expresses lines in the form of rho=x*cos(theta)+y*sin(theta), where theta is the angle and rho the smallest distance to the origin of the coordinate system[12]. The Radon transform for a set of parameters (rho,theta) is the line integral through the image g(x,y), where the line is positioned corresponding to the value of (rho,theta). The delta() is the Dirac delta function which is infinite for argument 0 and zero for all other arguments[14].
value with variations in rho values which is shown in figure2.
This function is implied with the original image, and denoised image of two methods[11]. The detection of line segment in the SAR images is more appropriate with surelet denoising and radon transformation then with the former and the conventional method. Experiments are carried over with the proposed method to verify and validate the results. With the angle of both arms of the wake calculated, the equation of the line that passes by each of them can be estimated.
Figure 2: (a) Original image (b) Angle using RT (c) Angle using first denoising method and RT (d) Angle using second denoising method and RT
V. RESULTS AND DISCUSSION
To verify the validity of the proposed method the results are compared based on PSNR ratio and time parameters for the Shrinkage methods and it is given in figure1. With the extension of the next segment work, detection of angle is also compared based on the radial coordinates (rho). Noise (sigma) is been the main phenomena for the comparison job. Surelet which is the latest method based on the SURE. The DWT was used with Daubechies, least asymmetric compactly-supported wavelet with eight vanishing moments with four scales. The 120 x 120 pixel region SAR images are used for applying radon transformation. They were contaminated with Gaussian random noise of 10 20 30 50 75 100.
Table 1 explicates about the radial co ordinates values and the angle values for two SAR images with corresponding change of noise values for each method respectively. SAR images
Image 1 Image 2
Noise values
10 100 10 100
Original Image with RT
Denoised image (first method with RT)
Denoised image (second method with RT)
radial
angle
radial
angle
radial
angle
48
85
50
85
195
85
85
45
85
45
155
45
Table 1: Comparison of three methods with change of noise values VI. CONCLUSION
In this proposed method, the originality of the technique consent to the wake detection and the estimation of the velocity of vessels more effectively. Here the projected method proves that surelet compared with Neighshrink can determine optimal results by using finest threshold instead of using the suboptimal universal threshold in all bands. It exhibits an excellent performance for wake detection and the experimental result signifies that it produces both higher PSNRs and enhanced visual eminence than the former and conventional methods. The Radon Transform is used to detect the ship wake and estimate the range velocity component. The key
Figure 1: Comparison of PSNR values and time for two Methods (NeighShrink and surelet) for Two SAR image For the wake detection the angle is got by applying the radon transformation which in results the same angle
68
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
[12] Copeland, A. C., Ravichandran, G., and Trivedi, M. M., (1995), Localized Radon transform-based detection of ship wakes in SAR images, IEEE Trans. On Geoscience and Remote Sensing, 33, 35-45. [13]www.mdpi.com/journal/sensors Article Haiyan Li 1,2 , Yijun He 1,* and Wenguang Wang Improving Ship Detection with Polarimetric SAR based on Convolution between Co-polarization Channels [14] Mari.T.Rey James.K.tunaley,member IEEE.J.T.Folinsbee,paul.A.Jahans john.A.Dixon,member IEEE,and Malcolm R,Vant,member IEEE, (1990,july) ”Application of radon transform techniques to wake detection in Seasat – A SAR images’IEEE transactions on geoscience and remote sensing vol 28, [15] T. Nabil, Mathematics Department, Faculty of Sciences, King Khalid University, P.O. Box 9004, Abha 16321, Kingdom of Saudi Arabia “SAR Image Filtering in Wavelet Domain by Subband Depended Shrink”, The Permanent address: Basic science Department Faculty of Computers and Informatics, Suez Canal University, Ismailia, Egypt e-mail:
[email protected], Int. J. Open Problems Comp. Math., Vol. 2, No. 1, March 2009
advantage is that it holds low computational requirements. Further enhancement of the work can be concentrated on the neighbouring window size for every wavelet subband .This aid in difficulties when the ship wake is not visible in the image properly. This paper is therefore concluded that better detection with lower probability of false alarm rate. References [1] R. Sivakumar. 2007. Denoising of Computer Tomography Images using Curvelet Transform. ARPN Journal of Engineering and Applied Sciences. February. [2] Marques, P.; Dias, J.; Moving Target Trajectory Estimation in SAR Spatial Domain Using a Single Sensor, IEEE Trans. on Aerospace and Electronic Systems, Vol. 43, No. 3, pp. 864 - 874, July, 2007. [3] Ali,S.M., M.Y Javed and N.S.Khattak,2007.Wavelet based despeckling of synthetic aperture radar images using adaptive and mean filters .Int J.Computer Sci Eng.,1 (2) :108-112 [4] A. Gupta, S. D. Joshi, and S. Prasad. A new approach for estimation of statistically matched wavelet. IEEE Transac- tions on Signal Processing, 53:1778–1793, May 2005. [5] Lopez S, Cumplido R “A Hybrid Approach for Target Detection Using CFAR Algorithm and Image Processing” Fifth Mexican International Conference on Computer Science. 2004. [6] David K. Hammond and Eero P. Simoncelli,” Image denoising with an orientation-adaptive gaussian scale mixture model, Center for Neural Science, and Courant Institute of Mathematical Sciences New York University [7] S. Durand and J. Froment, Reconstruction of wavelet coeffients using total variation minimiza tion, SIAM Journal on Scienti¯c computing, 24(5), pp. 1754-1767, 2003. [8] G. Y. Chen and T. D. Bui, “Multi-wavelet De-noising using Neighboring Coefficients,” IEEE Signal Processing Letters, vol.10, no.7, pp.211-214, 2003. [9]Michaelis and G. Krell (Eds.): DAGM 2003, LNCS 2781, pp. 156-163.” Rotationally Invariant Wavelet Shrinkage”, Pavel Mr´azek and Joachim Weickert Mathematical Image Analysis Group Faculty of Mathematics and Computer Science, Building 27 Saarland University, 66123 Saarbr¨ucken, Germany {mrazek,weickert}@mia.uni-saarland.de http://www.mia.uni-saarland.de Springer-Verlag Berlin Heidelberg 2003. [10 Achim , A,P.Tsakalides and A. Bezerianos,2003 SAR Image Denoising via Bayesian Wavelet Shrinkage Based on “Heavy –Tailed Modeling In : IEEE Trans Geosci Remote Sensing,41 (8):17731784.DOI:10.11.09/TGRS.2003 813488 INSPEC:7733902 [11G. Chang, B. Yu, and M. Vetterli. Adaptive wavelet thresholding for image denoising and compression. IEEE Transac- tions on Image Processing, 9:1532–1546, September 2000. 69
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6 No. 1, 2009
AES Implementation and Performance Evaluation on 8-bit Microcontrollers Hyubgun Lee
Kyounghwa Lee
Dept. of Computing Soongsil University Seoul, South Korea .
Dept. of Computing Soongsil University Seoul, South Korea .
I.
Sensor
Networks;
Dept. of Computing Soongsil University Seoul, South Korea .
The structure of the paper is organized as follows: Section 2 describes The Rijndael's AES encryption algorithm in Symmetric key encryption; Section 3 measures the encryption and decryption performance on the 8-bit Microcontroller; Sections 4 analyzes the communication efficiency in sensor network through the total delay per hop; and Section 5 concludes this paper.
Abstract— The sensor network is a network technique for the implementation of Ubiquitous computing environment. It is wireless network environment that consists of the many sensors of lightweight and low-power. Though sensor network provides various capabilities, it is unable to ensure the secure authentication between nodes. Eventually it causes the losing reliability of the entire network and many secure problems. Therefore, encryption algorithm for the implementation of reliable sensor network environments is required to the applicable sensor network. In this paper, we proposed the solution of reliable sensor network to analyze the communication efficiency through measuring performance of AES encryption algorithm by plaintext size, and cost of operation per hop according to the network scale. Keywords-component; Wireless algorithm; 8-bit Microcontroller;
Yongtae Shin
II.
AES(ADVANCED ENCRYPTION STANDARD)
A. Rijndael‘s algorithm The AES (advanced encryption standard) [3] is an encryption standard as a symmetric block cipher. It was announced by National Institute of Standards and Technology (NIST) as U.S. FIPS PUB 197 (FIPS 197) on November 26, 2001. The central design principle of the AES algorithm is the adoption of symmetry at different platforms and the efficiency of processing. After a 5-year standardization process, the NIST adopted the Rijndael algorithm as the AES.
AES
INTRODUCTION
The sensor network is a network technique for the implementation of Ubiquitous computing environment. It is wireless network environment that consists of the many sensors of lightweight and low-power. It is researching and developing at the various standards and research organizations. As a result, various fields such as logistics, environmental control, home network applied to sensor network [1]. In these environments, the data is collected by sensors is used through the systematic analysis and the cross-linking between services in a variety of services. Therefore, common security requirements (integrity, confidentiality, authentication, non-repudiation) are required for security service and applications.
The AES operates on 128-bit blocks of data. The algorithm can encrypt and decrypt blocks using secret keys. The key size can either be 128 bit, 192 bit, or 256 bit. The actual key size depends on the desired security level. The different versions are most often denoted as AES-128, AES-192 or AES-256. The cipher Rijndael [4] consists of an initial Round Key addition, Nr-1 Rounds, a final round. Figure 1 shows the pseudo C code of Rijndael algorithm. Rijndael(State,CipherKey) { KeyExpansion(CipherKey,ExpandedKey) ;
Public key encryption algorithm is a fundamental and widely using technology around the world. But it has hardware limitations as like memory and battery, so it is not applied to the sensor network [2]. Therefore, Symmetric key encryption algorithm with low-Energy consumption is used in the sensor networks.
AddRoundKey(State,ExpandedKey); For(i=1; i
In this paper, we describe the Rijndael's AES encryption algorithm in the symmetric key encryption. And we measure the encryption and decryption performance on the 8-bit Microcontroller. Then, we analyse the communication efficiency through the total delay per hop in sensor network.
Figure 1. Rijndael algorithm
70
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6 No. 1, 2009
each column of the input state. Figure 5 show the ShiftRows cyclically shifts the last three rows in the State.
The key expansion can be done on beforehand and Rijndael can be specified in terms of the Expanded Key. The Expanded Key shall always be derived from the Cipher Key and never be specified directly. There are however no restrictions on the selection of the Cipher Key itself. Figure 2 shows the pseudo C code of Rijndael’s Expanded Key algorithm. Rijndael(State, ExpandedKey) { AddRoundKey(State, ExpandedKey); For(i=1; i
Figure 5. ShiftRows cyclically shifts the last three rows in the State
}
MixColumns is a Mixing function in the Cipher round. In the MixColumns step, In the MixColumns step, the four bytes of each column of the state are combined using an invertible linear transformation. The MixColumns function takes four bytes as input and outputs four bytes, where each input byte affects all four output bytes. Together with ShiftRows, MixColumns provides diffusion in the Cipher. Figure 6 shows the MixColumns operates on the State column-by-column.
Figure 2. Rijndael’s Expanded Key algorithm
B. AES round transformation The round transformation [5] modifies the 128-bit State. The initial State is the input plaintext and the final State is the output ciphertext. The State is organised as a 4 X 4 matrix of bytes. The round transformation scrambles the bytes of the State either individually, rowwise, or columnwise by applying the functions SubBytes, ShiftRows, MixColumns, and AddRoundKey sequentially. Figure 3 show the AES iterates a round transformation.
Figure 6. MixColumns operates on the State column-by-column
AddRoundKey is a key adding function in the Cipher round. In the AddRoundKey step, the subkey is combined with the state. For each round, a subkey is derived from the main key using Rijndael's key schedule, each subkey is the same size as the state. The subkey is added by combining each byte of the state with the corresponding byte of the subkey using bitwise XOR. Figure 7 shows the AddRoundKey XORs each column of the State with a word from the key schedule.
Figure 3. AES iterates a round transformation.
An initial AddRoundKey operation precedes the first round. The last round differs slightly from the others the MixColumns operation is omitted. SubByte is a substitution function in the Cipher round. In the SubBytes step, each byte in the state is replaced with its entry using a nonlinear byte substitution table (S-box) that operates on each of the State bytes independently. Figure 4 shows the SubBytes applies the S-box to each byte of the State.
Figure 7. AddRoundKey XORs each column of the State with a word from the key schedule
AES Decryption computes the original plaintext of an encrypted ciphertext. During the decryption, the AES algorithm reverses encryption by executing inverse round transformations in reverse order. The round transformation of decryption uses the functions AddRoundKey, InvMixColumns, InvShiftRows, and InvSubBytes.
Figure 4. SubBytes applies the S-box to each byte of the State
ShiftRows is a permutation function in the Cipher round. In the ShiftRows step, bytes in each row of the state are shifted cyclically to the left. The number of places each byte is shifted differs for each row. ShiftRows step is composed of bytes from
71
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
III.
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
counter value (TCNT), which is cleared to zero, matches the OCR. The timer measurement measures the counts ( P ) of the compared interrupt per 1ms.
IMPLEMENTATION AND PERFORMANCE EVALUATION
A. Experiment and Device For the performance analysis of AES encryption algorithm in the sensor network, we use the ATmega644p [6] in 8-bit Microcontroller as a hardware device. The AVR Studio 4 and Programmer’s Notepad in the WinAVR are used as development tools. The JTAG (Joint Test Action Group) Emulator is used as a debugging tool. Figure 8 shows that device for the performance analysis of the AES encryption algorithm.
The operation time per 1 clock ( TC ) is the following: TC =
1 1 = Frequency 20 *106
(1)
The ATmega644P has a system clock prescaler, and the system clock can be divided by setting the Clock Prescale Register. The prescale time per system clock prescaler( TP ) is the following: TP = prescaler ∗ TC
(2)
The Timer/Counter (TCNT) and Output Compare Registers (OCR) are 8-bit Registers. The OCR for the generating of the compare Interrupt is the following: Figure 8. Device for the performance of analysis of AES
OCR0 A = 0 xFF − (0 xFF − ( P / TP ) + 1)
ATmega644p in 8-bit Microcontroller is made by Atmel. The main function of the ATmega644p is to ensure correct program execution. It must therefore be able to access memories, perform calculations, control peripherals, and handle interrupts. It has 20Mhz System Clock, prescaler of 8, 64, 256 or 1024 and advanced RISC Architecture.
(3)
C. Result For the comparison between encryption and decryption performance, we use the AES-128 CBC mode. The operation time of the encryption and decryption is measured to the data sizes of 16, 32, 64, 128, 256 and 512 Byte. Table 1 and Figure 10 show the encryption and decryption operation time and CPU cycle according to the data size.
AVR Studio is execution or debuging without AVR Microcontroller board. And compiled programs are applied to the AVR. Programmer's Notepad with the Win-GCC Compiler compiles the written C language. The compiled programs are applied to the AVR Studio. JTAG Emulator in JTAG Standard is I/O device using JTAG Port which receives the information from PCB or IC.
TABLE I.
THE COMPARISON BETWEEN ENCRYPTION AND DECRYPTION PERFORMANCE BY DATA SIZES
Data Size(byte) Time (㎳) Enc CPU Cycle Time (㎳) Dec CPU Cycle
Encryption Decryption Operation Time (ms )
B. The implementation of principle For the performance Measurement of AES encryption algorithm, we apply the AES-128 CBC (Cipher Block Chaining) mode to the ATmega644p's EEPROM. In CBC mode, each block of plaintext is XORed with the previous ciphertext block before being encrypted. Also, to make each message unique, an initialization vector must be used in the first block. Figure 9 shows that CBC mode encryption [7].
Figure 9. CBC mode encryption
The timer mode for the time measurement uses the Timer/Counter CTC (Clear Timer on Compare Match) Mode. The CTC Mode generates the compare interrupt only if the
16
32
64
128
256
512
449
898
1,796
3,592
7,184
14,368
8,980
17,960
35,920
71,840 143,680 287,360
456
912
1,825
3,649
9,120
18,240
36,500
72,980 145,940 291,840
Encyption
7,297
Decyption
16,000 14,000 12,000 10,000 8,000 6,000 4,000 2,000 0 16
32
64
128
256
512
Data Size (byte)
72
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
14,592
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6 No. 1, 2009
CPU cycle
Encyption
The data delivery process by hop communication is following:
Decyption
350,000 300,000 250,000 200,000 150,000 100,000 50,000 0
∀Ni ∈ subnet
Ni → Ni + 1 : msg E = E < K prv , plantext >
32 64 128 256 Data Size (byte)
512
If the delay by hop communication includes encryption delay, decryption delay and data transfer delay, total delay is the following:
Figure 10. The operation time and CPU Cycle by data sizes
In the result, the operation time and CPU Cycle by data sizes increase approximately 2 times. In 512 byte, it takes approximately 14 minutes to the encryption and decryption. IV.
Thop −by − hop = t Enc + t Transmitio n + t Dec + Δt
communicate to the encrypted packet data, the generated total delay is the following:
CH
n
Ttot = ∑ i × (Thop −by −hop )
Ni N
N4 N3 N2
(1 < n)
i =1
(6)
The n in equation (6) represents the total hop counts. It has more than 1 for the communication by the neighbor node.
Ni Ni
Figure 12 shows that total delay according to the count of hop between CH and Ni. We assume that the encryption delay is 449ms, decryption delay is 456ms and data transfer delay is 10 ms in 16 byte data. And the number of nodes in the entire network is 215 which is less than the maximum number of nodes 65,535 in the WPAN area. We does not consider the channel access and allocation delay.
Cluster Head(CH) Sensor node (Ni)
N1
(5)
The Δt in equation (5) represents the delay for the allocation and channel access. It has between zero and Thop −by − hop . When the general node and the cluster head
APPLICATION SCENARIO
A. Network model Figure 11 shows that a general node (Ni) sends the secured data packet to the cluster head (CH) in the same subnet.
Ni
(4)
Ni + 1 → Ni + n : msg E = E < K prv , D < K prv , msg E >> 16
Ni
(i = 0...n)
Subnet
Figure 11. Sensor Network Application Model
For measurement of the data encryption and decryption transmission delay by the number of communication hop, the following assumptions are established. Namely, the every node within subnet has same performance, and there is no interfere or packet loss in the data communication. Each node shares common key with neighbor nodes in advance, and operates encryption and decryption once per hop. The communication for the generating of Pair-wise Shared Key is similar to the μTESLA(Micro Timed Efficient Stream Loss-tolerant Authentication) protocol of the sensor network [8].
Total Delay (ms )
250,000 200,000 150,000 100,000 50,000 0
B. Communication delay in sensor network In communication process of the sensor network, the Beacon Request Command and Association Request Command are communicated between new node and cluster head. The general node (N1) encrypts the data using the pre-deployed security key. It sends secured data to the neighbor node (N2). The node (N2) decrypts the encrypted message ( msg E ) using the pre-deployed security keys. Then it obtained to the plantext. The node (N2) repeats the same process in the previous step using the private key shared with its neighbor node (N3).
5
30
55
80 105 130 155 180 205 Number of hop count
Figure 12. Total delay according as the count of hop
In figure 5, the delay of 30 hops and 180 hops generate 27,450ms, 164,700ms respectively. If the number of nodes in the entire network is 65,535 (the maximum number of nodes in the sensor network [1]), the delay is measured 59,964,525ms (about 16 hours). The fundamental reason of the extensive delay occurred is the performance of the equipment that used in
73
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
the experiment as 8-bit Microcontroller has a low capability of the operation. Therefore, the scale of sensor network consisted of the equipments increases, the transmission delay and energy consumption will also increases. V.
CONCLUSIONS
In this paper, we analyse the performance of AES encryption algorithm in the symmetric key encryption on ATmega644p in 8-bit microcontroller. In application scenario, we measure the encryption and decryption operation time by the plantext size. As a result, scale of the sensor network grows, the delay has been doubled. And energy consumption has also increased accordingly. In the future, specific researching on the performance analysis under plantext size and hop count require. ACKNOWLEDGMENT This work was supported by the IT R&D program of MKE/IITA [2008-S-041-01, Development of Sensor Network PHY/MAC for the u-City] REFERENCE [1]
[2]
[3] [4] [5]
[6]
[7]
[8]
IEEE Std 802.15.4: “Wireless Medium Access Control(MAC) and Physical Layer(PHY) Specifications for Low-Rate Wireless Personal Area Networks (LR-WPANs)”, 2003. Yun Zhou, Yuguang Fang, Yanchao Zhang, “Securing wireless sensor networks: a survey,” IEEE Communications Surveys and Tutorials, Vol. 10, No. 3, 3rd Quarter, 2008. FIP 197: Announcing the Advanced Encryption Standard, Nov. 26,. 2001. http://csrc.nist.gov/publications/fips/fips197/fips-197.pdf. J. Daemen and V. Rijmen, “AES Proposal: Rijndael, AES Algorithm,” Submission, September 3, 1999. M. Feldhofer, J. Wolkerstorfer, and V. Rijmen, “AES implementation on a grain of sand,” IEE Proc. Inf. Security, vol. 152, IEE, pp. 13-20, Oct. 2005. Atmel, 8-bit Microcontroller with 16/32/64K Bytes InSystemProgrammable Flash,. E ed., Atmel, San Jose, CA, 08 2008. http://www.atmel.com/dyn/resources/prod_documents/doc7674S.pdf. S. Kim, Ingrid Verbauwhede, “AES implementation on 8-bit microcontroller,” Department of Electrical Engineering, University of California, Los Angeles, USA, September, 2002. A. Perrig et al., “SPINS: Security Protocols for Sensor Networks,” ACM Wireless Networks, vol. 8, no. 5, Sept. 2002.
AUTHORS PROFILE H. Lee. Author is with the Department of Computing, M.Sc. course, Soongsil University, Seoul, Korea. His current research interests focus on the communications in wireless sensor networks (e-mail:
[email protected]). K. Lee. Author is with the Department of Computing, Ph.D. course, Soongsil University, Seoul, Korea. Her current research interests focus on the communications in wireless sensor networks (e-mail:
[email protected]). Y. Shin. Author was with the Computer Science Department M.Sc. and Ph.D., University of Iowa. He is now with the Professor, Department of Computing, Soongsil University. (e-mail:
[email protected]).
74
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6 No. 1, 2009
GoS Proposal to Improve Trust and Delay of MPLS Flows for MCN Services Francisco J. Rodríguez-Pérez
José-Luis González-Sánchez
Computer Science Dept., Area of Telematics Engineering University of Extremadura Cáceres, Spain
Computer Science Dept., Area of Telematics Engineering University of Extremadura Cáceres, Spain
Alfonso Gazo-Cervero Computer Science Dept., Area of Telematics Engineering University of Extremadura Cáceres, Spain routed away from network failures or congestion points [6], [7]. Resource Reservation Protocol with Traffic Engineering (RSVP-TE) is the signalling protocol used to allocate resources for those LSP tunnels across the network [8]. Therefore, MPLS allocates bandwidth on the network when it uses RSVP-TE to build LSP [9]. When RSVP-TE is used to allocate bandwidth for a particular LSP, then the concept of consumable resource in the network is introduced, in order to allow edge nodes finding paths across the domain, which has bandwidth available to be allocated. However, there is no forwardingplane enforcement of a reservation, which is signalled in the control plane only, which means that, for instance, if a Label Switch Router (LSR) makes a RSVP-TE reservation for 10 Mbps and later it needs 100 Mbps, it will congest that LSP [10]. The network attempts to deliver the 100 Mbps, causing a lower performance to other flows that can have even more priority, unless we attempt to apply traffic policing using QoS techniques [11]. In this context, extensions of RSVP-TE protocol are expected to be an important application for performance improvement in such problematic instances, because MPLS-TE is providing fast networks, but with no local flow control. Therefore, it is being assumed that devices are not going to be congested and that they will not lose traffic. However, resource failures and unexpected congestions cause traffic looses [12], [13]. In these cases, upper layers protocols will request re-transmissions of lost data at end points [14], [15], but the time interval to obtain re-transmitted data can be significant for some types of time-critical MCN applications, such as real-time data delivery or synchronized healthcare services, where there are time-deadlines to be met.
Abstract—In this article, Guarantee of Service (GoS) is defined as a proposal to improve the integration of Mission Critical Networking (MCN) services in the Internet, analyzing the congestion impact on those privileged flows with high requirements of trust and delay. Multiprotocol Label Switching (MPLS) is a technology that offers flow differentiation and QoS in the Internet. Therefore, in order to improve network performance in case of congested domains, GoS is proposed as a technique that allows the local recovering of lost packets of MPLS privileged flows. To fulfill the GoS requirements for integration of MCN in MPLS, a minimum set of extensions to RSVP-TE has been proposed to provide GoS capable routes. Moreover, we have carried out an analytical study of GoS scalability and a performance improvement analysis by means of simulations. Keywords-MPLS, congestion, trust, RSVP-TE, Guarantee of Service, local re-transmissions
I. INTRODUCTION The integration of Mission Critical Networking (MCN) with the Internet allows enhancing reachability and ubiquity and the cost reduction of deployment and maintenance. However, an efficient network operation for MCN services is always required, but the Internet is a heterogeneous network that typically includes numerous resource-constrained devices [1], which creates bottlenecks that affect the network performance. In this context, Multiprotocol Label Switching (MPLS) is currently used to provide policy management for heterogeneous networks and protocols with QoS integration purposes, combining traffic engineering capabilities with flexibility of IP and class-of-service differentiation [2], [3].
The objective of this work is to analyze our Guarantee of Service (GoS) proposal as a resource engineering technique for local recovery of lost packets of MCN services, which need reliable and timely responses. With this purpose, GoS extensions of RSVP-TE [16] are used as a service-oriented technique, offering Privileged LSP to mission critical flows, in order to manage high requirements of delay and reliability. Furthermore, GoS does not propose the replacement of nodes in a MPLS domain but the incorporation of several GoS
MPLS Label Switched Paths (LSP) let the head-end Label Edge Router (LER) to control the path that traffic takes to a particular destination [4]. This method is more flexible than forwarding traffic based on destination address only. LSP tunnels also allow the implementation of a variety of policies related to the optimization of network performance [5]. Moreover, resilience allows LSP tunnels being automatically This work is supported in part by the Regional Government of Extremadura (Economy, Commerce and Innovation Council) under GRANT PDT07A039.
75
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
GoS characterization information of a MCN flow packet consists of GoSP, GoS Level and Packet ID. GoSP is the most generic information. It is a constant value for every packet of flows in a same LSP. Therefore, it is related to the LSP, but neither to flows nor to packets. GoS Level is a constant value for every packet of a flow; i. e., it is flow specific information. A greater GoS level implies a greater probability that a packet can be re-transmitted from a previous hop, because a flow with a higher GoS level is signalled across an LSP with more GoS capable nodes. Moreover, more memory is allocated in GoS buffers for flows with the highest GoS level. It allows classifying the GoS priority level with respect to other MCN flows of the LSP or other paths in the domain. Moreover, this value keeps constant only in packets belonging to the same MPLS Forwarding Equivalence Class (FEC). Finally, Packet ID is necessary to request local re-transmissions in case of packet loss of a MCN service. It is packet specific information, with a unique value per packet of a flow.
capable MPLS nodes in bottlenecks. This way, in case of MCN services packets loss in a congested node, there will be a set of upstream nodes to request a local re-transmission to, increasing possibilities of finding lost packets faster. The remainder of this article is structured as follows: Firstly, in Section 2, we define the GoS concept to be applied to MPLS flows for MCN services and how to signal the local recovery messages. Then, in Section 3 the proposed RSVP-TE extensions are studied, with the aim of minimizing the forwarding of GoS information across the MPLS domain. Next, an analysis of the GoS scalability is shown in fourth Section. In Section five, end-to-end (E-E) and GoS recoveries performances are compared by means of simulations [17], [18]. Finally we draw up some conclusions, results and contributions of our research. II.
GUARANTEE OF SERVICE IN AN MPLS DOMAIN
Our GoS technique can be defined as the possibility for resilience improvement in congested networks to flows with high requirements of delay and reliability. In particular, the GoS for MPLS protocol provide LSR nodes with the capacity to recover locally lost packets of a MPLS flow for MCN services. The GoS proposal is provided by a limited RSVP-TE protocol extension, to achieve GoS capacity in intermediate nodes, in order to get faster re-transmissions of lost packets. Furthermore, our proposal let RSVP-TE to get local recoveries in case of LSP failures by means of Fast Reroute point-to-point technique. In [6] the efficiency of this technique was studied and compared to other E-E failure recoveries techniques.
In order to get the GoSP from a GoS node when a MCN flow packet is lost, we consider a domain G(U), with a set of nodes U and a data flow ϕ(G)=ϕ(xi, xn) in G(U) across a path LSPi,n, with the origin in node xi and destination in node xn, with {xi, xn} ⊂ U. Maybe xn only knows incoming port and incoming label of any arrived packet of flow ϕ(G), i.e., xn only knows that xn-1 is the sender of ϕ(xi, xn). It would know which node the sender of a packet is, using label information. However, this is not a reliable strategy because, in case of flow aggregates, an RSVP-TE aggregator could perform reservation aggregation to merge k flows, in the form:
Therefore, a buffer in GoS nodes to temporally store only packets of a MCN service is needed. However, a particular packet is only needed to be buffered for a short interval of time. This is because the time for a local recovery request for such packet to be received is very limited due to the low packets delay in MPLS backbones. So, a GoS node only needs to store a limited number of packets per flow, allowing very efficient buffer searches. This set of GoS nodes, which have switched the packets of a GoS flow, is called GoS Plane (GoSP) and the number of necessary hops to achieve a successfully local recovery is called Diameter (d) of the local re-transmission. This way, a greater GoS level gives a higher probability to achieve a local retransmission with lower diameter. Therefore, the diameter is the key parameter of a GoS re-transmission. In this paper we focus on an analysis of the diameter scalability.
ϕ ( x n −1 , x n ) =
δ 1,2 d=3
x2
δ 2,3 d=2
x3
δ 3,4
x4
δ 4,5
i =1
i
( x n −1 , x n )
(1)
Furthermore, xn, may not be able to satisfy the Flow Conservation Law due to congestion: k
∑p i =1
il
>
k
∑p j =1
lj
(2)
The parameter p ij is the traffic volume sent from xi to xj across xl. Therefore, one or more packets are being discarded in xl, because the number of outgoing packets from xl is lower than the number of incoming packets. In this case upper layers protocols will have to detect lost packets and re-transmit them from head-end.
In Fig. 1, operation of GoS is shown when a packet of a MCN service is discarded, for instance, in intermediate node X4 and three feasible diameters can be used to recover locally the lost packet.
x1
k
∑ϕ
In order to request local re-transmissions when a packet of a MCN service is lost, it is necessary for GoS to know the set of nodes that forward the GoS packets. Thus, xn would know that discarded traffic have been stored in the upstream GoS nodes of LSPi,n. The first node to request a local retransmission is the previous GoS capable neighbour. With this purpose, RSVP-TE has been extended to allow signalling the GoS re-transmission requests, even, across non-GoS nodes. This proposal avoids the re-transmissions requests to the headend and brings a lesser increment of global ϕ(G) in the congested domain. Moreover, the deployment of GoS does not
x5
d=1
Figure 1. GoSP from node X4, with diameter = 3 hops
76
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
The table includes a first column for FEC or flow identification, a second column for flow GoS level and, finally, a third column is used to know the previous GoS hop address, to send it a request in case of GoS packet loss.
imply the replacement of a lot of routers in a MPLS domain, but only the insertion of several GoS capable nodes in bottlenecks. For this purpose, a study of distribution of GoS nodes in the domain has been carried out in order to get the optimal placement of GoS nodes. It has been carried out basing on several parameters, such as domain topology, links capacity, RSVP-TE reservations, network load and GoS level of the flows. The main benefit of this study is to minimize the diameter of local recoveries in case of MCN service data loss.
B. Guarantee of Service States Diagram In Fig. 2 a states diagram of the operation of a GoS node is shown. In the FP, the state of a GoS node is Data Forwarding, switching labels and forwarding data packets to the next node. There are only two events that change this state in the GoS node. The first event is the detection of a GoS packet loss. In this case, the GoS capable node gets FEC and GoS packet identification and change its state to Local recovery request, sending a local re-transmission request (GoSReq) to the first node of GoSP (the closest upstream GoS node). When a response (GoSAck) is received, it changes to the initial state.
A. A Connection-Oriented GoSP The throughput of a flow could be lower if GoS characterization information was carried with data packets. To avoid this, GoS information carried into data packets has been minimized, signalling the GoSP when the LSP is being signalled by RSVP-TE. This task is only carried out at the beginning, before data packets forwarding. Therefore, a GoS integrated with the MPLS Control Plane (CP), avoids that GoS information must be forwarded with every MPLS data packet. This way, GoS characterization info (GoS Level and GoSP previous hop) is only sent when LSP is being signalled, adding a new row in a table of the GoS nodes. This is similar to the operation of RSVP-TE protocol when an LSP is signalled across the domain, considering the GoSP as a connectionoriented subset of nodes of the LSP with GoS capability. The LSP that supports a GoSP to forward a MCN service with high requirements of delay and reliability is named privileged LSP.
The other event that changes the state is reception of a GoSReq from any downstream GoS node, which is requesting a local re-transmission. In this case, the node changes its state to Buffer Access, to search the requested packet according to the information received in the GoSReq. If the requested packet is found in the GoS buffer, a GoSAck is sent in response to the GoSReq, indicating that requested packet was found and it will be re-transmitted locally. Therefore, it changes to Local Retransmission state to get the GoS packet from the GoS buffer and re-forward it. Next, it will return to initial Forwarding state. In case of not find the packet in GoS buffer, it will send a GoSAck message, indicating that packet was not found and changing to Local Recovery Request state, sending a new GoSReq to its previous GoS node in the GoSP, if it is not the last one.
This way, GoS proposal extends the RSVP-TE protocol to let GoSP signalling as a subset of nodes of a privileged LSP. In the CP, when a node receives an RSVP-TE message requesting a new LSP, it inserts a new row in the Forwarding Information Base (FIB), about how to forward data packets across nodes of the LSP that is being signalled. Therefore, this is the info to be used by an LSR in the MPLS Forwarding Plane (FP) when it receives a MPLS packet to be switched. With FIB information it will know how to make the label swapping and how to forward it to the next hop. Therefore, with a connectionoriented GoSP, a GoS node that in FP detects an erroneous or discarded privileged packet, it only needs to get the FEC and GoS packet ID of the lost packet, because the GoS table already has all it needs to initiate a local re-transmission request. When RSVP-TE signals a new LSP for a MCN flow, then every GoS capable node of the LSP will add a new row to the FIB table, but also to the GoS Table. Flows information in that table is very simple, as in Table 1 is shown. TABLE I.
III.
Local recovery request
AN EXAMPLE OF GO S TABLE V ALUES
FEC
GoS Level
GoSP PHOP
35
0000000000001011
x.x.160.12
36
0000000000000001
x.x.160.73
37
0000000000010010
x.x.160.17
38
0000000000000001
x.x.160.35
GUARANTEE OF SERVICE MESSAGES
GoS levels can easily be mapped to MPLS FEC, which is commonly used to describe a packet-destination mapping. A FEC is a set of packets to be forwarded in the same way (e.g. using the same path or Quality of Service criteria). One of the reasons to use the FEC is that allows grouping packets in classes. It can be used for packet routing or for efficient QoS supporting too; for instance, a high priority FEC can be mapped to a healthcare service or a low priority FEC to a web service.
GoS packet discarded
GoSAck received
Data forwarding
Not found in GoS buffer
GoSReq received
GoS buffer access
Found in GoS buffer
Local retransmission
Figure 2. States diagram of a GoS capable node
77
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
When an LSP tunnel is being signalled in the CP, a GoS node that receives a GoS-extended Path message will access this GoS info to update its GoS Table. Then, it will record its IP address in the GoSP PHOP field of the GoSPath object because it will be the previous hop of the next downstream GoS node that detects a packet loss. It is not necessary to transport the entire GoSP in the GoSPath message, but only the last GoS node, because the node that detects a packet lost only send a local retransmission request to the PHOP in the GoSP. If PHOP cannot find the requested packet, it will request a local retransmission to the GoS PPHOP of the point of loss (if it is not the last one). Finally, following the RSVP-TE operation way, when an LSP is being signalled, GoS information will be confirmed with the reception of a GoS-extended Resv message, confirming the requested GoS level.
RSVP Checksum (2 octets)
TTL (1 octet)
Reserved (1 octet)
RSVP Message Length (2 octets) C-Type (1 octet)
Session object contents (variable length) C-Type (1 octet)
Class-Num (1 octet)
RSVP_Hop object contents (variable length)
Class-Num (1 octet)
C-Type (1 octet)
Therefore, in case of packet loss in a GoS node, this LSR would send to the upstream GoS PHOP a local re-transmission request. With this purpose, RSVP-TE Hello message has been extended. In particular, Hello Request message (see Fig. 5) has been extended with a GoSReq object, in order to allow requesting to the upstream GoSP PHOP the re-transmission of the lost packet specified in Packet ID field of the flow (specified in Privileged Flow ID field). Upstream GoS node that receives the GoSReq message sends a response in an extended Hello Ack message (see Fig. 6), with a GoSAck object to notify if requested packet has been found in the GoS buffer. Furthermore, following the RSVP-TE operation way, Source Instance and Destination Instance of the Hello object are used to test connectivity between GoSP neighbour nodes.
Record_Route object (RRO) contents (variable length) Class-Num (1 octet)
C-Type (1 octet)
GoS Level Request (2 octets)
Privileged Flow ID (2 octets) GoSP PHOP (4 octets)
Version Flags (4 bits) (4 bits)
Message Type (1 octet)
RSVP Checksum (2 octets)
TTL (1 octet)
Reserved (1 octet)
RSVP Message Length (2 octets)
Object Length (2 octets)
Class-Num (1 octet)
C-Type (1 octet)
Object Length (2 octets)
Class-Num (1 octet)
COMMON HEADER
Session object contents (variable length) C-Type (1 octet)
Object Length (2 octets)
Class-Num (1 octet)
HELLO REQ. OBJECT
RSVP_Hop object contents (variable length)
C-Type (1 octet)
Record_Route object (RRO) contents (variable length) Object Length (2 octets) Privileged Flow ID (2 octets)
Class-Num (1 octet)
GoS REQ. OBJECT
BODY HDR.BODY HDR. BODY HDR.BODY HDR.
GoS RESV RRO OBJECT OBJECT
RSVP HOP SESSION OBJECT OBJECT
COMMON HEADER
Figure 3. GoS extended Path message format with GoS Path object
C-Type (1 octet)
GoS Level (2 octets)
HDR.
Object Length (2 octets)
Version Flags (4 bits) (4 bits)
Message Type (1 octet)
RSVP Checksum (2 octets)
TTL (1 octet)
Reserved (1 octet)
RSVP Message Length (2 octets)
Object Length (2 octets)
Class-Num (1 octet)
C-Type (1 octet)
Source Instance (4 octets)
BODY
Object Length (2 octets)
HDR.
Object Length (2 octets)
Class-Num (1 octet)
A. Signalling of GoS Local Re-transmissions It is not necessary to send GoSP in every GoSReq message, because GoS nodes have an entry in the GoS Table with the GoSP PHOP to every flow. Therefore, in case that a GoSP PHOP node cannot satisfy a local re-transmission request, then it will get the GoS PHOP from the GoS Table, to send a new GoSReq to its GoSP PHOP to forward the request. So, it is not necessary that a node, which initiates a GoSReq, sends more requests to previous nodes of the GoSP PHOP. This technique has benefits in the LSP overhead when sending GoSReq messages. This is the reason to only buffer one address in the GoSP PHOP column, instead of the entire GoSP.
BODY
BODY HDR.BODY HDR.
Message Type (1 octet)
HDR.BODY HDR.
Version Flags (4 bits) (4 bits)
Object Length (2 octets)
BODY
GoS PATH OBJECT
RRO OBJECT
RSVP HOP SESSION OBJECT OBJECT
COMMON HEADER
Label is used by MPLS to establish the mapping between FEC and packet, because an incoming and outgoing labels combination identifies a particular FEC. With different classes of services, different FEC with mapped labels will be used. In our proposal, GoS FEC concept is used to classify the different GoS levels, giving more priority to the most privileged FEC. Therefore, GoS FEC will allow giving different treatments to GoS packets belonging to flows with different privileges, although they are being forwarded along the same path. With the purpose of minimize GoS signalling in the MPLS FP, GoS characterization info (GoS Level, Packet Id and GoSP) can be signalled by RSVP-TE in the MPLS CP. When a privileged LSP is being established, extended RSVP-TE Path and Resv messages can forward GoS Level and GoSP info (see Figs. 3 and 4).
Destination Instance (4 octets) Object Length (2 octets)
Class-Num (1 octet)
C-Type (1 octet)
Privileged Flow ID (4 octets) Packet ID (4 octets)
Figure 5. GoS extended Hello message format, with GoS Request object after the Hello object
Figure 4. GoS extended Resv message format with GoS Resv object
78
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
HDR. HDR. BODY
Version Flags (4 bits) (4 bits)
Message Type (1 octet)
RSVP Checksum (2 octets)
TTL (1 octet)
Reserved (1 octet)
RSVP Message Length (2 octets)
Object Length (2 octets)
Class-Num (1 octet)
IV. SCALABILITY OF THE GOSP DIAMETER In this section we analyze the scalability of the connectionoriented GoSP. A MPLS domain G(U) will be considered, with a set X of n nodes and a set U of links. Let δij the delay of link (xi, xj) ∈ U and let δ(xi, xj) the delay of a path between two any nodes xi and xj. Finally, let δGoS the delay proportion used for transmission of GoS characterization information in FP (GoS packet ID). The main objective is to analyze the scalability of the GoSP when lost packets are re-transmitted between two any nodes of LSPi,n in U(G). This way, minimum delay used by a packet when is forwarded between two nodes of the path LSPi,n of G(U) is:
C-Type (1 octet)
Source Instance (4 octets) Destination Instance (4 octets) Object Length (2 octets)
Class-Num (1 octet)
C-Type (1 octet)
Privileged Flow ID (4 octets)
BODY
GoS ACK OBJECT
HELLO ACK OBJECT
COMMON HEADER
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Packet ID (4 octets) GoS Ack (4 octets)
min δ ( x i , x j ) =
Figure 6. GoS extended Hello message format, with GoS Ack object after the Hello object
n
n
i =1
j =1
∑∑δ
ij
x ij
(3)
subject to:
In Fig. 7, operation of the GoS when a packet that is being forwarded from X1 to X5 (with delay δ1,5) is discarded in the intermediate node X4 is shown. For instance, in this case 3 GoSP diameters (d=1, d=2 and d=3) can be used to achieve a successfully local re-transmission. First, X4 sends a local retransmission request (GoS_Req) to the first node of the GoSP (X3). Then, that node will send a response (GoS_Ack) to indicate whether it has found the requested packet or not in the GoS buffer. If it is found (d=1), it will send that locally recovered packet (LRP) towards its destination. But if it is not found, X3 will send a new GoS_Req message to its PHOP in the GoSP (X2). If X2 finds requested packet, the successfully diameter would be d=2. Finally, if X1, which is the last node of the GoSP, finds the lost MCN packet, then a diameter d=3 would achieve a successfully local re-transmission. Furthermore, this local recovery process is compared with both end-to-end re-transmission request (EERR) and end-to-end retransmission packet (EERP).
n
∑ l=2
n
∑ i =1
n −1
∑ l =1
x1 l = 1
(4)
x il − ∑ x lj = 0, l = 2, 3,..., n − 1
n
(5)
x ln = 1
(6)
j =1
where: x i , j = 1, ∀ ( x i , x j ) ∈ LSP i , n , x i , j = 0 , ∀ ( x i , x j ) ∉ LSP i , n and δ i , i = 0 , ∀ i A. End-to-End Retransmissions Let xn a non-GoS congested end node. In case of packet discarding by xn, then Discarding Detection Time (DDTE-E) function between two nodes of LSPi,n is: DDT E − E ( x i , x n ) =
n −1
∑δ l =i
l , l +1
x l ,l +1
(7)
Minimal delay of the end to end (E-E) retransmission is: n −1
δ E − E ( x i , x n ) = 2 ∑ δ l , l +1 x l , l +1
(8)
l =i
Therefore, total delay ∆ E−E ( xi , xn ) to get discarded flow in xn is got from Eqs. (7) and (8): n −1
∆ E − E ( x i , x n ) = 3 ∑ δ l , l +1 x l , l +1
Figure 7. Local re-transmission operation when a GoS packet is discarded in an intermediate node
l =i
79
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(9)
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
B. GoS-based Local Re-transmissions Let xn be a GoS congested end node. In case of packet discarding by xn, then Discarding Detection Time (DDTd) between source and sink nodes of path LSPi,n is: n −1
∑δ
DDT d ( x i , x n ) =
l =i
l , l +1
diameter scalability with respect to the number of nodes of the privileged LSP and δGoS, we get parameter d: d <
Minimal delay of local retransmission using a GoSP with diameter d (δd) is: δ d ( xi , x n ) = 2
n −1
∑δ
l = n −d
l , l +1
(11)
· δ GoS · x l , l +1
− 1 − i ) · ( 3 − δ GoS ) ) (2 · δ GoS
)+1
(17)
In Fig. 8 scalability of the GoSP diameter for different LSP sizes (parameters i and n) is shown. In chart we can see that there is a lineal rise when increasing the number of nodes of the LSP, until a maximum LSP size of 251 nodes. After this point, the maximum feasible diameter that would allow a successfully local re-transmission has a value of 250 hops.
(10)
· δ GoS · x l , l +1
(( n
250 225
subject to: 0 < d < n – i
200
If the diameter in Eq. (11) was n-i, then if l = n–d = n – (n– i) = n – n + i = i, we get that: n −1
2
∑δ
l = n− d
n −1
l , l +1
· δ GoS · x l , l +1 = 2 ∑ δ l , l + 1 · δ GoS · x l , l + 1
Diameter
175
(12)
25 0
Moreover, if in Eq. (11) GoSP diameter was bigger than ni, then it would be trying to get a retransmission from a previous node to xi, but this one is the source of data flow, so it is unfeasible. Thus, total delay ∆ d ( xi , x n ) to get discarded traffic from initial instant of transmission is got from Eqs. (10) and (11):
l =i
n −1
∑δ
l = n −d
l ,l + 1
0
n −1 l =i
δ GoS x l ,l +1 + 2
l , l +1
n −1
l = n− d
n −1
3∑ δ l , l +1 x l , l +1 > δ GoS l =i
n −1
∑δ l=i
l , l +1
∑δ
δ GoS x l ,l +1 (13)
x l , l +1
l =i
l =i
l , l +1
x l ,l +1 + 2 δ GoS
n −1
∑δ
l = n−d
l , l +1
x l ,l + 1
(15)
n −1 2 δ GoS ∑ δ l , l +1 x l , l + 1 l = n−d > (3 − δ GoS )
100
125
150
175
200
225
250
275
300
V. SIMULATION RESULTS In order to evaluate the performance of GoS approach, we have carried out a series of simulations focused on AT&T backbone network topology (see Fig. 9), which is MPLS enabled to provide QoS for customers who require value-added services. In our simulations, AT&T core topology is characterized by 120 LER nodes, 30 LSR nodes and 180 links, with capacities in the range of [45Mbps, 2.5Gbps]. A GoS enabled node has been located at the eight routers with the biggest connectivity. In scenarios, signalled LSP are unidirectional and the bandwidth demanded for each flow is drawn from a distribution over the range of [64Kbps, 4Mbps]. In order to analyze the effect that GoS re-transmissions have on transport layer protocols, several MCN services over TCP/IP that use LSP across a different number of GoS capable nodes have been compared with not privileged TCP/IP flows across the same paths. LSP congestion has also been considered in the range of [0.01%, 4%].
(14)
n −1
75
This proof can easily be extended to include the case where an intermediate node XDD is requesting re-transmission, getting the same half-plane of solutions for the GoSP diameter, as is shown in Eq (17).
n −1
∑δ
50
Figure 8. Scalability of GoSP diameter for different LSP sizes
δ GoS x l ,l +1 < 3∑ δ l , l +1δ GoS x l ,l +1
l , l +1
25
Number of nodes of the LSP
At this point we test if Eq. (13) < Eq. (9):
∑δ
100
50
i.e., it would be an E-E retransmission.
n −1
125
75
l =i
∆ d ( x i , x n ) = ∑ δ l , l +1δ GoS x l , l +1 + 2
150
(16)
In Eq. (16) the half-plane of solutions has been obtained for the case of a local recovery with diameter d that have lower delay than an E-E re-transmission. Therefore, to get the GoSP
80
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009 100%
Percentage of packets received
90%
GoSPdiameter=1 GoSPdiameter=2
80%
GoSPdiameter=4
70%
GoSPdiameter=8
60%
E-E re-transmissions
50% 40% 30% 20% 10% 0%
10
70
190
240
350
400
450
Time (102 seconds)
Figure 11. Packets received in sink in GoS re-transmission cases and E-E case at different time samples
Figure 9. AT&T core topology characterization
Therefore, the more GoS capable nodes crossed by the LSP, the higher the probability for local re-transmissions with optimal diameter=1. Hence a MPLS service provider would assign flows with the highest GoS level to an LSP that crosses more GoS nodes.
Fig. 10 shows a throughput comparative between an E-E case, where lost packets need TCP re-transmissions from the head-end and a GoS case where dropped packets are recovered locally. Due to GoS assigned to the MCN service, 91.04% of discarded packets were recovered with diameter=1, 8.96% with d=2 and no packets were re-transmitted with d>2. Trend functions are also shown in the chart to allow a performance comparative, with a confidence interval of 12.5Kbps, at a 95% confidence level. Average difference between trend functions is 4.84%.
Fig. 12 shows a packet loss comparative between a no GoS case, where a lost packet need a TCP re-transmission from the head-end and a GoS case where discarded packets can be recovered locally; therefore, these would not be considered as lost packets at the head-end. Trend functions are also shown, with a confidence interval of 0.21%, at a 95% confidence level and an average difference between trend functions of 1.32%.
Fig. 11 shows a comparison between the percentage of packets received at different time samples of a particular flow when dropped packets are E-E recovered by the transport level protocol and when they are re-transmitted locally with d=1, d=2, d=4 and d=8 diameters. For instance, at 35000s, 55.79% of E-E traffic has been received; at the lowest GoS level case (d=8), 58.12% of packets have already been received, in the d=4 case, 60.04% of packets, in the d=2 case 61.83% of packets and in the best GoS level case, when d=1, 62.91% of packets have been received.
This way, we conclude that a significant part of discarded traffic will not have to be recovered end-to-end by transport layer protocol due to GoS local re-transmissions. Furthermore, including GoS capable nodes in bottlenecks we obtain an improvement in the number of packets delivered for MCN services in the Internet, with a better use of network resources. 4,5 4
2,10
3,5
Percentage of loss
1,90 1,80 1,70
1,5
0,5
Time (102 seconds) GoS flow GoS Trend Function
500
400
300
200
100
Time (102 seconds)
No GoS flow No GoS Trend Function
Figure 12. Percentage of packet loss of GoS and E-E flows
Figure 10. Throughput sampling comparative between GoS and E-E retransmissions
81
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
500
400
0
No GoS Trend Function
300
0
GoS Trend Function
1,40 0
2
1 Max GoS Level No GoS
1,50
2,5
200
1,60
3
100
Throughput (Mbps)
2,00
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009 [11] A.B. Bagula, “On Achieving Bandwidth-Aware LSP//spl lambda/SP Multiplexing/Separation in Multi-layer Networks,” IEEE Journal on Selected Areas in Comm., vol. 25, issue 5, June 2007, pp. 987–1000. [12] S. Butenweg, “Two distributed reactive MPLS traffic engineering mechanisms for throughput optimization in best effort MPLS networks,” in Proc. The 8th IEEE Int. Symposium on Computers and Communications, Kemer - Antalya, Turkey, Jul. 2003, pp. 379–384. [13] L. Xu, K. Harfoush, and I. Rhee, “Binary Increase Congestion Control for Fast Long-Distance Networks,” in Proc. The 23rd Conference of the IEEE Communications Society (INFOCOM 2004), Hon Kong, China, Mar. 2004, pp. 2514–2524. [14] Y. Li, D. Leith, and R. Shorten, “Experimental Evaluation of TCP Protocols for High-Speed Networks,” IEEE/ACM Transactions on Networking, vol. 15, issue 5, Oct. 2007, pp. 1109–1122. [15] S. Floyd, HighSpeed TCP for Large Congestion Windows, IETF RFC 3649, Dec. 2003. [16] Q. Fu, and G. Armitage, “A Blind Method towards Performance Improvement of High Performance TCP with Random Losses,” in Proc. The 4th IEEE International Conference on Wired/Wireless Internet Comm., Bern, Switzerland, May. 2006, vol. 1, pp. 49–61. [17] K. Kompella, and J. Lang, Procedures for Modifying the Resource reSerVation Protocol (RSVP), IETF RFC 3936, Oct. 2004. [18] S. Floyd and E. Kohler, Tools for the Evaluation of Simulation and Testbed Scenarios, Internet-draft draft-irtf-tmrg-tools-05, work in progress, Feb. 2008.
VI. CONCLUDING REMARKS This article discusses GoS as a local traffic recovery technique in a MPLS domain with the aim of improving the network performance for MCN services in the face of congestion. We have first defined and discussed the requirements for GoS over MPLS. Then, we have explained that GoS signalling for MCN services with requirements of low delay and high reliability is possible. The scalability of the proposal has been analytically studied and, finally, the benefits due to local re-transmissions of discarded traffic with respect to end to end re-transmissions have been evaluated. Further work should include the evaluation and comparison of different network scenarios under different real traffic distributions. REFERENCES [1]
G. Siganos, “Powerlaws and the AS-level Internet topology,” ACM/IEEE Trans. on Networking, vol. 11, pp. 514–524, Aug. 2003. [2] Taesang Choi, “Design and implementation of an information model for integrated configuration and performance management of MPLSTE/VPN/QoS,” in Proc. The 8th IFIP/IEEE Int. Symp. on Integrated Network Management, Colorado Springs, USA, 2003, pp. 143–146. [3] E. Rosen, A. Viswanathan,and R. Callon, Multiprotocol Label Switching Architecture, IETF RFC 3031, Jan 2001. [4] S. Bhatnagar, S. Ganguly, and B. Nath, “Creating multipoint-to-point LSPs for traffic engineering,” IEEE Communications Magazine, vol. 43, issue 1, Jan. 2005, pp. 95–100. [5] S. Fowler, S. Zeadally, and F. Siddiqui, “QoS path selection exploiting minimum link delays in MPLS-based networks,” in Proc. The 2005 IEEE Systems & Comm., Montreal, Canada, Aug. 2005, pp. 27–32. [6] Li Li, Buddhikot, M.M., C. Chekuri, and K. Guo, “Routing bandwidth guaranteed paths with local restoration in label switched networks,” IEEE Journal on Selected Areas in Comm., vol. 23, issue 2, Feb. 2005, pp. 437–449. [7] A. Tizghadam, and A. Leon-Garcia, “Lsp and back up path setup in mpls networks based on path criticality index,” in Proc. The IEEE International Conference on Communications, Glasgow, Scotland, June. 2007, pp.441–448. [8] K. Suncheul, P. Jaehyung, and Y. Byung-ho, “A scalable and loadefficient implementation of an RSVP-TE in MPLS networks,” in Proc. The 7th IEEE International Conference on Advanced Communication Technology, Phoenix Park, Republic of Korea, Feb. 2005, pp. 950–953. [9] K. Sohn, Y. Seung, and D.K. Sung, “A distributed LSP scheme to reduce spare bandwidth demand in MPLS networks,” IEEE Trans. on Communications, vol. 54, issue 7, July 2006, pp. 1277–1288. [10] D. Oulai, S. Chamberland, and S. Pierre, “A New Routing-Based Admission Control for MPLS Networks,” IEEE Communications Letters, vol. 11, issue 2, Feb. 2007, pp. 216–218.
AUTHORS PROFILE Fco. Javier Rodríguez-Pérez received his Engineering degree in Computer Science Engineering at the University of Extremadura (Spain) in 2000, where he is currently a professor and a Ph. D candidate of GITACA group. His research is mainly focussed on QoS and traffic engineering, packet classification and signalling development over IP/MPLS systems. José-Luis González-Sánchez is a full time associate professor of the Computing Systems and Telematics Engineering department at the University of Extremadura, Spain. He received his Engineering degree in Computer Science and his Ph.D degree in Computer Science (2001) at the Polytechnic University of Cataluña, Barcelona, Spain. He has worked, for years, at several private enterprises and public organizations, accomplishing functions of System and Network Manager. He is the main researcher of the Advanced and Applied Communications Engineering Research Group (GÍTACA) of the University of Extremadura. He has published many articles, books and research projects related to computing and networking. Alfonso Gazo-Cervero received his PhD in computer science and communications from the University of Extremadura. He is currently a member of the research and teaching staff as assistant proffesor in GITACA group. His research interests are related mainly to QoS provision over heterogeneous networks, capacity planning, routing protocols and overlay networks.
82
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Novel Intrusion Detection using Probabilistic Neural Network and Adaptive Boosting Tich Phuoc Tran, Longbing Cao Faculty of Engineering and Information Technology University of Technology, Sydney, Australia {tiptran, lbcao}@it.uts.edu.au
Dat Tran
Cuong Duc Nguyen
Faculty of Information Sciences and Engineering University of Canberra, Australia
[email protected]
School of Computer Science and Engineering International University, HCMC, Vietnam
[email protected]
Abstract— This article applies Machine Learning techniques to solve Intrusion Detection problems within computer networks. Due to complex and dynamic nature of computer networks and hacking techniques, detecting malicious activities remains a challenging task for security experts, that is, currently available defense systems suffer from low detection capability and high number of false alarms. To overcome such performance limitations, we propose a novel Machine Learning algorithm, namely Boosted Subspace Probabilistic Neural Network (BSPNN), which integrates an adaptive boosting technique and a semi-parametric neural network to obtain good trade-off between accuracy and generalty. As the result, learning bias and generalization variance can be significantly minimized. Substantial experiments on KDD-99 intrusion benchmark indicate that our model outperforms other state-of-the-art learning algorithms, with significantly improved detection accuracy, minimal false alarms and relatively small computational complexity.
The majority of currently existing IDS face a number of challenges such as low detection rates which can miss serious intrusion attacks and high false alarm rates, which falsely classifies a normal connection as an attack and therefore obstructs legitimate user access to the network resources [1]. These problems are due to the sophistication of the attacks and their intended similarities to normal behavior. More intelligence is brought into IDS by means of Machine Learning (ML). Theoretically, it is possible for a ML algorithm to achieve the best performance, i.e. it can minimize the false alarm rate and maximize the detection accuracy. However, this normally requires infinite training sample sizes (theoretically) [2]. In practice, this condition is impossible due to limited computational power and real-time response requirement of IDS. IDS must be active in real time and they cannot allow much delay because this would cause a bottleneck to the whole network.
Keywords- Intrusion Detection, Adaptive Boosting
To overcome the above limitations of currently existing IDS, we propose an efficient Boosted Subspace Probabilistic Neural Network (BSPNN) to enhance the performance of intrusion detection for rare and complicated attacks. BSPNN combines and improves a Vector Quantized-Generalized Regression Neural Network (VQ-GRNN) with an ensemble technique to improve detection accuracy while minimizing computation overheads by tuning of models. Because this method combines the virtues of boosting and neural network technologies, it has both high data fitting capability and high system robustness. To evaluate our approach, substantial experiments are conducted on the KDD-99 intrusion detection benchmark. The proposed algorithm clearly demonstrates superior classification performance compared with other well-known techniques in terms of bias and variance for the real life problems.
Neural
Network,
I. INTRODUCTION As more and more corporations rely on computers and networks for communications and critical business transactions, securing digital information has become one of the largest concerns of the business community. A powerful security system is not only a requirement but essential to the livelihood of enterprises. In recent years, there has been a great deal of research conducted in this area to develop intelligent and automated security tools which can fight the latest cyber attacks. Alongside with static defense mechanisms such as keeping operating systems up-to-date or deploying firewalls at critical network segments for access control, more advanced defense systems, namely Intrusion Detection Systems (IDS), are becoming an important part of today’s network security architectures. Particularly, IDS can be used to monitor computers or networks for unauthorized activities based on network traffic or system usage behaviors, thereby detect if a system is targeted by a network attack such as a denial of service attack.
II.
NETWORK INTRUSION DETECTION AND RELATED WORKS Because most computers today are connected to the Internet, network security has become a major concern for organizations throughout the world. Alongside the existing techniques for preventing intrusions such as
83
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Decision trees are one of the most commonly used supervised learning algorithms in IDS [7-11] due to its simplicity, high detection accuracy and fast adaptation. Another high performing method is Artificial Neural Networks (ANN) which can model both linear and nonlinear patterns. ANN-based IDS [12-15] have achieved great successes in detecting difficult attacks. For unsupervised intrusion detection, data clustering methods can be applied [16, 17]. These methods involve computing a distance between numeric features and therefore they cannot easily deal with symbolic attributes, resulting in inaccuracy.
encryption and firewalls, Intrusion Detection technology has established itself as an emerging research field that is concerned with detecting unauthorized access and abuse of computer systems from both internal users and external offenders. An Intrusion Detection System (IDS) is defined as a protection system that monitors computers or networks for unauthorized activities based on network traffic or system usage behaviors, thereby detecting if a system is targeted by a network attack such as a denial of service attack [4]. In response to those identified adversarial transactions, IDS can inform relevant authorities to take corrective actions.
Another well-known ML techniques used in IDS is Naïve Bayes classifiers [7]. Because Naïve Bayes assumes that features are independent, which is often not the case for intrusion detection, correlated features may degrade its performance. In [18], the authors apply a Bayesian network for IDS. The network appears to be attack specific and its size grows rapidly as the number of features and attack types increase.
There are a large number of IDS available on the market to complement firewalls and other defense techniques. These systems are categorized into two types of IDS, namely (1) misuse-based detection in which events are compared against pre-defined patterns of known attacks and (2) anomaly-based detection which relies on detecting the activities deviating from system “normal” operations.
Beside popular decision trees and ANN, Support Vector Machines (SVMs) are also a good candidate for intrusion detection systems [14, 19] which can provide real-time detection capability, deal with large dimensionality of data. SVMs plot the training vectors in high dimensional feature space through nonlinear mapping and labeling each vector by its class. The data is then classified by determining a set of support vectors, which are members of the set of training inputs that outline a hyperplane in the feature space.
In addition to the overwhelming volume of generated network data, rapidly changing technologies present a great challenge for today’s security systems with respect to attack detection speed, accuracy and system adaptability. In order to overcome such limitations, there has been considerable research conducted to apply ML algorithms to achieve a generalization capability from limited training data. That means, given known intrusion signatures, a security system should be able to detect similar or new attacks. Various techniques such as association rules, clustering, Naïve Bayes, Support Vector Machines, Genetic Algorithms, Neural Networks, and others have been developed to detect intrusions. This section provides a brief literature review on these technologies and related frameworks.
Several other AI paradigms including linear genetic programming [20] , Hidden Markov Model [21], Columbia Model [22] and Layered Conditional Random Fields [23] have been applied for the design of IDS. III.
One of the rule-based methods which is commonly used by early IDS is the Expert System (ES) [3, 4]. In such a system, the knowledge of human experts is encoded into a set of rules. This allows more effective knowledge management than that of a human expert in terms of reproducibility, consistency and completeness in identifying activities that match the defined characteristics of misuse and attacks. However, ES suffers from low flexibility and robustness. Unlike ES, data mining approaches derive association rules and frequent episodes from available sample data, not from human experts. Using these rules, Lee et. al. developed a data mining framework for the purpose of intrusion detection [5, 6]. In particular, system usage behaviors are recorded and analyzed to generate rules which can recognize misuse attacks. The drawback of such frameworks is that they tend to produce a large number of rules and thereby, increase the complexity of the system.
BOOSTED SUBSPACE PROBABILISTIC NEURAL NETWORK (BSPNN)
A. Bias-Variance-Computation Dilemma Several ML techniques have been adopted in the Network Security domain with certain success; however, there remain severe limitations. Firstly, we consider Artificial Neural Network (ANN) because of its wide popularity and well-known characteristics. As a flexible “model-free" learning method, ANN can fit training data very well and thus provide a low learning bias. However, they are susceptible to overfitting, which can cause instability in generalization [24]. Recent remedies try to improve the model stability by reducing generalization variance at the cost of worse learning bias, i.e. allowing underfitting. However, underfitting is not acceptable for some applications requiring high classification accuracy. Therefore, a system which can achieve both stable generalization and accurate learning is imperative for applications as
84
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
orthogonality, diversity or disagreement is required to obtain a good ensemble.
in Intrusion Detection [19]. Mathematically, both bias and variance may be reduced at the same time given infinite sized models. However, this is infeasible since computing resources must be limited in real life. We develop a learning algorithm which provides a good tradeoff for learning bias, generalization variance and computational requirement motivated by the need of an accurate detection system for Intrusion Detection.
C. Model description As shown in Figure 1, the proposed BSPNN algorithm has two major modules: the Adaptive Booster and the Modified Probabilistic Classifier. Given the input data , | 1 … where output vector 1 … , the BSPNN algorithm aims to produce a classifier F such that:
B. Objectives This paper is inspired by a light-weight ANN model, namely Vector Quantized-Generalized Regression Neural Network (VQ-GRNN) [25], which reduces the nonparametric GRNN [26] to a semiparametric model by applying vector quantization techniques on the training data, i.e. clustering the input space into a smaller subspace. Compared with GRNN method which incorporates every training vector into its structure, VQ-GRNN only applies on a smaller number of clusters of input data. This significantly improves the robustness of the algorithm (low variance), but also controls its learning accuracy to some extent [24]. To make the VQ-GRNN suitable for Intrusion Detection problems, i.e. enhancing its accuracy, we propose the Boosted Subspace Probabilistic Neural Network (BSPNN) which combines VQ-GRNN and Ensemble Learning technique. Ensemble methods such as Boosting [27] iteratively learn multiple classifiers (base classifiers) on different distributions of training data. It particularly guides changes of the training data to direct further classifiers toward more “difficult cases”, i.e. putting more weights for previously misclassified instances. It then combines base classifiers in such a way that the composite – boosted learner – outperforms the single classifiers. Amongst popular boosting variants, we choose Adaptive Boosting or AdaBoost [28] to improve performance of VQ-GRNN. AdaBoost is the most widely adopted method which allows the designer to continue adding weak learners whose accuracy is only moderate until some desired low training error has been achieved. AdaBoost is “adaptive” in the sense that it does not require prior knowledge of the accuracy of these hypotheses [27]. Instead, it measures the accuracy of a base hypothesis at each iteration and sets its parameters accordingly.
In this research, we implement F (referred to as Adaptive Booster), using SAMME algorithm [29]. F learns by iteratively training a Modified Probabilistic Classifier f on weighted data samples S and their weights are updated by the Distribution Generator according to previously created models of f. This base learner f is actually a modified version of the emerging VQ-GRNN model [25] (called Modified GRNN Base learner) in which the input data space is reduced significantly (by the Weighted vector quantization module) and its output is computed by a linearly weighted mixture of Radial Basis Function (RBF). This process is repeated until F reaches a desired number of iterations or its Mean Squared Error (MSE) approaches an appropriate level. The base hypotheses returned from f are finally combined by the Hypothesis Aggregator:
.
This combination depends not only on the misclassification error of previously added but also the diversity of the ensemble at that time. The Diversity Checker measures ensemble diversity by using Kohavi-Wolpert variance [30] (which is denoted by the hypothesis weighting coefficient ). To avoid any confusion, the adaptive booster F is called the master algorithm while f refers to the base learner. They are described in greater details in next sections. 1) Adaptive Booster The Adaptive Booster iteratively produces base hypotheses on a weighted training dataset. The weights are updated adaptively based on the classification performance of component hypotheses. The generated hypotheses are then integrated via a weighted sum based on their diversity.
Although classifier combinations (as in boosting) can improve generalization performance, correlation between individual classifiers can be harmful to the final composite model. Moreover, it is widely accepted that generalization performance of a combined classifier is not necessarily achieved by combining classifiers with better individual performance but by including independent classifiers in the ensemble [9]. Therefore, such independence condition among individual classifiers which is normally termed as
85
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Figure 1. BSPNN high-level design view
TABLE I.
ADAPTIVE BOOSTER ALGORITHM
Input: , , … , , and associated distribution W
Initialize for all i=1…N, ! Do for t = 1 … T Generate base classifiers Train a classifier on the weighed sample {, " # } using the Modified Probabilistic Classifier and obtain hypothesis $# : & '0,1)* Compute Kohavi-Wolpert variance (!+ ) of current ensemble !+, .-. ∑ 2 012 3 45 6 012 37 Where L and 012 3 are the number of base classifiers generated so far in the ensemble and the number of classifiers that correctly classifies 2 . We have L=t. Compute class probability estimates +
#
#
89 : 6 1 . ;log ?@ 6 ∑* log ?@A C , D 1, . . , B * @ #
Where ?@ EFGHI 1$# DJ3 is the weighted class probability of class k. Update weights *N #
+,
" . exp ;6 . log ?# . $# C , 1, . . , O * Where ? EFGH
Renormalize W P ∑S Q , 1 … P RTU
R
End for Output #
VWXYZ argmax@ ∑^# !+ . V@
2) Modified Learner)
Probabilistic
Classifier
The Modified Probabilistic Classifier serves as the base learner which can be trained on {, _# } repeatedly by the Adatptive Booster to obtain the hypothesis
(Base
86
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
$# : & '61, `1)
Such modifications make VQ-GRNN specially suited for boosting. In particular, the center vector g is computed as:
In each boosting iteration, a base hypothesis is created with associated accuracy and diversity measures. From this information, the data weights are updated for the next iteration and the final weighting of that hypothesis in the joint classification is computed.
g@
de f
VQ-GRNN’s learning involves finding the optimal bandwidth a giving the minimum MSE. In our implementation, a Weighted MSE (WMSE) is used instead: ∑ h 6 )n '" where o and h are the associated weight and prediction of an example , , i = 1…N "lm
3) Remarks on BSPNN The high accuracy of BSPNN can be attributed to the boosting effects of SAMME method implemented in the Adaptive Booster module. By sufficiently handling the multiclass problem and using confidencerated predictions, SAMME can maximize the distribution margins of the training data [32]. Also, our implementation of Kohavi-Wolpert variance (KW) [30] in the reweighting of hypotheses in the joint classification can effectively enforce the ensemble diversity. The Modified Probabilistic Classifier has very fast adaptation and it is modified to better integrate with the Adaptive Booster module. Particularly, after being modified, it can produce confidence rated outputs and fully utilize the weights given by the booster into learning process. In the next sections, we apply BSPNN into specific Intrusion Detection problems.
1 6 , a3 b c@ @ 1 6 g@ , a3
This approximation is reasonable because the vectors are close to each other in the input vector space. Using this idea, the VQ-GRNN’s equation can be generalized [25]: h13
c@
where c@ is the number of training vectors belonging to a cluster k; " is the weight associated with .
We adapt VQ-GRNN [25] as a base learner in our BSPNN model. VQ-GRNN is closely related to Specht’s GRNN [26] and PNN [31] classifiers. This adaptation of VQ-GRNN can produce confidence-rated outputs and it is modified such that it utilizes weights associated with training examples (to compute cluster center vectors and find a single smoothing factor) and incorporates these weights as penalties for misclassifications (e.g. weighted MSE). This modified version of VQ-GRNN is similar to the original one in that a single kernel bandwidth is tuned to achieve satisfactory learning. They both cluster close training vectors according to a very simple procedure related to vector quantization. A number of equally sized radial basis functions are placed at each and every center vector location. These functions are approximated:
e ∑d "
∑ f c 1 6 g , a3 ∑ f c 1 6 g , a3
Where g is the center vector for class i in the input space, , a is the radial basis function with centre x and the width parameter a , is the ouput related related to g , c is the number of vectors 2 associated with centre g . ∑ c i is the total number of training vectors.
IV.
APPLICATION TO NETWORK INTRUSION DETECTION Current IDS suffer from low detection accuracy and insufficient system robustness for new and rare security breaches. In this section, we apply our BSPNN to identify known and novel attacks in the KDD-99 dataset [1], containing TCP/IP connection records. Each record consisted of 41 attributes (features) and one target value (labeled data) which indicates whether a connection is Normal or an attack. There are 40 types of attacks, classified into four major categories, namely Probing (Probe) (collect information of target system prior to an attack), Denial of Service (DoS) (prevent legitimate requests to a network resource by consuming the bandwidth or overloading computational resources), User-to-Root (U2R) (attackers with normal user level access gain privileges of root user), and Remote-to-Local (R2L) (unauthorized users gain the ability to execute commands locally).
The above formula can be extended to a multiclass classification problem by redefining the output vector as a K-dimensional vector (K is the number of classes): , … , * ^
where @ is the class membership probability of the k-th class of the vector . If the vector is of class k, then @ 1.0 and @A 0 for the remaining vector elements (D j Dk). An input vector x is classified to class-k if the k-th element of the output vector has the highest magnitude. To suit ensemble learning, VQ-GRNN is adapted such that it incoperates the weights associated with each training vector into the learning process, i.e. using them in cluster center formation and Mean Square Error (MSE) calculation for realzing the smoothing factor a.
87
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Table 2 describes the components of KDD-99 dataset (referred to as Whole KDD): 10% KDD containing 26 known attack types (for training) and TABLE II.
Dataset Whole KDD 10% KDD Corrected KDD
DoS 3883370 391458 229853
Corrected KDD containing 14 novel attacks (for testing).
KDD-99 COMPONENT DATASETS [1]
Probe 41102 4107 4166
U2R 52 52 70
R2L 1126 1126 16347
Total Attack 3925650 396743 250436
Total Normal 972780 97277 60593
for effective detection and this method, as detailed in [33], is not influenced by cluster orders.
A. Experiment Setup 1) Cost-Sensitive Evaluation Because an error on a particular class may not be equally serious as errors on other classes, we should consider misclassification cost for intrusion detection. Given a test set, the average cost of a classifier is calculated as below [1]:
In our experiments, we first created 13 datasets v , … , vu , as shown in Table 3, by incrementally adding each cluster V into the normal dataset (Norm) to simulate the evolution of new intrusions: @
VGpq ∑t ∑t2 VGOl , r s VGpql , r ( 4)
v@ GFw ` V v@N ` V@
Where
The BSPNN and other learning methods are then tested against the “Corrected KDD” testing set, containing both known and unknown attacks.
N: total number of connections in the dataset ConfM(i,j): the entry at row i, column j in the confusion matrix.
B. Experiment Result 1) Anomaly Detection We train BSPNN on the pure Normal dataset (Norm) to detect anomalies in “Corrected KDD” testing set. Table 4 shows that our BSPNN obtains competitive detection rate compared with [33] while achieves significantly lower false alarm rate (1.1%), minimizing major drawbacks of anomaly detection.
CostM(i,j): the entry at row i, column j in the cost matrix. 2) Datasets Creation First, we consider anomaly detection where only normal connection records are available for training. Any connections that differ from these normal records are classified as “abnormal” without further specifying which attack categories it actually belongs to. For this purpose, we filter all known intrusions from the 10% KDD to form a pure normal dataset (Norm).
2) Misuse Detection To test the effect of having known intrusions in the training set on the overall performance, we run BSPNN on the 13 training sets: v , … , vu . Its detection rates (DR) on different attack categories are displayed in Figure 2. We could discover a general trend of increasing performance as more intrusions are added into training set. In particular, detection of R2L attacks requires less known intrusion data (DR starts rising at vx) than that of other classes.
For misuse detection, we inject the 26 known attacks into Norm to classify 14 novel ones. For example, from the Probe attacks that appeared in the training set (ipsweep., nmap., portspeep., satan.), we aim to detect unseen Probe attacks that were only included in testing data (mscan., saint.). In [33], artificial anomalies are added to the training data to help the learner discover a boundary around the available training data. The method particularly changes the value of one feature of a connection while leaving other features unaltered. However, we do not adopt this method due to its high false alarm rate and its unconfirmed assumption that the boundary is very close to the known data and that they do not intersect one another. Instead, we group 26 known intrusions into 13 clusters V , … , Vu (note that these clusters are not artificially generated but real incidents, available in “10% KDD” set) and use it for classification. Each cluster contains intrusions that require similar features
Using the full training set (vu), we test our BSPNN against other existing methods, including the KDD-99 winner [8], the rule-based PNrule approach [34], the multi-class Support Vector Machine [19], the Layered Conditional Random Fields Framework (LCRF) [23], the Columbia Model [22] and the Decision Tree method [11]. Their Detection Rate (DR) and False Alarm Rate (FAR) are reported in Table 5, with highest DR and lowest FAR for each class in bold.
88
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009 TABLE III.
8 back 8z ftp_write, warezclient, warezmaster 8| imap 8~ portsweep, satan 8 multihop 8 phf 8 z spy, smurf
TABLE IV.
CLUSTERS OF KNOWN INTRUSION
8y buffer_overflow, loadmodule, perl, rootkit 8{ guess_passwd 8} land 8 ipsweep, nmap 8 neptune 8 y pod, teardrop
ANOMALY DETECTION RATE (DR) AND FALSE ALARM RATE (FAR) FOR ANOMALY DETECTION
Anomaly DR FAR
Fan et. [33] 94.26 2.02
BSPNN 94.31 1.12
Figure 2. Detection Rate on Datasets for misuse detection
For Probe and DoS attacks, BSPNN can achieve slightly better DR than other algorithms with very competitive FAR. Though improvement for detection of Normal class is not significant, our model can, in fact, get a remarkably low FAR. In addition, a clear performance superiority is claimed for BSPNN in the case of U2R and R2L classes.
the baseline models can only classify the major classes and performs poorly on other minor ones, while our BSPNN exhibits superior detection power for all classes. Significant improvement in detection of more dangerous attacks (U2R, R2L) leads to lower total weight of misclassification of 0.1523 compared with 0.2332 of the KDD-99 winner.
It is also important to note that, since KDD-99 dataset is unbalanced (U2R and R2L appeared rarely),
89
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Probe
DoS
U2R
R2L
DR/FAR (%)
DETECTION RATE (DR) AND FALSE ALARM RATE (FR) FOR MISUSE DETECTION
Normal
TABLE V.
Columbia Model [22]
99.5 27.0 99.5 27.0 99.6 27.8 -
83.3 35.2 73.2 7.5 75 11.7 98.60 0.91 96.7
97.1 0.1 96.9 0.05 96.8 0.1 97.40 0.07 24.3
13.2 28.6 6.6 89.5 5.3 47.8 86.30 0.05 81.8
8.4 1.2 10.7 12.0 4.2 35.4 29.60 0.35 5.9
DR FAR DR FAR DR FAR DR FAR DR
Decision Tree [11]
-
81.4
60.0
58.8
24.2
DR
BSPNN
99.8 3.6
99.3 1.1
98.1 0.06
89.7 0.03
48.2 0.19
DR FAR
KDD 99 winner [8] PNrule [34] Multi-class SVM [19] Layered Conditional Random Fields [23]
[6]
V. CONCLUSION This research is inspired by the need of a highly performing but low in computation classifier for applications in Network Security. Particularly, the Boosted Subspace Probabilistic Neural Network (BSPNN) is proposed which combines two emerging algorithms, an adaptive boosting method and a probabilistic neural network. BSPNN retains the semiparametric characteristics of VQ-GRNN and therefore obtains low generalization variance while receives accuracy boosting from SAMME method (low bias). Though BSPNN requires more processing power due to the effect of boosting, the increased computation is still lower than GRNN or other boosted algorithms.
[7]
[8] [9]
[10] [11]
Experiments on the KDD-99 network intrusion dataset show that our approach obtains superior performance in comparison with other state-of-the-art detection methods, achieving low learning bias and improved generalization at an affordable computational cost.
[12]
[13]
REFERENCES [14] [1] [2]
[3]
[4]
[5]
C. Elkan, "Results of the KDD’99 Classifier Learning," ACM SIGKDD Explorations, vol. 1, pp. 63-64, 2000. I. Kononenko and M. Kukar, Machine Learning and Data Mining: Introduction to Principles and Algorithms Horwood Publishing Limited, 2007. D. S. Bauer and M. E. Koblentz, "NIDX – an expert system for realtime network intrusion detection," in Proceeding of the Computer Networking Symposium Washington, D.C., 1988, pp. 98-106. K. Ilgun, R. Kemmerer, and P. Porras, "State transition analysis: a rulebased intrusion detection approach," IEEE Transactions on Software Engineering, pp. 181-199, 1995. W. Lee, S. Stolfo, and K. Mok, "Mining Audit Data to Build Intrusion Detection Models," Proc. Fourth International Conference Knowledge Discovery and Data Mining pp. 66-72, 1999.
[15]
[16]
[17]
90
W. Lee, S. Stolfo, and K. Mok, "A Data Mining Framework for Building Intrusion Detection Model," Proc. IEEE Symp. Security and Privacy, pp. 120-132, 1999. N. B. Amor, S. Benferhat, and Z. Elouedi, "Naive Bayes vs. Decision Trees in Intrusion Detection Systems," Proc. ACM Symp. Applied Computing, pp. 420-424, 2004. B. Pfahringer, "Winning the KDD99 Classification Cup: Bagged Boosting," SIGKDD Explorations, vol. 1, pp. 65–66, 2000. V. Miheev, A. Vopilov, and I. Shabalin, "The MP13 Approach to the KDD’99 Classifier Learning Contest," SIGKDD Explorations, vol. 1, pp. 76–77, 2000. I. Levin, "KDD-99 Classifier Learning Contest: LLSoft’s Results Overview," SIGKDD Explorations, vol. 1, pp. 67–75, 2000. J.-H. Lee, J.-H. Lee, S.-G. Sohn, J.-H. Ryu, and T.-M. Chung, "Effective Value of Decision Tree with KDD 99 Intrusion Detection Datasets for Intrusion Detection System," in 10th International Conference on Advanced Communication Technology. vol. 2, 2008, pp. 1170-1175. Z. Zhang, J. Li, C. N. Manikopoulos, J. Jorgenson, and J. Ucles, "HIDE: A Hierarchical Network Intrusion Detection System Using Statistical Preprocessing and Neural Network Classification," Proc. IEEE Workshop Information Assurance and Security, pp. 85-90, 2001. J. Cannady, "Artificial neural networks for misuse detection," in In Proceedings of the National Information Systems Security Conference Arlington, VA, 1998. S. Mukkamala, G. Janoski, and A. Sung "Intrusion detection using neural networks and support vector machines," in International Joint Conference on Neural Networks (IJCNN). vol. 2: IEEE, 2002, pp. 17021707. C. Jirapummin, N. Wattanapongsakorn, and P. Kanthamanon, "Hybrid neural networks for intrusion detection system," In Proceedings of The 2002 International Technical Conference On Circuits/Systems,Computers and Communications, 2002. L. Portnoy, E. Eskin, and S. Stolfo, "Intrusion Detection with Unlabeled Data Using Clustering," Proc. ACM Workshop Data Mining Applied to Security (DMSA), 2001. H. Shah, J. Undercoffer, and A. Joshi, "Fuzzy Clustering for Intrusion Detection," Proc. 12th IEEE International Conference Fuzzy Systems (FUZZ-IEEE ’03), vol. 2, pp. 1274-1278, 2003.
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009 [18] C. Kruegel, D. Mutz, W. Robertson, and F. Valeur, "Bayesian Event Classification for Intrusion Detection," Proc. 19th Annual Computer Security Applications Conference, pp. 14-23, 2003. [19] T. Ambwani, "Multi class support vector machine implementation to intrusion detection," in Proc. of IJCNN, 2003, pp. 2300-2305. [20] D. Song, M. I. Heywood, and A. N. Zincir-Heywood, "Training Genetic Programming on Half a Million Patterns: An Example from Anomaly Detection," IEEE Trans. Evolutionary Computation, vol. 9, pp. 225-239, 2005. [21] W. Wang, X. H. Guan, and X. L. Zhang, "Modeling Program Behaviors by Hidden Markov Models for Intrusion Detection," Proc. International Conference Machine Learning and Cybernetics, vol. 5, pp. 2830-2835, 2004. [22] W. Lee and S. Stolfo, "A Framework for Constructing Features and Models for Intrusion Detection Systems," Information and System Security, vol. 4, pp. 227-261, 2000. [23] K. K. Gupta, B. Nath, and R. Kotagiri, "Layered Approach using Conditional Random Fields for Intrusion Detection," IEEE Transactions on Dependable and Secure Computing, vol. 5, 2008. [24] A. Zaknich, Neural Networks for Intelligent Signal Processing. Sydney: World Scientific Publishing, 2003. [25] A. Zaknich, "Introduction to the modified probabilistic neural network for general signal processing applications," IEEE Transactions on Signal Processing, vol. 46, pp. 1980-1990, 1998. [26] D. F. Spetch, "A general regression neural network," IEEE Transactions on Neural Networks, vol. 2, pp. 568-576, 1991.
[27] R. E. Schapire, "A brief introduction to boosting," in Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, San Francisco, CA, 1999, pp. 1401-1406. [28] Y. Freund and R. Schapire, "A decision-theoretic generation of on-line learning and an application to boosting," Journal of Computer and System Science, vol. 55, pp. 119–139, 1997. [29] J. Zhu, S. Rosset, H. Zhou, and T. Hastie, "Multiclass adaboost," The Annals of Applied Statistics, vol. 2, pp. 1290--1306., 2005. [30] R. Kohavi and D. Wolpert, "Bias plus variance decomposition for zeroone loss functions," in Proc. of International Conference on Machine Learning Italy, 1996, pp. 275-283. [31] D. F. Specht, "Probabilistic neural networks," Neural Networks, vol. 3, pp. 109-118, 1990. [32] J. Huang, S. Ertekin, Y. Song, H. Zha, and C. L. Giles, "Efficient Multiclass Boosting Classification with Active Learning," ICDM, 2007. [33] W. Fan, M. Miller, S. Stolfo, W. Lee, and P. Chan, "Using artificial anomalies to detect unknown and known network intrusions," Knowledge and Information Systems, vol. 6, pp. 507–527, 2004. [34] R. Agarwal and M. V. Joshi, "PNrule: A New Framework for Learning Classifier Models in Data Mining," in A Case-Study in Network Intrusion Detection, 2000.
91
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
Building a Vietnamese language query processing framework for e-library searching systems Dang Tuan Nguyen, Ha Quy-Tinh Luong
Tuyen Thi-Thanh Do
Faculty of Computer Science University of Information Technology, VNU- HCM Ho Chi Minh city, Vietnam
Faculty of Software Engineering University of Information Technology, VNU - HCM Ho Chi Minh city, Vietnam
Abstract—In the objective of building intelligent searching systems for e-libraries or online bookstores, we have proposed a searching system model based on a Vietnamese language query processing component. Such document searching systems based on this model can allow users to use Vietnamese queries that represent content information as input, instead of entering keywords for searching in specific fields in database. To simplify the realization process of system based on this searching system model, we set a target of building a framework to support the rapid development of Vietnamese language query processing components. Such framework let the implementation of Vietnamese language query processing component in similar systems in this domain to be done more easily.
II. FRAMEWORK ARCHITECTURE The VLQP framework is architecture of 2-tiers. This framework includes a restricted parser for analyzing Vietnamese query from users based on a class of the predefined syntactic rules and a transformer for transforming syntactic structure of query to its semantic representation. Main features of those components are described in brief as follows: -
The parser analyzes Vietnamese query syntaxes and output of the syntactic components that were analyzed from the query. After analyzing, the parts-of-speech and the sub-categories of these components are determined. The parser’s performing is based on a set of syntactic rules. This set of syntactic rules can cover various forms of Vietnamese query relating to the ebook searching application in e-libraries. The new syntactic rules can be added to the set of these rules for enriching it.
-
The transformer bases on predefined transforming rules to transform the syntactic structure of Vietnamese query to its semantic representation. These rules are defined specifically for some determined application domain. The semantic representation model is also built to represent the semantic of all forms of Vietnamese query which are represented by syntactic rules.
Keyword—natural language processing; document retrieval; search engine.
I.
INTRODUCTION
In the objective of building intelligent searching systems for e-libraries or online bookstores, we have proposed a searching system model based on a Vietnamese language query processing component. Such document searching systems based on this model can allow users to use Vietnamese queries that represent content information as input, instead of entering keywords for searching in specific fields in database. This searching system model includes a restricted parser for analyzing Vietnamese query, a transformer for transforming syntactic structure of query to its semantic representation, a generator for generating queries on relational database from semantic model, and a constructor of answer. In fact, this searching system model inherits the idea of an earlier our document retrieval system, which supports users to use English queries for searching e-books in Gutenberg e-library. [1], [2], [3], [4], [5], [6], [7], [8].
The architecture of framework is illustrated in figure 1.
To simplify the realization process of system based on this searching system model, we set a target of building a framework to support the rapid development of Vietnamese language query processing components. Such framework let the implementation of Vietnamese language query processing component in similar systems in this domain to be done more easily.
92
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009
-
S1 := Tác giả A có viết sách B vào năm 2008 không? (S1:= Did author A write book B in 2008?)
In this query, the words “có” and “không” are interrogative words. As a result, it can be analyzed into components: -
author: tác giả A (author A) interrogative1: có verb_write: viết (write) book: sách B (book B) adverbial phrase of time (APT): vào năm 2008 (in 2008) interrogative2: không
The above query is represented in BNF notation: -
B. Syntactic rules The parser works on a set of predefined syntactic rules. Table 1 presents a full list of syntactic rules in BNF form which is included in VLPQ framework version 1.0.
Figure 1. Framework architecture
The VLPQ framework is given as a complete Java package. The Vietnamese language query processing components of searching systems based on VLPQ have an ability of getting Vietnamese queries as input and giving theirs semantic representations as output. The searching systems must build some additional components to process semantic representations of Vietnamese queries and give results to user.
TABLE 1. No 1
2
III. RESTRICTED PARSER 3
A. Description of syntactic rules The parser is built for analyzing the syntax of Vietnamese queries in determined application domain.
4 5
For examples, some different query forms as following: -
Ai đã viết cuốn sách B vào năm 2000? (Who wrote book B in 2000?)
-
Nhà xuất bản nào đã phát hành cuốn B trong năm 2008?
S1_BNF:=
[] [<APT>] [] “?”
6
7
SYNTACTIC RULES
Syntactic rules = <what_author> [] [] {[] } [] “?” = [] [“,”] <what_author> [] [] {[] } “?” = {[] } [] <what_author> [] “?” = [] [“,”] {[] } [] <what_author> “?” = [] [<possessive>] {[] } [] “?” = [] [<possessive>] {[] } [] “?” = [] [<possessive>] {[] } [] “?”
(Which publisher published book B in 2008?)
8
-
Sách B được tác giả A viết vào năm nào? (What year did author A write book B?)
= [] [] [] {[] } [] [] “?”
9
-
Trong năm 2009, tác giả A có viết sách nào thuộc chủ đề T không? (In 2009, does author A write any book with subject
= [] [“,”] [] {[] } [] []“?”
10
= [] [] {[] } [<prep_time>] < what_time > “?”
11
::= {[] } [] [<prep_time>] <what_time> “?”
12
= <what_publisher> [] [] {[] } [] “?”
T?) The syntax of Vietnamese question forms can be described by BNF notation (Backus–Naur Form). The set of syntactic rules contains about 60 forms of Vietnamese queries involving in titles, authors, years of publication, publishers, subject … For example, the following query’s analyzed into syntactic components:
93
http://sites.google.com/site/ijcsis/ ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 1, 2009 13
= [] [“,”] <what_publisher> [] [] {[] } “?”
<what_subject> ? 35
= [] [interrogative1] [] <what_subject> [] ?
14
= {[] } [] <what_publisher> [] “?”
36
15
= [] [“,”] {[] } [] <what_publisher> “?”
= [] [] [] <what_subject> [] ?
37
16
= [] [] [] {[] } [] [] “?”
= [] [] [] [] <what_subject> ?
38
17
= [] [“,”] [] [] [] {[] } [] “?”
= [] [] [] <what_subject> [] ?
39
18
= [] {[] } [] [] [] “?”
= [] [] [] [] <what_subject> ?
40
19
= [] [“,”] [] {[] } [] [] “?”
= [plural] [book_type] [ <subject>] [] [] ?
41
= [] [,] [plural][book_type] [<subject>] [] [interrogative4] ?
42
= [plural][book_type] [<subject>] [] [] ?
43
= [] [,] [plural][book_type] [<subject>] [] ?
20
= [] [] {[] } [<prep_time>] <what_time> “?”
21
= [<prep_time>] <what_time> [] []