International Journal of Computer Science Issues
Volume 3, August 2009 ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
© IJCSI PUBLICATION www.IJCSI.org
© IJCSI PUBLICATION 2009 www.IJCSI.org
EDITORIAL There are several journals available in the areas of Computer Science having different policies. IJCSI is among the few of those who believe giving free access to scientific results will help in advancing computer science research and help the fellow scientist.
IJCSI pay particular care in ensuring wide dissemination of its authors’ works. Apart from being indexed in other databases (Google Scholar, DOAJ,
CiteSeerX,
etc…),
IJCSI
makes
articles
available
to
be
downloaded for free to increase the chance of the latter to be cited. Furthermore, unlike most journals, IJCSI send a printed copy of its issue to the concerned authors free of charge irrespective of geographic location.
IJCSI Editorial Board is pleased to present IJCSI Volume Three (IJCSI Vol. 3, 2009). The paper acceptance rate for this issue is 37.5%; set after all submitted papers have been received with important comments and recommendations from our reviewers.
We sincerely hope you would find important ideas, concepts, techniques, or results in this special issue.
As final words, PUBLISH, GET CITED and MAKE AN IMPACT.
IJCSI Editorial Board August 2009 www.ijcsi.org
IJCSI EDITORIAL BOARD
Dr Tristan Vanrullen Chief Editor LPL, Laboratoire Parole et Langage - CNRS - Aix en Provence, France LABRI, Laboratoire Bordelais de Recherche en Informatique - INRIA - Bordeaux, France LEEE, Laboratoire d'Esthétique et Expérimentations de l'Espace - Université d'auvergne, France
Dr Mokhtar Beldjehem Professor Sainte-Anne University Halifax, NS, Canada
Dr Pascal Chatonnay Assistant Professor Maître de Conférences Université de Franche-Comté (University of French-County) Laboratoire d'informatique de l'université de Franche-Comté (Computer Sience Laboratory of University of French-County)
Prof N. Jaisankar School of Computing Sciences, VIT University Vellore, Tamilnadu, India
IJCSI REVIEWERS COMMITTEE • Mr. Markus Schatten, University of Zagreb, Faculty of Organization and Informatics, Croatia • Mr. Forrest Sheng Bao, Texas Tech University, USA • Mr. Vassilis Papataxiarhis, Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, Panepistimiopolis, Ilissia, GR-15784, Athens, Greece, Greece • Dr Modestos Stavrakis, Univarsity of the Aegean, Greece • Prof Dr.Mohamed Abdelall Ibrahim, Faculty of Engineering Alexandria Univeristy, Egypt • Dr Fadi KHALIL, LAAS -- CNRS Laboratory, France • Dr Dimitar Trajanov, Faculty of Electrical Engineering and Information technologies, ss. Cyril and Methodius Univesity - Skopje, Macedonia • Dr Jinping Yuan, College of Information System and Management,National Univ. of Defense Tech., China • Dr Alexios Lazanas, Ministry of Education, Greece • Dr Stavroula Mougiakakou, University of Bern, ARTORG Center for Biomedical Engineering Research, Switzerland • Dr DE RUNZ, CReSTIC-SIC, IUT de Reims, University of Reims, France • Mr. Pramodkumar P. Gupta, Dept of Bioinformatics, Dr D Y Patil University, India • Dr Alireza Fereidunian, School of ECE, University of Tehran, Iran • Mr. Fred Viezens, Otto-Von-Guericke-University Magdeburg, Germany • Mr. J. Caleb Goodwin, University of Texas at Houston: Health Science Center, USA • Dr. Richard G. Bush, Lawrence Technological University, United States • Dr. Ola Osunkoya, Information Security Architect, USA • Mr. Kotsokostas N.Antonios, TEI Piraeus, Hellas • Prof Steven Totosy de Zepetnek, U of Halle-Wittenberg & Purdue U & National Sun Yat-sen U, Germany, USA, Taiwan • Mr. M Arif Siddiqui, Najran University, Saudi Arabia • Ms. Ilknur Icke, The Graduate Center, City University of New York, USA • Prof Miroslav Baca, Associated Professor/Faculty of Organization and Informatics/University of Zagreb, Croatia • Dr. Elvia Ruiz Beltrán, Instituto Tecnológico de Aguascalientes, Mexico • Mr. Moustafa Banbouk, Engineer du Telecom, UAE • Mr. Kevin P. Monaghan, Wayne State University, Detroit, Michigan, USA • Ms. Moira Stephens, University of Sydney, Australia
• Ms. Maryam Feily, National Advanced IPv6 Centre of Excellence (NAV6) , Universiti Sains Malaysia (USM), Malaysia • Dr. Constantine YIALOURIS, Informatics Laboratory Agricultural University of Athens, Greece • Dr. Sherif Edris Ahmed, Ain Shams University, Fac. of agriculture, Dept. of Genetics, Egypt • Mr. Barrington Stewart, Center for Regional & Tourism Research, Denmark • Mrs. Angeles Abella, U. de Montreal, Canada • Dr. Patrizio Arrigo, CNR ISMAC, italy • Mr. Anirban Mukhopadhyay, B.P.Poddar Institute of Management & Technology, India • Mr. Dinesh Kumar, DAV Institute of Engineering & Technology, India • Mr. Jorge L. Hernandez-Ardieta, INDRA SISTEMAS / University Carlos III of Madrid, Spain • Mr. AliReza Shahrestani, University of Malaya (UM), National Advanced IPv6 Centre of Excellence (NAv6), Malaysia • Mr. Blagoj Ristevski, Faculty of Administration and Information Systems Management - Bitola, Republic of Macedonia • Mr. Mauricio Egidio Cantão, Department of Computer Science / University of São Paulo, Brazil • Mr. Thaddeus M. Carvajal, Trinity University of Asia - St Luke's College of Nursing, Philippines • Mr. Jules Ruis, Fractal Consultancy, The netherlands • Mr. Mohammad Iftekhar Husain, University at Buffalo, USA • Dr. Deepak Laxmi Narasimha, VIT University, INDIA • Dr. Paola Di Maio, DMEM University of Strathclyde, UK • Dr. Bhanu Pratap Singh, Institute of Instrumentation Engineering, Kurukshetra University Kurukshetra, India • Mr. Sana Ullah, Inha University, South Korea • Mr. Cornelis Pieter Pieters, Condast, The Netherlands • Dr. Amogh Kavimandan, The MathWorks Inc., USA • Dr. Zhinan Zhou, Samsung Telecommunications America, USA • Mr. Alberto de Santos Sierra, Universidad Politécnica de Madrid, Spain • Dr. Md. Atiqur Rahman Ahad, Department of Applied Physics, Electronics & Communication Engineering (APECE), University of Dhaka, Bangladesh • Dr. Charalampos Bratsas, Lab of Medical Informatics, Medical Faculty, Aristotle University, Thessaloniki, Greece • Ms. Alexia Dini Kounoudes, Cyprus University of Technology, Cyprus • Mr. Anthony Gesase, University of Dar es salaam Computing Centre, Tanzania • Dr. Jorge A. Ruiz-Vanoye, Universidad Juárez Autónoma de Tabasco, Mexico
• Dr. Alejandro Fuentes Penna, Universidad Popular Autónoma del Estado de Puebla, México • Dr. Ocotlán Díaz-Parra, Universidad Juárez Autónoma de Tabasco, México • Mrs. Nantia Iakovidou, Aristotle University of Thessaloniki, Greece • Mr. Vinay Chopra, DAV Institute of Engineering & Technology, Jalandhar • Ms. Carmen Lastres, Universidad Politécnica de Madrid - Centre for Smart Environments, Spain • Dr. Sanja Lazarova-Molnar, United Arab Emirates University, UAE • Mr. Srikrishna Nudurumati, Imaging & Printing Group R&D Hub, Hewlett-Packard, India • Dr. Olivier Nocent, CReSTIC/SIC, University of Reims, France • Mr. Burak Cizmeci, Isik University, Turkey • Dr. Carlos Jaime Barrios Hernandez, LIG (Laboratory Of Informatics of Grenoble), France • Mr. Md. Rabiul Islam, Rajshahi university of Engineering & Technology (RUET), Bangladesh • Dr. LAKHOUA Mohamed Najeh, ISSAT - Laboratory of Analysis and Control of Systems, Tunisia • Dr. Alessandro Lavacchi, Department of Chemistry - University of Firenze, Italy • Mr. Mungwe, University of Oldenburg, Germany • Mr. Somnath Tagore, Dr D Y Patil University, India • Mr. Nehinbe Joshua, University of Essex, Colchester, Essex, UK • Ms. Xueqin Wang, ATCS, USA • Dr. Borislav D Dimitrov, Department of General Practice, Royal College of Surgeons in Ireland, Dublin, Ireland • Dr. Fondjo Fotou Franklin, Langston University, USA • Mr. Haytham Mohtasseb, Department of Computing - University of Lincoln, United Kingdom • Dr. Vishal Goyal, Department of Computer Science, Punjabi University, Patiala, India • Mr. Thomas J. Clancy, ACM, United States • Dr. Ahmed Nabih Zaki Rashed, Dr. in Electronic Engineering, Faculty of Electronic Engineering, menouf 32951, Electronics and Electrical Communication Engineering Department, Menoufia university, EGYPT, EGYPT • Dr. Rushed Kanawati, LIPN, France • Mr. Koteshwar Rao, K G REDDY COLLEGE OF ENGG.&TECH,CHILKUR, RR DIST.,AP, INDIA • Mr. M. Nagesh Kumar, Department of Electronics and Communication, J.S.S. research foundation, Mysore University, Mysore-6, India • Dr. Babu A Manjasetty, Research & Industry Incubation Center, Dayananda Sagar Institutions, , India
• Mr. Saqib Saeed, University of Siegen, Germany • Dr. Ibrahim Noha, Grenoble Informatics Laboratory, France • Mr. Muhammad Yasir Qadri, University of Essex, UK
TABLE OF CONTENTS 1. Pharmaco-Cybernetics as an Interactive Component of Pharma-Culture: Empowering Drug Knowledge through User-, Experience- and Activity-Centered Designs Kevin Yi-Lwern YAP, Xuejin CHUANG, Alvin Jun Ming LEE, Alvin Jun Ming LEE, Raemarie Zejin LEE, Lijuan LIM, Jeanette Jiahui LIM, Ranasinghe NIMESHA, NM5206 Project Team, Communications and New Media Programme, Faculty of Arts & Social Sciences, National University of Singapore
2. Similarity Matching Techniques For Fault Diagnosis In Automotive Infotainment Electronics Mashud Kabir, Department of Computer Science, University of Tuebingen, D-72027 Tuebingen, Germany
3 . Prototype System for Retrieval of Remote Sensing Images based on Color Moment and Gray Level Co-Occurrence Matrix Priti Maheshwary and Namita Sricastava, Deparment of Mathematics, Maulana Azad National Institute of Technology, Bhopal, Madhya Pradesh, India
4. Performing Hybrid Recommendation in Intermodal Transportation – the FTMarket System’s Recommendation Module Alexis Lazanas, Industrial Management and Information Systems Lab, University of Patras, Rion Patras, 26500, Greece
5. Geometric and Signal Strength Dilution of Precision (DoP) Wi-Fi Soumaya Zirari, Philippe Canalda and François Spies, Computer Science Laboratory of the University of Franche-Comté, France
6. Implementation of Rule Based Algorithm for Sandhi-Vicheda Of Compound Hindi Words Priyanka Gupta and Vishal Goyal, Department of Computer Science, Punjabi University Patiala
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009 ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
1
Pharmaco-Cybernetics as an Interactive Component of Pharma-Culture: Empowering Drug Knowledge through User-, Experience- and Activity-Centered Designs Kevin Yi-Lwern YAP1,2, Xuejin CHUANG2, Alvin Jun Ming LEE2, Raemarie Zejin LEE2, Lijuan LIM2, Jeanette Jiahui LIM2 and Ranasinghe NIMESHA2 1
2
Department of Pharmacy, Faculty of Science, National University of Singapore Block S4, 18 Science Drive 4, Singapore 117543, Singapore
[email protected]
NM5206 Project Team, Communications and New Media Programme, Faculty of Arts & Social Sciences, National University of Singapore
Abstract The advent of the World Wide Web (WWW) has led to the creation of many web publishing platforms. Patients are becoming more well-informed through drug and health-related information over the internet. The integration of interactive media technologies and the WWW provides an opportunity to improve the pharmaceutical care of patients on anticoagulant therapy. In this paper, the concept of ‘pharmaco-cybernetics’ is introduced through the creation of an interactive tool which consists of a pill-catching game and hangman game designed to enable users to learn about warfarin tablet strengths and drug interactions, based on user-centered (UCD), experience-centered (ECD), and activity-centered design (ACD) approaches. Currently, this tool is largely based on UCD and ECD. However, the potential of incorporating the ACD approach in the tool’s design is definitely attractive. Pharmaco-cybernetics can empower patients with the appropriate knowledge regarding their therapy so that they can better participate in the management of their health. Key words: Drug Information, Interactive Games, PharmacoCybernetics, User Interaction, Warfarin.
1. Introduction Anticoagulation therapy involves the use of drugs to help prevent and treat blood clots in the arteries or veins. Anticoagulants, also known as ‘blood thinners’, work in various ways to inhibit blood-clotting factors in the body. Warfarin is an oral anticoagulant which works by blocking the action of vitamin K in the liver. It is usually prescribed for people with certain types of cardiovascular conditions or those suffering from deep vein thrombosis [1]. Patients on warfarin therapy are usually treated for a period of time ranging from a few months to long term chronic therapy. The dose of warfarin taken by the patient is adjusted according to the results of a blood test known as the International Normalized Ratio (INR), which is a measure of how long a patient’s blood takes to clot. An INR above or below a set target means that the patient is at a higher risk of bleeding and clotting occurrences respectively.
Thus, the dose of warfarin has to be individualized according to the patient’s response to the drug. Warfarin comes in many brands. Patients are advised not to switch among brands as different brands have slightly different efficacy. In Singapore, the brand Marevan® is used, and it comes in a tablet with three strengths which can be identified by its color: 1mg (brown), 3mg (blue) and 5mg (pink). Patients on warfarin therapy may need adjustment of their dosages until their INR stabilizes, and this may be confusing for some patients, especially during the initial stages. Hence, it is important to educate them to recognize the tablets which they are taking and remember the dosages of their therapy. It is easier for the patient to remember the dosage if they can correlate it with the strength of the tablets, which in turn, can be identified by their colors. Warfarin also has many drug interactions. In a broad sense of this paper, these include other medicines, nutritional supplements, traditional herbs, and foods which are rich in vitamin K. It is prudent that patients on warfarin therapy also know some of its common interactions so that they can adapt to any changes in their dietary habits and lifestyles. In traditional medical practice, healthcare professionals have always played active roles in the care of patients. For example, doctors tell their patients what is wrong and how to get better, and pharmacists counsel patients with regards to their medications. For warfarin therapy, patients currently see a pharmacist-run clinic for counseling, where they are educated about the drug itself and how to recognize and manage signs and symptoms of adverse effects and drug interactions. In addition, they are also given supplementary materials such as pamphlets as part of their education. However, the patients’ understanding of warfarin therapy is limited to the time for each counseling session, and the frequency in which they revisit the clinic for follow-up. Thus, their knowledge on warfarin may be limited, particularly for those who are on this medication for the first time. The lack of knowledge or misinterpretation of information about the drug or its
IJCSI
2
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
use can affect their compliance to their medication, which may consequently lead to the patients suffering from drugrelated problems (DRPs) such as under- or overdosing, or potential drug-drug, drug-food or drug-herb interactions [2] . Human-computer interaction (HCI) has become a norm in society. The roles between patients and healthcare professionals have evolved with the information age. Internet and informatics technologies brought about by the cyber era have been critical in transforming the public’s attitudes towards healthcare and medicine. The interface between HCI and health services has led to the birth of medical informatics, which aims to develop studies and instruments to solve clinical issues in the practical setting [3] . Its ultimate goal is to improve the healthcare of patients. As such, many issues from the genetics, social, economic and environmental factors, cognitive, emotional and behavioral domains can also play a role [4]. The emergence of the World Wide Web (WWW) is one of the most significant developments in the history of the internet [5] . The internet is rapidly gaining importance not just for healthcare professionals, but for patients as well. Although healthcare professionals access information on the internet to help them make decisions regarding patient care, patients are also becoming more well-informed about their health and health-related issues through the information which they can get over the internet. Patients are now just as likely to be able to highlight the risks, various therapies and available treatments to their healthcare providers [6]. As traditional therapy is being translated to the internet, the layman is now more aware of his health and is able to better understand the science behind the various illnesses through information he gets from the WWW. Albeit the uncertainty as to whether cybermedicine will ever be comparable to non-cybermedicine [7], the WWW has nevertheless impacted the way healthcare is being practiced today. The challenge is for both healthcare professionals and patients to critically evaluate the vast amounts of available information so as to provide the best care for the patients’ well-being.
1.1 The Roles of the Internet and Interactive Media in Healthcare The traditional role of media in healthcare has involved the use of audio and video programs in public health education, such as with psychiatric diseases, cancer and smoking. Film and photography were used as forms of ‘Edutainment’ – an Education-Entertainment strategy – to address the stigma of people experiencing depression [8] and schizophrenia [9]; while the American Cancer Society leveraged the use of movies as an educational tool for the public on cancer in the 1920s [10]. In fact, popular Hollywood films in the 1930s to 1970s also used this
strategy to portray some cancers as being more ‘favorable’ since they were more photogenic and less offensive [11]. Furthermore, a recent trial also showed the usefulness of digital media in improving the knowledge and awareness of prostate cancer screening among African-American men [12]. However, the two most pressing health-related issues currently which involve the impact of digital media are on its effects on the views and attitudes of sexuality [13] and smoking among youths [14,15]. In recent years, the internet has become a very popular HCI tool in a person’s daily life. It is not uncommon nowadays for patients to search for health-related information online. The World Wide Web Consortium (W3C) [16] and the Internet Engineering Task Force (IETF) [17] have not only provided common standards for data, information and software applications for the WWW, but also encouraged users to discuss about various internetrelated operational and technical problems. Users can now navigate through a vast and complex web of linked computer documents through an inexpensive, easy-to-use, cross-platform, graphic interface which supports items like buttons, scroll lists, tables and pop-up menus for user interaction. However, the current hype in healthcare not only embarks on the use of IT and the WWW, but also the integration of interactive media technologies. Interactive media not only establishes a two-way communication among its users, but allows active participation as well. An opportunity exists for web users to gain information and knowledge in a more interesting manner. Internet interactivity can exist in both digital and multimedia forms, and is most commonly represented by means of text, audio, video, graphics, images and animation [18]. As long as one has the hardware, software, talent and skills for developing an interactive application, it can be mounted on the WWW through inexpensive browsers.
1.2 Animation Healthcare
as
an
Interactive
Tool
in
Animations have always been promoted as a way to showcase the dynamics of user interface actions. People encounter animations frequently since they have been used for various purposes, particularly in web pages and online advertisements. Animations are useful for presenting highly abstract or dynamic processes, or when the user is involved in an action or process [19]. It is known that user satisfaction with animations is usually quite high, unless they distract the user from focusing on key issues [20]. The applications of animation are widespread, normally involving the entertainment and advertising industries. However, this form of interactivity is also getting more widely accepted in the healthcare world. There are many examples of animation applications in the medical sciences, such as in medicine and dentistry [21],
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009 orthopedics [22,23], and aesthetics surgery [24,25]. A virtual human simulation using a 3D phantom was developed by Oak Ridge National Laboratory and its collaborators [26] at the beginning of the century as a computer representation of the human anatomy. Animated films can also be used in the field of psychology for teaching purposes, such as characterizing personality types. An example can be extracted from the animated film ‘Who Framed Roger Rabbit’ [27], in which Roger exhibits a whole range of personality traits from being extroverted and aggressive to being insecure and anxious. However, film animation is only one of animation techniques that can be used in the health sciences. Advancements in computer technology have revolutionized the way healthcare is practiced. As computers become more affordable and newer technologies emerge, traditional animation techniques of tweening and morphing have transformed into computerized versions created by two- (2D) and threedimensional (3D) bitmap and vector graphics. The development of the WWW has led to the creation of many web publishing platforms, including HyperText Markup Language (HTML) and its variants, Java applets, Flash and Shockwave, among others. Web technologies have also enabled the generation of other forms of web pages like Hypertext Preprocessor (PHP) and Active Server Pages (ASP). HTML has been the well-known standard format for publishing content on the WWW, but its limitation lies in the management of interactive and animated content. However, the WWW has now managed to successfully integrate Flash technology for this purpose due to its advantages of not having cross-platform and cross-browser compatibility problems, and the ‘Flash everywhere’ phenomenon is getting very popular with website developers [28]. Websites can now be created using a combination of HTML and Flash, or created entirely in Flash. A recent small-scale usability study done by Piyasirivej reported that users generally enjoy Flash sites more than HTML sites [28]. Examples are the ‘Virtual Knee Surgery’ and ‘Choose the Prosthetic’ games developed by Edheads + COSI where the user takes on the role of a virtual surgeon to diagnose knee replacement patients and carry out a total knee replacement surgery [29]. However, despite the attractiveness of such technologies in the various areas of healthcare, their progress in the pharmaceutical arena is still slow.
1.3 Pharmaco-Cybernetics as Part of PharmaCulture The objectives, roles and value-addedness of clinical pharmacists have always been in continuous debate. Nevertheless, many organizations such as the World Health Organization (WHO) and the Nuffield Foundation
3
have recognized pharmacists as essential health care providers [30]. The practice of pharmaceutical care forms the cornerstone of clinical pharmacy, and its concept revolves around identifying, solving and preventing drugrelated problems (DRPs) with regards to a patient’s drug therapy [31]. Although this area has significantly contributed to new approaches in pharmacy education, several ‘driving forces’ that will impact the value of pharmacists have been identified [30]. These include: (a) improved care and protection for patients, especially the chronically ill or those with particular types of diseases (e.g. acquired immune deficiency syndrome or AIDS); (b) training new pharmacy professionals to be more patient orientated; and (c) the need for advanced pharmaceutical expertise and new skills to keep up with accelerated information technology so as to be able to manage new treatments. Pharmaco-cybernetics is an upcoming area of pharmacy which involves advanced skills and expertise to deal with HCI concepts and technologies in relation to medicines and drugs. The term ‘pharmaco’ is derived from the Greek term ‘pharmakon’ meaning drugs or poisons [32], and ‘cybernetics’ comes from the Greek term ‘kubernetes’, which can be translated to mean ‘the art of steering’ [33,34]. Originally defined by Norbert Wiener in his book of the same title, he defined ‘cybernetics’ as the science or study of ‘control and communication in the animal and the machine’ [33-35]. Aptly described by the American Society for Cybernetics (ASC) as the design, discovery and application of principles of regulation and communication [35] , this is a multi-disciplinary area which has been applied to many fields such as system theory, psychology, anthropology, sociology, and more recently, biology, engineering and computer science [34]. The single characteristic that defines a cybernetic system is the relationship between endogenous goals and the external environment [36]. In fact, this was not a new concept in healthcare, and was already applied in the 1970s by Maltz as a means of setting goals of positive outcomes for his patients who were not satisfied by their plastic surgery procedures [37]. However, the traditional concept of cybernetics has evolved into a modern theory known as ‘new cybernetics’ or ‘second-order cybernetics’, in which information is viewed as construct and reconstructed by individuals interacting with the environment [38,39]. This means that the system is not only dependent on the observer or person interacting with it, but it also links the individual with the society as a whole. The science of cybernetics has further led to the term ‘cyberspace’ being coined by Gibson in his famous book Neuromancer, which identified a virtual representation of information in varying states of accessibility, linked to various people and organizations [40-42]. A similar concept was brought up in the movie ‘The Matrix’ and its sequels
IJCSI
4
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
in which Neo, a computer programmer, who lived in a future world perceived by humans as reality, was actually a simulated matrix created by sentient machines to subdue the human race [43]. This term is now ubiquitously used to describe anything which is associated with computers, information technology, and the internet. It also incorporates the elements of social experiences and interaction of individuals through the exchange of ideas and the sharing of information [44]. Thus, ‘pharmaco-cybernetics’ or ‘pharma-cybernetics’ aptly describes the science of dealing with medicines or drugs through applications of HCI concepts and technologies so as to reduce or prevent DRPs, and ultimately, improve pharmaceutical care in patients. It involves communication and feedback with the users, and connects control (i.e. actions taken in the hope of achieving goals) with communication (i.e. the flow of drug information and knowledge between the user and the cybernetic system or environment). In this paper, we attempt to introduce the concept of ‘pharmaco-cybernetics’ through the creation of a simple interactive tool aimed at improving the knowledge of users on anticoagulation therapy. In particular, two prototype games which are targeted at students in the pharmaceutical sciences and patients on warfarin therapy will be discussed. Ten web animation principles [45], as well as user- (UCD), experience- (ECD) and activity-centered design (ACD) approaches which can be considered in the designing of pharmaco-cybenetic systems will also be elaborated through a critique of the tool based on a pilot usability survey that was done. Due to space constraints, only important concepts related to the design frameworks will be discussed. The reader is referred to Appendices 1, 2 and 3 for more detailed application summaries.
2. Creation and Evaluation of WarfarINT The WarfarINT interactive tool was created as an information resource for patients, students and the general public who are interested in learning about anticoagulation therapy. WarfarINT stands for ‘Warfarin INTerative’, and consists of 2 games (Fig. 1) which provides the interactive component for users. The first is a pill-catching game in which users have to catch different colored warfarin tablets dropping from the top of the screen by moving a pill bag with their mouse in a horizontal direction. Their scores are correlated with the strength of the tablets that are caught, which in turn are reflected by the different colors. The second is a hangman game in which users are supposed to guess a drug, food or herb that interacts with warfarin. The objectives of this tool are to enable users to correlate the tablet colors with
their strengths, as well as know the drugs, herbs or foods that interact with warfarin in an interesting manner.
Fig. 1 Screenshots of the interaction tool which consists of 2 games: (a) Warfarin Game, and (b) Warfarin Hangman.
A pilot usability study was also carried out on a group of pharmaceutical science students at a local educational institution to evaluate how well the interactive tool helped in improving their knowledge of the anticoagulant drug. Participants were given 15 minutes to answer a questionnaire which consisted of questions categorized into 3 parts: (a) user demographics, (b) general knowledge and views on anticoagulation therapy and online interaction tools, and (c) feedback and experiences on using the interactive tool (warfarin games). A fifth of the time (3 minutes) was dedicated to playing the games. The results were then evaluated based on descriptive statistics and participants’ responses. A total of 25 participants were recruited in the study, with a response rate of 92%. Two responses were excluded from analysis due to incomplete submissions. The mean age of the respondents was 19.7+/-0.8 years, and majority were females (87%). All respondents had previously heard of warfarin before participating in the study, but did not know about its tablet strengths and interactions.
3. Human-Computer Interaction Frameworks in Pharmaco-Cybernetics 3.1 The User-Centered Design (UCD) Approach User-centered design (UCD) is a broad term used to describe design processes in which end-users play a role in influencing how a product’s design takes shape. Users are
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
placed at the center of the design process throughout the planning, creation and development phases of the product. The concepts of visibility, mapping and feedback play crucial roles in the UCD approach [46]. The principle of visibility states that the user should be able to figure out the use of a product based on the visibility of its components. In other words, the product’s parts or components should convey a correct message regarding its usage [46]. This can be correlated to the animation principles proposed by Weir and Heeps (Appendix 1) [45]. The product, in this case is the tool consisting of the games, should not distract users’ attention from salient information, but rather, convey its intended message across. Users should be drawn to the essential features of the animation so that they can focus on the relevant aspects. The graphical user interfaces (GUIs) of the tool (Fig. 2) are located in the middle of the webpages so that the user’s attention will be focused on the games. The white backgrounds of the webpages are meant as contrasts to the background of the games, and the titles of the games are kept simple and self-explanatory so that first-time users would know what to expect of the tool.
Fig. 2 Graphical user interfaces of the (a) warfarin pill-catching and (b) hangman games.
In addition, visibility was demonstrated in the games through short and concise instructions to users on what the games entail and how to play:
5
“Collect as many warfarin tablets as you can! Move your mouse to shift the pill bag left and right. Each tablet color awards you points equivalent to its strength.” – Instructions of the pill-catching game. “Choose a letter by clicking on it… The letter changes to green if your guess is correct, and red if your guess is wrong.” – Instructions of the hangman game. The use of ‘backup’ text to provide additional details can help users understand the rationale of the animation better provided it is used sparingly. Animations combined with text and sound can reduce the likelihood of an ambiguity in interpretation by the user. However, when used inappropriately, it may cause distractions and cognitive overloads. Besides textual information, sounds can also support ambiguity and provide feedback to the users regarding certain results. However, it should only be used to enhance the purpose of the animation. When used inappropriately, sounds can confuse the user instead of enhancing their information-retrieval experience. In the pill-catching game, users would hear a ‘boing’ when they manage to catch a tablet, but if they miss, a ‘splash’ would be heard instead. This enables the users to discriminate between a score and a miss, which would be important since the users would strive to hear more ‘boings’ than ‘splashes’ to gain higher scores. The use of appropriate colors and adherence to color conventions are also important for visibility of the product. Like sounds, irrelevant color differences can also distract and mislead users of the product. Colors are more than just a cosmetic effect. They do not only help convey messages to users, but also affect the users’ perceptions of depth and space. The colors of the animated tablets follow the actual color convention of warfarin tablets in reality with regards to their tablet strengths. A 3D aspect is also achieved in the hangman animation through the use of different colors. A brown surface with red diagonal lines gives the ground a horizontal effect, and the pole and stool seem to be situated on the ground. The background is green to distinguish it from the other objects in the animation, and to give a sense of calm to the user playing the game, since green is often associated with safety (e.g. traffic lights) or nature (e.g. trees). Humans have limited visual processing capability. When faced with a visually cluttered display, users tend to ignore some components in their perceptual field, and this often impedes the delivery of the intended message. To avoid clutter of our online tool, the animation screens are centralized in the middle of the webpages (Fig. 2). In the pill-catching game, the title, instructions, and scores, are placed on the top left and right corners respectively. The button to start and restart the game, indicated by ‘Play Again’, is placed below the ‘Game Over’ message so that users can click on it to play the game. Similarly, the title
IJCSI
6
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
and instructions of the hangman game occupy the top half of the screen, and the animation of the hangman is located just beside the words that users are supposed to guess, so that they know how many wrong guesses they have made. Clutter is also minimized as users are allowed to expand or collapse the categories of drug interactions as appropriate. Mapping [46], the second principle of UCD, describes the link between one’s intended actions (what one wants to do) to actual operations (what appears to be possible). In animated products, it is crucial for the designer to appreciate the insight of semiotics. Users will be able to play the games if the games can be mapped to processes or objects that are known or familiar to them. The target audiences of the games are pharmacy/ pharmaceutical science students and patients on warfarin therapy, who are expected to be familiar with the drug. Furthermore, users can guess the interactions based on their previous experience of knowing how to play the hangman game. Proper positioning and organization of objects in the games can help users understand how to play the games. The tool uses natural mapping of the left-right clicks on the mouse controls that are familiar to users. This leads to an immediate understanding of how to use these controls to play the games. Incorporating these controls in the games allows for easier manipulations of the various animated components such as moving the pill-bag to catch the dropping warfarin tablets, and selecting the alphabets of the interacting drug. Gestalt’s law of proximity which states that ‘related items should be placed closer together than non-related items’ also applies here. Similarly, information deemed to be of greater importance should appear in positions of greater importance on the screen from the user’s perspective. Related items in the games are grouped together in time, space and shape, such as with the warfarin tablets dropping in a vertical direction while the pill bag moves in the opposite horizontal direction; and the hangman animation being grouped side-by-side with the word of the interacting drug. Users who play the games will then be able to better remember the warfarin interactions, as well as the tablet strengths. For animations, the duration of exposure to users also affects their ability to interpret and understand the information about the product. Too short an exposure time will leave the viewer confused, but too long a time can lead to boredom and fatigue. Both games provide an adequate amount of exposure time to users – the pillcatching game lasts less than a minute so that users do not get bored, yet have enough time to learn and correlate the tablets’ colors with their strengths; while users are given an option to end the hangman game in the middle of gameplay or if they give up guessing the word, or else, frustration will result and lead to the user not wanting to play the game again. Generally, if the correct amount of
information exposure cannot be determined, the common rule of ‘too-much is better than too-little’ can be applied. A principle that deserves special mention in this paper is that of complying with the Co-operative Maxims. Based originally on Grice’s Coorperative Principle, Weir and Heeps have defined them with regards to animation in terms of (a) quality (the animator tells/ portrays the truth), (b) quantity (the intended message is adequately conveyed without use of excess animation), (c) relation (the animations are organized in a meaningful order), and (d) manner (the animations are clear and natural, avoiding ambiguity and disorder). The warfarin tool follows these principles in the form of simple instructions and information that is easily understood by the layman, with the exception of drug names which cannot be simplified, so as to avoid misinterpretation and ambiguity. Similarly, these principles can and should be applied in any tool/ product that are designed for the purpose of providing drug information. The explanations of these ‘Four Pharmaco-cybernetic Maxims’ are provided in Table 1. Table 1: The ‘Four Pharmaco-cybernetic Maxims’ for designing pharmacy and/or pharmaceutical science tools. Design Explanation of principle with regards to pharmacy principle and/or pharmaceutical sciences Drug information content provided by the informatics or internet tool(s) should be accurate and follow appropriate Quality resources for evidence-based therapies (e.g. research articles, established databases or product information). Adequate information about the drug or drug therapy is provided so that users of the tool know enough to minimize Quantity the likelihood of drug-related problems (e.g. underdose, overdose, drug interactions). Drug information provided by the tool(s) is/are relevant to Relation what the target audience needs to know, and should clarify their doubts instead of making them more confused. Drug information provided by the tool(s) is/are conveyed clearly in an appropriate manner which avoids ambiguity Manner and misinterpretation (e.g. layman language for the patient and medical jargon for healthcare professionals).
In UCD of products, feedback is largely a crucial component as it reflects to the user about what action has been done and what result is achieved [46]. Feedback is accomplished in the warfarin tool as the user seeing the pill bag move in response to his mouse movements, and parts of the hangman animation or the letters appearing as part of the word when he selects wrong or correct alphabets respectively. Feedback in animated tools should also follow the traditional features developed by Walt Disney Studios, which aims to make animations as realistic and entertaining as possible. The ‘Squash and Stretch’ and ‘Timing and Motion’ aspects are most commonly accepted by the public. The former defines an object’s rigidity and mass by distorting its shape during an action, and the latter follows the natural motion of an object such as
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
acceleration and deceleration, moving in curved paths, or experiencing color and texture changes. Potentially ‘unreal’ aspects of an animated object’s behavior could hinder users from interpreting the correct message. ‘Squash and Stretch’ in the games (Fig. 3) is demonstrated by the distorting/ shrinking of the pill bag when the user catches the tablet and the rope becoming taut when the hangman is no longer supported by the stool. On the other hand, ‘Timing and Motion’ is seen through the acceleration of the dropping tablets and the hangman and his feet dropping lower when the stool topples. These give users the perceptions of gravity and friction in the animations, which translates a sense of virtual reality when playing the games.
7
The results showed that although 75-85% of the respondents deemed the instructions of the games to be clear, one respondent actually commented to “Give some instructions on playing the games” as a free-response feedback. This situation could not have been predicted or detected if a usability study had not been carried out on the games. The participants in our pilot study had different requirements and experiences with the games, and this proved to be one of the major limitations of UCD which can be accounted for by experience-centered (ECD) and activity-centered designs (ACD), discussed in later sections. Thus, there is a need to involve potential users in the environment in which the interactive tool would be used so as to increase its effectiveness, and consequently, its acceptance and success.
3.2 The Experience-Centered Design (ECD) Approach
Fig. 3 ‘Squash and Stretch’ aspect in the pill-catching game, and ‘Timing and Motion’ aspect in the hangman game.
Users are a central part of the UCD developmental process. Although UCD is about engineering usability, it fails to take into account other important elements such as environmental and socio-cultural factors. In the creation of the interactive tool, it was assumed that all users would be familiar with the mouse even though some users might be more familiar and comfortable playing the games with the keyboard instead. The games also did not take into account the varying educational levels, or the settings and/or situations in which potential users would be using this online interactive tool. This is a condition known as ‘design myopia’ which is characterized by the shortsightedness of the designer. To the designer, the product may appear suitable, even ideal. Yet, to the common layman, the same product may seem unobvious and obscure. This can result in an ‘adverse outcome’ of breaking the user’s focus in the games and hindering his learning potential. One approach to solving this problem is to seek ‘fresh eyes’ on the product through means of usertesting to ensure that a suitable product is produced for the intended purpose, and is also efficient and effective during its development [47]. In this case, the pilot study was to minimize possible misinterpretations and potential problems before the product is released on a larger scale to patients and pharmacy undergraduate students.
Norman’s principles on emotional design stem from our varied responses towards everyday things. The variables that deliver a positive emotional experience vary greatly with the appearance or functioning of a tool [48], and can be matched with the visceral, behavioral and reflective levels of design [49]. At the visceral level, the physical features of a product (e.g. look, feel, sound) dominate over an otherwise usable but plain looking product [49]. The current designs of the warfarin tool are meant to pique the users’ interest in playing the games. However, from our results, 10-20% of the respondents rated the visual appeal as ‘fair’ even though majority (45-70%) rated it ‘good’ to ‘excellent’. This suggests that both games could be improved with more aesthetically pleasing designs so as to give users a thrill during gameplay which will enhance their overall experience [49]. The behavioral level sees functionality as being paramount [49]. The pill-catching game affords function and usability through the user’s mouse movements as an ‘instinctive’ extension of his hand to move the pill bag to catch the dropping tablets; while the hangman game does this by leveraging on the user’s prior experience of playing the ‘pen-and-paper’ version. Feedback is present through real-time score updates in the pill-catching game, and the various stages of hanging in the hangman game. However, the underlying objectives of the games are not explicitly made known to the user. Users may find it difficult to keep track of their scores while simultaneously trying to relate it to the strengths of the tablets. Similarly, users who do not know any warfarin interactions would not find the game useful. To further improve on the behavioral aspects, immediate feedback on the scores and the tablet strengths can be expressed through a storyline, such as a better health-related outcome of a virtual patient,
IJCSI
8
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
and increasing the sizes and color intensities of the tablets with higher strengths. Providing the interaction effects of the drug, herb or food will also allow the user to understand the need of knowing the drug interactions. The reflective level [49] is related to the ‘emotional thread of experience’ by McCarthy and Wright which describes personal meaning derived from use of a product [50] . Sixty-five percent of the survey respondents thought that the interactive tool did help them learn about warfarin, even though it took a while for the learning to be assimilated. The factors that could probably keep them motivated in playing the games are the high scores in the pill-catching game, since they indicate the user’s level of accomplishment, and he is motivated to better his scores and learn about the tablet strengths; and the congratulatory message indicating “[the hangman] is alive!” when the user guesses the word correctly. This gives meaning and satisfaction to the user when he saves the hangman. However, if he loses, words of encouragement “Don’t give up!” appear to motivate him to play another round. The ‘sensual thread’ describes the involvement of the human senses in shaping an experience [50]. Both games currently focus on sight and utilize the user’s experience of moving and clicking the mouse to play. Sound effects which provide feedback when the user catches (‘boing’) or misses (‘splash’) a tablet cater to his sense of hearing. However, the user plays the hangman game in silence. Short midi, wav or mp3 files to indicate a win or loss in the game can further enhance the user’s experience in this case. Mounting the games on other platforms such as personal digital assistants (PDAs) or iPhones can also provide touch-alternatives and a completely different experience to mouse-clicking. The ‘compositional thread’ describes how one frames the many parts that make up one’s whole experience [50]. According to this principle, the games should be considered in relation to the rest of the WarfarINT website. A common feedback from the survey was the lack of adequate information about the drug. Although this could be due to the limited time given in the pilot study to explore the rest of the website, this was seen as a ‘breakdown’ by the respondents as the games seemed to be relatively disjointed from the rest of the website. Questions such as “how do these things go together” and “I wonder what will happen if [action occurs]” could not have been answered by the users. Thus, an improvement would be to include the warfarin dosing information on the same page as the pill-catching game instead of a separate page, as is the current case. Another suggestion from the respondents was to “show image[s] of the food interaction with the correct word” in the hangman game for a more positive and added visceral feel to the experience.
The ‘spatio-temporal’ thread describes one entering a state of ‘flow’ as he becomes engrossed in his experience [50] . Both games managed to keep the respondents engrossed in gameplay, with 55% and 70% of the respondents indicating that their levels of concentration increased during continuous gameplay of the pill-catching and hangman games respectively. However, some comments from the respondents also suggested to “make the pill catching game more interesting.” This can be done by splitting the game into varying difficulty levels and an animated storyline, for example, a virtual patient whose blood vessels become less blocked due to the bloodthinning effect of warfarin, resulting in the patient improving from his medical condition. On the other hand, only users who have adequate drug vocabulary knowledge of the warfarin interactions (e.g. pharmacy or medical students) are immersed in a state of flow when playing the hangman game. Patients who might not be as well-versed in the interactions might suffer from a ‘disruption of flow’ due to frustration of not getting the correct word. Hints can be provided in this case to ease the current steep learning curve of the game. The designing of interactive systems require an understanding of how a person experiences the product from an interaction-centered viewpoint [51]. Cognitive user-product interactions require users to focus on the product at hand, thus users of both games have to learn what their actions will lead to during gameplay. It was suggested in the survey that the warfarin tablets drop too quickly in the pill-catching game, and that users could not keep track on their scores without comprising their gameplay. Increasing tablet sizes and/or color intensities can improve the cognitive interaction as users will find it easier to relate the animated tablets to their strengths, since bigger and more intensely-colored tablets would be worth more points. Furthermore, the games currently do not account for the fact that users will gain competence over time and probably stop playing. To improve users’ scalability of experience, splitting the games into varying difficulty levels will continually challenge users and provide a different experience each time they play the games. Additional features to allow for customization of the backgrounds and interfaces to suit users’ preferences, or mounting the games on a variety of platforms like PDAs, mobile phones, and social networking sites (e.g. Facebook or MySpace) will not only facilitate expressive interactions and co-experience, but also reinforce the reflective and emotional threads of users’ overall experiences.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
3.3 The Approach
Activity-Centered
Design
(ACD)
The ECD approach gives designers an insight to users’ experiences of the interaction tool. However, it does not explain how the activity of playing these games affects the user. Activity Theory (AT) describes a framework for understanding how people operate in the world, taking ‘activity’ rather than ‘person’ or ‘mind’ as the central unit of analysis [52-54]. Several other interpretations of AT exist, but we will discuss the online tool based on the principles described by Kaptelinin (Appendix 2) [53]. The principle on unity of consciousness and activity states that the human mind (consciousness) is inseparable from his interaction with the environment (activity) [52,53]. Users of the online tool know that the tablet colors in the pill-catching game are related to their strengths, and the objective of the hangman game is to learn about the warfarin drug interactions. However, they may not see the relevance of knowing the strengths and interactions. Thus, providing a form of text or storyline would help make users aware of the consequences of DRPs such as underand overdosing, and the severity of a drug interaction with warfarin. Object-orientedness, in this case, is to educate users on the warfarin tablet strengths and drug interactions. In a broad sense, the object in this principle need not be related to physical objects, but includes socially/ culturally defined properties as well [52,54]. Although the tool fulfils its objectives, the significance of the activity itself can be enhanced through making explicit to the user why it is important to know about the tablet strengths and the consequences of the drug interactions. The hierarchical structure of activity is associated with a tri-level scheme describing activities, actions and operations which are oriented towards the goals and motive of the whole activity [52-54]. This hierarchy differs in patients and students playing the games. Students would want to know the tablet strengths and drug interactions to better prepare for exams, instead of improving their health. Based on Leontiev’s principles [52], the relationship between higher and lower objectives of a patient who undergoes anticoagulant therapy and uses the online tool is illustrated by Fig. 4. The smooth transition of conscious actions to subconscious operations when playing the games orients the user towards the objectives of learning about warfarin. A breakdown, however, will disrupt the user’s game playing activity, and may lead to disorientation of the user or even frustration. An example would be the shift in alphabet locations when the browser is resized, resulting in the user trying to find out where to click the alphabets. The concept of internalization-externalization states that our mental processes are derived from external actions
9
through the course of internalization, and is related to the socio-cultural environment [52-54]. There is currently no means of knowing whether the user has assimilated the learning objectives of the games. Feedback mechanisms such as short quizzes on simple warfarin interactions or doses of different colored tablet combinations can be incorporated so that the user is able to ‘internalize’ the knowledge he has gained from the tool and ‘externalize’ this knowledge by correctly answering the questions. The principle of tool mediation is the most significant concept in AT, and it describes how a tool reflects the accumulation and transmission of social knowledge, and experiences of others who have tried to solve similar problems before to make the tool more efficient [52-54]. Improvements of the ‘tools’ in the games would also improve the users’ cognitive skills and knowledge on warfarin. For example, a pill-box, cupped hand or a mouth to simulate catching the warfarin tablets would better mediate the process of how a patient takes the medication in reality. Similarly, an animated form of the traditional ‘pen-and-paper’ hangman can probably provide a more familiar and fun way of learning the warfarin drug interactions. Lastly, the principle of development is used to understand how tools are developed into their existing form [52-54]. The underlying concepts of why the games were developed have been explained throughout the various sections of this paper, but it can also be used to further develop and improve the games. Voice reporting of the user’s score status can improve his gameplay so that he does not need to simultaneously focus on the rapidly changing scores and correlating the strengths of the different colored tablets. Similarly, having different difficulty levels in the hangman game can also ease the user’s learning curve.
Fig. 4 Hierarchy of objectives of a patient on anticoagulant therapy, and how they are affected by socio-cultural factors.
IJCSI
10
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
3.4 Pharmaco-Cybernetics from an Ecological Perspective The Ecological Systems Theory by Urie Bronfenbrenner describes how users interact with their immediate environments (micro, meso, exo, macro, chrono), and how these environments affect the user in a wider context [55]. From a pharmaco-cybernetics perspective, this theory can be applied in the context of users learning about anticoagulant therapy from the interaction tool (Appendix 3). The bi-directional influences of each individual system on the others can help identify possible avenues for improvement, as well as the pitfalls and disturbances in the activity of using the tool. This warfarin tool also allows the possibility of creating other larger-scale and more complex interactive tools that will not only encompass the magnitude of influences across the various environments, but also reduce DRPs by empowering patients with the appropriate drug knowledge so that they can better participate in their therapies and management strategies with their healthcare professionals, and ultimately improve their health.
management of their health. This can potentially help them to adapt to any changes in their dietary habits and lifestyles, as well as improve compliance, and ultimately, improve the pharmaceutical care of patients who are on anticoagulant therapy. Healthcare providers, patients and developers of health information systems should realize the importance and know the concepts and related principles when designing for pharmaco-cybernetics applications. However, understanding how users structure their individual experiences, immediate environments, and tasks is just the beginning when designing such products. Designers should also take into account how external forces such as socio-cultural and inter-personal factors shape a user’s overall experience, attitude and goals in using the applications, and through an ecological perspective so as to cater the interactive tools for a wider audience; as well as how they can be applied to the designing of other pharmaco-cybernetics products involving medication therapies.
4. Conclusion Developers of healthcare interactive tools often overlook relevant user characteristics, tasks, preferences and usability issues, thus resulting in systems or tools that decrease productivity or simply remain unusable [56]. Medical tools need to be robust and easy to use in a wide variety of environments [57]. Thus, healthcare applications must be carefully crafted to ensure that they meet the standards and models outlined by their target users. The integration of interactive media and informatics technologies with the WWW has enabled computational tools to play an important role in pharma-culture. In this paper, the concept of ‘pharmaco-cybernetics’ is introduced through the creation of an interactive tool on oral anticoagulation therapy. Interactivity was developed in the form of two games for users to learn about warfarin tablet strengths and drug interactions. Currently, this tool is largely based on the principles of UCD and ECD. However, the potential of incorporating the ACD approach in the designing of this tool is definitely attractive, and can lead to better quality healthcare tools for other chronic medication therapies. Prototype sketches of how the games can be improved in future versions are provided in Fig. 5. It is hoped that these improved versions will not only cater towards enhancing the user’s experience, but also his interactions with the tool. In conclusion, pharmaco-cybernetics can empower patients with the appropriate knowledge regarding their therapy so that they can better participate in the
Fig. 5 Prototype sketches of improved versions of the interactive tool consisting of (a) the warfarin pill-catching game and (b) the warfarin hangman game.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
Acknowledgments The authors would like to thank Asst. Prof. Timothy Marsh, lecturer for the NM5206 module, and Ms. Cecilia Chua from the Republic Polytechnic, Singapore, for their support for the WarfarINT tool and contributing to the success of this article.
References [1]
L. H. Lee, M. C. Kong, Y. M. Wong, et al., Clinical Pharmacy Practice Guidelines. Anticoagulation Warfarin, March 2006 ed., Singapore: Ministry of Health, Singapore, 2006. [2] American Society of Hospital Pharmacists, “ASHP statement on pharmaceutical care”, American Journal of Hospital Pharmacy, Vol. 50, 1993, pp. 1720-1723. [3] E. H. Shortliffe, and A. M. Garber, “Training synergies between medical informatics and health services research”, Journal of the American Medical Informatics Association, Vol. 9, No. 2, 2002, pp. 133-139. [4] D. R. Masys, P. F. Brennan, J. G. Ozbolt, et al., “Are medical informatics and nursing informatics distinct disciplines? The 1999 ACMI debate”, Journal of the American Medical Informatics Association, Vol. 7, No. 3, 2000, pp. 304-312. [5] H. J. Lowe, E. C. Lomax, and S. E. Polonkey, “The World Wide Web: a review of an emerging internet-based technology for the distribution of biomedical information”, Journal of the American Medical Informatics Association, Vol. 3, No. 1, 1996, pp. 1-14. [6] S. Nettleton, “The emergence of e-scaped medicine?”, Sociology, Vol. 38, No. 4, 2004, pp. 661-679. [7] G. Collste, “The internet doctor and medical ethics: ethical implications of the introduction of the internet into medical encounters”, Medicine, Health Care, and Philosophy, Vol. 5, No. 2, 2002, pp. 121-125. [8] B. Chung, C. E. Corbett, B. Boulet, et al., “Talking Wellness: a description of a community-academic partnered project to engage an African-American community around depression through the use of poetry, film, and photography”, Ethnicity and Disease, Vol. 16, No. 1 Suppl 1, 2006, pp. S67-S78. [9] U. Ritterfeld, and S. A. Jin, “Addressing media stigma for people experiencing mental illness using an entertainmenteducation strategy”, Journal of Health Psychology, Vol. 11, No. 2, 2006, pp. 247-267. [10] D. Cantor, “Uncertain enthusiasm: the American Cancer Society, public education, and the problems of the movie, 1921-1960”, Bulletin of the History of Medicine, Vol. 81, No. 1, 2007, pp. 39-69. [11] S. E. Lederer, “Dark victory: cancer and popular Hollywood film”, Bulletin of the History of Medicine, Vol. 81, No. 1, 2007, pp. 94-115. [12] K. L. Taylor, J. L. Davis 3rd, R. O. Turner, et al., “Educating African American men about the prostate cancer screening dilemma: a randomized intervention”, Cancer Epidemiology, Biomarkers and Prevention, Vol. 15, No. 11, 2006, pp. 2179-2188.
11
[13] G. Wallmyr, and C. Welin, “Young people, pornography, and sexuality: sources and attitudes”, Journal of School Nursing, Vol. 22, No. 5, 2006, pp. 290-295. [14] C. Jackson, J. D. Brown, and K. L. L'Engle, “R-rated movies, bedroom televisions, and initiation of smoking by white and black adolescents”, Archives of Pediatrics and Adolescent Medicine, Vol. 161, No. 3, 2007, pp. 260-268. [15] J. D. Sargent, “Smoking in movies: impact on adolescent smoking”, Adolescent Medicine Clinics, Vol. 16, No. 2, 2005, pp. 345-370. [16] Jean-Guilhem, “World Wide Web Consortium (W3C)”; http://www.w3.org/ [accessed 3 July 2009]. [17] Internet Society (ISOC), “The Internet Engineering Task Force (IETF)”; http://www.ietf.org/ [accessed 3 July 2009]. [18] J. Liaskos, and M. Diomidus, “Multimedia technologies in education”, Studies in Health Technology and Informatics, Vol. 65, 2002, pp. 359-372. [19] G. G. Moghaddam, and M. Moballeghi, “Human-computer interaction: guidelines for web animation”, East/West Interaction : An Online English Chinese Journal, 2006; http://eprints.rclis.org/archive/00007823/ [accessed 3 July 2009]. [20] R. E. Weiss, D. S. Knowlton, and G. R. Morrison, “Principles for using animation in computer-based instruction: theoretical heuristics for effective design”, Computers in Human Behavior, Vol. 18, No. 4, 2002, pp. 465-477. [21] A. Demirjian, and B. David, “Learning medical and dental sciences through interactive multi-media”, Medinfo, Vol. 8 Pt 2, 1995, p. 1705. [22] R. W. Pho, S. Y. Lim, and B. P. Pereira, “Computer applications in orthopaedics”, Annals of the Academy of Medicine Singapore, Vol. 19, No. 5, 1990, pp. 691-698. [23] K. Penska, L. Folio, and R. Bunger, “Medical applications of digital image morphing”, Journal of Digital Imaging, Vol. 20, No. 3, 2007, pp. 279-283. [24] D. M. Smith, S. J. Aston, C. B. Cutting, et al., “Applications of virtual reality in aesthetic surgery”, Plastic and Reconstructive Surgery, Vol. 116, No. 3, 2005, pp. 898-904. [25] D. M. Smith, S. J. Aston, C. B. Cutting, et al., “Designing a virtual reality model for aesthetic surgery”, Plastic and Reconstructive Surgery, Vol. 116, No. 3, 2005, pp. 893897. [26] R. C. Ward, M. W. Yambert, R. J. Toedte, et al., “Creating a human phantom for the virtual human program”, Studies in Health Technology and Informatics, Vol. 70, 2000, pp. 368-374. [27] J. E. Champoux, “Animated films as a teaching resource”, Journal of Management Education, Vol. 25, No. 1, 2001, pp. 79-100. [28] P. Piyasirivej, “Towards usability evaluation of Flash web sites”, in World Forum Proceedings of the International Research Foundation for Development, Tunis, Tunisia, 2005. [29] Edheads + COSI, “Virtual Knee Surgery and Choose the Prosthetic games”; http://www.edheads.org/activities/knee/ [accessed 3 July 2009].
IJCSI
12
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
[30] H. Leufkens, Y. Hekster, and S. Hudson, “Scenario analysis of the future of clinical pharmacy”, Pharmacy World and Science, Vol. 19, No. 4, 1997, pp. 182-185. [31] T. Westerlund, A. B. Almarsdóttir, and A. Melander, “Factors influencing the detection rate of drug-related problems in community pharmacy”, Pharmacy World and Science, Vol. 21, No. 6, 1999, pp. 245-250. [32] T. L. Stedman, Stedman's Medical Dictionary, 28th ed., Baltimore, Maryland: Lippincott Williams & Wilkins, 2005. [33] P. Pangaro, “"Getting started" guide to cybernetics”; http://pangaro.com/published/cyber-macmillan.html [accessed 3 July 2009]. [34] Wikipedia, “Cybernetics”; http://en.wikipedia.org/wiki/Cybernetics [accessed 3 July 2009]. [35] American Society for Cybernetics, “Foundations - the subject of cybernetics: defining 'cybernetics'”; http://www.asc-cybernetics.org/foundations/definitions.htm [accessed 3 July 2009]. [36] P. A. Corning, “Synergy, cybernetics, and the evolution of politics”, International Political Science Review, Vol. 17, No. 1, 1996, pp. 91-119. [37] 'Psycho-cybernetics' Author, “Plastic surgeon tries to heal 'inner scars'”, Los Angeles TimesB5, 1973, pp. B5-1. [38] Wikipedia, “New cybernetics”; http://en.wikipedia.org/wiki/New_cybernetics [accessed 3 July 2009]. [39] F. Geyer, “The challenge of sociocybernetics”, Kybernetes: The International Journal of Systems & Cybernetics, Vol. 24, No. 4, 1995, pp. 6-32. [40] P. C. Adams, “Cyberspace and virtual places”, Geographical Review, Vol. 87, No. 2, 1997, pp. 155-171. [41] R. Trappl, “The cybernetics and systems revival”, in 14th European Meeting on Cybernetics and Systems Research (EMCSR'98), University of Vienna, Austrian Society for Cybernetic Studies, February 1998, p. Preface. [42] W. Gibson, Neuromancer (Special 20th Anniversary Edition), Hardcover ed.: Ace Books, 2004. [43] Wikipedia, “The Matrix”; http://en.wikipedia.org/wiki/Matrix_movie [accessed 3 July 2009]. [44] Wikipedia, “Cyberspace”; http://en.wikipedia.org/wiki/Cyberspace [accessed 3 July 2009]. [45] G. R. S. Weir, and S. Heeps, “Getting the message across: ten principles for web animation. ”; http://eprints.cdlr.strath.ac.uk/2513/ [accessed 3 July 2009]. [46] D. A. Norman, “The psychopathology of everyday things”, The Design of Everyday Things, pp. 1-33, USA: Perseus Publishing, 2002. [47] J. Preece, Y. Rogers, and H. Sharp, Interaction design: beyond human-computer interaction, 1st ed., New York, NY: John Wiley & Sons Inc., 2002. [48] D. A. Norman, “Emotion and design: attractive things work better”, Interactions, Vol. 9, No. 4, 2002, pp. 36-42. [49] D. A. Norman, “Three levels of design: visceral, behavioral, and reflective”, Emotional Design: Why We
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
Love (or Hate) Everyday Things, pp. 63-98, New York, NY: Basic Books, 2004. J. McCarthy, and P. Wright, “The threads of experience”, Technology as Experience, pp. 79-104, Cambridge, MA: MIT Press, 2004. J. Forlizzi, and K. Battarbee, “Understanding experience in interactive systems”, Symposium on Designing Interactive Systems. Proceedings of the 5th Conference on Designing Interactive Systems: Processes, Practices, Methods, and Techniques. DIS '04, 2004, pp. 261-268; http://doi.acm.org/10.1145/1013115.1013152 [accessed 3 July 2009]. V. Kaptelinin, K. Kuutti, and L. Bannon, “Activity theory: basic concepts and applications. A summary of a tutorial given at the east west HCI95 conference”, Lecture Notes in Computer Science, Human-Computer Interaction, Vol. 1015, 1995, pp. 189-201. V. Kaptelinin, “Activity theory: implications for humancomputer interaction”, Context and Consciousness: Activity Theory and Human-Computer Interaction, B. A. Nardi, ed., pp. 103-116, Cambridge, MA: MIT Press, 1996. V. Kaptelinin, and B. A. Nardi, “Activity theory in a nutshell”, Acting with Technology: Activity Theory and Interaction Design, pp. 29-72, Cambridge, MA: MIT Press, 2006. D. Paquette, and J. Ryan, “Bronfenbrenner’s Ecological Systems Theory”; http://pt3.nl.edu/paquetteryanwebquest.pdf [accessed 3 July 2009]. C. M. Johnson, T. R. Johnson, and J. Zhang, “A usercentered framework for redesigning health care interfaces”, Journal of Biomedical Informatics, Vol. 38, No. 1, 2005, pp. 75-87. R. Gagnier, “User centered design of medical devices: managing use related hazards”, Macadamian White Papers, n.d.; http://www.macadamianusability.com/resources/whitepapers/Maskerydesigning_medical_devices.pdf [accessed 10 November 2008].
Kevin Y.-L. Yap (B.Sc. in Pharmacy (Hons), M.Eng., Sp. Dip. Digital Media Creation) is currently a Ph.D. candidate in the Department of Pharmacy, National University of Singapore, and a registered pharmacist in Singapore. He has worked as a pharmacist in the hospital and community settings, as well as an academic facilitator in the biomedical sciences, based on the problem-based learning pedagogy. His research interests lie in the application of informatics, digital media, interactive and web technologies in clinical pharmacy practice, particularly with regards to pharmaceutical care and the solving of drug-related problems; and he has presented in various international conferences and published several papers in this area. He is a member of the Pharmaceutical Society of Singapore, American Association for the Advancement of Science, and the Healthcare Information and Management Systems Society. He has also been featured in th Marquis Who’s Who in Science and Engineering (10 ed.), and in th Medicine and Healthcare (7 ed.). Xuejin Chuang, Alvin J.M. Lee, Raemarie Z. Lee, Lijuan Lim and Jeanette J. Lim were undergraduates, while R. Nimesha and Kevin Yap were postgraduates in the National University of Singapore during the time in which the pilot usability study was
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
13
carried out. The WarfarINT tool was originally designed and created by Kevin Yap. All authors were members of the project team in the module NM5206 Emerging Media Interaction Design offered by the Communications and New Media (CNM) Programme, Faculty of Arts and Social Science, in the first semester of the Academic Year 2008-2009.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
14
ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
SIMILARITY MATCHING TECHNIQUES FOR FAULT DIAGNOSIS IN AUTOMOTIVE INFOTAINMENT ELECTRONICS Dr. Mashud Kabir Department of Computer Science, University of Tuebingen D-72027 Tuebingen, Germany
[email protected]
Abstract Fault diagnosis has become a very important area of research during the last decade due to the advancement of mechanical and electrical systems in industries. The automobile is a crucial field where fault diagnosis is given a special attention. Due to the increasing complexity and newly added features in vehicles, a comprehensive study has to be performed in order to achieve an appropriate diagnosis model. A diagnosis system is capable of identifying the faults of a system by investigating the observable effects (or symptoms). The system categorizes the fault into a diagnosis class and identifies a probable cause based on the supplied fault symptoms. Fault categorization and identification are done using similarity matching techniques. The development of diagnosis classes is done by making use of previous experience, knowledge or information within an application area. The necessary information used may come from several sources of knowledge, such as from system analysis. In this paper similarity matching techniques for fault diagnosis in automotive infotainment applications are discussed. Key words: similarity, fault, diagnosis, matching, automotive, infotainment, cosine.
1. Introduction At first feature selection is discussed where stop word list and word stemming are used. Then pattern recognition is explained. Ranking algorithms are used to rank words, web pages. Page ranking algorithm is discussed keeping in mind our application. Similarity algorithms are discussed in the next sections. Then proper similarity matching algorithm which best fits to fault diagnosis in automotive infotainment system is presented. The algorithm is analyzed with real field data and the results are evaluated.
2. Feature Selection Feature selection is one of the main steps in similarity matching of faults. We apply a stop word [1] list to filter out the meaningless words. A list of stop words has been built. This list has been created keeping in mind the
existing standard fault description language in automotive infotainment systems. Word stemming is a method where lexically similar words are listed together. Here, the words with affixes and suffixes are converted into root words. This methodology overcomes the limitation of words with the same meaning being categorized into different classes. A list of stemming words is created for automotive infotainment system. Both the lists of stop words and stemming words were developed with the help of experienced system engineers in automotive infotainment system.
3. Pattern Recognition Based on highly developed skill after sensing the surroundings, humans are capable of taking any actions according to their observations. By observing the nature of human intelligence, a machine can be built to do the same job, such as identifying hand writing, post code, voice, finger print, DNA, human face etc. A pattern is an abstract object such as a set of measurements describing a physical object. This is an entity with a given name such as hand writing, a sentence, human face etc. Pattern recognition consists of several steps such as observation of inputs, learning how to distinguish different patterns and making rationale decisions in categorizing patterns. Shmuel Brody [3] has summarized the concepts of pattern recognition and their uses in similarity matching. Human detected patterns contain many relevant and irrelevant data. The most important task in pattern recognition is to find out the meaningful patterns and to disregard the irrelevant subject matter. The fields of area of pattern recognition range from data analysis, feature extraction, error estimation, error removal, cluster analysis, grammatical inference and parsing.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
Faramarz Valafar [4] has discussed pattern recognition techniques in data analysis. Clustering is one of the most commonly used recognition techniques. Data are grouped into clusters or groups in clustering. K-means clustering [2] is a widely used algorithm for data clustering. In kmeans similar algorithm patterns are partitioned into the same group. All the data are classified into any of the k clusters or classes. Then the mean inter and intra-class distances are determined. The last step is to maximize the intra-class distance and minimize the inter-class distance. This is an iterative procedure where data is moved from one cluster to another. This process continues until optimized distances of intra-class and inter-class are found. In pattern similarity discover similarity detecting.
recognition different techniques are applied for matching. For this work it is necessary to optimized techniques and algorithms for matching, fault classifying and fault cause
15
Fig. 1 Inbound link of page A.
A set of five web pages is assumed: A, B, C, D, E. The initial probability is distributed evenly among these pages. Therefore, each of the pages will get a PageRank of 1.0/5. It means, PR(A) = PR(B) = PR(C) = PR(D) = PR(E) = 0.2 (i) Now suppose the scenario as depicted in figure 1: Page A has inbound links from Page C, D and E. Thus, the PageRank of page A
PR(A) =PR(C) + PR(D) + PR(E)
Page C has other outbound links to page E, page D has other outbound links to B, C and E as depicted in figure 2. E
C
4. Ranking Algorithms PageRank algorithm is a widely used algorithm to rank web pages according to their importance. The algorithm is described as following – PageRank is a link analysis algorithm to rank a web page from a set of pages according to its relative importance. It provides a numerical weighting to each of the page elements in the set. This weighting is called PageRank of E which is denoted by PR(E). PageRank was introduced by Larry Page at Stanford University to develop a new search engine in the web. The ranking of a page depends on the number of links of the other pages to that page. PageRank is a probability distribution which shows the likelihood that a user randomly clicking on the links finds a specific site. This probability ranges from 0 to 1. A PageRank of 0.8 means that the probability of reaching a specific site by randomly clicking on a set of links is 80%.
(ii)
E
D
C
B
Fig. 2 The Outbound links of page C and Page D.
The value of the link-votes is divided among all the outbound links of a page. Thus, page C contributes a vote weight of 0.2/2 i.e. 0.1 and page D contributes a vote weight of 0.2/4 i.e. 0.05. Thus, the equation stands in the following form:
PR( A) =
PR(C ) PR( D) PR( E ) + + 2 4 1
(iii)
The above equation can be generalized in the following form assuming that the PageRank incurred by an outband link of a page is the page’s own PageRank in the set divided by the number of outband links
PR( A) =
PR(C ) PR( D) PR( E ) + + L(C ) L( D) L( E )
(iv)
D The PageRank of any page i can be expressed in the following form: C
A
E
PR (i) =∑ jεS i
PR ( j ) Nj
(v)
Where,
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
16
ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
PR(i ) = PageRank of page i PR( j ) = PageRank of any other pages except page i.
N j = Number of pages in the set jεS i = Inbound pages linking to page i PageRank algorithm is mainly used for internet applications to find the rank of a page. The basis of the algorithm is that the rank of a page depends on the inbound links of the other pages. To apply this technique we need to compare links among the pages with the links among the features of the fault. But this study requires the ranking of features according to their importance. This makes PageRank algorithm inappropriate for this project.
5.
Similarity Matching
This chapter describes the similarity matching techniques for strings. Using these techniques, a concept is proposed to search similar faults when the symptoms of a fault are provided. Edit distance is a common term in matching algorithms. The word distance is used to compare different data for similarity. Edit distance is a measure to estimate differences between input elements. Different methods to calculate edit distance exist: Levenshtein Distance Levenshtein distance is named after the Russian scientist Vladimir Levenshtein, who devised the algorithm in 1965. The Levenshtein distance between two strings is given by the minimum number of operations needed to transform one string into the other, where an operation is an insertion, deletion, or substitution of a single character. Levenshtein distance (LD) is a measure of the similarity between two inputs: the source s and the target input t. The distance is the number of deletions, insertions, or substitutions required to transform s into t. For example, If s is "math" and t is "math", then LD(s,t) = 0, because no transformations are needed. If s is "math" and t is "mats", then LD(s,t) = 1, because one substitution (change "h" to "s") is sufficient to transform s into t. The more different the inputs are, the greater the Levenshtein distance is. Insertion, deletion and substitution are the main criteria for determining Levenshtein Distance. The position of a character plays an important role to determine the distance. In this study, the description of a fault is dealt
with. If Levenshtein Distance is applied to find out the similarity of faults it would not give a meaningful result as the positions of the strings should not have importance. That is why this technique will not be used in this study. Damerau-Levenshtein Distance Damerau-Levenshtein distance comes from Levenshtein distance that counts transposition as a single edit operation. The Damerau-Levenshtein distance is equal to the minimal number of insertions, deletions, substitutions and transpositions needed to transform one string into the other. Kukich [5] described several edit distance algorithms which use Damerau-Levenshtein distance. It has been proved that the use of Damerau-Levenshtein metric to calculate the similarity between two words is a slow process. For this reason this method is not well-suited for similarity matching in this project. Needleman – Wunsch Distance The Levenshtein distance algorithm assumes that the cost of all insertions, deletions, substitutions or conversions is equal. However, in some scenarios this may not be desirable and may mask the acceptable distances between inputs. Needleman-Wunsch has modified Levenshtein distance algorithm to add cost matrix as an extra input. This matrix structure contains two cost matrics for each pair of characters to convert from and to. The cost of inserting this character and converting between characters is listed in this matrix. This approach is not appropriate for use in this study’s similarity matching for the same reason stated in Levenshtein approach. Hamming Distance The Hamming distance [6] H is defined for the same length inputs. For two inputs s and t, H(s, t) is the number of places in which the two strings differ, i.e., have different characters. Hamming Distance is used in information theory. This method can not be applied in similarity matching for automotive faults since Hamming Distance only considers the differences among the two inputs. Weighted Edit Distance
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
This algorithm differs from the Edit Distance in weighting. A particular weight is imposed for each operation of insertion, deletion and substitution. The main goal of similarity matching of faults is to find the faults with the similar behaviors. Weighted Edit Distance focuses on providing weight on the operations. This kind of approach is inappropriate for finding similar faults. Hamming Distance The Hamming distance is the number of positions for which the corresponding characters differ. It is simply the number of differences between two strings of the same length. For example: The Hamming Distance between GERMANY and IRELAND is 5. To apply this distance between two error features they must be of equal length, which is a rare case. This results in the decision not to use Hamming Distance for similarity matching in this study.
6. Similarity Determination The aim of this section is to propose an algorithm to use for similarity matching in text queries. The procedures of this algorithm are as following
17
Cos similarity ( A, B ) =
∑ ∑
N i =1
N i =1
α i Fi ( A) Fi ( B )
α i 2 Fi ( A) 2 ∑i =1α i 2 Fi ( B ) 2 N
where,
α i = user-determined parameter (weights) (~1)
Cosine similarity method counts the number of different words in two documents. With this method the highest frequency words within any document will have the largest influence on its similarity with other documents. Documents with many occurrences of an unusual word or many different unusual words will have low cosine similarity measures with most other documents. Weighting schemes are frequently used to modify the standard cosine measure. These typically lower the importance of common words. Below are the results of some input data and their similarities with the existing input database using this algorithm – Input Database: This is the database which is already stored in the system. This is compared with the user provided fault symptoms. Attachment
A text (query) T is represented by multidimensional vector: F(T) = (F1(T), F2(T), …Fk(T)) (occurrence vector) k = no. of distinct term occurring in database (non-stop word) Function of frequency of the i-th term in T,
tf i 1⎛ Fi (T ) = ⎜⎜1 + 2 ⎝ max tf i T T
⎞ ⎟ log N ⎟ ni ⎠
where,
tf i
T
= frequency of the i-th term in T
max tf i
T
= no. of database documents where the most frequent term of T occurs
N = no. of database entries ni
= no. of entries where the i-th term occurs
The cosine similarity measure between a query (A) and a stored document (B) is defined as:
Y message
Defect ID Fault Characteristics 32 Display ON Signal will be sent, but Display remains dark 40 Preconditions: radio hu
41 audio hu radio message message 42 hu -> audio Y 44 message message radio radio 45 message 46 message 47 message no sds 48 no message sds 49 50 51 52 53 message from headunit
radio:
radio.
Preconditions:
message; message->
preconditions: hu sds sdars radio hu message message radio
hu
no
message
radio
no
hu
message
radio hu message message radio hu radio radio radio radio radio radio radio does not
receive
Result Analysis: Below is the graphical representation of outputs for determining fault similarities corresponding to user
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
18
ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
radio hu
radio dvd message 80 similarity
provided fault symptoms. The similarities of fault symptom radio hu are 100% (fault id 49) with the database fault radio hu and 68% (fault id 40) with the database fault Preconditions: radio hu message. The result of this fault matching is shown in figure 3.
60 40 20 0 40 41 42 44 45 46 47 48 49 50 51 52 53
similarity %
100
fault id
80 60 40
Fig. 5 Fault similarities with symptom radio dvd message.
20 0 40 41 42 44 45 46 47 48 49 50 51 52 53 fault id
Fig. 3 Fault similarities with symptom radio hu.
The similarities of fault symptom radio hu message are 100% (id 40) with database fault radio hu and 84% (id 41) with database fault radio: radio. message; audio hu radio message message and 84% (id 45) with database fault radio hu message message messaeg. The result of this fault matching is shown in figure 4.
The similarities of fault symptom radio dvd message are 61% (id 41) with the database fault radio: radio. message; audio hu radio message message and 56% (id 45) with the database fault radio hu message message message. The result of this fault matching is shown in figure 5. The similarities of fault symptom radio dvd are 58% with database faults radio (id 50) and radio radio(id 51) and radio radio radio (id 52). The result of this fault matching is shown in figure 6.
radio dvd
radio hu message
similarity %
100 80
similarity %
100 80 60 40 20 0
60 40
40 41 44 45 46 47 48 49 50 51 52 53
20
fault id
0 40 41 42 44 45 46 47 48 49 50 51 52 53
Fig. 6 Fault similarities with symptom radio dvd.
fault id
Fig. 4 Fault similarities with symptom radio hu message.
Based on the above result analysis it can be concluded that the similarity of a user provided fault is higher if the symptom of the fault matches more closely with any database fault. It satisfies the requirement of finding similar faults for a fault symptom.
7. Conclusion In this paper feature selection, pattern recognition, page ranking algorithms have been discussed to process input data and system database. Different similarity matching algorithms have also been explained. Cosine similarity
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
19
algorithm has been chosen for our special application of automotive infotainment system. Real field automotive faults have been used to analyse the cosine similarity method. After comparing with the existing fault database, a decision of fault similarities on user provided fault has been made and the results have been discussed.
References [1] Eric Brill, Jimmy Lin, Michele Banko, Susan Dumais, Andrew Ng. Data-Intensive Question-Answering. In the proceedings of the Tenth Text Retrieval Conference (TREC 2001), Maryland,November 2001. [2] Tapas Kanungoy, David Mountz, Nathan Netanyahu, Christine Piatko, Ruth Silverman, Angela Wu. A Local Search Approximation Algorithm for k-Means Clustering. 18th Annual ACM Symposium on Computational Geometry (SoCG’02), Barcelona, Spain, June 2002. [3] Shmuel Brody. Cluster-Based Pattern Recognition in Natural Language Test. Master Thesis. August 2005. [4] Faramarz Valafar. Pattern Recognition Techniques in Microarray Data Analysis: A Survey. Special issue of Annals of New York Academy of Sciences, Techniques in Bioinformatics and Medical Informatics. (980) 41-64, December 2002. [5] Karen Kukich. Techniques for automatically correcting words in text. ACM Computing Surveys, 24(4):377–439, December 1992. [6] Yu Tao; Muthukkumarasamy, V.; Verma, B.; Blumenstein, M. A texture extraction technique using 2D-DFT and Hamming distance. Fifth International Conference on Computational Intelligence and Multimedia Applications, 2003. ICCIMA 2003.
Mashud Kabir. I was born in Narayanganj, Bangladesh in 1976. I have completed my Bachelor of Science (BSc) in Electrical & Electronic Engineering from Bangladesh University of Engineering & Technology (BUET) in 2000. I was awarded board scholarship from 1995 to 2000. I earned my Master of Science (MSc) in Communication Engineering from University of Stuttgart, Germany in 2003. I achieved STIEBET German Government scholarship during my Master Study. My Master thesis was “Region-Based Adaptation of Diffusion Protocols in MANETs” where up to 21% of broadcast can be saved. I worked at Mercedes-Benz Technology Center, Germany from 2003 to 2005 as a PhD student. I have worked in the research & development projects of BMW, Land-Rover and Audi in Automotive Infotainment Network area for more than four years. I have achieved my Doctoral degree from the Department of Computer Science, University of Tuebingen, Germany in 2008. My dissertation topic was “Intelligent System for Fault Diagnosis in Automotive Applications”.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
20
ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
Prototype System for Retrieval of Remote Sensing Images based on Color Moment and Gray Level Co-Occurrence Matrix 1
Priti Maheshwary1, Namita Srivastava2 Deparment of Computer Application, Maulana Azad National Institute of Technology Bhopal, Madhya Pradesh, India
[email protected] 2
Deparment of Mathematics, Maulana Azad National Institute of Technology Bhopal, Madhya Pradesh, India
[email protected]
Abstract The remote sensing image archive is increasing day by day. The storage, organization and retrieval of these images poses a challenge to the scienitific community. In this paper we have developed a system for retrieval of remote sensing images on the basis of color moment and gray level co-occurrence matrix feature extractor. The results obtained through prototype system is encouraging. Key words: Remote Sensing Image Retrieval, Color Moment, Gray Level Co-occurrence Matrix, Clustering index.
1. Introduction Content-based image retrieval (CBIR) technology was proposed in 1990s and it is an image retrieval technology using image vision contents such as color, texture, shape, spatial relationship, not using image notation to search images. It resolves some traditional image retrieval problems, for example, manual notations for images bring users a large amount of workload and inaccurate subjective description. After more than one decade, it has been developed as content-based vision information retrieval technology including image information and video information. Great progress has been made in theory and applications. At present, CBIR technology obtains successful applications in face reorganization fields, fingerprint reorganization fields, medical image database fields, trademark registration fields, etc., such as QBIC system of IBM Corporation, Photobook system of MIT Media Laboratory and Virage system of Virage Corporation. It is difficult to apply these systems in massive remote sensing image archive because remote sensing image has many features including various data types, a mass of data, different resolution scales and different data sources, which restrict the application of CBIR technology in remote sensing image field. In order to change the current situation, we must resolve some problems as follows. 1) Storing massive remote sensing image data. 2) Designing reasonable physical and logical pattern of remote sensing image database. 3) Adopting adaptive image feature extraction algorithms.
4) Adopting indexing structure for search. 5) Designing reasonable content based searching system of massive remote sensing image database. The rest of the paper is arranged as follows. In Sec. 2, we discuss the methodology. In Sec. 3, the experimental setup and the results obtained are discussed. We conclude in Sec. 4.
2. Methodology For practical applications, users are often interested in the partial region or targets, such as military target, public targets and ground resource targets in remote sensing image instead of the entire image. For example, the small scale important targets and regions of remote sensing image arrest more attention than the entire remote sensing image in application. These image slice features of important targets and regions extracted by color, texture, shape, spatial relationship, etc. are stored in feature database. Efficient indexing technology is a key factor for applying the content-based image retrieval in massive image database successfully. Indexing technology developed from traditional database and has been applied in content-based image retrieval field subsequently. Fig. 1 shows an architecture frame of content-based remote sensing image. Traditionally, satellite image classification has been done at the pixel level. For a typical LISS III image has 23.5m resolution, a 100 × 100 sized image patch covers roughly 7.2 Km2. This is too large an area to represent precise ground segmentation, but our focus is more on building a querying and browsing system than showing exact boundaries between classes. Dividing the image into rectangular patches makes it very convenient for training as well as browsing. Since users of such systems are generally more interested in getting an overview of the location, zooming and panning is allowed optionally as part of the interface.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
21
2.2 Grey-level co-occurrence matrix texture Grey-Level Co-occurrence Matrix texture measurements have been the workhorse of image texture since they were proposed by Haralick in the 1970s. To many image analysts, they are a button you push in the software that yields a band whose use improves classification - or not. The original works are necessarily condensed and mathematical, making the process difficult to understand for the student or front-line image analyst. Calculate the selected Feature. This calculation uses only the values in the GLCM. See: i) Contrast
Figure 1: Architectural Framework of CBIR system ii) Correlation We have developed a prototype system for image retrieval. In this a query image is taken and images similar to the query images are found on the basis of color and texture similarity. The three main tasks of the system are: 1. 2. 3. 4.
Color Moment Feature Extraction GLCM Texture Feature Extraction. K-means clustering to form index. Retrieval between the query image and database.
2.1 Color moment:
We will define the ith color channel at the jth image pixel as pij. The three color moments can then be defined as: MOMENT 1 – Mean:
Mean can be understood as the average color value in the image. MOMENT 2 -Standard Deviation:
The standard deviation is the square root of the variance of the distribution. MOMENT 3 – Skewness:
Skewness can be understood as a measure of the degree of asymmetry in the distribution.
iii) Energy
iv) Homogeneity
These features are calculated with distance 1 and angle 0, 45 and 90 degrees.
2.3 K-Means Clustering A cluster is a collection of data objects that are similar to one another with in the same cluster and are dissimilar to the objects in the other clusters. It is the best suited for data mining because of its efficiency in processing large data sets. It is defined as follows: The k-means algorithm is built upon four basic operations: 1. Selection of the initial k-means for k-clusters. 2. Calculation of the dissimilarity between an object and the mean of a cluster. 3. Allocation of an object of the cluster whose mean is nearest to the object. 4. Re-calculation of the mean of a cluster from the object allocated to it so that the intra cluster dissimilarity is minimized. The advantage of K-means algorithm is that it works well when clusters are not well separated from each other, which is frequently encountered in images. The cluster number allotted to each image is considered its class or group.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
22
ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
2.4 Similarity Matching: Many similarity measures have been developed for image retrieval based on empirical estimates of the feature extraction. We have used Euclidean Distance for similarity matching. The Euclidean distance between two points P = (p1, p2, pn) and Q = (q1,q2, ……, qn), in Euclidean n-space defined as: ……,
Now for the retrieval purpose the user select the query patch and on the basis of its class number the distance between the query patch with the other images of that class is calculated and images are retrieved.
Figure 2: Water bodies
3. Experimental Plan For our experiments, we use 3 LISS III + multi-spectral satellite images with 23.5m resolution. We choose to support 4 semantic categories in our experimental system, namely mountain, water bodies, vegetation, and residential area. In consultation with an expert in satellite image analysis, we choose near-IR (infra-red), red and green bands as the three spectral channels for classification as well as display. The reasons for this choice are as follows. Near-IR band is selected over blue band because of a somewhat inverse relationship between a healthy plant’s reflectivity in near-IR and red, i.e., healthy vegetation reflects high in near-IR and low in red. Near-IR and red bands are key to differentiating between vegetation types and states. Blue light is very abundant in the atmosphere and is diffracted all over the place. It therefore is very noisy. Hence use of blue band is often avoided. Visible green is used because it is less noisy and provides unique information compared to Near IR and red. The pixel dimensions of each satellite image are used in our experiments are 720x540, with geographic dimensions being approximately 51.84Km× 38.88Km. The choice patch size is critical. A patch should be large enough to encapsulate the visual features of a semantic category, while being small enough to include only one semantic category in most cases. We choose patch size 100×100 pixels. We obtain 80 patches from all the images in this manner. These patches are stored in a database along with the identity of their parent images and the relative location within them. Ground truth categorization is not available readily for our patches.
Figure 3: Open Land with vegetation
Figure 4: Buildings
The four major classifications of images are shown in figure 2 to 5. Figure 6 and 7 shows the content based retrieval system. We get 80% to 83% accuracy in our results.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
23
content and knowledge base for finding vegetation or water or building areas.
5. References
Figure 5: Vegetation and Mountain
Figure 6: CBIR System
[1] Li, J., Wang, J. Z. and Wiederhold, G., “Integrated Region Matching for Image Retrieval,” ACM Multimedia, 2000, pp. 147-156. [2] Flickner, M., Sawhney, H., Niblack, W., Ashley, J., Huang, Q., Dom, B., Gorkani, M., Hafner, J., Lee, D., Petkovic, D., Steele, D. and Yanker, P., “Query by image and video content: The QBIC system,” IEEE Computer, 28(9), 1995, pp. 23-32 [3] Pentland, A., Picard, R. and Sclaroff S., “Photobook: Contentbased manipulation of image databases”, International Journal of Computer Vision, 18(3), 1996, pp. 233–254 [4] Smith, J.R., and Chang, S.F., “Single color extraction and image query,” In Proceeding IEEE International Conference on Image Processing, 1997, pp. 528–531 [5]Gupta, A., and Jain, R., “Visual information retrieval,” Comm. Assoc. Comp. Mach., 40(5), 1997, pp. 70–79 [6]Eka Aulia, “Heirarchical Indexing for Region based image retrieval”, A thesis Submitted to the Graduate Faculty of the Louisiana State University and Agricultural and Mechanical College. [7]Shi, J., and Malik, J., “Normalized Cuts and Image Segmentation,” Proceedings Computer Vision and Pattern Recognition, June, 1997, pp. 731-737 [8]Smith, J., “Color for Image Retrieval”, Image Databases: Search and Retrieval of Digital Imagery, John Wiley & Sons, New York, 2001, pp. 285-311 [9]Zhang, R. and Zhang, Z., (2002), “A Clustering Based Approach to Efficient Image Retrieval,” Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence, pp. 339
Figure 7: Screen 2 of CBIR System
4. Conclusions For retrieving similar images to a given query image we have developed a prototype system. We get fruitful results on the example images used in the experiments. We can use this technique for mining similar images based on
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
24
ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
Performing Hybrid Recommendation in Intermodal Transportation – the FTMarket System’s Recommendation Module Alexis Lazanas Industrial Management and Information Systems Lab, University of Patras Rion Patras, 26500, Greece
[email protected]
Abstract Diverse recommendation techniques have been already proposed and encapsulated into several e-business applications, aiming to perform a more accurate evaluation of the existing information and accordingly augment the assistance provided to the users involved. This paper reports on the development and integration of a recommendation module in an agent-based transportation transactions management system. The module is built according to a novel hybrid recommendation technique, which combines the advantages of collaborative filtering and knowledge-based approaches. The proposed technique and supporting module assist customers in considering in detail alternative transportation transactions that satisfy their requests, as well as in evaluating completed transactions. The related services are invoked through a software agent that constructs the appropriate knowledge rules and performs a synthesis of the recommendation policy. Key words: Data mining, Knowledge Association Rules, Recommender systems, Intermodal Transportation.
1. Introduction Transportation management involves diverse decision making issues, which are basically related to the appropriate route and carrier selection. Such issues mainly raise due to the variety of the customer’s preferences (e.g. cost limitations, loading preferences, delivery dates) and the carrier’s service resources (e.g. transportation media, available itineraries, capacity). The matching between the above preferences and offered services cannot be easily handled manually, as in most cases a plethora of alternative options exist, while time and money limitations are ubiquitous. Generally speaking, transportation transactions management requires quick and cost-effective solutions to the customers’ demands for both distribution and shipping operations. In cases where many alternatives exist, there is an urgent need for providing recommendations. The customer should be assisted in order to properly evaluate the proposed alternatives and make his/her final decision.
Recommendation systems have been described as systems that produce individualized recommendations or have the effect of guiding the user in a personalized way, in environments where the amount of on-line information vastly outstrips any individual’s capability to survey it [2]. Generally speaking, such systems represent the users’ preferences for the purpose of submitting suggestions for purchasing or evaluating elements. Fundamental applications can be found in the fields of electronic commerce and information retrieval, where they provide suggestions that effectively direct the users to the elements that satisfy better their necessities and preferences [21]. This paper reports on the development of an innovative recommendation module that provides valuable assistance to the users of a transportation transactions management system, namely FTMarket (Freight Transportation Market). FTMarket is fully implemented and handles various types of transportation transactions [14, 10]. It exploits a series of dedicated software agents that represent and act for any type of user involved in a transportation scenario (such as customers who look for efficient ways to ship their products and transport companies that may - fully or partially - carry out such requests), while they cooperate and get the related information in real-time mode [24]. Our overall approach is based on flexible models that achieve efficient communication among all parties involved, coordinate the overall process, construct possible alternative solutions and perform the required decision-making [10, 12]. In addition, FTMarket is able to handle the complexity that is inherent in such environments [6], which concerns freighting and fleet scheduling processes, as well as “modular transportation solutions” 1 . FTMarket provides 1
To further explain this concept, consider the case where a customer wants to convey some goods from place A to place B, while there is no transport company acting directly between these two places. Supposing that two available carriers X and Y have some scheduled itineraries from A to C and from C to B, respectively, it is obvious that a possible solution
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
the customer with a set of alternative solutions for each requested transaction. These solutions are constructed through the use of a specially developed algorithm for retrieving optimal and sub-optimal solutions. Moreover, through a dedicated recommender agent [9, 22], which builds on Web Services concepts [26], the system assists the customer further towards making the appropriate decisions. The remainder of this paper is structured as follows: Section 2 reports on background issues from the area of recommender systems, paying particular attention to recommendation approaches. Section 3 describes the basic aspects of our approach, which concern the selection of transportation plans and the evaluation of alternative solutions. Section 4 focuses on issues raised during the integration of the recommendation module, the formulation of the recommendation policy, and the exploitation of software agents and Web Services technologies. Finally, Section 5 concludes the paper and highlights future work directions.
2. Related Work The most widely adopted recommendation techniques are Collaborative Filtering (CF) and Knowledge Based Recommendation (KBR), each one possessing its own strengths and weaknesses. Collaborative Filtering (CF) [17, 18] is the most commonly used recommendation technique to date. The basic idea of CF-based algorithms is to provide item recommendations or predictions, based on the opinion of other like-minded users. In a typical CF scenario, there is a list of m users U = {u1, u2, …, um} and a list of n items I = {i1, i2, …, in}. Each user ui is associated with a list of items Iui, for which the user has expressed his/her opinion. Opinions can be explicitly given by the user as a rating score (within a certain numerical scale), or implicitly derived from transaction records (by analyzing timing logs, mining web hyperlinks and so on). For a particular user ua, the task of a collaborative filtering algorithm is to find an item likeness that can be of two forms: •
•
25
Recommendation: this is a list of N items Ir (Ir is a subset of I) that the user will like most (the recommended list must contain items not already selected by the user). This outcome of CF algorithms is also known as Top-N recommendation [20].
On the other hand, KBR attempts to suggest objects based on inferences about a user’s needs and preferences. In some sense, all recommendation techniques could be described as doing some kind of inference. Knowledgebased approaches are distinguished in that they utilize functional knowledge; in other words, they have knowledge about how a particular item meets a particular user need and can therefore reason about the relationship between a need and a possible recommendation. The user profile can be any knowledge structure that supports this inference. In the simplest case, as in Google, it may simply be the query that the user has formulated. The Entrée system and several other recent systems [23], employ techniques from case-based reasoning for knowledgebased recommendations. The knowledge used by a knowledge-based recommender system can take many forms. Google uses information about the links between web pages to infer popularity and authoritative value [1]. Entrée uses knowledge of cuisines to infer similarity between restaurants. Utility-based approaches calculate a utility value for objects to be recommended; in principle, such calculations could be based on functional knowledge. However, existing systems do not use such inference mechanisms, thus requiring users to do their own mapping between their needs and the features of products, either in the form of preference functions for each feature, as in the case of Tête-à-Tête, or answers to a detailed questionnaire, as in the case of PersonaLogic [2]. Knowledge-based recommender systems are prone to the drawback of all knowledge-based systems: the need for knowledge acquisition. More specifically, there are three types of knowledge that are involved in such systems: •
Prediction: this is a numerical value, Pi, expressing the predicted likeness of item i (i does not belong to Iua) for the user. The predicted value is within the same scale (e.g. from 1 to 5) as the opinion values provided by ua [19].
•
to the above customer’s request is to involve both X and Y and fragment the intended overall itinerary to the related sub-routes. It is also noted that these carriers may be associated with diverse transportation means, such as trains, trucks, ships and airplanes.
•
Catalog knowledge: Knowledge about the objects being recommended and their features. For example, the system should know that “Gasoline” is a type of “Fuel”. Functional knowledge: The system must be able to match the user’s needs with the object that might satisfy those needs. For example, a recommendation module should know that the transportation of toxics require a higher safety level. User knowledge: To provide good recommendations, the system must have some knowledge about the user.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
26
ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
This might take the form of general demographic information or specific information about the need for which a recommendation is sought. Of these knowledge types, the last one is the most challenging, as it is an instance of the general usermodelling problem [25]. Despite this drawback, knowledge-based recommendation has some beneficial characteristics. First of all, it is appropriate for casual exploration, because it demands less from the user (compared to the utility-based recommendation). Moreover, it does not involve a start-up period during which its suggestions are of low quality. On the other hand, a knowledge-based recommender cannot “discover” user niches, the way collaborative systems can. However, it can make recommendations as wide-ranging as its knowledge base allows. Alternative techniques have been proposed in the literature in order to handle the above issues [11]. Having thoroughly considered their pros and cons, our approach follows a hybrid recommendation technique. Generally speaking, CF and KBR techniques can be combined in hybrid recommendation systems in order to improve their performance. Most commonly, CF is combined with some other technique in an attempt to minimize or avoid the ramp-up problem [3].
3. The Proposed System 3.1 Transportation plans and evaluation of alternative solutions The recommendation procedure adopted in our approach is highly associated with the selection (by the user) of the appropriate transportation plan. A transportation plan typically defines the user preferences for the upcoming transactions. The five alternative plans offered are: • • • • •
Express Economic Safe Dependable User Defined
It can be easily observed that each of the first four plans declares a specific tension in the recommendation strategy to be followed by the system, in that it either minimizes the overall duration or cost (first two plans), or it retains a high level of safety or dependability (third and fourth plans) of the suggested itineraries. The last choice
offers the possibility for a user-customized plan definition. Such a plan may combine parameters from all the above four plans. The selection of one of these plans will influence the recommendation process of our approach for the particular user.
Figure 1: Transaction’s request interface
As shown in Figure 1, which depicts the system’s interface for handling a user’s request, the user provides input about the loading and delivery terminals, the quantity to be transported, expresses his/her preferences concerning maximum cost and duration of the transaction, and selects the desired transportation plan. By selecting the “userdefined” plan, a new window appears, allowing the user to adjust the criteria (cost, duration, safety, dependability) of his/her transportation request. Table 1: Selection criteria for the alternative transportation plans (safety and dependability take values from the set {very low, low, average, high, very high}).
Plan
Cost
Duration
Safety
Dependabil ity
Express
Any
Min
Any
Any
Economic
Min
Any
Any
Any
Safe
Any
Any
>Average
≥Low
Dependable
Any
Any
≥ Low
> Average
Hybrid
User Defined
User Defined
User Defined
User Defined
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
27
During the construction of the available transportation solutions, our approach excludes solutions that do not comply with the customer’s requirements. More specifically, a set of predefined rules is employed to exclude the alternative solutions that do not correspond to the specific freight transportation’s requirements and customer preferences. Table 1 summarizes the constraints to be met for each transportation plan (for the “User Defined” plan, this process takes into account the constraints set by the user). In all cases, solutions that do not satisfy these constraints are discarded.
3.2 A Methodology for the Selection of Alternative Route Paths In our former work [10, 27], we have presented an algorithm for constructing optimal (direct or modular) solutions for a requested transportation transaction. This algorithm was taking into account the cost and duration of each sub-route, as well as the cost and duration upper bounds (as they had been set by the user). If no optimal solution could be constructed, the algorithm terminated without providing any solutions. To better handle such cases, our approach uses an elaborated version of Dijkstra’s shortest path algorithm [4] to construct suboptimal solutions. Even if such solutions cannot be characterized as optimal, they represent acceptable alternatives for a specific transportation request. As it can be retrieved from the related literature [4], shortest path algorithms use a bidirectional, singleweighted graph to represent a connected set of vertices (Vi) through a number of arcs Aij (from Vi to Vj). Our algorithm takes into consideration each Aij and its correspondent weight (Wij) in order to produce a route path from a starting point (S) to an ending point (E) that minimizes the total weight (WSE). The complexity of our approach consists in the presence of a pair of variables that affect each arc’s weight, namely the cost and the duration. Due to the fact that there exist two weights for each arc (cost and duration), we confronted the problem of unifying these weights into a single one, in order to proceed with the ranking of the solutions. As shown in Figure 2, each arc’s Aij weight (Wij) consists of a cost weight (Wcost-ij) and a duration weight (Wduration-ij). It is obvious that: (1) W =W +W
ij
cost - ij
duration - ij
Figure 2: A hypothetical 2-weighted graph.
Having defined the total weight for each arc (Aij), we encountered the problem of adding these two parameters that are measured in different units (Euros and hours, respectively). This problem was confronted by applying a normalization technique that divided both the costij and durationij of an arc with its correspondent maximum cost and duration of the sub-route. It is: W duration - ij = Wcost -ij =
duration ij
(2)
max(duration ij ) cost ij
max(cost ij )
(3)
Another issue that came up after the weight normalization procedure concerned the solutions’ ranking. To address this problem, our approach provides the user with different solutions by using a pair of weight coefficients (costCoef and durationCoef) and by calculating solutions corresponding to alternative combinations of the weights of the cost and duration criteria (see Figure 3), according to the formula: Wij = (costCoef * Wcost - ij ) + (durationCoef * Wduration - ij ) (4) The cost and duration coefficients take values from the set {0, 0.1, 0,2, …, 1}. The main idea of this process is to provide the algorithm with alternative weights (wij), each one expressing a different combination of cost and duration parameters. At the beginning of this procedure, we calculate the weight of each sub-route by taking into consideration only the duration parameter (we set the cost coefficient to 0 and the duration coefficient to 1). Then, in a step-wise way, we decrease the duration coefficient by
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
28
ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
Coefficients' Significance
0.1 (obviously, we increase at the same time the cost coefficient by 0.1). Finally, we calculate the sub-route’s weight taking into consideration only the cost parameter (the duration coefficient has become 0).
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
Figure 4: Solutions produced by the system. 1
2
3
4
5
6
7
8
9
10
11
Number of Iteration Duration Coef Cost Coef
4. Integrating a Recommendation Module 4.1 A Hybrid Recommendation Methodology
Figure 3: Weight coefficients’ variation.
This process is described in pseudo-code as follows: { costCoef Å 0.0; durationCoef Å 1.0; step Å 0.0; while step ≤ 1.0 calculate { costCoef Å step; durationCoef Å 1-step; weight[i][j] Å costCoef*Wcost + durationCoef*Wduration; perform shortest path algorithm; step Å step + 0.1; } }
The outcome of the above process is then presented to user. As shown in Figure 4 (which depicts an instance of the related system interface), the optimal routes for a transportation request from Athens to Patra have been retrieved (after a related request). The basic characteristics of each route are presented in the main table of the web interface. By selecting the “View Details” option, the user is able to receive an analytical description of the subroutes contained in each itinerary, as well as their corresponding characteristics. Solutions at this phase are ranked by default according to the cost; in any case, users may request alternative rankings by clicking on the corresponding column header.
The recommendation procedure begins immediately after the abovementioned construction of the alternative solutions. It is a complex process which is carried out in three basic phases, which are: •
the evaluation of the carriers and the transactions data; • the exploitation of transaction data through a data mining process, and • the recommendation methodology selection or synthesis. At the beginning of the process, the system stores all the appropriate data that are submitted by the user and are related with pending or completed transportation transactions. These data are of significant importance and will be further exploited by the data mining process. Moreover, in this phase the user evaluates (i.e. assigns a score to) the carrier(s) involved in a transaction through an appropriate interface. The second phase of recommendation concerns the data mining process. Data mining is a useful decision support technique, which can be used to find trends and regularities in big volumes of data. At this phase, transactions data are gathered through knowledge construction processes. In our case, the data mining process constructs a model from the recommendation module’s database that may produce well defined knowledge rules. This procedure is performed through SQL queries performed on the transactions’ tables. After the completion of this process, the constructed knowledgebased rules participate in the production of knowledgebased recommendation data that will be evaluated and synthesized in the last phase of recommendation.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
The last phase of recommendation refers to the selection or synthesis of the appropriate recommendation technique. This objective will be reached through the definition of well structured rules that will be applied for each transaction. The Recommender Agent of our system takes the initiative to select the most appropriate recommendation technique. For example, for a particular itinerary from point i to point j, taking into consideration that the customer has selected a certain plan, a rule for the specific itinerary could lead to the recommendation of a carrier that is different than the one suggested by the CF technique, based on the carriers’ evaluation process described earlier in this section. The recommendation methodology described above is graphically presented in Figure 5, through a data flow diagram.
29
Table 2: Recommendation Module’s Database Model
Table Name
Description
Transactions
Transactions in progress
Transaction’s Subroutes
Transactions sub-routes in progress
Transactions _Rating
Completed Transactions’ evaluation
Carriers_Rating
Carrier evaluation with completed
Users_Reliability
Customers reliability evaluation
Temp_Transactions
Proposed transaction itineraries
Temp_Transactions_Subroutes
Subroutes of the proposed itineraries
4.2 Calculation of Recommendation Score
Figure 5: The data flow diagram for the recommendation methodology
Due to the large amount of data the recommendation module takes into account in order to provide knowledgebased recommendations, the database model has been thoroughly considered. The system’s database has been designed through the use of SQL Server 2005 Management Console, in order to accomplish further with the customers’ needs. Much attention has been paid into the reorganization of data tables’ fields, as well as into the representation of the entities’ relationships [16]. The database model that participates in the knowledge construction of the recommendation’s phase is presented in Table 2.
After the ranking phase, the evaluation of each alternative route retrieved is performed. Our system retrieves all possible transportation routes that can be constructed for a given transaction request. These routes are presented to the user through an appropriate designed user interface. The corresponding user interface enables the user to either select one of the proposed routes (in this case, he/she will be asked to complete the transaction), or to be redirected to a user-friendly interface where he/she can receive recommendations for each separate route. The evaluation of a transaction is based on various criteria, such as: • • • • • • • •
Cost Duration Safety Reliability Average scores of the above carriers’ elements. Average scores of the sub-routes contained in the transaction The number of times that the specific route has been selected by other customers (popularity). Number of transloadings
The recommendation procedure is implemented through the evaluation of both the transactions and the transportation companies involved. It is a complex
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
30
ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
procedure, basically due to the fact that a modular solution may involve two or more carriers. It is obvious that a transaction can receive an overall negative evaluation, while - at the same time - a specific part could have been completed quite satisfactorily. The evaluation of a transaction is based on a set of criteria such as cost, duration, safety, dependability, average score of a carrier, itinerary’s popularity and number of transloadings [15]. Taking into consideration all the above issues, we define total the calculation formula of the overall score Oi, j of each transaction from point i to point j (for each sub-route of the itinerary). It is:
(
)
Oi,total = Oi,t j + Oi,s j + Oi,r j j
(5)
Ci,t j = The carrier’s score according to time, for the transportation from point i to j.
C si, j = The carrier’s score according to safety, for the transportation from point i to j.
Ci,r j = The carrier’s score according to dependability, for the transportation from point i to j.
Tt = The transaction’s score according to time.
Ts = The transaction’s score according to safety. Tr = The transaction’s score according to dependability. The expression avg(x) refers to the average value of the element x in the database, and the variables a,b,c are
O i, j = final
n
2 (O i,total - Ο i,cost j j )
i, j = 1
f S, E
∑
coefficients related with the user’s preferences according (6)
where Oi , j , Oi , j , Oi , j represent the score of the time, safety and dependability, respectively, for the transportation from point i to point j . The variable fS,E represents the number of transloadings of each proposed solution and is considered as a negative factor, assuming that a large number of transloadings could evoke damage in the product and increase the transaction’s completion time. The number of transloadings is related to the number of sub-routes (n) of each itinerary. It is: t
s
r
fS, E = n - 1, n > 1
(7)
Each one of the detailed scores is calculated according to the score that has been assigned to the carrier and each sub-route. It is:
⎡ avg(Ci,t j * ur) + avg(Tt * ur) ⎤⎦ a (8) Oi,t j = ⎣ 2
⎡ avg(Ci, j * ur) + avg(Ts * ur) ⎤⎦ b Osi, j = ⎣ s
2
O where
r i, j
(9)
⎡⎣avg(Ci,r j * ur) + avg(Tr * ur) ⎤⎦ c (10) = 2
to time, safety and dependability respectively. Having defined the detailed scores for each sub-route, we calculate the overall score
(O ) total S, E
for the proposed
itinerary from point S (start) to point E (end).
O
to ta l S,E
n
=
∑ i, j = 1
⎧ O i,t j + O i,s j + O i,r j ⎫ ⎨ ⎬ (11) ⎩ (a + b + c ) * n ⎭
For the calculation of
( O ) we total S,E
do not take into
consideration the proposed cost of a transaction, due to the fact that the system evaluates it through its normalization. The evaluation of the cost is performed through the formula:
Oi,cost j =
(
cost i, j
(
min cost i, j
)
(12)
)
where min cost i, j represents the minimum cost for the specific route. At this point we encapsulate into the overall score the cost’s score in order to recalculate a final score
( O ) for final i, j
the transaction, which will be the system’s final recommendation to the user. It is:
⎡ (O i,tojta l - Ο i,c ojst ) 2 ⎤ ⎥ (13) f i, j = 1 ⎢ ⎥⎦ S , E ⎣ n
O i,finj a l = ∑ ⎢
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
4.3 An Example This subsection presents an example of the recommendation process and its runtime environment. Having performed the optimal routes retrieval algorithm [4, 15], the user is transferred to the recommendation interface, where the results of the recommendation process are presented (Figure 6). At this phase, the evaluation of the itineraries is executed. More specifically, for every solution that has been retrieved for a requested transaction, the user may further consider its sub-routes. For each subroute, the system calculates the average score that the carrier has received for its reliability during the transaction, as well as the average score for the transaction’s duration. During the calculation of the above averages, the scores that each carrier (or each route) has received are multiplied by a user’s reliability coefficient. This is performed in order to add a level of significance into a reliable user’s opinion (compared with a less reliable one). Reliability refers to the number of times that a user has rated an itinerary, and not by the fact that his/her evaluation was considered as being strict or not. In addition to the above evaluation, a similar procedure takes place with respect to the safety and the overall carrier’s reliability during the transaction. Both the average score of the specific elements (duration, reliability, safety, general reliability) and the overall score are stored in the system’s database. When this procedure is completed for all itineraries’ sub-routes, an average of all scores is extracted. The final score of the itinerary is the sum of the carriers’ and the sub-routes’ overall score, normalized by the overall cost and the number of intermediate transloadings. Moreover, the system retrieves information related to the completion of the above itineraries and their correspondent frequency. This procedure aims at checking whether a specific itinerary is constantly selected by other users. The popularity of each route is presented to the user later, in order not to affect his/her decision. Initially, the recommended solutions are shown to the user according to their final score (top table of the interface shown in Figure 6). The user may then see each solution’s details; by clicking on the “View Details” link (which appears at each entry of the top table), the interface expands dynamically and a second table appears (entitled “Sub-Route Details”), containing information about the sub-routes of the selected itinerary and the overall scores of each sub-route. Clicking on the “More Details” link, the user is provided with additional information about each sub-route (such as scores for its duration, safety and
31
reliability). Moreover (by exploiting the “Show” link at the “Top-10 Carriers” column), the user is given the opportunity to compare a sub-route’s carrier with any of the Top-10 carriers that exist for the particular sub-route (this is a common practice in CF techniques). In such a case, the interface of Figure 6 expands further and a third table, entitled “Top-10 Carriers”, appears. When selecting a carrier from this table, by clicking on the “Select” link, the corresponding differences (in terms of cost, duration and carrier’s rating) are presented in the bottom right part of the window (under the header “Additional Features”).
Figure 6: The recommendation module interface.
4.4 Implementation Issues A new software agent, namely the Recommender Agent (RA), has been implemented and interconnected with a correspondent Web Service, in order to coordinate the overall recommendation process. The main tasks of the RA concern the coordination of the recommendation module, depending on the characteristics of each transaction. Through these formally modeled tasks, RA provides continuous assistance to customers, while it remains active and capable to adapt its “behavior” into a rapidly changing environment. RA is responsible for the coordination of the whole process, as it interacts with the other software agents of the system [10]. Moreover, the recommendation policy of our system builds on Web Services concepts [26]. A Web Service is a URLaddressable software resource that performs functions and provides answers. It is constructed by taking a set of software functionality and wrapping it up so that the
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
32
ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
services it performs are visible and accessible to other software applications. A Web Service can be discovered and leveraged by other Web Services, applications, clients, or agents. In other words, Web Services can request services from other Web Services, and they can expect to receive the results or responses from those requests. Moreover, Web Services communicate using an easy-toimplement standard protocol (SOAP). Web Services may interoperate in a loosely-coupled manner; they can request services across Internet and wait for a response [5]. Due to the fact that external applications could exploit the proposed recommendation services, the implementation of the FTMarket’s recommendation module was performed according to Web Services concepts and standards.
the future (it will constitute a services repository). It is noted that it is not necessary for all these services to be provided through a single server; multiple servers, located in distinct providers, may be used. Finally, our system’s Web Services are message-based. Interaction via message exchange means that instead of a client invoking functionality exposed as a Web Service, it sends a request to the Web Service to have the functionality invoked [7, 8]. In other words, what a Web Service exposes is the functionality of receiving a message. We have adopted a generic message interchange, which means that delivery of message content is independent of its format.
5. Conclusions
Figure 7: The recommendation module architecture.
The overall architecture of the FTMarket’s recommendation module is illustrated in Figure 7. As shown, the module is appropriately wrapped in order to describe the kind of service to be provided. To be easily located by users, such descriptions of services are placed in a shared public registry. It is through this registry that users may look up for the services they need each time (in any case, a Web Service can be directly accessed if one knows its URL and WSDL). The correspondent agent that needs functions provided by the specific Web Service sends the appropriate request as an XML document in a SOAP envelope. This protocol can work across a variety of mechanisms, either asynchronously or synchronously. Web Services may make requests of multiple services in parallel and wait for their responses. The set of services to be provided in the FTMarket platform will be increased in
This paper has elaborated a series of issues related to the integration of hybrid recommendation techniques into an agent–based transportation transactions management platform. We proposed a hybrid recommendation module that combines different recommendation techniques in order to provide the user with more accurate and efficient suggestions. The overall recommendation process is coordinated by a software agent, which is responsible for carrying out multiple tasks, such as coordination of the recommendation module, selection of alternatives and knowledge synthesis through the exploitation of different recommendation techniques and algorithms. The presence of the Recommender Agent guarantees that the user will be provided with continuous recommendations, which are dynamically updated. Finally, we have exploited concepts related to Web Services in order to make the proposed recommendation functionalities accessible from external applications. Future work plans mainly concern the consideration of additional recommendation techniques, such as content– based or model–based techniques and the exploitation of data mining algorithms in order to enhance the overall quality of the recommendations provided. The development of additional (local or remote) Web Services, which will be capable of carrying out more complex requests for recommendation techniques synthesis, is another major concern.
References [1] S. Brin, L. Page, The anatomy of a large-scale hyper textual Web search engine, Computer Networks and ISDN Systems 30(1-7) (1998) , pp. 107-117.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
[2] R. Burke, Hybrid Recommender Systems: Survey and Experiments, User Modelling and User-Adapted Interaction 12 (2002), pp. 331-370. [3] R. Burke, Integrating Knowledge-Based and CollaborativeFiltering Recommender Systems, Artificial Intelligence for Electronic Commerce, AAAI Technical Report WS-99-01 (1999), pp. 69-72. [4] A. Crauser, K. Mehlhorn, U. Meyer, P. Sanders, A Parallelization of Dijkstra's Shortest Path Algorithm, Proceedings of 23rd International Symposium, MFCS'98 (1998), Brno, Czech Republic. [5] G. Fox, W. Wu, A. Uyar, H. Bulut, A Web services framework for collaboration and audio videoconferencing, International Multiconference in Computer Science and Computer Engineering, Internet Computing (2002), Las Vegas, USA. [6] G. Froehlich, H. J. Hoover, W. Liew, P. Sorenson, Application Framework Issues when Evolving Business Applications for Electronic Commerce, Information Systems 24(6) (1999), pp. 457-473. [7] D. Greenwood, M. Calisti, Engineering Web Service - Agent Integration, IEEE Conference of Systems, Man and Cybernetics (2004), The Hague. [8] E. Hanson, P. Nandi, D. Levine, Conversation Enabled Web Services for Agents and e-Business, Proceedings of International Conference of Internet Computing, Computer Science Research, Education and Applications (CSREA) Press (2002), pp. 791-796. [9] N. Jennings, P. Faratin, T.J. Norman, P. O'Brien, B. Odgers, Autonomous Agents for Business Process Management, International Journal of Applied Artificial Intelligence 14(2) (2000), pp. 145-189. [10] N. Karacapilidis, A. Lazanas, G. Megalokonomos, P. Moraitis, On the Development of a Web-based System for Transportation Services, Information Sciences, 176(13) (2006), pp. 1801-1828. [11] N. Karacapilidis, L. Hatzieleftheriou, A hybrid framework for similarity-based recommendations, Business Intelligence and Data Mining, 1(1) (2005), pp. 107-121. [12] N. Karacapilidis, P. Moraitis, Building an Agent-Mediated Electronic Commerce System with Decision Analysis Features, Decision Support Systems 32(1) (2001), pp. 53-69. [13] M. Klusch, K. Sycara, Brokering and Matchmaking for Coordination of Agent Societies: A Survey, A. Omicini et al. (eds.), Coordination of Internet Agents, Springer (2001), pp. 197-224. [14] A. Lazanas, C Evangelou N. Karacapilidis, Ontology-Driven Decision Making in Transportation Transactions Management, Witold Abramowicz (ed.), Proceedings of the 8th International Conference on Business Information Systems (2005), Poznan, Poland, pp. 228-241. [15] A. Lazanas, N. Karacapilidis Y. Pirovolakis, Providing Recommendations in an Agent-Based Transportation Transactions Management Platform, Proceedings of the 8th International Conference on Enterprise Information Systems (2006), Paphos, Cyprus. [16] U. Nahm, R. Mooney, Text Mining with Information Extraction, Proceedings of the AAAI Spring 2002
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
33
Symposium on Mining Answers from Texts and Knowledge Bases (2002), Stanford, pp. 60-68. M. O’Mahony, N. Hurley, C. Silvestre, Promoting Recommendations: An attack on Collaborative Filtering, Proceedings of DEXA 2002 Conference, Springer-Verlag (2002), Berlin, pp. 494-503. P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, J. Riedl, GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In Proceedings of CSCW (1994), Chapel Hill, NC. B.M. Sarwar, G. Karypis, J.A. Konstan, J. Riedl, Analysis of Recommender Algorithms for E-Commerce, Proceedings of the ACM Conference on e-Commerce, Minneapolis, ACM Press (2000), pp. 158-167. B.M. Sarwar, G. Karypis, J.A. Konstan, J. Riedl, Item-Based Collaborative Filtering Recommendation Algorithms, Proceedings of the 10th International World Wide Web Conference (2001), Hong Kong, ACM Press. J.B. Schafer, J. Konstan, J. Riedl, Electronic Commerce Recommender Applications, Journal of Data Mining and Knowledge Discovery, 5 (1-2) (2000), pp. 115-152. W. Shen, D.H. Norrie, An Agent-Based Approach for Dynamic Manufacturing Scheduling, Proceedings of Autonomous Agents'98 Workshop on Agent-Based Manufacturing, Minneapolis (1998), pp. 117-128. S. Schmitt, R. Bergmann, Applying case-based reasoning technology for product selection and customization in electronic commerce environments, 12th Bled Electronic Commerce Conference (1999), Bled, Slovenia. K. Sycara, D. Zeng, Coordination of Multiple Intelligent Software Agents, International Journal of Cooperative Information Systems, 5(2-3) (1996), pp 546-563. B. Towle, C. Quinn, Knowledge Based Recommender Systems Using Explicit User Models, Knowledge-Based Electronic Markets, AAAI Technical Report WS-00-04, AAAI Press (2000), pp. 74 -77. H. Wang, J. Huang, Y. Qu, J. Xie, Web services: problems and future directions, Journal of Web Semantics, 1 (2004), pp. 309–320. A. Lazanas, G. Megalokonomos, Optimizing Alternative Routes Retrieval in an Agent–based Transportation Management System. Proceedings of the International Conference on Service Systems and Service Management (ICSSSM 2006), Troyes, France, pp. 1525-1530.
Dr. Alexis Lazanas studied Applied Informatics in Athens University of Economic and Business (B.Sc. 1996) and received his Ph.D. from University of Patras (Greece) in the field of Recommender Systems, Data Mining and Intermodal Transportation (2008). He worked in Technological Educational Institute (T.E.I.) of Patras as Scientific Collaborator and as Software Developer – Special Analyst in various major companies. Currently he is working as Teacher of Informatics in Greek Public Education. His research interests are on the areas of Agent-based Information Systems, Data Mining, Web Technologies, Hybrid Recommender Systems and Intermodal Transportation Management.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
35
ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
Geometric and Signal Strength Dilution of Precision (DoP) Wi-Fi Soumaya ZIRARI*, Philippe CANALDA and François SPIES 1
Computer Science Laboratory of the University of Franche-Comté, France Numerica, 1 cours, Louis Leprince Ringuet, 25200 Montbéliard E-Mail:
[email protected]
2
Computer Science Laboratory of the University of Franche-Comté, France Numerica, 1 cours, Louis Leprince Ringuet, 25200 Montbéliard E-Mail:
[email protected]
3
Computer Science Laboratory of the University of Franche-Comté, France Numerica, 1 cours, Louis Leprince Ringuet, 25200 Montbéliard E-Mail:
[email protected]
Abstract The democratization of wireless networks combined to the emergence of mobile devices increasingly autonomous and efficient lead to new services. Positioning services become overcrowded. Accuracy is the main quality criteria in positioning. But to better appreciate this one a coefficient is needed. In this paper we present Geometric and Signal Strength Dilution of Precision (DOP) for positioning systems based on Wi-Fi and Signal Strength measurements.
Keywords: Wireless LAN, Radio position measurement, Indoor radio communication.
The GPS is limited in given environments and Wi-Fi is becoming a viable positioning method. The authors think that the Wi-Fi network can be adapted by learning from the GPS. In this paper, we present a mathematical approach of a new version of the known GPS Dilution of Precision [6] which is more adapted to the Wi-FI networks and use other elements to estimate the precision. We also present a model that allows to estimate the precision based on criteria other than the geometric one only. The third section presents and analyzes some results.
1. Introduction 2. GEOMETRIC CRITERIA The world population is currently growing which implies a remarkable increase in buildings and skyscrapers. These are obstacles for Global Navigation Satellite Systems (GNSS) such as the Global Positioning System (GPS). New networks have emerged (UMTS, GSM, ...) which does not help to reduce the impact of interferences. These factors among others contribute to the GPS [1] up to 20 meter loss in accuracy especially in urban and peri-urban environments. During the last ten years the number of users of the IEEE 802.11x community has known a remarkable growth and a new positioning solution based on Wi-Fi was born. Some positioning algorithms guaranty an accuracy of 5 meters such as RADAR[2], Viterbi-like algorithm [3], Friis and Reference Based Hybrid Model [4] (FRBHM) [5].
The evolution of the IEEE 802.11 standard fulfil more and more the constraints allowing the improvement of its efficiency in large and more complex environments. The efficiency of such networks is measured by different criteria. Some of those criteria are focused on the network geometry, others on the throughput [7] or on the interference [8]. A. Gondran and al. [9] provide a geometric indicator for WLAN planning. This indicator is based on the study of the covered area by a Basic Service Set (BSS), where a cell relative to one antenna is a set of pixels associated to a given base station. The cell C is defined by:
c= {bi , j / F i , j q }
(1)
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
44
Whereis bi , j the pixel of coordinates i , j and F i , j is the signal strength received at bi , j exceeding a given quality threshold q. Considering the 2-D space, each pixels have 8 neighbours with the exception of the pixels on space borders. Mabed and al. [10] define the geometrical criteria as bellow:
3. Propagation Models When the signal transmitted by a transmitter travels in space, it loses its power. Part of the energy of the signal strength is dissipated. The environment where the carrier signal travels and the distance covered have an important impact on the signal attenuation. Several equations have been developed.
(2)
3.1 FRIIS A. Gondran and al. adapted this formula to 3-D space which can be indoor environment such as buildings.
The Friis [11] equation is:
(5) where :
(3)
•
P R and P T are respectively the Signal Strength
(SS) received and the SS emitted;
Where k presents the floor.
•
The geometric indicator regrouping all floor-indicators is defined by the following equation:
transmitter antenna gains; • •
(4
G R and G T are respectively the receiver and is the carrier wavelength;
d is the distance between the receiver and the
transmitter.
) Where
3.2 Interlink Networks The Interlink Networks [14] approach offers to replace the power 2 in the Friis formula by the power to the 3.5 due to the prompt wave's attenuation in a building because of the high number of obstacles in this one. The Interlink Networks formula is: (6) where :
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
•
43
P R and P T are respectively the Signal Strength
(SS) received and the SS emitted; •
G R and G T are respectively the receiver and
transmitter antenna gains;
The
is the carrier wavelength;
•
d is the distance between the receiver and the
•
2. Contribution
transmitter.
contribution
of
this
paper
consists
in
presenting a precision of dilution model for wireless networks. This model aims at giving an idea about the position estimation accuracy. This model can be described in three steps:
3.3 SNAP-WPS Y. Wang proves in the paper [15] the possibility to approximate the target position by measuring the signal strength. In fact, the signal attenuation between the transmitter and the receiver allows to determine the mobile position. However, the Friis equation enables to estimate the distance between the receiver and the transmitter in an environment without any obstacles. Thus, Y. Wang suggest an empirical model based on regression. By comparing the residual among different degrees polynomials, he decide that a cubic regressive equation would be adequate for the empirical 2 model EM : d i = 0.000198 S 3i − 0.025 S 2i
1.14 S i − 14.8 (
1- The first step consists in the constitution of a set of all visible access points (Fig. 1). The number of visible access points is one of the decisive elements on the accuracy of a positioning system. Our needs in the number of visible access points depend on the dimension of the positioning system. At least three APs for a two dimension positioning system and at least four APs for a three dimension one. If the number of AP is not sufficient, we set automatically the value of the precision of dilution coefficient as infinite. The optimal value is equal to
7) Where S is the signal strength (SS) in dBM, normally is between 15-90 dBM.
one. 2- The second step concerns the signal strength of the visible access points (Fig. 2). We assume that access points with a signal strength under a given
3.4 Analysis
threshold may induce errors in the position The results in Table.1 [16] present the comparison between Wi-Fi positioning systems. Table 1: Comparison Between The Positioning Algorithms
estimation of the target. An access point with a bad signal strength can be near or far from the user. In fact, the signal strength may be attenuated either
Positioning System
Mean Error
Standard Deviation
Friis
9.86
6.3
SNAP-WPS
8.76
5.87
Interlink Networks
9.58
5.11
FBCM
7.77
3.03
3- The final step deals with the positioning system
Radar
4.62
2.98
architecture geometry i.e. the third step verifies if
FRBHM
5.98
3.22
the visible access points are geometrically well
because of the distance or because of the number of obstacles. If only three access points have a good signal strength (we are in a 3D positioning system) we predict that the coefficient value will be higher.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
distributed with respect to the user. For this step we propose a Wi-Fi DOP Dilution Of Precision which is calculated as below.
44
X c ,i − X u
d i=
2
Y c , i− Y u
2
Z c ,i − Z u
2
(8) X
c ,i
,Y
c ,i
,Z
c ,i
are
the AP i coordinates
and X u , Y u , Z u the user unknown coordinates.
We obtain:
(9)
4.1 Friis equation The Friis equation [13] as seen before is: The Friis equation allows us to compute the distance as below:
d i=
PT ,i G RGT ,i 4
P R ,i
(10)
Where : Let us suppose S AP = N AP the number of visible access points. We assume that :
P R , i , P T , i ,G R and G T , i are respectively the receiver and AP i data.
S AP = {AP 1 , AP 2 , ... , AP N AP } Where AP i are the visible access points. The radius of circle d i ( i
{1,... , N } the number
of calculation) is defined by:
AP
The distance d i can be approximated by a Taylor expansion: (11) The Taylor expansion at the first order is:
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
43
(12)
and :
Where
X c ,i − X u
bi , x =
, bi , y =
ri
,
ri
We obtain:
Z c , i− Z u
b i , z= ri=
Y c ,i − Y u
X c , i− X u
C
and
ri 2
Y c , i− Y u
2
Z c ,i− Z u
2
P R= H
X (16)
Where C is a known matrix equal to:
We obtain:
(13)
and ci =
P T ,i G T , i G R 4
The linear system is: d= H
X (14)
Where :
We
that P T , i ,G R and G T , i are
suppose
P R , i= fixed
1 P R,i
−
1 P R,i
parameters. Only P R , i the Signal Strength (SS) received The G matrix is defined by: G= H T H
from the AP i is unknown and then estimated. Thus from the equation (5), we obtain:
d i= We have:
PT , i G T , i G R 4
−1
(17)
The Wi-Fi GDOP follows the equation bellow:
1 P R,i
−
1 P R,i
(15)
DOP = Tr [ G ] (18) We conclude from the model that we can estimate the positioning accuracy, and measure the error of the
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
wireless positioning system by analysing the following elements:
44
Where :
• P R Signal Strength (SS) received from the AP which emits a SS P T • G R the user antenna gain and G T the AP antenna gain; • the carrier wavelength; • The number of visible AP.
The G matrix is defined by: G= H T H
−1
The Wi-Fi GDOP follows the equation bellow:
4.2 Interlink Networks
DOP = Tr [ G ]
The Interlink Networks formula is:
4.3 SNAP-WPS The distance in SNAP-WPS system is equal to: The distance is:
d i = 0.000198 S 3i − 0.025 S 2i
1.14 S i − 14.8
The linear system is: S= H
(19) The linear system become equivalent to: C P R= H X Where C is a known matrix equal to:
X (20)
5. Experiments Experiments have been carried out to validate our model of precision dilution for wireless networks. Open Wireless Positioning System (OWLPS) [17], which is an indoor positioning system, based on the Wi-Fi wireless network, was the positioning system used to calculate the mobile position. The experiments were carried in our laboratory, Laboratoire d' Informatique de Franche Comté (LIFC).
5.1 OWLPS Architecture
and
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
43
Fig. 3. : The environment of experimentation
5.3 Analysis
Open Wireless Positioning System (OWLPS) implements several positioning techniques and algorithms such as FBCM [4] or FRBHM [5]. The system is Infrastructure-centred, i.e., the mobile asks its position to the infrastructure (see Fig. 3). The main task of the system is to provide an adequate environment to the creation and test of new techniques, propagation models and for the development of hybrid techniques combining existing algorithms.
The first experiments were done in order to verify the impact of the number of access points on our model and to check if the Wi-Fi DOP is consistent with this information.
5.2 The experimentation scenario As we can see in Fig.4, the experimentation scenario was about a mobile displacements during an interval of time. During all this interval, the user is located through the OWLPS system and the algorithm used for the positioning are Friis, Interlink Networks and FRBHM.
Along all the mobile trajectory, we know the exact mobile coordinates and the estimated one, which allow us to analyse the results. The positioning system is a 3D one.
Fig. 6. : The DOP cartography when the mobile is moving in the first floor
The Fig. 6 proves how the Geometric and Signal Strength Dilution of Precision (DoP) Wi-Fi progress with the mobile movement in the first floor of the building.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
44
Analysing the results presented in Fig.7, we deduce the DOP fits quite well in terms of number of visible access points. In fact, as shown in Fig.7 when the DOP[ 10,15], the number of visible access points is equal to three, thus the Wi-Fi DOP values reach infinite values. However, the Wi-Fi DOP values reach good values when the number of visible access points is up to four but we observe some peaks when the number acceptable of access point for 3D positioning system is minimal (i.e. four access points). The second step of our experiments was done in order to verify the impact of the signal strength of each visible access point on our model and in which way this information makes the DOP vary.
Fig. 8 proves that the Wi-Fi DOP is really influenced by the signal strength of the access points. When DOP [10,15] and DOP [35,44], the Wi-Fi DOP values vary from seven to the infinite. If we look at the signal strength for those behaviours we note that the signal can not be received or the signal is too weak. This means that the model can in fact predict the system accuracy by analysing the access point signal strength.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
The third step of our experiments has been carried out to analyse the efficiency of our model by comparing the real trajectory and the estimated one with Wi-Fi DOP values (see Fig. 9). The analysis shows that the trajectories (the real one and the estimated one) are more or less similar except when the Wi-Fi DOP is up to eight.
43
The model presented in this paper may provide the guaranty we need. In fact, as shown in the results obtained in the previous section, our model illustrates the positioning system accuracy. The idea consists in the observation of the results of the model and when the values of this one reach a given threshold, we inform the user that the position accuracy is not sufficient and then anticipate a solution to guaranty the quality and continuity of service.
7. Future Trends Our model opens and leads to numerous extensions and perspectives. The coefficient of dilution of precision or rather the WiFi DOP is a good candidate to specify the most adequate access points distribution. It is possible to extend the Wi-Fi DOP to the system OWLPS. It could provide a continuity of positioning, but also assistance to the optimal positioning of access points. The aim of this study is to offer to the user most of the time four access points with a DOP of the order of 2 in sight. The fourth and last step is to verify whether the Wi-Fi DOP is a good indicator of the positioning accuracy. Fig.10 shows that when DOP [10,15], the error is up to eleven. When the Wi-Fi DOP value is equal to three (when the Wi-Fi DOP value is [1,5], we consider that the system has a god accuracy) the mean average error is equal to four.
6. Conclusions Nowadays, the Wi-Fi positioning algorithms and systems are becoming a new mean of positioning mobile terminals within a heterogeneous environment. The quality of service of such system may be improved in order to guaranty the integrity and the continuity of service. This paper describes a model for dilution of precision and a mathematical description of the coefficient weakening of the accuracy, the Wi-Fi DOP.
Acknowledgments We thank all the reviews for their detailed feedback and suggestions specially Matteo Cypriani.
References [1] US Army Corps of Engineer, Engineering and Design NAVSTAR Global Positioning System Surveying, Department of the Army, 2003, Washington, DC, July. [2] Paramvir Bahl, Venkata N. Padmanabhan. RADAR: An In-Building RF-Based User Location and Tracking System. In Proceedings of the IEEE Infocom 2000, Tel-Aviv, Israel, vol. 2, Mar. 2000, pp. 775--784. [3] P.Bahl, A. Balachandran, V. N. Padmanabha. Enhancements to the RADAR User Location and Tracking System. Microsoft Research Technical Report, February 2000. [4] F. Lassabe and O. Baala and P. Canalda and P. Chatonnay and F. Spies, A Friis-based Calibrated Model for Wi-Fi Terminals Positioning, Proceedings of IEEE Int. Symp. on a World of Wireless, Mobile and Multimedia Networks (WoWMoM 2005), 2005 [5] Frédéric Lassabe. Géolocalisation et prédiction dans les réseaux Wi-Fi en intérieur, Rapport de thèse. 2009
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
[6] Radar, Sonar and Navigation, IEE Proceedings - Volume 147, Issue 5, Oct. 2000 Page(s):259 - 264 Yarlagadda, R.; Ali, I.; Al-Dhahir, N.; Hershey, J., GPS GDOP metric Radar, Sonar and Navigation, IEE Proceedings - Volume 147, Issue 5, Oct. 2000 Page(s):259 - 264 . [7] Ling X., Yeung K.L., ?Joint access point placement and channel assignment for 802.11 wireless LANs?, IEEE Wireless Communication and Networking Conference, pp. 1583-1588, 2005. [8] Amaldi E., Capone A., Cesana M., Malucelli F., Optimizing WLAN Radio Coverage, IEEE International Conference on Communications 2004, 1, pp.180-184, 2004 [9] Gondran, A.; Baala, O.; Caminada, A.; Mabed, H., "3-D BSS geometric indicator for WLAN planning" Software, Telecommunications and Computer Networks, 2007. SoftCOM 2007. 15th International Conference on Volume , Issue , 27-29 Sept. 2007 Page(s):1 – 5 [10] H. Mabed, A. Caminada, Geometric criteria to improve the interference performances of cellular network, IEEE Vehicular Technology Conference, Montreal. Sept. 2006. [11] S. Zirari, P. Canalda, and F. Spies. Modelling and Emulation of an Extended GDOP For Hybrid And Combined Positioning System. In ENC-GNSS'09, European Navigation Conference - Global Navigation Satellite Systems, Naples, Italy, May 2009 [12] S. Zirari, P. Canalda, and F. Spies. A Very First Geometric Dilution Of Precision Proposal For Wireless Access Mobile Networks. In SPACOMM'09, The First International Conference on Advances in Satellite and Space Communications, Colmar, France, July 2009 [13] H. T. Friis, A note on a simple transmission formula, Proc. IRE, pp. 254-256, 1946. (NOAA), Environmental Technology Laboratory (ETL), in Boulder, Colorado [14] Inc Interlink Networks. A practical approach to identifying and tracking unauthorized 802.11 cards and access points. Technical report, 2002. [15] Y. Wang, X. Jia, and H.K Lee. An indoors wireless positioning system based on wireless local area network
44
infrastructure. In 6th Int. Symp. on Satellite Navigation Technology Including Mobile Positioning and Location Services, number paper 54, Melbourne, July 2003. CDROM proc. [16] Matteo Cypriani, Frédéric Lassabe, Soumaya Zirari, Philippe Canalda, François Spies. Open Wireless Positioning System : un système de géopositionnement par Wi-Fi en intérieur. JDIR, belfort, France, 2009. [17] M. Cypriani, F. Lassabe, S. Zirari, P. Canalda, and F. Spies, Open wireless positioning system, Université de Franche-Comté, Tech. Rep. RT2008-02. Soumaya Zirari was born in 1981. She received her diploma in engineering in 2006. She is preparing her Ph.D Thesis at the Computer Science Laboratory at the University of Franche-ComtŽ in France, to be defended the 1st semester of 2010. She is focusing on hybrid location-based services and service continuity. Dr Philippe Canalda got M.Sc. and Ph.D. Degrees in computer science from the University of OrlŽans (France) in 1991 and 1997, respectively. He worked at INRIA Rocquencourt from 1991 to 1996 on the automatic generation of optimizing and parallel n-to-n crosscompilers. From 1996 to 1998, he worked as Research Engineer in the Associated Compiler Expert start-up factory at Amsterdam, The Netherlands. Then he worked 2 years at LORIA on the synchronisation of cooperative process fragment, based on workflow model, and applied to ephemeral enterprise. Since 2001, he is an Associate Professor at the Computer Science Laboratory (LIFC, EA 4269) at the University of Franche-Comté in France. His research topics deal with, on the one hand mobility services and wireless positioning, and on the other hand on robust and flexible optimizing algorithms based on graph, automata and rewriting theories.. Prof. François Spies received his Ph.D. and the French “Accreditation to supervise research” Degrees in 1994 and 1999, respectively. He was an Associate Professor at the Computer Science Laboratory at the University of Franche-Comté in France from 1996-1999. Since 1999, he has held a Professor position at the University of Franche-Comté. Currently he is focusing on managing video streams on wireless and mobile architecture. Researches on, cooperative video cache strategies including mobility and video quality levels, transport, congestion control and quality of service of video streams are the main developed topics.
IJCSI
45
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009 ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
Implementation of Rule Based Algorithm for Sandhi-Vicheda Of Compound Hindi Words Priyanka Gupta1 ,Vishal Goyal 2 1
M.Tech. (ICT) Student, 2Lecturer Department of Computer Science Punjabi University Patiala 1
[email protected] ,
[email protected]
Abstract Sandhi means to join two or more words to coin new word. Sandhi literally means `putting together' or combining (of sounds), It denotes all combinatory sound-changes effected (spontaneously) for ease of pronunciation. Sandhi-vicheda describes [5] the process by which one letter (whether single or cojoined) is broken to form two words. Part of the broken letter remains as the last letter of the first word and part of the letter forms the first letter of the next letter. SandhiVicheda is an easy and interesting way that can give entirely new dimension that add new way to traditional approach to Hindi Teaching. In this paper using the Rule based algorithm we have reported an accuracy of 60-80% depending upon the number of rules to be implemented. Keywords: Rule Based Algorithm, Sandhi-Vicheda, Compound Hindi Words
I INTRODUCTION Natural Language Processing (NLP) refers to descriptions that attempt to make the computers analyze, understand and generate natural languages, enabling one to address a computer in a manner as one is addressing a human being. Natural Language Processing is both a modern computational technology and a method of investigating and evaluating claims about human language itself. It is a subfield of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages. A word can be defined as a sequence of characters delimited by spaces, punctuation marks, etc. in case of written text. A compound word (also known as co-joined word) can be broken up into two or more independent words. A Sandhi-Vicheda module breaks the compound word in a sentence into constituent words. Sandhis take place whenever there is a presence of a swara i.e.a vowel; the presence of a consonant with a halanta; the presence of a visarga. Sanskrit has a well defined set of rules for Sandhi-vicheda. But Hindi has its own rules of Sandhi-vicheda. They are, however, not so well-defined as, and much fewer in number than, the Sanskrit rules.
1.1 The Hindi Language Hindi is spoken in northern and central India. Linguists think of Hindi and Urdu as the same language, the difference being that Hindi [5] is written in the Devanagari script and draws much of its vocabulary from Sanskrit, while Urdu is written in the Persian script and draws a great deal of its vocabulary from Persian and Arabic. More than 180 million people in India regard Hindi as their mother tongue. Another 300 million use it as second language. Hindi is the national language of India and is spoken by almost half a billion people in India and throughout the world and is the world's second most spoken language. It allows you to communicate with a far wider variety of people in India than English which is only spoken by around five percent of the population. It is written in an easy to learn phonetic script called “Devanagari” which is also used to write Sanskrit, Marathi and Nepali. Hindi is normally spoken using a combination of 52 sounds, ten vowels, 40 consonants, nasalisation and a kind of aspiration. These sounds are represented in the Devanagari script by 52 symbols: for ten vowels, two modifiers and 40 consonants.
II RELATED WORK Sandhi (in linguistics) [1] is a cover term for a wide variety of phonological processes that occur at morpheme or word boundaries, such as the fusion of sounds across word boundaries and the alteration of sounds due to neighboring sounds or due to the grammatical function of adjacent words. Internal sandhi features the alteration of sounds within words at morpheme boundaries, as in sympathy (syn- + pathy). External sandhi refers to changes found at word boundaries, such as in the pronunciation [tm bʊks] for ten books. This is not true of all dialects of English. The Linking R of some dialects of English is a kind of external sandhi, as is the process called liaison in the French language. While it may be extremely common in speech, sandhi (especially external) is typically ignored in spelling, as is the case in English, with the exception of the distinction between "a" and
IJCSI
46
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009 ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
"an" (sandhi is, however, reflected in the writing system of Sanskrit and Hindi). External sandhi effects can sometimes become morphologized. Most tonal languages have Tone sandhi, in which the tones of words alter according to pre-determined rules. For example: Mandarin has four tones: a high monotone, a rising tone, a falling-rising tone, and a falling tone. In the common greeting nǐ hǎo, both words in isolation would normally have the falling-rising tone. However, this is difficult to say, so the tone on nǐ is pronounced as ní (but still written nǐ in Hanyu Pinyin). The Sanskrit Sandhi engine software is not currently available as a standalone application, since its local use demands the installation of an HTTP server on the user's host. The Sandhi module[1] developed by RCILTSSanskrit, Japanese, Chinese at Jawaharlal Nehru University, New Delhi. RCILTS, JNU is a resource center for Sanskrit language of DIT, Government of India. At JNU work started in three languages viz., Sanskrit, Japanese, and Chinese. Using this module the user can get the information about Sandhi rules and processes. Sutra number in Astyadhayi and its description is displayed. User can learn three types of Svara Sandhi, Vyanjan Sandhi, Hal Sandhi through this Sandhi module Data is in Unicode. Sandhi exceptions and options are also incorporated. This module takes two words as input. First word cannot be null but second word can be. A user can input the two words and submit the form to get the result of the given input. Chinese Tone Sandhi,[2] Cheng and Chin-Chuan from California University, Berkeley, Phonology Laboratory faced the problem that English stresses are interpreted by Chinese speakers when they speak Chinese with Engish words inserted. Chinese speakers in the United States usually speak Chinese with Engish words inserted. In Mandarin Chinese, a tone-sandhi rule changes a third tone preceding another third tone to a second tone. Using the tone-sandhi rule, they designed the experiment to find out hoe English stresses are interpreted in Chinese sentences. Stress does not exist in the underlying representations of English phonology. But in studying bilingual phenomena, the phonetic level is also important. Fry (1995) found that when a vowel was long and of high intensity, listeners agreed that the vowel was strongly stressed. The results of his experiments indicate that the duration ratio has a stronger influence on judgements of stress than has the intensity ratio. Lehiste and Peterson (1959) also reported experiments on stress.
English l-sandhi [3] involves an allophonic alternation in alveolar contact for word-final /l/ in connected speech [4]. EPG data for five Scottish Standard English and five Southern Standard British English speakers shows that there is individual and dialectal variation in contact patterns.
III PROBLEM DEFINITION Developing programs that understand a natural language is a difficult task. Natural languages are large. They contain an infinity of different sentences. No matter how many sentences a person has heard or seen, new ones can always be produced. Also, there is much ambiguity in a natural language. Many words have several meanings and sentences can have different meanings in different contexts. Compound words are created by joining an arbitrary number of existing words together, and this can lead to a large increase of the vocabulary size, and thus also to sparse data problems. Therefore the problem of compound words poses challenges for many NLP applications. The problem domain, to which this paper is concerned, is breaking up of Hindi compound words into constituent words. In Hindi, words are a sequence of characters. These words are combined with ‘swar’, ‘vyanjan’, and matra’s. Hindi has its own rules of Sandhi-vicheda. They are, however, not so well-defined as, and much fewer in number than, the Sanskrit rules. So my problem is to break the compound word into constituent words with the help of rules of ‘Sandhivicheda’ in Hindi grammar. My problem is to design a Graphical User Interface, which accepts input as a Hindi language word (source text) from the keyboard or mouse and break it into constituent words (target text). The source text is converted into target text in Unicode Format. Compound Word
Sandhi-vicheda
ijk/khu
ij $ v/khu
HkkokFkZ
Hkko $ vFkZ
f’koky;
f’ko $ vky;
dohUnz
dfo $ bUnz
x.ks’k
x.k $ bZ’k
ijes’oj
ije $ bZ’oj
,dSd
,d $ ,d
;FkSd
;Fkk $ ,d
ijksidkj
ij $ midkj
lfU/kPNsn
lfU/k $ Nsn
foPNsn
fo $ Nsn
IJCSI
47
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009 ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
Table 1:Sandhi-Vicheda of Hindi Compound Words
IV IMPLEMENTATION We have implemented the Rule-Based algorithm to first manually find the compound words and then develop the program that uses the database for displaying the correct meaning to the Sandhi-Vicheda word according to the Hindi grammar Sandhi-Vicheda rules.
ujsUnz
uj $ bUnz
lqjUs nz
lqj $ bUnz
dohUnz
dfo $ bUnz
’kphUnz
’kph $ bUnz
Table 4: Rule III Implemented Word List
4.1 Algorithm Step 4.4: (Rule for “Sign-E( h )” replaced with Swar hword = hindi word to be entered cur = Variable that stores the length of string Step 1: Repeat for every word of the input string. Step 2: Count the Length of String. Step 2.1: Store the Length of String in variable. For i = 1 To Len(hword) cur = Mid$(hword, i, 1) Step 3: Find the position of Matra. hword.Substring(b - 1, 1) Step 4: Apply the rules for sandhi –vicheda Step 4.1: (Rule for “Sign-AA( k )” replaced with Swar “Letter-A( v )” in Sandhi vicheda) LokFkhZ
Lo $ vFkhZ
HkkokFkZ
Hkko $ vFkZ
lR;kFkhZ
lR; $ vFkhZ
;FkkFkZ
;Fkk $ vFkZ
“Letter-E( ई )” in Sandhi vicheda) fxjh’k
fxfj $ bZ’k
jtuh’k
jtuh $ bZ’k
x.ks’k ijes’oj
x.k $ bZ’k ije $ bZ’oj
Table 5: Rule IV Implemented Word List Step 4.5: (Rule for “Sign-U( ks )” replaced with “LetterU( m )” in Sandhi Vicheda) ijksidkj
ij $ midkj
egksnf/k
egk $ mnf/k
vkRekaRs lxZ
vkRe $ mRlxZ
lkxjksfeZ
lkxj $ mfeZ
Table 6: Rule V Implemented Word List Table 2: Rule I Implemented Word List Step 4.2: (Rule for “Sign-AA( k )” replaced with Swar “Letter-AA( vk )” in Sandhi vicheda)
Step 4.6: (Rule for “Sign-EE( S )” replaced with Vowel “Letter-E( , )” in Sandhi Vicheda)
fo|ky;
fo|k $ vky;
lnSo
lnk $ ,o
f’koky;
f’ko $ vky;
egSo
egk $ ,o
iqLrdky;
iqLrd $ vky;
;FkSo
;Fkk $ ,o
Hkkstuky;
Hkkstu $ vky;
,dSd
,d $ ,d
Table 3: Rule II Implemented Word List Step 4.3: (Rule for “Sign-E( h )” replaced with Swar “Letter-E( b )” in Sandhi vicheda)
Table 7: Rule VI Implemented Word List Step 4.7: (Rule for “Sign-EE ( S )” replaced with “Letter-EE ( ,s )” in Sandhi Vicheda)
IJCSI
48
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009 ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
egS’o;Z
egk $ ,s’o;Z
nso’S o;Z
nso $ ,s’o;Z
V RESULTS AND DISCUSSION
ijeS’o;Z
ije $ ,s’o;Z
;FkSfrgkfld
;Fkk$ ,sfrgkfld
We have tested our software on more than 200 words. Using the Rule based algorithm we have reported an accuracy of 60-80% depending upon the number of rules to be implemented. SANDHI-VICHEDA is an easy and interesting way that can give entirely new dimension that add new way to traditional approach to Hindi Teaching.
Table 8: Rule VII Implemented Word List Step 4.8: (Rule for eliminating the half letter in Sandhi- Vicheda) If find the (Half CH) (PP) Letter then eliminates the Letter and decompose the word. lfU/kPNsn
lfU/k $ Nsn
foPNsn
fo $ Nsn
ifjPNsn
ifj $ Nsn
y{ehPNk;k
y{eh $ Nk;k
Table 9: Rule VIII Implemented Word List Step 4.9: (Rule of Visarga in Sandhi Vicheda) If find the (Half Letter) then replace with Sign ( : )visarga.
Total Matra=13
VI CONCLUSION AND FUTURE WORK In this paper, we presented the technique for the Sandhi-Vicheda of compound hindi words. Using the Rule based algorithm we have reported an accuracy of 60-80% depending upon the number of rules to be implemented. As future work, database can be extended to include more entries to improve the accuracy. This software can be used as a teaching aid to all the students from Class-V to the highest level of education. With this software one can learn about the very important aspect of Hindi Grammar i.e. ‘SANDHI-VICHEDA’. By adding new more features, we can upgrade it to learn all the aspects of Hindi Grammar. It can also be used to solve and test the problems related to Hindi Grammar.
fu’py
fu% $ py
fu’rst
fu% $ rst
nqLlkgl
nq% $ lkgl
ACKNOWLEDGEMENT
fuLrkj
fu% $ rkj
We would like to thank Dr. G.S. Lehal, Professor and Head, Department of Computer Science, Punjabi University, Patiala for many helpful suggestions and comments.
Table 10: Rule IX Implemented Word List Step 5: Repeat Steps 4.1 to 4.9 to check the next word for checking the Vyanjan that combined with Matra. Then replace the Matra with Swar. Step 6: Find the Unicode value for each of the Hindi characters and additional characters and use those values to implement above rules. Step 7: Display the results. Our module was developed in Visual Basic.NET (2005) and the encoding used for text was in Unicode, most suitable for other applications as well. Unicode uses a 16 bit encoding that provision for 65536 characters. Unicode standard [18] assigns each character a unique numeric value and name. Presently it provides codes for 49194 characters: In Hindi Language: Total Swar=13 Total Vyanjan=33
REFERENCES [1] Bharati, Akshar, Vineet Chaitanya & Rajeev Sangal, 1991, A Computational Grammar for Indian languages processing, Indian Linguistics Journal, pp.52, 91-103. [2] Bharati A., Chaitanya V and Sangal R, "Natural Language processing: A Paninian Perspective", Prentice Hall of India, 1995. [3] Cheng, Chin-Chuan “English Stresses and Chinese Tones in Chinese Sentences” California University, Berkeley, Phonology Laboratory. [4] Dan W. Patterson “Introduction to Artificial Intelligence and Expert Systems” Prentice Hall P-227.
IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 3, 2009
49
ISSN (Online): 1694-0784 ISSN (Printed): 1694-0814
[5] Elaine Rich, Kevin Knight “Artificial Intelligence” Tata McGraw-Hill Second Edition, P-377. [6] Jain Vinish 2004, Anus¡raka:Morphological Analyzer Component,IIIT-Hyderabad.
Sanskrit-English and Dictionary
[7] James M. Scobbie (Queen Margaret University), Marianne Pouplier (Edinburgh University), Alan A. Wrench (Articulate Instruments Ltd.) “Conditioning Factors in External Sandhi: An EPG Study of English /l/ Vocalisation”. [8] Jha, Girish N., 2004, The system of Paini, Language in India, volume4:2. [9] Jha, Girish N. et al., 2006, Towards a Computational analysis system for Sanskrit, Proc. of first National symposium on Modeling and Shallow parsing of Indian Languages at Indian Institute of Technology Bombay, pp 2534. [10] Jurafsky Daniel and James H. Martin, 2000, Speech and Languages Processing, Prentice-Hall, New Delhi. [11] Kasturi Venkateswara Rao, “A Web-Based Simple Sentence Level GB Translator from Hindi to Sanskrit”, M.Tech(CS) Dissertation, School of Computer Systems Sciences, Jawaharlal Nehru University, New Delhi. [12] Mitkov Ruslan, The Oxford Handbook of Computational Linguistics, Oxford University Press. [13] Peng, Shu-hui (1994). 'Effects of prosodic position and tonal context on Taiwanese Tones'. Ohio State University Working Papers in Linguistics, 44, 166-190. [14] Resource Centre For Indian Language Technology Solutions Sanskrit, Japanese, Chinese Jawaharlal Nehru University, New Delhi “Achievements”. [15] Scobbie, J. & Wrench, A., 2003. “An articulatory investigation of word-final /l/ and /l/-sandhi in three dialects of English”. Proc. XVth ICPhS, 1871-1874. [16] Suraj Bhan Singh. Hindi bhasha: Sandharbh aur Sanrachna. Sahitya Sahakar,1991. [17] Whitney, W.D., 2002, History of Sanskrit Grammar, Sanjay Prakashan, Delhi.
IJCSI
IJCSI CALL FOR PAPERS JANUARY 2010 ISSUE The topics suggested by this issue can be discussed in term of concepts, surveys, state of the art, research, standards, implementations, running experiments, applications, and industrial case studies. Authors are invited to submit complete unpublished papers, which are not under review in any other conference or journal in the following, but not limited to, topic areas. See authors guide for manuscript preparation and submission guidelines. Accepted papers will be published online and authors will be provided with printed copies and indexed by Google Scholar, CiteSeerX, Directory for Open Access Journal (DOAJ), Bielefeld Academic Search Engine (BASE), SCIRUS and more. Deadline: 15th December 2009 Notification: 15th January 2010 Online Publication: 31st January 2010 • • • • • • • • • • • • • • • •
Evolutionary computation Industrial systems Evolutionary computation Autonomic and autonomous systems Bio-technologies Knowledge data systems Mobile and distance education Intelligent techniques, logics, and systems Knowledge processing Information technologies Internet and web technologies Digital information processing Cognitive science and knowledge agent-based systems Mobility and multimedia systems Systems performance Networking and telecommunications
• • • • • • • • • • • • • • •
Software development and deployment Knowledge virtualization Systems and networks on the chip Context-aware systems Networking technologies Security in network, systems, and applications Knowledge for global defense Information Systems [IS] IPv6 Today - Technology and deployment Modeling Optimization Complexity Natural Language Processing Speech Synthesis Data Mining
All submitted papers will be judged based on their quality by the technical committee and reviewers. Papers that describe research and experimentation are encouraged. All paper submissions will be handled electronically and detailed instructions on submission procedure are available on IJCSI website (http://www.ijcsi.org). For other information, please contact IJCSI Managing Editor, (
[email protected]) Website: http://www.ijcsi.org
© IJCSI PUBLICATION 2009 www.IJCSI.org
IJCSI The International Journal of Computer Science Issues (IJCSI) is a refereed journal for scientific papers dealing with any area of computer science research. The purpose of establishing the scientific journal is the assistance in development of science, fast operative publication and storage of materials and results of scientific researches and representation of the scientific conception of the society. It also provides a venue for researchers, students and professionals to submit ongoing research and developments in these areas. Authors are encouraged to contribute to the journal by submitting articles that illustrate new research results, projects, surveying works and industrial experiences that describe significant advances in field of computer science.
Indexing of IJCSI: 1. Google Scholar 2. Directory for Open Access Journals (DOAJ) 3. Bielefeld Academic Search Engine (BASE) 4. CiteSeerX 5. SCIRUS
Frequency of Publication: Monthly
© IJCSI PUBLICATION www.IJCSI.org