Data Mining And Intrusion Detection

  • Uploaded by: Matthew Sparkes
  • 0
  • 0
  • April 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Data Mining And Intrusion Detection as PDF for free.

More details

  • Words: 1,301
  • Pages: 4
CMPS-MA02 Final Report: Data Mining and Intrusion Detection Matthew Sparkes February, 2004

1

Introduction

I propose to conduct research into the topic of data mining and it’s specific use in the field of detecting network intrusions. I will then create a literature survey of approximately 3,500 words surveying the different approaches to this concept, and the current state of the art.

1.1

What is Data Mining?

Data Mining is a technique of data analysis that allows the user to discover previously unknown relationships amongst data that can be applied to a problem usefully.

1.2

What is Intrusion Detection?

1.3

How is Data Mining Applied to Intrusion Detection?

Intrusion detection systems fall into one of two types, which are technically quite similar but have different methodologies. These are anomaly or misuse detection. [Florez] 1.3.1

Misuse Detection

Misuse detection presumes that an intrusion can be discovered by matching the current state to a set of intrusion states. Network intrusion detection requires that a network is set up with the capability to record certain metrics about it’s current state. These metrics are then used to compare the current state with other saved states that represent possible intrusions. An example of a metric matching in intrusion detection is examining logs to find unusual login patterns such as an irregular log in time for a particular user. This can often help to discover illegal intrusions to the network. The matching of the current state to a sample intrusion alert state automatically alerts a human user who can differentiate between a false positive and an accurate intrusion alarm. [Bloedorn 01]

1

1.3.2

Anomaly Detection

Anomaly detection works on the assumption that an intrusion will alter the behaviour of the system, so instead of looking for typical intrusion patterns as misuse detection does it looks for alterations in the usual running of the system. [Florez] These systems started out as programmed systems, that is, it was told what consisted anomalous behavior. Thresholds for certain metrics where set so that when they were exceeded the system could detect that an anomalous value had been reached and this information could be used to determine if an intrusion was taking place, often based on more than one metric. Some systems have incorporated self learning such as that used by speech recognition systems to get used to a persons voice to analyse traffic for a period to determine what is and what is not normal, in this way it can generate it’s own model of normal network behaviour and it is not necessary to program it manually. [?] Self learning detectors can also be subdivided into those that use a continual time series technique to update their expectations of normal operating parameters, and non-time series systems that take a subset of the network data to calculate normal behaviour.

1.4

The Need for Human Operatives

Human operatives are still a vital part of a succesful intrusion detection system. Their ability to differentiate between a false positive and an actual intrusion alert is something that cannot be replicated in software or hardware at the current time. Some systems attempt to cover this part of the process using fuzzy logic and neural networks and are described later, but still are not capable of performing to the same standard as well trained humans. [Florez] The need for a human element does not stop purely at categorising alerts as real or false, but also extends to the need for a decision on appropriate action for those alerts classified as genuine.

1.5

What Happens After an Alert?

There are many actions that can be taken once an alert has been classified as genuine, if the problem is limited to one small part of the network then that part can be shutdown, providing it is not a vital system. Another possibility is to contact the ISP from which the problem originates, this would cause them to track down the individual(s) responsible, and if possible also inform the police. If the intrusion is not serious, or if it has not been possible to classify it as a definite intrusion then the appropriate action may be simpkly to record the event for future reference. [Bloedorn 01]

2 2.1

Topic Overview Data Mining

Data mining is a process of finding patterns in data that may represent some previously unknown opportunity or phenomenon. It is used on large data sets where there is no possibility of human analysis.

2

2.2

Current Network Intrusion Detection Techniques

The main aim of mining data for computer security is to examine logs to find unusual patterns such as an irregular log-in time. This can often help to discover illegal intrusions into the network. This is a successful application for data mining and helpfully the target data is generated by a computer so no data cleansing needs to be performed. Research into an entirely visual representation of network activity has been conducted, this is based on the premise that humans can take in visual data at 150 Mb/S. [Yurcik et al]. The idea is that one screen can show the state of many thousands of computers connected to the network, representing each as a two pixel square of varying colour. The information for this display is taken from network logs that have been data mined for any erroneous data that should be highlighted. Another visual system called mining alarming information from data streams (MAIDS) has been developed using clustering techniques on a stream of network information (currently synthesised logs) which uses pie charts to represent traffic and its classification assigned by the system (illegal/legal etc). [Dora et al] This is a particularly interesting package as tests have shown it to be very well suited not just to detecting illegal intrusions but also to monitoring any constant stream of structured data. This would enable it to be implemented in a real-time monitoring scenario.

2.3

3

Data Mining in Intrusion Detection

Existing Intrusion Detection Systems

3.1

Anomaly Detection Systems

3.2

Misuse Detection Systems

3.3

Hybrid Detection Systems

4

Pros and Con of Implementing Data Mining as Intrusion Detection

Data mining can provide a very reliable detector for intrusions, by using adaptive rule learning very sophisticated intrusion signatures can be maintained making misuse detection very accurate. However, these rules need to be input into the system, as misuse systems cannot detect intrusions for which they do not have a signature. Anomaly detectors can cover the gap left by misuse systems, where new attacks aren’t recognised. Therefore using a combination of both is more secure than either on its own. In both systems it is necessary to set up some way of taking various metrics from the network in order to have a sufficient data set to analyse, this may be costly to implement. It is also necessary to train staff in the softwares use and installation. The actual cost of the system can be kept to a minimum by using open source systems or by writing the software in-house.

3

References [Yurcik et al] ”Two Visual Computer Network Security Monitoring Tools Incorporating Operator Interface Requirements”, William Yurcik, James Barlow, Kiran Lakkaraju Mike Haberman, National Center for Supercomputing Applications (NCSA) [Dora et al]

“MAIDS: Mining Alarming Incidents from Data Streams”, Y. Dora Cai, David Clutter, Greg Pape, Jiawei Han, Michael Welge Loretta Auvil, University of Illinois at Urbana-Champaign U.S.A., Demonstration Proposal

[Florez]

”An Improved Algorithm for Fuzzy Data Mining for Intrusion Detection”, German Florez, Susan M. Bridges and Rayford B. Vaughn

[Axelsson 00] ””, [Lee 98]

”Data Mining Approaches for Intrusion Detection”, Proceedings of the 7th USENIX Security Symposium, Wenke Lee and Salvatore Stolfo, San Antonio, TX, 1998

[Bloedorn 01] ”Data mining for network intrusion detection: How to get started.”, Technical report, The MITRE Corporation, Eric Bloedorn, Alan D. Christiansen, William Hill, Clement Skorupka, Lisa M. Talbot, and Jonathan Tivel, 2001

4

Related Documents


More Documents from ""