Data Mining

  • Uploaded by: sunnynnus
  • 0
  • 0
  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Data Mining as PDF for free.

More details

  • Words: 420
  • Pages: 10
Data Mining

• • • • •

What is Data Mining? What is Data Warehousing? What is the need of Data Mining? Data Mining Architecture Data Mining Algorithms

Data mining • Data mining, the extraction of hidden predictive information from large databases, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses

Data Warehousing • A system for storing and delivering massive quantities of data. • The critical factor leading to the use of a data warehouse is that a data analyst can perform complex queries and analysis (such as data mining) on the information without slowing down the operational systems

Data Mining • Data mining tools predict future trends and behaviors, allowing businesses to make proactive, knowledgedriven decisions • Data mining tools can answer business questions that traditionally were too time consuming to resolve

Uses • Retailing • Weather Forecasting • Traffic Congestion

Algorithms • Data mining algorithms traditionally fall into one of four broad categories • Classification • Clustering • Association • Sequence discovery

• Classification, or supervised induction, is perhaps the

most common of all data mining activities. The objective of classification is to analyze the historical data stored in a database and to automatically generate a model that can predict future behavior. • This induced model consists of generalizations over the records of a training data set, which help distinguish predefined classes. • The hope is that this model can then be used to predict the classes of other unclassified records. .

• Common tools used for classification are neural networks, decision trees and if-then-else rules that need not have a tree structure.

• Neural networks involve the development of mathematical structures with the ability to learn.

• Decision trees classify data into a finite number of classes, based on the values of the variables. DTs are comprised of essentially a hierarchy of if-then statements and are thus significantly faster than neural nets

• Rule induction —The extraction of useful if-then rules from data based on statistical significance. if-then statements used here need not be hierarchical

• Clustering partitions the database into segments in which each segment member shares similar qualities

• Associations establish relationships about items that occur together in a given record

• Sequence Discovery can be looked at as the identification of associations over time. When appropriate information is available (for instance, the identity of a customer in a retail shop), a temporal analysis can be conducted to identify behavior over time.

Related Documents

Data Mining
May 2020 23
Data Mining
October 2019 35
Data Mining
November 2019 32
Data Mining
May 2020 21
Data Mining
May 2020 19
Data Mining
November 2019 34

More Documents from ""