Data Mining Techniques :=
- There are several major data mining techniques have
been developed and used in data mining projects recently including
association, classification, clustering, prediction and sequential
patterns. We will briefly examine those data mining techniques with
example to have a good overview of them.
1) Association
Association is one of the best known data mining
technique. In association, a pattern is discovered based on a
relationship of a particular item on other items in the same
transaction. For example, the association technique is used in
market basket analysis
to identify what products that customers frequently purchase together.
Based on this data businesses can have corresponding marketing campaign
to sell more products to make more profit.
2) Classification
Classification
is a classic data mining technique based on machine learning. Basically
classification is used to classify each item in a set of data into one
of predefined set of classes or groups. Classification method makes use
of mathematical techniques such as decision trees, linear programming,
neural network and statistics. In classification, we make the software
that can learn how to classify the data items into groups. For example,
we can apply classification in application that “given all past records
of employees who left the company, predict which current employees are
probably to leave in the future.” In this case, we divide the employee’s
records into two groups that are “leave” and “stay”. And then we can
ask our data mining software to classify the employees into each group.
3) Clustering
Clustering
is a data mining technique that makes meaningful or useful cluster of
objects that have similar characteristic using automatic technique.
Different from classification, clustering technique also defines the
classes and put objects in them, while in classification objects are
assigned into predefined classes. To make the concept clearer, we can
take library as an example. In a library, books have a wide range of
topics available. The challenge is how to keep those books in a way that
readers can take several books in a specific topic without hassle. By
using clustering technique, we can keep books that have some kind of
similarities in one cluster or one shelf and label it with a meaningful
name. If readers want to grab books in a topic, he or she would only go
to that shelf instead of looking the whole in the whole library.
4) Prediction
The
prediction as it name implied is one of a data mining techniques that
discovers relationship between independent variables and relationship
between dependent and independent variables
. For instance
,
prediction analysis technique can be used in sale to predict profit for
the future if we consider sale is an independent variable, profit could
be a dependent variable. Then based on the historical sale and profit
data, we can draw a fitted regression curve that is used for profit
prediction.
5) Sequential Patterns
Sequential patterns
analysis in one of data mining technique that seeks to discover similar
patterns in data transaction over a business period. The uncover
patterns are used for further business analysis to recognize
relationships among data.