Understanding Patterns and Data Mining

Understanding Patterns and Data Mining

StatSoft(n.d) defines Data Mining an “analytic process designed to explore data (usually large amounts of data – typically business or market related – also known as “big data”) in search of consistent patterns and/or systematic relationships between variables, and then to validate the findings by applying the detected patterns to new subsets of data. ”. They further went on to state that “the ultimate goal of data mining is prediction”.

Figure 1.0 Illustrates the Multiple Disciplines for data mining(GDi Techno Solutions, 2012)

data-mining-gdi-techno-solutions-12-728.jpg

As mentioned in Brookshear et al.(2012,p414-415) the different types of data mining are as follows:

  • Class description
  • Class discrimination
  • Cluster Analysis
  • Association Analysis
  • Outlier Analysis
  • Sequential Pattern Analysis

Class description

Also known as characterization, is a data mining system that according to Brookshear et al.(2012) ,“deals with identifying properties that characterize a given group of data items”.  This produces a descriptive summarization into the characteristics of customers.

An Example: This pattern will show characteristics of a customer(s) who spends more than let’s say USD$500 a year at CPJ Market’s online store. This can result in a general profile of the customers; such as age, marital status, employment status or credit ratings.

Class discrimination

It is a comparison or the contrasting of two groups. “class discrimination deals with identifying properties that divide two groups.” (Brookshear et al.,2012,p414) . According to Zaiane(1999), “The techniques used for data discrimination are very similar to the techniques used for data characterization with the exception that data discrimination results include comparative measures.”

An Example: This pattern can be used to compare the general characteristics of the customers who bought complete albums on iTunes last year against those who bought less than 3 tracks from an album.

Cluster Analysis

Clustering analyses also known as ‘unsupervised classification’. Cluster analysis is somewhat similar to classification in that it organizes the data into classes, however unlike classification, the class labels are not known, therefore it’s based on the algorithm to disclose these classes.

An Example: Cluster analysis can be performed on Wal-Mart customer data in order to identify homogeneous subpopulations of customers.

Figure 1.1 Illustrate a cluster analysis pattern(Smart, 2013)

2013-09-19-Latent+Class2.png

Association Analysis

From a sales and marketing perspective, it determines which items are frequently sold together within the same transaction and or time period. 

Example: If a customer buys a fish tank there is a 50% chance that he/she will also buy an air pump as well. This pattern is used most often on online stores, a prime example is Amazon and their techniques to up-sell items.

pic1.jpg

Figure 1.2 Illustrate a association analysis pattern(Olsen, 2013)

Outlier Analysis

Outliers also known as ‘exceptions or surprises’ are data elements that cannot be grouped in a given class or cluster. 

An Example: A very well known use of this is in finding Fraudulent usage of credit cards and the sudden change in a customer’s buying patterns, especially purchases wherein a customer buying increases in volume very suddenly.

Sequential Pattern Analysis
Data evolution analysis describes and models regularities or trends for objects whose behavior of time-related data and the changes over time. This focuses on “characterizing, comparing, classifying or clustering of time-related data.”(Zaiane,1999).

Example: This pattern can be used in predicting the future trends in the stock market prices, This contributes to a decision in which stock investment or not. This pattern can be used by various financial or investment companies.

Database types, data mining patterns within and why

Databases TypesData Mining PatternsWhy
TransactionalOutlier, Associate, Class discrimination, Class description, and ClusterThis database contains transactional data that could highlight each of the following patterns.
Time-SeriesSequence Pattern and ClusterThis database contains stock exchange and movement over a period of time
Sequence Sequence Pattern,  Class discrimination, Associate, Class description, and ClusterThis type of database contains information with regards to customer shopping sequences or browsing info.
MultimediaClass discrimination, Class description, and Cluster AnalysisA primary example would be Netflix, as this pattern could help improve UX and increase sales.
LegacyClass discrimination, Associate, Class description, Cluster Analysis, and Sequence PatternAs the name suggests, this contains history information that can span numerous patterns. This is a grouping of all the major databases.

In conclusion, Data mining help organizations to make informed decisions with regards to the pattern of interest. Not all patterns will be suited for an organization and those that are will provide the most or become the source of vital information. For marketing companies or supermarkets, this can facilitate an increase in sales and revenue, management of inventory restock and supply chain. Data mining has become an important facet of the information age and more research is being done to improve it usefulness.


References:

Brookshear, J. G., Smith, D. and Brylow, D. Computer Science: An Overview, 11th Edition. Reading, MA: Pearson (Addison-Wesley), 2012

GDi Techno Solutions (2012) Data mining – GDi Techno Solutions, Available at: http://www.slideshare.net/gditechnosolutions/data-mining-gdi-techno-solutions (Accessed: May 8, 2016).

Olsen, J. (2013) Shopping for KPIs: Market Basket Analysis for Web Analytics Data, Available at: https://blogs.adobe.com/digitalmarketing/analytics/shopping-for-kpis-market-basket-analysis-for-web-analytics-data/ (Accessed: May 8, 2016).

StatSoft (n.d) What is Data Mining (Predictive Analytics, Big Data), Available at: http://www.statsoft.com/Textbook/Data-Mining-Techniques#mining (Accessed: May 8, 2016).

Smart, F. (2013) Cluster Analysis, Available at: http://www.econometricsbysimulation.com/2013/09/cluster-analysis.html (Accessed: May 8, 2016).

Zaiane, O. R. (1999) Chapter I: Introduction to Data Mining, Available at: https://webdocs.cs.ualberta.ca/~zaiane/courses/cmput690/notes/Chapter1/ (Accessed: May 8, 2016).

Leave a Reply

Your email address will not be published. Required fields are marked *

Groope Multimedia © 2019, All rights reserved