What is Data Mining?

Data Mining

What is Data Mining?

Data mining is a technology that blends traditional data analysis methods with sophisticated algorithms for processing large volumes of data. This is a process of extracting implicit, unknown and potentially meaningful information from data. Exploration & Analysis, by automatic or semi-automatic means, large quantities of data to discover meaningful patterns.

Tasks and features of Data mining

1) Prediction Methods

  • Mainly for prediction. Learn from some variables, then predict unknown or future values of other variables.
  • X -> y 2) Description Methods
  • Find human-interpretable patterns that describe the data
  • Regression, correlation, and graphs

supervised/unsupervised

Supervised learning – prediction of given target variable using other features of the dataset.

1) Regression problems Target: Continuous/ Numeric variables Example of Regression

  • Multivariate regression

2) Classification problems Target: Categorical variables Examples of Classification

  • Logistic Regression
  • Decision tree
  • Rule-based
  • Ensemble
  • Artificial Neural Networks (ANN)
  • Support Vector Machine

Unsupervised learning – a set of statistical tools to better understand and describe the data, but performs the analysis without a target variable

1) Association analysis Finding relationships between categorical variables Examples of Association analysis

  • ‘Apriori’ Algorithm

2) Dimension reduction Grouping by columns

  • Principal Components Analysis

3) Clustering Grouping by rows

  • K-means clustering
  • Hierarchical clustering
  • DBSCAN
  • Graph-based clustering

Warmest regards,

This is the first post to explain massive amount of information of data mining. Hope you enjoyed, and please stay tuned for next uploads!

Tags:

Categories:

Updated: