What is Data Mining?

Data Mining

What is Data Mining?

Data mining is a technology that blends traditional data analysis methods with sophisticated algorithms for processing large volumes of data. This is a process of extracting implicit, unknown and potentially meaningful information from data. Exploration & Analysis, by automatic or semi-automatic means, large quantities of data to discover meaningful patterns.

Tasks and features of Data mining

1) Prediction Methods

Mainly for prediction. Learn from some variables, then predict unknown or future values of other variables.
X -> y 2) Description Methods
Find human-interpretable patterns that describe the data
Regression, correlation, and graphs

Supervised learning – prediction of given target variable using other features of the dataset.

1) Regression problems Target: Continuous/ Numeric variables Example of Regression

Multivariate regression

2) Classification problems Target: Categorical variables Examples of Classification

Logistic Regression
Decision tree
Rule-based
Ensemble
Artificial Neural Networks (ANN)
Support Vector Machine

Unsupervised learning – a set of statistical tools to better understand and describe the data, but performs the analysis without a target variable

1) Association analysis Finding relationships between categorical variables Examples of Association analysis

‘Apriori’ Algorithm

2) Dimension reduction Grouping by columns

Principal Components Analysis

3) Clustering Grouping by rows

K-means clustering
Hierarchical clustering
DBSCAN
Graph-based clustering

Warmest regards,

This is the first post to explain massive amount of information of data mining. Hope you enjoyed, and please stay tuned for next uploads!

Share on

Twitter Facebook LinkedIn

Juhyun Lee

What is Data Mining?

What is Data Mining?

Tasks and features of Data mining

Warmest regards,

Share on

Leave a comment

You may also enjoy

[Research] Strategies for Women in the Tourism Industry to address future pandemics: A systematic literature review of crisis management in Tanzania

Chinese’Tangping Qingnian’ (lying flat youth) text mining and LDA topic modeling Analysis

[Data Science Project] Amazon Magazine Subscription Review Data Analysis: Topic Modeling Techniques on LDA, BERTopic, and LLM-based QualIT

[Data Science Project] What Drives Intra-State Migration? A Network Regression Analysis on Florida Counties