What is Data Mining?

What is Data Mining?
Data mining is a technology that blends traditional data analysis methods with sophisticated algorithms for processing large volumes of data. This is a process of extracting implicit, unknown and potentially meaningful information from data. Exploration & Analysis, by automatic or semi-automatic means, large quantities of data to discover meaningful patterns.
Tasks and features of Data mining
1) Prediction Methods
- Mainly for prediction. Learn from some variables, then predict unknown or future values of other variables.
- X -> y 2) Description Methods
- Find human-interpretable patterns that describe the data
- Regression, correlation, and graphs
![]()
Supervised learning – prediction of given target variable using other features of the dataset.
1) Regression problems Target: Continuous/ Numeric variables Example of Regression
- Multivariate regression
2) Classification problems Target: Categorical variables Examples of Classification
- Logistic Regression
- Decision tree
- Rule-based
- Ensemble
- Artificial Neural Networks (ANN)
- Support Vector Machine
Unsupervised learning – a set of statistical tools to better understand and describe the data, but performs the analysis without a target variable
1) Association analysis Finding relationships between categorical variables Examples of Association analysis
- ‘Apriori’ Algorithm
2) Dimension reduction Grouping by columns
- Principal Components Analysis
3) Clustering Grouping by rows
- K-means clustering
- Hierarchical clustering
- DBSCAN
- Graph-based clustering
Warmest regards,
This is the first post to explain massive amount of information of data mining. Hope you enjoyed, and please stay tuned for next uploads!
Leave a comment