Algorithm Infographic
Kirti's Algorithm Infographic
Important Numerical Algorithms
Key algorithms used in data analysis and machine learning for various tasks.
- Supervised: Decision Tree, KNN, Naive Bayes, SVM
Learning from labeled data to make predictions on new, unseen data.
- Unsupervised: K-means, Hierarchical
Finding patterns in unlabeled data without predefined outputs.
- Neural Networks:
Algorithms inspired by the human brain's structure and function.
- Recurrent Neural Network → Long Short-Term Memory (Words, time series)
- Feed Forward → Deep Feed Forward (Image Recognition)
- Markov model → Hopfield Network → Boltzmann Machine → Restricted BM → Deep Belief Network
- Deep Convolutional Network → Deconvolutional Network
- Generative Adversarial Network → Liquid State Network → Extreme Learning Machine → Echo State Network (LLM)
- Filtering: Content-based → Collaborative → Associative Rule Mining
Techniques for recommender systems and pattern discovery in data.
- Reinforcement Learning
Learning through interaction with an environment to maximize cumulative reward.
- Hidden Markov Model
Statistical model for systems with unobserved (hidden) states.
Some Important Problems in Stats
Fundamental concepts and challenges in probability and statistics.
- Markov Inequality → Chebyshev Inequality → WLLN → Central Limit Theorem
Progression of key probability theorems leading to the foundation of statistical inference.
- Estimate Properties
- ANOVA (Analysis of Variance)
Statistical method to analyze differences among group means in a sample.
- PCA Proof (Principal Component Analysis)
Technique for dimensionality reduction and feature extraction.
- Rank Order Statistic
- Poisson Process
Stochastic process that models the occurrence of random events over time.
- Regression Generalized Inverse (Logistic)
Advanced regression technique for categorical outcome variables.
- Sample Size Calculations
Dualities in Data Analysis
Important trade-offs and contrasts in machine learning and statistics.
- Underfitting vs. Overfitting (Bias-Variance Trade-off) (Confidence-Power) Type 1 - Type 2, Finding Imaginary Pattern - Finding No Pattern
Balance between model complexity and generalization ability.
- Generative vs. Discriminative
Approaches to modeling probability distributions in machine learning.
- Parametric vs. Non-parametric
Distinction between models with fixed vs. flexible number of parameters.
- Tests of Difference vs Tests of Associations
Statistical methods for comparing groups vs. examining relationships between variables.
- Normalization vs. Standardization
Techniques for scaling features to a common range or distribution.
- Imputing vs. Weight of Evidence
Methods for handling missing data in statistical analysis.
- Long Table vs. Wide Table
Data formatting considerations for analysis and modeling.
Non-Parametric Tests
Statistical tests that don't assume a specific distribution of the data.
- Durbin-Watson: Autocorrelation
Test for serial correlation in regression analysis.
- Kolmogorov-Smirnov: Fitting
Test for the equality of continuous, one-dimensional probability distributions.
- Run Test: Randomness
Test to check if a sequence of observations is random.
- Cohen's d: Effect Size
Measure of the strength of a phenomenon or relationship between variables.
- Bartlett's Test: Same Variance
Test for the equality of variances across groups.
Other Statistical Tests
- Z/t-test: Sign Test, Sign Rank Test (Paired - Wilcoxon, Unpaired - Mann-Whitney)
Tests for comparing means or medians between groups.
- ANOVA: Kruskal-Wallis (p < α)
Non-parametric method for testing whether samples originate from the same distribution.
- Correlation: Pearson, Spearman
Measures of the strength and direction of association between variables.
- ARIMA (Autoregressive Integrated Moving Average)
Time series analysis and forecasting model.
- Chi-square (Calculated more than critical: Reject)
Test for independence between categorical variables.
Some Important Concepts to Remember
Key ideas and considerations in statistical analysis and machine learning.
- Correlation ≠ Causation
Principle that correlation between variables doesn't imply a causal relationship.
- Confounding Experiment
Presence of variables that affect both the dependent and independent variables.
- Regression Error
Difference between predicted and actual values in regression analysis.
- Endogeneity
Correlation between a variable and the error term in a regression model.
- Paradox of Unanimity
Phenomenon where unanimous agreement can indicate less reliability than disagreement.
- Actual Effect, Confounding Factor, Type 1 Error, Sampling Bias
Various factors affecting the validity and interpretation of statistical results.
Comments
Post a Comment