Algorithm Infographic

Kirti's Algorithm Infographic

Important Numerical Algorithms

Key algorithms used in data analysis and machine learning for various tasks.

Supervised: Decision Tree, KNN, Naive Bayes, SVM
Learning from labeled data to make predictions on new, unseen data.
Unsupervised: K-means, Hierarchical
Finding patterns in unlabeled data without predefined outputs.
Neural Networks:
Algorithms inspired by the human brain's structure and function.
- Recurrent Neural Network → Long Short-Term Memory (Words, time series)
- Feed Forward → Deep Feed Forward (Image Recognition)
- Markov model → Hopfield Network → Boltzmann Machine → Restricted BM → Deep Belief Network
- Deep Convolutional Network → Deconvolutional Network
- Generative Adversarial Network → Liquid State Network → Extreme Learning Machine → Echo State Network (LLM)
Filtering: Content-based → Collaborative → Associative Rule Mining
Techniques for recommender systems and pattern discovery in data.
Reinforcement Learning
Learning through interaction with an environment to maximize cumulative reward.
Hidden Markov Model
Statistical model for systems with unobserved (hidden) states.

Fundamental concepts and challenges in probability and statistics.

Markov Inequality → Chebyshev Inequality → WLLN → Central Limit Theorem
Progression of key probability theorems leading to the foundation of statistical inference.
Estimate Properties
ANOVA (Analysis of Variance)
Statistical method to analyze differences among group means in a sample.
PCA Proof (Principal Component Analysis)
Technique for dimensionality reduction and feature extraction.
Rank Order Statistic
Poisson Process
Stochastic process that models the occurrence of random events over time.
Regression Generalized Inverse (Logistic)
Advanced regression technique for categorical outcome variables.
Sample Size Calculations

Important trade-offs and contrasts in machine learning and statistics.

Underfitting vs. Overfitting (Bias-Variance Trade-off) (Confidence-Power) Type 1 - Type 2, Finding Imaginary Pattern - Finding No Pattern
Balance between model complexity and generalization ability.
Generative vs. Discriminative
Approaches to modeling probability distributions in machine learning.
Parametric vs. Non-parametric
Distinction between models with fixed vs. flexible number of parameters.
Tests of Difference vs Tests of Associations
Statistical methods for comparing groups vs. examining relationships between variables.
Normalization vs. Standardization
Techniques for scaling features to a common range or distribution.
Imputing vs. Weight of Evidence
Methods for handling missing data in statistical analysis.
Long Table vs. Wide Table
Data formatting considerations for analysis and modeling.

Statistical tests that don't assume a specific distribution of the data.

Durbin-Watson: Autocorrelation
Test for serial correlation in regression analysis.
Kolmogorov-Smirnov: Fitting
Test for the equality of continuous, one-dimensional probability distributions.
Run Test: Randomness
Test to check if a sequence of observations is random.
Cohen's d: Effect Size
Measure of the strength of a phenomenon or relationship between variables.
Bartlett's Test: Same Variance
Test for the equality of variances across groups.

Z/t-test: Sign Test, Sign Rank Test (Paired - Wilcoxon, Unpaired - Mann-Whitney)
Tests for comparing means or medians between groups.
ANOVA: Kruskal-Wallis (p < α)
Non-parametric method for testing whether samples originate from the same distribution.
Correlation: Pearson, Spearman
Measures of the strength and direction of association between variables.
ARIMA (Autoregressive Integrated Moving Average)
Time series analysis and forecasting model.
Chi-square (Calculated more than critical: Reject)
Test for independence between categorical variables.

Key ideas and considerations in statistical analysis and machine learning.

Correlation ≠ Causation
Principle that correlation between variables doesn't imply a causal relationship.
Confounding Experiment
Presence of variables that affect both the dependent and independent variables.
Regression Error
Difference between predicted and actual values in regression analysis.
Endogeneity
Correlation between a variable and the error term in a regression model.
Paradox of Unanimity
Phenomenon where unanimous agreement can indicate less reliability than disagreement.
Actual Effect, Confounding Factor, Type 1 Error, Sampling Bias
Various factors affecting the validity and interpretation of statistical results.