Algorithm Infographic

Algorithm Infographic

Kirti's Algorithm Infographic

Important Numerical Algorithms

Key algorithms used in data analysis and machine learning for various tasks.

  • Supervised: Decision Tree, KNN, Naive Bayes, SVM

    Learning from labeled data to make predictions on new, unseen data.

  • Unsupervised: K-means, Hierarchical

    Finding patterns in unlabeled data without predefined outputs.

  • Neural Networks:

    Algorithms inspired by the human brain's structure and function.

    • Recurrent Neural Network → Long Short-Term Memory (Words, time series)
    • Feed Forward → Deep Feed Forward (Image Recognition)
    • Markov model → Hopfield Network → Boltzmann Machine → Restricted BM → Deep Belief Network
    • Deep Convolutional Network → Deconvolutional Network
    • Generative Adversarial Network → Liquid State Network → Extreme Learning Machine → Echo State Network (LLM)
  • Filtering: Content-based → Collaborative → Associative Rule Mining

    Techniques for recommender systems and pattern discovery in data.

  • Reinforcement Learning

    Learning through interaction with an environment to maximize cumulative reward.

  • Hidden Markov Model

    Statistical model for systems with unobserved (hidden) states.

Some Important Problems in Stats

Fundamental concepts and challenges in probability and statistics.

  1. Markov Inequality → Chebyshev Inequality → WLLN → Central Limit Theorem

    Progression of key probability theorems leading to the foundation of statistical inference.

  2. Estimate Properties
  3. ANOVA (Analysis of Variance)

    Statistical method to analyze differences among group means in a sample.

  4. PCA Proof (Principal Component Analysis)

    Technique for dimensionality reduction and feature extraction.

  5. Rank Order Statistic
  6. Poisson Process

    Stochastic process that models the occurrence of random events over time.

  7. Regression Generalized Inverse (Logistic)

    Advanced regression technique for categorical outcome variables.

  8. Sample Size Calculations

Dualities in Data Analysis

Important trade-offs and contrasts in machine learning and statistics.

  • Underfitting vs. Overfitting (Bias-Variance Trade-off) (Confidence-Power) Type 1 - Type 2, Finding Imaginary Pattern - Finding No Pattern

    Balance between model complexity and generalization ability.

  • Generative vs. Discriminative

    Approaches to modeling probability distributions in machine learning.

  • Parametric vs. Non-parametric

    Distinction between models with fixed vs. flexible number of parameters.

  • Tests of Difference vs Tests of Associations

    Statistical methods for comparing groups vs. examining relationships between variables.

  • Normalization vs. Standardization

    Techniques for scaling features to a common range or distribution.

  • Imputing vs. Weight of Evidence

    Methods for handling missing data in statistical analysis.

  • Long Table vs. Wide Table

    Data formatting considerations for analysis and modeling.

Non-Parametric Tests

Statistical tests that don't assume a specific distribution of the data.

  • Durbin-Watson: Autocorrelation

    Test for serial correlation in regression analysis.

  • Kolmogorov-Smirnov: Fitting

    Test for the equality of continuous, one-dimensional probability distributions.

  • Run Test: Randomness

    Test to check if a sequence of observations is random.

  • Cohen's d: Effect Size

    Measure of the strength of a phenomenon or relationship between variables.

  • Bartlett's Test: Same Variance

    Test for the equality of variances across groups.

Other Statistical Tests

  • Z/t-test: Sign Test, Sign Rank Test (Paired - Wilcoxon, Unpaired - Mann-Whitney)

    Tests for comparing means or medians between groups.

  • ANOVA: Kruskal-Wallis (p < α)

    Non-parametric method for testing whether samples originate from the same distribution.

  • Correlation: Pearson, Spearman

    Measures of the strength and direction of association between variables.

  • ARIMA (Autoregressive Integrated Moving Average)

    Time series analysis and forecasting model.

  • Chi-square (Calculated more than critical: Reject)

    Test for independence between categorical variables.

Some Important Concepts to Remember

Key ideas and considerations in statistical analysis and machine learning.

  • Correlation ≠ Causation

    Principle that correlation between variables doesn't imply a causal relationship.

  • Confounding Experiment

    Presence of variables that affect both the dependent and independent variables.

  • Regression Error

    Difference between predicted and actual values in regression analysis.

  • Endogeneity

    Correlation between a variable and the error term in a regression model.

  • Paradox of Unanimity

    Phenomenon where unanimous agreement can indicate less reliability than disagreement.

  • Actual Effect, Confounding Factor, Type 1 Error, Sampling Bias

    Various factors affecting the validity and interpretation of statistical results.

Comments

Popular posts from this blog

Types of Thought Experiments

Guide to Informal Logical Fallacies

The Art of Questioning