Skip to main content

06 - AI, Machine Learning & LLM

Foundations of AI & Machine Learning

Core AI Paradigms

  • AI kinds
    • Symbolic AI - The collection of all methods in artificial intelligence research that are based on high-level symbolic (human-readable) representations of problems, logic and search
    • Generative AI - A subset of artificial intelligence that uses generative models to produce text, images, videos, or other forms of data
    • Causal AI - A technique in artificial intelligence that builds a causal model and can thereby make inferences using causality rather than just correlation
  • AI Programming Languages
    • Mojo - The programming language for all AI developers

Classical Machine Learning

  • Paradigms
    • Supervised learning - A paradigm in machine learning where algorithms learn from labeled data
      • Decision tree learning - The method using a decision tree as a predictive model to go from observations about an item to conclusions about the item's target value
      • Ensemble learning - The method using multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone
        • Random forest - An ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time
      • Support vector machine - The supervised learning models with associated learning algorithms that analyze data for classification and regression analysis
      • Classification - The problem of identifying which of a set of categories (sub-populations) a new observation belongs to, on the basis of a training set of data containing observations
        • Logistic regression - A statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables
        • ROC curve - A graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied
        • Naive Bayes classifier - A family of simple probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features
      • Regression - A set of statistical processes for estimating the relationships between a dependent variable and one or more independent variables
        • Ordinary least squares - A type of linear least squares method for choosing the unknown parameters in a linear regression model
        • Generalized linear model - A flexible generalization of ordinary least squares regression
        • ARIMA model - A generalization of an autoregressive moving average (ARMA) model, fitted to time series data either to better understand the data or to predict future points in the series
    • Unsupervised learning - A type of machine learning in which models are trained using unlabeled dataset and are allowed to act on that data without previous training
      • K-means clustering - A method of vector quantization that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean
    • Reinforcement learning - An area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward
      • Markov decision process - The mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker
      • Multi-armed bandit - A problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain
      • Value function - A function used in mathematical optimization and reinforcement learning that assigns a measure of desirability to states or actions
  • Concepts & Techniques
    • Hyperparameter - A parameter whose value is used to control the learning process
    • Hyperparameter optimization - The problem of choosing a set of optimal hyperparameters for a learning algorithm
    • Embedding - A representation learning technique that maps complex, high-dimensional data into a lower-dimensional vector space of numerical vectors
    • Early stopping - A form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent
    • Cross-validation - Any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set
    • Transfer learning
  • Applications & Problem Domains
    • Anomaly detection - The identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data
      • One-class classification - The technique trying to identify objects of a specific class amongst all objects, by primarily learning from a training set containing only the objects of that class
    • Recommender system - An information filtering system that seeks to predict the 'rating' or 'preference' a user would give to an item
  • Related Fields
    • Mathematical model - An abstract description of a concrete system using mathematical concepts and language
    • Mathematical optimization - The selection of a best element, with regard to some criteria, from some set of available alternatives
  • Frameworks, Platforms & Tools
    • scikit-learn - A free software machine learning library for the Python programming language
      • libsvm - A Library for Support Vector Machines
    • ML.NET - An open-source, cross-platform machine learning framework for .NET developers
    • Crab - A Python library for building recommender systems
    • Gradio - The fastest way to demo your machine learning model with a friendly web interface so that anyone can use it, anywhere
    • mlxtend - A Python library of useful tools for the day-to-day data science tasks
    • Prophet - A forecasting procedure for time series data that is fast and provides completely automated forecasts

Deep Learning

  • Neural Network Fundamentals
    • Neural network - The computational models used in machine learning for finding patterns in data
    • Tensor - The mathematical objects represented as multidimensional arrays used in machine learning
      • Sigmoid function - A mathematical function having a characteristic 'S'-shaped curve or sigmoid curve
      • Softmax function - A function that converts a vector of K real numbers into a probability distribution of K possible outcomes
    • Backpropagation - A widely used algorithm for training feedforward neural networks
    • Autoencoder - A type of artificial neural network used to learn efficient codings of unlabeled data (unsupervised learning)
    • Vanishing gradient problem - The difficulty encountered when training artificial neural networks with gradient-based learning methods and backpropagation, where gradients shrink as they back-propagate
  • Deep Learning Concepts & Training
    • Deep Learning - A part of a broader family of machine learning methods based on artificial neural networks with representation learning
    • Stochastic gradient descent - An iterative method for optimizing an objective function with suitable smoothness properties
    • Fine tuning - An approach to transfer learning in which the weights of a pre-trained model are trained on new data
    • LoRA (machine learning) - A parameter-efficient fine-tuning technique for adapting pre-trained models to specific tasks with significantly fewer computational resources
  • Key Architectures
    • Recurrent neural network - A class of artificial neural networks where connections between nodes can create cycles, allowing output from some nodes to affect subsequent input to the same nodes
      • LSTM - An artificial neural network used in the fields of artificial intelligence and deep learning, distinguished by feedback connections
    • Convolutional neural network (CNN) - A class of artificial neural network, most commonly applied to analyze visual imagery
    • Attention - A technique in the context of neural networks that mimics cognitive attention, enhancing the important parts of the input data and fading out the rest
      • Transformer - A deep learning architecture based on the multi-head attention mechanism
  • Core Frameworks
    • TensorFlow - An end-to-end open source platform for machine learning
      • TFDS - The collection of datasets ready to use with TensorFlow or other Python ML frameworks like Jax
      • Keras - The Python Deep Learning API designed for human beings, not machines
    • PyTorch - An open source machine learning framework that accelerates the path from research prototyping to production deployment
  • Textbooks & Visualization
    • Neural Networks and Deep Learning - A free online book explaining the core ideas behind neural networks and deep learning
    • Deep Learning, MIT Press - The textbook intended to help students and practitioners enter the field of machine learning in general and deep learning in particular
    • AttentionViz - A Global View of Transformer Attention
    • BertViz - A tool for visualizing Attention in NLP Models

AI Applications & Modalities

Natural Language Processing (NLP)

Large Language Models (LLMs)

  • Model Providers & Aggregators
  • Open Models
    • Llama - The open-source AI models you can fine-tune, distill and deploy anywhere
    • Gemma - A family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models
    • Mistral - A family of open-source and commercial generative AI models
    • OLMo - A state-of-the-art, truly open language model and framework to build and study the science of language models
  • Techniques & Methods
  • Application Frameworks & SDKs
    • Unified SDKs
      • OmniAI - A minimalist library for interfacing with LLMs
      • LiteLLM - A Python SDK and Proxy Server to call over 100 LLM APIs using the OpenAI format
      • RubyLLM - The one beautiful Ruby API for GPT, Claude, Gemini, and more
    • Single-Provider SDKs
      • Go OpenAI - The Go client libraries for OpenAI API
      • Ruby OpenAI - A Ruby wrapper for the OpenAI API
      • Google Gen AI SDK - The Python SDK for Google's generative AI models
      • RedCandle - A Ruby gem for running state-of-the-art language models locally (via Rust's Candle)
    • Application Frameworks
      • Genkit - An open-source framework for building AI-powered apps, built and used in production by Google
      • LangChain - A framework for developing applications powered by language models
      • FastMCP v2 - The standard framework for building MCP applications
  • Dev Tools & Evaluation
    • LLM - A CLI utility and Python library for interacting with Large Language Models
    • lootbox - A CLI which is inspired by "Code Mode" - LLMs write TypeScript code to call APIs rather than using tool invocation
    • Chatbot Arena - A crowdsourced open platform for evaluating LLMs
  • Chatbot Services

Agentic AI

  • Agent Frameworks
    • LangGraph - A library for building stateful, multi-actor applications with LLMs
    • Agno - A multi-agent framework, runtime and control plane
    • Fantasy - A unified interface for interacting with various AI language models
    • Semantic Kernel - A lightweight, open-source development kit that lets you easily build AI agents and integrate the latest AI models
  • LLM App Platforms
    • Dify - An open-source LLM app development platform
  • Workflow Automation
    • n8n - A fair-code licensed workflow automation tool that combines AI capabilities with business process automation
  • Protocols
    • A2A Protocol - A protocol for enabling bidirectional communication between web applications and AI agents
  • Supporting Services
    • Firecrawl - An API service that takes a URL, crawls it, and converts it into clean markdown or structured data
    • Tavily Search - A search engine optimized for LLMs, aimed at efficient, quick and persistent search results

Computer Vision

  • Core Concepts
  • Software, Libraries and Tools
    • General computer vision
      • OpenCV - An open source computer vision and machine learning software library
        • GoCV - A package for the Go programming language with bindings for OpenCV 4
    • Optical Character Recognition (OCR)
      • Tesseract OCR - An open source text recognition (OCR) Engine
        • gosseract OCR - A Go package for OCR (Optical Character Recognition), by using Tesseract C++ library
      • EasyOCR - A ready-to-use OCR with 80+ supported languages and all popular writing scripts
      • OCRmyPDF - A tool to add a searchable OCR text layer to PDF files

MLOps & Productionalization

ML Lifecycle & Versioning

  • DVC - Open-source Data Version Control for machine learning projects
  • CML - An open-source tool for implementing continuous integration & delivery (CI/CD) in machine learning projects
  • MLFlow - An open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry
  • KubeFlow - The Machine Learning Toolkit for Kubernetes, dedicated to making deployments of ML workflows on Kubernetes simple, portable and scalable

Model Deployment & Serving

  • Cloud Platforms
    • Vertex AI - A machine learning (ML) platform for training and deploying ML models and AI applications
    • Amazon Bedrock - A fully managed service offering a choice of high-performing foundation models
    • Microsoft Foundry - A platform for building and deploying AI applications, with a portfolio of services and models
    • Azure OpenAI Service - The service providing REST API access to OpenAI's powerful language models
    • Azure Machine Learning - An enterprise-grade machine learning service to build and deploy models faster
    • Amazon SageMaker - The service to build, train, and deploy machine learning (ML) models for any use case with fully managed infrastructure, tools, and workflows
  • Local LLM Deployment
    • Ollama - A tool designed for deploying and managing large language models (LLMs) locally
    • LM Studio - A desktop app for developing and experimenting with LLMs locally on your computer
    • LocalAI - The free, Open Source OpenAI alternative
  • Standards
    • Model Formats
      • GGUF - A file format for storing models for inference with GGML and executors based on GGML
      • ONNX - An open format built to represent machine learning models
      • Safetensors - A simple format for storing tensors safely
    • Protocols