06 - AI, Machine Learning & LLM

Foundations of AI & Machine Learning

Core AI Paradigms

AI kinds
- Symbolic AI - The collection of all methods in artificial intelligence research that are based on high-level symbolic (human-readable) representations of problems, logic and search
- Generative AI - A subset of artificial intelligence that uses generative models to produce text, images, videos, or other forms of data
- Causal AI - A technique in artificial intelligence that builds a causal model and can thereby make inferences using causality rather than just correlation
AI Programming Languages
- Mojo - The programming language for all AI developers

Classical Machine Learning

Paradigms
- Supervised learning - A paradigm in machine learning where algorithms learn from labeled data
  - Decision tree learning - The method using a decision tree as a predictive model to go from observations about an item to conclusions about the item's target value
  - Ensemble learning - The method using multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone
    - Random forest - An ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time
  - Support vector machine - The supervised learning models with associated learning algorithms that analyze data for classification and regression analysis
  - Classification - The problem of identifying which of a set of categories (sub-populations) a new observation belongs to, on the basis of a training set of data containing observations
    - Logistic regression - A statistical model that models the probability of an event taking place by having the log-odds for the event be a linear combination of one or more independent variables
    - ROC curve - A graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied
    - Naive Bayes classifier - A family of simple probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features
  - Regression - A set of statistical processes for estimating the relationships between a dependent variable and one or more independent variables
    - Ordinary least squares - A type of linear least squares method for choosing the unknown parameters in a linear regression model
    - Generalized linear model - A flexible generalization of ordinary least squares regression
    - ARIMA model - A generalization of an autoregressive moving average (ARMA) model, fitted to time series data either to better understand the data or to predict future points in the series
- Unsupervised learning - A type of machine learning in which models are trained using unlabeled dataset and are allowed to act on that data without previous training
  - K-means clustering - A method of vector quantization that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean
- Reinforcement learning - An area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward
  - Markov decision process - The mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker
  - Multi-armed bandit - A problem in which a fixed limited set of resources must be allocated between competing (alternative) choices in a way that maximizes their expected gain
  - Value function - A function used in mathematical optimization and reinforcement learning that assigns a measure of desirability to states or actions
Concepts & Techniques
- Hyperparameter - A parameter whose value is used to control the learning process
- Hyperparameter optimization - The problem of choosing a set of optimal hyperparameters for a learning algorithm
- Embedding - A representation learning technique that maps complex, high-dimensional data into a lower-dimensional vector space of numerical vectors
- Early stopping - A form of regularization used to avoid overfitting when training a learner with an iterative method, such as gradient descent
- Cross-validation - Any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set
- Transfer learning
Applications & Problem Domains
- Anomaly detection - The identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data
  - One-class classification - The technique trying to identify objects of a specific class amongst all objects, by primarily learning from a training set containing only the objects of that class
- Recommender system - An information filtering system that seeks to predict the 'rating' or 'preference' a user would give to an item
Related Fields
- Mathematical model - An abstract description of a concrete system using mathematical concepts and language
- Mathematical optimization - The selection of a best element, with regard to some criteria, from some set of available alternatives
Frameworks, Platforms & Tools
- scikit-learn - A free software machine learning library for the Python programming language
  - libsvm - A Library for Support Vector Machines
- ML.NET - An open-source, cross-platform machine learning framework for .NET developers
- Crab - A Python library for building recommender systems
- Gradio - The fastest way to demo your machine learning model with a friendly web interface so that anyone can use it, anywhere
- mlxtend - A Python library of useful tools for the day-to-day data science tasks
- Prophet - A forecasting procedure for time series data that is fast and provides completely automated forecasts

Deep Learning

Neural Network Fundamentals
- Neural network - The computational models used in machine learning for finding patterns in data
- Tensor - The mathematical objects represented as multidimensional arrays used in machine learning
  - Sigmoid function - A mathematical function having a characteristic 'S'-shaped curve or sigmoid curve
  - Softmax function - A function that converts a vector of K real numbers into a probability distribution of K possible outcomes
- Backpropagation - A widely used algorithm for training feedforward neural networks
- Autoencoder - A type of artificial neural network used to learn efficient codings of unlabeled data (unsupervised learning)
- Vanishing gradient problem - The difficulty encountered when training artificial neural networks with gradient-based learning methods and backpropagation, where gradients shrink as they back-propagate
Deep Learning Concepts & Training
- Deep Learning - A part of a broader family of machine learning methods based on artificial neural networks with representation learning
- Stochastic gradient descent - An iterative method for optimizing an objective function with suitable smoothness properties
- Fine tuning - An approach to transfer learning in which the weights of a pre-trained model are trained on new data
- LoRA (machine learning) - A parameter-efficient fine-tuning technique for adapting pre-trained models to specific tasks with significantly fewer computational resources
Key Architectures
- Recurrent neural network - A class of artificial neural networks where connections between nodes can create cycles, allowing output from some nodes to affect subsequent input to the same nodes
  - LSTM - An artificial neural network used in the fields of artificial intelligence and deep learning, distinguished by feedback connections
- Convolutional neural network (CNN) - A class of artificial neural network, most commonly applied to analyze visual imagery
- Attention - A technique in the context of neural networks that mimics cognitive attention, enhancing the important parts of the input data and fading out the rest
  - Transformer - A deep learning architecture based on the multi-head attention mechanism
Core Frameworks
- TensorFlow - An end-to-end open source platform for machine learning
  - TFDS - The collection of datasets ready to use with TensorFlow or other Python ML frameworks like Jax
  - Keras - The Python Deep Learning API designed for human beings, not machines
- PyTorch - An open source machine learning framework that accelerates the path from research prototyping to production deployment
Textbooks & Visualization
- Neural Networks and Deep Learning - A free online book explaining the core ideas behind neural networks and deep learning
- Deep Learning, MIT Press - The textbook intended to help students and practitioners enter the field of machine learning in general and deep learning in particular
- AttentionViz - A Global View of Transformer Attention
- BertViz - A tool for visualizing Attention in NLP Models

AI Applications & Modalities

Natural Language Processing (NLP)

Foundational Linguistics Fields
Core NLP Concepts & Techniques
Vector Representations (Embeddings)
- Word embedding
  - Word2vec
  - fastText - Library for efficient text classification and representation learning
  - GloVe - Global Vectors for Word Representation
- Sentence embedding
Libraries & tools
- General Purpose
  - Natural Language Toolkit - A leading platform for building Python programs to work with human language data
  - Gensim - A free open-source Python library for representing documents as semantic vectors
  - wego - The implementations from scratch for word embeddings (a.k.a word representation) models in Go
- Morphological Analyzers / Tokenizers
  - Kuromoji - An open source Japanese morphological analyzer written in Java
  - Kagome - An open source Japanese morphological analyzer written in pure golang
  - mecab-python3 - A Python wrapper for the MeCab morphological analyzer for Japanese text
  - jieba - A Python module for Chinese text segmentation

Large Language Models (LLMs)

Model Providers & Aggregators
- Anthropic - The API providing access to Anthropic's Claude models
- OpenAI - The platform for building applications with OpenAI's models
- Gemini Developer APIs - The API that gives you access to the latest Gemini models from Google
- Hugging Face Serverless Inference API - The API allowing inference on models hosted on the Hugging Face Hub
- OpenRouter - A unified interface for LLMs
Open Models
- Llama - The open-source AI models you can fine-tune, distill and deploy anywhere
- Gemma - A family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models
- Mistral - A family of open-source and commercial generative AI models
- OLMo - A state-of-the-art, truly open language model and framework to build and study the science of language models
Techniques & Methods
- Retrieval-augmented generation (RAG)
- GraphRAG - A data pipeline and transformation suite that is designed to extract meaningful, structured data from unstructured text using the power of LLMs
- Prompt Engineering
  - Prompt Engineering Guide
  - CRAFT framework
- ReAct Prompting - A prompting technique synergizing reasoning and acting in language models
  - Reason, Act, Thought, Observation
Application Frameworks & SDKs
- Unified SDKs
  - OmniAI - A minimalist library for interfacing with LLMs
  - LiteLLM - A Python SDK and Proxy Server to call over 100 LLM APIs using the OpenAI format
  - RubyLLM - The one beautiful Ruby API for GPT, Claude, Gemini, and more
- Single-Provider SDKs
  - Go OpenAI - The Go client libraries for OpenAI API
  - Ruby OpenAI - A Ruby wrapper for the OpenAI API
  - Google Gen AI SDK - The Python SDK for Google's generative AI models
  - RedCandle - A Ruby gem for running state-of-the-art language models locally (via Rust's Candle)
- Application Frameworks
  - Genkit - An open-source framework for building AI-powered apps, built and used in production by Google
  - LangChain - A framework for developing applications powered by language models
  - FastMCP v2 - The standard framework for building MCP applications
Dev Tools & Evaluation
- LLM - A CLI utility and Python library for interacting with Large Language Models
- lootbox - A CLI which is inspired by "Code Mode" - LLMs write TypeScript code to call APIs rather than using tool invocation
- Chatbot Arena - A crowdsourced open platform for evaluating LLMs
Chatbot Services
- Character.ai

Agentic AI

Agent Frameworks
- LangGraph - A library for building stateful, multi-actor applications with LLMs
- Agno - A multi-agent framework, runtime and control plane
- Fantasy - A unified interface for interacting with various AI language models
- Semantic Kernel - A lightweight, open-source development kit that lets you easily build AI agents and integrate the latest AI models
LLM App Platforms
- Dify - An open-source LLM app development platform
Workflow Automation
- n8n - A fair-code licensed workflow automation tool that combines AI capabilities with business process automation
Protocols
- A2A Protocol - A protocol for enabling bidirectional communication between web applications and AI agents
Supporting Services
- Firecrawl - An API service that takes a URL, crawls it, and converts it into clean markdown or structured data
- Tavily Search - A search engine optimized for LLMs, aimed at efficient, quick and persistent search results

Computer Vision

Core Concepts
- Vision Language Models (VLM) - An exciting class of models that can understand images and text
- Diffusion model
- Multimodal learning
Software, Libraries and Tools
- General computer vision
  - OpenCV - An open source computer vision and machine learning software library
    - GoCV - A package for the Go programming language with bindings for OpenCV 4
- Optical Character Recognition (OCR)
  - Tesseract OCR - An open source text recognition (OCR) Engine
    - gosseract OCR - A Go package for OCR (Optical Character Recognition), by using Tesseract C++ library
  - EasyOCR - A ready-to-use OCR with 80+ supported languages and all popular writing scripts
  - OCRmyPDF - A tool to add a searchable OCR text layer to PDF files

MLOps & Productionalization

ML Lifecycle & Versioning

DVC - Open-source Data Version Control for machine learning projects
CML - An open-source tool for implementing continuous integration & delivery (CI/CD) in machine learning projects
MLFlow - An open source platform to manage the ML lifecycle, including experimentation, reproducibility, deployment, and a central model registry
KubeFlow - The Machine Learning Toolkit for Kubernetes, dedicated to making deployments of ML workflows on Kubernetes simple, portable and scalable

Model Deployment & Serving

Cloud Platforms
- Vertex AI - A machine learning (ML) platform for training and deploying ML models and AI applications
- Amazon Bedrock - A fully managed service offering a choice of high-performing foundation models
- Microsoft Foundry - A platform for building and deploying AI applications, with a portfolio of services and models
- Azure OpenAI Service - The service providing REST API access to OpenAI's powerful language models
- Azure Machine Learning - An enterprise-grade machine learning service to build and deploy models faster
- Amazon SageMaker - The service to build, train, and deploy machine learning (ML) models for any use case with fully managed infrastructure, tools, and workflows
Local LLM Deployment
- Ollama - A tool designed for deploying and managing large language models (LLMs) locally
- LM Studio - A desktop app for developing and experimenting with LLMs locally on your computer
- LocalAI - The free, Open Source OpenAI alternative
Standards
- Model Formats
  - GGUF - A file format for storing models for inference with GGML and executors based on GGML
  - ONNX - An open format built to represent machine learning models
  - Safetensors - A simple format for storing tensors safely
- Protocols
  - Model Context Protocol (MCP) - An open protocol that standardizes how applications provide context to LLMs
    - Elicitation

Foundations of AI & Machine Learning​

Core AI Paradigms​

Classical Machine Learning​

Deep Learning​

AI Applications & Modalities​

Natural Language Processing (NLP)​

Large Language Models (LLMs)​

Agentic AI​

Computer Vision​

MLOps & Productionalization​

ML Lifecycle & Versioning​

Model Deployment & Serving​

Foundations of AI & Machine Learning

Core AI Paradigms

Classical Machine Learning

Deep Learning

AI Applications & Modalities

Natural Language Processing (NLP)

Large Language Models (LLMs)

Agentic AI

Computer Vision

MLOps & Productionalization

ML Lifecycle & Versioning

Model Deployment & Serving