02 - Web Application Development
Web Platform Fundamentals
Relevant DSS-P Skills
- 3. Technology > 3.1 Software Development > Web Application Core Technology
Web Concepts
- World Wide Web - An information space where documents and other web resources are identified by Uniform Resource Locators (URLs), interlinked by hypertext links, and accessible via the Internet
- Hypertext - A text displayed on a computer display or other electronic devices with references (hyperlinks) to other text that the reader can immediately access
- Semantic Web - An extension of the World Wide Web that allows Internet data to be machine-readable through standards set by the W3C, enabling automated agents to process information more intelligently
- URI - A unique sequence of characters that identifies a logical or physical resource
- URL - A standard that defines URLs, domains, IP addresses, the application/x-www-form-urlencoded format, and their API
- Core Web Protocols & Languages
- HTTP - An application protocol for distributed, collaborative, hypermedia information systems
- HTTP cookie - A small piece of data that a server sends to a user's web browser
- HTML - The World Wide Web's core markup language
- CSS - A simple mechanism for adding style (e.g., fonts, colors, spacing) to Web documents
- HTTP - An application protocol for distributed, collaborative, hypermedia information systems
- Real-time & Messaging Protocols
- WebSockets - A technology that makes it possible to open a two-way interactive communication session between the user's browser and a server
- WebRTC - A free and open-source project providing web browsers and mobile applications with real-time communication (RTC)
- Server-sent events - A technology to enable servers to push data to web pages over HTTP or using dedicated server-push protocols
- MQTT - A lightweight, publish-subscribe, machine to machine network protocol for message queue/message queuing service
- AMQP - An open standard application layer protocol for message-oriented middleware
- Data & Event Specifications
- CloudEvents - A specification for describing event data in a common way
- JSON Merge Patch - A JSON format that describes changes to be made to a target JSON document
- OpenAPI spec - A standard, language-agnostic interface to HTTP APIs
- TypeSpec - A minimal language that helps developers describe API shapes in a familiar way
- API Tooling
- Redocly CLI - An open-source command-line tool that helps you lint, bundle, and preview OpenAPI definitions
- Web Performance Concepts
- DNS Prefetching - A mechanism to resolve domain names before a user tries to follow a link
- Web Application Types
- Progressive web app - A type of application software delivered through the web, built using common web technologies including HTML, CSS, JavaScript, and WebAssembly
Browser Technologies & DOM
- Browsers
- Chrome - A freeware, cross-platform web browser developed by Google
- Chromium - An open-source browser project that aims to build a safer, faster, and more stable way for all users to experience the web
- Firefox - A free and open-source web browser developed by the Mozilla Foundation
- w3m - A text-based web browser as well as a pager
- EWW - The Emacs Web Wowser, a web browser for Emacs
- Rendering Engines
- Scripting Engines
- V8 (JavaScript engine) - Google's open source high-performance JavaScript and WebAssembly engine, written in C++
- JavaScriptCore - The JavaScript engine that powers Safari and other apps on Apple platforms
- Client Scripting APIs
- XMLHttpRequest (XHR) - An API that provides scripted client functionality for transferring data between a client and a server
- Fetch Standard - A living standard that defines requests, responses, and the process that binds them: fetching
- Canvas API - The means for drawing graphics via JavaScript and the HTML
<canvas>element - WebGL API - A JavaScript API for rendering high-performance interactive 3D and 2D graphics within any compatible web browser without the use of plug-ins
- Site Analyzers
- Wappalyzer - A technology profiler that shows you what websites are built with
Web Application Architectures
- Single-page application - A web application or website that interacts with the user by dynamically rewriting the current web page with new data from the web server
- Multi-page application - A traditional web structure with multiple pages that are independently downloaded from the server, each having its own URL and loaded separately when requested by the user
- Microfrontend - An architectural pattern for web development where independently developed frontends are composed into a greater whole
- Islands Architecture - A frontend pattern that renders pages to fast, static HTML with selective "islands" of JavaScript added only where interactivity is needed
- Backend for Frontend - An architectural pattern where separate backend services are created specifically for different frontend applications
- Multitier architecture - A client-server architecture where different levels of software architecture are physically separated into presentation, application processing, and data management functions
- Server-side rendering - An approach where static HTML is sent from the server to the client, and client-side JavaScript then makes the web page dynamic by attaching event handlers in a process called hydration
- Incremental Static Regeneration - A technique that enables static-generation on a per-page basis without needing to rebuild the entire site, allowing updates to static content after deployment
- JAMstack - An architectural approach that decouples the web experience layer from data and business logic, improving flexibility, scalability, performance, and maintainability
Frontend Development
Relevant DSS-P Skills
- 3. Technology > 3.1 Software Development > Front-end System Development
- 1. Business Transformation > 1.4 Design > Digital Product Design
UI Frameworks & Core Libraries
- Core SPA Frameworks
- React - The library for web and native user interfaces
- Core Concepts
- Component - A fundamental building block used to create user interfaces
- Props - A mechanism for passing data from a parent component to a child component
- Children - A special prop that allows components to be composed
- Key Props - A special string attribute you need to include when creating lists of elements
- Rendering - The process of React asking your components to describe what they want to look like
- Event Handler - A function that is executed in response to an event
- State - A JavaScript object that stores a component's dynamic data
- Controlled Component - A component where React state controls the value of an input element
- Hooks - A set of functions that let you "hook into" React state and lifecycle features from function components
- Strict Mode - A tool for highlighting potential problems in an application
- Side-effect - A term that refers to any operation that affects something outside of the function being executed
- Refs - A feature that provides a way to access DOM nodes or React elements created in the render method
- Context - A way to pass data through the component tree without having to pass props down manually at every level
- Portals - A feature that provides a first-class way to render children into a DOM node that exists outside the DOM hierarchy of the parent component
- Suspense - A component that lets you specify a loading indicator for a part of the component tree
- Error Boundary - A React component that catches JavaScript errors anywhere in their child component tree
- Core Concepts
- Preact - A fast 3kB alternative to React with the same modern API
- Vue.js - A JavaScript framework for building user interfaces
- Angular - A web framework that empowers developers to build fast, reliable applications
- Svelte - A UI framework that uses a compiler to let you write breathtakingly concise components that do minimal work in the browser, using languages you already know — HTML, CSS and JavaScript
- Ember.js - A framework for ambitious web developers
- React - The library for web and native user interfaces
- HTML-First Frameworks
- Framework-agnostic Core Libraries
- TanStack - A collection of high-quality, framework-agnostic open-source libraries for web development
- TanStack Query - A powerful asynchronous state management for TS/JS, React, Solid, Vue, Svelte and Angular
- TanStack Router - A powerful, type-safe, and framework-agnostic router for building modern web applications
- TanStack Table - A headless UI for building powerful tables & datagrids for TS/JS, React, Vue, Solid and Svelte
- TanStack Form - A type-safe and framework-agnostic form state management for React, Vue, Solid, and Svelte
- TanStack Virtual - A headless UI for virtualizing large lists and grids in React, Vue, Svelte, Solid and JS
- TanStack - A collection of high-quality, framework-agnostic open-source libraries for web development
State, Routing & Logic
- State Management
- Redux - A JS library for predictable and maintainable global state management
- React-Redux - The official React binding for Redux
- Zustand - A small, fast, and scalable barebones state-management solution using simplified flux principles
- Recoil - A state management library for React
- XState - A library for creating, interpreting, and executing finite state machines and statecharts
- Redux - A JS library for predictable and maintainable global state management
- Routing
- React Router - A user-obsessed, standards-focused, multi-strategy router you can deploy anywhere
- Syntax & Templating
- WASM Runtimes
- PyScript - A free Open Source Software (OSS) that facilitates the creation, deployment, and sharing of Python applications
Styling & UI Components
- CSS Ecosystem
- Frameworks and UI Kits
- Bootstrap - The world's most popular front-end open source toolkit
- Tailwind CSS - A utility-first CSS framework packed with classes
- Oat - An ultra-lightweight, semantic, zero-dependency HTML UI component library that provides minimal, standards-based CSS and JS
- Tailwind Component Libraries
- daisyUI - The most popular component library for Tailwind CSS
- CSS-in-JS
- Preprocessors
- Sass language - A stylesheet language that's compiled to CSS
- Transforms
- CSS Transforms 1 - A CSS module that allows elements to be transformed in two-dimensional space
- CSS Transforms 2 - A CSS module that allows elements to be transformed in three-dimensional space
- Frameworks and UI Kits
- UI Component Libraries
- templUI - A growing collection of beautifully designed UI components for Go and templ
- Material UI - An open-source React component library that implements Google's Material Design
- Chakra UI - A component system for building products with speed
- Vuetify - A no design skills required Open Source UI Library with beautifully handcrafted Vue Components
- Specialized UI Widgets
Build & Development Tooling
- Development Environments
- Storybook - A frontend workshop for building UI components and pages in isolation
- Bundlers
- Vite - A build tool that aims to provide a faster and leaner development experience for modern web projects
- Parcel - The zero configuration build tool
- webpack - A static module bundler for modern JavaScript applications
- Rspack - A high performance JavaScript bundler written in Rust
- Rsbuild - The Rspack-based web build tool
- Transpilers
- babel - A JavaScript compiler
- Minifiers
- JSMin - A minification tool that removes comments and unnecessary whitespace from JavaScript files
- Linters & Formatters
Full-stack & Static Site Frameworks
Relevant DSS-P Skills
- 3. Technology > 3.1 Software Development > Front-end System Development
- 3. Technology > 3.1 Software Development > Back-end System Development
Full-stack Frameworks
- JS/TS Full-stack Frameworks
- Next.js - A React framework for building full-stack web applications
- Nuxt.js - A free and open-source framework with an intuitive and extendable way to create type-safe, performant and production-grade full-stack web applications and websites with Vue.js
- Astro - The web framework for content-driven websites
- Fresh - A next generation web framework, built for speed, reliability, and simplicity
- Rust Full-Stack Frameworks
- Leptos - A cutting-edge Rust framework for the modern web
Static Site Generators
- Docusaurus - A static-site generator. It builds a single-page application with fast client-side navigation, leveraging the full power of React to make your site interactive
- mdBook - A utility to create modern online books from Markdown files
- VuePress - A Vue-powered Static Site Generator
- Hugo - The world's fastest framework for building websites
- Docsy - A Hugo theme for technical documentation sites, providing easy site navigation, structure, and more
- Jekyll - A simple, blog-aware, static site generator perfect for personal, project, or organization sites
- Eleventy - A simpler static site generator written in JavaScript
- Sphinx - A tool that makes it easy to create intelligent and beautiful documentation
- MkDocs - A fast, simple and downright gorgeous static site generator that's geared towards building project documentation
- Material for MkDocs - A powerful and beautiful theme for the MkDocs static site generator
- Nanoc - A static-site generator, fit for building anything from a small personal blog to a large corporate website
- gitmal - A static page generator designed for Git repositories
Headless CMS
- Cloud-native & API-first CMS
- Contentful - A headless content management system that provides a content-first approach to building digital products
- Strapi - The leading open-source headless CMS
- Sanity - A platform for structured content that lets you build better digital experiences
Backend Development
Relevant DSS-P Skills
- 3. Technology > 3.1 Software Development > Back-end System Development
API Architectural Styles
- REST - A software architectural style that was created to guide the design and development of the architecture for the World Wide Web
- SOAP (legacy) - A messaging protocol specification for exchanging structured information in the implementation of web services
- GraphQL - A query language for APIs and a runtime for fulfilling those queries with your existing data
- gRPC - A modern open source high performance Remote Procedure Call (RPC) framework that can run in any environment
- json-rpc - A stateless, light-weight remote procedure call (RPC) protocol
- Webhook - A method of augmenting or altering the behavior of a web page or web application with custom callbacks
Backend Frameworks
- JS/TS Backend Frameworks
- Fastify - A fast and low-overhead web framework for Node.js, designed for optimal performance and developer experience
- Express.js - A minimal and flexible Node.js web application framework
- Koa - A new web framework designed by the team behind Express
- Nest.js - A progressive Node.js framework for building efficient, reliable and scalable server-side applications
- Hono - A small, simple, and ultrafast web framework for the Edges
- API Tools
- tRPC - A tool that allows you to easily build & consume fully typesafe APIs without schemas or code generation
- Go Backend Frameworks
- Echo - A high performance, extensible, minimalist Go web framework
- Fiber - An Express inspired web framework built on top of Fasthttp, the fastest HTTP engine for Go, designed to ease development with performance in mind
- Gin Web Framework - A web framework written in Go
- Gorilla web toolkit - A helpful toolkit that provides useful, composable packages for writing HTTP-based applications
- Yokai - A simple, modular and observable Go framework for backend applications
- Python Backend Frameworks & Servers
- WSGI - The Web Server Gateway Interface
- ASGI - A spiritual successor to WSGI, the long-standing Python standard for compatibility between web servers, frameworks, and applications
- Uvicorn - A lightning-fast ASGI server implementation for Python, using uvloop and httptools for high performance
- Hypercorn - An ASGI and WSGI web server based on the sans-io hyper, h11, h2, and wsproto libraries with support for HTTP/1, HTTP/2, and HTTP/3
- FastAPI - A modern, fast (high-performance), web framework for building APIs with Python based on standard Python type hints
- SlowAPI - A small library to rate limit your ASGI applications
- Ruby Backend Frameworks & Servers
- Ruby on Rails - A web-application framework that includes everything needed to create database-backed web applications according to the Model-View-Controller (MVC) pattern
- Rack - A modular Ruby web server interface
- Sidekiq - A simple, efficient background processing tool for Ruby
- Perl Backend Frameworks (legacy)
- Java Backend Frameworks
- Jakarta EE - A set of specifications that define Java APIs for enterprise software development
- Apache Tomcat - An open-source web server and servlet container
- Spring - A project that makes Java simple, modern, productive, reactive, and cloud-ready
- Spring Boot - A tool that takes an opinionated view of the Spring platform and third-party libraries so you can get started with minimum fuss
- Jakarta EE - A set of specifications that define Java APIs for enterprise software development
- .NET Backend Frameworks
- ASP.NET - A free, cross-platform, open source framework for building web apps and services with .NET and C#
- Elixir Backend Frameworks
- Phoenix - A web framework for building rich, interactive web applications quickly with less code and fewer moving parts, used to craft APIs, HTML5 apps, and more at scale
- GraphQL Servers
- Apollo Server - An open-source, spec-compliant GraphQL server that's compatible with any GraphQL client
Web Infrastructure
Relevant DSS-P Skills
- 3. Technology > 3.1 Software Development > Cloud Infrastructure Utilization
Web Server & Proxy
- Web Servers & Reverse Proxy Servers
- NGINX - An open source software for web serving, reverse proxying, caching, load balancing, media streaming, and more
- Apache HTTP Server - A project to develop and maintain an open-source HTTP server for modern operating systems including UNIX and Windows
- Caddy - A powerful, extensible platform to serve your sites, services, and apps, written in Go
- HAProxy - A free, very fast and reliable reverse-proxy offering high availability, load balancing, and proxying for TCP and HTTP-based applications
- nodejs http-server - A simple static HTTP server
- goshs - A feature-rich single-binary file server for red teamers and developers supporting HTTP/S, WebDAV, SFTP, SMB, LDAP/S, NTLM hash capture, DNS/SMTP callbacks, TLS, authentication, and share links
- API Management
- Unkey - An open-source API management platform designed to help developers secure, manage, and scale their APIs
- Kong API gateway - A lightweight, fast, and flexible cloud-native API gateway
- Azure API Management - A hybrid, multicloud management platform for APIs across all environments
- Amazon API Gateway - A fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale
- Google Cloud Apigee - The platform for developing and managing API services
- Gravitee - A unified API visibility and governance platform that provides a single pane of glass for managing, securing, and governing APIs across any infrastructure
CDN & Edge Computing
- Concepts
- Web cache - An information technology for the temporary storage (caching) of web documents, such as HTML pages and images, to reduce bandwidth usage, server load, and perceived lag
- Content delivery network - A geographically distributed network of proxy servers and their data centers
- Point of presence - An artificial demarcation point or interface point between communicating entities
- Forward Proxy Servers
- Squid - A caching proxy for the Web supporting HTTP, HTTPS, FTP, and more
- CDN Providers
- Cloudflare - A global network designed to make everything you connect to the Internet secure, private, fast, and reliable
- Cloudflare Workers - A serverless execution environment that allows you to create entirely new applications or augment existing ones without configuring or maintaining infrastructure
- Cloudflare Workers Bindings - A mechanism that allows your Worker to interact with resources on the Cloudflare Developer Platform, providing better performance and fewer restrictions than REST APIs for accessing resources from Workers
- Amazon CloudFront - A content delivery network (CDN) service built for high performance, security, and developer convenience
- Lambda@Edge - A feature of Amazon CloudFront that lets you run code closer to users of your application
- Google Cloud CDN - A content delivery network (CDN) that accelerates delivery of your web and video content
- Azure Front Door - A modern cloud content delivery network (CDN) that provides a secure and scalable entry point for fast delivery of your global web applications and content
- Cloudflare - A global network designed to make everything you connect to the Internet secure, private, fast, and reliable
- JAMstack Hosting
- GitLab Pages - A feature that allows you to publish static websites directly from a repository in GitLab
- Cloudflare Pages - A JAMstack platform for frontend developers to collaborate and deploy websites
Decentralized Web
Relevant DSS-P Skills
- 3. Technology > 3.2 Digital Technology > Other Advanced Technologies
Blockchain Technology
- Web3 - An idea for a new iteration of the World Wide Web which incorporates concepts such as decentralization, blockchain technologies, and token-based economics
- Blockchain - A distributed ledger with growing lists of records
- Hashcash - A proof-of-work system used to limit email spam and denial-of-service attacks
- Proof of work - A form of cryptographic proof in which one party proves to others that a certain amount of a specific computational effort has been expended
- Smart contract - A computer program or transaction protocol designed to automatically execute, control, or document events and actions according to contract terms
- Bitcoin - A decentralized digital currency that can be transferred on the peer-to-peer bitcoin network
- Ethereum - A global, decentralized network that provides direct ownership of digital assets, data, and identity without requiring permission from any central authority
- Non-fungible token - A unique digital identifier that is recorded on a blockchain and is used to certify ownership and authenticity
- Decentralized autonomous organization - A member-owned community without centralized leadership managed by decentralized computer programs with voting and finances handled through a blockchain
- Solidity - A programming language for implementing smart contracts on various blockchain platforms, most notably Ethereum
- Web3.js - A TypeScript/JavaScript library that enables developers to connect to and interact with Ethereum and other EVM-compatible blockchains
- ethers.js - A simple, compact and complete JavaScript library for all your Ethereum needs
- MetaMask - A crypto wallet that enables users to buy, sell, swap, and store cryptocurrencies while maintaining control over their data and assets
- WalletConnect - An open-source protocol that establishes encrypted connections between mobile cryptocurrency wallets and desktop-based decentralized applications
- Hardhat - A development environment for Ethereum and EVM-compatible blockchains that helps developers compile, deploy, test, and debug Solidity smart contracts
Decentralized Social
- ActivityPub - A decentralized social networking protocol based on the ActivityStreams 2.0 data format
- AT Protocol - An open data network for building social applications where users own their identities and content is represented as interlinked JSON records
- Fediverse - An ensemble of interconnected servers that are used for web publishing and file hosting, but which can communicate with each other
- Mastodon - A free, open-source, decentralized social media platform that puts users in control of their feeds without algorithms or ads, allowing independent servers to interoperate through the ActivityPub protocol
- Bluesky - A microblogging social media service and a public benefit corporation based in the United States
- Nostr - An open, decentralized social protocol that uses cryptographic signatures to enable censorship-resistant communication across multiple independent servers called relays
- Matrix - An open standard and communication protocol for real-time communication that enables seamless communication between different service providers
- PeerTube - A free, open-source tool for creating independent video hosting platforms that connect to form a decentralized network, offering an alternative to centralized services
- Lemmy - A decentralized discussion platform that allows users to control their experience without corporate tracking or advertising
- Diaspora - A nonprofit, user-owned, distributed social network consisting of independently owned nodes called pods that interoperate to form the network
- Secure Scuttlebutt - A decentralized social network platform that enables local community development free from corporate data harvesting
Development & Testing Tools
Relevant DSS-P Skills
- 3. Technology > 3.1 Software Development > Computer Science
- 3. Technology > 3.1 Software Development > Team Development
Web/HTTP Clients
- HTTP CLI Tools
- cURL - A command line tool and library for transferring data with URLs
- Wget - A free software package for retrieving files using HTTP, HTTPS, FTP and FTPS
- curlie - The power of curl, the ease of use of httpie
- hurl - A command line tool that runs HTTP requests defined in a simple plain text format
- httpie cli - A simple yet powerful command-line HTTP and API testing client for the API era
- wuzz - An interactive CLI tool for HTTP inspection
- httptap - A tool to view the HTTP and HTTPS requests made by any Linux program
- HTTP Client Libraries
- Python Requests - An elegant and simple HTTP library for Python, built for human beings
- JS Axios - A promise-based HTTP Client for node.js and the browser
- Go Resty - A simple HTTP and REST client library for Go
- Go FastHTTP - A fast HTTP package for Go
- Surf - An advanced Go HTTP client with Chrome/Firefox browser impersonation, HTTP/3 with QUIC fingerprinting, JA3/JA4 TLS emulation, and anti-bot bypass
- Typhoeus - A library that wraps libcurl in order to make fast and reliable requests
- Ruby Net - A collection of classes that implement client-side internet protocols
- httpx - An HTTP client library for the Ruby programming language
- wreq-ruby - An easy and powerful Ruby HTTP client with advanced browser fingerprinting that accurately emulates various browsers with precise TLS/HTTP2 signatures
- Rust reqwest - An ergonomic, async HTTP client
- GraphQL Libraries
- URQL - The highly customizable and versatile GraphQL client for React, Svelte, Vue, or plain JavaScript
- API Testing Platforms
- Bruno - A Git-integrated, fully offline, and open-source API client
- Postman/Newman - An API platform for building and using APIs
- Classic Web Automation
- Mechanize - A module that helps you automate interaction with a website
- Mechanize (Ruby) - A ruby library that makes automated web interaction easy
Web Debugging Tools
- Chrome DevTools - A set of web developer tools built directly into the Google Chrome browser
- Firefox Developer Tools - A set of web developer tools built into Firefox that allow you to examine, edit, and debug HTML, CSS, and JavaScript
- React Developer Tools - A browser extension and standalone debugger that allows developers to inspect React components, edit props and state, and identify performance problems in React applications
- Vue.js devtools - A browser extension for debugging Vue.js applications that provides component inspection and state management debugging
- Redux DevTools - A development tool that provides power-ups for Redux development workflow, including hot reloading, action replay, and customizable UI
- Lighthouse - An open-source, automated tool that helps improve web page quality by auditing performance, accessibility, SEO, and best practices
- Fiddler - A free web debugging proxy for any browser, system or platform
- Charles Proxy - An HTTP proxy/monitor that enables developers to view all HTTP and SSL/HTTPS traffic between their machine and the Internet, including requests, responses, and headers
- mitmproxy - A free and open source interactive HTTPS proxy that can intercept, inspect, modify, and replay web traffic for debugging, testing, and penetration testing purposes
- Requestly - An HTTP interceptor that allows developers to modify URLs, headers, and API responses in real-time for debugging and testing
Web Test Automation Frameworks
- Browser Automation & Testing
- Puppeteer - A Node.js library which provides a high-level API to control Chrome/Chromium over the DevTools Protocol
- Playwright - A framework for reliable end-to-end testing for modern web apps with a single API for Chromium, Firefox, and WebKit
- Playwright for Go - A Go library to automate Chromium, Firefox and WebKit with a single API
- Cypress - An open-source, JavaScript-based testing framework that enables developers to write, run, and debug end-to-end and component tests directly in the browser for modern web applications
- WebDriver - A remote control interface that enables introspection and control of user agents
- Selenium WebDriver - A tool that drives a browser natively, as a user would, either locally or on a remote machine
- WebDriver BiDi - The BiDirectional WebDriver Protocol, a mechanism for remote control of user agents
- Selenium IDE - An open source record and playback test automation for the web
- Chrome DevTools Protocol (CDP) - A low-level API that allows external tools to instrument, inspect, debug, and profile Chromium-based browsers
- Karma - A test runner that spawns a web server and executes source code against test code for each of the connected browsers
- Supporting Tools
- Chrome for Testing - A new flavor of Chrome that specifically targets web app testing and automation use cases
- Accessibility Testing
- axe-core - An accessibility testing engine for websites and other HTML-based user interfaces
- AI-powered Web Automation
- browser-use - An open-source Python library that allows AI agents to interact with web browsers using natural language
- Web Scraping
- Crawlee - A web scraping and browser automation library
- BeautifulSoup - A Python library designed for quick turnaround projects like screen-scraping
- Scrapy - An open source and collaborative framework for extracting the data you need from websites
- Colly - A Golang framework for building web scrapers
- Katana - A next-generation crawling and spidering framework
- Trafilatura - A Python package and command-line tool to gather text on the Web