Architecture

QuickCrawl uses a layered architecture that combines HTTP fetching, headless browser automation, and LLM-powered structured extraction.

Layers

Client (HTTP / MCP / CLI)
         ↓
   Gin Router
         ↓
   API Handlers
         ↓
┌────────┴────────┐
│   Renderer       │
│  (HTTP / CDP)   │
└────────┬────────┘
         ↓
┌─────────────────┐
│   Extractor     │
│ (Markdown, HTML,│
│  Links, JSON)   │
└─────────────────┘

Components

Component	File	Responsibility
HTTP Fetcher	`internal/core/http.go`	Plain HTTP GET with stealth headers, retries
CDP Browser	`internal/core/renderer.go`	Headless Chrome via chromedp
Extractor	`internal/extractor/`	HTML → Markdown, Plain Text, Links
Crawler	`internal/crawler/crawl.go`	BFS site crawling with robots.txt
Sitemap	`internal/crawler/map.go`	URL discovery via sitemap.xml
Search	`internal/search/`	SearXNG + optional result scraping
LLM Extraction	`internal/core/llm.go`	JSON schema-based extraction

Render Modes

Every fetch goes through one of three render strategies:

Mode	Description
`http`	Plain HTTP GET, no JavaScript
`browser`	Headless Chrome via CDP (full JS rendering)
`auto`	HTTP first — escalate to browser when needed

Sections

Scrape — Single URL fetching with HTTP/browser/auto modes
Crawl — Async BFS website crawling
Map — URL discovery without content extraction
Search — SearXNG search with optional result scraping

Layers​

Components​

Render Modes​

Sections​

Layers

Components

Render Modes

Sections