Skip to main content

Configuration

QuickCrawl loads configuration in this order (later layers override earlier):

  1. Defaults — hardcoded sensible defaults
  2. quickcrawl.toml — TOML config file
  3. Environment variables — always override (including .env loaded at startup)

TOML Config File

Place quickcrawl.toml in the same directory as the binary, or set the CONFIG env var to point to a custom path.

CONFIG=/path/to/custom.toml quickcrawl server

Example quickcrawl.toml

[server]
host = "0.0.0.0"
port = 3000
request_timeout_secs = 60
rate_limit_rps = 10

[renderer]
page_timeout_ms = 45000
pool_size = 4
render_mode = "auto"
browser = "cloak"

[renderer.chrome]
ws_url = ""

[crawler]
max_concurrency = 40
requests_per_second = 40.0
respect_robots_txt = false
default_max_depth = 2
default_max_pages = 100

[crawler.stealth]
enabled = true
strategy = "modern_browser"

[extraction.llm]
api_key = ""
model = "gpt-4o-mini"

[cache]
enabled = true
ttl_default_secs = 3600

[search]
base_url = ""
timeout_secs = 30

Environment Variables

Environment variables always override TOML values. Use the format SECTION__KEY (double underscore).

Server

VariableDescription
SERVER__HOSTListen address
SERVER__PORTListen port
SERVER__REQUEST_TIMEOUT_SECSRequest timeout
SERVER__RATE_LIMIT_RPSRate limit (requests/sec)

Renderer

VariableDescription
RENDERER__PAGE_TIMEOUT_MSPage load timeout
RENDERER__POOL_SIZEBrowser pool size
RENDERER__RENDER_MODEDefault render mode: auto, browser, http
RENDERER__BROWSERBrowser type: cloak, lightpanda, browserless
RENDERER__CHROME__WS_URLChrome CDP WebSocket URL

Crawler

VariableDescription
CRAWLER__MAX_CONCURRENCYMax concurrent crawls
CRAWLER__REQUESTS_PER_SECONDRate limit
CRAWLER__RESPECT_ROBOTS_TXTFollow robots.txt (true/false)
CRAWLER__DEFAULT_MAX_DEPTHDefault crawl depth
CRAWLER__DEFAULT_MAX_PAGESDefault max pages
CRAWLER__STEALTH__ENABLEDEnable stealth mode
CRAWLER__STEALTH__STRATEGYHeader strategy: modern_browser, mobile_device, bot_friendly

LLM Extraction

VariableDescription
EXTRACTION__LLM__API_KEYOpenAI API key
EXTRACTION__LLM__MODELModel name
EXTRACTION__LLM__BASE_URLCustom API base URL
VariableDescription
SEARCH__BASE_URLSearXNG instance URL

Cache

VariableDescription
REDIS_URLRedis connection URL
CACHE__ENABLEDEnable Redis cache
CACHE__TTL_DEFAULT_SECSDefault cache TTL

.env File

QuickCrawl loads .env at startup via godotenv.Load(). Variables in .env follow the same SECTION__KEY format.

# .env
SERVER__PORT=3000
EXTRACTION__LLM__API_KEY=sk-...
SEARCH__BASE_URL=https://searx.example.com

Render Mode

Controls how pages are fetched globally or per-request:

ValueBehavior
autoHTTP first, escalate to browser on anti-bot / SPA signals (default)
browserAlways use browser (Chrome/LightPanda)
httpHTTP only, never use browser