Layout analysis, motion detection, and quality evaluation — vectorized into 768-dimensional embeddings and searchable via 19 MCP tools with HNSW cosine similarity at 10.66ms P95.
N 35.6762
E 139.6503
100m
10.66ms
vector search P95
768d
embedding space
37
database models
10
HNSW indexes
Begin survey
N 35.6762
Chapter I
The terrain we survey
0
total embeddings
across 4 survey domains
01domain
0
section embeddings
Layout Topology
Every webpage has terrain. Hero sections rise like plateaus. Feature grids tile the valley floor. Footers anchor the southern boundary. We survey each formation and record its coordinates in 768-dimensional vector space -- indexed via HNSW for sub-millisecond retrieval.
CSS transitions ripple through the DOM like seismic waves. GSAP tweens pulse at specific frequencies. WebGL shaders paint at 60fps. We capture these vibrations across three detection channels -- their duration, easing curves, and trigger conditions.
Beneath the visible surface lie layers of gradients, textures, and color fields. Linear, radial, conic, mesh -- each geological layer tells the story of a design decision. We catalog the full depth across seven distinct strata types.
Every design carries weather. Some radiate professional calm. Others crackle with artistic energy. We measure the atmospheric pressure of mood, tone, and emotional register -- classifying each site into one of sixteen distinct atmospheric profiles.
Unified terrain survey in a single pass. Analyzes layout structure, motion patterns, background designs, and quality scores simultaneously. Vision analysis via Ollama (llama3.2-vision) is enabled by default. Supports async mode for heavy sites and auto_timeout for dynamic complexity detection.
Four stages transform unstructured web pages into a searchable vector space -- a cartographic process of surveying, decomposing, encoding, and indexing across 37 models, 10 HNSW indexes, and 768-dimensional vectors.
01
Ingest
Capture the territory
37
DB models
Playwright captures full-page HTML and screenshots. DOMPurify 3.3 sanitizes with a 3-layer strategy. SSRF protection blocks private IPs and cloud metadata services. External CSS resolved and inlined.
Sections classified by type (hero, feature, cta, pricing ...). Motion patterns detected via CSS static parsing, JS runtime (CDP + Web Animations API), and WebGL frame diff analysis. Video mode captures at 15px/frame scroll resolution. Background designs decomposed from inline styles and stylesheet sources.
multilingual-e5-base via ONNX Runtime encodes each pattern into a 768-dimensional vector. L2-normalized for cosine similarity. Batch processing handles 100 patterns in under 10 seconds. Average embedding generation: 46ms.