.cvextract.json Schema v1 — VLM → cv-web Bridge¶
Status: v1 (CV-336, Sprint 22). Producer:
cv-blueprint-mcp(CV-332) — wraps Qwen2.5-VL / Qwen3-VL via Ollama. Consumer: cv-web import (CV-180 follow-up), eventually cv-cad ingestion. Source of truth:webpage/Simplestruct/cv-web/src/types/cvextract.ts.
Why this exists¶
The blueprint-ingestion story for ConstructiVision is “scan a tilt-up panel drawing PDF → land structured panel data in cv-web (or cv-cad) without re-typing.” A local VLM does the heavy lift; the schema below is the contract between the model and the rest of the toolchain.
The schema is deliberately scoped to what a VLM can plausibly extract from a single drawing in one pass:
panel mark
overall dimensions
openings (doors, windows, louvers)
block-outs / recesses
embeds (weld plates, lifting inserts, anchor bolts)
coarse reinforcement call-outs (free text — not a full rebar layout)
Engineered detail that requires structural calculation — full rebar schedule, slab dowels, bracing inserts with hold-down loads, footing-pile geometry — stays out of v1. Those belong to the engineer’s review pass downstream of ingestion.
What this is not¶
Not a port of the legacy .pnl format. The .pnl schema (decoded by
src/x32/TB11-01x32/makepan.LSP) carries ~1,700 fields across 21 var-groups —
because it describes a fully engineered panel that’s already been through
CV. .cvextract.json describes what was seen on a scanned drawing, with
extraction provenance baked in. The two schemas overlap conceptually but
intentionally don’t share field names.
What the legacy format taught us, which carried forward:
Version stamp at the root.
.pnlopens withmpv#= “V3.60”; we useschema: "cvextract-v1". A mismatched stamp aborts import.Source-of-truth integrity check.
.pnlrejects a file whosemppn(project name) doesn’t match the path it was loaded from. v1’ssource.pdf+ optionalsource.sha256play the same role.Consistent, predictable field naming.
.pnlfield names are terse but always follow<group><field><index>. v1 prefers full property names but keeps grouping (dimensions,openings,blockouts,embeds).
What v1 adds that .pnl can’t carry:
Evidence. Every datum may include a
bbox, apage, and the raw OCR/VLM text chunk that produced it. This is essential for a human reviewer to audit a VLM extraction before promoting it to PanelData.Confidence. 0..1 score per panel, per opening, per measurement — the reviewer’s triage signal.
Extractor metadata. Model tag, prompt version, timestamp. Lets us re-run on a new model without losing the trail.
Raw VLM response. Optional
rawfield stores the un-parsed model output for re-parse / audit.
Envelope¶
{
"schema": "cvextract-v1",
"source": {
"pdf": "AAB001.pdf",
"sha256": "…",
"totalPages": 1,
"pagesExtracted": [1]
},
"extractor": {
"model": "qwen2.5vl:7b",
"promptVersion": "cv-blueprint-mcp@0.1",
"timestamp": "2026-05-27T21:00:00Z",
"durationMs": 8420
},
"panels": [
{
"mark": "AAB001",
"page": 1,
"dimensions": {
"width": { "raw": "24'-0\"", "feet": 24, "inches": 0, "unit": "ft-in" },
"height": { "raw": "32'-6\"", "feet": 32, "inches": 6, "unit": "ft-in" },
"thickness": { "raw": "7-1/4\"", "inches": 7.25, "unit": "in" }
},
"openings": [
{
"type": "door",
"mark": "D1",
"width": { "raw": "3'-0\"", "feet": 3, "inches": 0, "unit": "ft-in" },
"height": { "raw": "7'-0\"", "feet": 7, "inches": 0, "unit": "ft-in" },
"position": {
"x": { "raw": "5'-0\"", "feet": 5, "inches": 0, "unit": "ft-in" },
"y": { "raw": "0'-0\"", "feet": 0, "inches": 0, "unit": "ft-in" },
"origin": "panel-bl"
},
"evidence": { "page": 1, "bbox": [0.10, 0.40, 0.22, 0.70] },
"confidence": 0.85
}
],
"blockouts": [],
"embeds": [],
"reinforcement": {
"bars": ["#5@12 EW", "(4)#7 vert ea face"],
"notes": ["Provide additional rebar at openings per detail 7/S3.1"]
},
"evidence": { "page": 1 },
"confidence": 0.92
}
]
}
Field reference¶
Root¶
Field |
Type |
Required |
Notes |
|---|---|---|---|
|
|
yes |
Exact literal. New fields are additive within v1; breaking changes bump the version. |
|
|
yes |
Where the data came from. |
|
|
yes |
Which model produced it. |
|
|
yes |
One entry per panel detected. Empty array means “no panels found.” |
|
|
no |
Extractor-level warnings (truncated page, unreadable region, etc). |
|
|
no |
Full unparsed VLM response — keep for audit. |
SourceRef¶
Field |
Type |
Required |
Notes |
|---|---|---|---|
|
|
yes |
Filename only — no path. |
|
|
no |
SHA-256 of the source PDF, hex-encoded. |
|
|
no |
Convenience for downstream tools. |
|
|
yes |
1-based page indices the extractor actually processed. |
ExtractorMeta¶
Field |
Type |
Required |
Notes |
|---|---|---|---|
|
|
yes |
Ollama model tag (e.g. |
|
|
yes |
Identifies the prompt template — |
|
|
yes |
ISO-8601 UTC. |
|
|
no |
Total extraction wall-clock time. |
ExtractedPanel¶
Field |
Type |
Required |
Notes |
|---|---|---|---|
|
|
yes |
Panel mark, e.g. |
|
|
yes |
1-based primary page where the panel was detected. |
|
|
yes |
width / height / optional thickness. |
|
|
yes |
May be empty. |
|
|
yes |
May be empty. |
|
|
yes |
May be empty. |
|
|
no |
Free-text call-outs only. |
|
|
yes |
Where on the PDF this panel sits. |
|
|
yes |
0..1. |
Measurement¶
Every dimensioned value uses this shape.
Field |
Type |
Required |
Notes |
|---|---|---|---|
|
|
yes |
Verbatim text from the drawing. Single source of truth on disagreement. |
|
|
no |
Parsed, best-effort. |
|
|
no |
Parsed, best-effort. May be fractional. |
|
|
yes |
The unit the |
|
|
no |
Where on the page the value was read from. |
|
|
no |
0..1. |
Opening¶
Field |
Type |
Required |
Notes |
|---|---|---|---|
|
|
yes |
|
|
|
no |
Panel-local mark — |
|
|
yes |
|
|
|
no |
Distance from panel base to opening bottom. |
|
|
no |
Within-panel position. |
|
|
no |
|
|
|
no |
Blockout¶
Recessed area on the panel face — electrical panel recess, plumbing chase, etc.
Field |
Type |
Required |
Notes |
|---|---|---|---|
|
|
no |
Free-text intent. |
|
|
yes |
|
|
|
no |
|
|
|
no |
|
|
|
no |
|
|
|
no |
Embed¶
Weld plates, lifting inserts, bracing inserts, anchor bolts. Geometry is typically too small for a VLM to fully resolve in one pass — v1 captures existence + mark + approximate position, leaving detailed sizing to engineer review.
Field |
Type |
Required |
Notes |
|---|---|---|---|
|
|
no |
Free-text type — |
|
|
no |
Panel-local mark. |
|
|
no |
|
|
|
no |
|
|
|
no |
Reinforcement¶
Coarse-grained, free text only. Detailed bar-by-bar layout is out of scope for v1.
Field |
Type |
Required |
Notes |
|---|---|---|---|
|
|
no |
Raw call-outs — |
|
|
no |
Free-text drawing notes. |
|
|
no |
Position¶
Field |
Type |
Required |
Notes |
|---|---|---|---|
|
|
no |
Distance from origin. |
|
|
no |
Coordinate origin convention. |
Evidence¶
Field |
Type |
Required |
Notes |
|---|---|---|---|
|
|
no |
1-based page index. |
|
|
no |
Normalized 0..1 image coords. |
|
|
no |
Raw OCR/VLM chunk supporting the datum. |
Versioning policy¶
New optional fields are additive within v1. Consumers MUST ignore unknown fields.
Renames, type changes, or required-field additions bump the schema to v2.
The parser will accept both
cvextract-v1andcvextract-v2once v2 ships, with a migration helper.
Validation contract¶
import { parseCVExtract, validateCVExtract } from '../types/cvextract';
const errs = validateCVExtract(jsonFromVLM);
if (errs.length > 0) {
// surface to the reviewer; do not auto-import
}
// throws on invalid input:
const extract = parseCVExtract(jsonFromVLM);
Validation is intentionally permissive — producers may include additional
fields, and null is the explicit “I asked but couldn’t tell” signal for any
optional dimension. The reviewer UI MUST surface null differently from a
missing field, even though both pass validation.