.cvextract.json Schema v1 — VLM → cv-web Bridge

Status: v1 (CV-336, Sprint 22). Producer: cv-blueprint-mcp (CV-332) — wraps Qwen2.5-VL / Qwen3-VL via Ollama. Consumer: cv-web import (CV-180 follow-up), eventually cv-cad ingestion. Source of truth: webpage/Simplestruct/cv-web/src/types/cvextract.ts.

Why this exists

The blueprint-ingestion story for ConstructiVision is “scan a tilt-up panel drawing PDF → land structured panel data in cv-web (or cv-cad) without re-typing.” A local VLM does the heavy lift; the schema below is the contract between the model and the rest of the toolchain.

The schema is deliberately scoped to what a VLM can plausibly extract from a single drawing in one pass:

  • panel mark

  • overall dimensions

  • openings (doors, windows, louvers)

  • block-outs / recesses

  • embeds (weld plates, lifting inserts, anchor bolts)

  • coarse reinforcement call-outs (free text — not a full rebar layout)

Engineered detail that requires structural calculation — full rebar schedule, slab dowels, bracing inserts with hold-down loads, footing-pile geometry — stays out of v1. Those belong to the engineer’s review pass downstream of ingestion.

What this is not

Not a port of the legacy .pnl format. The .pnl schema (decoded by src/x32/TB11-01x32/makepan.LSP) carries ~1,700 fields across 21 var-groups — because it describes a fully engineered panel that’s already been through CV. .cvextract.json describes what was seen on a scanned drawing, with extraction provenance baked in. The two schemas overlap conceptually but intentionally don’t share field names.

What the legacy format taught us, which carried forward:

  • Version stamp at the root. .pnl opens with mpv# = “V3.60”; we use schema: "cvextract-v1". A mismatched stamp aborts import.

  • Source-of-truth integrity check. .pnl rejects a file whose mppn (project name) doesn’t match the path it was loaded from. v1’s source.pdf + optional source.sha256 play the same role.

  • Consistent, predictable field naming. .pnl field names are terse but always follow <group><field><index>. v1 prefers full property names but keeps grouping (dimensions, openings, blockouts, embeds).

What v1 adds that .pnl can’t carry:

  • Evidence. Every datum may include a bbox, a page, and the raw OCR/VLM text chunk that produced it. This is essential for a human reviewer to audit a VLM extraction before promoting it to PanelData.

  • Confidence. 0..1 score per panel, per opening, per measurement — the reviewer’s triage signal.

  • Extractor metadata. Model tag, prompt version, timestamp. Lets us re-run on a new model without losing the trail.

  • Raw VLM response. Optional raw field stores the un-parsed model output for re-parse / audit.

Envelope

{
  "schema": "cvextract-v1",
  "source": {
    "pdf": "AAB001.pdf",
    "sha256": "…",
    "totalPages": 1,
    "pagesExtracted": [1]
  },
  "extractor": {
    "model": "qwen2.5vl:7b",
    "promptVersion": "cv-blueprint-mcp@0.1",
    "timestamp": "2026-05-27T21:00:00Z",
    "durationMs": 8420
  },
  "panels": [
    {
      "mark": "AAB001",
      "page": 1,
      "dimensions": {
        "width":  { "raw": "24'-0\"", "feet": 24, "inches": 0, "unit": "ft-in" },
        "height": { "raw": "32'-6\"", "feet": 32, "inches": 6, "unit": "ft-in" },
        "thickness": { "raw": "7-1/4\"", "inches": 7.25, "unit": "in" }
      },
      "openings": [
        {
          "type": "door",
          "mark": "D1",
          "width":  { "raw": "3'-0\"", "feet": 3, "inches": 0, "unit": "ft-in" },
          "height": { "raw": "7'-0\"", "feet": 7, "inches": 0, "unit": "ft-in" },
          "position": {
            "x": { "raw": "5'-0\"",  "feet": 5, "inches": 0, "unit": "ft-in" },
            "y": { "raw": "0'-0\"",  "feet": 0, "inches": 0, "unit": "ft-in" },
            "origin": "panel-bl"
          },
          "evidence": { "page": 1, "bbox": [0.10, 0.40, 0.22, 0.70] },
          "confidence": 0.85
        }
      ],
      "blockouts": [],
      "embeds": [],
      "reinforcement": {
        "bars":  ["#5@12 EW", "(4)#7 vert ea face"],
        "notes": ["Provide additional rebar at openings per detail 7/S3.1"]
      },
      "evidence": { "page": 1 },
      "confidence": 0.92
    }
  ]
}

Field reference

Root

Field

Type

Required

Notes

schema

"cvextract-v1"

yes

Exact literal. New fields are additive within v1; breaking changes bump the version.

source

SourceRef

yes

Where the data came from.

extractor

ExtractorMeta

yes

Which model produced it.

panels

ExtractedPanel[]

yes

One entry per panel detected. Empty array means “no panels found.”

warnings

string[]

no

Extractor-level warnings (truncated page, unreadable region, etc).

raw

string

no

Full unparsed VLM response — keep for audit.

SourceRef

Field

Type

Required

Notes

pdf

string

yes

Filename only — no path.

sha256

string

no

SHA-256 of the source PDF, hex-encoded.

totalPages

number

no

Convenience for downstream tools.

pagesExtracted

number[]

yes

1-based page indices the extractor actually processed.

ExtractorMeta

Field

Type

Required

Notes

model

string

yes

Ollama model tag (e.g. qwen2.5vl:7b).

promptVersion

string

yes

Identifies the prompt template — <package>@<semver>.

timestamp

string

yes

ISO-8601 UTC.

durationMs

number

no

Total extraction wall-clock time.

ExtractedPanel

Field

Type

Required

Notes

mark

string | null

yes

Panel mark, e.g. "AAB001". null = VLM couldn’t read it.

page

number

yes

1-based primary page where the panel was detected.

dimensions

PanelDimensions

yes

width / height / optional thickness.

openings

Opening[]

yes

May be empty.

blockouts

Blockout[]

yes

May be empty.

embeds

Embed[]

yes

May be empty.

reinforcement

Reinforcement

no

Free-text call-outs only.

evidence

Evidence

yes

Where on the PDF this panel sits.

confidence

number

yes

0..1.

Measurement

Every dimensioned value uses this shape.

Field

Type

Required

Notes

raw

string

yes

Verbatim text from the drawing. Single source of truth on disagreement.

feet

number

no

Parsed, best-effort.

inches

number

no

Parsed, best-effort. May be fractional.

unit

"ft-in" | "in" | "mm" | "unknown"

yes

The unit the raw string is in.

evidence

Evidence

no

Where on the page the value was read from.

confidence

number

no

0..1.

Opening

Field

Type

Required

Notes

type

"door" | "window" | "louver" | "other"

yes

mark

string

no

Panel-local mark — "W1", "D1".

width / height

Measurement | null

yes

sillHeight

Measurement | null

no

Distance from panel base to opening bottom.

position

Position

no

Within-panel position.

evidence

Evidence

no

confidence

number

no

Blockout

Recessed area on the panel face — electrical panel recess, plumbing chase, etc.

Field

Type

Required

Notes

description

string

no

Free-text intent.

width / height

Measurement | null

yes

depth

Measurement | null

no

position

Position

no

evidence

Evidence

no

confidence

number

no

Embed

Weld plates, lifting inserts, bracing inserts, anchor bolts. Geometry is typically too small for a VLM to fully resolve in one pass — v1 captures existence + mark + approximate position, leaving detailed sizing to engineer review.

Field

Type

Required

Notes

type

string

no

Free-text type — "weld plate", "lifting insert".

mark

string

no

Panel-local mark.

position

Position

no

evidence

Evidence

no

confidence

number

no

Reinforcement

Coarse-grained, free text only. Detailed bar-by-bar layout is out of scope for v1.

Field

Type

Required

Notes

bars

string[]

no

Raw call-outs — "#5@12 EW", "(4)#7 vert ea face".

notes

string[]

no

Free-text drawing notes.

evidence

Evidence

no

Position

Field

Type

Required

Notes

x, y

Measurement

no

Distance from origin.

origin

"panel-bl" | "panel-center" | "unknown"

no

Coordinate origin convention.

Evidence

Field

Type

Required

Notes

page

number

no

1-based page index.

bbox

[x1, y1, x2, y2]

no

Normalized 0..1 image coords.

text

string

no

Raw OCR/VLM chunk supporting the datum.

Versioning policy

  • New optional fields are additive within v1. Consumers MUST ignore unknown fields.

  • Renames, type changes, or required-field additions bump the schema to v2.

  • The parser will accept both cvextract-v1 and cvextract-v2 once v2 ships, with a migration helper.

Validation contract

import { parseCVExtract, validateCVExtract } from '../types/cvextract';

const errs = validateCVExtract(jsonFromVLM);
if (errs.length > 0) {
  // surface to the reviewer; do not auto-import
}

// throws on invalid input:
const extract = parseCVExtract(jsonFromVLM);

Validation is intentionally permissive — producers may include additional fields, and null is the explicit “I asked but couldn’t tell” signal for any optional dimension. The reviewer UI MUST surface null differently from a missing field, even though both pass validation.