
Artifacts, evaluated.
Quality Management Tools and Benchmarking for the Digital Visual Arts.
Pass/fail scorecards for AI deliverables. Automated vetting and culling for high-volume production pipelines.
Phase 1 build in progress. First public benchmarks scheduled for 2026. Pilot inquiries welcome.
Use cases
What this does for you
Two primary jobs the framework does today — both real on AI image and video workflows that have moved past pilot stage.
Creators & their clients
Objective acceptance criteria
Stop arguing about whether the AI output is good enough. Score the deliverable against agreed thresholds and ship the scorecard alongside it. Both sides see exactly what passed, what didn't, and why.
- Subjective sign-offDocumented pass / fail
- Scope-creep argumentsThreshold-anchored revisions
- “It looks off”“Frame 47 failed identity at 0.78”
Studios & high-volume pipelines
Automated batch vetting & culling
Score every generation in a batch automatically. Reject obvious failures — wrong codec, severe artifacts, identity drift — at the gate, before they ever reach a reviewer. Humans see only the assets worth their time.
- 1,000 generationsThe 50 worth reviewing
- Manual triageAutomated gating
- “Something looks bad”Frame-level flags with bounding boxes
Bonus
Independent benchmarking, by-product
Same metrics, same thresholds, applied across models. The framework is model-agnostic by design — making it equally useful as the basis for neutral, published comparisons. No marketing claims; just the scores.
The framework
Nine dimensions. Three gates. Automated scorecards.
A model-agnostic scoring framework for AI-generated images and video. Each dimension is a distinct axis of output quality. A gated pipeline rules out fast failures cheaply, then runs deeper analysis only on what survives.
Local-first. No cloud dependencies. Designed to run on Apple Silicon using open-source components.
D01
Technical Delivery
Specs, codecs, VMAF
D02
Spatial & Texture
Artifacts, banding, noise
D03
Temporal & Motion
Flicker, optical flow, stability
D04
Audio Quality
LUFS, clipping, sync offset
D05
Lip Sync Precision
MAR, DTW, phoneme timing
D06
Character & Identity
Identity drift, hands, anatomy
D07
Lighting & Scene
Shadows, color temperature
D08
Brand & Client Compliance
Palette, talent, LUTs
D09
Prompt & Action Adherence
VLM-evaluated framing