Technical Architecture

The Calibra
Protocol v4.8

A five-layer forensic attestation architecture designed to intercept Invalid Traffic, audit behavioral integrity, and deliver a cryptographically verifiable provenance record for every survey response.

Version 4.8 · Live Last updated: March 2026 Avg. latency: 4.1ms
00Architecture Overview 01SIVT Entry Filtration 02Forensic Fingerprinting 03Behavioral Velocity 04Semantic AI Audit 05Attestation & Delivery The Calibra-ID Disposition Codes Scoring Dimensions API Documentation Technical Whitepaper
Architecture Overview

How the Calibra Protocol works.

The Calibra Protocol is a sequential, multi-gate validation architecture. Unlike post-survey data cleaning tools — which operate on already-collected data — CalibraSync intercepts respondents before they enter your survey environment. Each gate applies a distinct enforcement logic; respondents who fail any gate are disposed before consuming your quota, your server resources, or your client-side incentive budget.

Respondent Flow — Left to Right
Traffic Source
Layer 01: SIVT Scan
Layer 02: Fingerprint
Layer 03: Behavioral
Layer 04: Semantic
Layer 05: Attestation
VALID COMPLETE
→ SECURITY (IVT blocked)
→ PRE-TERM (behavioral fail)
→ DUPLICATE (fingerprint match)

Every respondent who passes all five layers receives a Calibra-ID — a cryptographic attestation token embedded in the completion callback. This token serves as the audit anchor for the delivered dataset, allowing research operations teams to verify the provenance of any individual response at any point after fieldwork closes.

01
Active · Pre-Entry SIVT Classification
SIVT Entry Filtration
// module: sivt_classifier · threshold: <1ms · intercept_rate: avg 4.8%

The first gate classifies traffic at the network and device layer before any survey interaction occurs. Sophisticated Invalid Traffic (SIVT) — datacenter IP ranges, TOR exit nodes, known residential proxy networks, and IP addresses appearing on our 4.1-billion-record behavioral blacklist — is intercepted and disposed as SECURITY at this stage. General Invalid Traffic (GIVT) patterns such as bot-flagged user agents are also resolved here.

Unlike basic IP blacklists that rely on static databases, our SIVT classifier is updated in near-real-time from the full visibility we maintain across 250+ sample sources. When a bad actor is identified on one panel, that fingerprint propagates across the entire CalibraSync enforcement network within minutes.

GIVT Detection
Datacenter IPs, known bot user agents, automated crawlers, and spider signatures. Standard enforcement with near-zero false-positive rate.
SIVT Classification
Residential proxy networks, TOR nodes, VPN endpoints masking commercial intent, and IP addresses operating at suspicious geographic velocity.
Behavioral Blacklist
4.1B entry database of flagged respondent identifiers aggregated across all monitored sources. Updated continuously via reconciliation feedback loops.
No-Fly List
Confirmed bad actors are permanently excluded from all projects running through CalibraSync, across all panels and exchanges, regardless of source.
02
Forensic · Cookie-Free Hardware-Layer ID
Forensic Fingerprinting
// module: calibra_fingerprint · uniqueness_rate: 99.97% · dedup_db: 18.4M entries

Respondents who clear the SIVT gate are fingerprinted at the hardware and protocol layer — not the cookie layer. Cookie-based identification is trivially circumvented by browser resets, private browsing, and device cycling. Our fingerprinting approach instead evaluates hardware architecture signatures, Canvas API rendering noise patterns, WebGL capabilities, audio context characteristics, font enumeration, and TCP/IP stack timing to construct a device identity that persists across browser resets and standard evasion techniques.

The resulting identifier — the Calibra-ID draft — is checked against our deduplication database. Respondents with matching fingerprints on the current project, on sibling projects running through the same client account, and across our global cross-panel record are disposed as DUPLICATE before entering the screener.

// Calibra-ID Generation — Pseudocode Representation
fingerprint = collect('canvas_noise', 'webgl_vendor', 'audio_context',
                  'hw_concurrency', 'font_set', 'tcp_timing')
calibra_id = sha3_256(fingerprint + project_salt + timestamp_bucket)
dupe_check = query_dedup_db(calibra_id, scope='global')
// if dupe_check.hit → disposition: DUPLICATE
// if dupe_check.miss → proceed to Layer 03
03
Active · Mid-Survey Real-Time Scoring
Behavioral Velocity Engine
// module: bve_runtime · scoring_vectors: 87 · evaluation_window: continuous

Once a respondent enters the survey, the Behavioral Velocity Engine monitors their interaction continuously across 87 scoring dimensions. This layer is designed to catch two distinct populations that evade identity-level checks: automated scripts that have successfully spoofed their fingerprint, and incentive-gaming human respondents who are completing surveys at industrial pace across multiple panels simultaneously.

Respondents who accumulate a behavioral anomaly score beyond the project-configured threshold are disposed as PRE-TERM — before their response enters the quota count. This protects both the client's budget and the study's n-size integrity, as PRE-TERM respondents are not charged as completes.

Answer Velocity Analysis
Page-level dwell time and question-response cadence scored against a calibrated baseline for study type and length. Straight-through completions detected.
LLM & Agent Blocking
Detection of OpenAI Operator, Selenium-based automation frameworks, and LLM-proxy agents — including those that self-identify during navigation events.
Straight-Line Detection
Matrix question response patterns, consistent extreme-scale selections, and cross-question logical inconsistencies all contribute to the behavioral anomaly score.
Mouse & Keystroke Biometrics
Micro-movement variance, click timing distribution, and inter-keystroke interval analysis distinguishes human from programmatic interaction with high confidence.
04
Proprietary · NLP Post-Response Audit
Semantic AI Audit
// module: semantic_audit_engine v2.0 · scoring_vectors: 43 · model: calibra-nlp-2026

The most consequential emerging threat to research data quality in 2026 is AI-generated open-end responses — text that passes surface-level quality checks but carries no genuine human perspective. Our Semantic AI Audit engine runs NLP-based scoring across all open-ended responses, evaluating semantic coherence, lexical diversity, cross-question consistency, and statistically anomalous phrasing patterns that characterize both LLM-generated and professionally farmed content.

Responses that score below the semantic integrity threshold are flagged, contributing to the respondent's overall attestation score. Depending on project configuration, low-scoring responses may trigger an immediate PRE-TERM or be flagged in the delivery audit log for researcher review.

// Semantic Audit Output — Example Response Record
{
  "calibra_id": "F9E2-8201-3A4C",
  "semantic_score": 0.91,
  "llm_probability": 0.04,
  "lexical_diversity": 0.78,
  "cross_q_consistency": 0.94,
  "disposition": "PROCEED_TO_ATTESTATION"
}
05
Cryptographic · Audit-Ready S2S Callback
Attestation & Delivery
// module: attestation_layer · callback: S2S · log_retention: 36 months

Respondents who pass all four prior layers receive their finalized Calibra-ID — a SHA-3 hash encoding the respondent fingerprint, project identifier, timestamp bucket, and aggregate attestation score. This token is passed in the server-to-server completion callback to the client's survey platform, and simultaneously logged to the CalibraSync audit ledger.

Every delivered dataset includes a machine-readable audit manifest: a structured record of each respondent's Calibra-ID, their per-layer scores, disposition code, and callback timestamp. This manifest is the documentation asset that satisfies enterprise procurement audits, legal disclosure requirements, and ESOMAR methodological transparency standards.

Disposition Code Meaning Charged to Client? Logged to Audit?
COMPLETEPassed all 5 layers. Calibra-ID issued.YesYes — full record
SECURITYSIVT intercept at Layer 01 or Layer 02NoYes — IP/fingerprint log
PRE-TERMBehavioral or semantic threshold failureNoYes — score breakdown
DUPLICATEFingerprint match in deduplication DBNoYes — reference ID logged
QUOTA-FULLQuota cell closed before entry processedNoTimestamp only