benchmarks & methodology

How we score it, before we tell you a number.

The detector strip on the workbench combines two real signals with three simulated ones. We’d rather say that plainly than dress all five cells up as live vendor calls.

1. In-house score (live)

A Haiku-class critic prompted to assess stylistic AI tells: vocabulary density (delve / tapestry / paradigm shift), sentence-length uniformity, em-dash and transition stacking, balanced phrasing, hedge density, contraction absence, and rhythm monotony. Returns a structured 0–1 score with a verdict bucket (low / medium / high) and a list of top flags.

This is the “Humanize” cell — it’s always live and runs on every request. It is not a reimplementation of GPTZero or any other third-party detector.

2. GPTZero (live when configured)

When GPTZERO_API_KEY is configured server-side, the GPTZero v2 endpoint is called directly and its score is reported as-is. When the key isn’t configured, the cell falls back to a deterministic estimate derived from the in-house score, clearly labeled sim.

The fallback estimate is calibrated against our offline corpus (synthetic-only, never your prose) so the simulated number tracks the real GPTZero number reasonably closely — but it is still a simulation and we label it as such.

3. Originality, Turnitin, Copyleaks (simulated)

These vendors don’t offer practical public APIs for consumer apps — Originality is enterprise-only at the price we’d need, Turnitin is institutional-only, Copyleaks gated. We expose simulated cells anyway because the in-house score correlates well enough that the simulation is a useful sanity check, and we always label them sim.

If you need an actual Originality/Turnitin/Copyleaks reading, we can’t give it to you yet. Honesty beats fake numbers.

4. Pangram — noted separately

Pangram is a newer, stricter classifier. We don’t reliably bypass Pangram and we won’t pretend otherwise. We’ll publish our Pangram pass rate alongside the others — without spinning the number — once we stabilize the live Pangram integration.

Roadmap

Now: in-house critic + GPTZero (when configured) + clearly simulated cells for Originality / Turnitin / Copyleaks.
Next: live ZeroGPT and Sapling cells (their APIs are accessible to small developers).
Then: a paid “bring your own detector key” mode on Pro for users who already have Originality or Copyleaks access — the cell goes live with their key.
Always: the cell shows whether the number you’re looking at came from a live vendor call, a calibrated estimate, or a simulation. No guessing.

See also: ethics for what we won’t do; privacy for what we don’t store.