Traditional hiring relies on broadcast-rejection — filtering out hundreds of talented developers based on resume keywords or rigid pass/fail LeetCode puzzles because manual screening doesn't scale.
interviewsignal enables AI-native broad-interviewing: a high-volume, high-fidelity asynchronous screening model that opens the funnel wide without draining engineering resources. Share a code. Every candidate works the problem on their own time, in their own IDE, with their own AI tools. The session captures the full thought process — every prompt, every decision, every iteration. Submissions arrive auto-graded and ranked. You spend 15 minutes triaging, not 200 hours interviewing.
When every candidate uses AI, code quality converges. Output is no longer signal. ATS platforms grade the output — did the code pass tests? We grade the thinking — how the candidate decomposes the problem, directs the AI, and iterates on failures. The transcript captures who drove the thinking. That's the signal no one else can see.
Broad-interview, not broadcast-reject. Pure signal.
The Engine in Action
|
Candidate — works in terminal with full-power AI
|
Hiring Manager — reviews auto-graded submissions in the dashboard
|
The Unfair Advantage
interviewsignal vs the status quo
| Phone screen | Take-home test | LeetCode | AI screening SaaS | interviewsignal | |
|---|---|---|---|---|---|
| Scales to 200+ candidates | 🚫 | ✅ | ✅ | ||
| Captures thought process | 🚫 | 🚫 | ✅ Hash-chained transcript | ||
| AI-native | 🚫 | 🚫 "No AI" policies | 🚫 | ✅ | ✅ Full-power AI, graded on usage |
| Real problems, real tools | ✅ | 🚫 Contrived | ✅ Candidate's own IDE | ||
| Candidate gets feedback | 🚫 Usually ghosted | 🚫 | 🚫 | ✅ Score + summary | |
| Setup cost | High (scheduling) | Medium | Medium (platform) | High (vendor + procurement) | pip install, done |
| Tamper detection | N/A | 🚫 Honor system | ✅ 9 automated flags | ||
| Self-hosted / private | N/A | N/A | 🚫 | 🚫 Multi-tenant cloud | ✅ Your infra, your data |
| Cost | Engineer time | Engineer time | $$$$/seat | $100+/seat, 5-20 assessments/mo | Free forever |
Quickstart
Hiring manager — create an interview
interview dashboard
First launch opens a setup wizard in your browser — relay URL, API key, create your first interview. Three screens and you're live. The form asks for three things: problem, rubric, time limit. You get back a code like INT-4829-XK. Share it with 5 candidates or 500.
Your rubric dimensions are your weights. If you want thought process to matter more than code quality, make more of your dimensions about process.
Candidate — take the interview
pip install interviewsignal && interview install # Codex: pip install interviewsignal && interview install --platform codex /interview INT-4829-XK
The session starts, GitHub OAuth opens (one account = one submission), and the problem appears. Work normally — ask the AI questions, write code, run tests. When done:
/submit
Session sealed. Pushed to relay. Auto-graded. Score + summary shown in terminal.
Hiring manager — review
interview dashboard # → http://localhost:7832 interview dashboard INT-4829-XK # → jump to one interview's submissions
Submissions arrive sorted by score. Flags highlight anomalies. Select candidates in bulk → advance or reject. Click into any candidate for the full transcript, dimension scores, and diff.
Batch actions: ↻ Regrade (re-run AI grading after rubric tuning) · ✓ Yes / → Maybe / ✗ No · ↓ Export CSV
How it works
graph TD
A[Candidate Prompts AI] --> B[Shell Hooks Capture Tool Calls]
B --> C[Append-Only SHA-256 Event Log]
C --> D[Automatic Git Micro-Commit after each turn]
D --> E[Log Sealed on /submit]
E --> F[Relay Server Auto-Grades via Rubric]
F --> G[HM Dashboard ranks candidates by thinking score]
interviewsignal installs as a skill into your AI coding assistant. It captures the full conversation — prompts, reasoning, every tool call — and builds an append-only, hash-chained session log. After each turn, it silently commits changed files to the local repo. On /submit, the log is sealed and pushed to the relay.
|
HM creates interview HM reviews |
Candidate works Candidate submits |
Tamper-Evident Architecture
Candidates control their own machine. Security is detection, not prevention. A sparse or gapped session is its own red flag.
Quality Flags catch sessions completed in under 10 minutes, fewer than 3 tool calls, no iteration pattern, statistically uniform timing, and zero prompts.
Tamper Flags catch large gaps in the event stream (hooks disabled), code changes that don't match Write/Edit tool calls (work outside AI), tool calls with no corresponding prompts (selective suppression), and commits with no matching events (cross-verification).
Overtime Scoring
Submissions after the time limit are accepted — but automatically penalized. The deduction is applied to the AI-graded score post-grading, not injected into the grading prompt, so it can't be argued away.
The math
Overtime is divided into bands: 0–10 min, 10–20 min, 20–30 min, 30–60 min, and 60+ min (capped). Within each band the penalty grows as a quadratic curve — not a flat step — so a candidate barely over the boundary loses almost nothing while one deep in the band feels it accelerate:
position = (overtime - band_start) / (band_end - band_start) ← 0.0 to 1.0
penalty = prev_band_max + (this_band_max - prev_band_max) × position²
Band maxima on the 0–10 score scale: −0.5 at 10 min, −1.0 at 20 min, −1.5 at 30 min, −2.5 at 60 min, −4.0 cap beyond that.
Example — 7 minutes over a 60-minute interview, raw AI score 7.5
overtime = 7 min → falls in the 0–10 min band
position = 7 / 10 = 0.70
penalty = 0 + (0.5 − 0) × 0.70² = 0.5 × 0.49 = −0.245
final score = 7.5 − 0.245 = 7.26
Compare: 1 min over costs only −0.005; 10 min over costs the full −0.5. The curve means being slightly late is forgiving, but lingering well past the limit compounds quickly.
The dashboard shows the raw AI score, overtime deduction, and adjusted final score separately. A flag is raised alongside — yellow for ≤20 min over, red beyond.
What gets captured
The session log is append-only and hash-chained. Any tampering breaks the chain. Raw file contents are never stored — only paths, hashes, and summaries.
Platform support
| Platform | Install | Activity capture |
|---|---|---|
| Claude Code | interview install |
✅ Full — prompts, tool calls, reasoning |
| Codex | interview install --platform codex |
✅ Full |
| Gemini CLI | interview install --platform gemini |
✅ Full |
| Cursor | interview install --platform cursor |
|
| Aider | interview install --platform aider |
Relay setup
The relay stores interview packages and candidate sessions so everyone only needs to share a short code.
Option 1 — Self-hosted (~$5/mo, fully private) ← recommended
# After deploying: # 1. Set RELAY_API_KEY (any random string) in Railway → Variables # 2. Add a /data volume # 3. Copy your Railway URL → paste into dashboard setup wizard # Optional — auto-grading on submission: GRADING_API_KEY=<anthropic-key> GRADING_MODEL=claude-haiku-4-5-20251001
Or Docker:
docker build -t interviewsignal-relay .
docker run -e RELAY_API_KEY=secret -v /data:/data -p 8080:8080 interviewsignal-relayGitHub OAuth (one account = one submission)
Relay operator step — done once at deploy time.
GITHUB_CLIENT_ID=<your_client_id> GITHUB_CLIENT_SECRET=<your_client_secret> RELAY_BASE_URL=https://myrelay.up.railway.app
Create the OAuth App at github.com/settings/developers with callback URL: https://myrelay.up.railway.app/auth/github/callback
Option 2 — Email only (free, no server)
interview configure-relay # choose 2 interview configure-email # set up SMTP
Reports emailed directly to HM on /submit.
Enterprise configuration
interview configure-llm
| Pattern | What to set |
|---|---|
| Anthropic direct | API key only (default) |
| Internal proxy (Floodgate, corporate gateway) | Base URL + optional key |
| OpenAI-compatible endpoint | Base URL + key + format=openai |
Environment variable overrides: ANTHROPIC_API_KEY, ANTHROPIC_BASE_URL, INTERVIEW_GRADING_MODEL
Privacy
- Sessions stored on relay:
events.jsonl,manifest.json,flags.json— raw file contents never stored - Grading uses your own API key — interviewsignal never sees it
- Self-hosted relay: nothing leaves your network
- No telemetry. No analytics. No tracking.
FAQ
How do you prevent candidates from using a second screen to get answers?
Security is detection, not prevention. When someone pastes pre-written code from another screen, they produce large blocks of finished code with no corresponding prompts, no trial-and-error, no iteration. This triggers Ghost Edits and Zero Prompts flags automatically. The absence of signal is itself signal — a sparse session ranks itself at the bottom.
Can we run this completely offline or in a private network?
Yes. The relay server runs inside your own infrastructure — VPC, air-gapped network, whatever you need. Configure your internal LLM proxy for grading. Zero telemetry, zero trackers, zero external dependencies. Python stdlib only.
What coding platforms are supported?
Full hook support (prompts, tool calls, reasoning): Claude Code, Codex, Gemini CLI. Skill instruction support (limited capture): Cursor, Aider. Each new platform adapter is ~30 lines.
Built with
Python stdlib only — zero external dependencies for core and relay. Grading via Anthropic Messages API or any compatible endpoint. Dashboard is a self-contained local HTTP server. Relay is a single-process stdlib server backed by flat files.
Contributing
Prompts — grading instructions are open and community-editable: interview/skills/interview/SKILL.md
Worked examples — run a session, save to worked/{slug}/, write a review.md, open a PR.
Platform adapters — each new platform is ~30 lines in cli.py.
See ARCHITECTURE.md for module map · docs/relay-api.md for the relay API.
Broad-interview, not broadcast-reject. Pure signal.
No contrived puzzles. No whiteboard anxiety. No ghosting. Just signal.



















