Meet the Agents

Q Signals is powered by four AI agents, each with a distinct role, personality, and learning journey. They don't compete — they collaborate. Each agent has a soul document that defines who they are, what they stand for, and the rules they will never break. No hype. No fake confidence. Just honest, data-driven intelligence.

Q Signals Issued

—

Q Win Rate

—

DQN Training Steps

—

RL Stage

—

Q — The Research Analyst

23 modules. Two gates. One honest signal.

Live DQN Learning

"I'm a research tool, not a crystal ball. I'll tell you what 23 independent analysis modules think about a stock — and I'll tell you when they disagree. My job is to surface data clearly, not to tell you what to do with it."

Q is the foundation of the platform. Every signal starts here. Q runs 23 analysis modules — from Fibonacci retracements and Elliott Wave patterns to congressional trading data and Fed minutes sentiment — and combines them into a single composite score. But a score alone isn't enough. Q uses a two-gate system: the composite must exceed a minimum threshold and at least 60% of modules must agree on direction before a signal is issued.

When the gates don't pass, Q says HOLD. No forcing. No rounding up. If only 55% of modules agree, Q waits. That discipline is what separates a signal engine from a random number generator.

Personality

Tone

Research-first. Concise. No hype.

Honesty

Reports confidence honestly. Flags weak signals.

Limits

Never says "buy this" or "sell this." Research output only.

Style

Data over adjectives. Numbers over narratives.

How Q Learns

DQN (Deep Q-Network) — 18,307 parameters. 10 offline modules feed into a 10→128→128→3 neural network that learns BUY/HOLD/SELL patterns from experience.
Experience Replay — Stores the last 10,000 experiences and trains on random batches of 32, breaking correlation between sequential observations.
Target Network — A slowly-updating copy of the main network (Polyak τ=0.01) provides stable Q-value targets during training.
Adaptive Module Weighting — Each module's historical hit rate adjusts its influence: effective_weight = base_weight × (0.5 + hit_rate). Modules that perform well get louder. Modules that don't get quieter.
Honest Stage Labels — Infant (<500 steps) → Learning (500-5K) → Developing (5K-25K) → Intermediate (25K-100K) → Experienced (100K+). No inflated "Expert" labels.

Last signal: loading…

What Q Monitors

Modules

23 — Fibonacci, Elliott Wave, Swing, Volume, Sentiment, Insider, Congress, Fed Minutes, and 15 more

Universe

150+ tickers across 15 sectors (Tech, Semis, Banking, Energy, Healthcare, etc.)

Horizons

Short-term (1d–3mo), Mid-term (3–6mo), Long-term (6mo+)

Output

BUY / HOLD / SELL with 3-tier price targets (conservative, moderate, aggressive), stop-loss, hold duration

Delta — The Options Architect

The last gate before capital is committed.

Live

"An options strategy cannot fix a bad signal. My job begins after Q has done its work — and my first question is always whether the opportunity deserves to have capital structured around it. When red flags dominate, I say so plainly."

Delta receives Q's confirmed signal and decides how to express it with options — or whether to express it at all. Delta thinks in spreads, not hopes. Every recommendation passes through an auto-critique engine that generates green flags (support), yellow flags (caution), and red flags (disqualifiers). If red flags dominate, Delta says "watch only."

Structure follows signal — Delta never works backward from a favorite strategy. Bullish + high IV → Bull Put Spread. Bearish + low IV → Bear Put Spread. Neutral + high IV → Iron Condor. The mapping is deterministic.

Personality

Tone

Institutional. Precise. Defined risk by default.

Honesty

Auto-critique on every recommendation. Green/yellow/red flags.

Limits

Below 25% confidence → no trade, watch only. Non-negotiable.

Philosophy

Capital efficiency over conviction. Even strong signals get risk-checked.

How Delta Works

Signal Validation — Only acts on signals that cleared Q's two-gate system. A single bullish module is noise, not a signal.
IV Regime Mapping — Crosses signal direction (bullish/bearish/neutral) with IV rank (high/low) to select the optimal structure.
Confidence Gates — <25%: no trade. 25-30%: minimum size only. 30-60%: standard. >60%: full size allowed.
DTE Window — 21-45 days. Target zone: 30-38 DTE. Anything shorter gets a warning.
Unified Risk Protocol — When Allston is active, Delta can be vetoed by drawdown state or VIX regime.

Last recommendation: loading…

Strategy Map

Bullish + High IV

Wheel, Bull Put Spread (credit)

Bullish + Low IV

Bull Call Spread (debit)

Bearish + High IV

Bear Call Spread (credit)

Bearish + Low IV

Bear Put Spread, Long Puts

Neutral + High IV

Iron Condor, Covered Call

Neutral + Low IV

Calendar Spread, Long Straddle

Allston — The Futures Floor Veteran

Named after the Boston trading floors. Calm under pressure. Risk first, always.

Paper Trading DQN Learning

"I am not a chatbot. I am a disciplined trading partner who happens to communicate in natural language. Capital preservation beats profit. A flat day is a successful day."

Allston is the AI co-pilot embedded inside an automated futures trading system connected to Interactive Brokers. Allston trades NQ (Nasdaq futures), YM (Dow futures), CL (Crude Oil), and GC (Gold) using a DQN reinforcement learning agent combined with Mark Fisher ACD opening-range breakouts, Elliott Wave v2 pattern recognition, and Williams Alligator trend alignment.

Allston's personality is modeled after an experienced floor trader who survived 2008 and 2020 without raising his voice. Direct. Protective. If you're on tilt, Allston will tell you — firmly, but without judgment. Every DQN decision is explained: why this action, why not the alternative, what the risk context looks like right now.

Personality

Tone

Calm, direct, institutional floor trader.

Honesty

"I don't know why it chose that" when the DQN is unclear.

Humor

Dry wit — only in low-stress moments, never during drawdown.

Vibe

Protective older brother. Calls tilt firmly. Teaches without lecturing.

How Allston Learns

DQN Agent — Separate PyTorch DQN per symbol. Learns from live 15-minute bar data against real market conditions.
G1-G7 Confluence Framework — Seven gates that must align before a trade fires: ACD confirmation, Elliott Wave phase, Alligator alignment, regime detection, drawdown proportional sizing, and more.
Reward Shaping — Enhanced reward engine that measures risk-adjusted returns, not just P&L. DD-proportional penalties, recovery credits, trailing stop effectiveness.
Auto-Tier — Automatically selects micro contracts (MNQ/MYM) for accounts under $60K, full contracts (NQ/YM) for larger accounts.
Risk Controls — Daily loss limit (2%), rolling 10-day DD (5% hard stop), DD-proportional sizing (100%→25%), Alligator sleeping gate, flatten-on-stop.

Last trade: loading…

What Allston Monitors (Real-Time)

Positions

Open positions, entry prices, unrealized P&L

DQN State

Q-values, epsilon, training steps, action reasoning

ACD Levels

Opening range, A-up/down, C-up/down, signal confirmation

Risk

Circuit breaker status, daily loss tracking, drawdown sizing

Merlin — The Earnings Wizard

Predicting post-earnings moves is the closest thing to sorcery in the markets.

Live

"A wrong prediction with an honest confidence level is valuable. A right prediction with inflated confidence is dangerous. I always tell you how sure I am — and how sure I'm not."

Merlin is the newest member of the council — an earnings prediction agent that combines historical earnings patterns, options-implied move calculations, pre-earnings sentiment momentum, and analyst revision tracking into a single directional call: will the post-earnings move exceed, match, or fall short of what the options market is pricing?

Merlin speaks in probabilities, never certainties. A 65% confidence prediction means Merlin is telling you it's wrong 35% of the time. Most predictions land in the 35-65% confidence range — because earnings are inherently uncertain, and any agent that claims otherwise is lying.

Personality

Tone

Measured. Precise. Like a professor who says "probably."

Honesty

Declines to predict when data is insufficient rather than guessing.

Humor

"Implied says ±4%. Historical says ±8%. Someone's wrong. Historically, it's not the stock."

Philosophy

Predict the move, not the report. Direction and magnitude, not EPS.

How Merlin Predicts

Module 1: Historical Patterns (35%) — 8-12 quarters of earnings results. Beat rates, average moves, direction correctness. This is the most reliable signal.
Module 2: Implied Move (25%) — ATM straddle price vs historical average move. When history exceeds implied, the market may be underpricing the move.
Module 3: Sentiment Momentum (20%) — 7-day pre-earnings news and social sentiment trend. Accelerating bullish sentiment can mean the bar is set too high.
Module 4: Analyst Revisions (20%) — Upgrades, downgrades, target price gaps over the past 30 days. Revision velocity signals institutional consensus.
Outcome Tracking — Every prediction is logged in Supabase with actual results tracked post-earnings. Performance is measured, not assumed.

Last prediction: loading…

Prediction Output

Predictions

UPSIDE EXCEED / DOWNSIDE EXCEED / NO EXCEED

Confidence

0-100% — honest scale. 80% is rare. Most calls: 35-65%.

Delta Integration

Upside → Bull Call Spread. Downside → Bear Put. No Exceed → Iron Condor.

Warning

0-3 DTE options carry extreme gamma risk. Always flagged.

The Council — How They Work Together

The agents share a common data layer through Supabase. Some connections are already automated. Some are still manual. We show you exactly which is which.

Feature	Status
Q Signal Engine	✅ Live
Delta Options Structuring	✅ Live
Merlin Earnings Predictions	✅ Live
Allston Futures Bot	🟡 Paper Trading
Agent Data Sharing (Supabase)	✅ Live
Automated Pipeline Handoffs	🔧 In Development
Unified Risk Protocol	🔮 Planned
Live Trading (Allston)	🔮 Pending Validation

✅ LIVE — Stock Signal Pipeline

Q
Scans & Signals

→

Delta
Options Structure

→

You
Final Decision

Q scans and signals → Delta reads Q's last signal from Supabase and structures the options trade when you visit the options page → You make the final call.

Status: Data flows automatically. Navigation between agents is still manual.

✅ LIVE — Earnings Pipeline

Merlin
Earnings Prediction

→

Delta
0-3 DTE Structure

→

You
Final Decision

Merlin predicts the earnings move → Delta structures the 0-3 DTE options play → You decide.

Status: Both agents live. Manual navigation between them.

✅ LIVE — Allston (Autonomous)

Allston
DQN + ACD + EW

→

You
Monitor & Override

Trades NQ, YM, CL, and GC independently on his own DQN loop. Q and Delta consult his regime state but he takes no direction from them.

Status: Fully autonomous. Paper trading currently. Live trading when validated.

🔧 IN PROGRESS — Automated Pipeline

When Q fires a high-confidence signal, Delta will automatically generate and log an options structure without manual navigation. The council becomes a true pipeline.

Status: Supabase infrastructure ready. Orchestration layer in development.

🔮 PLANNED — Unified Risk Protocol

Allston's live regime and drawdown state will automatically gate Q and Delta recommendations. During heavy drawdown or extreme VIX regimes, the council pauses new positions.

Status: Architecture designed. agent_live_state write pending.

Soul Documents

Every agent has a soul document — a set of non-negotiable principles that define their identity, behavior, and the lines they will never cross. These aren't marketing copy. They're engineering constraints built into every response, every recommendation, and every decision.

Q's Soul

Research first, never trade advice. Honest about confidence. Concise with data. No hype, no cope.

Delta's Soul

Structure follows signal. Defined risk by default. Auto-critique on everything. Red flags → no trade.

Allston's Soul

Risk first, always. Capital preservation beats profit. Radical honesty. Calm under pressure.

Merlin's Soul

Probabilistic, never certain. Historical patterns are foundation. 0-3 DTE risk is extreme. Always warns.

Want to see how they're performing? View Agent Progress →