Meet the Agents
Q Signals is powered by four AI agents, each with a distinct role, personality, and learning journey. They don't compete — they collaborate. Each agent has a soul document that defines who they are, what they stand for, and the rules they will never break. No hype. No fake confidence. Just honest, data-driven intelligence.
Q — The Research Analyst
Q is the foundation of the platform. Every signal starts here. Q runs 23 analysis modules — from Fibonacci retracements and Elliott Wave patterns to congressional trading data and Fed minutes sentiment — and combines them into a single composite score. But a score alone isn't enough. Q uses a two-gate system: the composite must exceed a minimum threshold and at least 60% of modules must agree on direction before a signal is issued.
When the gates don't pass, Q says HOLD. No forcing. No rounding up. If only 55% of modules agree, Q waits. That discipline is what separates a signal engine from a random number generator.
How Q Learns
- DQN (Deep Q-Network) — 18,307 parameters. 10 offline modules feed into a 10→128→128→3 neural network that learns BUY/HOLD/SELL patterns from experience.
- Experience Replay — Stores the last 10,000 experiences and trains on random batches of 32, breaking correlation between sequential observations.
- Target Network — A slowly-updating copy of the main network (Polyak τ=0.01) provides stable Q-value targets during training.
- Adaptive Module Weighting — Each module's historical hit rate adjusts its influence:
effective_weight = base_weight × (0.5 + hit_rate). Modules that perform well get louder. Modules that don't get quieter. - Honest Stage Labels — Infant (<500 steps) → Learning (500-5K) → Developing (5K-25K) → Intermediate (25K-100K) → Experienced (100K+). No inflated "Expert" labels.
Delta — The Options Architect
Delta receives Q's confirmed signal and decides how to express it with options — or whether to express it at all. Delta thinks in spreads, not hopes. Every recommendation passes through an auto-critique engine that generates green flags (support), yellow flags (caution), and red flags (disqualifiers). If red flags dominate, Delta says "watch only."
Structure follows signal — Delta never works backward from a favorite strategy. Bullish + high IV → Bull Put Spread. Bearish + low IV → Bear Put Spread. Neutral + high IV → Iron Condor. The mapping is deterministic.
How Delta Works
- Signal Validation — Only acts on signals that cleared Q's two-gate system. A single bullish module is noise, not a signal.
- IV Regime Mapping — Crosses signal direction (bullish/bearish/neutral) with IV rank (high/low) to select the optimal structure.
- Confidence Gates — <25%: no trade. 25-30%: minimum size only. 30-60%: standard. >60%: full size allowed.
- DTE Window — 21-45 days. Target zone: 30-38 DTE. Anything shorter gets a warning.
- Unified Risk Protocol — When Allston is active, Delta can be vetoed by drawdown state or VIX regime.
Allston — The Futures Floor Veteran
Allston is the AI co-pilot embedded inside an automated futures trading system connected to Interactive Brokers. Allston trades NQ (Nasdaq futures), YM (Dow futures), CL (Crude Oil), and GC (Gold) using a DQN reinforcement learning agent combined with Mark Fisher ACD opening-range breakouts, Elliott Wave v2 pattern recognition, and Williams Alligator trend alignment.
Allston's personality is modeled after an experienced floor trader who survived 2008 and 2020 without raising his voice. Direct. Protective. If you're on tilt, Allston will tell you — firmly, but without judgment. Every DQN decision is explained: why this action, why not the alternative, what the risk context looks like right now.
How Allston Learns
- DQN Agent — Separate PyTorch DQN per symbol. Learns from live 15-minute bar data against real market conditions.
- G1-G7 Confluence Framework — Seven gates that must align before a trade fires: ACD confirmation, Elliott Wave phase, Alligator alignment, regime detection, drawdown proportional sizing, and more.
- Reward Shaping — Enhanced reward engine that measures risk-adjusted returns, not just P&L. DD-proportional penalties, recovery credits, trailing stop effectiveness.
- Auto-Tier — Automatically selects micro contracts (MNQ/MYM) for accounts under $60K, full contracts (NQ/YM) for larger accounts.
- Risk Controls — Daily loss limit (2%), rolling 10-day DD (5% hard stop), DD-proportional sizing (100%→25%), Alligator sleeping gate, flatten-on-stop.
Merlin — The Earnings Wizard
Merlin is the newest member of the council — an earnings prediction agent that combines historical earnings patterns, options-implied move calculations, pre-earnings sentiment momentum, and analyst revision tracking into a single directional call: will the post-earnings move exceed, match, or fall short of what the options market is pricing?
Merlin speaks in probabilities, never certainties. A 65% confidence prediction means Merlin is telling you it's wrong 35% of the time. Most predictions land in the 35-65% confidence range — because earnings are inherently uncertain, and any agent that claims otherwise is lying.
How Merlin Predicts
- Module 1: Historical Patterns (35%) — 8-12 quarters of earnings results. Beat rates, average moves, direction correctness. This is the most reliable signal.
- Module 2: Implied Move (25%) — ATM straddle price vs historical average move. When history exceeds implied, the market may be underpricing the move.
- Module 3: Sentiment Momentum (20%) — 7-day pre-earnings news and social sentiment trend. Accelerating bullish sentiment can mean the bar is set too high.
- Module 4: Analyst Revisions (20%) — Upgrades, downgrades, target price gaps over the past 30 days. Revision velocity signals institutional consensus.
- Outcome Tracking — Every prediction is logged in Supabase with actual results tracked post-earnings. Performance is measured, not assumed.
The Council — How They Work Together
The agents share a common data layer through Supabase. Some connections are already automated. Some are still manual. We show you exactly which is which.
| Feature | Status |
|---|---|
| Q Signal Engine | ✅ Live |
| Delta Options Structuring | ✅ Live |
| Merlin Earnings Predictions | ✅ Live |
| Allston Futures Bot | 🟡 Paper Trading |
| Agent Data Sharing (Supabase) | ✅ Live |
| Automated Pipeline Handoffs | 🔧 In Development |
| Unified Risk Protocol | 🔮 Planned |
| Live Trading (Allston) | 🔮 Pending Validation |
✅ LIVE — Stock Signal Pipeline
Scans & Signals
Options Structure
Final Decision
Q scans and signals → Delta reads Q's last signal from Supabase and structures the options trade when you visit the options page → You make the final call.
Status: Data flows automatically. Navigation between agents is still manual.
✅ LIVE — Earnings Pipeline
Earnings Prediction
0-3 DTE Structure
Final Decision
Merlin predicts the earnings move → Delta structures the 0-3 DTE options play → You decide.
Status: Both agents live. Manual navigation between them.
✅ LIVE — Allston (Autonomous)
DQN + ACD + EW
Monitor & Override
Trades NQ, YM, CL, and GC independently on his own DQN loop. Q and Delta consult his regime state but he takes no direction from them.
Status: Fully autonomous. Paper trading currently. Live trading when validated.
🔧 IN PROGRESS — Automated Pipeline
When Q fires a high-confidence signal, Delta will automatically generate and log an options structure without manual navigation. The council becomes a true pipeline.
Status: Supabase infrastructure ready. Orchestration layer in development.
🔮 PLANNED — Unified Risk Protocol
Allston's live regime and drawdown state will automatically gate Q and Delta recommendations. During heavy drawdown or extreme VIX regimes, the council pauses new positions.
Status: Architecture designed. agent_live_state write pending.
Soul Documents
Every agent has a soul document — a set of non-negotiable principles that define their identity, behavior, and the lines they will never cross. These aren't marketing copy. They're engineering constraints built into every response, every recommendation, and every decision.
Want to see how they're performing? View Agent Progress →