Markitel · advisor brief

v1 · 2026-05-14 · prepared for trading-signals advisor review

What we've built, what it's doing, and where it actually has edge.

Markitel is a real-time multi-asset signal platform with an integrated automation rail, an MT5 broker bridge, and a daily LLM analyst → critic → auto-tuner loop. The infrastructure is unusually thorough for a stage-zero company. The honest open question — the one we want your sharpest read on — is whether classical multi-indicator TA on liquid markets can clear costs at all, and if not, which of the additions on our roadmap actually moves the needle.

Stage paper + bridged-live, single-user Universe FX · metals · indices · crypto Cadence signals every 2 min · v2 agent every 1 min Stack Next.js · Supabase · Vercel · MT5 EA · Claude Opus 4.7

7-day net P&L · all rails

−$30,067

877 closed · 43.4 % WR · profit factor 0.59

7-day net P&L · v2 rail only

+$5,550

188 closed · 52.7 % WR · profit factor 1.46

Only profitable asset (60d)

XAUUSD

25 % WR · n=13 · everything else bleeds

Active open paper exposure

−$135k

floating, 329 open positions (30-d slice)

Numbers above are real-data snapshots from production Supabase. The v2 rail isn't magic — its advanced features (ranker, regime, bandit, adaptive sizing/exits, reentry) are all flag-off. v2 wins the slice mostly because it didn't take London-session signals during a bad London week. Audit detail in section 5.

What Markitel is, in one screen

A signal-first trading product. Users see ranked, calibrated trade ideas across FX/metals/indices/crypto, can copy them manually, or enroll them in an automation agent with their own rule set (per-rule confidence floors, asset filters, schedule, risk & cooldown caps). Paper trading is the default; live trading routes via an MT5 Expert Advisor bridge. Everything is paper-gated, promotion-gated, and circuit-breaker gated before any user ever risks live capital.

Intel terminal — the signal feed

/intel

Agent settings + skills

/automation

Portfolio & journal — paper + live unified

/portfolio

Signals feed (mobile)

/feed

The signal pipeline, end-to-end

A signal flows through six bounded stages before a user can see it; a candidate trade flows through nineteen-plus risk gates before any order fires. Every stage and every gate writes an audit row (signal_gate_events, automation_decisions.gates_evaluated[]). Reversibility is a first-class property — every model change ships behind an env flag, a parameter row, or a model-version pointer.

01 · ingest

Market data

Per-asset H4 / H1 / M15 candles, currency strength state, economic calendar, news embeddings.

→

02 · classify

Confluence engine

10-factor weighted vote → setup type, direction, confidence, structure SL/TP.

→

03 · gate

Session + regime

Session-quality matrix, HTF veto, currency-strength prescan, calibrated confidence.

→

04 · shadow

Parallel A/B

baselinep1-blacklistslearned-v1 — each variant runs in parallel into signals_shadow.

→

05 · emit

Publish + audit

Row into signals, factor scores persisted, narrative cached, gate events logged.

→

06 · review

Analyst · critic · tuner

Daily Opus 4.7 review, second-LLM critic, walk-forward replay, bounded parameter delta.

The honest claim isn't that the model is exceptional — the literature is clear that classical multi-indicator TA on liquid markets has near-zero edge after costs. The honest claim is that the rails are. Each step in 02–06 is reusable for any future model: drop a learned classifier into 02, the shadow rail in 04 measures it, the critic in 06 rejects bad parameter changes, and the live publish in 05 is a single config flag away.

The engine, what it actually computes

Confluence — the 10-factor classifier

A weighted heuristic vote. Min-confluence threshold defaults to 52; per-asset overrides live in engine_parameters.

Factor	Weight
Trend structure (HH/HL or LH/LL)	20%
EMA ribbon alignment (8/21/55/200)	15%
RSI regime	12%
MACD histogram & signal cross	12%
ADX strength	10%
Volume / relative volume	10%
Bollinger band position & squeeze	8%
Stoch-RSI cross	5%
S/R proximity	4%
Liquidity-sweep / wick reject	4%

Six setup types: TREND_CONTINUATION TREND_BREAKOUT MEAN_REVERSION MOMENTUM_SURGE DIVERGENCE_REVERSAL SQUEEZE_BREAKOUT. Each can be disabled per-asset by the auto-tuner.

Validation toolchain

walk-forward: Replay last 60d against any proposed parameter delta; reject if winners drop faster than losers.
calibration: Isotonic regression (PAV) on confidence → realized hit rate, fit monthly on 90d closed signals.
stats: Paired bootstrap (p<0.05) + Bonferroni, applied by the critic before any promotion.
sanity: Rule guards (n≥8 per asset, 7d anti-thrash, hard envelopes) then a second-LLM skeptic.
shadow: Every model change runs into signals_shadow for ≥ 14 days before the live pointer flips.

Self-tuning loop (daily)

00:10 UTC engine-analyst — Opus 4.7 reads 14d cohort + factor weights + bounds; returns markdown analysis + JSON recs.
00:30 UTC engine-auto-tune — rule gate → challenger critic → walk-forward → bounded delta (max 2 pts/day).
every 6h engine-critic — second-pass review of analyst recs.
monthly calibrate-confidence — re-fit isotonic regression.

Engine analyst output (markdown + factor scoring)

/admin/engine

Live signal cards + race bars

/intel · /signals/[id]

The automation rails — v1 today, v2 in promotion

Users do not auto-trade by ticking a box. They subscribe to skills (rule rows) that each carry an explicit match_dsl (which signals?) and action_dsl (what risk, what exits?). Every candidate goes through 19+ ordered gates and writes an automation_decisions row whether it's accepted or rejected.

v1 — production rail

cron: automation-tick · */2 * * * *
partition: users with agent_version IS NULL OR 'v1'
dispatch: first-match candidate → gate stack → execute
state: fully live, ~1.7k closed trades / 30 d

30-day v1: −$50,886 · 38.3 % WR · profit factor 0.56. This is the slice we need to fix.

v2 — parallel test rail flag-gated

cron: automation-tick-v2 · * * * * *
partition: users with agent_version = 'v2'
features: ranker · regime · bandit · adaptive sizing · adaptive exits · reentry · dynamic cooldown
flag state: master ON · all 7 sub-features OFF
guards: paper-gated 14 d · ≥ 100 closed trades · promotion-criteria · admin live-flip · circuit breaker

v2 (7d): +$5,550 · 52.7 % WR · pf 1.46. Caveat: the lift is mostly cohort & session selection, not yet the smart features. They're still flag-off.

Risk-gate stack (in order)

Defined in lib/risk/automation-gates.ts. First reject short-circuits. All evaluated gates are recorded with latency in automation_decisions.gates_evaluated[].

kill_switch_active

User-level emergency stop, admin-flippable.

broker_disconnected

MT5 EA heartbeat stale > threshold; live blocked.

concurrent_per_class_cap

Hard ceiling on open positions per asset class.

margin_cap

Projected margin vs free-margin guard.

automation_paused

User-scheduled pause window.

schedule_quiet_hours

Time-of-day filter, per user.

news_blackout

±30 min around high-impact events.

asset_filter / allowlist / denylist

Class & symbol level, settings + per-rule.

crypto_live_disabled · daily · weekly cap

Restrictive defaults, opt-in only.

confidence_below_threshold

Per-rule floor + global approval threshold.

rule_throttle_minute / hour / day

Prevent over-fire on broad-match rules.

cooldown

Same-asset cooldown, N seconds per rule.

post_loss_cooldown

Tiered: 5/30/90 min after 1/2/3 losses (proposed tier-1→15 min).

daily_loss_cap

Realized P&L vs balance × limit %.

daily_loss_cap_per_asset

Per-asset daily loss limit.

duplicate_detection

Same asset + side within 60 s. Fixed paper-blind bug: query unions bridge_orders + trades.

volatility_filter

ATR shift vs ATR at signal emit.

stale_market_data / stale_signal

Candle age + adverse-move-since-emit checks.

direction_bias_breaker (v2)

Thompson-sampling veto when an arm goes cold.

Live metrics — what the production data actually says

7-day realized P&L (snapshot 2026-05-11)

Slice	Closed	WR	Net P&L	Avg / trade	Profit factor
All rails	877	43.4 %	−$30,067.86	−$34.28	0.59
v1	689	40.9 %	−$35,617.90	−$51.70	0.41
v2	188	52.7 %	+$5,550.04	+$29.52	1.46

30-day realized P&L

Slice	Closed	WR	Net P&L	Avg / trade	Profit factor
All rails	1,967	39.7 %	−$45,336.43	−$23.05	0.64
v1	1,779	38.3 %	−$50,886.47	−$28.60	0.56
v2	188	52.7 %	+$5,550.04	+$29.52	1.46

The realized headline is harsh and the floating book makes it harsher: 329 active positions across the 30-day slice with stored floating P&L of −$135,349. Any funded-style equity curve must include unrealized P&L — the realized line alone is misleading.

Per-asset edge — last 60 days (raw signal level)

Only positive expectancy

XAUUSD

25 % WR on n=13 closed signals · 3 of top-5 winners · LIVE_EMISSION_WHITELIST target

Every other asset class

−1.6k avg pips

26.5 % blended WR · 79 % expiry rate · classical TA dragging through costs

This is the diagnosis we need pressure-tested. The roadmap (Section 7) restricts live emission to XAUUSD only while everything else moves to shadow-only, then rebuilds the classifier on labels rather than intuition.

Cron map — what runs, and how often

29 scheduled jobs, all UTC. The two that matter most: generate-signals (the engine) and automation-tick-v2 (the agent).

Path	Schedule	Purpose
generate-signals	/2 * * *	Scan + emit new signals
update-signals	/2 * * *	Price ticks, TP/SL transitions
automation-tick	/2 * * *	v1 agent dispatch
automation-tick-v2	* * * * *	v2 agent dispatch
automation-position-tick	/2 * * *	Mark-to-market + exits
engine-watchdog	/5 * * *	Engine-silence detector
broker-heartbeat	/5 * * *	MT5 EA liveness check
reconcile-trades	/5 * * *	Bridge vs trades reconciliation

Path	Schedule	Purpose
engine-analyst	10 0 * * *	Daily Opus 4.7 review
engine-auto-tune	30 0 * * *	Bounded parameter delta
engine-critic	0 /6 * *	Second-LLM critic
calibrate-confidence	0 4 1 * *	Monthly isotonic re-fit
refit-session-policy	0 22 * * 0	Sunday session-quality refit
refit-payoff-distributions	0 23 * * 0	Sunday TP-distribution refit
agent-v2-promotion-eval	0 6 * * *	v2 paper→live gate eval
detect-anomalies	/5 * * *	Drift & tail-event sniffer

The honest postmortem — three incidents, three lessons

2026-05-04 · catastrophic spiral

Trade volume 76× normal · 9 % WR

From 5/day on Apr 28 to 380/day on May 4. 974 closed trades, −$52,483 on a $8,150 paper account. Hour 06:00 UTC alone: 12 trades, −$39,346.

Root cause: Loss spiral on broad-match rules with no post-loss throttle that worked in paper. 96 % of trades after a loss within 30 min also lost (823 within 5 min).

Fixed: per-asset daily-loss cap; gate-cap on max-decisions-per-cron-tick; engine-silence watchdog (paradoxically also caused by this episode).

2026-05-11 · alextfx accounting artifact

+$17.9k → +$323 after normalization

BNBUSD & SUIUSD were absent from the tracker contract-size spec table → fell through to the FX default of 100,000. Eight BNB trades displayed +$17,759 but the real economic gain was +$0.18.

Real edge in that day: small clusters in GBPAUD, EURCHF, XAGUSD, US30. The crypto outlier was an accounting bug, not a strategy.

Fixed: crypto symbols seeded into pnl.ts (BNBUSD pip 0.1, SUIUSD pip 0.0001, …). Future-symbol guard rejects unknown crypto from automation until spec exists.

2026-05-13 · XAUUSD over-trading

8 same-side trades in 14 min · paper-blind dedup

130 XAU trades / 14 d across two test users · 45.3 % WR · −$2,056. Both rules had empty match_dsl so "High Confidence Only" wasn't enforcing confidence at all.

Root cause: (1) empty match_dsl on broad-preset rules; (2) v1+v2 cron lanes double-firing the same signal pre-coexistence patch; (3) duplicate-detection gate queried only bridge_orders, so paper trades never tripped it.

Fixed: dedup gate now unions bridge_orders + trades. v1/v2 partition tightened. Still open: server-side default min_confidence=70 on empty match_dsl, and dollar-risk cap (not lot cap) per trade.

The pattern across all three: the engineering scaffolding is genuinely modern, but a one-line oversight (missing spec / empty DSL / wrong table union) produces an outsized symptom precisely because the surface area is large. The lesson is to invest in invariants — guarantees that hold across the whole surface — not heroic per-bug fixes.

The 12-week engine roadmap

Sequenced for reversibility. Each phase ships behind an env flag or model-version pointer; each phase can be reverted by the critic if shadow loses to live. No flag-day cutover at any point.

PHASE 1

Stop the bleeding

week 1

LIVE_EMISSION_WHITELIST=XAUUSD; weighted per-asset caps by WR; engine-heartbeat alert if no signals > 4 h in market hours; calibrated WR fed into analyst prompt.

PHASE 2

Learned classifier

week 2–4

Replace the 10-factor vote with XGBoost on the same features + class one-hot + session + minute-of-day cyclical. Walk-forward CV. Promote only via shadow win on net pips, WR, drawdown with bootstrap p<0.05.

PHASE 3

Predictive features

week 5–10

One at a time: news embeddings → CFTC COT positioning (weekly) → options skew (FX risk reversals / VIX / Deribit) → ETF + on-chain flows. Each must lift shadow AUC ≥ 0.01 & expectancy ≥ 30 pips/signal in 14 d.

PHASE 4

Calibrated sizing

week 11–12

Half-Kelly with realized-vol scaling. Correlation-aware portfolio VaR budget at the dispatch gate. Goal: Sharpe +0.3, max DD −25 % vs fixed-lot baseline.

PHASE 5

Hardening

parallel

Public /admin/engine/dashboard; drift alerts on WR / calibration / regime; weekly auto-retrain with shadow → critic → promotion. Mean-time-to-detect: 8 d → 1 h.

Out-of-scope (deliberately)

Deep-learning sequence models on raw price. Tabular XGBoost is competitive and 100× cheaper to interpret. Skip until classical ML is exhausted.
Tick-data / order-book features. Requires a paid feed + a real-time pipeline. Maybe phase 7+.
RL for execution. Poor sample efficiency on non-stationary data; Kelly + heuristic execution covers 80 %.
Custom matching engine. We're a signal product, not a venue. Don't blur the line.

Funded-trading pivot — conditional go, not yet

We have a private-preview /challenges page (10k / 25k / 50k / 100k / 200k tiers, 8 % target, 5 % daily / 10 % max drawdown, 80–90 % payout split). We are not launching it publicly. Two reasons.

Why not yet — product

Paper engine is not yet broker-grade — no quote conversion, no swap accrual on paper close, no stop-loss slippage, no spread expansion on news, no margin-call model.
Active floating book sits at −$135k on the test cohort. A funded engine must score realized and unrealized together.
Loss-after-loss is too aggressive: 45-loss streak observed in the 7-d slice. Challenge-grade products need hard throttles.

Why not yet — regulatory

CFTC enforcement is active in this exact category (Traders Global / My Forex Funds complaint focused on "live funds" misrepresentation).
NFA registration thresholds (CTA / CPO / FCM / IB / RFED / AP) depend on actual business activity — counsel-reviewed before any public launch.
FTMO's own terms explicitly frame Challenge accounts as simulated — not real-instrument trading.

The v1 model we'd ship (after gates)

Do

Simulated capital only · no user funds pooled · no custody.
Rewards = contractually defined, simulated-performance based.
Markitel does not execute trades on user behalf; the challenge is a discipline-and-skill layer on top of Markitel intelligence.
Audited methodology + published payout math.

Don't

No real-funded accounts, no hedged prop book, no pooled trading capital.
No unverified trader-count / payout / pass-rate claims.
No marketing that mixes "Markitel executes no trades" with "funded with firm capital".

Open questions — where we want your hand on the wheel

Predictive edge. Is multi-indicator TA on liquid markets ever an edge over a 14 d / 60 d window with realistic costs? Or is the only honest path features the literature actually validates (COT, options skew, flows, microstructure)?
Sample size for promotion. We use bootstrap p<0.05 paired against live on net pips + WR + max DD over 14 d. Is that enough for a flag flip, or do you want 30 d / per-regime stratification?
Calibrated 65 % WR at 1:2 R:R — realistic stretch, fantasy, or wrong target entirely? Should we be optimizing expectancy at lower R:R, or selectivity at higher R:R?
Gating vs pricing edge. Most of our recent wins came from not trading the bad slices (London, broad-match, missing specs). How much further can gating take us before the engine itself has to improve?
Funded layer. Is a simulated-capital discipline product on top of a signal engine a defensible business — or does it inherit FTMO's regulatory tail without the brand equity? Where would you draw the line?
Auto-tune autonomy. The analyst→critic→walk-forward→tuner loop runs daily. Should the bound on a single day's parameter delta stay at 2 points, tighten, or vary by regime?

Reference index — files you can pull up

Signal engine

lib/signals/confluence-engine.ts
lib/signals/live-signal-engine.ts
lib/signals/shadow-variants.ts
lib/signals/technical-analysis.ts
lib/services/signals/engine-analyst.ts
lib/services/signals/engine-tuner.ts
lib/services/signals/calibration.ts
lib/services/signals/walk-forward.ts
lib/services/signals/sanity-check.ts
lib/services/signals/stats.ts

Automation & risk

lib/risk/automation-gates.ts
lib/risk/user-trade-gates.ts
lib/automation/match-evaluator.ts
lib/automation/v2/dispatch.ts
lib/automation/v2/ranker.ts · regime.ts · bandit.ts
lib/automation/paper-fill.ts
lib/automation/paper-cost-model.ts
lib/automation/floating-pnl.ts
lib/services/orders/execute.ts
lib/services/tracker/pnl.ts

Scheduling

vercel.json (29 crons)
app/api/cron/generate-signals/route.ts
app/api/cron/automation-tick-v2/route.ts
app/api/cron/engine-analyst/route.ts
app/api/cron/engine-auto-tune/route.ts
app/api/cron/engine-watchdog/route.ts
app/api/cron/agent-v2-promotion-eval/route.ts

Reading material

docs/SIGNAL_ENGINE_ROADMAP.md
docs/markitel-paper-agent-funded-decision-memo-2026-05-11.md
docs/agent-performance-alextfx-2026-05-12.md
docs/agent-performance-xauusd-analysis-2026-05-13.md
docs/agent-risk-audit-2026-05-04.md
docs/agent-v2-handoff-2026-05-08.md

Prepared by alextfx · 2026-05-14 · For external trading-signals advisor · Scope show-and-tell briefing, no code or decisions in this document.

All numerics are pulled from production Supabase as of 2026-05-11 (decision memo snapshot) and verified against the analysis scripts in scripts/analyze-agent-performance.js and scripts/probe-trades.ts. File references are live paths in the working tree. This brief is intentionally short on opinion and long on substance — we want the advisor's judgment, not our own re-stated.