Market Maker & Liquidity Provider Evaluation Framework
A structured, evidence-based framework that answers the critical questions of liquidity readiness and market maker accountability — grounded in real-time orderbook data from the Indonesian crypto market.
→ GitHub · Live Assessment Tool
// The decision exchange operators ignore
One of the least-discussed operational decisions a newly launched crypto exchange faces is knowing when to engage a dedicated market maker — and once engaged, how to hold them accountable. The answer is neither obvious nor static. Engage too early and you’re paying for liquidity your volume doesn’t justify. Engage too late and thin books repel the very traders you need to grow. Evaluate them poorly and SLA violations go undetected until the damage is already done.
This project builds the decision framework I would have wanted as an exchange operator: a tool that answers both questions — readiness and evaluation — using a three-mode classification system and five core liquidity diagnostics, applied to real orderbook data from OJK-regulated Indonesian exchanges.
// The Indonesian liquidity landscape
In Indonesia, OJK-regulated venues (Pedagang Aset Keuangan Digital) operate under specific constraints around pair availability, IDR settlement, and counterparty structure. This creates a fragmented liquidity landscape where global benchmarks often fail to capture the local reality — a venue may appear functional by volume metrics while being structurally illiquid for any order above retail size.
The framework targets this gap directly. Data is collected via a continuous polling pipeline running on an Oracle Cloud VM, capturing 60-second snapshots across the top 20 bid and ask levels. The 60-second cadence was chosen deliberately: it captures the persistent state of the book — the liquidity that actually exists for a typical retail or small institutional order — without the noise of sub-second quote flickering that would require co-located infrastructure to measure meaningfully. Flat CSV storage was chosen over a time-series database to prioritize reproducibility; anyone with Python and the raw files can verify every output independently.
These snapshots are processed into five diagnostics, each chosen because it maps directly to an operational decision rather than an academic metric:
1. Bid-ask spread. Time-weighted average as a % of mid-price. This is the primary cost-of-trading signal — the minimum round-trip cost a trader pays regardless of order size.
2. Orderbook depth. Measured at 0.5%, 1.0%, and 2.0% price thresholds. It quantifies exactly how much capital the book can absorb before significant price movement. Depth at 1% is the primary SLA anchor because it represents the realistic execution window for retail-to-small-institutional flow.
3. Quote presence ratio (QPR). The fraction of snapshots where a valid two-sided quote exists within the target spread. Unlike fill rate — which requires trade data the exchange controls — QPR is independently measurable from public orderbook data, making it the most reliable external MM performance signal available to a third-party evaluator.
4. Price impact / slippage. Exact calculated cost for standardized trade sizes — $10K, $50K, $100K, and $250K. This translates abstract depth into a functional execution cost, and exposes book exhaustion points that raw depth figures obscure.
5. Spread resilience. The recovery speed following volatility events, flagged via automated high-volatility alerts. An MM that widens or disappears during stress is operationally worse than no MM at all — it creates false confidence in normal conditions.
// System Metrics (March 2026 Baseline)
The cross-exchange comparison reveals a sharp liquidity gap within the regulated market. The 54-hour observation window (March 11–13, 2026) captured both normal and mildly stressed market conditions, with one high-volatility event flagged on March 12.
| Metric (BTC/IDR) | Tokocrypto | Upbit ID | Gap / Differential |
|---|---|---|---|
| Avg. Spread (bps) | ~4.2 | ~17.1 | 4x Spread Gap |
| Depth within 1% (IDR) | ~820M | ~82M | 10x Depth Gap |
| $10k Slippage | 0.08% | EXHAUSTED | Non-Executable |
| Resilience Score | High | Low | Structural Deviation |
Baseline derived from observation period: 2026-03-11 05:41 to 2026-03-13 12:01 UTC.
The numbers tell a clear operational story. Tokocrypto’s 4.2 bps spread and 820M IDR depth ($51K USD) at the 1% threshold indicate an actively managed book — these figures are consistent with dedicated MM participation keeping quotes tight. Upbit ID’s 17.1 bps spread and 82M IDR depth ($5K USD) tell the opposite story: a passive, organic-only book where the cost of trading is structurally 4x higher and where any order above $10K exhausts available liquidity entirely.
The depth heatmap reinforces this finding visually. Tokocrypto’s book shows dense, consistent concentration around mid-price across the full observation window. Upbit ID shows a fragmented, checkerboard pattern — intermittent quote presence with notable asymmetry on the ask side, potentially indicating structural imbalance in book composition, though snapshot frequency may partially account for this observation.
Critically, both exchanges exhaust their books above $50K order size. This is the framework’s most operationally significant finding: even the stronger of the two venues cannot support institutional flow without dedicated liquidity infrastructure.
// How it works: The Three-Mode Classification
The core of the framework is the LiquidityReadinessAssessor, which classifies an exchange’s liquidity into three modes based on the observed diagnostics. The classification is not a score — it is an operational recommendation that determines what kind of MM engagement, if any, is warranted.
Mode 1: Passive. Standard orderbook presence without active quote management. Books are thin, spreads are elevated, and depth is structurally asymmetric. The exchange is relying entirely on organic flow to populate the book. MM engagement at this stage would deliver immediate, measurable improvement in retail trade execution — tighter spreads and deeper books are achievable with even a single active provider. (Observed: Upbit ID)
Mode 2: Integrated. Active LP participation with tracking error compliance relative to global anchors like Binance. The book is healthy enough for retail flow and moderate order sizes, but vulnerable under stress and unable to support institutional blocks without exhaustion. The priority at this stage shifts from whether to engage an MM to how to structure the SLA — minimum depth floors, spread band obligations, and recovery time requirements become the negotiating levers. (Observed: Tokocrypto)
Mode 3: Institutional. High-resilience depth with guaranteed fill rates at >$50K notional. This requires dedicated liquidity infrastructure, typically involving multiple competing MMs with overlapping obligations. Neither exchange in this dataset has reached this threshold, which defines the ceiling that the Indonesian market has not yet cleared.
// Building for Accountability (SLA Design)
The framework is not just for observation — it is designed for audit. Beyond diagnostics, it includes an MM evaluation rubric that an exchange operator would use to structure a new liquidity agreement or audit an existing provider against contractual obligations.
Key SLA components include:
- Minimum QPR thresholds: 95%+ uptime during normal market conditions, with defined degradation allowances during flagged volatility events.
- Maximum spread bands: Configurable by volatility regime — tighter obligations in calm markets, wider tolerance during stress, with hard ceilings regardless of conditions.
- Depth floor requirements: Guaranteed minimum liquidity at the 0.5% and 1.0% thresholds, expressed in IDR notional rather than BTC units to avoid price-level gaming.
- Recovery time obligations: Maximum allowable downtime following a circuit-breaker event, measurable directly from QPR data.
The LiquidityReadinessAssessor allows operators to model different contractual structures and stress-test these thresholds against observed market behavior before committing to a liquidity agreement — turning the framework from a diagnostic tool into a negotiation instrument.
// What is not modeled
A model that knows what it cannot do is more useful than one that does not. These limitations are noted not as disclaimers but as design boundaries that inform how the findings should be used.
- High-frequency dynamics. The 60-second polling cadence captures persistent book state but misses sub-second quote flickering. This means QPR and spread figures may be slightly more favorable than what a high-frequency trader would observe — but are representative of the retail and small-institutional experience this framework targets.
- Order-flow toxicity. The model measures the state of the book, not the intent of the trades hitting it. An MM quoting tight spreads into a toxic flow environment will widen or withdraw — the resilience metric captures the result of this but not the cause.
- Cross-exchange aggregation. Smart order routing across Tokocrypto and Upbit ID simultaneously is out of scope. The 10x depth gap between them suggests aggregation would be meaningful, but modeling it requires trade-level data beyond public orderbook access.
These boundaries do not affect the core classification findings — Tokocrypto’s Mode 2 and Upbit ID’s Mode 1 designations are robust to all three limitations. They define the scope for the next version of the framework.
// Stack
Python · Pandas · NumPy · Matplotlib · Seaborn · systemd · Makefile
| Component | Detail |
|---|---|
| Data Pipeline | Concurrent poller (60s interval), VPS-deployed systemd daemon |
| Assessment Engine | LiquidityReadinessAssessor class with mode-switching logic |
| Analytics | Rolling spread, heatmaps (Seaborn), and slippage exhaustion detection |
| Environment | Isolated Python environment managed via uv |
Synthetic and empirical data baseline. Does not represent the internal state of any exchange.
Built by Gilang Fajar Wijayanto — Senior Treasury & Finance Operations