Advanced Recipes for Market Data and Strategy Management
6 min read
Core idea
A live trading system is only as good as the data flowing in, the storage it writes to, the alerts that fire when something is wrong, and the audit trail that lets you reconstruct what happened. This topic fills the production-grade gaps left by the IB-API workflow with four recipes that compose orthogonally: ThetaData streams firehose-volume options data (1.4 million active contracts, 3 TB/day) at millisecond latency via a Theta Terminal intermediary; ArcticDB persists those streams in DataFrame-native form at petabyte scale; the CVaR alert thread fires email or SMS when risk metrics breach defined thresholds; execution-detail capture writes openOrder and execDetails callbacks into SQL tables that make end-of-day reconciliation a query, not an archaeology dig. Together they turn the previous topics' prototype into infrastructure.
Author's framing: Real-time data handling, careful risk management, and detailed trade records are instrumental for algorithmic traders. The recipes here build the surrounding production discipline.
Why it matters
Options data has a different scale and shape
Equities have one symbol per security; OPRA (Options Price Reporting Authority) tracks ~1.4 million active option contracts simultaneously and pushes more than 3 TB of trades and quotes per day. A normal market-data API would drown trying to deliver that. ThetaData's solution is the Theta Terminal — a local Java process that connects to OPRA's nearest distribution server, decompresses the feed (~30:1), and exposes a streamlined interface to the Python client. Your Python code subscribes to callbacks via connect_stream(callback), then requests either the full firehose (req_full_trade_stream_opt) or a specific contract (req_trade_stream_opt(ticker, expiration, strike, OptionRight.CALL)). The same pattern composes — subscribe to the two legs of a straddle separately, store the latest Quote for each, recompute the combined bid/mid/ask in the callback. Real-time multi-leg pricing falls out for free.
Embedded DataFrame storage scales further than SQLite
A SQLite database is fine for thousands of bid-ask ticks per day on twenty symbols. It is not fine for billions of options trades per day. ArcticDB — built by Man Group, slated for Bloomberg's BQuant platform — is an embedded, serverless database designed for pandas-shaped data at petabyte scale. It treats every symbol as an independent storage entity (no joins, no shared schema), runs on top of LMDB locally or S3 / Azure Blob remotely, supports schema-less data and streaming ingestion, and provides bitemporal access (every historical version of every symbol). The API is arctic.get_library("trades").write(symbol, df) to store and lib.read(symbol).data to retrieve. QueryBuilder lets you filter, groupby, and aggregate before pulling into memory.
Risk alerts are decoupled background watchers
Continuously monitoring the live cvar property and emailing yourself when it crosses a threshold takes ten lines of code — a watch_cvar(threshold, interval) method spawned on its own daemon thread inside IBApp.__init__. Email goes through Python's stdlib smtplib and email.message; SMS goes through Twilio's Client.messages.create. The choice of channel is left to the operator; the alerting infrastructure is one method. The pattern composes: spawn a thread per metric you want to watch (max drawdown, exposure breach, position concentration), each polling its own property and firing its own notifications.
Audit trails are SQL tables populated by callbacks
openOrder and execDetails from the order-callback layer (in the previous topic) carry rich object hierarchies — Contract, Order, OrderState, Execution — each with dozens of fields. Routing them to two SQLite tables (open_orders and trades) via parameterized INSERT statements gives you the substrate every end-of-day report needs: which orders were placed, when, at what price, where they routed, who got the liquidity rebate, how long until they filled. Moving the schema strings (CREATE_BID_ASK_DATA, CREATE_OPEN_ORDERS, CREATE_TRADES) into utils.py keeps app.py clean. The pattern generalizes: every callback worth caring about deserves a table.
These four pieces are independent and additive
You can adopt ThetaData without ArcticDB. You can adopt ArcticDB without alerts. You can add alerts without changing data storage. Each recipe is a slot on the trading app; production systems compose them according to which slots matter for the strategy. The composition is the point — none of this is required for the strategy to function; all of it is required for the operation to be trustworthy.
Key takeaways
Mental model
Practical application
The four recipes layer onto the existing trading-app skeleton without restructuring it.
-
Wire ThetaData streaming. Install
thetadatavia pip and ensure Java is installed (Theta Terminal requirement). ConstructThetaClient(username, passwd), register acallback(msg)that switches onmsg.type(TRADEvsQUOTE), callclient.connect_stream(callback), thenclient.req_full_trade_stream_opt()for the firehose orclient.req_trade_stream_opt(ticker, expiration, strike, OptionRight.CALL)for a specific contract. Verify the subscription withclient.verify(req_id)againstStreamResponseType.SUBSCRIBED. -
Persist to ArcticDB. Install
arcticdb(pip on Linux/Windows, conda on macOS). Constructarctic = adb.Arctic("lmdb://arcticdb_options")(or ans3://...URI for remote). Get a library:lib = arctic.get_library("trades", create_if_missing=True). In the streaming callback, build a one-row DataFrame from the trade message, then eitherlib.update(symbol, df, upsert=True)orlib.write(symbol, df)depending on whether the symbol already exists. For retrieval:lib.read("QQQ").datareturns the full DataFrame;lib.read("QQQ", query_builder=q).datafilters at storage with aQueryBuilder. -
Add a CVaR alert thread. Add
watch_cvar(self, threshold, interval)toIBApp— a method that sleeps 60 seconds (to let returns accumulate), then enters awhile Trueloop checkingself.cvar[1](dollar CVaR) against the threshold and firing an alert on breach. Spawn it as a daemon thread inIBApp.__init__takingcvar_thresholdfrom**kwargs. Replace the print statement withsmtplib.SMTP("localhost").send_message(msg)for email ortwilio.rest.Client(...).messages.create(...)for SMS. -
Refactor schemas and capture execution detail. Move the
CREATE TABLEstrings forbid_ask_data,open_orders, andtradesfromapp.pyintoutils.pyas named constants (CREATE_BID_ASK_DATA,CREATE_OPEN_ORDERS,CREATE_TRADES). Import them inapp.pyand execute all three increate_table. Replace theprintstatements inopenOrderandexecDetailswith parameterized SQLINSERTstatements binding the relevant fields. Rename the SQLite file fromtick_data.sqlitetostrategy_1.sqlitesince it now stores more than ticks.
Example
A multi-strategy desk runs the monthly factor portfolio, an intraday crack-spread mean-reversion, and three options-volatility strategies — all on one IBApp instance. Without the recipes from this topic, the operator would be flying blind: no record of which fills came from which strategy, no early warning when CVaR drifts past the daily loss limit, no way to research yesterday's options-flow without re-downloading from IB.
With the four pieces wired in: the ThetaData stream populates ArcticDB with every options trade tagged by expiry, strike, and call/put. End-of-day, an analyst runs lib.read("SPY", query_builder=q).data.groupby("days_to_expiration").bid_size.sum() and gets a strike-by-strike volume profile in seconds — the same query against the IB callback log would take ten minutes of SQL. The watch_cvar thread fires an SMS at 11:32 AM when CVaR crosses -$5,000 (limit was -$5,000 by design); the operator pauses new orders, investigates, finds an unexpected gap in HO futures, and resumes with a halved position size. The open_orders and trades tables let the operator answer "which strategy placed that order at 9:47 ET?" with a one-line SQL query that joins on order_id.
None of the strategies themselves changed. The wrapping infrastructure is what turned a collection of scripts into a production operation.
Related lessons
Related concepts
- Options Datalinked concept
- Tick Storagelinked concept
- Risk Alertinglinked concept
- Audit Traillinked concept
- Embedded Databaselinked concept
- Conditional Value-at-Risklinked concept