Systematic Commodity Trading: Definitive Research Bibliography

The canonical reading list for building a quantitative commodity ETF trading system. Organized by domain. Each entry includes full citation, core findings, and platform relevance.

1. Foundational Commodity Futures Theory

1.1 Theory of Storage & Normal Backwardation

Kaldor, N. (1939). "Speculation and Economic Stability." Review of Economic Studies, 7(1), 1-27. - Originator of the Theory of Storage. Introduced convenience yield as the implicit benefit of holding physical inventory. Storage costs, convenience yield, and interest rates jointly determine the futures basis. - Platform relevance: Core theoretical basis for carry/basis signals. Convenience yield is unobservable but proxied by the basis -- the foundation of our term structure features.

Keynes, J.M. (1930). A Treatise on Money, Vol. 2. London: Macmillan. - Theory of Normal Backwardation: hedgers (producers) pay a risk premium to speculators by selling futures below expected spot prices. Commodity futures should have positive expected excess returns on average. - Platform relevance: Theoretical justification for long commodity exposure. Explains why systematic long-biased strategies can earn a structural premium.

Working, H. (1949). "The Theory of Price of Storage." American Economic Review, 39(6), 1254-1262. - Formalized convenience yield and demonstrated that the basis (futures minus spot) reflects storage costs net of convenience yield. Solved the "reverse carrying charge" puzzle. - Platform relevance: Basis decomposition is how we parse carry signals into storage cost vs. convenience yield components.

1.2 Landmark Empirical Studies

Gorton, G. & Rouwenhorst, K.G. (2006). "Facts and Fantasies about Commodity Futures." Financial Analysts Journal, 62(2), 47-68. - Constructed an equally-weighted commodity futures index (1959-2004). Found commodity futures delivered equity-like Sharpe ratios with negative correlation to stocks and bonds. Positively correlated with inflation and unexpected inflation. The foundational empirical case for commodity allocation. - Platform relevance: Establishes the diversification rationale. Our portfolio construction module uses commodity-equity correlation dynamics for allocation sizing.

Bhardwaj, G., Gorton, G. & Rouwenhorst, K.G. (2015). "Facts and Fantasies about Commodity Futures Ten Years Later." NBER Working Paper No. 21243. - Updated the original study through the 2008 crisis and financialization era. Found the core diversification benefits persisted but commodity index returns were lower, partly due to increased financialization and persistent contango in energy markets. - Platform relevance: Warns against naive long-only indexing. Supports our active approach -- term structure awareness and tactical allocation rather than buy-and-hold commodity beta.

Erb, C.B. & Harvey, C.R. (2006). "The Strategic and Tactical Value of Commodity Futures." Financial Analysts Journal, 62(2), 69-97. - Average individual commodity excess returns are approximately zero; portfolio rebalancing and tactical strategies (momentum, term structure) generate equity-like returns. Diversification return from rebalancing is a key source of alpha. - Platform relevance: Directly motivates our multi-signal approach. Individual commodity betas are noisy -- portfolio construction and signal combination are where value is created.

2. Commodity Risk Premia & Factor Structure

2.1 Decomposing Commodity Returns

Szymanowska, M., de Roon, F., Nijman, T. & van den Goorbergh, R. (2014). "An Anatomy of Commodity Futures Risk Premia." Journal of Finance, 69(1), 453-482. - Decomposes commodity futures returns into spot premia (5-14% p.a.) and term premia (1-3% p.a.). A single high-minus-low basis factor explains the cross-section of spot premia. Basis, momentum, volatility, hedging pressure, and liquidity are the key sorting variables. - Platform relevance: Defines the factor taxonomy our signal engine targets. Carry (basis), momentum, and volatility are the primary features.

Bakshi, G., Gao, X. & Rossi, A.G. (2019). "Understanding the Sources of Risk Underlying the Cross Section of Commodity Returns." Management Science, 65(2), 619-641. - A three-factor model (average commodity, carry, momentum) explains the cross-section of commodity returns. Global equity volatility innovations price carry portfolios; speculative activity innovations price momentum portfolios. - Platform relevance: Validates our factor model architecture. Carry and momentum are not redundant -- they load on different risk sources.

2.2 Carry as a Universal Factor

Koijen, R.S.J., Moskowitz, T.J., Pedersen, L.H. & Vrugt, E.B. (2018). "Carry." Journal of Financial Economics, 127(2), 197-225. - Extends carry beyond currencies to all asset classes (equities, bonds, commodities, treasuries, credit, options). Carry predicts returns cross-sectionally and in time series. Not explained by known predictors. Exposed to global recession, liquidity, and volatility risks. - Platform relevance: Carry is our primary factor. This paper provides the theoretical framework for computing carry across all commodity ETF positions and using it as both a cross-sectional sort and a timing signal.

2.3 The Fundamentals of Commodity Returns

Gorton, G., Hayashi, F. & Rouwenhorst, K.G. (2013). "The Fundamentals of Commodity Futures Returns." Review of Finance, 17(1), 35-105. (NBER WP 13249) - Inventory levels are the fundamental driver. Low inventories produce backwardation and higher risk premia. High inventories produce contango and lower/negative risk premia. Bridges the Theory of Storage to empirical cross-section. - Platform relevance: Inventory data feeds directly into our fundamental overlay signals. When USDA or EIA inventory reports shift, our models should update carry expectations.

3. Momentum & Trend Following

3.1 Time-Series Momentum

Moskowitz, T.J., Ooi, Y.H. & Pedersen, L.H. (2012). "Time Series Momentum." Journal of Financial Economics, 104(2), 228-250. - Documents significant time-series momentum across 58 liquid instruments (equities, currencies, commodities, bonds) at 1-12 month horizons. Partial reversal over longer horizons. Diversified TSMOM portfolio delivers substantial alpha with near-zero exposure to standard factors. Performs best in extreme markets. - Platform relevance: The foundational signal for our trend-following module. We implement lookback windows from 1 to 12 months with exponential weighting per this paper's methodology.

Hurst, B., Ooi, Y.H. & Pedersen, L.H. (2017). "A Century of Evidence on Trend-Following Investing." Journal of Portfolio Management, 44(1), 15-29. - Extends time-series momentum evidence back to 1880 using novel historical data. Positive returns in every decade. Performed well in 8 of 10 largest crisis periods (60/40 drawdowns). Robust to transaction costs, capacity constraints, and look-ahead bias. - Platform relevance: Out-of-sample validation across 137 years. Confirms trend following is not a data-mined artifact. Supports our conviction in TSMOM as a core signal.

Baltas, N. & Kosowski, R. (2013). "Demystifying Time-Series Momentum Strategies: Volatility Estimators, Trading Rules and Pairwise Correlations." Journal of Derivatives & Hedge Funds, 19(4). - Yang-Zhang range-based volatility estimator and TREND trading rule reduce turnover by 35% without statistically significant performance degradation. Trading costs amount to ~10% of gross TSMOM returns over 30 years. - Platform relevance: Directly applicable to our execution engine. We use range-based vol estimators and optimized signal smoothing to minimize turnover costs.

3.2 Cross-Sectional Momentum

Miffre, J. & Rallis, G. (2007). "Momentum Strategies in Commodity Futures Markets." Journal of Banking & Finance, 31(6), 1863-1886. - 12-month momentum sorting of commodity futures generates 10.7% annual returns (t=3.75). Profitable across almost all ranking/holding period combinations up to 12 months. Survives standard risk adjustments. - Platform relevance: Validates cross-sectional momentum as a standalone commodity signal. Our model combines this with TSMOM for signal diversification.

3.3 Multi-Signal Commodity Strategies

Fuertes, A.M., Miffre, J. & Fernandez-Perez, A. (2015). "Commodity Strategies Based on Momentum, Term Structure, and Idiosyncratic Volatility." Journal of Futures Markets, 35(3), 274-297. - Triple-screen strategy: buy high-momentum / high-roll-yield / low-idiosyncratic-vol; short the opposite. Sharpe ratio 5x the S&P-GSCI. The three signals are non-overlapping, meaning combination is genuinely additive. - Platform relevance: Blueprint for our multi-factor signal combination. Confirms that momentum + carry + vol are complementary, not redundant. The triple-screen architecture maps directly to our signal pipeline.

Fuertes, A.M., Miffre, J. & Rallis, G. (2010). "Tactical Allocation in Commodity Futures Markets: Combining Momentum and Term Structure Signals." Journal of Banking & Finance, 34(10), 2530-2548. - Combining momentum and term structure (roll yield) signals outperforms either signal alone. Higher returns, lower risk than long-only commodity exposure. - Platform relevance: Validates our signal combination approach at the portfolio level, not just individual positions.

4. Cross-Asset Factor Investing

Asness, C.S., Moskowitz, T.J. & Pedersen, L.H. (2013). "Value and Momentum Everywhere." Journal of Finance, 68(3), 929-985. - Documents consistent value and momentum premia across eight diverse markets and asset classes. Returns correlate more strongly across asset classes than passive exposures. Value and momentum are negatively correlated with each other. A three-factor global model captures the common structure. - Platform relevance: Establishes the cross-asset factor framework. Commodity momentum and carry are manifestations of global factors -- our risk model accounts for factor crowding across asset classes.

Brooks, J. (2017). "A Half Century of Macro Momentum." AQR White Paper. - Systematic macro investing (long improving fundamentals, short deteriorating) has performed consistently since 1970. Highly diversifying to traditional assets. Hedges equity drawdowns and rising real yield environments. - Platform relevance: Macro momentum overlays our commodity signals. When fundamental macro trends (growth, inflation) align with commodity TSMOM, conviction increases.

AQR Research (2017). "Trends Everywhere." Journal of Investment Management (JOIM). - Out-of-sample evidence on trend-following across 82 new securities and 16 equity factors. Extends TSMOM to emerging market equities, swaps, exotic commodities, CDS, and vol futures. - Platform relevance: Broadens our signal universe beyond the standard 20-30 commodity futures, validating trend in thinner markets.

5. Financialization & Market Microstructure

Tang, K. & Xiong, W. (2012). "Index Investment and the Financialization of Commodities." Financial Analysts Journal, 68(6), 54-74. - Since the early 2000s, non-energy commodity prices became increasingly correlated with oil, driven by index investment flows (not emerging market demand). Effect was absent in Chinese markets. - Platform relevance: Critical for correlation modeling. Index rebalancing dates and flows create predictable correlation spikes that our risk model must account for. Also creates mean-reversion opportunities post-rebalance.

Singleton, K.J. (2014). "Investor Flows and the 2008 Boom/Bust in Oil Prices." Management Science, 60(2), 300-318. - Index positions and managed-money spread positions had the largest impact on oil futures prices. Speculative flows can push prices away from fundamental values. Informational frictions enable boom/bust dynamics. - Platform relevance: CFTC Commitments of Traders positioning data is a signal input. This paper validates that flow data has predictive content, especially for energy commodities.

6. Commodity Pricing Models & Term Structure

Schwartz, E.S. (1997). "The Stochastic Behavior of Commodity Prices: Implications for Valuation and Hedging." Journal of Finance, 52(3), 923-973. - Develops one-, two-, and three-factor models for commodity price dynamics. The two-factor model (short-term deviation + equilibrium price level) became the workhorse for commodity derivatives. - Platform relevance: Our term structure models for roll optimization are rooted in Schwartz's framework. Mean-reversion speed estimates inform our holding period decisions.

Schwartz, E.S. & Smith, J.E. (2000). "Short-Term Variations and Long-Term Dynamics in Commodity Prices." Management Science, 46(7), 893-911. - Reformulates the Gibson-Schwartz stochastic convenience yield model as an equivalent latent factor model. Short-term mean-reverting component + long-term random walk. Easier econometric estimation from futures curves. - Platform relevance: Decomposition of price moves into transitory vs. permanent components directly informs our signal horizon calibration.

Geman, H. (2005). Commodities and Commodity Derivatives: Modeling and Pricing for Agriculturals, Metals and Energy. Wiley Finance. - Comprehensive treatment of commodity-specific dynamics: seasonality in agriculturals, convenience yield in energy, inventory-price relationships. Covers both physical and financial markets. - Platform relevance: Reference for sector-specific model calibration. Energy, agriculture, and metals have different return-generating processes.

7. Commodity ETF-Specific Research

Bessembinder, H. (2018). "The Roll Yield Myth." Financial Analysts Journal, 74(2), 41-53. - Roll yield is not a separate source of return; it is a component of total return already captured by price changes. The common narrative that contango "destroys" ETF value is misleading -- what matters is the total return including collateral. - Platform relevance: Corrects a common misconception. Our return attribution must properly decompose spot return, roll return, and collateral return.

Two Sigma Venn (2017). "Inside: The Commodity Futures Roll Return 'Tax'." Two Sigma Working Paper. - Quantifies the magnitude and persistence of roll costs across commodity sectors. Energy suffers the worst contango drag; precious metals are nearly flat. Roll timing can mitigate but not eliminate the cost. - Platform relevance: Roll timing algorithms are a first-order concern for commodity ETF strategies. Our execution engine optimizes roll windows using term structure slope.

Columbia University Working Paper. "Understanding the Tracking Errors of Commodity Leveraged ETFs." - Leveraged commodity ETFs suffer from volatility drag (variance drain), daily rebalancing costs, and compounding effects that cause severe long-term tracking error. 2x and 3x products can lose money even when the underlying rises. - Platform relevance: Confirms that leveraged commodity ETFs should only be used for short-term tactical trades, never strategic holds. Our position limits enforce this.

8. Machine Learning & Quantitative Methods

8.1 Financial Machine Learning

Lopez de Prado, M. (2018). Advances in Financial Machine Learning. Wiley. - Introduces meta-labeling, triple barrier method, fractional differentiation, and the combinatorial purged cross-validation (CPCV) framework. Addresses why most ML funds fail: backtest overfitting, data leakage through time-series structure, and lack of proper validation. - Platform relevance: Our ML pipeline implements CPCV, purged k-fold, and the triple barrier labeling method. This is the methodological backbone for any ML signal we deploy.

Lopez de Prado, M. (2020). "The 10 Reasons Most Machine Learning Funds Fail." Journal of Portfolio Management. - Discretionary fund managers using ML tools without proper methodology: multiple testing, backtest overfitting, walk-forward leakage, and non-IID data all produce spurious results. - Platform relevance: Checklist for our research process. Every ML signal must pass the deflated Sharpe ratio test and combinatorial symmetric cross-validation before promotion.

8.2 ML Applied to Commodities

Wang, T. & Zhang, S. (2024). "Predictability of Commodity Futures Returns with Machine Learning Models." Journal of Futures Markets, 44(2), 302-322. - Tests ML models (gradient boosting, random forest, neural nets) on 22 commodities with commodity-specific and macro predictors. LightGBM long-short portfolios outperform linear benchmarks in Sharpe, return, and max drawdown. Shapley values identify dominant predictors per commodity. - Platform relevance: Directly applicable. Our feature engineering uses the same predictor categories. Shapley-based feature importance is how we maintain model interpretability.

Guida, T. (2025). "Machine Learning in Commodity Futures: Bridging Data, Theory, and Return Predictability." CFA Institute Research Foundation, AI in Asset Management. - ML grounded in commodity theory (carry, basis, momentum, skewness) can uncover persistent return patterns. Commodities remain an underexplored frontier for ML. Cross-sectional rankings and multi-horizon diversification are key. - Platform relevance: Validates our approach of grounding ML features in economic theory rather than pure data mining. Feature engineering should start from carry, momentum, and term structure.

9. Risk Management & Portfolio Construction

9.1 Position Sizing

Thorp, E.O. (2006). "The Kelly Criterion in Blackjack, Sports Betting, and the Stock Market." In Handbook of Asset and Liability Management, Vol. 1, Elsevier. - Kelly criterion maximizes long-term geometric growth rate and minimizes time to reach a wealth target. Overbetting is more harmful than underbetting. Practical trading should use fractional Kelly (typically 0.25-0.5x) due to estimation error. - Platform relevance: Our position sizing module uses fractional Kelly (default 0.25x full Kelly). The Kelly fraction adapts to signal confidence -- higher confidence signals get larger fractions.

Carver, R. (2015). Systematic Trading: A Unique New Method for Designing Trading and Investing Systems. Harriman House. - Comprehensive practitioner framework: volatility targeting, signal combination via inverse-variance weighting, forecast scaling, and instrument diversification multipliers. Author is former AHL (Man Group) portfolio manager. - Platform relevance: Our volatility targeting (10% annualized default), forecast combination, and position scaling follow Carver's framework directly. The "pysystemtrade" open-source implementation is a reference codebase.

9.2 Tail Risk & Drawdown Control

Rockafellar, R.T. & Uryasev, S. (2000). "Optimization of Conditional Value-at-Risk." Journal of Risk, 2(3), 21-41. - CVaR can be optimized using linear programming -- a breakthrough for practical tail-risk management. Simultaneously computes VaR and minimizes CVaR. Enabling technology for large-scale risk-constrained portfolio optimization. - Platform relevance: CVaR is our primary tail risk metric. Portfolio optimization runs CVaR constraints alongside return objectives.

Rockafellar, R.T. & Uryasev, S. (2002). "Conditional Value-at-Risk for General Loss Distributions." Journal of Banking & Finance, 26(7), 1443-1471. - Extends CVaR to discrete distributions (scenario-based), critical for practical implementation with finite samples and Monte Carlo simulations. - Platform relevance: Our Monte Carlo stress testing uses this discrete CVaR formulation.

Chekhlov, A., Uryasev, S. & Zabarankin, M. (2005). "Drawdown Measure in Portfolio Optimization." International Journal of Theoretical and Applied Finance, 8(1), 13-58. - Introduces Conditional Drawdown at Risk (CDaR) -- a family of risk measures based on the drawdown curve. CDaR generalizes between maximum drawdown and average drawdown. Analogous to CVaR but in the drawdown domain. - Platform relevance: Our drawdown control system uses CDaR as the constraint metric. Maximum drawdown limits trigger position reduction; CDaR informs the gradual risk reduction pathway.

9.3 Portfolio Construction

Levine, A., Ooi, Y.H. & Richardson, M. (2018). "Commodities for the Long Run." Financial Analysts Journal, 74(2), 55-68. (NBER WP 22793) - Long-run (1877-present) analysis of commodity futures returns. Equal-weight and risk-balanced approaches dominate production-weighted indices. Active strategies (momentum, carry, value) substantially improve risk-adjusted returns. - Platform relevance: Validates risk-balanced over cap-weighted commodity allocation. Our equal-risk-contribution weighting scheme follows this evidence.

Ilmanen, A. (2011). Expected Returns: An Investor's Guide to Harvesting Market Rewards. Wiley. - Comprehensive framework for understanding return sources across all asset classes. Covers risk premia (equity, term, credit, commodity), strategy premia (value, momentum, carry, volatility), and macro factors (growth, inflation, liquidity). - Platform relevance: Intellectual backbone for our multi-factor, multi-asset approach. The commodity chapters directly inform how we decompose expected returns into structural and tactical components.

10. Commodity-Specific Dynamics

10.1 Energy Markets

Hamilton, J.D. (2009). "Understanding Crude Oil Prices." Energy Journal, 30(2), 179-206. (NBER WP 14492) - Comprehensive analysis of oil price drivers: supply disruptions, OPEC behavior, demand growth, speculation, and resource depletion. Statistical properties of oil prices reviewed against theoretical predictions. - Platform relevance: Oil is the dominant commodity by weight in most indices. Understanding its unique supply-side dynamics (OPEC, geopolitics, shale) is essential for our energy sector models.

10.2 Seasonality & Calendar Effects

Li, Y., Liu, Q., Miao, D. & Tse, Y. (2024). "Return Seasonality in Commodity Futures." International Review of Economics & Finance. - Documents persistent seasonal patterns in commodity futures returns driven by production cycles, weather, and consumption patterns. Agricultural commodities show the strongest seasonality. - Platform relevance: Seasonality features feed into our signal engine as conditioning variables. Planting/harvest cycles for grains, heating/cooling demand for energy.

Boos, D. et al. (2024). "Risky Times: Seasonality and Event Risk of Commodities." Journal of Futures Markets. - Commodity event risk (USDA reports, OPEC meetings, weather events) clusters seasonally. Risk management must account for calendar-dependent volatility. - Platform relevance: Our event calendar module adjusts position sizing around known high-volatility windows (WASDE reports, FOMC, OPEC meetings).

10.3 Weather Effects

Various (2023). "Impacts of Weather Conditions on the US Commodity Markets Systemic Interdependence Across Multi-Timescales." ScienceDirect. - Weather conditions create cross-commodity spillovers at multiple timescales. Drought affects grains, which affects livestock feed costs, which affects meat prices. Non-linear transmission mechanisms. - Platform relevance: Weather data pipeline feeds agricultural and energy commodity signals. Cross-commodity spillover modeling prevents concentrated weather risk.

11. Practitioner Research & Working Papers

11.1 AQR Capital Management

AQR (Various). Trend Following Research Bibliography. - AQR maintains the most comprehensive practitioner research library on systematic strategies: trend following, carry, value, momentum, and risk parity across commodities and other asset classes. - Key papers: "Demystifying Managed Futures," "Building a Better Commodities Portfolio," "Trend Following and Rising Rates" (2023), "Risk Parity in Commodity Portfolios." - Platform relevance: AQR's research is the gold standard for practitioner implementation. Their commodity datasets (available on the AQR data library) are used for our backtesting.

11.2 Man AHL

Man AHL Research (Various). "Gaining Momentum: Where Next for Trend-Following?" - Man AHL runs multiple trend-following programs across the full spectrum of markets, models, and risk budgets. Research shows simple pure trend strategies on the largest futures markets have been the strongest performers. Carries a proprietary commodity database from 1946-present. - Platform relevance: Man AHL's research informs our model simplicity bias -- complex models rarely beat well-diversified simple trend systems in production.

11.3 Winton Group

Winton Capital Management. "Trend Following Working Papers." - Backtested fast, medium, and slow trend-following strategies across 20 futures markets over 30 years. Sharpe ratios: fast (0.87), medium (1.12), slow (0.81). Medium-speed trend captures the best risk-adjusted returns. - Platform relevance: Calibrates our default lookback window. Medium-speed (3-6 month) trend is the core, with fast and slow as diversifying signals.

12. Post-2020 Frontier Research

Hao, J. (2025). "Predicting Commodity Returns Through Image-Based Price Patterns." Journal of Futures Markets. - Uses convolutional neural networks on price chart images to predict commodity futures returns. Novel computer-vision approach to technical analysis. - Platform relevance: Exploratory signal. We monitor chart-pattern recognition as an alternative to traditional TSMOM lookback signals.

Various (2024-2025). Deep Learning for Commodity Price Forecasting. - BiLSTM-Attention-CNN models with wavelet transforms forecast crude oil futures. LSTM and GRU models outperform traditional approaches for agricultural commodities. Ensemble methods (gradient boosting + neural nets) generate the strongest out-of-sample performance. - Platform relevance: Next-generation signal research. Deep learning models are candidates for our "experimental" signal bucket, subject to rigorous out-of-sample validation before production deployment.

AQR (2023). "Trend Following and Rising Rates." - Analyzes trend-following performance specifically in rising rate environments (relevant post-2022). Trend strategies naturally position short fixed income and long commodities during inflationary regimes. - Platform relevance: Regime-conditional analysis for our macro overlay. In rising-rate environments, the system should increase commodity trend allocation.

Reading Order Recommendation

For someone building a commodity trading system from scratch:

Phase 1: Foundations (Week 1-2)

Gorton & Rouwenhorst (2006) -- Why commodities?
Erb & Harvey (2006) -- Strategic vs. tactical
Keynes (1930) / Working (1949) -- Theory of storage
Gorton, Hayashi & Rouwenhorst (2013) -- Inventory fundamentals
Ilmanen (2011), Chapters on commodities -- Expected returns framework

Phase 2: Signals (Week 3-4)

Moskowitz, Ooi & Pedersen (2012) -- Time-series momentum
Hurst, Ooi & Pedersen (2017) -- Century of trend evidence
Koijen et al. (2018) -- Carry everywhere
Szymanowska et al. (2014) -- Risk premia anatomy
Fuertes, Miffre & Fernandez-Perez (2015) -- Triple-screen strategy
Asness, Moskowitz & Pedersen (2013) -- Cross-asset factors

Phase 3: Implementation (Week 5-6)

Carver (2015) -- Systematic Trading (entire book)
Lopez de Prado (2018) -- ML methodology
Baltas & Kosowski (2013) -- Practical TSMOM optimization
Levine, Ooi & Richardson (2018) -- Long-run commodity allocation

Phase 4: Risk Management (Week 7-8)

Rockafellar & Uryasev (2000, 2002) -- CVaR optimization
Chekhlov, Uryasev & Zabarankin (2005) -- Drawdown control
Thorp (2006) -- Kelly criterion
Bhardwaj, Gorton & Rouwenhorst (2015) -- Post-crisis reality check

Phase 5: Advanced & Frontier (Ongoing)

Wang & Zhang (2024) -- ML for commodity futures
Guida (2025) -- ML meets commodity theory
Hamilton (2009) -- Oil market fundamentals
Tang & Xiong (2012) -- Financialization effects
Singleton (2014) -- Flow-based signals
Bakshi, Gao & Rossi (2019) -- Factor model validation

Key Datasets & Data Sources

Source	Coverage	Use
AQR Data Library	TSMOM factors, carry, value (public)	Backtesting, factor replication
CFTC Commitments of Traders	Weekly positioning data	Flow/sentiment signals
EIA (Energy Information Admin)	Oil, gas inventories & production	Energy fundamental signals
USDA WASDE Reports	Agricultural supply/demand	Grain/livestock signals
Bloomberg Commodity Indices	Real-time term structures	Live signal computation
Quandl/Nasdaq Data Link	Historical futures curves	Backtesting roll strategies
FRED (Federal Reserve)	Macro variables, inflation	Regime classification

Last updated: 2026-04-10 Maintained by: QGTM Research