Financial Markets Analyst at eClerx | India Financials | 8 stocks across
private banks, NBFCs, gold finance. Building systematic quant strategies
with real market data, rigorous mathematics, and industry-standard code.
Why skip last month (t-1)?
→ Short-term reversal effect (Jegadeesh 1990)
→ Stocks reverse sharply over 1M horizon
→ Skipping it improves signal quality
In plain English: At month T, look at how each stock performed
over the PAST 12 months (not including last month). Stocks that went up the most
are "momentum winners" — put them in the LONG portfolio. Stocks that fell the
most are "momentum losers" — put them in the SHORT portfolio. Each month you
rebalance and collect the spread.
Information Coefficient (IC)
IC(t) = SpearmanCorr(Signal(t), Return(t+1))
IC > 0 → signal predicts direction correctly
Mean IC > 0.02 → signal is useful
ICIR = Mean(IC) / Std(IC)
ICIR > 0.5 → signal is deployable
Why IC matters more than backtest returns: A strategy can show
positive returns by LUCK. IC tests whether the signal GENUINELY ranks stocks
correctly. If IC is positive consistently, the factor has real predictive power —
not just a backtest artifact. Every serious quant uses IC to validate signals before
putting capital at risk.
Project 02 — Black-Scholes Options Pricer
Real-time options pricing. Adjust any input — prices and all Greeks update instantly. Implied volatility solved via Newton-Raphson iteration. Put-Call Parity validated.
Model Inputs
Stock Price (S) ₹22,450
Current market price of the underlying stock
Strike Price (K) ₹22,500
Price at which you can buy (call) or sell (put)
Time to Expiry (days)30
NSE weekly=7d · Monthly=30d · Long-dated=365d
Volatility (σ) %18.0%
Historical/implied vol. Nifty 50 typical: 12-25%
Risk-Free Rate (r) %6.50%
RBI repo rate proxy. Currently ~6.5% in India
Market Price (for IV)₹320
Observed market price → we back-solve for Implied Vol
—
—
ΔDelta——
callput
ΓGamma—same ∀
νVega—same ∀
θTheta——
callput
ρRho——
callput
Put-Call Parity: C - P = S - K·e^(-rT)
—
—
—
—
—
—
P&L at Expiry vs Stock Price
Delta Profile — Call & Put
The Mathematics — From First Principles
Black-Scholes Formula
C = S·N(d₁) - K·e^(-rT)·N(d₂)
P = K·e^(-rT)·N(-d₂) - S·N(-d₁)
d₁ = [ln(S/K) + (r + σ²/2)·T] / σ√T
d₂ = d₁ - σ√T
N(x) = cumulative standard normal CDF
What N(d₂) means: The risk-neutral probability that the option
expires in-the-money. If d₂ = 0, N(0) = 0.5 — 50% chance. If d₂ = 2,
N(2) ≈ 0.977 — 97.7% chance the call expires ITM.
S·N(d₁): Expected value of receiving the stock if you exercise. K·e^(-rT)·N(d₂): Present value of paying strike K, probability-weighted.
Implied Vol: Newton-Raphson
σ_new = σ_old - (BS(σ_old) - Price) / Vega(σ_old)
Repeat until |BS(σ) - Price| < 0.0001
Converges in 5-10 iterations
Delta (Δ): If stock goes up ₹1, call gains Δ rupees. ATM call ≈ 0.5. Gamma (Γ): How fast delta changes. Highest ATM near expiry. Long gamma = benefit from big moves. Vega (ν): Option value per 1% rise in vol. Long options always positive vega. Theta (θ): Daily time decay — how much value lost each day. Negative for buyers. Rho (ρ): Sensitivity to interest rate. Matters more for long-dated options.
Project 03 — ML Return Predictor (XGBoost)
XGBoost gradient boosting model predicting next-month stock returns using 15 price and fundamental features. Walk-forward out-of-sample validation. SHAP feature importance.
0.127
Information Coeff
0.84
IC / Std(IC)
0.61
OOS factor
0.043
Explained variance
58.3%
Direction accuracy
Feature Importance (SHAP Values) — Top 15 Features
Walk-Forward: Predicted vs Actual Returns
The Mathematics
XGBoost Objective
L(φ) = Σᵢ l(yᵢ, ŷᵢ) + Σₖ Ω(fₖ)
Ω(f) = γT + ½λ||w||²
Each tree k minimises residuals of previous trees.
ŷᵢ = Σₖ fₖ(xᵢ) [sum of K trees]
In plain English: XGBoost builds trees one by one. Each new tree
tries to fix the errors of the previous ones. The regularisation term Ω prevents
overfitting — critical in finance where data is scarce and noisy.
Why better than random forest for finance: Gradient boosting focuses
on hard examples. In return prediction, the "hard" stocks — the ones with ambiguous
signals — get more attention in each subsequent tree.
Walk-Forward Validation
Train on months 1 → 36
Test on months 37 → 48
Roll forward by 6 months. Repeat.
OOS IC = mean of all test-period ICs
This avoids look-ahead bias entirely.
Why walk-forward beats train/test split: A single split gives one
OOS estimate — which could be lucky or unlucky. Walk-forward gives many OOS periods,
testing whether the model works across different market regimes (bull, bear, sideways,
high-vol, low-vol). If IC is positive across all regimes, the model is robust.
Bias-Variance tradeoff: In finance, overfitting is the dominant
failure mode. OOS R² of 4.3% sounds small but is actually meaningful — finance
signal-to-noise is extremely low.
Project 04 — Portfolio Risk Dashboard
VaR and CVaR at 95% and 99% confidence. Correlation heatmap. Rolling volatility. Stress testing against 3 historical crash scenarios. Portfolio: HDFC Bank, ICICI, Bajaj Finance, Reliance, Infosys.
-2.31%
1-day loss limit
-3.45%
1-day loss limit
-3.12%
Expected shortfall
-4.21%
Expected shortfall
18.4%
Portfolio
0.74
Portfolio
Daily Return Distribution with VaR Lines
Correlation Heatmap — Portfolio Holdings
Stress Test — Historical Crash Scenarios
-42.3%
Peak to trough (Nifty -60%)
-28.7%
Feb-Mar 2020 (Nifty -38%)
-16.2%
Jan-Jun 2022 (Nifty -17%)
The Mathematics
Value at Risk (VaR)
VaR_α = -Quantile(Returns, 1-α)
VaR₉₅ = 5th percentile of return dist.
VaR₉₉ = 1st percentile of return dist.
Interpretation: "With 95% confidence,
we will not lose more than X% in one day."
VaR limitation — why CVaR is better: VaR tells you NOTHING about
losses beyond the threshold. Two portfolios can have identical VaR but very different
tail risk. CVaR (Expected Shortfall) asks: IF we breach VaR, how bad is it on average?
CVaR is a coherent risk measure — VaR is not (it can violate sub-additivity).
Basel III now requires CVaR, not VaR.
CVaR / Expected Shortfall
CVaR_α = E[Loss | Loss > VaR_α]
= mean of all returns below VaR threshold
CVaR ≥ VaR always
CVaR is sub-additive: CVaR(A+B) ≤ CVaR(A) + CVaR(B)
→ diversification always reduces CVaR
Portfolio VaR formula: σ_p = √(w'Σw) where Σ is the covariance
matrix and w is the weight vector. This is why correlation matters — two assets
with high individual VaR but negative correlation can have LOW portfolio VaR.
The correlation heatmap shows where diversification benefit exists in your portfolio.
Project 05 — Pairs Trading Engine
Statistical arbitrage via Engle-Granger cointegration test. HDFCBANK vs ICICIBANK. Z-score based entry/exit signals (±2σ). Dynamic hedge ratio. Half-life 18 days.
0.023
<0.05 = cointegrated
1.300
ln(HDFC)/ln(ICICI)
18 days
Mean reversion speed
|Z| > 2
Current z-score
|Z| < 0.5
Close position
Price Series — HDFCBANK vs ICICIBANK (Normalised)
Z-Score of Spread — Entry/Exit Signals
Current Z-Score
0.00
Z < -2: SHORT HDFC, LONG ICICI
|Z| < 0.5: Exit / No trade
Z > +2: LONG HDFC, SHORT ICICI
The Mathematics
Cointegration Test (Engle-Granger)
Step 1: Regress ln(S₁) = α + β·ln(S₂) + ε
Step 2: ADF test on residuals ε
H₀: ε has unit root (NOT cointegrated)
p < 0.05 → reject H₀ → cointegrated
Spread = ln(S₁) - β·ln(S₂)
Cointegration vs correlation: Two stocks can be highly correlated
(move together short-term) but not cointegrated (diverge permanently long-term).
Cointegration is stronger — it means the SPREAD between the two prices is
stationary (mean-reverting). This is what makes pairs trading work.
HDFC Bank and ICICI Bank are both large private sector banks exposed to the same
Indian credit cycle, so their prices are cointegrated.
Half-Life of Mean Reversion
Spread_t = α + φ·Spread_{t-1} + ε_t
Half-life = -ln(2) / ln(φ)
φ = 0.962 → Half-life = -ln(2)/ln(0.962) ≈ 18 days
Z-score = (Spread - Mean) / Std(Spread)
Why half-life matters: Half-life tells you how quickly the spread
reverts to its mean. 18 days means: if the spread is 1 std dev away from mean today,
in 18 days it will be 0.5 std dev away — on average.
Trading rule: Open position at |Z| > 2. Close at |Z| < 0.5.
Stop-loss at |Z| > 3 (in case cointegration breaks down).
Project 06 — Yield Curve Analyzer
RBI G-Sec yield curve fitted using Nelson-Siegel model. PCA decomposes movements into Level, Slope, Curvature factors. Duration and convexity for bond portfolio risk.
Why Nelson-Siegel: Only 4 parameters describe the entire yield
curve across all maturities. β₀ is the long-run interest rate. β₁ determines
whether the curve is upward or downward sloping. β₂ controls the hump.
RBI current curve: β₀=7.1, β₁=-0.8 (normal upward slope),
β₂=1.2 (slight hump around 5Y tenor).
Duration & Convexity
Duration = -dP/dr × 1/P [Modified Duration]
ΔP/P ≈ -D×Δr + ½×C×(Δr)²
10Y bond (7% coupon) at par:
Duration ≈ 7.1 years
Convexity ≈ 58.4
DV01 = Duration × P × 0.0001
Duration in plain English: If interest rates rise by 1%, a bond
with duration 7.1 years loses approximately 7.1% in value. That's why long-dated
bonds are riskier in a rising rate environment.
Convexity adjustment: Duration is a linear approximation. Convexity
corrects for the curvature — bonds gain MORE when rates fall than they lose when
rates rise by the same amount. This is positive convexity, a desirable property.
The key insight (Markowitz 1952): What matters is not just each
asset's own risk, but how assets move TOGETHER (covariance). Two risky assets with
negative correlation can form a portfolio with LOWER risk than either asset alone.
This is the math behind diversification.
Practical limitation: MVO is extremely sensitive to expected return
estimates (μ), which are nearly impossible to estimate precisely. Small errors in
μ lead to wildly different portfolio weights.
Risk Parity vs MVO: MVO requires expected returns (hard to estimate).
Risk parity only requires the covariance matrix (more stable).
Bridgewater All Weather: The most famous risk parity fund. Holds bonds,
equities, gold, and commodities weighted so each contributes equal risk. Uses leverage
to bring bonds to equity-equivalent risk level.
Weakness: Underperforms when bonds and equities fall together — as
in 2022 when rate rises hurt both asset classes simultaneously.