慣性聚合 高效追讀感興趣之博客、新聞、科技資訊
閱原文 以慣性聚合開啟

推薦訂閱源

博客园 - 司徒正美
V
V2EX
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
aimingoo的专栏
aimingoo的专栏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
Blog — PlanetScale
Blog — PlanetScale
A
About on SuperTechFans
月光博客
月光博客
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Martin Fowler
Martin Fowler
博客园 - 聂微东
The GitHub Blog
The GitHub Blog
V
Visual Studio Blog
WordPress大学
WordPress大学
酷 壳 – CoolShell
酷 壳 – CoolShell
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python)
协整与配对交易:当时间序列同向运动时
Berkan Sesen · 2026-05-24 · via DEV Community

配对交易,本乎简易之理:觅二资,其动相协,俟其分驰,乃赌其复合。其难在定“相协”之界。二商品ETF,EWA(澳大利亚)与EWC(加拿大),于数载间,其相协度达0.95。均值回归者见此数,遂以为价差必复归。然价差渐离,数月未合。其相协实然,而策略犹失金。其弊在,相协仅示二序同向之趋,而协整乃明其系于长程均衡,使偏颇暂存,终将自正。

此别甚要,盖因多数金融时序非平稳(其游荡而无定均)。二非平稳时序,或偶生高度相关(此即格兰杰与纽博尔德所识之"伪回归"问题)。而协整,乃其差分之正式检验也。(或其線性組合)為穩定,意謂之確實回歸於均值。

讀者終此篇,將試用英格爾-格拉寧及約翰森之法檢驗整合性,知其爭論之時與由,並於實ETF數據上構建簡易雙重交易策略。

資料:國家ETF對

吾用二iShares国之ETF:EWA(Australia)及EWC(Canada)。二国皆货殖之出口国,其经济驱动力相似(采矿、能源、农事),故有根本之由,可期其长程之关系。此乃原R分析所译之对也。

为较之,吾亦测GLD(金)与GDX(金矿者)。虽其显有相系,然金矿者有特异之险(管理、成本、杠杆),可破协整之理。

Dual-axis time series of EWA and EWC ETF prices from 2007 to 2023, showing similar patterns with occasional divergences

此二ETF于十七年间,显然相随。二〇〇八年同崩,同复,而于COVID时暂分,后复会合。然形似非协整之证。吾需正法以验之。

速胜之法:检验协整

点击徽章,自行运行此法:

Open In Colab

import numpy as np
import pandas as pd
import yfinance as yf
from statsmodels.tsa.stattools import adfuller

# Download EWA and EWC adjusted close prices
ewa = yf.download("EWA", start="2007-01-01", end="2023-12-31",
                   auto_adjust=True, progress=False)["Close"]
ewc = yf.download("EWC", start="2007-01-01", end="2023-12-31",
                   auto_adjust=True, progress=False)["Close"]

# Align on common trading days
common = ewa.index.intersection(ewc.index)
ewa, ewc = ewa.loc[common], ewc.loc[common]
print(f"{len(ewa)} trading days, {ewa.index[0].date()} to {ewa.index[-1].date()}")

全屏模式 退出全屏模式

4278 trading days, 2007-01-03 to 2023-12-29

全屏模式 退出全屏模式

恩格尔-格兰杰检验分两步:以一序列对另一序列回归,再检验残差是否平稳.

from statsmodels.regression.linear_model import OLS

# Regress EWC on EWA (no intercept, following the original R code)
model = OLS(ewc.values, ewa.values).fit()
spread = model.resid
print(f"Hedge ratio: {model.params[0]:.4f}")

# ADF test on the residuals
adf_stat, adf_pval, _, _, crit_vals, _ = adfuller(spread, regression="n")
print(f"ADF statistic: {adf_stat:.4f}")
print(f"p-value: {adf_pval:.4f}")

退出全屏模式

Hedge ratio: 1.5674
ADF statistic: -3.1704
p-value: 0.0015

进入全屏模式 退出全屏模式

ADF检验在1%水平上拒绝单位根原假设(p=0.0015)。EWC与1.57倍EWA之间的差异是平稳的,这意味着这两个ETF是协整的。任何偏离长期关系的趋势都会倾向于自我修正.

差异之状如下:

Cointegration spread oscillating around zero with 2-sigma bands, showing mean-reverting behaviour

流散虽漫,终归其平。非若随机游走,永逝于一方。此归均值之性,正为ointegration之交易所用也.

何事之有?

平稳性:其要义也

靜態時序,其均值方差恒定不變。若取任何時窗,其統計觀察大抵相似。股價幾乎無靜態者(或趨升或趨降),然兩整合之股價間,其散度或可靜態。

增广迪基-福勒(ADF)之检验,察序列有无单位根(非平稳)。零假设为“此序列有单位根”(于吾不利)。p值甚小,则可拒零假设,断序列为平稳(于吾有利)。

英格尔-格兰杰两步法

英格尔与格兰杰(1987)所倡,法甚简明:

  1. 回归一时序列与他时序列相较:$\text{EWC}_t = \beta \cdot \text{EWA}_t + \varepsilon_t$
  2. 其残差$\varepsilon_t$之平稳性,以ADF检验之

若残差平稳,则二序列相协整,协整向量在$[1, -\beta]$。系数$\beta = 1.57$之对冲比也:每有 EWC 一元,必持 EWA 之钱一元五角七分,以平息其共趋之势。

:然有微妙之理:所择之数,孰为因孰为果,实有关键。原 R 之码,双向运行(EWC 依 EWA,EWA 亦依 EWC),择 ADF 统计量最负之回归。吾等之例,两途所得,大略相仿。

:何不径用相关之法?

二序列之相合或可至九九,然未必同根。试想二随机游走,适逢同期间上扬。其相合度必高,然其散差将无界漂移。反之,二同根序列,若暂离而复归,短期相合度或低。相合度量同动;同根度量随行有缰。

约翰森之法:多元之术

英格-格兰杰之法,仅限于对偶。约翰森之验,乃某某所引。约翰森(1991),可同时处理任意数量之时间序列。其运作基于向量自回归(VAR)之框架,并估量之。整合秩:诸序列间,独立协整之关系几何?

from statsmodels.tsa.vector_ar.vecm import coint_johansen

data = np.column_stack([ewa.values, ewc.values])
result = coint_johansen(data, det_order=0, k_ar_diff=1)

print(f"Trace statistic (r=0): {result.lr1[0]:.2f}")
print(f"95% critical value:    {result.cvt[0, 1]:.2f}")

Enter fullscreen mode Exit fullscreen mode

Trace statistic (r=0): 16.66
95% critical value:    15.49

Enter fullscreen mode Exit fullscreen mode

迹统计量(16.66)逾越95%临界值(15.49),故约翰森亦拒无协整之原假设。EWA/EWC二者之法,其意相合。

考验相悖时

昔之R码所用日期范围较短,Engle-Granger尝发微弱之协整(p=7%),而Johansen则未之察。此显一要旨:协整之检,其敏感在样本之期、结构之变、滞后之择。如二零零八年之金融危,足以乱其关系。当二法相左,盖协整微弱或依时序而异,非择其善者之由也。

深入探究

配对交易:利用均值回归

若价差平稳,可交易其均值回归。策略简明:

  1. 计算价差滚动z分数:$z_t = \frac{s_t - \bar{s}_{60}}{\sigma_{60}}$
  2. 价差异常便宜时买入$z < -2$(价差异常便宜)
  3. 价差异常昂贵时卖出$z > +2$ (散布异常昂贵)
  4. $z$越零 (散布已回归)

"购散布"即长持EWC、短持EWA (按对冲比率成比例)。 "售散布"则反之。

Rolling z-score of the spread with buy and sell thresholds at negative and positive 2

Z分数在约-4至+4间振荡,定期越交易阈值。每越一次,皆为潜在交易之入或出。

回测之果

运此简策于十七载EWA/EWC之数:

Cumulative PnL curve and position indicator for the pairs trading backtest

策生累计盈亏约一九元于每单位价差,计交易百三十五次,年化夏普比率得零点六九。权益曲线多呈上扬,然二零一二至二零一四年间,价差久漂,遂有显著回撤。

此乃玩具回测也(无交易成本、价差滑点或融资费用)。真实实施需慎之又慎,然核心信号(均值回归价差)实属真切。

一偶败:GLD与GDX

欲观非协整之貌,可取GLD(金)与GDX(金矿)而察之。虽其理相通,然金矿有公司独有之险,破其长程之衡。

ADF test comparison showing EWA/EWC clearly rejecting the unit root while GLD/GDX does not

ADF检验统计量对于EWA/EWC(-3.17)已远超所有临界值。至于GLD/GDX(-1.64),甚至未能通过10%水平。约翰森检验证实:GLD/GDX未显示任何协整迹象(迹统计量13.38)<15.49 之临界值也。

此故,独凭根本之理,未足也。汝需统计之验。

自相关:平稳之证

散布之自相关函数(ACF),昭示平稳之实

Autocorrelation plot of the spread showing high persistence but gradual decay

ACF自1.0处渐衰,此乃常态而持久之过程。若真非稳定之序列,其自相关将几无衰减。渐衰之象,证知扩散复归,然缓(据衰减率,半衰期约三月)。

超参数之择

参数 由何故
ETF配对 EWA/EWC 原R代码配对;商品出口国基于根本经济联系
日期范围 2007-2023 17年涵盖多重市场体制(全球金融危机、新冠疫情)
ADF回归 无常数项 与原R代码匹配(type="nc");价差应为零均值
约翰森之设 det_order=0, k_ar_diff=1 火柴Recdet="none", K=2
Z分数之窗 六十日 三载为期;动静相宜,刚柔并济。
入门之阶 正负二标准差 配对交易之标准;尾端观测值约5%
出阈 散而复收于均数

此理何来

英格尔与格兰杰(1987年):诺贝尔奖论文

英格尔与格兰杰于其1987年之文首倡整合理论协整与误差修正:表示、估计及检验,载于《计量经济学杂志》。此著使格兰杰获2003年诺贝尔经济学奖(与恩格尔共享,恩格尔因ARCH模型获认可)。

其要义在,虽各经济时序或非平稳(阶数一,即I(1)),然其线性组合可平稳(I(0))。此形式化某经济变量为均衡之力所“系”之直觉,纵各变量自游自荡。

“协整之检验,可视为预试,以避‘虚假回归’之境。”
-- 英格勒 & 格兰杰 (1987)

吾所施之法,乃先回归,后验残差,此即其本原之法也。简明直观,至今仍为成对协整检验之最广行法。

约翰森 (1991):多元之延展

索伦·约翰森之 1991年之文 "高斯向量自回归模型中协整向量的估计与假设检验"将协整检验推广至任意变量数。非 pairwise regression,Johansen 迹检验直接通过特征值分解估计协整矩阵之秩。

二变量者,约翰森之法与英格尔-格兰杰之法常相合。三变量或以上者(如诸商品ETF之篮),约翰森乃唯一可行之选。

迪基-富勒之基

二法终赖于增广迪基-富勒之试(迪基&富勒氏,一九七九年,用以察根。ADF之法,适此模型$\Delta y_t = \alpha y_{t-1} + \sum \gamma_i \Delta y_{t-i} + \varepsilon_t$,而验$\alpha = 0$(根)与$\alpha < 0$(稳)之辨。此验之数,不循常态之t,故需特值(迪基氏与富勒氏所列)。

互市交易之实

互市交易之学理,富勒氏一九七九年奠其基。盖特文、戈茨曼与罗文霍斯特(2006)《配对交易:相对价值套利规则之绩效考》。彼辈考配对交易于美利坚股市,自一九六二至二千零二年,得最佳配对之年均化收益约十一。

欲求全治,维迪亚穆尔蒂(Vidyamurthy)(2004) 配对交易:量化方法与分析 覆盖配对选取至执行之全流程。

详读

  • 诺贝尔奖论文: Engle & Granger (1987),《协整与误差修正》,经济学杂志
  • 多元扩展: Johansen (1991),"整合向量之估计与假设检验"
  • 单位根基础: 迪基与富勒(1979),"自回归时间序列估计量之分布"
  • 配对交易之实证: 盖特夫等(2006),"配对交易:相对价值套利规则之效能"
  • 实用指南: 维迪亚穆尔蒂(Vidyamurthy)(2004),配对交易:定量方法与分析,威利(Wiley)

互动工具

相关文章

常见疑问

相关与协整之别何在?

相关之度,测二序列于短时是否同趋。协整之验,则察二序列线性组合之平稳,谓其长程关系之离差,暂且自纠。二序列高相关,可永相离;二序列协整,则受均衡之约,必复归一。

可否时移而破协整?

然。协整非恒久之物。经济结构之变,行业动态之移,或法规之变,皆可毁前时稳定之关系。是故行家常以滚动窗口复验协整,察利差行为之变,以辨体制之迁。

为何英格尔-格兰杰检验有时与约翰森检验相左?

二法殊途。Engle-Granger独运回归,验其残差;Johansen则构向量自回归之框架。当协整微弱,或样本期涵结构之变,抑或滞后选择相异,二者或生龃龉。龃龉者,常为关系脆弱之兆,非稳健之征也.

何谓对冲比率,及其于配对交易何故为要?

围栏之率,乃整合回归之系数也。其示人持一资若干,以抗另一资,使合位有稳态之差。围栏之率失当,则差将漂而不返,反戕策略之本意.

对偶交易,犹利于今世之市乎?

学理之证,谓双元交易之利,自二千年初为世所知,已渐衰矣。然施诸流动性较薄之市,合以本源之析以择双元,或增益以更精微之信号生成,犹可获利。交易之费与执行之质,乃关键所在也。

吾需别其价列乎?方试协整之理?

勿需。协整之试,必用原(未别)之价列。其旨,乃求非定态之I(1)列之线性合,得定态之I(0)果。若先别之,则去汝所欲察之关系矣。