Where can I find backtest data?

Backtesting requires: proportionate method (price-weighted return method), a simple explanation, and examples. Also, why cannot use add-subtract method for adjustment, and recommendations for Python historical data sources using the proportionate method.

Okay, let’s explain the “proportionate method (price-weighted return method)” in an easy-to-understand way, along with why the “add-subtract method” is not suitable, and we’ll recommend some Python historical data sources using the proportionate method.

Core Concepts: Why is Time-Weighting Necessary?

In the world of investing, stock prices don’t just fluctuate due to buying and selling. Corporate actions, such as dividends, stock splits, and cash dividends, directly impact share prices, but these changes don’t reflect the company’s true value growth or decline.

Imagine that yesterday your stock closed at a price of 100 yuan. Today, the company decides to pay out a dividend of 5 yuan per share. This process is called “ex-dividend.” When the dividend is paid, the company’s total value decreases, so the exchange will lower the stock price by 5 yuan, opening at 95 yuan.

If you directly used 95 yuan and yesterday’s 100 yuan to calculate the percentage change, you would arrive at a -5% conclusion. This is clearly incorrect because you now have 5 yuan in your account, and your total assets haven’t decreased.

Time-Weighting (Reinvestment/Adjustment) aims to fill in these price “gaps” caused by non-market transactions (such as dividends, stock splits) and restore the true trend of the stock price, allowing you to accurately calculate returns and perform strategy backtesting.

Percentage Method (Earnings Per Share Adjusted Method): A Simple Example Explanation

Core Idea: The percentage method assumes that all dividends and stock splits you receive are immediately repurchased at the prevailing price on the date they are received. It focuses on “the growth rate of total assets”. Example: Assume you bought 1 share of “Magic Company” stock for $100 on Day 1. Your total asset is $100. On Day 2, the market hasn’t changed, but Magic Company announces a “cash dividend” of $2 per share.

  • After the dividend payout, the price automatically drops from $100 to $98.
  • At this point, your holding is: 1 share stock (valued at $98) + $2 cash.
  • Your total assets remain at $98 + $2 = $100, unchanged. On Day 3, the market rises, and Magic Company’s stock price increases from $98 to $102.9.
  • What is the increase? It’s (102.9 - 98) / 98 = 5%.
  • How much are your total assets now worth?
    • If you didn’t reinvest the dividend: 1 share stock (valued at $102.9) + $2 cash = $104.9.
    • If we use the percentage method to calculate a “adjusted” price, we assume that the $2 cash was bought back on day 2 at $98. But for simplicity, the percentage method directly multiplies the previous price by today’s rise or fall. Percentage Method Calculation Logic: It assumes that the total assets (100) on Day 2 and Day 1 compared to each other have a growth rate of 0%. The total assets on Day 3 compared to Day 2 have a growth rate of 5%. So, it will construct a continuous, adjusted price sequence:
  • Day 1 Adjusted Price: $100
  • Day 2 Adjusted Price: Since the total asset hasn’t changed, it will “discount” yesterday’s closing price to reflect today’s actual situation. The method is to multiply yesterday’s adjusted price by today’s real rise or fall. On the dividend payout day, the real rise or fall is 0 (because the total assets haven’t changed), so the adjusted price remains unchanged, or we directly look at Day 3.
  • Day 3 Adjusted Price: Day 1 Adjusted Price * (1 + 0%) * (1 + 5%) is inaccurate. The correct logic is that it will use the pre-dividend price as a baseline and then perform a “discounting”. Let’s understand a more clear pre-adjusted perspective:
  • Day 3 closing price is $102.9. (Baseline)
  • Day 2 closing price is $98.
  • Day 1 closing price is $100, but because Day 2 has a dividend payout (the price drops from $100 to $98, equivalent to a discount of 98/100 = 0.98), we need to adjust the first day’s price according to this proportion.
    • Adjusted first day price = 102.9 / (1 + 5%) / (100/98) … This calculation is very complex. The simplest way to understand (Percentage Method): The core of the percentage method is to ensure that the rise and fall of any two adjusted prices in a period equals the total return rate you would have obtained if you had reinvested all dividends.
  • From Day 1 closing price to Day 3 closing price, your real total return is (104.9 - 100) / 100 = 4.9%. (Assuming you didn’t reinvest).
  • If the dividend is immediately invested, on Day 3 your total assets will be 100 * (1 + 5%) = 105 (because all $100 are in stocks and enjoy a 5% growth).
  • So, the adjusted price sequence’s rise and fall should be 5%. Conclusion: The percentage method (Earnings Per Share Adjusted Method) adjusts historical prices to ensure that any period’s price rises and falls precisely correspond to the total return rate you would have obtained if you had reinvested all dividends. This is the most accurate

Why Cannot Use “Add-Subtract Reconciliation”?

Core Idea: Add-subtract reconciliation attempts to simply add the dividend amounts back into the pre-dividend stock price through addition.

Example (Using the previous text):

  • Day 1 Closing Price: 100 yuan
  • Day 2 Dividend payout of 2 yuan, Closing Price: 98 yuan
  • Day 3 Increases by 5%, Closing Price: 102.9 yuan

The Flawed Logic of Add-Subtract:

It will assume that the 98 yuan on Day 2 is because of a 2 yuan dividend deduction, so it should add this 2 yuan “back.”

  • It calculates the second day’s “adjusted price” as 98 + 2 = 100 yuan.
  • It calculates the third day’s “adjusted price” as 102.9 + 2 = 104.9 yuan.

Now, let’s use this “adjusted price” sequence to calculate the percentage change on Day 3:

  • Percentage Change = (104.9 - 100) / 100 = 4.9%

Where Does the Error Lie?

This 4.9% percentage change is incorrect! As we analyzed earlier, the actual percentage increase in the stock price is (102.9 - 98) / 98 = 5%. The add-subtract method underestimates the true growth potential of the stock.

Why Does It Underestimate?

Because add-subtract does not consider the “compound interest” effect. The proportional method assumes that your 2 yuan dividend also grows at a rate of 5%, while add-subtract crudely assumes this 2 yuan is always 2 yuan, without participating in subsequent investment appreciation. Over time and with increasing dividend payments, this error will become larger, seriously distorting your backtesting results, especially for high-dividend stocks.

In one sentence: Add-subtract destroys the “growth rate” information of the price sequence, leading to incorrect yield calculations; proportional method preserves the true “yield,” making it the correct choice for backtesting.

Python Getting Historical Data “Ratio Method” Data Source Recommendations

In practice, we often don’t need to calculate the adjustments ourselves. Professional data providers directly offer already adjusted prices. You simply select the correct price type when calling the API. This is commonly referred to as “Adjusted Price” (Adjusted Pricing).

Here are some highly-rated and effective data sources for obtaining back-adjusted historical data through Python:

  1. yfinance (Yahoo Finance)

    • Advantages: Completely free, easy to use, the preferred choice for individual developers and beginners. The data it provides is already back-adjusted using the ratio method (forward-adjusted).
    • Disadvantages: Data may have occasional cleaning issues or delays. For very rigorous business strategies, a more professional data source might be needed.
    • Python Usage Example:
  2. TuShare

    • Advantages: A very popular financial data interface in China, providing rich data for A-shares, Hong Kong shares, US stocks, etc. Data quality is high, with a points system, but basic data is free. It offers clear adjustment factors and back-adjusted行情 (market data) interfaces.
    • Disadvantages: Requires registration to obtain a token; some advanced data or high frequency calls require points.
    • Python Usage Example (Requires first registering for a token):
  3. baostock

    • Advantages: Free, open-source Chinese A-share securities data platform. Data is relatively stable and accurate, and it also provides back-adjustment options.
    • Disadvantages: Primarily covers the A-share market.
    • Python Usage Example:
  4. Commercial-Grade Data Sources (Quandl/FactSet, Refinitiv, Bloomberg)

    • Advantages: Highest data quality, widest coverage, and most up-to-date updates, providing professional APIs and technical support.
    • Disadvantages: Expensive, primarily aimed at financial institutions and corporate users.

Recommendations for Beginners: Start with yfinance or TuShare. They can fully meet the needs of learning, research, and personal backtesting projects, and will help you understand and apply the “ratio method” back-adjusted data effectively. Be sure to select the “Adjusted” or “Back-Adjusted” option when calling the API.

Licensed under CC BY-NC-SA 4.0
Last updated on Jun 27, 2025 19:41
A financial IT programmer's tinkering and daily life musings
Built with Hugo
Theme Stack designed by Jimmy