The Bias of Ratio-Based Profit Indicators: The Harmonic Mean Theorem

Abstract

This article shows that ratio-based profit level indicators (PLIs)—including return on assets (ROA), Berry ratio, and operating margin—are biased estimators of the structural profit rate when the underlying relationship between profit and the denominator variable includes a nonzero intercept. The bias equals a/H, where a is the intercept and H is the harmonic mean of the denominator. This bias propagates to all order statistics, making the OECD-sanctioned practice of computing interquartile ranges from ratios methodologically unsound. The unbiased remedy is ordinary least squares (OLS) regression of profit on the denominator variable, with generalized least squares (GLS) employed when heteroskedasticity is present.

1. Introduction

Transfer pricing practice relies heavily on ratio-based profit level indicators (PLIs) to establish arm’s length ranges. The OECD Transfer Pricing Guidelines sanction the computation of interquartile ranges from ratios such as return on assets, Berry ratio, and net cost plus. Yet this practice rests on an algebraic assumption that is never tested: that the intercept of the underlying profit–denominator relationship is zero.

This article presents the harmonic mean bias theorem, which shows that when the intercept is nonzero, every ratio-based statistic—mean, median, quartiles—is biased. The bias is not a minor econometric nuisance; it is a first-order distortion that makes the entire interquartile range an artifact of heteroskedastic noise rather than a measure of genuine profit variation.

2. The Structural Model

Let the levels equation relating profit Y(i) to a denominator variable X(i) be:

(1)\quad Y(i) = a + b \cdot X(i), \quad i = 1, \dots, N

where 𝑎 is the intercept, 𝑏 is the slope (the structural profit ratio), and 𝑖 indexes firms or observations. The denominator 𝑋(𝑖) stands for total assets, operating costs, sales, or any other base relevant to the PLI under consideration.

Define the ratio-based PLI as:

(2)\quad m(i) \equiv \frac{Y(i)}{X(i)}

3. Derivation of the Bias

3.1 Decomposition of the Ratio

Substituting Eq. (1) into Eq. (2):

m(i) = \frac{Y(i)}{X(i)} = \frac{a + b \cdot X(i)}{X(i)}

Distributing the division:

(3)\quad m(i) = b + \frac{a}{X(i)}

This is the fundamental decomposition. The ratio 𝑚(𝑖) equals the structural slope 𝑏 plus a term 𝑎/𝑋(𝑖) that varies inversely with the denominator.

3.2 The Harmonic Mean

The harmonic mean 𝐻 of the 𝑋(𝑖) values is defined as:

(4)\quad H \equiv \frac{N}{\sum \left(\frac{1}{X(i)}\right)}

Equivalently:

\frac{1}{H} = \frac{1}{N} \sum \left(\frac{1}{X(i)}\right)

3.3 The Arithmetic Mean of the Ratios

Taking the arithmetic mean across all observations:

\bar{m} = \frac{1}{N} \sum m(i) = \frac{1}{N} \sum \left[b + \frac{a}{X(i)}\right]

Distributing the summation:

\bar{m} = b + \frac{a}{N} \sum \left(\frac{1}{X(i)}\right)

By the definition of 𝐻:

(5)\quad \bar{m} = b + \frac{a}{H}

3.4 The Bias

The structural (unbiased) profit rate is 𝑏. The arithmetic mean of the ratios is 𝑚̅. The bias is:

(6)\quad \mathrm{Bias}(\bar{m}) = \bar{m} - b = \frac{a}{H}

The bias is positive when a > 0 and negative when a < 0. The size depends on the harmonic mean of the denominators.

4. Propagation to All Order Statistics

From Eq. (3), each m(i) = b + a/X(i) is a monotonic transformation of 1/X(i). For a > 0, 𝑚(𝑖) is strictly decreasing in 𝑋(𝑖)—larger firms have smaller ratios. For a < 0, the relationship reverses.

Let Qₚ[m] denote the 𝑝-th quantile of the distribution of 𝑚(𝑖). For a > 0:

(7)\quad Q_p[m] = b + \frac{a}{Q_{1-p}[X]}

where the subscript shows that the p-th quantile of the ratios corresponds to the (1−p)-th quantile of the X values.

The bias at any quantile is:

Q_p[m] - b = \frac{a}{Q_{1-p}[X]}

Consequently, the first quartile (Q₁) and third quartile (Q₃) of the ratios are:

\begin{aligned} Q_1[m] = b + \frac{a}{Q_3[X]} \\ Q_3[m] = b + \frac{a}{Q_1[X]} \end{aligned}

The interquartile range (IQR) of the ratios is:

(8)\quad \mathrm{IQR}[m] = a \cdot \left(\frac{1}{Q_1[X]} - \frac{1}{Q_3[X]}\right)

This IQR reflects the dispersion of 1/X, not an intrinsic spread of the profit rate 𝑏. The entire interquartile range is an artifact of heteroskedastic bias when a ≠ 0.

5. The Unbiasedness Condition

The ratio m(i) = Y(i)/X(i) is unbiased for 𝑏 if and only if a = 0.

If the intercept is zero:

m(i) = \frac{0 + b \cdot X(i)}{X(i)} = b

Every ratio equals 𝑏; all quantiles equal 𝑏; variance is zero (absent measurement error). This zero-intercept condition is implicitly assumed whenever practitioners compute quartiles of ratio-based PLIs. The assumption is:

• Not tested in practice

• Algebraically necessary for the ratios to be unbiased

• Empirically refuted when OLS regression of Y on X yields a statistically significant intercept

6. The Unbiased Remedy

The unbiased estimator of the structural profit rate is the OLS slope 𝑏̂ from the regression of 𝑌 on 𝑋. The fitted value:

(9)\quad \hat{Y} = \hat{a} + \hat{b} \cdot X

provides the arm’s length benchmark for any tested party with denominator X, without ratio-induced distortion.

Inference on 𝑏̂ follows the standard 𝑡-test framework. As Fisher (1925, p. 47) showed, deviations exceeding twice the standard error are formally regarded as significant at the 5% level.

Appendix: Conditions for Aitken’s GLS

Extend the structural equation to include a stochastic disturbance:

(A1)\quad Y(i) = a + b \cdot X(i) + \varepsilon(i), \quad i = 1, \dots, N

where ε(i) is a zero-mean error term. OLS is BLUE (Best Linear Unbiased Estimator) under the Gauss–Markov conditions (Draper & Smith, 1998, Ch. 2):

1. Linearity: E[Y(i) | X(i)] = a + bX(i)

2. Exogeneity: E[ε(i) | X(i)] = 0

3. Homoskedasticity: Var(ε(i) | X(i)) = σ² for all i

4. No serial correlation: Cov(ε(i), ε(j)) = 0 for i ≠ j

A.1 When Homoskedasticity Fails

Suppose the error variance depends on X(i):

\mathrm{Var}(\varepsilon(i)\mid X(i)) = \sigma^2 \cdot h(X(i))

where h(·) > 0 is a known or estimable function. The variance–covariance matrix becomes E[εε′] = σ²Ω, where Ω is diagonal but not proportional to the identity matrix.

Consequence: OLS remains unbiased but is no longer efficient. Standard errors computed under homoskedasticity are inconsistent—inference is invalid (Draper & Smith, 1998, §9.1).

A.2 The GLS Estimator

When Ω is known, the efficient estimator is Aitken’s generalized least squares (Draper & Smith, 1998, §9.2):

(A2)\quad \hat{\beta}_{\mathrm{GLS}} = (X' \Omega^{-1} X)^{-1} X' \Omega^{-1} Y

Equivalently, GLS is OLS applied to the transformed model:

\Omega^{-1/2} Y = \Omega^{-1/2} X \beta + \Omega^{-1/2} \varepsilon

The transformed errors Ω⁻½ε are homoskedastic with variance σ²I.

A.3 Conditions Warranting GLS

Condition 1: Heteroskedasticity is present and systematic. Test using Breusch–Pagan: regress the squared OLS residuals on X; reject homoskedasticity if the slope is significant (Draper & Smith, 1998, §2.12).

Condition 2: The variance function h(X) is known or estimable. Common specifications:

Variance Structure	h(X(i))	Interpretation
Proportional to X²	X(i)²	Constant coefficient of variation
Proportional to X	X(i)	Error scales with firm size
General power	X(i)ᵞ	Estimated from residuals

Estimation of 𝛾: Regress ln(ε̂²) on ln(X) to identify the variance function (Draper & Smith, 1998, §9.3).

Condition 3: Sample size is sufficient for feasible GLS (FGLS). When Ω is unknown, FGLS proceeds in two steps: (1) run OLS to obtain residuals; (2) estimate h(X) from the residuals; (3) run GLS with the estimated weights. FGLS is consistent and asymptotically efficient. For small samples (N < 50), the efficiency gain may be offset by estimation error in the variance function.

A.4 Decision Rule

1. Run OLS on Y = a + bX + ε

2. Test for heteroskedasticity (Breusch–Pagan or White test)

3. If homoskedasticity is not rejected: Use OLS; standard errors are valid

4. If heteroskedasticity is detected:

• If h(X) is well-identified: Use FGLS with weights w(i) = 1/h(X(i))

• If h(X) is poorly identified: Use OLS with heteroskedasticity-consistent (HC) standard errors

Key point: OLS on the levels equation is always unbiased for 𝑏. GLS improves efficiency and inference validity when heteroskedasticity is present. The ratio method m(i) = Y(i)/X(i) is biased for 𝑏 regardless of the error structure—GLS does not rescue ratios; it optimizes estimation in levels.

References

Draper, N. R., & Smith, H. (1998). Applied Regression Analysis (3rd ed.). Wiley. [Chapters 2, 9]

Fisher, R. A. (1925). Statistical Methods for Research Workers. Oliver and Boyd.

OECD (2022). Transfer Pricing Guidelines for Multinational Enterprises and Tax Administrations. OECD Publishing.

Silva, E. (2018). “The Perpetual Inventory Method for Measuring Intangible Capital.” RoyaltyStat Working Paper.

The Bias of Ratio-Based Profit Indicators: The Harmonic Mean Theorem

Terms of use

By registering with EdgarStat®, you agree not to: