The Bias of Ratio-Based Profit Indicators

ABSTRACT

When the true relationship between profit Yᵢ and capital (operating assets) Kᵢ is the linear function Yᵢ = a + b · Kᵢ with a nonzero intercept, the return-on-capital employed rᵢ = Yᵢ/Kᵢ is not constant across enterprises or firms. We show algebraically that every quantile of the empirical distribution of {rᵢ}—including the median, the quartiles, and the interquartile range—is a biased estimator of the true return b. The bias of the arithmetic mean of rᵢ equals a/H, where H is the harmonic mean of Kᵢ. The bias of any other quantile is the corresponding quantile of 1/Kᵢ. All biases vanish if and only if a = 0, i.e., if and only if the true model is a proportional (no-intercept) relationship between profit and capital employed. Testing whether an enterprise or firm’s ROA lies within the interquartile range of comparables, therefore, constitutes a biased procedure whenever a ≠ 0. The correct approach is to test the slope of the underlying linear regression of any profit indicator promoted as a ratio.

1. Setup and Notation

Let i = 1, …, N index a cross-section of enterprises. Denote by Yᵢ the operating profit of enterprise i and by Kᵢ > 0 its asset base (capital employed). Suppose the data-generating process is the linear model:

(1)\quad Y_i = a + b \cdot K_i, \quad i = 1, \dots, N

where b is the return on capital employed (the slope) and a is a fixed profit component (the intercept), which may be positive or negative. Equation (1) is the classical simple linear regression specification; the Gauss–Markov (MK) theorem guarantees that the ordinary least-squares (OLS) estimator of (a, b) is the best linear unbiased estimator when the standard assumptions hold [1, 2]. MK refers to the bounds on the regression coefficients, not to the algebraic determination of the coefficients per se.

The return on assets (ROA) of the enterprise i is the ratio:

(2)\quad r_i = \frac{Y_i}{K_i}

A common practice is to compute quantiles (quartiles, interquartile range) of the empirical distribution of {rᵢ} and to use these as benchmarks for the “normal” or comparable ROA. We examine whether this procedure correctly targets the structural parameter b.

2. Decomposition of the Ratio

Substituting (1) into (2):

(3)\quad r_i = \frac{a + b \cdot K_i}{K_i} = b + \frac{a}{K_i}

Equation (3) is the central result. The ratio rᵢ equals the slope b plus an enterprise-specific correction term a/Kᵢ. Because Kᵢ varies across enterprises or firms, rᵢ is not constant even when model (1) holds exactly. The correction term is a decreasing function of Kᵢ when a > 0, and an increasing function when a < 0. The ratio model rᵢ = b (constant ROA) holds identically across enterprises or firms only when a = 0, i.e., only when profit is strictly proportional to the capital employed with no intercept.

3. Bias of the Arithmetic Mean of ROA

The arithmetic mean of the N ratios is:

(4)\quad \bar{r} = \frac{1}{N} \sum_i r_i

Substituting (3):

(5)\quad \bar{r} = \frac{1}{N} \sum_i \left(b + \frac{a}{K_i}\right) = b + a \cdot \frac{1}{N} \sum_i \frac{1}{K_i}

Recall the harmonic mean of {Kᵢ}:

(6)\quad H = \frac{N}{\sum_i \frac{1}{K_i}}

so that

(7)\quad \frac{1}{N} \sum_i \frac{1}{K_i} = \frac{1}{H}

Substituting (7) into (5):

(8)\quad \bar{r} = b + \frac{a}{H} \quad \text{[central result]}

Proposition 1 — Bias of the Mean

Under the linear model (1), the arithmetic mean of ROA satisfies r̅ = b + a/H, where H is the harmonic mean of {Kᵢ}. The bias r̅ − b = a/H is zero if and only if a = 0.

Because H > 0 whenever all Kᵢ > 0, the bias a/H is strictly positive when a > 0 (mean ROA overstates b) and strictly negative when a < 0 (mean ROA understates b).

The relationship between the harmonic and arithmetic means can be seen by Jensen’s inequality applied to the convex function f(x) = 1/x. See reference [3]:

(9)\quad \frac{1}{N} \sum_i \frac{1}{K_i} \ge \frac{1}{\bar{K}}

where K̅ is the arithmetic mean of the capital employed, with equality only when all Kᵢ are equal. Equivalently, H ≤ K̅, and therefore |a/H| ≥ |a/K̅|: the bias computed using the harmonic mean is at least as large in absolute value as the bias that would be obtained by naively replacing H with K̅.

4. Bias of an Arbitrary Quantile of ROA

Let the order statistics of {Kᵢ} be K₍₁₎ ≤ K₍₂₎ ≤ ⋯ ≤ K₍ₙ₎. Since rᵢ = b + a/Kᵢ, the ordering of {rᵢ} by Kᵢ yields:

• If a > 0: rᵢ is strictly decreasing in Kᵢ, so the largest ratios correspond to the smallest capitals.

• If a < 0: rᵢ is strictly increasing in Kᵢ.

In both cases, denoting the p-th quantile of variable Z as Qₚ[Z], and using the linear transformation property of quantiles (since rᵢ = b + a·(1/Kᵢ)):

(10)\quad Q_p[r] = b + a \cdot Q_p\left(\frac{1}{K}\right)

Since 1/Kᵢ is a strictly decreasing function of Kᵢ, its quantiles satisfy:

(11)\quad Q_p\left(\frac{1}{K}\right) = \frac{1}{Q_{1-p}(K)}

Substituting (11) into (10):

(12)\quad Q_p[r] = b + \frac{a}{Q_{1-p}(K)} \quad \text{[central result]}

Proposition 2 — Bias of an Arbitrary Quantile

Under the linear model (1), the p-th quantile of the empirical ROA distribution satisfies Qₚ[r] − b = a / Q₍₁₋ₚ₎[K], where Q₍₁₋ₚ₎[K] is the (1−p)-th quantile of the capital distribution {Kᵢ}. The bias is zero if and only if a = 0.

For the median (p = 1/2): the bias is a/med(K). For the lower quartile (p = 1/4): the bias is a/Q₃₄[K], i.e., the intercept divided by the upper quartile of capital. For the upper quartile (p = 3/4): the bias is a/Q₁₄[K], i.e., a divided by the lower quartile of the capital employed.

5. The Interquartile Range Is Also Biased

The interquartile range (IQR) of ROA is IQR[r] = Q₃₄[r] − Q₁₄[r]. From Proposition 2:

(13a)\quad \mathrm{IQR}[r] = \left(b + \frac{a}{Q_{1/4}(K)}\right) - \left(b + \frac{a}{Q_{3/4}(K)}\right)

(13b)\quad \mathrm{IQR}[r] = a \cdot \left(\frac{1}{Q_{1/4}(K)} - \frac{1}{Q_{3/4}(K)}\right)

The IQR of m equals a times the spread of 1/K between its lower and upper quartiles. This spread is always positive (since Q₁₄[K] < Q₃₄[K] implies 1/Q₁₄[K] > 1/Q₃₄[K]), so the IQR of ROA is always inflated (when a > 0) or deflated (when a < 0) relative to what a constant-b model would imply. Using [Q₁₄[r], Q₃₄[r]] as the “normal” range for a firm with capital K* is a biased procedure: the correct benchmark is simply a + b·K*, computed from the regression.

6. Unified Summary of Biases

Table 1 collects the bias expressions for the principal summary statistics of the ROA distribution, under the maintained hypothesis that the true model is (1):

Statistic of rᵢ	Value under model (1)	Bias vs. b
Arithmetic mean r̅	b + a/H	a/H
Median Q₁₂[r]	b + a/med(K)	a/med(K)
Lower quartile Q₁₄[r]	b + a/Q₃₄[K]	a/Q₃₄[K]
Upper quartile Q₃₄[r]	b + a/Q₁₄[K]	a/Q₁₄[K]
IQR of m	a×(1/Q₁₄[K] − 1/Q₃₄[K])	not 0 if a ≠ 0

Table 1. Biases of ROA summary statistics under Yᵢ = a + b·Kᵢ. H = harmonic mean of {Kᵢ}; quantiles of K are of the capital distribution.

7. Implication: The Intercept Test

The unbiased approach is to estimate model (1) by OLS, obtaining estimates â and b̂, and then:

1. Test whether â is statistically different from zero using the standard t-statistic with N−2 degrees of freedom [1]. If â is not significantly different from zero, the proportional (no-intercept) model may be appropriate, and the ratio-based analysis will be unbiased.

2. If a ≠ 0 cannot be rejected, use the regression-fitted value Ŷ* = â + b̂·K* as the benchmark for the enterprise with capital K*, rather than a quantile of {rᵢ}.

3. Construct confidence or prediction intervals around the estimated regression coefficients using standard OLS inference; these correctly account for the nonzero intercept [2].

In brief: whenever the comparable data-generating process has a nonzero intercept, the linear regression slope b̂ — not any quantile of the ratio rᵢ — is the correct estimator of the return on capital employed (operating assets). The algebraic considerations above apply to any profit indicator defined as a ratio.

¹ The condition intercept a = 0 is equivalent to saying that the regression line passes through the origin. This can be formally tested; forcing the origin restriction when a ≠ 0 yields biased estimates of all regression coefficients (see [2], §5.3).

References

[1] Aitken, A. (1935). “On least squares and linear combinations of observations.” Proceedings of the Royal Society of Edinburgh, 55, 42–48 (introduced the Aitken estimator (or GLS) to solve linear regression models when observation errors are not independent or identically distributed).

[2] Greene, W. (2018). Econometric Analysis (8th ed.). New York: Pearson. Chapters 2–4 (classical linear regression model, OLS, inference).

[3] Jensen, J. (1906). “Sur les fonctions convexes et les inégalités entre les valeurs moyennes.” Acta Mathematica, 30(1), 175–193 (Jensen’s inequality, used in §3 to establish H ≤ K̅). Access Online: Sur les fonctions convexes et les inégalités entre les valeurs moyennes

[4] Serfling, R. (1980). Approximation Theorems of Mathematical Statistics. New York: Wiley. §2.3 (quantile functions and order statistics of transformed random variables).

[5] Cochran, W. (1977). Sampling Techniques (3rd ed.). New York: Wiley. §6.4–6.5 (bias of the ratio estimator and the role of the harmonic mean).

[6] Mood, A., Graybill, F., & Boes, D. (1974). Introduction to the Theory of Statistics (3rd ed.). New York: McGraw-Hill. §5.2 (order statistics and quantile transformations).

Terms of use

By registering with EdgarStat®, you agree not to: