Revisiting Power Functions in Transfer Pricing

The contrived linear function $Y = \beta \cdot X$ , with the zero intercept specified in the U.S. and OECD transfer pricing guidelines, is a special case of a much more general structural form: the power function,

(1)\quad Y = \alpha \cdot X^{\beta}

Two questions arise the moment one fits (1) as a double-log regression. First, how do we translate the estimated coefficients back into the original scale of Y and X? Second, why do we run the regression in logs at all — because the underlying relationship is multiplicative, or because the log transformation stabilizes the variance of the disturbance? These are two distinct rationales for the same regression, answering to different theoretical foundations. The two are routinely conflated, with the result that readers of a double-log fit are unsure what they are reading. This note treats each rationale in turn and shows the back-transformation explicitly.

1. The Power Function as a Structural Model

The data-generating process underlying (1) is multiplicative:

(2)\quad Y(i) = \alpha \cdot X(i)^{\beta} \cdot \varepsilon(i), \qquad i=1,\dots,N

where $\varepsilon(i) > 0$ is a multiplicative disturbance with $\mathbb{E}[\ln \varepsilon(i)] = 0$ . Taking natural logarithms of both sides linearizes (2) in the parameters:

(3)\quad\ln Y(i) = \ln \alpha + \beta \cdot \ln X(i) + \ln \varepsilon(i)

Let $\alpha^{*} = \ln \alpha$ and $u(i) = \ln \varepsilon(i)$ . Then (3) becomes a classical regression in logs:

(4)\quad\ln Y(i) = \alpha^{*} + \beta \cdot \ln X(i) + u(i)

OLS on (4) recovers consistent estimates $(\hat{\alpha}^{*}, \hat{\beta})$ . To restore the original scale of Y and X, exponentiate the intercept and read the slope directly as the exponent:

(5)\quad\hat{\alpha} = \exp(\hat{\alpha}^{*}), \qquad \hat{Y}(X) = \hat{\alpha}\cdot X^{\hat{\beta}}

This is the back-transformation. The fitted equation in logs translates one-to-one into a fitted power curve in levels: the intercept of the log-log regression becomes a multiplicative scale in the original units, and the slope becomes an exponent. A subtle issue deserves a parenthetical note: $\exp(\hat{\alpha}^{*})$ is unbiased for the median of Y given X, not the mean. Under log-normal ε with $\mathrm{Var}(u) = \sigma^2$ , the bias-corrected conditional mean is

(6)\quad\mathbb{E}[Y\mid X] = \alpha \cdot X^{\beta}\cdot \exp\left(\frac{\sigma^2}{2}\right)

and Duan’s (1983) smearing factor $\frac{1}{N}\sum \exp(\hat{u}(i))$ is the nonparametric analogue. Either correction is invoked when the object of inference is a predicted mean Y rather than the structural parameters (α, β). For transfer pricing applications, the parameters of (1) and the elasticity β are what we typically report, so the smearing correction need not detain us further here.

2. Why Logs? — Rationale One: The Relationship Is Multiplicative

The first rationale is substantive. Many economic relationships — Cobb-Douglas production, demand and supply with constant elasticity, royalty rates as a fraction of a non-linear base — are multiplicative in their natural specification. For such relationships, the additive linear form $Y = \alpha + \beta \cdot X$ is a misspecification, and OLS in levels will distort both slope and inference. The double-log regression is then not a transformation undertaken for statistical convenience but the correct algebraic representation of the structural model.

This rationale is testable through the elasticity. Differentiating (1):

(7)\quad\frac{dY}{dX} = \beta \cdot \alpha \cdot X^{\beta-1}

The proportional rate of change is

(8)\quad\frac{dY/Y}{dX/X} = \left(\frac{dY}{dX}\right) \left(\frac{X}{Y}\right) = \beta

That is, β is the elasticity of Y with respect to X — the percentage change in Y for a 1 percent change in X. Elasticities are invariant to the units of measurement, scale, and currency in which the variables are denominated. Ratios of levels (Y/X) are not—unless both variables are flow and not mixed flow over stock.

3. Why Logs? — Rationale Two: Variance Stabilization

The second rationale is statistical. Even if the substantive relationship between Y and X is genuinely linear, $Y = \alpha + \beta \cdot X + u$ , the conditional variance Var(u | X) is rarely constant across comparable firms. Larger firms may display larger absolute fluctuations in operating profits than smaller firms; the standard deviation of the disturbance scales with the level of the variable. This is heteroskedasticity. When Var(u | X) is approximately proportional to E[Y | X]², the log transformation stabilizes the variance, since by the delta method

(9)\quad\mathrm{Var}(\ln Y\mid X) \approx \frac{\mathrm{Var}(Y\mid X)} {\mathbb{E}[Y\mid X]^2} \approx \text{constant}

This is the classical motivation in Aitchison and Brown (1957) and the Box-Cox literature. OLS on the logged equation then satisfies the Gauss-Markov efficiency condition that OLS on the levels does not, and the resulting standard errors are correctly sized rather than artificially inflated by heteroskedasticity in the disturbance.

In transfer pricing applications, these two rationales are usually not in competition. Comparable companies span a wide range of sizes; the disturbance scales with size; and the underlying intensity (operating profits per unit of revenue) is approximately constant in a percentage sense. Both rationales point to the same regression. The point of separating them is that each can be defended on its own terms, and a reader who understands only one of the two will misread the output.

4. Empirical Illustration

Using historical annual data on five U.S. retailers (167 paired observations of Revenue and Operating Profits drawn from Compustat), the linear-without-intercept fit is

(10)\quad\hat{Y} = 0.071 \cdot X

with Newey-West t-statistic on the slope of 7.82 and R² = 0.80. The intercept is suppressed by assumption, as specified in the U.S. and OECD guidelines. The double-log fit on the same data is

(11)\quad\ln \hat{Y} = -2.938 + 1.0219 \cdot \ln X

with Newey-West t-statistic on the intercept of −8.62, t-statistic on the slope of 26.04, and R² = 0.90. Back-transforming via (5),

(12)\quad\hat{\alpha} = \exp(-2.938) = 0.0530

and the fitted power function in levels is

(13)\quad\hat{Y} = 0.0530 \cdot X^{1.0219}

Three observations follow.

The double-log specification fits substantially better (R² rises from 0.80 to 0.90), suggesting that the substantive relationship is closer to a power function than to a proportional line through the origin.
The estimated elasticity β̂ = 1.0219 is very close to 1, indicating that operating profits scale roughly proportionally with revenue — but not exactly so.
The log-scale intercept α̂* = −2.938 is highly informative (t = −8.62). The joint hypothesis (α = 0, β = 1) — which is precisely what the linear-without-intercept model imposes — is rejected by the data. The forced-zero-intercept regression in (10) is a doubly constrained estimator whose constraints the data do not support.

5. Why This Matters for the Operating Margin

The contrived linear model $Y = \beta \cdot X$ places the operating margin $m = \frac{Y}{X}$ at the center of the comparison: it asserts that m is a constant equal to β. But under the more general relationship $Y = \alpha + \beta \cdot X$ , the margin is

(14)\quad m(i) = \beta + \frac{\alpha}{X(i)}

For α ≠ 0, m is not constant across comparable firms; it varies inversely with size. The implication is that quartiles of the operating margin distribution are biased by α / H, where H is the harmonic mean of X — a separate algebraic point that the EdgarStat blog treats in detail in companion articles on the harmonic mean bias theorem. The double-log regression sidesteps the issue at its root by estimating the elasticity β rather than the margin m.

Takeaway

Two practical conclusions for the transfer pricing practitioner.

First, when reading a double-log regression output, the back-transformation is simply α̂ = exp(intercept) and $\hat{Y} = \hat{\alpha} \cdot X^{\hat{\beta}}$ . The intercept becomes a multiplicative scale; the slope becomes an exponent — and equivalently, an elasticity.

Second, the case for running the regression in logs rests on two independent foundations: the substantive case (the relationship is multiplicative) and the statistical case (the disturbance variance stabilizes in logs). Either alone justifies the specification; in transfer pricing applications, both usually hold simultaneously, which is why the double-log fit so often dominates the linear fit in goodness-of-fit and inference. The linear-without-intercept regression specified by the U.S. and OECD guidelines is the doubly constrained special case of the power function in which both α = 0 and β = 1, and the data, more often than not, reject one or both restrictions.

References

Aitchison, J., and J. A. C. Brown (1957). The Lognormal Distribution. Cambridge University Press.

Box, G. E. P., and D. R. Cox (1964). “An Analysis of Transformations.” Journal of the Royal Statistical Society, Series B, 26(2): 211–252.

Duan, N. (1983). “Smearing Estimate: A Nonparametric Retransformation Method.” Journal of the American Statistical Association, 78(383): 605–610.

Newey, W. K., and K. D. West (1987). “A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix.” Econometrica, 55(3): 703–708.

Revisiting Power Functions in Transfer Pricing

Terms of use

By registering with EdgarStat®, you agree not to: