Artificial increasing returns to scale and the problem of sampling from lognormals
We show how increasing returns to scale in urban scaling can artificially emerge, systematically and predictably, without any sorting or positive externalities. We employ a model where individual productivities are independent and identically distributed lognormal random variables across all cities. We use extreme value theory to demonstrate analytically the paradoxical emergence of increasing returns to scale when the variance of log-productivity is larger than twice the log-size of the population size of the smallest city in a cross-sectional regression. Our contributions are to derive an analytical prediction for the artificial scaling exponent arising from this mechanism and to develop a simple statistical test to try to tell whether a given estimate is real or an artifact. Our analytical results are validated analyzing simulations and real microdata of wages across municipalities in Colombia. We show how an artificial scaling exponent emerges in the Colombian data when the sizes of random samples of workers per municipality are 1% or less of their total size.