Abstract: The number of extant individuals within a lineage, as exemplified by countsof species numbers across genera in a higher taxonomic category, is known to bea highly skewed distribution. Because the sublineages such as genera in aclade themselves follow a random birth process, deriving the distribution oflineage sizes involves averaging the solutions to a birth and death processover the distribution of time intervals separating the origin of the lineages.In this paper, we show that the resulting distributions can be represented byhypergeometric functions of the second kind. We also provide approximations ofthese distributions up to the second order, and compare these results to theasymptotic distributions and numerical approximations used in previous studies.For two limiting cases, one with a relatively high rate of lineage origin, onewith a low rate, the cumulative probability densities and percentiles arecompared to show that the approximations are robust over a wide rane ofparameters. It is proposed that the probability density distributions oflineage size may have a number of relevant applications to biological problemssuch as the coalescence of genetic lineages and in predicting the number ofspecies in living and extinct higher taxa, as these systems are specialinstances of the underlying process analyzed in this paper.

Author: Panagis Moschopoulos, Max Shpak


