To determine whether the power law distribution provides an appropriate model for flake-size distributions, I obtained experimentally-generated lithic assemblages and fit a number of common distributions to the data. The initial experiments comprised the reduction of cores to produce flakes that could be further shaped into a tool. I used the maximum likelihood method to find the optimal parameter values for each distribution. The following table shows some initial results of this work, focusing on the flake-size data generated during a single core reduction episode.

**Results of Fitting Various Distributions to a Single Experimentally-Generated Flake Assemblage
**

The models selected for comparison comprise a list of commonly-used distributions for modeling heavy-tailed data. Other model-fitting approaches, for example, found the Weibull distribution to fit flake-size distributions. While the maximum likelihood method provides a means by which to fit each of these models to the same data, this approach must be supplemented with some method for comparing the results.

One approach to model comparison, the Akaike Information Criterion (AIC), comes from information theory. The information-theoretic approach quantifies the expected distance between a particular model and the “true” model. The distance reflects the loss of information resulting from use of the focal model, which naturally fails to capture all of the variability in the data. The best model among a series of candidate models is the model that minimizes the distance or information loss. Obviously, the true model is unknown, but the distance can be estimated using some clever math. The derivation of AIC is beyond my powers to explain in any additional detail. After much hairy math, involving various approximations and simplifications and matrix algebra, a very simple formula emerges. AIC quantifies the distance using each model’s likelihood and a penalty term. AIC for a particular model can be defined as follows:

AIC = -2*L* + 2*k*, where* L* is the log-likelihood and *k* refers to the number of parameters in the model.

For small samples, a corrected version of AIC – termed AICc – is sometimes used:

AICc = AIC + 2*k*(*k*+1)/(*n*–*k*-1)

The best-fitting model among a series of candidates will have the lowest value of AIC or AICc. Unlike other approaches, the information-theoretic approach can simultaneously compare any set of models. Likelihood ratio tests, another common model-comparision technique, are limited to pairwise comparisons of nested models; models are nested when the more complex model can be reduced to the simpler model by setting some parameters to a particular value, such as zero. The models compared in this example are not all nested, since the lognormal – for example – cannot be reduced to any of the other models by setting a parameter to a particular value.

The AIC and AICc for the power law distribution are both lower than the values for any of the other modeled distributions. The power law distribution thus fits this experimentally-produced flake size data better than other common distributions. These preliminary results support the work of Brown (2001) and others.

Note that the best-fitting model among all candidates may still provide a poor fit to the data. Thus, the power law distribution could still provide a poor fit to the data. A couple options exist for evaluating model fit. A simple approach would be to plot the data versus the theoretical distribution. There is also a way to measure the fit of the model to the data, which I will detail in a subsequent post.

© Scott Pletka and *Mathematical Tools, Archaeological Problems*, 2014