Posts Tagged ‘power law distribution’

Comparing the Fit of Distributions to Experimental Flake Size Data I

January 28, 2014

To determine whether the power law distribution provides an appropriate model for flake-size distributions, I obtained experimentally-generated lithic assemblages and fit a number of common distributions to the data. The initial experiments comprised the reduction of cores to produce flakes that could be further shaped into a tool. I used the maximum likelihood method to find the optimal parameter values for each distribution. The following table shows some initial results of this work, focusing on the flake-size data generated during a single core reduction episode.

Results of Fitting Various Distributions to a Single Experimentally-Generated Flake Assemblage

The models selected for comparison comprise a list of commonly-used distributions for modeling heavy-tailed data. Other model-fitting approaches, for example, found the Weibull distribution to fit flake-size distributions. While the maximum likelihood method provides a means by which to fit each of these models to the same data, this approach must be supplemented with some method for comparing the results.

One approach to model comparison, the Akaike Information Criterion (AIC), comes from information theory. The information-theoretic approach quantifies the expected distance between a particular model and the “true” model. The distance reflects the loss of information resulting from use of the focal model, which naturally fails to capture all of the variability in the data. The best model among a series of candidate models is the model that minimizes the distance or information loss. Obviously, the true model is unknown, but the distance can be estimated using some clever math. The derivation of AIC is beyond my powers to explain in any additional detail. After much hairy math, involving various approximations and simplifications and matrix algebra, a very simple formula emerges. AIC quantifies the distance using each model’s likelihood and a penalty term. AIC for a particular model can be defined as follows:

AIC = -2L + 2k, where L is the log-likelihood and k refers to the number of parameters in the model.

For small samples, a corrected version of AIC – termed AICc – is sometimes used:

AICc = AIC + 2k(k+1)/(nk-1)

The best-fitting model among a series of candidates will have the lowest value of AIC or AICc. Unlike other approaches, the information-theoretic approach can simultaneously compare any set of models. Likelihood ratio tests, another common model-comparision technique, are limited to pairwise comparisons of nested models; models are nested when the more complex model can be reduced to the simpler model by setting some parameters to a particular value, such as zero. The models compared in this example are not all nested, since the lognormal – for example – cannot be reduced to any of the other models by setting a parameter to a particular value.

The AIC and AICc for the power law distribution are both lower than the values for any of the other modeled distributions. The power law distribution thus fits this experimentally-produced flake size data better than other common distributions. These preliminary results support the work of Brown (2001) and others.

Note that the best-fitting model among all candidates may still provide a poor fit to the data. Thus, the power law distribution could still provide a poor fit to the data. A couple options exist for evaluating model fit. A simple approach would be to plot the data versus the theoretical distribution. There is also a way to measure the fit of the model to the data, which I will detail in a subsequent post.

© Scott Pletka and Mathematical Tools, Archaeological Problems, 2014

Power Law Distributions and Model-fitting Approaches

November 18, 2013

A power law distribution has a heavy tail. The following simulation results depict the form of a power law distribution. In this example, the value of $\alpha$ is 3.6, the minimum value is 8.9, and the sample size is 75. The histogram shows the values that resulted from the simulation run, while the red line shows the theoretical distribution. Note that a few simulation values are much, much larger than the rest. To preview some results to be presented later, the distribution of debitage size in lithic reduction experiments looks similar. Such similarity is suggestive but not definitive.

As discussed previously, linear regression analysis has sometimes been used to evaluate the fit of the power law distribution to data and to estimate the value of the exponent, $\alpha$. This technique produces biased estimates, which the next simulation results illustrate. In the simulation, a random number generator produced a sample of numbers drawn from a power law distribution at a particular value of $\alpha$. I then analyzed this artificial data set using the linear regression approach described by Brown (2001) and using maximum likelihood estimation through a direct search of the parameter space. For a simple power law distribution, the maximimum likelihood estimate could also be found analytically. I used a direct search approach, however, in anticipation of using this approach for more complex mixture models. I repeated the random number generation and analysis 35 separate times for several different combinations of $\alpha$ and sample sizes. The following histograms show the estimates for $\alpha$ using the linear regression analysis and maximum likelihood estimation to find that value. In this particular case, $\alpha$ was set to 3.6 in the random number generator, the minimum value was set to 8.9, and the sample size was 500.

The histograms clearly suggest that the maximum likelihood estimates center closely around the true value of alpha, while the regression analyses skew to a lower value. The simulations that I performed at other parameter values and sample sizes displayed similar results. Other, more extensive simulation work also supports these impressions (Clauset et al. [2009] provides these results as part of a detailed, comprehensive technical discussion). Consequently, I used maximum likelihood estimation to fit probability distributions to data in the subsequent analyses.

References cited

Brown, Clifford T.
2001 The Fractal Dimensions of Lithic Reduction. Journal of Archaeological Science 28: 619-631.

Clauset, Aaron; Cosma Rohilla Shalizi; and Mark E. J. Newman
2009. Power-law distributions in empirical data. SIAM Review 51(4): 661-703.

© Scott Pletka and Mathematical Tools, Archaeological Problems, 2013

Identification of Lithic Reduction Strategies from Mixed Assemblages

November 11, 2013

This post is the first in a series that will try to characterize lithic debitage assemblages formed from more than one reduction strategy. The primary goals are to estimate the proportions of the various reduction strategies represented within these mixed assemblages and to quantify the uncertainty of these estimates. I plan to use mixture models and the method of maximum likelihood to identify the distinct components of such assemblages.

Brown (2001) suggests that the distribution of debitage size follows a power law. Power law distributions have the following probability density function:

$f(x\vert \alpha) = C*x^{\alpha}$,

where C is a constant that normalizes the distribution, so the density integrates to one. The value of C thus depends entirely on the exponent $\alpha$.

Based on analysis of experimentally-produced assemblages, Brown further suggests that the exponent, $\alpha$, of these power law distributions varies among different reduction strategies. Thus, different reduction strategies produce distinctive debitage size distributions. This result could be very powerful, allowing reduction strategies from a wide variety of contexts to be characterized and distinguished. The technique used by Brown to estimate the value of the exponent, however, has some technical flaws.

Brown (2001) fits a linear regression to the relationship between the log of flake size grade and the log of the cumulative count of flakes in each size grade. In its favor, this approach seemingly reduces the effects of small sample sizes and can be easily replicated. The regression approach, on the other hand, also produces biased estimates of the exponent and does not allow the fit of the power law model to be compared to other probability density functions.

Maximum likelihood estimates, using data on the size of each piece of debitage, produce more reliable estimates of the exponent of a power law. Maximum likelihood estimates can also be readily compared among different distributions fit to the data, to evaluate whether a power law is the best model to describe debitage size distributions. The next post will illustrate the use of the linear regression approach and the maximum likelihood approach on simulated data drawn from a power law distribution.

Reference cited

Brown, Clifford T.
2001 The Fractal Dimensions of Lithic Reduction. Journal of Archaeological Science 28: 619-631.

© Scott Pletka and Mathematical Tools, Archaeological Problems, 2013