At the invitation of the fine folks at the Center for Regional Heritage Research at Stephen F. Austin State University, I wrote a piece that evaluated Woodland-period radiocarbon dates from East Texas. You can find the results of this work here.
To determine whether the power law distribution provides an appropriate model for flake-size distributions, I obtained experimentally-generated lithic assemblages and fit a number of common distributions to the data. The initial experiments comprised the reduction of cores to produce flakes that could be further shaped into a tool. I used the maximum likelihood method to find the optimal parameter values for each distribution. The following table shows some initial results of this work, focusing on the flake-size data generated during a single core reduction episode.
Results of Fitting Various Distributions to a Single Experimentally-Generated Flake Assemblage
The models selected for comparison comprise a list of commonly-used distributions for modeling heavy-tailed data. Other model-fitting approaches, for example, found the Weibull distribution to fit flake-size distributions. While the maximum likelihood method provides a means by which to fit each of these models to the same data, this approach must be supplemented with some method for comparing the results.
One approach to model comparison, the Akaike Information Criterion (AIC), comes from information theory. The information-theoretic approach quantifies the expected distance between a particular model and the “true” model. The distance reflects the loss of information resulting from use of the focal model, which naturally fails to capture all of the variability in the data. The best model among a series of candidate models is the model that minimizes the distance or information loss. Obviously, the true model is unknown, but the distance can be estimated using some clever math. The derivation of AIC is beyond my powers to explain in any additional detail. After much hairy math, involving various approximations and simplifications and matrix algebra, a very simple formula emerges. AIC quantifies the distance using each model’s likelihood and a penalty term. AIC for a particular model can be defined as follows:
AIC = -2L + 2k, where L is the log-likelihood and k refers to the number of parameters in the model.
For small samples, a corrected version of AIC – termed AICc – is sometimes used:
AICc = AIC + 2k(k+1)/(n-k-1)
The best-fitting model among a series of candidates will have the lowest value of AIC or AICc. Unlike other approaches, the information-theoretic approach can simultaneously compare any set of models. Likelihood ratio tests, another common model-comparision technique, are limited to pairwise comparisons of nested models; models are nested when the more complex model can be reduced to the simpler model by setting some parameters to a particular value, such as zero. The models compared in this example are not all nested, since the lognormal – for example – cannot be reduced to any of the other models by setting a parameter to a particular value.
The AIC and AICc for the power law distribution are both lower than the values for any of the other modeled distributions. The power law distribution thus fits this experimentally-produced flake size data better than other common distributions. These preliminary results support the work of Brown (2001) and others.
Note that the best-fitting model among all candidates may still provide a poor fit to the data. Thus, the power law distribution could still provide a poor fit to the data. A couple options exist for evaluating model fit. A simple approach would be to plot the data versus the theoretical distribution. There is also a way to measure the fit of the model to the data, which I will detail in a subsequent post.
© Scott Pletka and Mathematical Tools, Archaeological Problems, 2014
A power law distribution has a heavy tail. The following simulation results depict the form of a power law distribution. In this example, the value of is 3.6, the minimum value is 8.9, and the sample size is 75. The histogram shows the values that resulted from the simulation run, while the red line shows the theoretical distribution. Note that a few simulation values are much, much larger than the rest. To preview some results to be presented later, the distribution of debitage size in lithic reduction experiments looks similar. Such similarity is suggestive but not definitive.
As discussed previously, linear regression analysis has sometimes been used to evaluate the fit of the power law distribution to data and to estimate the value of the exponent, . This technique produces biased estimates, which the next simulation results illustrate. In the simulation, a random number generator produced a sample of numbers drawn from a power law distribution at a particular value of . I then analyzed this artificial data set using the linear regression approach described by Brown (2001) and using maximum likelihood estimation through a direct search of the parameter space. For a simple power law distribution, the maximimum likelihood estimate could also be found analytically. I used a direct search approach, however, in anticipation of using this approach for more complex mixture models. I repeated the random number generation and analysis 35 separate times for several different combinations of and sample sizes. The following histograms show the estimates for using the linear regression analysis and maximum likelihood estimation to find that value. In this particular case, was set to 3.6 in the random number generator, the minimum value was set to 8.9, and the sample size was 500.
The histograms clearly suggest that the maximum likelihood estimates center closely around the true value of alpha, while the regression analyses skew to a lower value. The simulations that I performed at other parameter values and sample sizes displayed similar results. Other, more extensive simulation work also supports these impressions (Clauset et al.  provides these results as part of a detailed, comprehensive technical discussion). Consequently, I used maximum likelihood estimation to fit probability distributions to data in the subsequent analyses.
Brown, Clifford T.
2001 The Fractal Dimensions of Lithic Reduction. Journal of Archaeological Science 28: 619-631.
Clauset, Aaron; Cosma Rohilla Shalizi; and Mark E. J. Newman
2009. Power-law distributions in empirical data. SIAM Review 51(4): 661-703.
© Scott Pletka and Mathematical Tools, Archaeological Problems, 2013
This post begins to explore additional patterning in mound size, refining some of my earlier observations and offering some hypotheses for evaluation. Suppose mound-building groups occupied stable territories over the span of several generations or longer. Within the territory held by such groups, they built burial mounds. Many burial mounds within a given area may thus have been produced by the same group or lineage. Under these circumstances, burial mounds located in close proximity should be more likely to be the product of a single group or lineage. If the group traits that influenced mound volume were also relatively stable through time, burial mounds located near to each other should be similar in size. As an initial attempt to evaluate these claims, I looked at the relationship in mound size between mounds that were nearest neighbors and between randomly-paired mounds.
Recall that most mounds have been affected by modern plowing and other disturbances, but some mounds have been largely spared such damage. The museum records that I used characterized these undamaged mounds as “whole”. The museum records documented 287 whole mounds. To make sure that the comparisons were fair, I limited the sample of nearest neighbors to just those whole mounds that had another whole mound as its nearest neighbor. I eliminated duplicate pairings, so each pair of nearest neighbors was only considered once. The imposition of these constraints shrunk the nearest neighbor sample size to 49. Finally, I ran a simple linear regression to evaluate the relationship between the size of the mounds in these nearest neighbor pairings. Because the distribution of mound volume can be modeled as an exponential distribution, I used the log of mound volume in the regression analysis. Without this transformation, any relationship in
mound size between the nearest neighbors would be unlikely to be well approximated by a straight line.
I then sampled without replacement from the 287 whole mounds to obtain 49 randomly-selected pairs. As with the nearest neighbors, I performed a simple linear regression, using log volume. I repeated this procedure 500 times. The repeated sampling and analysis allowed me to develop a null hypothesis for the values of the regression coefficients.
I expected that the randomly-selected pairs would not have a meaningful relationship. The slope of the regression line should be close to zero for these samples. In contrast, the size of the nearest neighbor pairs should be positively correlated, so the slope of the regression line should be significantly larger than zero. The following two figures show the distribution of the regression coefficients, the intercept and slope, for the randomly-selected pairs.
Notice, in particular, that the distribution of the slope clusters near zero as predicted. This result indicates that the randomly-selected pairs do not have a meaningful relationship with each other, at least with respect to size.
These distributions contrast with the regression coefficients calculated for the nearest neighbors. The intercept is 0.90, and the slope is 0.75. These values are completely beyond the range of values estimated for the randomly-selected pairs. This experiment shows that the size of nearest neighbors is significantly and positively correlated. The results lend some support to the notion that stable groups produced these mounds. At the very least, the results provide encouragement to further explore the relationship between mound size and mound spatial distribution. Such work should probably make use of the spatial analysis tools available in GIS programs.
© Scott Pletka and Mathematical Tools, Archaeological Problems, 2013.
Monumental architecture, by virtue of its scale, implies something about the organizational capabilities of the groups that produced it. The previous analysis of burial mound size further implies something about the variation in those capabilities. I explore some of those implications at greater length here.
The following thoughts should be considered preliminary. The original goal of this particular analysis was very modest, concerned with establishing a reliable measure of monument scale or prominence. My hope was that mound volume had stayed reasonably constant despite the effects of weathering and other processes. The analysis was being done as part of a project regarding monument function and social organization. While the analysis showed that many mounds lost volume as a result of modern plowing, it also showed that the volume of plowed mounds and whole mounds varied in very similar fashion. Variation in mound volume can be modeled with the exponential distribution. I did not expect this result at the outset.
I have often regarded mounds as potentially reflecting the “strength” of the groups that built them. Group strength might be a function of many different factors, such as group size, the productivity of the territory that the group occupies, the group’s organizational capabilities and the size of the social network upon which the group could call. Groups that scored higher on these variables should have been capable of building larger mounds. Groups that scored lower on these variables should have been limited to building smaller mounds. A large list of qualities could thus contribute to group strength and to burial mound size. I assumed that each factor would have a small additive effect on strength. Consequently, I supposed that variation in group strength and mound size should take the form of a normal distribution.
Clearly, this intuition was wrong. Upon further reflection, I think that I’ve underestimated the contribution of social networks. Their contribution is probably not minor. Ethnographic studies of leadership in small-scale societies illustrate the hard work and emphasis that group leaders often put on the maintenance of their networks. The effect of each additional ally is probably not merely additive, since each ally that gets incorporated has the potential to contribute their own unique allies to the network. Modern studies of social networks indicate that variability among individuals in network size has a heavy-tailed distribution, where most individuals have a relatively small network and a few individuals have very large networks. The mound data is suggestive of similar processes at play.
Before getting too carried away, let me emphasize again that this interpretation is very preliminary. It is, nevertheless, consistent with other archeological evidence for the operation of long-distance exchange networks at the time. The results also illustrate the potential value of this form of statistical modeling. The type of probability model which can be fit to the data — whether normal, exponential, or some other model — reflect the type of processes which operated in the past. The modeling thus constrains the set of possible interpretations that should be considered.
© Scott Pletka and Mathematical Tools, Archaeological Problems, 2013.