In this post, I will highlight some of the technical issues that I encountered while trying to model the variability in the sizes of fish vertebrae from a midden deposit. As described elsewhere, my goal was to distinguish fish caught by nets from fish caught by other gear. The use of these gear types should produce different distributions of fish size. Mixture models are appropriate for cases where variability in a characteristic results from the combination of two or more different distributions. I fit mixture models to the data using the maximum likelihood method. This approach is common in modern statistics but has not been widely employed within archaeology.

The maximum likelihood method addresses the question: “what are the parameter values that make the observed data most likely to occur?” The parameter estimates can be determined from the corresponding likelihood value. The likelihood is calculated from the product of the probability of observing each case in the data given a particular set of parameter values. In practice the log likelihood is usually calculated because the log probabilities can be summed. Calculating the product of many small numbers can be computationally more difficult than summing the log of those numbers. The best parameter estimates have the highest likelihood value or the lowest negative log likelihood value. Algorithms that search the parameter space are typically used to determine those values.

To use this method, a probability distribution has to be selected that is appropriate for the variability in the data. A simple linear regression, for example, is essentially a maximum likelihood analysis which assumes that the data are normally distributed with a mean of μ= *a*+*x***b* and a standard deviation of σ^{2}. In this example, the maximum likelihood analysis finds the values of *a*, *b*, and σ that best account for the variability in the data.

The application of a mixture model to my data on fish bone size provides another example of this approach. The mixture model that I used assumes that the size-frequency distribution of fish in each assemblage was a result of the combination of two lognormal distributions. Such distributions are appropriate to the data for a couple reasons. First, those distributions can have long tails to the right, and the histograms of caudal vertebrae height for my assemblages also have long tails.

Second, consider the average size of those modern fish species that are also found at archaeological sites in the region. A histogram of the average size of these fish also displays a long tail to the right as shown in the following histogram.

This distribution is probably not lognormal. Nevertheless, smaller fish species are clearly more common than larger fish. The distribution of fish sizes from which fishers obtained individual fish was likely affected by the abundance of these fish species, habitat, climate, and other factors. Better data on the modern distribution of fish size for my study area is not available, so this discussion will have to be sufficient for now.

I used the mixdist package for R to find the maximum likelihood estimate of the parameter values, including the proportion of the fish in the two modes, the mean size of fish in each mode, and standard deviation of each mode. This package uses a special algorithm to arrive at those estimates. I also searched the parameter space directly, writing a simple program in R to loop over the plausible range of values for my parameters and find the best parameter estimates.

The direct search of the parameter space proved to be the most informative approach. I could examine the results to see how the likelihood varied with parameter values. Examination of this variation showed that the likelihood value was not converging smoothly with those parameter values. Wildly different combinations of parameter values had very similar likelihoods.

The problem turned out to be the large fish at the extreme end of the distributions in my assemblages. Too many of these fish occurred in the samples for the models to readily converge on parameter estimates. These difficulties were largely hidden when I used the mixdist package to fit the mixture models. The very large fish probably belong to a third mode and may therefore have been obtained in a different manner from the techniques used to acquire the fish in the two smaller modes. Once I removed these large fish from the analysis, the likelihood varied smoothly with the parameter values.

© Scott Pletka and *Mathematical Tools, Archaeological Problems*, 2009.

Tags: archaeology, faunal analysis, mathematical modeling in archaeology, middle-level theory, middle-range theory, statistics in archaeology

March 19, 2011 at 8:48 pm |

I find it refershing to see some statistics being doen correctly in archaeology. The social sciences in general butcher statistics for many reasons. I have been told that this is done for practical reasons as opposed to pursuing something that can give some kind of stable, tentative answer. I suppose this is true when grants are awarded and seemingly “good” answers are required.

I have many valid complaints. Here are a few.

Virtually no one in archaeology can obtain a representative sample, but act as if they do. What about independence assumptions?

Statistical procedures were developed with small error variability in mind. Scientists address this error and estimate its effects so that the outcomes can be qualified as to precision. This becomes very important when comparing results among similar studies. Systematic error and random error occur through multiple observers who code the data. The occur through inadequate sampling and lack of representation. These problems have been brought up many times in archaeology, but they are ignored. Errors are not analyzed nor are they published. Some resources in scientific ethics label these omissions as incompetance.

Many misuse statistical procedures for data that are not suitable for analysis by the statistical assumptions.

Even when they do use appropriate statistics, they often overlook better and simpler ways to perform the analysis. D. H. Thomas analyzed Great Basin points and used discriminant analysis to distinguish arrow and dart points. I used his data and obtained a slightly better level of prediction using a one dimensional statistic from the start.

Some have used techniques, such as principal components without checking for linearity. Even when they do establish linearity among the variables, with very small samples, they are dealing with nonlinear, complex, chaotic phenomena that are immune to prediction. Developing appropriate linear approximations in several equations is not parcticed by any archaeologists whom I have known. Even if they did, their approximations may well be very short term predictors in a moderate way and poor predictors in a long time series.

The ecological (statistical) problem is another difficulty no one addresses.

Constructing arguments from analogy and HRAF files are just too inadequate to be of scientific value due to the ecological problem and the historical nature of archaeology.

The late David Freeman, a statistician and member of the American Academy of Sciences, wrote many papers about what statistics can and cannot do. These should be required reading for all graduate students in archaeology who would use statiscs in their work.

There is much more to be mentioned, but that is all for now.

April 7, 2011 at 10:49 pm |

Archeologists clearly vary in their mathematical “literacy”. Michael makes some good points. In general, statistical methods have advanced beyond the limits and assumptions posed by classical statistics, but those methods have not yet been widely employed in archeology. Resampling methods, for example, provide a useful alternative when the assumptions of classical statistics can not be met or a population can not be clearly defined.