Posts Tagged ‘mathematical modeling in archaeology’

On Monument Volume I

March 26, 2013

This post introduces an approach for evaluating the original size of round burial mounds. In one of the places where I’ve worked, burial mounds comprise a prominent feature of the landscape, as illustrated in the following photograph.

This prominence may be amenable to explanation through formal high-level theory. Mound size, for example, may reflect the labor used to produce it, suggesting something about the size and organizational capabilities of the group that produced the mound. In order to use this feature of the monuments to evaluate high-level theory, the modern size should be an accurate reflection of the original size.

Such monuments may erode over time, making them less conspicuous and also less reliable as an index of the characteristics of the group that produced them. Natural weathering may take its toll, but modern agricultural practices probably affected burial mounds to a greater extent. Burials mounds were sometimes plowed repeatedly. These modern practices came later to the region where my case study is located, by which time laws protecting them had been enacted. Nevertheless, various processes leveled many mounds, perhaps decreasing their height and increasing their diameter. Despite these depredations, the original volume of the mounds may be preserved.

Mound shape can be modeled as a spherical cap, a geometric form representing the portion of a sphere above its intersection with a plane. Spherical caps are thus dome-shaped. The following figure illustrates a spherical cap. In the figure, h is the height of the dome, a is the radius of the dome’s base, and R is the radius of the sphere.

The total volume of a spherical cap depends on the maximum dome height, h, and on the radius of the circle where the plane intersects with the plane, a. The formula for the volume, V, of a spherical cap is:

$V=(\frac{1}{6})\pi h(3a^2+h^2)$

Importantly, the calculation of the volume of a spherical cap does not depend on the radius of the sphere of which it is a part. The maximum possible original height of a mound, however, should be equal to the radius of that sphere. This height can be calculated by holding the volume constant and finding this value of the height and radius. At that point, the height and radius will be equal. Subsequent posts will explore these ideas further and play with some data on mound size.

© Scott Pletka and Mathematical Tools, Archaeological Problems, 2013.

Mixture Models, Maximum Likelihood Methods, and Confidence Intervals

December 19, 2009

In an earlier post, I noted that the parameter estimates for a mixture model supplied by maximum likelihood methods were only part of the story. A measure of the precision of those estimates is also needed. Confidence intervals provide this measure. This post details a method for determining them.

Packages for the analysis of mixture models in R like mixdist generate confidence intervals automatically. The direct search approach, however, has proven to be more reliable for the data sets that I have been examining. In a direct search, the likelihood is calculated for each combination of parameter values over a plausible range of those parameter values.

The likelihood value of each combination is calculated by looping over a sequence of parameter values for each parameter. The interval between the values of a parameter used in the calculations should be relatively small. When small intervals are used, however, the number of combinations of parameter values for which likelihood values must be calculated increases rapidly. Direct search of the parameter space may not be practical for some applications. The direct search approach requires that a balance be struck between precision and manageability.

I have provided some R code that can be used, with some additional work, to generate confidence intervals for parameters of a simple mixture model. The mixture model assumes that the data comprises two lognormal distributions. Confidence intervals for the proportion of cases in the first model can be determined from the results produced by the code as written; the code can be modified to generate results relevant to other variables. The code follows:

Identifying and Explaining Intensification in Prehistoric Fishing Practices XII: Specific Hypotheses to Explain Variation in Net-Use

November 28, 2009

The previous post in this series presented some high-level theory that might account for variation in fishing intensification and, thus, net use. This theory will now be tailored for my study area to generate some specific expectations. As noted elsewhere, I will not be dwelling on the details of the site and study area except as necessary to explain how I arrived at particular predictions.

The archeological assemblages that I have been analyzing derive from a single shell midden site. The analyzed deposit ranges from 15 to 55 centimeters below the ground surface. The site was excavated in five-centimeter arbitrary levels. I have treated each arbitrary level as a distinct analytic unit, which seems reasonable as spatial analysis of radiocarbon dates and other chronologically-diagnostic artifacts show very little vertical mixing or movement of artifacts. Poor environmental conditions appear to have occurred during the time period in which three levels–located between 35 and 50 centimeters below the ground surface–were deposited.

The period is characterized by widespread drought. These conditions disrupted settlement at many other sites. My site is one of the few sites in the region to have been continuously occupied during the period. The site lies near the mouth of a creek, an important source of fresh water.

Marine productivity may also have declined during the period of poor environmental conditions. Paleoenvironmental data regarding ocean conditions are complex and not entirely consistent. Proxy data derived directly from archaeological sites within the region, however, shows that sea surface temperatures during the period of drought were unusually high. These conditions may have affected the distribution and abundance of fish.

A variety of social and economic responses to the challenges of the period of poor environmental conditions have been documented. Economic specialization in artifact production emerged at my site and across the region. Local manufacturers produced trade goods. In exchange for these goods, these specialists presumably received food and other items that could not be produced locally as easily. Fishers at my site may have responded by changing their fishing strategies. The number of fish caught by the site’s inhabitants seems to peak during the interval of poor environmental conditions before declining.

This observation is consistent with other faunal analyses of the site’s deposits, but it could be attributable to a number of different factors. The peak in density of fish remains could be due to an increase in population at the site during the period of poor environmental conditions. The site may have served as a refuge for groups from elsewhere in this period, since fresh water was more readily available at the site. The increase in the density of fish remains could reflect a more widespread emphasis on fishing by the site’s inhabitants as other foods normally taken by them became less abundant. It could be attributable to increased economic specialization. Fish may have subsidized on-site specialized artifact production. Workers at the site may have specialized in both artifact production and fish procurement, as the local inhabitants had comparative advantages in these activities and exchanged their wares and fish for other goods. The greater density of fish could also be due to some quirk of cultural transmission, as fishers made choices about the appropriate gear to use and effort to undertake based on the work being done by their neighbors. A more detailed examination of the data will allow these possibilities to be distinguished.

© Scott Pletka and Mathematical Tools, Archaeological Problems, 2009.

More Thoughts on Mixture Models and Maximum Likelihood Methods

November 11, 2009

In my previous post on this topic, I discussed two techniques for finding the combination of mixture distribution parameters that have the lowest log likelihood, direct search and the mixdist package for R. I suggested that direct search of the parameter space allowed the effects of outliers in my data to be identified more clearly. I have done additional work since that time, comparing direct search and the mixdist package. As a result of this work, I have concluded that direct search is much more effective at finding the optimal combination of parameter values.

Mixdist returned parameter values that consistently produced higher log likelihoods than I found using direct search. The differences were substantial. I can not fully explain the observed differences, but the differences were also consistent among all of my data sets.

Direct search of the parameter space is obviously not the most convenient approach. My model involved only two lognormal distributions with a total of five parameters. Direct search of the optimal parameters for more complex mixture models may not be feasible, as the number of parameter value combinations that need to be searched is too large. For simple mixture models, however, the direct search may be preferable.

The following code is the very simple program that I wrote for R to find the lowest log likelihood and corresponding parameter values. The program was written specifically for a mixture model of two lognormal distributions. The parameter value search space included the range of likely values of log mean of fish vertebra height and the log standard deviation of fish vertebrae height in both of the two distributions. I am not a professional programmer, as you will see, so any suggestions for improving and extending the code will be greatly appreciated.

#vcdata is a list of the data for which the parameter estimates and likelihood will be calculated

#pvec provides the sequence of values for the proportion of vertebrae in the mode of smaller (net-caught) fish over which the program loops.

pvec=seq(0.10, 0.99, by=0.015)

#mean1vec provides the sequence of values for the log mean of vertebra height in the mode of smaller (net-caught) fish over which the program loops.

mean1vec = seq(0.80, 1.20, by = 0.03)

#sd1vec provides the sequence of values for the log standard deviation of vertebra height in the mode of smaller (net-caught) fish over which the program loops.

sd1vec=seq(0.02, 0.44, by=0.03)

#mean2vec provides the sequence of values for the log mean of vertebra height in the mode of larger (hook- or spear-caught) fish over which the program loops.

mean2vec=seq(1.25, 1.70, by=0.03)

#sd2vec provides the sequence of values for the log standard deviation of vertebra height in the mode of larger (hook- or spear-caught) fish over which the program loops.

sd2vec=seq(0.02, 0.44, by=0.03)

#loglike is the negative log likelihood, which is calculated for each combination of parameter values.

loglike=100000

#p stores the parameter value for the proportion of vertebrae in the mode of smaller (net-caught) fish.

p=0

#mean1 stores the parameter value for the log mean of vertebra height in the mode of smaller (net-caught) fish.

mean1=0

#sd1 stores the parameter value for the log standard deviation of vertebra height in the mode of smaller (net-caught) fish.

sd1=0

#mean2 stores the parameter value for the log mean of vertebra height in the mode of larger (hook- or spear-caught) fish.

mean2=0

#sd2 stores the parameter value for the log standard deviation of vertebra height in the mode of larger (hook- or spear-caught) fish.

sd2=0

#the result data frame returns the value of the log likelihood at a particular combination of parameter values

result<-data.frame(loglike, p, mean1, sd1, mean2, sd2)

#the finalresult data frame stores the log likelihood and parameter values for the combination of parameter values that returns a log likelihood that is smaller than all other log likelihoods generated.

#looping over the combination of parameter values

finalresult<-data.frame(loglike, p, mean1, sd1, mean2, sd2)

for (j in 1:length(mean1vec)) {

for (k in 1:length(pvec)) {

for (m in 1:length(sd1vec)) {

for (n in 1:length(mean2vec)){

for (q in 1:length(sd2vec)){

#the following function returns the negative log likelihood value at a particular combination of parameter values

L= -sum(log(pvec[k]*(dlnorm(vcdata, meanlog=mean1vec[j], sdlog=sd1vec[m]))+(1-pvec[k])*(dlnorm(vcdata, meanlog=mean2vec[n], sdlog=sd2vec[q]))))

result$loglike=L result$p=pvec[k]

result$mean1=mean1vec[j] result$sd1=sd1vec[m]

result$mean2=mean2vec[n] result$sd2=sd2vec[q]

#the following comparison stores the lowest log likelihood and the corresponding combination of parameter values seen up to this point

if (L<finalresult\$loglike) finalresult=result

}

}

}

}

}

finalresult

When I employed the direct search, I ran it twice for each data set. The first time, I looped over a wider range of parameter values. The sequence of parameter values searched within each variable was spaced sufficiently far apart so the direct search would not bog down. The one exception was the proportion of fish vertebrae in each mode. For this variable, the step-size between the values of proportion for which I calculated the log likelihood was fairly small from the start. Experience with my data sets showed that this variable had the biggest effect on the log likelihood.

The second time that I ran the direct search for each data set, I focused on a narrower range of parameter values. The range of parameter values that I searched centered around the values found in the initial run. The sequence of values searched within each variable was spaced closer together in the second run.

Additional code, of course, needs to be written in order to determine the standard errors of the parameter estimates.

© Scott Pletka and Mathematical Tools, Archaeological Problems, 2009.

Identifying and Explaining Intensification in Prehistoric Fishing Practices XI: High-level Theories for the Explanation of Variation in Net Use

November 1, 2009

The ten previous posts in this series developed the middle-level theory needed to quantify the intensification of fishing. This theory linked archaeological data to interpretations of past human behavior. Fishing effort intensifies as fishers rely more extensively on nets to catch fish. Net use can be identified and distinguished from the use of other gear as a distinctive mode within a size-frequency distribution of fish bone size. To explain variation in fishing practices, high-level theory must be developed. For this theory, I turn to some formal economic models.

Formal economic theory provides expectations for the relationships among fish size, net use, and environmental conditions. Many different kinds of formal economic theory exist, and some of this diversity will be explored in the following paragraphs. I begin with a discussion of a technological intensification model. This model is both simple and directly relevant for understanding the decisions made by fishers faced with a choice of gear to use.

Recall that nets are more expensive to produce compared to other types of fishing gear. These fixed costs affect the circumstances under which different gear was employed. This insight can be formalized in a model for technological intensification, following Bettinger et al. (2006: 541). Let:

mi = the hours required to manufacture a particular gear type, i, such as nets or spears;

fi(mi) = the return for using the ith gear type (in kcal or some other currency) as a function of gear manufacturing time; and

p = the hours spent procuring fish.

Assume that nets are more expensive to produce than another type of gear (i.e., mnets > mother) and that nets provide a greater return [fnets(mnets)> fother(mother)]. Nets would never be adopted if they were both more expensive and had a lower return rate. Under these assumptions, fishers will adopt nets when:

$\frac{f_{nets}(m_{nets})}{m_{nets}+p} > \frac{f_{other}(m_{other})}{m_{other}+p}$

The foregoing equation can be used to derive the length of time that fishers would have to be engaged in procurement for nets to produce better returns than another gear type. This threshold is given by the following equation:

$p=\frac {[f_{other}(m_{other}) \cdot m_{nets}]-[f_{nets}(m_{nets}) \cdot m_{other}]}{f_{nets}(m_{nets})-f_{other}(m_{other})}$

Some useful insights can be derived from the model. Because nets are expensive to make or acquire, they have to be used extensively in order for the benefits to outweigh the costs. Fishers should prefer to use nets once some threshold level of fishing effort has been reached. The model can not, however, explain why fishers might choose to fish extensively. Declining environmental conditions would seem a plausible reason to redouble fishing efforts. Despite the plausibility of this intuition, any drop in environmental productivity that affects the return rates of different gear types to the same extent does not alter the threshold value of fishing effort. Any constant that changes the value of return rates equally can be divided out of the model. The model was not intended to evaluate the effects of such factors on investment in technology.

The model also assumes that the costs of finding, chasing, and processing fish are constant across gear types (Bettinger et al. 2006: 541), so it does not explicitly include them. Like return rates, the values of these variables are likely to be affected by environmental changes. The technological investment model can be revised slightly to show the effect of search and pursuit costs.

The revision will allow variability in the abundance of fish to be evaluated. To extend the model, let:

o = the hours spent searching and pursuing prey.

Note that in this revision, like in the original technological intensification model, the hours spent searching and pursuing do not vary among technologies.

With this addition to the original model, the condition under which fishers would adopt nets is:

$\frac{f_{nets}(m_{nets})}{m_{nets}+p+o} > \frac{f_{other}(m_{other})}{m_{other}+p+o}$

The threshold number of hours spent in procurement at which the return rate of nets is the same as another gear type is thus given by:

$p=\frac {[f_{other}(m_{other}) \cdot (m_{nets}+o)]-[f_{nets}(m_{nets}) \cdot (m_{other}+o)]}{f_{nets}(m_{nets})-f_{other}(m_{other})}$

The revised model does not have qualitatively different implications for the adoption of different gear types. The revised model does show that an increase in the hours spent searching for fish would reduce the time at which nets would be preferred relative to another gear type, given the assumption that nets have a higher return rate. An environmental change that altered the abundance of fish, increasing search times, could lead to greater use of nets.

Poor conditions may have other observable effects on archeological fish assemblages. Fish size may be sensitive to climate and to predation. The prey choice model speaks to the relationship between fish size and fishing practices. This model may therefore provide a better context for understanding net use.

Prey choice models show that foragers who seek to optimize their returns should preferentially take certain kinds of prey. All other things being equal, fishers using hook-and-line or spears should focus their efforts on prey that is large, readily caught, and easily processed. Such prey provides a greater return for the effort expended. Archaeological applications of prey choice models typically assume that larger prey is preferred to smaller prey. In these applications, the cost of handling and processing larger prey is presumed to not be commensurately bigger as well. When large prey is abundant, fishers will forgo opportunities to catch other types of fish. Fishers will become less selective, however, as the density of preferred prey decreases.

Thus, fishers should target large fish, unless such fish become scarce due to overexploitation, reduction in favorable habitat, poor marine productivity, or other circumstances. Fishers will still take large fish whenever they are available, even as those fish become less abundant. They should just be more willing to take smaller fish in the face of scarcity. Mean caudal vertebra height among fish caught by hook or by spear may thus serve as an index of environmental conditions. Fish size should also be correlated with other environmental indices.

The discussion of the prey choice and technological intensification models can now be integrated to provide additional predictions. Net use provides better returns than other gear only if fishing effort exceeds a threshold number of hours, in order to offset the high costs of making those nets. The threshold number of hours is the same for all fishers, so these fishers should respond identically when faced with changes in search costs or gear production costs. Shifts in environmental conditions that decrease fish abundance and increase search costs will lower the threshold number of hours for all fishers. The frequency of net use may only change (but change rapidly) once environmental perturbations have altered this threshold value sufficiently. Using mean fish size among hook- and spear-caught fish as an index of environmental conditions, net use may only change once mean fish size has reached certain levels. Net use may predominate when mean fish size reaches a particularly low value, and it may be rare when mean fish size attains a particularly high value.

In practice, however, individuals may vary in return rates and in their opportunity costs to using various types of gear. Children, for example, may be better suited for simple hook-and-line fishing than for the production and use of large nets. This variability may engender a more gradual response among fishers to changing conditions. Some fishers may be quite sensitive to environmental changes and quickly switch to nets, while other individuals may not be so sensitive. Mean fish size among hook- and spear-caught fish, again serving as an index of environmental conditions, may therefore be negatively correlated with net use, provided that the assumptions of these models hold true.

The technological intensification and prey choice models employ a number of assumptions. The models assume, for example, that individuals possess perfect information about their environment and the return rates to fishing with various gear types. The models also assume that the relevant currency is the nutrients that the prey would provide upon consumption. To the extent that these models fail to fit particular real-world cases, other models that use different assumptions should be explored.

Suppose, for example, that fish are valuable to fishers as a good to be exchanged for other products. The value of fish would thus be a function of both their return rate and the demand for fish among consumers. To the extent that fishers have a comparative advantage in fish procurement and demand for fish is relatively strong, net fishing may be worthwhile even if it is costly. The high cost of fish procurement would be offset by the goods received in exchange under these circumstances.

Specialized fishing need not have developed for fishing to be affected by the emergence of exchange systems. Fishing and the specialized production of other goods could be alternative strategies for the acquisition of desirable products. Under this scenario, fishing poses an opportunity cost to other specialized production. Fishers may therefore be less inclined to spend large amounts of time fishing if they can more easily satisfy their needs through the production and exchange of other goods. While the microeconomic theory underlying both of these proposals is well-established, archeological evidence for the operation of such processes may be less obvious.

If fishing develops into a specialized activity, the total amount of fish caught ought to be positively correlated with other evidence for the volume of exchange. Net use may increase dramatically once some threshold volume of exchange has been reached, as the number of hours spent fishing increases to the point at which net use becomes viable. Alternatively, net use may increase more gradually as exchange grows in importance due to variability among fishers regarding the threshold value at which they would adopt nets.

If fishing is a lesser alternative to the specialized production of other goods, net use may be negatively correlated with evidence for the volume of exchange. Net use may then drop precipitously once a threshold volume of exchange has been attained. Of course, net use may decline more gradually due to the same variability among fishers regarding the threshold level of effort that has been discussed previously.

Like the technological intensification and prey choice models, these microeconomic models of net use assume that fishers have perfect information about return rates, environmental conditions, and demand for fish. Assumptions of this sort may be appropriate as an approximation for simple adaptive problems. Information may, however, be very difficult to gather or evaluate. Return rates for the use of different gear types and search costs may be difficult to estimate, for example. Experimental studies, ethnographic evidence, and theoretical considerations suggest that individuals acquire relatively few norms through their own trial-and-error.

Models of cultural transmission allow the effects of imperfect information to be incorporated. Individuals acquire much of their norms through a mechanism of cultural transmission that includes some type of imitation. Most fishers may prefer to take their cues about the type of gear to employ from someone else, like a particularly successful fisher. Transmission rules of this sort can lead to the spread of adaptive norms.

The utility of explicit models of cultural transmission often lies in their ability to account for cases where culture change appears maladaptive or unrelated to adaptation. Simpler economic models may provide an adequate account of shifts in adaptive norms when such changes have obvious adaptive consequences, as may be the case with many changes in subsistence. Models like the prey choice and technological intensification models usefully draw connections between key variables such as diet breadth and environmental conditions. They do not address the processes by which norms regarding subsistence behavior change. The details of these processes can be significant for understanding changes with less obvious adaptive consequences.

Cultural transmission models may therefore provide insight to cases where change in subsistence does not conform to the predictions of the prey choice or technological intensification models. In many cultural transmission models, random factors like sampling effects and imperfect copying work alongside more focused imitation processes to elevate (or decrease) the popularity of particular cultural traits. Individuals, for example, may select a subset of the available population before choosing the “best” model or set of models to copy. These random factors work most powerfully among small groups. Sampling effects within small groups can eventually cause a particular variant to predominate within the group. Such changes typically occur only after many false starts with substantial swings in the frequency of the trait within the population. The operation of sampling effects may thus be identifiable as a pattern of gradual change that does not closely correspond to other environmental or economic trends.

Other effects resulting from the mechanics of cultural transmission are possible. Cultural transmission processes may sometimes lead to the development of exaggerated cultural traits featured in prestige competition or to within-group homogeneity and among-group heterogeneity in certain characteristics. The applicability of alternative models depends on the details of a particular case.

© Scott Pletka and Mathematical Tools, Archaeological Problems, 2009.

Identifying and Explaining Intensification in Prehistoric Fishing Practices X: Quantifying the Relative Importance of Netted Fish

October 23, 2009

In the previous post in this series, I showed that mixture models of two lognormal distributions provide a reasonable fit to the size-frequency data for fish caudal vertebrae from my midden assemblages. I have interpreted these distributions as result of the use of nets and other gear types. Nets should take smaller fish than other gear types, and both nets and other gear types may also take large fish. During the course of my model-fitting, I determined that a few vertebrae in each assemblage were too large to fit the mixture model. Such fish were excluded from the data used to fit the mixture models. These very large fish may be attributable to some other gear type than was used to take the other fish. The mixture models show that most fish, quantified in terms of minimum number of individuals, were caught by nets in each assemblage. Archeological analysis should not just end, however, with an identification of the number of fish in an assemblage that were caught by nets or by other gear.

The overall contribution to the diet of fish caught by nets in comparison to fish caught by other gear is of particular interest. Smaller fish produce a lower return on the work invested in fishing, all other things being equal. Many small fish may have to be caught to provide the contribution to the diet that a single, large fish would provide. The proportion of “net-caught” or “hook/spear-caught” fish bone in an assemblage thus does not by itself accurately reflect that return.

This contribution can be determined by calculating the total live weight of fish represented by the modeled distributions of net-caught fish and fish caught with other gear. The total live weight of fish in an assemblage has a more obvious relationship to the potential dietary contribution of the fish than the count of those fish. The positive correlation between caudal vertebra height and fish live weight allows these amounts to be inferred, using a simple transformation of the data.

The total live weight of “net-caught” and “hook/spear-caught” fish can be calculated from the mixture model results. The mixture model provides parameters that can be employed to create an idealized size-frequency distribution for each population. These distributions are scaled using the inferred number of fish from each population and the live weight equation. Remember that the relationship between live weight of fish and caudal vertebrae height can be represented by the following equation:

$y=4.54x^{2.77}\,$

where the parameters were estimated from modern data.

The next equation illustrates the calculation of total live fish weight from one of the modeled distributions represented in an assemblage, where N is the inferred number of fish in the assemblage that belongs to that population and f(x, µ, σ) represents the lognormal probability density function:

$wt=\int_0^{14.2}(4.54x^{2.77})(N)f(x, \mu, \sigma)\,\mathrm{d}x \,$

The parameters N, µ, and σ are estimated from the maximum likelihood analysis of the mixture models.

The scaled distributions are integrated over the range of observed vertebra heights to obtain the total weight of fish. For this study, the equation was integrated from zero to the maximum observed caudal vertebra height among all of the assemblages, which was 14.2 mm. The scaled distributions are then integrated over the range of observed vertebra heights to obtain the total weight of fish. Remember that some fish vertebrae were so large that they were considered outliers and possibly part of a third mode, caught using a different technique than was used to catch fish from the other two distributions. The weight represented by these very large fish was calculated directly from the live weight (power law) equation. Once these mathematical operations have been completed for both of the populations that comprise the mixture distribution, assemblages can be compared for patterns in the amount of fish caught by net and by other gear. The following table provides the results.

Total weight of fish by distribution from each level

The table shows the weight of net-caught fish from the distribution of smaller fish, the weight of larger fish from the second distribution, and the weight of very large fish. The weight of net-caught fish was compared to the combined weight of other fish to calculate the proportion of net-caught fish in each assemblage. These results contrast with my earlier results based on the number of fish from these distributions. Net-caught fish are much less important by weight in all of the assemblages.

At this point, sufficient middle-level theory has been developed to consider the variation among these assemblages. High-level theory provides possible explanations for the patterns observed through the application middle-theory. In subsequent posts in this series, I will present and apply some formal high-level theories that may explain variation in the intensification of fishing.

© Scott Pletka and Mathematical Tools, Archaeological Problems, 2009.

Identifying and Explaining Intensification in Prehistoric Fishing Practices IX: Model Evaluation and Parameter Estimates

October 16, 2009

The previous post in this series introduced the use of mixture models to fit two lognormal distributions to my data on the frequencies of fish caudal vertebra sizes. I have also discussed some technical issues that I encountered while fitting the models. This post presents some results. The following table gives the estimated parameter values for the two lognormal distributions, including the proportion of the assemblage comprised by fish from each distribution, the log means of the two distributions, and the log standard deviations of the two distributions.

Parameter Estimates for Mixture Models

The log mean and log standard deviation parameters describe the distribution of caudal vertebra height for the fish bone in the modes of smaller (net-caught) and larger (hook- or spear-caught) fish. The estimates of all the parameters also have associated standard errors, but I am still calculating those errors. They will be reported in a later installment in the series.

A future post in the series will also present several theories by which these estimates might be interpreted. For now, I will make a few general observations. Note that the standard deviation of the distribution of smaller fish is consistently greater than the standard deviation of the distribution of larger fish. The standard deviation is scale-dependent, but I can offer an explanation for this observation without standardizing the standard deviations. As explained elsewhere, nets should capture both small and large fish. Small fish were likely more common than large fish. Thus, the distribution of net-caught fish should have a small mean and a relatively large standard deviation. The distribution of fish caught by hook or by spear should have a larger mean and a smaller standard deviation. Hooks and spears would not be likely to catch fish smaller than some threshold size. The estimates support these assumptions. The estimated proportion of net-caught fish is also consistently higher than the proportion of fish caught with other gear. Some variability exists, however, and this variability may be significant.

Having fit the models, another issue must be resolved now. I also need to consider whether these models provide an appropriate fit to the data. In particular, I need to evaluate whether a simpler model might also explain the observed patterns. In this case, an example of a simpler model than the mixture model of two lognormal distributions might be a single lognormal distribution. I fit such single lognormal distributions to the size data from each assemblage using the maximum likelihood method. The negative log likelihoods of the mixture models was consistently lower (showing that it fit the data better) than the negative log likelihoods of the single lognormal distribution models, but this result is not surprising.

Models with many parameters can generally be made to fit data better than models with fewer parameters. Models with fewer parameters, however, should generally be preferred to models with more parameters, following the principle that simpler explanations are better than more complex explanations. Models with many parameters may also be worse at predicting the variability in new data sets. In essence, more complex models may be finely tuned to match the particular, random factors that affected one data set. The next data set will have been affected by those random factors differently. Thus, a simpler model that does not try to “explain” random variation may often do better at predicting additional data. Such models focus on the deterministic factors that pattern variation. These observations

The likelihood ratio test provides a way to compare “nested” models. Models are nested when more complex models can be reduced to simpler models by setting parameters to particular values. In the case of my fish data, I can reduce my mixture model of two lognormal distributions to a single lognormal distribution by setting p=1 or p=0. Recall that p is the proportion of fish in the assemblage that derive from the distribution of mainly smaller (and presumably net-caught) fish.

As the name implies, the likelihood ratio test compares the likelihood values of a complex model and a simpler model. A theorem states that the ratio of these values has a chi-square distribution with degrees of freedom equal to the difference in the number of parameters between the models being compared. Using this theorem, I want to know if the observed ratio attests to a sufficiently significant increase in the fit to the data of the more complex model to justify the added complexity. The following table shows the results of this analysis for my mixture models and the corresponding single lognormal distribution models.

Likelihood Ratio Test Results for Mixture Model and Single Lognormal Distribution

The results provide some support for the use of the mixture models on my data. Many of the p-values for my likelihood ratio tests exceed the arbitrary 0.05 value often employed in studies, although some are lower than this value. Notice that the p-values are generally lower when the sample size is higher. The following scatterplot illustrates this relationship.

Sample size of fish bone and p-value for likelihood ratio tests

P-values often reflect such sample size effects. In addition, no universal threshold exists at which a p-value can be said to truly “significant”. For these reasons, I am comfortable applying the mixture models to all of my assemblages. The mixture models seem sufficiently better at explaining the variability among all of the assemblages to justify the added complexity. I intend to use the mixture models to determine the importance of net-caught fish in each assemblage.

© Scott Pletka and Mathematical Tools, Archaeological Problems, 2009.

Mixture Models and Maximum Likelihood Methods

October 7, 2009

In this post, I will highlight some of the technical issues that I encountered while trying to model the variability in the sizes of fish vertebrae from a midden deposit. As described elsewhere, my goal was to distinguish fish caught by nets from fish caught by other gear. The use of these gear types should produce different distributions of fish size. Mixture models are appropriate for cases where variability in a characteristic results from the combination of two or more different distributions. I fit mixture models to the data using the maximum likelihood method. This approach is common in modern statistics but has not been widely employed within archaeology.

The maximum likelihood method addresses the question: “what are the parameter values that make the observed data most likely to occur?” The parameter estimates can be determined from the corresponding likelihood value. The likelihood is calculated from the product of the probability of observing each case in the data given a particular set of parameter values. In practice the log likelihood is usually calculated because the log probabilities can be summed. Calculating the product of many small numbers can be computationally more difficult than summing the log of those numbers. The best parameter estimates have the highest likelihood value or the lowest negative log likelihood value. Algorithms that search the parameter space are typically used to determine those values.

To use this method, a probability distribution has to be selected that is appropriate for the variability in the data. A simple linear regression, for example, is essentially a maximum likelihood analysis which assumes that the data are normally distributed with a mean of μ= a+x*b and a standard deviation of σ2. In this example, the maximum likelihood analysis finds the values of a, b, and σ that best account for the variability in the data.

The application of a mixture model to my data on fish bone size provides another example of this approach. The mixture model that I used assumes that the size-frequency distribution of fish in each assemblage was a result of the combination of two lognormal distributions. Such distributions are appropriate to the data for a couple reasons. First, those distributions can have long tails to the right, and the histograms of caudal vertebrae height for my assemblages also have long tails.

Histograms of Caudal Vertebra Height (mm) by Level from the Midden Deposit

Second, consider the average size of those modern fish species that are also found at archaeological sites in the region. A histogram of the average size of these fish also displays a long tail to the right as shown in the following histogram.

Average live weight of modern fish species in study region

This distribution is probably not lognormal. Nevertheless, smaller fish species are clearly more common than larger fish. The distribution of fish sizes from which fishers obtained individual fish was likely affected by the abundance of these fish species, habitat, climate, and other factors. Better data on the modern distribution of fish size for my study area is not available, so this discussion will have to be sufficient for now.

I used the mixdist package for R to find the maximum likelihood estimate of the parameter values, including the proportion of the fish in the two modes, the mean size of fish in each mode, and standard deviation of each mode. This package uses a special algorithm to arrive at those estimates. I also searched the parameter space directly, writing a simple program in R to loop over the plausible range of values for my parameters and find the best parameter estimates.

The direct search of the parameter space proved to be the most informative approach. I could examine the results to see how the likelihood varied with parameter values. Examination of this variation showed that the likelihood value was not converging smoothly with those parameter values. Wildly different combinations of parameter values had very similar likelihoods.

The problem turned out to be the large fish at the extreme end of the distributions in my assemblages. Too many of these fish occurred in the samples for the models to readily converge on parameter estimates. These difficulties were largely hidden when I used the mixdist package to fit the mixture models. The very large fish probably belong to a third mode and may therefore have been obtained in a different manner from the techniques used to acquire the fish in the two smaller modes. Once I removed these large fish from the analysis, the likelihood varied smoothly with the parameter values.

© Scott Pletka and Mathematical Tools, Archaeological Problems, 2009.

Identifying and Explaining Intensification in Prehistoric Fishing Practices VIII: Establishing the Number of Fish Caught by Nets or Other Gear through Mixture Models

October 2, 2009

The previous post in this series provided the middle-level theory needed to quantify the number and size of fish represented in an archeological assemblage. Recall that I am interested in determining how many of these fish were caught by nets and how many were caught by other gear. These gear should differ in the size range of fish that they are likely to capture.

The next step is to look at histograms of the data, which show the number of fish bone that occur within a particular size interval. Modes, or peaks, in the fish size-frequency histograms should reflect the use of different fishing gear. The mode of smaller fish should represent fish taken by nets, and the mode of larger fish should reflect fish taken by hooks and line or by spears. The following histograms show data from my archeological site. As noted in an earlier post, the fish bone assemblages derive from different levels within a single site, where minimal mixing has occurred among the levels.

Histograms of Fish Caudal Vertebra Height by Level from an Archaeological Site

The histograms for some of the levels show distinct modes. In particular, two modes seem to be present in the 50-55 cm level, while three modes apparently occur in the 30-35 cm level. The other levels are harder to interpret.

Clearly, the identification of these modes is not straightforward. The lack of more clear-cut patterning likely results from a heavy reliance on nets. Nets may catch both large fish as well as small fish. Other gear like hook and line or spears is much more likely to catch large fish. Prehistoric fishers used spears tipped with large stone points or sharpened bone. They employed hooks made from shells. Hooks and line or spears may not be able to catch fish smaller than some threshold value of size. In assemblages where net-caught fish predominate, the prevalence of net-caught fish may obscure any mode in the fish size distribution formed by fish caught with other gear.

Fortunately, statistical techniques exist which may help to distinguish separate populations which are mixed together in a single distribution. Finite mixture distributions model such situations. Such distributions can be analyzed using the mixdist package for R. This package allows the parameters of the contributing populations to be estimated, including the proportion of each population represented in the distribution and the mean vertebra size in each separate population. The following graph illustrates the application of a mixture model to data from the 50-55 cm level at the site.

Example of the Mixture Distribution Fit to Data from the 50-55 cm Level

For the mixture model, I fit two lognormal distributions to the data. The histogram depicts the original data. Note that the histogram interval differs from the interval used in the previous graph. The two dotted lines show the separate lognormal distributions fit to the data, and the black triangles identify the means of those distributions. The solid line shows the mixture model prediction that results from combining the two individual lognormal distributions. The gray bars at the bottom of the graphic show the deviations of the model from the observed distribution. The scale of the deviations is depicted in relative terms. This model appears to fit the data reasonably well.

I have also been working on a more rigorous analysis of the mixture models and their fit. This analysis is ongoing and has been plagued by some problems that I may have finally resolved. I will present some the results and issues in the next post in this series.

© Scott Pletka and Mathematical Tools, Archaeological Problems, 2009.

Identifying and Explaining Intensification in Prehistoric Fishing Practices VII: Estimating the Minimum Number of Individuals

September 28, 2009

In the previous post in this series, I demonstrated that caudal vertebra height can be used to estimate live fish weight. The use of vertebrae, however, introduces an additional issue of quantification that requires resolution. Bony fish have two main types of vertebra: abdominal and caudal. Predictable variation in form occurs among the abdominal and caudal vertebrae in the vertebral column of an individual fish. Despite this predictability, an undifferentiated pile of fish vertebrae in an archaeological collection is usually just separated into these main types because the variation can be subtle. Multiple specimens of a single vertebra type (like caudal vertebrae) from a particular species thus could be attributable to a single individual or to multiple individuals. Many statistical tests, however, require that each observation be independent of the others. This assumption is particularly critical for the analysis of size-frequency distributions. The assumption of independent observations would be violated if multiple bone specimens derived from the same individual. This violation could dramatically affect inferences regarding the shape of that distribution. Some method must be used to eliminate potentially redundant specimens.

Two criteria can be used to identify vertebrae from separate individuals. First, the size of vertebrae within an individual bony fish (excluding the length of the centrum) typically varies only a little. The vertebral centra of sharks, skates, and rays (elasmobranchs) seem to vary to a much greater extent within an individual. Subsequent analysis focused on bony fish for this reason. Second, each species of bony fish has a characteristic number of abdominal and caudal vertebrae, and this number varies modestly among individuals. A small number of caudal vertebrae within a narrow size range from a particular taxon, for example, may well have come from the same individual. Vertebrae from a particular taxon that span a large size range or that occur in large number within a small size range are likely to have derived from multiple fish.

In the sample of fish specimens, vertebra height typically varied less than 0.3 mm when comparing abdominal and caudal vertebrae. The size difference within individuals was not strongly correlated with the overall size of the caudal vertebra.

Size Difference Between Caudal and Abdominal Vertebrae and Caudal Vertebra Height

A simple linear regression returned estimates of -0.19 for the y-intercept and 0.13 for the slope of the line, while r2=0.49 and p < 0.01. Two of the large vertebrae appear to be outliers, however, and may be unduly influencing these results. With these two cases removed, the simple linear regression estimates the y-intercept to be 0.05 and the slope to be 0.06, while r2=0.22 and p =0.01. This rule of thumb may therefore be generally applicable.

When the number of caudal vertebrae within a 0.3-mm size interval exceeds the typical number for that taxon, more than one individual from that size range may be represented. Additional work should be undertaken, using a larger sample of fish, to confirm and refine this observation. In the interim, the foregoing principles and observations can be used to calculate the minimum number of individuals represented in an archaeological assemblage and to estimate the size of each fish.

© Scott Pletka and Mathematical Tools, Archaeological Problems, 2009.