TheArchaeologistsLaboratory_SpatialAnalysis.html

ARH 312Y - Archaeological Laboratory

Spatial Analyses

Course Schedule Evaluation FAQ Lab ARH Prog Anthro Events

Apart from stratigraphy, the prime archaeological evidence for context comes from spatial relationships, and archaeologists depend on these relationships very heavily for their interpretations of ancient human behavior, site-formation processes, and the meaning of the archaeological record. Here we will explore spatial analysis at various spatial scales. This summary will be longer than usual, because this is a topic not covered in the published version of the text. I apologize that I have not yet had time to add any graphics

The cultural significance of spatial patterning

Archaeologists often assume that patterns in the spatial distribution of and relationships between artifacts, features and other observable data have meaning in terms of activity areas, the organization of households, camps and larger settlements, and human use of landscapes. As we have seen (chapter 12), non-cultural site-formation processes may also contribute to, or blur, these patterns. Archaeologists have adopted a number of implicit and explicit models for what cultural patterns in space should look like at various scales and in various cultural and economic circumstances.

At relatively small scales, archaeologists talk about tool kits and activity areas. An activity area represents the place where one person or a few people carried out a single activity, such as removing flakes from a core. Sometimes combinations of tool types and other materials found within small clusters should provide clues to the activities with which they were associated - the recurring combinations of artifacts are sometimes called tool kits - but usually identification of activity areas and tool kits depends on the assumption that items were dropped where they were used and never substantially moved prior to their discovery by archaeologists. This occasionally happens, but usually site-formation processes are much more complicated:

The traditional concept of "activity area," although perhaps useful in terms of observable activity performance (e.g., in an ethnographic context), is not necessarily a valid concept in terms of deposition. Simply put, people might well perform "activities" in "areas," but there is no reason to expect them to map those areas with their garbage; material products of activities may often be collected in dump locations along with the products of other activities performed in other areas (Rigaud and Simek 1991: 217).

Consequently, although it is still useful to attempt to detect spatial patterns at this scale, analysts must take these problems into account. In general, activity areas are easiest to identify in cases where there was only one brief occupation and little later disturbance, or where there was repeated, intensive activity of a particular kind within a well-bounded area, such as repeated food preparation within a room we might call a "kitchen."

At somewhat larger scales, archaeologists often attempt to find the boundaries of work areas or households or to understand how households may have been organized into household clusters and neighborhoods.

At larger scales still, archaeologists attempt to discover spatial structure in settlements. The archaeological correlates of these communities are often assumed to be "sites," but in practice most archaeologists count as sites any apparently concentrated cluster of artifacts on the landscape, including small artifact scatters that could be the residues of isolated single-activity areas as well as large cities. Defining which sites should be considered the remains of communities, such as villages, can be difficult in some instances, and fairly trivial in others.

At the largest scales, archaeologists try to understand why settlements and other types of sites are distributed the way they are on the landscape, to discover what relationships there may have been between sites, and generally how the people who made and used the sites may have exploited the environment or even conceptualized the cosmos. The pattern with which sites are distributed on the landscape is called a settlement pattern, while the way in which the sites interacted and jointly operated within the society (or societies) that used them, economically, politically, socially and ideologically, is called a settlement system.

At the regional scale, geographers have provided us with some models for settlement patterns with which we can attempt to compare the distributions of known sites. This kind of modelling has a much longer history than you would expect. Most current models owe something to Central Place Theory, as formulated by von Thünen (1826), and expanded by Christaller (1933).

Von Thünen hypothesized that, other things being equal, market forces and transport costs would encourage different land uses at varying distances away from a city. Extensive land use would take place only on the periphery, where transport costs to the city were high, while intensive land use, such as market gardening, would take place in a ring closer to the city. This was an economic model for land use, with close links to decision theory and cost-benefit analysis.

Christaller (1933) had noticed that, in reasonably flat plains in parts of Germany, contemporary settlements tended to be very regularly spaced. Furthermore, they tended to be arranged in a hexagonal lattice, and in a hierarchy so that a large town would occupy the center of a hexagon, six smaller towns would occupy the hexagon's corners, and small villages would be located about halfway between each pair of towns. He attributed this pattern principally to the economic efficiencies that resulted from it under a market economy. For example, farmers were able to locate themselves so that they had three markets for their produce within a short distance of their farms, and transport costs between towns could be minimized. Services that consumers expected to use frequently were distributed thinly and evenly over the landscape, while more costly services they would use less often were located in central places, still within a reasonable distance, but not as convenient as the more critical services. Among the kinds of services that Christaller examined were churches, post offices, telegraph offices, and government agencies.

Central Place Theory seems applicable in many places outside Germany as long as the terrain approximates a flat, undifferentiated plain and settlement is heavily influenced by market forces. In other situations we would at least expect distortion of the classic hexagonal lattice. For example, where there is a navigable river, transport costs are much reduced near the river and the lattice becomes "stretched" linearly along it. Johnson (1972) attempted to fit such a distorted lattice to settlement patterns in ancient Mesopotamia, where extensive canal networks would have been the least costly transport routes, as well as the lifeblood of the settlements' agriculture.

In general, however, Central Place Theory has not had as large an impact on archaeology as you might expect simply because it is often difficult to satisfy two more of its assumptions: that all of the settlement locations are known and that all of them were occupied simultaneously. Central Place Theory is particularly ill-suited to analysis of data from spatial samples; because typical archaeological surveys only examine a patchy sample of the landscape and therefore miss a great many sites, it is usually unreasonable to assume that we know where all the sites are. In future, however, and particularly with the aid of GIS (below), it is possible that Central Place Theory will begin to fluorish in archaeology as a model with which we might predict the location of undiscovered sites, and then do new fieldwork to test the predictions.

Kinds of spatial analysis

No matter what the scale of archaeologists' spatial investigations, they can choose among quite a number of approaches to the problem of uncovering spatial patterns. We may group many of these into the broad categories of point-pattern analyses, grid-based distributional analyses, graph-theory approaches, and Geographic Information Systems (GIS). In addition, we can distinguish analyses that deal with one kind of phenomenon at a time from ones aimed at discovering the relationship between different phenomena over space.

In point-pattern analyses, the data involve the locations of individual artifacts, features, sites or some other observations in three-dimensional or two-dimensional space. The purpose of the analyses can be to discern clustering of these objects that might be related to activity areas, in the case of artifacts, or social boundaries between sites, for example, to reveal patterns in the way various classes of artifacts or various attributes co-occur, which might tell us something about tool-kits and activities, to discover the way in which artifacts of known source were distributed to consumers, or to clarify site-formation processes. In sites where the only evidence for architecture consists of post holes, we may try to discern patterned groups of post holes belonging to individual structures (Bradley and Small 1985).

Point-pattern analysis

Point-pattern analysis has a very long history in archaeology. In the late 19th century, many archaeologists produced maps showing the locations of artifact finds in their attempts to delineate the geographical boundaries of what they regarded as archaeological cultures. In the 1960s, French prehistorians made great strides toward the identification of spatial patterns in the debris on Palaeolithic sites. As a student, I remember being quite impressed by a film in which the overlapping distributions of lithics, ash, cobbles, bones and even bear claws were used to infer the outlines of huts and even (on the assumption that phalanges and claws were attached to furs) sleeping areas. Until recently, however, the delineation of such patterns involved the subjective, visual examination of point-patterns on maps. These inferences can be creative but, as with all analyses, are influenced by the implicit preconceptions and explicit models of researchers.

Carr (1991) shows how archaeologists with two different conceptual models, one explicit, the other implicit, obtain quite different interpretations of the debris at Pincevent, a Magdalenian site in northern France that has produced radiocarbon dates of about 11,000 bp. Faunal remains suggest that the site was a reindeer-hunting camp occupied in late winter and spring. The site's excavators (Leroi-Gourhan and Brezillon 1966) interpret the spatial patterns at "habitation no. 1" as the residue of three interconnected, teepee-like huts, each with an indoor hearth and a sleeping area (figure 17.3). Binford (1983: 156-60) instead interprets hearths 2 and 3 as outdoor hearths that were used sequentially, probably when the users moved in response to a change in wind direction. One might be tempted to conclude that this is a relatively minor distinction but many avenues of research on the site, including estimates of population size based on floor area and number of contemporary hearths, depend on which interpretation (if either) is right.

Binford uses as a model for interpreting the Pincevent site a pattern he observed among Alaskan Nunamiut, which he calls the "Men's" Outside Hearth Model (figure 17.4). According to this model, several men sat in an arc around the upwind side of the hearth (to avoid smoke), and dropped small waste, such as small flint chips from flintknapping and small bone fragments, in their immediate vicinity, but threw larger debris that would get in their way as they worked or make sitting uncomfortable either over their shoulders or across the fire in front of them. This pattern of behavior created a "drop zone" in an arc near the hearth and two "toss zones" farther from the hearth. When Binford overlaid a scaled version of this model on the distribution maps from Pincevent habitation no. 1, he concluded that the model's drop zone "fits exactly" with the distribution of lithic debris (figure 17.5). Certainly the largest concentrations of such debris are in arcs around the hearths, as the model would predict, although Carr (1991: 231) draws our attention to other, smaller but well-defined arcs that Binford's model does not address. Binford found much poorer fit of his model to the bone distributions, and Carr (1991: 232) notes that this probably results from the fact that the distribution includes both large bones and small fragments. Binford instead opts to explain the poor fit as the result of overlapping toss zones from different episodes of hearth use, oriented differently because of changes in wind direction. Even this does not account for the bone distribution very well, and some of the backward toss zones are nearly empty of bone. As Carr notes (1991: 234), often large bones and lithics occur in the drop zones, and not in the toss zones where the model predicts they should be, while the boundaries of the toss zones appear too crisp to be the result of casual tossing (figure 17.6), which should result in a gradual diminishing density of debris away from the hearth area (1991: 235-36).

Leroi-Gourhan and Brézillon (1966), by contrast, studied the debris patterns, noticed the abrupt changes in the density of debris, and implicitly fit them to a hut model. A very satisfying way to account for the fact that concentrations of debris seemed to form very distinct arcs was to infer that the debris had been kicked or swept against some kind of barrier, such as a tent wall. A number of different lines of evidence help to corroberate this interpretation. Red ocher appears to have been sprinkled on the floor just prior to occupation, and its stains also stop abruptly at the arcs defined by chipping debris, while the areas that Leroi-Gourhan and Brézillon suppose were swept, as indicated by very low debris density, also lack ocher (figure 17.7). Refitting burin spalls to the burins from which they were struck shows that the spalls found in an arc of debris for one of the putative huts often fit burins found in the drop zone of another alleged hut, or vice versa (figure 17.8). The same can be said of refit flakes and cores (figure 17.9). Although there are other ways this could happen, this distribution of refits is consistent with the idea that debris in all three "hut" areas was swept together, in several different sweeping episodes. "This pattern would have been generated if work around one hut's hearth had been followed by the sweeping of the resulting debris against the walls of another hut, which would have been standing at the same time" (Carr 1991: 246). The orientation of debris also supports the hut model. The long axes of many of the larger bones and lithics run parallel to the arcs that mark the possible hut walls (Carr 1991: 246). The distribution of large flint nodules and a hummock of sediment at fairly regular intervals along the arcs, particularly on the western side that would have been exposed to the prevailing wind, makes sense if they were used to anchor tent poles or weigh down a tent skirt, and, it is important to remember that the site was occupied in winter during a very cold phase of the Pleistocene, not a very good time to do work requiring manual dexterity, such as burin production, out-of-doors (Carr 1991: 246-47). Even the microstratigraphy of the three hearths matches, with two carbon-rich lenses separated by a thin lens of sediment, which would be unlikely if the hearths were not used simultaneously.

Certainly, careful visual examination of point patterns, in combination with other sources of information, can sometimes result in quite vivid reconstructions of some of the activities that produced them but, as in the Pincevent example, we can sometimes say that one reconstruction seems more plausible than another, but the measure of plausibility is rather subjective. How clustered do the items have to be do be considered "clustered?" How dense do the clusters have to be to be "concentrations?" How abrupt does the falloff in density have to be to be considered "crisp?" At what scale should the clusters exist to be culturally meaningful? All these questions pose difficulties for point-pattern analyses even when site-formation processes are fairly straightforward and well understood.

Some people have tried to refine point-pattern analysis and address at least some of these questions by taking a more quantitative approach. During the last two decades one of the principal uses of point-pattern analysis has been in the attempt to recognize clustering of the points quantitatively.

The most common technique that archaeologists have used in this attempt is Nearest Neighbor Analysis. This technique is very easy to apply in cases where its assumptions are valid. We must assume that our point-map does not omit any points (therefore we cannot use a sample of a larger population unless it is a fairly large and spatially contiguous cluster sample), and that all the points (whether artifacts, features or sites) are contemporaneous. Then, for each point, we simply measure the linear distance (r) to the nearest neighboring point, and we average all these distances to obtain the mean distance to nearest neighbor and the standard deviation on this distance. When the points are highly clustered, this mean distance to nearest neighbor is relatively low, when they are randomly distributed it is intermediary, and when they are evenly distributed, it is high. But how high is high and how low is low? To standardize our measure we then divide it by the theoretical mean for a random distribution of points -- or 2 * the square root of rho, where rho is the density of points on the map or (n-1)/A and A is the area.

Our measure of the degree of randomness in the distribution (R) then simply becomes the ratio of the observed and expected mean distances: mean r (observed)/mean r (expected). The result consistently ranges between zero (highly clustered), through 1.0 (random) to a little over 2 (evenly distributed).

Where the assumptions of Central Place Theory can reasonably be applied, and where our analyses indicate a relatively even distribution of sites, one of the tools we can use to detect the hexagonal (or some other) structure is to construct Thiessen polygons. To do this we simply draw line segments between each pair of settlements (figure 17.11), and then draw more line segments that bisect the first ones at a 90o angle. We then erase the first set of line segments as well as any parts of the second set that extend past the point of intersection with others. In a more complicated scenario, we can attempt to account for differential weight of settlements (e.g., large central places might be expected to have more territory than small villages) by intersecting the first set of line segments, not at their halfway points, but at a length away from each settlement that is proportional to the settlement's relative "importance." We can measure this importance in a number of ways - population, site size, number of services, proportion of elite goods - and if, for example, we are trying to find the boundary between a two sites with an importance ratio of 2:1, the perpendicular would be placed two-thirds of the way from the most important site (Hodder and Orton 1976: 59-60, 78-80).

Another form of simple point-pattern analysis has been popular among archaeologists studying regional settlement systems, but has recently been largely displaced by GIS (see below). This analysis involves patterns, not between the locations of the points, but in the relationships between the points and various environmental attributes, such as soil type, elevation above sea level, and distance to permanent water sources. Even quite early archaeologists noticed these kinds of environmental associations, such as the apparent tendency for Linearbandkeramik (LBK) sites to be located on loess soils in Europe (Buttler 1938). In the more modern form of these analyses, the actual associations between site locations and various environmental types are compared with the distribution you would expect if the sites were located randomly on the landscape. In other words, the question in the case of LBK sites is, "are LBK sites located on loess soils more often than we would expect to happen by chance?" If the association is purely by chance, we would expect, on average, that the proportion of sites on loess would be the same as the proportion of space that is covered by loess. Since the environmental categories constitute a nominal scale, we can compare the observed and the expected site distributions with a one-sample chi-square test. Essentially, the value of chi-square is high when there are very large differences between the observed and expected values; in the example here, LBK sites are found so much more often on loess soils than we would expect to happen by chance that we would tend to conclude that there really is a preference for the sites to be located on loess soils. Of course we should be careful about the possibility that our sample of sites could be biased by factors of differential preservation or by the research habits of their discoverers, and should ensure that we are not violating any of the chi-square tests' assumptions.

The key here is to be sure to compare the observed distribution with the distribution expected from a random pattern of dots. Archaeologists sometimes forget this when, during exploratory analysis of point patterns, they notice what seems to be an interesting pattern.

For example, Coinman et al. (1988) notice extreme clustering of Palaeolithic sites at low and high elevations in the tributaries of Wadi al-Hasa, in southern Jordan, and attempt to explain them by fitting them to a general model of a settlement system with small camps in part of the year, seasonally coalescing into large "aggregation camps" or base camps to take advantage of a seasonally available resource while participating in large-group social activities. Coinman et al. (1988) notice that, when you plot site size against elevation for the Wadi al-Hasa tributaries, you find a few, mainly small, sites at low elevations, usually no sites at intermediary elevations, and mixtures of large and small sites at high elevations. This seems to satisfy the model if the large sites at high elevations represent the aggregation camps and the small sites at high and low elevations represent dispersal camps at different seasons. However, they do not compare this distribution to an expected distribution under a random model. The clear separation between low and high sites is easily explicable by the substantial cliffs that separate the narrow valley bottoms of Wadi al-Hasa's drainage from the ridge-tops and broad plateaux above them (1988: 61, 64) and make it virtually impossible for open-air sites to occur at intermediary elevations, which occupy only a small percentage of the research area. It is equally impossible for large sites to occupy the narrow valley bottoms unless they are extremely linear in shape. The large sites at higher elevations are probably actually palimpsests of overlapping, deflated, small sites that wind has collapsed into a surface that is nearly continously "paved" with lithics (Banning 1988: 17). Coinman et al (1988: 51, 54, 61) recognize that these are problems, and make some attempt to take the availability of different elevation zones into account (1988: 58, 63). The point of using this exploratory study as an example is that a comparison to a random distribution would have been an easy way to see whether the apparent pattern had anything to do with prehistoric cultural practices.

In another case, Alan Zarky (1976) attempts to determine whether prehistoric sites at Ocós, Guatemala, were located to take particular advantage of certain resource zones by comparing the known site distributions with "expected" ones, much as just suggested. To do this he uses the one-sample chi-square test that we already saw, however briefly, in connection with grouping methods (above, pp. 41-43). He applies the chi-square test in a way that violates the test's assumptions, but we can use this example to show how one could compare the environmental contexts of point-patterns with those expected under a random model before going on to illustrate a better method based on spatial sample elements.

Zarky's analysis is predicated on a number of assumptions. First, is the assumption that the 36 archaeological sites to which the analysis pertains constitute a random sample of the population of sites for the periods under study. In fact, however, the data come from a random sample of spatial units (Plog 1968), not a random sample of sites, so Zarky is treating these sites as a cluster sample (see above, pp. xx-xx). Second, the chi-square test requires a sample size large enough that no more than about one-fifth of the cells in the table have expected frequencies below 5. Zarky finds that his analysis, which employed a large number of cells so that he could test for several environmental variables at once, had far too many cells with low expected values (1976: 127). Third, the chi-square test that Zarky selects to compare the known distribution of sites with a random distribution assumes that the observations consist of counts, and presumably this is one reason that Zarky has decided to use numbers of sites, rather than measures on some spatial unit, for his analysis. However, although he recognizes this limitation, he also recognizes that it is the proportion of some resource area that lies within the site's catchment that is really of interest, not just the presence or absence of that resource at or close to each site (Zarky 1976: 120-119). Consequently, he attempts to modify the chi-square test by counting sites with two resource zones within their half-kilometer catchment areas as half-a-site for each, those with three resource zones as three one-third-sites, and so on. He suggests that this "a good approximation" and satisfies the assumptions of the chi-squaretest (Zarky 1976: 126) but, in fact, this tinkering with the method is not statistically valid. Rather than try to adapt the chi-square test, he should have used a test that was well suited to measures on spatial areas.

But first, let us assume that the 36 sites do represent a random sample to illustrate how the chi-square test could have been used to evaluate a simpler hypothesis without such tinkering. Let our hypothesis be that the location of Middle and Late Formative sites gave them preferential access to the resources of Mangrove forests. This way we only need to record how often these sites include Mangrove forest within their catchments and not the proportion of the catchment that consists of Mangrove forest. I have combined the Middle and Late Formative periods in order to obtain a reasonable sample size (27). I will also assume that the samples of Middle and Late Formative sites are independent of one another, although in this case they probably are not.

The number of Middle and Late Formative sites in this example that lie within 0.5 km of mangrove forest is 29, while only seven do not. Since mangrove forest makes up only 22.5% of the surface area of the region surveyed for these sites, we would expect only 36 * 0.225 = 8.1 sites to lie within mangrove forest if their distribution were random. Here, however, the situation is a little more complicated, as we are interested in the probability of having mangrove forest within 0.5 km. A random sample of 100 circles of 500 m radius imposed on the map results in 35 that include mangrove forest, 57 that do not, and 8 that fall outside the boundaries of the study area. This means that about 61% of such circles can be expected to include mangrove forest purely by chance, so that we would expect 36 * 0.61 = 22 sites in such locations by chance. The chi-square test on the comparison of the observed and expected distributions results in a chi-square value of 5.73, which has a probability of less than 0.025 of happening by chance. Consequently, it appears possible that the sites are indeed located so as to provide preferential access to Mangrove forests.

Another problem with associating point patterns with landscape variables is spatial autocorrelation. Usually the points generated by an archaeologist's regional survey do not constitute a random sample from a population of sites. They are in fact observations made during a random sample of spaces (quadrats, transects, circles, etc.) drawn from some population of spaces. If we then treat the points as though they were sample elements, we are cluster sampling. Furthermore, it is probably not a very good cluster sample (see above, pp. 68-69). This is a problem when we are interested in the environmental factors that may have affected site location because these environmental phenomena are themselves distributed fairly uniformly at the scales on which we survey (typically 1 km or smaller). Within a single 1 km x 1 km quadrat, for example, we might find five or six small sites of various types, but it is highly likely that the quadrat will only contain one or two soil types and possibly only one geological bed will occur near the surface. It is further likely that all five or six sites will have exploited exactly the same water source, and so on. Consequently, the sites are not independent of one another, and it is fair to say that sites that are located near one another in space are highly likely to be closely similar in their environmental associations, simply because soils, geology, drainage, rainfall, aspect and many other variables do not change very abruptly over the short distances between sites. This is a problem known as spatial autocorrelation. The best way to avoid its effects in these cases is to give up point-pattern analysis of cluster samples and instead use quadrat methods.

When we add to the distribution of points some measurement on them, such as the proportion of a particular pottery type in the assemblages of sites, we can attempt to discover how that pottery type is distributed over space in a more instructive way than simply mapping the points. A very crude way to do this is simply to present a map with small pie charts, instead of points, marking site locations, the size of the pie chart being proportional to the sample size from each and the pie slices indicating the proportions of each pottery type, faunal taxon, or the like. In some cases, and especially when the source of raw material or manufacturing center for the artifact type is known, we can plot the proportion against distance from source, to produce "falloff curves." For example, Colin Renfrew and his colleagues used falloff in the proportion of obsidian among the chipped stone from sites in the Near East and eastern Mediterranean in an attempt to determine whether the obsidian was distributed through "down-the-line" exchange or through centralized intermediaries (Dixon et al 1968; Renfrew xxx). Clearly we would expect less obsidian far from its source than near and, if the obsidian were distributed by individuals who gave a portion of their own stock to trading partners a little farther away, and who in turn passed a portion of that obsidian on to still farther trading partners, the "falloff curve" would be exponential (cf. figure 17.14). By transforming the y-axis of the graph to a log scale, we can change the shape of the curve so that the sites cluster about a regression line (cf. figure 17.15). If there are large residuals about this line - that is, if some sites have a lot more obsidian than predicted by the regression line while others have considerably less than predicted - this may mean that some other kind of exchange system, perhaps with the obsidian being distributed through central places.

In figure 17.16 we see such a regression for the distribution of Oxfordshire pottery among sites in southeastern England. Note that relatively few sites actually lie on or near the regression line; instead there seems to be quite a large group of sites with positive residuals (more Oxfordshire pottery than predicted) and a smaller number of sites with negative residuals (less than predicted). Quite often in this type of analysis the residuals are much more interesting than the regression itself. In essence, the presence of many residuals, if they are distributed in a patterned way, means that mere distance away from the source is not the only factor affecting the likelihood of finding a particular kind of artifact. The sites with positive residuals in this case are all close to the River Thames, and thus able to take advantage of water transport. Sites that had to get the pottery overland, by contrast, often show less of it than the simple regression model would predict.

A more sophisticated approach, however, is to fit a trend surface to the map. This is analogous to the regression line in figure 17.15, but models a surface, rather than a line, which thus makes it more appropriate for spatial data. Computers allow us to do what is essentially a multiple regression in three dimensions, showing a surface that indicates the general trend in the distribution, in this case, of Oxfordshire pottery over space. The kind of polynomial one uses to fit such a surface mathematically affects the outcome: third-order polynomials will fit a more complicated surface, while a second-order polynomial will fit a smoother one. Unfortunately there is no simple way to decide what kind of polynomial to use, but, given a reasonable choice, we can then superimpose circles whose color and size indicates the direction and size of the residuals. In one study, for example, Reece (1973) uses the residuals for particular kinds of Roman coins on a trend surface over a European coin distribution to show how certain denominations were attracted to frontier regions where there were troops, while others were attracted to important ports.

Rank-size analysis involves yet another approach in which the actual sizes of sites are compared with their predicted size under the log-rank model. This is a model that predicts that the largest site will be twice as large as the second-largest, three times the size of the third-largest, and so on, a pattern observed in many modern urbanized settlement systems. Departures from the model are thought to be informative about the nature of societies undergoing the urban transformation. For example, "convex" distributions (figure 17.17), with sites that are closer in size to the largest site than predicted, seem to occur in cases in which there are several competing centers, while "concave" distributions, with subsidiary sites much smaller than predicted and most of the population apparently aggregated into the biggest site, seem typical of the first highly successful urban states, such as Ur in Early Dynastic Mesopotamia (Johnson 1980).

Grid-based analysis

For a number of reasons, it is often better or more practical for archaeologists to record data as densities or other measurements on a grid that they superimpose on space than to record individual points. If nothing else, it often happens that the archaeological data available to us were collected or recorded only by quadrat. Even where point provenience is available, furthermore, we sometimes should reduce it to quadrats to avoid cluster sampling. We have already seen this approach in the spatial histogram (above, p. xx), but grid-based approaches have problems as well as advantages.

The principle problem with isopleth mapping in archaeology, in which we try to model the spatial variations in artifact density or the like along the lines of a topographic map, is that archaeological data are much more chaotic over space than are natural topography or variations in natural magnetic fields and other phenomena on which the method is modelled. With either topography or a magnetic field, there is gradual change in value (elevation or intensity) over space, so that we can interpolate values based on only a few points on the map. With archaeological data, by contrast, variations over space are "spiky" and any apparent pattern is extremely dependent on where we take measurements and on the scale of the distances between measurement points. Consequently, archaeological data typically have to be "smoothed" by a "filter" or grid-generalization technique that reduces the noisiness of the data. The most common way to do this is to plot the averages of values for adjacent sets of four quadrats, rather than the actual values of each quadrat (figure 17.19; Orton 1980: 124-27). The most unsettling thing about isopleth maps, however, is that the size of quadrat used in the grid can have a startling effect on the apparent pattern in the map.

The grid-based technique that is somewhat analogous to Nearest Neighbor Analysis involves the variance-to-mean ratio. With data reduced to counts of microrefuse, artifacts, sites, or whatever in each square of a grid imposed on space, and assuming that the Poisson distribution is an appropriate model for the random assignment of observations to each quadrat, we merely measure the mean density (l, observations per square) and the variance on that mean and calculate the ratio s2/l. For a random pattern, this ratio should be near 1.0, as in Poisson distributions the mean and variance are equal. A very low value of s2/l would indicate an even distribution and a high value would indicate clustering at the scale of the grid. The trouble is that the result depends heavily on the size of quadrat used. In figure 17.21, for example, the smallest quadrat might not detect the obvious clustering, the next couple probably would, while the largest quadrat would indicate even distribution.

Dimensional Analysis of Variance (DAV), which can also be used to detect clustering, takes advantage of that dependence on quadrat size. The name of this approach is somewhat unfortunate, since it has nothing to do with the well known statistical test, Analysis of Variance (ANOVA). With DAV (figure 17.22), we look at the way patterns change as we gradually increase quadrat size through doubling (Whallon 1973; Hodder and Orton 1976: 34-36; Orton 1980: 146-49). For each quadrat size (1, 2, 4, 8, 16, , T), we calculate the sum of squares (S) by squaring the number of artifact counts or site counts in each quadrat, summing these over all quadrats and dividing by the number of quadrats. Then we calculate "mean square between blocks" (M) as the difference between S at the current quadrat size and S at the next larger (double) quadrat size, divided by the degrees of freedom (D). D is the size of the largest quadrat (T) divided by twice the current quadrat size. In a grid of 8 x 16 quadrats, then, T would equal 128 and for the starting quadrat size (1), D would be T/2 = 64.

If there is clustering at a particular quadrat size, the value of M should be relatively high because, holding the total number of observations constant, squaring a large number in a few quadrats results in a higher sum than squaring small numbers in many quadrats, while moving to the next larger quadrat size reduces the variability between quadrats. Consequently M will involve subtracting a small number from a large one before dividing by D, and will therefore be high. If we then plot the value of M against quadrat size for the various sizes of quadrat (figure 17.22), we can find the peak that marks the scale at which clustering occurs in the data.

Arguably, this cluster size has something to do with the size of activity areas in sites or the size of household groups or communities at the regional level, particularly if it can be shown that the same scale pertains for different categories of data (e.g., lithics, pottery, faunal remains). When this happens, Whallon (1973) suggests, we know what scale we should be using to search for the co-occurrences of artifacts and other materials that should help us identify what activities took place within each cluster.

The DAV method does have its problems, however. First we are faced with the fact that we can only apply it to rectangular areas gridded in such a way that their dimensions are some power of 2 of the smallest quadrat. Second, the smallest scale at which we can reliably detect any patterning is about twice the size of this smallest quadrat. Third, the method will miss unusually shaped clusters, such as linear or arced ones. Finally, as a result of the doubling, the precision with which we detect patterning at the larger scales is rather poor (Orton 1980: 149).

Ebert (1992) goes so far as to advocate an "antisite" archaeology or a "distributional archaeology" that uses what he calls dimensional analysis of variance, but which is really just an application of the variance-to-mean ratio, as its principal tool. Having noted the extreme dependence on scale that characterizes any attempt to identify activity areas and the like on the basis of artifact densities, he recalculates the variance-to-mean ratio for a variety of quadrat sizes and finds the quadrat size for each category of material that maximizes the ratio, which indicates the scale at which clusters exist. By comparing the scales of clustering for different kinds of material, he hopes to differentiate groups of materials that were manufactured, used and discarded in the same place from ones that were probably made in one place and discarded elsewhere (Ebert 1992: 193, 213). On the basis of this evidence, Ebert suggests that he can distinguish the remnants of expedient activities from ones that involved curation. He recognizes that the two are not mutually exclusive, and goes on to propose a classification of clusters at different scales that he identifies, for example, as foraging/logistic locations or foraging or multifamily bases (Ebert 1992: 229-33).

As we already saw in our discussion of variance-to-mean ratio and DAV, once archaeologists have discovered the scale at which spatial patterning occurs, they usually find this relatively uninteresting unless the pattern involves more than one kind of artifact, feature, or site. It is the regular co-occurrence of certain kinds of artifacts and other materials that defines a took kit, and pattern in the relationships between various artifacts and materials in discrete locations that helps us discern the activities that took place in activity areas. Even in Central Place Theory, we could not conclude that we have a Christalleran lattice solely on the basis of regular spacing of sites; we need to know that the small towns are distributed between the central places and the villages between the small towns.

To address these kinds of problems, archaeologists use a number of methods that are intended to detect the way that various categories of materials are "mixed" in space. Once again, we can attempt to do this either with point patterns or with counts or other measures on grid quadrats. For illustration, however, let us consider the situation in figure 17.23, with two artifact types that occur in clusters. The types can be found in discrete, non-overlapping clusters, or their clusters can overlap, or both types can be well mixed in a single cluster. How do we measure this?

A simple grid-based method is based on the chi-square test that we have already considered in connection with artifact typology. We can easily construct a 2 x 2 paradigm with the four classes, A but no B, A and B, B but no A, and neither A nor B. If the types are well separated, we would expect most of the non-empty quadrats to contain either A, or B, but not both. If they are well mixed, we would expect most non-empty quadrats to have both A and B, but few to have only one or the other. In the example given here, chi-square is high enough, although not extremely high, that we probably conclude that the two types are fairly well segregated.

This approach, once again, is sensitive to the way in which we arrange our grid. If our quadrats are small enough, or the artifact types rare enough, that many of the quadrats are empty, we will get quite different results than we would if we enlarged the quadrats or restricted the size of the area analysed to ensure that there are very few empty quadrats.

A similar method that uses point provenience rather than grid quadrats is analogous to Nearest Neighbor Analysis. Here we would count how many times each artifact's nearest neighbor is of type A, and how many times it is of type B, and, again, use a 2 x 2 contingency table and a chi-square test of association between the two types (Orton 1980: 145-46). The problem with this approach, as with nearest neighbor analysis generally, is that it is very sensitive to the smallest scale of clustering in the data, and even tiny changes in the position of a few artifacts can have substantial effect on the result.

Johnson (1977) and Hodder (Hodder and Okell 1978) have contributed other techniques of spatial association that are based on local densities of artifacts and the average distance to all points of each type, respectively. The former has some promise, although it is highly dependent on the size of the circles used to measure local densities around artifacts, and the latter works well but is computationally tedious. Both seem preferable to more common methods in a number of ways, however (Orton 1980: 150-55).

Another set of methods for the analysis of grid-based spatial data is called image segmentation. These methods take the data and partition it into zones thought to have different meanings. For example, we might want to distinguish zones of "background noise" from zones associated with house structures or settlements on the basis of artifact densities, soil conductivity, or phosphate concentrations. The methods are intended to take the fuzzy and noisy image from the raw data and "clean it up" to improve its definition. As with the smoothing methods mentioned above, the algorithms that accomplish this take into account the values of neighboring squares on the grid. In a Bayesian approach to image segmentation, the similarity of neighboring values increases our confidence in them, while a large contrast between a value and its neighbors decreases our confidence in that value. Buck et al. (1996: 276-91) give examples of Bayesian image segmentation for soil phosphate maps of two sites, one in Greece, and one in the United Kingdom.

Ring-and-Sector Analysis

The previous methods generally involved rectangular grids, but Ring-and-Sector Analysis involves a kind of radiating grid appropriate for analysis of spatial patterning that we might expect to be circular, as in Binford's "Men's Hearth Model," mentioned near the beginning of this chapter. It has been applied to the distribution of remains within teepee rings in the Prairies of North America as well as to Arctic and Maglemosian sites (Stapert 1994a; 1994b; Stapert and Johansen 1996).

Edge effects

One of the problems with most kinds of spatial analysis is that the spaces we analyse have edges, and observations that are omitted because they fall just outside these boundaries might have had substantial effects on our results. For example, with Nearest Neighbor Analysis, we might assume that a particular point's nearest neighbor is quite a long way toward the center of the studied area when in fact there is a point considerably nearer that falls just outside the study area's boundary. If this happens very often, we will overestimate the mean distance to nearest neighbor considerably, and a conceivable result of this bias is that we fail to recognize clustering or mistake a random distribution for an even one. One way to compensate for edge effects in a case like this is to nest the analysed area within a larger study area. We calculate mean distance to nearest neighbor only for points within the inner area, but take points in the outer area into account when searching for nearest neighbors.

References cited

Banning, E. B. (1988). Methodology. Pp. 13-25 in B. McDonald (ed.), The Wadi el Hasa Archaeological Survey 1979-1983, West Central Jordan. Waterloo, Canada: Wilfred Laurier Press.

Binford, L. R. (1978). Dimensional analysis of behavior and site structure: Learning from an Eskimo hunting stand. American Antiquity 43: 330-61.

Bradley, R., and C. Small (1985). Looking for circular structures in post hole distributions: Quantitative analysis of two settlements from Bronze Age England. Journal of Archaeological Science 12: 285-97.

Buttler, W. (1938). Der Donauländische und der westliche Kulturkreis der jüngeren Steinzeit. Berlin: de Gruyter.. Carr, C. (1991). Left in the dust: Contextual information in model-focused archaeology. Pp. 221-56 in E. M. Kroll and T. D. Price (eds.), The Interpretation of Archaeological Spatial Patterning. New York: Plenum Press.

Christaller, S. (1933). Die zentralen Orte in Süddeutschland: eine okonomisch-geographische Untersuchung über die Gesetzmassigkeit der Verbreitung und Entwicklung der Siedlungen mit stadtischen Funktionen. Jena.

Coinman, N., G. A. Clark, and J. Lindly (1988). A diachronic study of Paleolithic and early Neolithic site placement patterns in the southern tributaries of the Wadi el Hasa. Pp. 48-86 in B. McDonald (ed.), The Wadi el Hasa Archaeological Survey, 1979-1983, West-Central Jordan. Waterloo, Canada: Wilfred Laurier University Press.

Cribb, R. (1991). Nomads in Archaeology. Cambridge: Cambridge University Press.

Dixon, J. E., J. R. Cann, and C. Renfrew (1968). Obsidian and the origins of trade. Scientific American 218 (3): 38-46.

Ebert, J. I. (1992). Distributional Archaeology. Albuquerque: University of New Mexico Press.

Falconer, S. (1994). Village economy and society in the Jordan Valley: A study of Bronze Age rural complexity. Pp. 121-42 in G. Schwartz and S. Falconer (eds.), Archaeological Views from the Countryside. Washington: Smithsonian Institution.

Galanidou, N. (1993). Quantitative methods for spatial analysis at rockshelters: The case of Klithi. Pp. 357-66 in J. Andresen, T. Madsen and I. Scollar (eds.), Computing the Past. Computer Applications and Quantitative Methods in Archaeology, CAA 92. Aarhus, Denmark: Aarhus University Press.

Hillier, B., and J. Hanson (1984). The Social Logic of Space. Cambridge: Cambridge University Press.

Hodder, I., and C. Orton (1976). Spatial Analysis in Archaeology. Cambridge: Cambridge University Press.

Johnson, G. A. (1972). A test of the utility of Central Place Theory in archaeology. Pp. 769-85 in P. Ucko, R. Tringham and G. Dimbleby, Man, Settlement, and Urbanism. London: Duckworth. - (1980). Rank-size convexity and system integration: A view from archaeology. Economic Geography 56: 234-47.

Kellogg, D. C. (1987). Statistical relevance and site locational data. American Antiquity 52: 143-50.

Kroll, E. M., and T. D. Price (1991). The Interpretation of Archaeological Spatial Patterning. New York: Plenum Press.

Kvamme, K. L. (1991a). Geographic Information Systems and archaeology. Pp. 77-84 in G. Lock and J. Moffett (eds.), CAA 91, Computer Applications and Quantitative Methods in Archaeology 1991. BAR International Series S577. Oxford: Tempus Reparatum. - (1991b). Terrain form analysis of archaeological location through Geographic Information Systems. Pp. 127-36 in G. Lock and J. Moffett (eds.), CAA 91, Computer Applications and Quantitative Methods in Archaeology 1991. BAR International Series S577. Oxford: Tempus Reparatum. - (1993). Spatial statistics and GIS: An integrated approach. Pp. 91-103 in J. Andresen, T. Madsen and I. Scollar (eds.), Computing the Past. Computer Applications and Quantitative Methods in Archaeology, CAA 92. Aarhus, Denmark: Aarhus University Press.

Ladefoged, T. N., S. M. McLachlan, S. C. L. Ross, P. J. Sheppard, and D. G. Sutton (1995). GIS-based image enhancement of conductivity and magnetic susceptibility data from Ureturituri Pa and Fort Resolution, New Zealand.

Leroi-Gourhan, A., and M. Brézillon (1966). L'habitation Magdalénienne no. 1 de Pincevent près Montereau (Seine-et-Marne). Gallia Préhistoire 9: 263-385.

van Leusen, P. M. (1993). Cartographic modelling in a cell-based GIS. Pp. 105-23 in J. Andresen, T. Madsen and I. Scollar (eds.), Computing the Past. Computer Applications and Quantitative Methods in Archaeology, CAA 92. Aarhus, Denmark: Aarhus University Press.

McDonald, B. (1992). Settlement patterns along the southern flank of Wadi al-Hasa: Evidence from "The Wadi al-Hasa Archaeological Survey." Pp. 73-76 in M. Zaghloul, K. 'Amr, F. Zayadine, R. Nabeel, and N. R. Tawfiq (eds.), Studies in the History and Archaeology of Jordan IV. Amman: Department of Antiquites and Maison de l'Orient Méditerranéen.

Metcalfe, D., and K. M. Heath (1990). Microrefuse and site structure: The hearths and floors of the Heartbreak Hotel. American Antiquity 55: 781-96.

Monmonier, M. (1991). How to Lie with Maps. Chicago: University of Chicago Press.

O'Connell, J. F. (1987). Alyawara site structure and its archaeological implications. American Antiquity 52: 74-108.

Orton, C. (1980). Mathematics in Archaeology. Cambridge: Cambridge University Press.

Peregrine, P. (1991). A graph-theoretic approach to the evolution of Cahokia. American Antiquity 56: 66-75.

Peterson, J. (1992). Fourier analysis of field boundaries. Pp. 149-56 in G. Lock and J. Moffett (eds.), CAA 91, Computer Applications and Quantitative Methods in Archaeology 1991. BAR International Series S577. Oxford: Tempus Reparatum.

Plog, F. (1968). Archeological surveys: A new perspective. Unpublished M.A. thesis, University of Chicago. Reece, R. (1973). Roman coinage in the western empire. Britannia 4: 227-51.

Ridings, R., and C. G. Sampson (1990). There's no percentage in it: Intersite spatial analysis of Bushman (San) pottery decorations. American Antiquity 55: 766-80.

Stapert, D. (1994a). Intrasite spatial analysis and the Maglemosian site of Barmose I. Palaeohistoria 33/34: 31-51. - (1994b). Inside or outside: That is the question. Some comments on the article by H. P. Blankholm. Palaeohistoria 33/34: 59-61. Stapert, D., and L. Johansen (1996). Ring and sector analysis, and site 'IT' on Greenland. Palaeohistoria 37/38: 29-69.

Sullivan, A. P. (1992). Investigating the archaeological consequences of short-duration occupations. American Antiquity 57: 99-115.

von Thünen, J. H. (1826). Der Isolierte Staat in Beziehung auf Landwirtschaft und Nationalökonomie. Hamburg.

Whallon, R. (1973). Spatial analysis of occupation floors I: Application of dimensional analysis of variance. American Antiquity 38: 266-77.

Zarky, A. (1976). Statistical analysis of site catchments at Ocós, Guatemala. Pp. 117-30 in K. Flannery (ed.), The Early Mesoamerican Village. New York: Academic Press.

 

© 2007 Ted Banning, all rights reserved.