sampling strategies

One of the most important tasks an archaeologist faces is discovering sites in the landscape. Unless a site is clearly visible, it is necessary to use various methods of subsurface testing, from simple test pits dug by hand to technologically advanced methods of remote sensing such as ground penetrating radar and magnetic surveying. However, because it is generally unfeasible to test an entire survey area, the archaeologist must decide on the sampling strategy that best suits his or her purposes.

Sampling strategies can be classified as either non-probabilistic or probabilistic. Non-probabilistic sampling is used when the archaeologist is most interested in already visible or suspected sites and does not need to sample elsewhere. Probabilistic sampling is used when it is necessary to have a representative sample of the sites in a region (the "sample universe"), but it is possible to sample only a small percentage of the whole. By employing statistical methods, probabilistic sampling attempts to increase the probability that generalizations derived from the sample will be correct.

The following aerial photographs of western Montana illustrate various sampling strategies. For this hypothetical archaeological survey, the illustrated region has been divided into 999 equal sampling units and in each example subsurface tests have been carried out on approximately 5% of the units. Note that the percent of the sample universe that is sampled depends upon the circumstances of the survey, and can be influenced by variables including: the nature of the environment, the types of sites, and the budget and timeframe of the project. For example, a different project with a larger budget could sample 20% of the units (200 squares) instead of approximately 5% (48 squares).

Exposed archaeological sites are coded yellow to illustrate how many have been found with each strategy. Examples of the results are illustrated by passing the cursor over each image. If you have a slow connection, please be patient while the images load. If using Netscape, you may have to hold the cursor over the image for a few moments while the image loads.

	Illustration of All Sites If it were possible to excavate the entire sample universe (composed of 999 equal sampling units) shown in this aerial photograph, eight different prehistoric and historic sites would be found, including: a historic wagon trail, three archaic burial sites, a Palaeoindian quarry, a historic homestead and two archaic settlements. Unfortunately, it would be extremely unusual for an archaeologist to have the opportunity to engage in for such complete testing.

	Non-probabilistic sampling This type of sampling would be useful to an archaeologist interested in known sites. Their placement may be recorded in documentary sources, part of local knowledge, or they may simply be visible in the landscape. For example, here the historic wagon trail is clearly visible on the aerial photo. In this case, the areas excavated are only those around the two historic sites which were already known to the archaeologist. The six prehistoric sites have not been discovered.

	Simple random sampling This strategy is the simplest form of probabilistic sampling. Sampling units are selected on a completely random basis. The greatest drawback to this strategy is that, depending on the dispersion of the randomly selected numbers, large parts of the region may be left out of the sampling completely. For example, note the concentration of units towards the bottom of the aerial photo and the relatively sparse areas in the center. In this case, the sampling uncovered five of the eight sites: the historic wagon trail, the Palaeoindian quarry, an archaic burial site and settlement, and the historic homestead. However, note that if different random numbers had been used to determine sampling units, there could have been an entirely different result.

	Stratified random sampling Also a form of probabilistic sampling, stratified random sampling attempts to minimize variability within different zones (or "strata") in the sample universe. The sample universe is divided into large natural zones and each is designated the amount of sample units proportional to its area. The position of units within each area is determined by random sampling. In this case, the sample universe can easily be divided into a sloping riverbank zone (above the dotted blue line) and a flatter prairie zone (below the dotted blue line). As the riverbank zone makes up approximately 2/5ths of the sample universe, it is allocated 2/5ths of the sample units. The sampling uncovered four of the eight sites: the historic wagon trail, the Palaeoindian quarry, and an archaic burial site and settlement.

	Systematic sampling In this probabilistic strategy, sample units are evenly distributed throughout the sample universe. The areas of low sample concentration that can be a problem in random sampling are avoided in systematic sampling. However, in an unusual situation in which the sites are regularly spaced in a pattern approximating the layout of the sample units but slightly offset from it, it is possible to miss every site. In this case, the sampling uncovered four of the eight sites: the historic wagon trail, an archaic burial site and settlement, and the historic homestead.

	Stratified systematic or systematic unaligned sampling This probabilistic strategy combines the characteristics of simple random sampling and systematic sampling into a single strategy that limits their drawbacks. The sample universe is divided into smaller, regularly-spaced regions, then a sample unit is chosen randomly from each of these regions. The sample units are evenly dispersed, but not so regularly positioned as to miss equivalently positioned but offset sites. In this case, the sampling uncovered five of the eight sites: the historic wagon trail, an archaic burial site and two settlements, and the historic homestead.

Conclusions

Of the many types of sampling strategies discussed here, all are useful in certain situations, but none is perfect. For example, a significant danger of using only probabilistic sampling techniques in field survey is that a major site may be overlooked, resulting in a skewed analysis of the archaeology of the sample universe. The solution to this problem is that a good field survey will also consider features that are outside the sample area. Even so, It is unlikely that unless the archaeologist is very lucky he or she will discover all the sites in the sample universe.

This lab illustrates the use of sampling strategies in an archaeological survey, but remember that the probabilistic sampling strategies presented above are independent of both survey methods and the archaeological field conditions. Sampling strategies are used by archaeologists in many other situations, including: site survey, site excavation, and artifact analysis. In all of these cases, even though the sample universe is different, the basic sampling strategies remain the same. Use your knowledge of sampling strategies to answer the following questions:

Lab Questions

1. Looking at the illustration of all known sites and the illustrations of sampling strategies above, first comment on the types of sites that are most visible through sampling and the sites that are frequently missed. Then explain the differences. Remember that any probabilistic sample units are assigned according to idealized strategies that are independent of the archaeological field conditions.

2. You need to derive a representative sample of artifacts from a collection of archaic artifacts that contains lithics, groundstone, and ceramics. Which sampling strategy would you use to make sure that each type of artifact was equally well represented? Note that this question has nothing to do with field survey.

3. If you are working on a large Mesoamerican collection of ceramic artifacts which contains both plainware (unpainted) and painted vessels, and you are only interested in the painted vessels, which sampling strategy would you use? Note that this question has nothing to do with field survey.

4. What are the two probabilistic sampling strategies that you could use if you wanted to ensure that sample units were dispersed across the your entire sample universe?

5. Which sampling strategy would you use in a field survey in which you already have documentary evidence of the position of the site that interests you?

6. Of the probabilistic approaches outlined here, which would you think is the least useful to the archaeologist engaged in a field survey of the region in western Montana illustrated in this lab? Explain why.

*Title illustration modified from an image in Field Methods in Archaeology: Seventh Edition, Hester et al. Mayfield Publishing Co., 1997.