Matthews' digital data sets of global vegetation and land use rely on traditional sources of data, such as maps and statistics, which contain a large amount of underexploited information. The information content of these data bases can be leveraged by combining series of data sets developed from traditional sources with each other or with remote sensing data. In particular, the traditionally-based data sets can provide detailed information about surface features whose gross characteristics are determined by remote sensing, as well as provide longer time histories than the ~20-year histories possible from satellite-based instruments.
Information on the current and past status of the global land surface is needed for many types of investigations but varies with the research area. For global carbon cycle studies, it is important to know more than just the current status of land cover; information is needed regarding historical conversions and dispositions of disturbed lands, including length and intensity of use. Because the character and level of detail of required land surface information is a function of the questions asked, it is crucial to define data needs including thematic detail, and spatial and temporal resolution. For example, for questions regarding surface albedo, a high level of land surface detail is not needed because albedos do not vary a great deal among vegetation types. Alternatively, studies of the methane cycle, whose emissions from individual sources often originate from point sources, require great detail about selected areas, and essentially no information exists about most of the Earth. While it is possible to design flexible data sets that can be tailored to a wide variety of research areas, no data set will fulfill all needs.
Vegetation and Land Use Data Bases
Matthews' interest in global vegetation data arose in the late 1970s, primarily in order to improve prescriptions of land surface boundary conditions in the GISS GCM (Goddard Institute for Space Studies' General Circulation Model). At that time, land cover in GCMs was divided into a few types such as forest, grassland, desert, and ice. There was no global digital data set of land cover, no globally consistent series of vegetation maps, and no internationally accepted classification system in use. The closest thing to such an international system was that of UNESCO but the only UN FAO vegetation map in existence (Mediterranean Zone) did not use the UNESCO system; later FAO maps covering South America and Africa also did not use the system. At the time, detailed, digital information was a major improvement over what was available.
For the global vegetation data base published in 1983, Matthews compiled information from about 100 sources (primarily maps) at one degree latitude/longitude resolution. Legends from these maps were translated into, and recorded with, the 5-tier hierarchical UNESCO classification system. Information from the maps was then recorded using the UNESCO codes. Classification criteria include lifeform, density, seasonality, climate, plant architecture, altitude, and environmental setting. The philosophy underlying this approach was to provide for flexibility both in recording the variable amount of detail available from sources, and subsequently in tailoring the data sets to a variety of research areas.
In the UNESCO classification scheme used in Matthews' vegetation data base, five lifeform categories comprise the first level of the hierarchy; these are closed forest, woodland, shrubland, dwarf scrub and related communities, and herbaceous vegetation. Each of these is further broken down into subcategories primarily on the basis of seasonality. For example, closed forest may be evergreen, deciduous or xeromorphic. Evergreen forests are broken down into ten classes, such as tropical rainforest; tropical rainforest is further refined into eight categories, and so on. Globally, Matthews' vegetation data base identifies 178 vegetation types. For GCM applications, including e. g., surface reflectance features, these types are condensed to about 10; for carbon studies, about 30 types are used. Other applications dictate alternative divisions.
A land use data set was compiled at the same time, and at the same spatial resolution, as the vegetation data. Because there was no existing land use classification scheme appropriate for a global study, a triple-tier hierarchical system was developed by Matthews for the project. In a manner similar to the UNESCO system for vegetation classification, major categories of farming systems are further broken down to reflect a greater level of detail. Classification of major farming systems into, e. g., intensive subsistence farming or large-scale commercial farming, is based on their permanence and extent of disturbance. At the most detailed level, the data set identifies 119 land use types many of which include crop combinations. For example, the system distinguishes three categories of subsistence farming: rudimentary, extensive, and intensive (i. e., rice). Other categories are extensive and intensive grazing, plantations, mediterranean, large- and small-scale commercial, dairying, lumbering, and no use.
Because land use maps generally do not provide precise boundaries, Matthews developed a relationship between farming systems and cultivated area to integrate the vegetation and land use data sets in order to define land cover. The major farming systems were qualitatively scaled with respect to their impacts on the natural vegetation. Each system was assumed to replace a fraction of the natural vegetation. The assumptions are broad and the scale is simple. This procedure resulted in simple categories of cultivated fractions for 1° cells which range from 20% to 100% of cell areas; subsistence agriculture is associated with low cultivation intensities, while large-scale commercial agriculture is associated with 100% cultivated area. Problems remain particularly with characterizing the impact of less intensive land use systems.
The vegetation and land use data sets provide snapshots of pre-agricultural and present land cover conditions but do not reflect historical trends in change. These data sets have been used to estimate anthropogenically imposed biomass reductions from replacement of natural vegetation with agriculture. Estimates of pre-agricultural and present biomass suggest that about 12% of the original global biomass was replaced by agriculture. Penetration of large- and small-scale commercial agriculture into temperate forests between 30°N and 55°N accounts for about 60% of the global reduction in biomass. Due to poor land use data for the tropics, tropical reductions are probably under-estimated.
Global data bases such as those described here can be considered in three levels. Primary data bases are directly compiled, extrapolated or interpolated. They define topically detailed, large-scale characteristics of the Earth; many are nominal data ( e. g., vegetation, soils, countries). Secondary data bases provide global distributions of source categories identified with properties important for exchanges of particular constituents. They are derived from primary data bases and supplemental information (animal densities = country + land use + population statistics); they contain nominal, interval, or ordinal data; and may have a seasonal component. Tertiary data bases provide global distributions of gas exchanges for individual sources and are derived from combinations of primary and secondary data bases and emission factors.
For example, methane emission from wetlands is derived from combining vegetation, soils and inundation data to produce the wetland distribution; emission factors and in undation periods are applied to the wetland ecosystems to estimate the global methane emission from this source. Uncertainty increases from the primary through the tertiary level, and the tertiary level is most subject to revision as new emissions measurements are available. Information at each stage can be leveraged by coupling the data sets.
Characterizing Disturbance
A key issue is distinguishing areally extensive disturbances ( i. e., agriculture) from point-intensive disturbances ( i. e., mineral extraction) and characterizing ecosystem disturbances that result from them. In addition to knowing the area of land converted, for which remote sensing is especially valuable, it is crucial to know what ecosystems are disturbed by various activities as well as the disposition of disturbed lands. More predictable mechanisms of anthropogenic disturbance include cultivated land and pastures; less predictable mechanisms include subsistence agriculture, human relocations, forest fires, extractive activities, and dams. Techniques to characterize disturbances include remote sensing (success varies with ecosystem), determining types of land uses present and changing, and developing indices for extensive and intensive disturbances.
Current vegetation-land use associations confirm differential penetration of land uses into ecosystems. Such relationships vary among regions and can be useful for historical reconstructions of global land use. For example, in South America, 40% of grazing is concentrated in xeromorphic woods/shrublands which occupy 25% of the land area; 25% of grazing is in wooded grasslands which occupy 15% of the area; and 7% of grazing is in tropical/subtropical rainforests which occupy 35% of the area.
Information on extent and type of disturbance can also be deduced from historical map series showing progressions of anthropogenic features. One such attempt to characterize the distribution of human impacts is a landscape disturbance data set, developed by Matthews and compatible with the vegetation and land use data sets. The disturbance index, based on the prevalence of anthropogenic features, is based on Operational Navigation Charts (ONCs) dating mostly from the 1980s. These maps, designed for use by pilots, are at scale of 1:1 M and include a great deal of information on anthropogenic features ranging from urban areas and dense road networks to oases and isolated airstrips. Global coverage comprises about 250 map sheets. An advantage of the ONCs is that the first sheets were published in the 1960s and areas in transition are frequently updated.
The landscape intensity index, determined for each 1° cell, is a qualitative ranking from zero (no human structures at all) to five (maximum disturbance). Of the total global, ice-free land area, about 25% is ranked completely undisturbed (index = 0), and nearly 50% is ranked minimally disturbed (index = 1). A test case was carried out in which a historical series of the landscape intensity index was developed for a portion of the Brazilian Amazon. Based on this study, it appears that two indices are probably required to characterize land use status and changes: one reflecting areally extensive features ( e. g., ranches), and the other to reflect intensive, point uses ( e. g., extractive activities).
Global Litter Pools
There are uncertainties in the background conditions against which land use changes are occurring and must be measured. An example of one of these background conditions is global litter production and litter pools which have implications for modeling carbon-related impacts of human disturbances. Historically, global estimates of litter production have ranged from 25-70 Pg C/yr (1 Pg = 1015 g), and estimates of the fine litter pool have ranged from ~50-200 Pg C/yr. (The atmosphere contains ~700 Pg C; ~600 Pg C are held in live vegetation; and ~1500 Pg C are in soils. Under steady state conditions, another 50-60 Pg C are exchanged annually between the atmosphere and biosphere through photosynthesis and respiration.)
Litter measurements from >1000 sites were collected. Global distributions of litter production were developed using the published measurements stratified by ecosystem and simple regression models of litter production and its proxy, net primary production. The series of ten distributions of litter production from the models and measurements reveal a range from 25-70 Pg C for litter production; measurement- and model-based estimates for the litter pool range from 40 to 200 Pg C. Similar global totals of litter production are commonly underlain by large discrepancies in geographic litter distributions. The measurement compilation lead to a survey of how well ecosystems are represented. Measurements of litter production represent ecosystems that cover ~60% of the Earth's ice-free surface; litter pool measurements represent ecosystems that cover only ~50% of the land surface. Climate-based litter production or pool models tend to overestimate values in arid regions which occupy ~30% of the land surface. The availability of systematic validation data for ecosystem/biochemistry models, such as the litter data presented here, will reduce uncertainties in fluxes and pools, and hopefully reduce uncertainties in identifying the "missing carbon sink" which is equal to a small imbalance in large fluxes and stores.
Reference
Matthews, E., Global vegetation and land use: New high-resolution data bases for climate studies. J. Climate and Applied Meteorology, 22:474-487.