Which indicator to map?
Contents
Count of cases
Counts are used to displaying the burden of the disease in the population. This helps policymakers and control, program managers, target programs and allocate resources to the most affected areas. However, expressing indicators as the count of cases does not allow identifying areas with increased risk of transmission as the population varies across geographical areas.
Crude rates
Crude rates are a summary measure of the incidence of a disease in a population. They are calculated by dividing the number of cases (or deaths) of diseases that occurred in a certain period (often one year) by the average population in the area. According to the disease frequency, rates are expressed per 1000 or 100,000 inhabitants. Rates allow comparison between geographical areas by accounting for varying population sizes.
In outbreak investigations, rates are usually expressed for the epidemic period and referred to as attack rates.
Age and/or sex-specific rates
Crude rates may be confounded by age and/or sex if the distribution of the disease is known to be associated with age and/or sex and if the population structure by age and/or sex varies across geographical areas. In some countries, for instance, tuberculosis is known to occur at an increased rate among elderly people, and elderly people are more represented in rural areas than in urban areas. Summarizing the incidence of tuberculosis using a crude rate will tend to over-represent rural areas with large elderly populations while the risk of being infected at a specific age is not necessarily higher.
Mapping age and/or sex-specific rates control for these potential confounders. However, maps cannot easily represent several age and/or sex-specific rates in a single display and need to be repeated to reflect all age and/or sex groups.
Standardized rates
Visual inspection of age and/or sex-specific rates across geographical areas is a pre-requisite to mapping data. Whenever there are large variations of rates between age and/or sex categories, summarizing the incidence through standardized rates may not be indicated. However, there are instances where such a summary incidence is useful to assess transmission risks across geographical areas after controlling for age and/or sex potential confounders. This is achieved by a method called standardization of rates.
The use of crude rates when age-specific incidence and population structure differ, as in Table 1, can result in the overall crude rate in district B being greater than that in district A (5.0 vs. 4.8) while age-specific rates in district B are both smaller than in district A (6.9 vs. 7.0 and 2.5 vs. 3.1). This paradox, called Simpson paradox, results from the confusion induced by age.
Table 1: Distribution of cases, population, and rates of disease by age group in 2 hypothetical districts
District A | Cases | Population | Rate* | District B | Cases | Population | Rate* | |
---|---|---|---|---|---|---|---|---|
0-39 years | 42 | 600 | 7.0 | 0-39 years | 55 | 800 | 6.9 | |
40 years & + | 25 | 800 | 3.1 | 40 years & + | 15 | 600 | 2.5 | |
Total | 67 | 1 400,000 | 4.8 | Total | 70 | 1 400,000 | 5.0 | |
* cases/100,000 |
In these instances, standardization of rates is the technique required to control this confounder if a single summary incidence value is desired.
Direct standardization
Direct standardization involves weighing age-specific rates by applying them to a reference population. Age-specific rates from districts A and B are applied to a reference population for calculating age and/or sex-standardized rates. Controlling for age confounders by direct standardization, as presented in Table 2, shows that district B has an age-standardized rate smaller than district A, as expected when inspecting age-specific rates for both districts. The reference population can be an external population used at the country level, such as the country population, for standardizing several indicators or some international reference populations to allow for international comparisons. It can be the average population in the 2 districts, as in our example, if the objective is simply to compare the 2 areas.
Table 2: Calculation of age-standardized rates in 2 hypothetical districts by direct standardization
Age group | Reference Population | District A | District B | ||||
---|---|---|---|---|---|---|---|
Observed | Expected | Observed | Expected | ||||
rate | cases | rate | cases | ||||
0-39 years | 1400000 | 7 | 98 | 6,9 | 96 | ||
40 years & + | 1400000 | 3,1 | 44 | 2,5 | 35 | ||
Total | 2800000 | 142 | 131 | ||||
Age-standardized rate | 5,1 | 4,7 |
Indirect standardization
When the age distribution of the cases is not available in districts A and B, or if age-specific rates are unstable in relation to small figures, indirect standardization is indicated. It consists of applying reference age-specific rates to the populations of study. This would yield the expected number of cases in each district if the incidence were in accordance with the reference model. The age-standardized incidence ratio is calculated by dividing the number of observed deaths by the number of expected ones. It is sometimes multiplied by 100 and expressed as a percentage. Table 3 shows, in our theoretical example, that the incidence in district A is 1.02 times the reference incidence and 0.95 times in district B, which shows that the incidence is lower after standardizing on age.
Table 3: Calculation of age-standardized rate ratios in 2 hypothetical districts by indirect standardization
District A | District B | ||||||
---|---|---|---|---|---|---|---|
Age group | Reference | Population | Expected | Population | Expected | ||
rates | cases | cases | |||||
0-39 years | 7,0 | 600.000 | 42 | 800.000 | 56 | ||
40 years & + | 3,0 | 800.000 | 24 | 600.000 | 18 | ||
Total | 1.400.000 | 66 | 1.400.000 | 74 | |||
Observed cases | 67 | 70 | |||||
Age-standardized rate ratio (SRR) | 1,02 | 0,95 |
Strategy for standardization
When considering whether standardization is indicated, the first step is to consider whether the data mapping can be confounded by variables such as age and/or sex. If the disease is not associated with age or sex, standardization on these variables is not required. Similarly, if the age and/or sex structure of the population is identical across geographical areas, standardization on age and/or sex is not required. In other instances, standardization is required if a summary value of the incidence of the disease is desired to control for the induced confounding effect.
When mapping, the data can potentially be confounded by age and/or sex; age and/or sex-specific rates allow accurate comparisons of the geographical distribution of the disease. However, whenever summary incidence information is preferred, age and/or sex standardized rates are indicated.
Direct standardization allows better comparability across geographical areas but may be unreliable if age-specific rates are based on small numbers. In addition, age-standardized rates represent hypothetical values that have no real base. Indirect standardization requires less detailed information on cases. It is expressed as a percent of a reference situation, which is easily understood. However, indirect standardization of rates is less robust for comparing different geographical areas when the population structure is very heterogeneous.
FEM Editor 2007
- Denis Coulombier
Original FEM Authors
- Christophe Paquet
- Arnold Tarantola
- Philippe Quenel
- Nada Ghosn
FEM Contributors
- Lisa Lazareck
- Denis Coulombier
- Vladimir Prikazsky
Root > Assessing the burden of disease and risk assessment > Descriptive data analysis > Analysis by place characteristics