Common errors in surveillance data analysis

From
Jump to: navigation, search

10 common errors in secondary analyses of surveillance data

1. Lack of focus on one specific disease or health problem

Description of the error The report lacks focus on a specific disease or health problem and reviews many diseases under surveillance superficially.

Rationale to change Surveillance data analysis is a careful, systematic exercise that requires focus to generate information useful for decision-making.

2. Failure to report the methods used

Description of the error The report does not mention what analysis methods were used to analyze the surveillance data.

Rationale to change The count, divide and compare approach cannot be considered obvious or intuitive. The author must explicitly write all the steps taken in the data analysis. A description of the methods is all the more needed if sophisticated analysis techniques are used.

3. Failure to calculate population-based incidence

Description of the error The report presents an absolute number of cases without calculation of rates.

Rationale to change The "Count, divide and compare" approach is key to surveillance data analysis. Skipping the "divide" step prevents any sort of comparison. A number of cases over time will not reflect the growing population. A map of the number of cases by geographical area does not adjust for population densities. The distribution of cases by age and sex does not reflect the population structure.

4. Failure to use maps to display geographical observation

Description of the error The distribution of cases by geographical area is presented in a table or graphic format.

Rationale to change Map is the primary tool to reflect cases' spatial distribution. It is the only way to present in two dimensions the way that cases occupy the space.

5. Failure to use graphs to present time series

Description of the error Tables of numbers are used to present incidence over time

Rationale to change Time series are best presented using line graphs to present the rates over time

6. Display of raw data / insufficient data reduction

Description of the error The reports display data insufficiently analyzed in the form of large, complex tables from which no trend can be seen.

Rationale to change Surveillance data analysis is about data reduction so that raw data can be processed into information that can be used for decision-making. This systematic, careful and scientific process must generate outputs in the form of graphs (e.g., time series), tables (e.g., incidence by age and sex) and figures (e.g., maps) that display the message in a clear, summarized, explicit and scientifically honest manner.

7. Misuse of statistical tests

Description of the error Statistical tests are used excessively and inappropriately, including for testing hypotheses on the data that generated them.

Rationale to change Surveillance data analysis is mainly done to generate hypotheses. They are used to test hypotheses but be careful. A test can be used to determine whether a specific distribution may have occurred by chance or not: However, if statistical testing is at all needed, the author must always be aware of the following quick checklist:

  • Is it the right test?
  • Is the test calculated correctly?
  • Is the interpretation of the results of the test appropriate?

8. Analysis by more than one criterion at a time

Description of the error The analysis immediately breaks down the data by more than one criterion (e.g., by time and space or by person and time).

Rationale to change Data analysis goes as in peeling an onion and is done one step at a time. Initially, when looking at the data for one of the three criteria (time, place and person), the two others must be kept constant. When examining the incidence over time, use all population subgroups and the whole geographical area. When examining the incidence by area, use an average of the whole study period (or the last year) and all population subgroups. When considering the incidence by population sub-groups, use an average yearly incidence or the last year and include the whole geographical area. It is only when the data has been examined systematically through these steps that more advanced analysis can be made to understand the patterns that emerge (e.g., if the incidence goes up, an analysis by population group over time or an analysis by areas over time can point to where the increase in the number of cases comes from.).

9. Over-interpretation of surveillance data

Description of the error The analysis is over-interpreted, with final conclusions not supported by the data.

Rationale to change In most cases, surveillance data are analyzed to generate hypotheses. Thus, they cannot be used to test hypotheses in most cases. Never should they be used to test the hypotheses that they generated. That would be the worse error possible. For example, there is a peak of disease in the summer. Hence, the hypothesis is generated that the disease is more common in the summer. The same data is then used in a chi-square to compare rates in the summer with rates during other seasons.

10. Poor recommendations

Description of the error The recommendations are absent or not based on the data presented.

Rationale to change Recommendations must be focused, based on the presented results, specific, feasible, ethical, and practical. In field epidemiology, it is best always to try to propose recommendations in the form of (a) additional investigations and/or (b) public health action.

FEM PAGE CONTRIBUTORS 2007

Original author
Yvan Hutin
Contributor
Vladimir Prikazsky

Contributors