Difference between revisions of "Choosing a method of data display"

From
Jump to: navigation, search
m
m (General recommendation)
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
General recommendation
+
==General recommendation==
Before constructing any display of epidemiologic data, it is important to first determine the point to be conveyed. Are you highlighting a change from past patterns in the data? Are you showing a difference in incidence by geographic area or by some predetermined risk factor? What is the interpretation you want he reader to reach? Your answer to these questions will help to determine the choice of display <Ref>U.S. Dept. of Health and Human Services - Centers for Disease Control and Prevention (CDC). Self-study course 3030-G. Principles of epidemiology. An introduction to applied epidemiology and biostatistics. 2nd ed.</ref>.
+
Before constructing any display of epidemiologic data, it is important first to determine the point to be conveyed. Are you highlighting a change from past patterns in the data? Are you showing a difference in incidence by geographic area or by some predetermined risk factor? What is the interpretation you want the reader to reach? Your answer to these questions will help to determine the choice of display <Ref>U.S. Dept. of Health and Human Services - Centers for Disease Control and Prevention (CDC). Self-study course 3030-G. Principles of epidemiology. An introduction to applied epidemiology and biostatistics. 2nd ed.</ref>.
  
As a general recommendation, use a table for precise numbers, for large amounts of numbers, and if there is a great range between the largest and smallest figures. Use a graph for showing trends and relationships, displaying changes over time, and for explaining a point vividly. Use either a table or a graph for comparisons, and for showing parts of a whole <Ref>Bigwood S, Spore M. Presenting numbers, tables and charts. Oxford University Press, New York, 2003 p. 84</ref>.
+
As a general recommendation, use a table for precise numbers, for large amounts of numbers, and if there is a great range between the largest and smallest figures. Use a graph to show trends and relationships, displaying changes over time and explaining a point vividly. Use a table or a graph for comparisons and showing parts of a whole <Ref>Bigwood S, Spore M. Presenting numbers, tables and charts. Oxford University Press, New York, 2003 p. 84</ref>.
  
Often the choice between presenting data in a table or graph is arbitrary as both will work. In general when presenting lots of data e.g. in an annual report it is best to vary how the data is presented by making use of both tables and graphs.  A graphical presentation of data has the advantage of enabling a person to visualise a relationship between data i.e. proportions in groups. There is also a subjective nature to this as some individuals find it easier to interpret tables while others find a visual representation more easy to interpret.
+
The choice between presenting data in a table or graph is often arbitrary as both will work. Generally, when presenting lots of data, e.g. in an annual report, it is best to vary how the data is presented using tables and graphs.  A graphical presentation of data can enable a person to visualise a relationship between data, i.e. proportions in groups. There is also a subjective nature to this as some individuals find it easier to interpret tables while others find a visual representation easier to interpret.
  
If you decide that a graph is the best way to present your information, then no matter what type of graph you use, you need to keep in mind the following 10 tips <Ref>Statistics Canada, Statistics: Power from data! - Summary</ref>:
+
If you decide that a graph is the best way to present your information, then no matter what type of graph you use, you need to keep in mind the following 10 tips:<Ref>Statistics Canada, Statistics: Power from data! - Summary</ref>
  
* convey an important message
+
# convey an important message
* decide on a clear purpose
+
# decide on a clear purpose
* draw attention to the message, not the source
+
# draw attention to the message, not the source
* experiment with various options and graph styles
+
# experiment with various options and graph styles
* use simple design for complex data
+
# use simple design for complex data
* make the data 'speak'
+
# make the data 'speak'
* adapt graph presentation to suit the target audience
+
# adapt graph presentation to suit the target audience
* ensure that the visual perception process is easy and accurate
+
# ensure that the visual perception process is easy and accurate
* avoid distortion and ambiguity
+
# avoid distortion and ambiguity
* optimize design and integrate style with text and tables
+
# optimize design and integrate style with text and tables
* describing one variable
+
# describing one variable
The first step is to describe one variable which is crucial before one starts to compare two or more variables. The table below summarises the most common presentation formats for the different types of variables, from the "simplest" to the more "complex".  
+
The first step is to describe one variable, which is crucial before comparing two or more variables. The table below summarises the most common presentation formats for the different types of variables, from the "simplest" to the more "complex".  
  
 
{| class="wikitable"  
 
{| class="wikitable"  
Line 33: Line 33:
 
| style="vertical-align:bottom;" | Describe with proportions
 
| style="vertical-align:bottom;" | Describe with proportions
 
| style="text-decoration:underline; color:#0563C1;" | [[Tables#Frequency_distribution_table|Frequency distribution table]]
 
| style="text-decoration:underline; color:#0563C1;" | [[Tables#Frequency_distribution_table|Frequency distribution table]]
| style="vertical-align:bottom; font-weight:bold; color:#06D;" | Bar graph, Pie graph
+
| style="vertical-align:bottom; font-weight:bold; color:#06D;" | [[Bar graphs|Bar graph]], Pie graph
 
|-
 
|-
 
| style="vertical-align:bottom;" | Nominal  (categorical not ordered)
 
| style="vertical-align:bottom;" | Nominal  (categorical not ordered)
 
| style="vertical-align:bottom;" | Describe with proportions
 
| style="vertical-align:bottom;" | Describe with proportions
 
| style="text-decoration:underline; color:#0563C1;" | [[Tables#Frequency_distribution_table|Frequency distribution table]]
 
| style="text-decoration:underline; color:#0563C1;" | [[Tables#Frequency_distribution_table|Frequency distribution table]]
| style="vertical-align:bottom; font-weight:bold; color:#06D;" | Bar graph, Pie graph
+
| style="vertical-align:bottom; font-weight:bold; color:#06D;" | [[Bar graphs|Bar graph]], Pie graph
 
|- style="vertical-align:bottom;"
 
|- style="vertical-align:bottom;"
 
| Ordinal  (categorical (ordered)
 
| Ordinal  (categorical (ordered)
 
| Describe with proportions
 
| Describe with proportions
| style="font-weight:bold; color:#06D;" | [[Tables#Frequency_distribution_table|Frequency distribution table]] (also cumulative)
+
| style="font-weight:bold; color:#06D;" | [[Tables#Frequency_distribution_table|Frequency distribution table]] (also [[Tables#Cumulative_frequency_distribution_table|cumulative]])
| style="font-weight:bold; color:#06D;" | Bar graph(also   cumulative), Pie graph
+
| style="font-weight:bold; color:#06D;" | [[Bar graphs|Bar graph]](also cumulative), Pie graph
 
|- style="vertical-align:bottom;"
 
|- style="vertical-align:bottom;"
 
| Numerical  discrete
 
| Numerical  discrete
 
| Describe with proportions, means  and standard deviation
 
| Describe with proportions, means  and standard deviation
| style="font-weight:bold; color:#06D;" | [[Tables#Frequency_distribution_table|Frequency distribution table]] (also cumulative), Table of descriptive statistics
+
| style="font-weight:bold; color:#06D;" | [[Tables#Frequency_distribution_table|Frequency distribution table]] (also [[Tables#Cumulative_frequency_distribution_table|cumulative]]), Table of descriptive statistics
| style="font-weight:bold; color:#06D;" | Bar graph (also   cumulative), Histogram (if large number of values)
+
| style="font-weight:bold; color:#06D;" | [[Bar graphs|Bar graph]](also cumulative), Histogram (if large number of values)
 
|- style="vertical-align:bottom;"
 
|- style="vertical-align:bottom;"
 
| Numerical  continuous
 
| Numerical  continuous
 
| Describe with means, medians,  standard deviation, quartiles
 
| Describe with means, medians,  standard deviation, quartiles
 
| style="font-weight:bold; color:#06D;" | [[Tables#Frequency_distribution_table|Frequency distribution table]] (group frequencies or cumulative), Table of descriptive statistics
 
| style="font-weight:bold; color:#06D;" | [[Tables#Frequency_distribution_table|Frequency distribution table]] (group frequencies or cumulative), Table of descriptive statistics
| style="font-weight:bold; color:#06D;" | Histogram (also   cumulative), Frequency polygon, Box-and-whisker plot, Violin plot, One-way scatter   plot
+
| style="font-weight:bold; color:#06D;" | Histogram (also cumulative), Frequency polygon, Box-and-whisker plot, Violin plot, One-way scatter plot
 
|}
 
|}
Describing two variables together
 
There are potentially 5x5 = 25 combinations of the types of variables mentioned in the table above; there are many potential graphs and tables to describe these.  The important thing is that you understand what you wants to show in those tables or graphs. For describing two variables (X and Y) together the strategy is basically the following:
 
  
First, consider one variable as the "outcome" (Y) and the other as the "factor" (X), i.e. explanatory variable. Then describe the outcome (Y) in each group that you can make with the factor (X). Remember that the outcome will be described according to its nature as explained above (univariate description).
+
==Describing two variables together==
 +
There are potentially 5x5 = 25 combinations of the types of variables mentioned in the table above; there are many potential graphs and tables to describe these. The important thing is that you understand what you want to show in those tables or graphs. For describing two variables (X and Y) together, the strategy is basically the following:
  
Below you find a simple summary of describing two variables together.
+
First, consider one variable as the "outcome" (Y) and the other as the "factor" (X), i.e. explanatory variable. Then describe each group's outcome (Y) that you can make with the factor (X). Remember that the outcome will be described according to its nature as explained above (univariate description).
  
{| class="wikitable"  
+
Below you will find a simple summary of describing two variables together.
|- style="font-weight:bold; text-align:center; vertical-align:bottom;"
+
 
! colspan="4" | Describing   TWO variables together
+
{| class="wikitable" style="color:#222;"
 +
|- style="font-weight:bold; text-align:center;"
 +
! colspan="4" | Describing TWO variables together
 
|- style="font-weight:bold; vertical-align:bottom;"
 
|- style="font-weight:bold; vertical-align:bottom;"
 
| Variable
 
| Variable
Line 70: Line 71:
 
| Table
 
| Table
 
| Graph
 
| Graph
|-
+
|- style="vertical-align:bottom;"
| style="vertical-align:bottom;" | Two   categorical variables
+
| Two categorical variables
| style="vertical-align:bottom;" | Identify relationships,   patterns in the data
+
| rowspan="3" style="text-align:center; vertical-align:middle;" | Identify relationships, patterns in the   data
| style="text-decoration:underline; color:#0563C1;" | Contingency table
+
| style="text-decoration:underline; color:#0563C1;" | Contingency   table
| style="vertical-align:bottom; font-weight:bold; color:#06D;" | Grouped bar graph, Stacked bar graph, Component bar graph, Mosaic plot
+
| style="font-weight:bold; color:#0645AD;" | Grouped   bar graph, Stacked bar   graph, Component   bar graph, Mosaic plot
|-
+
|- style="vertical-align:bottom;"
| style="vertical-align:bottom;" | Two   numerical variables
+
| Two numerical variables
| style="text-decoration:underline; color:#0563C1;" | Contingency table (group   frequencies)
+
| style="text-decoration:underline; color:#0563C1;" | Contingency   table (group frequencies)
| style="vertical-align:bottom; font-weight:bold; color:#06D;" | Line graph (also   cumulative), Scatter plot (with or without regression line)
+
| style="font-weight:bold; color:#06D;" | Line   graph (also cumulative), Scatter plot (with or without regression line)
| style="vertical-align:bottom;" |
+
|- style="vertical-align:bottom;"
|-
+
| One categorical and one   numerical variable
| style="vertical-align:bottom;" | One   categorical and one numerical variable
+
| style="text-decoration:underline; color:#0563C1;" | Contingency   table, Table of descriptive statistics (mean, median, mode, etc)
| style="text-decoration:underline; color:#0563C1;" | Contingency table, Table of descriptive   statistics (mean, median, mode, etc)
+
| style="font-weight:bold; color:#06D;" | Scatter   plot, Box-and-whisker plot, Bar graph (showing mean or median with ± standard   deviation)
| style="vertical-align:bottom; font-weight:bold; color:#06D;" | Scatter plot, Box-and-whisker plot, Bar graph (showing mean or   median with ± standard deviation)
 
| style="vertical-align:bottom;" |
 
 
|}
 
|}
  
There are typical table formats for presenting results of cohort and case-control studies.  
+
There are typical table formats for presenting the results of cohort and case-control studies.  
  
Time series
+
==Time series==
 
Time series is a special case of describing two variables where the factor (X) variable is always the "time". Selecting a method of displaying time series data is based on certain conditions <Ref>U.S. Dept. of Health and Human Services - Centers for Disease Control and Prevention (CDC). Self-study course 3030-G. Principles of epidemiology. An introduction to applied epidemiology and biostatistics. 2nd ed.  p. 264 </ref>.
 
Time series is a special case of describing two variables where the factor (X) variable is always the "time". Selecting a method of displaying time series data is based on certain conditions <Ref>U.S. Dept. of Health and Human Services - Centers for Disease Control and Prevention (CDC). Self-study course 3030-G. Principles of epidemiology. An introduction to applied epidemiology and biostatistics. 2nd ed.  p. 264 </ref>.
  
Line 125: Line 124:
  
  
==FEM PAGE CONTRIBUTORS==
+
<div style="display: inline-block; width: 25%; vertical-align: top; border: 1px solid #000; background-color: #d7effc; padding: 10px; margin: 5px;">
 +
'''FEM PAGE CONTRIBUTORS 2007'''
 
;Editor
 
;Editor
 
:Agnes Hajdu
 
:Agnes Hajdu
Line 134: Line 134:
 
:Lisa Lazareck
 
:Lisa Lazareck
 
:Agnes Hajdu
 
:Agnes Hajdu
 
+
</div>
  
 
[[Category:Informing Action / Improving Knowledge]]
 
[[Category:Informing Action / Improving Knowledge]]

Latest revision as of 05:33, 14 April 2023

General recommendation

Before constructing any display of epidemiologic data, it is important first to determine the point to be conveyed. Are you highlighting a change from past patterns in the data? Are you showing a difference in incidence by geographic area or by some predetermined risk factor? What is the interpretation you want the reader to reach? Your answer to these questions will help to determine the choice of display [1].

As a general recommendation, use a table for precise numbers, for large amounts of numbers, and if there is a great range between the largest and smallest figures. Use a graph to show trends and relationships, displaying changes over time and explaining a point vividly. Use a table or a graph for comparisons and showing parts of a whole [2].

The choice between presenting data in a table or graph is often arbitrary as both will work. Generally, when presenting lots of data, e.g. in an annual report, it is best to vary how the data is presented using tables and graphs. A graphical presentation of data can enable a person to visualise a relationship between data, i.e. proportions in groups. There is also a subjective nature to this as some individuals find it easier to interpret tables while others find a visual representation easier to interpret.

If you decide that a graph is the best way to present your information, then no matter what type of graph you use, you need to keep in mind the following 10 tips:[3]

  1. convey an important message
  2. decide on a clear purpose
  3. draw attention to the message, not the source
  4. experiment with various options and graph styles
  5. use simple design for complex data
  6. make the data 'speak'
  7. adapt graph presentation to suit the target audience
  8. ensure that the visual perception process is easy and accurate
  9. avoid distortion and ambiguity
  10. optimize design and integrate style with text and tables
  11. describing one variable

The first step is to describe one variable, which is crucial before comparing two or more variables. The table below summarises the most common presentation formats for the different types of variables, from the "simplest" to the more "complex".

Describing ONE variable
Variable Aims Table Graph
Binary / dichotomous Describe with proportions Frequency distribution table Bar graph, Pie graph
Nominal (categorical not ordered) Describe with proportions Frequency distribution table Bar graph, Pie graph
Ordinal (categorical (ordered) Describe with proportions Frequency distribution table (also cumulative) Bar graph(also cumulative), Pie graph
Numerical discrete Describe with proportions, means and standard deviation Frequency distribution table (also cumulative), Table of descriptive statistics Bar graph(also cumulative), Histogram (if large number of values)
Numerical continuous Describe with means, medians, standard deviation, quartiles Frequency distribution table (group frequencies or cumulative), Table of descriptive statistics Histogram (also cumulative), Frequency polygon, Box-and-whisker plot, Violin plot, One-way scatter plot

Describing two variables together

There are potentially 5x5 = 25 combinations of the types of variables mentioned in the table above; there are many potential graphs and tables to describe these. The important thing is that you understand what you want to show in those tables or graphs. For describing two variables (X and Y) together, the strategy is basically the following:

First, consider one variable as the "outcome" (Y) and the other as the "factor" (X), i.e. explanatory variable. Then describe each group's outcome (Y) that you can make with the factor (X). Remember that the outcome will be described according to its nature as explained above (univariate description).

Below you will find a simple summary of describing two variables together.

Describing TWO variables together
Variable Aims Table Graph
Two categorical variables Identify relationships, patterns in the data Contingency table Grouped bar graph, Stacked bar graph, Component bar graph, Mosaic plot
Two numerical variables Contingency table (group frequencies) Line graph (also cumulative), Scatter plot (with or without regression line)
One categorical and one numerical variable Contingency table, Table of descriptive statistics (mean, median, mode, etc) Scatter plot, Box-and-whisker plot, Bar graph (showing mean or median with ± standard deviation)

There are typical table formats for presenting the results of cohort and case-control studies.

Time series

Time series is a special case of describing two variables where the factor (X) variable is always the "time". Selecting a method of displaying time series data is based on certain conditions [4].

Times series data
Conditions Aims Table Graph
Numbers of cases (epidemic or secular trend) 1 or 2 sets Display frequency distribution, trends in numbers over time Frequency table Histogram
2 or more sets Frequency polygon
Rates Range of values ≤ 2 orders of magnitude Display trends in rates over time Arithmetic scale line graph
Range of values ≥ 2 orders of magnitude Display rate of change over time Semi-logarithmic scale line graph

References

  1. U.S. Dept. of Health and Human Services - Centers for Disease Control and Prevention (CDC). Self-study course 3030-G. Principles of epidemiology. An introduction to applied epidemiology and biostatistics. 2nd ed.
  2. Bigwood S, Spore M. Presenting numbers, tables and charts. Oxford University Press, New York, 2003 p. 84
  3. Statistics Canada, Statistics: Power from data! - Summary
  4. U.S. Dept. of Health and Human Services - Centers for Disease Control and Prevention (CDC). Self-study course 3030-G. Principles of epidemiology. An introduction to applied epidemiology and biostatistics. 2nd ed. p. 264


FEM PAGE CONTRIBUTORS 2007

Editor
Agnes Hajdu
Original Author
Alain Moren
Contributors
Maarten Hoek
Lisa Lazareck
Agnes Hajdu

Contributors