DataViz Makeover 2

A makeover of a data visualisation on the willingness of the public towards Covid-19 vaccination. Data was obtained from Imperial College London YouGov Covid 19 Behaviour Tracker Data Hub hosted at Github (https://github.com/YouGov-Data/covid-19-tracker), with the focus on survey data collected in January 2021.

Selene Choong https://www.linkedin.com/in/selenechoong/
02-12-2021

1.0 Critiques of Existing Visualisation with Suggested Improvements

Clarity

S/N Comments Suggested Improvement
1 The order of the countries on the y-axis for both charts are not synchronised, making it difficult for the readers to match the corresponding country bars between the two charts. However, the use of sorting for the chart “% of strongly agreed to vaccination” makes it easier for readers to view the relative results across the countries. Align the country order such that the values for each country are shown on the same row. Countries can be sorted in decreasing % of respondents who selected strongly agreed
2 The charts are showing the % of survey respondents who selected the respective response values. As the survey only polled a subset of the citizens within each country (i.e. sample) instead of all citizens (i.e. population), there would be some form of uncertainty associated with the values shown. However, the existing chart does not depict any uncertainty and readers might interpret the chart as showing the precise representation of the country’s true acceptance towards the vaccine. Explore using statistical values to reflect the uncertainty associated with the survey results
3 The current chart title does not clearly describe the question being asked to capture the findings shown, which is only covering willingness to get the vaccine if available this week. Research also mainly utilised the survey data from interviews conducted in January 2021, but this timeframe was not shown in the data visualisation. Revise the chart title to better represent the question asked as well as to reflect the timeframe of January 2021 within the data visualisation
4 Value labels for the left chart were kept to the original data labels which only indicated the textual label for rating 1 and rating 5 and keeping ratings 2 to 4 as the numeric label only. This might not be intuitive to the readers on what each number represents. Recode the values to use textual labels instead (i.e. 1 as “Strongly agree”, 2 as “Agree”, 3 as “Neutral”, 4 as “Disagree” and 5 as “Strongly disagree”)
5 Scale of the x-axis of the two charts are not synchronised, causing the bars of the right chart to appear longer than the same value on the left chart. This might be misleading to the readers. Synchronise the axis for all charts, particularly those that are side by side so that the scales of the charts are aligned

Aesthetics

S/N Comments Suggested Improvement
6 Current position of the legend makes it difficult for the readers to match the value with its corresponding colour. Also, the title of the legend is showing the variable name instead of a related name to the chart and might confuse readers on whether the legend is meant for the left chart. Legend should be shown near to the chart to allow easy matching of the values. Should legend title be required, it should be renamed to a related name to the chart
7 While the colour coding used for the chart on “Which country is more pro-vaccine” allows readers to distinctly identify the different categories, it does not depict the relativity of the labels that the finding is on a rating scale. Revise the colour to show “Strongly agree” and “Agree” to use the same colour of different gradation. Likewise, “Strongly disagree” and “Disagree” will be shown with the same colour of different gradation as well
8 Coordinated use of blue colour for “Strongly agree” across the two charts allows for readers to better infer that the right chart might be showing a related value to the blue bar within the stacked chart of the left chart. Coordinated use of colours can be adopted where applicable
9 Inconsistent x-axis labels across the two charts, where the x-axis labels for the left chart is shown as whole number while the right chart us shown to 1 decimal place Standardise the number of decimal places across all the axis
10 Country labels are currently shown in full lowercase and with hyphens in the positions of the spaces. This might not be the best way to depict the country names. Change the country names into proper case format and replace hyphens with space

2.0 Ways to Improve Current Visualisation

Sketch of Proposed Design

3.0 Step-by-step Description on Preparation

3.1 Data Source

Data used for this visualisation can be downloaded from Imperial College London YouGov Covid 19 Behaviour Tracker Data Hub. While the entire data hosted on Github contains survey results for 30 countries, only the datasets for the following 14 countries were used: Australia, Canada, Denmark, Finland, France, Germany, Italy, Japan, Netherlands, Norway, Singapore, South Korea, Sweden and United Kingdom.

3.2 Data Preparation

Data Inspection of the Data Files for 14 Countries

Concatenating Multiple Datasets in Tableau

Filtering Master Data to Keep Relevant Data

Data Cleaning for Extracted Data in Tableau

S/N Field Name Formula Purpose
1 Country REPLACE(REPLACE([Table Name], “.csv”, "“),”-“,” ") Remove the file extension and replace hyphens with space in the Table Name field which contains the country labels
2 Country Format TRIM(UPPER(LEFT(SPLIT([Country], " “, 1), 1)) + MID(SPLIT([Country],” “, 1), 2) +” " + UPPER(LEFT(SPLIT([Country], " “, 2), 1)) + MID(SPLIT([Country],” ", 2), 2)) Proper case the country names i.e. capitalise the first letter of each word
3 Age Band IF [Age] <= 29 THEN “18 - 29”
ELSEIF [Age] <= 39 THEN “30 - 39”
ELSEIF [Age] <= 49 THEN “40 - 49”
ELSEIF [Age] <= 59 THEN “50 - 59”
ELSEIF [Age] <= 69 THEN “60 - 69”
ELSEIF [Age] > 69 THEN “70 and above”
END
Create age bands for comparison
4 Household Size IF [Household Size] = 1 THEN “1 pax”
ELSEIF [Household Size] = 4 THEN “2 - 4 paxes”
ELSEIF [Household Size] <= 7 THEN “5 - 7 paxes”
ELSEIF [Household Size] = 8 THEN “8 or more paxes”
ELSEIF [Household Size] = 9 THEN “Don’t know”
ELSEIF [Household Size] = 10 THEN “Prefer not to say”
ELSE “Not answered” END
Create household size bands for comparison
5 Employment Status Format IF [Employment Status 1] = “Yes” THEN “Full time employment”
ELSEIF [Employment Status 2] = “Yes” THEN “Part time employment”
ELSEIF [Employment Status 3] = “Yes” THEN “Full time student”
ELSEIF [Employment Status 4] = “Yes” THEN “Retired”
ELSEIF [Employment Status 5] = “Yes” THEN “Unemployed”
ELSEIF [Employment Status 6] = “Yes” THEN “Not working”
ELSEIF [Employment Status 7] = “Yes” THEN “Other”
ELSE [Employment Status] END
Standardise format and recode all employment status into 1 consolidated field

3.3 Creating Diverging Stacked Bar Chart

Pivoting Data into Required Format

Creating the Required Fields

S/N Field Name Formula Purpose
1 Number of Records 1 Create an artificial column on the count for ease of tabulation
2 Total Count TOTAL(SUM([Number of Records])) Create a total variable as the base
3 Count Negative IF [Question Response] = “5 – Strongly disagree” OR [Question Response] = “4” THEN 1
ELSEIF [Question Response] = “3” THEN 0.5
ELSE 0 END
Calibrate the diverging scale
4 Total Count Negative TOTAL(SUM([Count Negative])) Create a total negative variable for generating starting point
5 Percentage SUM([Number of Records])/ [Total Count] Compute the percentages
6 Gantt Start -[Total Count Negative]/ [Total Count] Create a starting point for the gantt chart
7 Gantt Percentage PREVIOUS_VALUE([Gantt Start]) + ZN(LOOKUP([Percentage], -1)) Create the percent for the gantt chart

Generating the Chart

Creating a Dynamic Sorting Field

S/N Field Name Formula Purpose
1 Response Value IF [Question Response] = “5 – Strongly disagree” THEN 5
ELSEIF [Question Response] = “4” THEN 4
ELSEIF [Question Response] = “3” THEN 3
ELSEIF [Question Response] = “2” THEN 2
ELSEIF [Question Response] = “1 - Strongly agree” THEN 1 END
Assign a numeric label for all responses for ease of computation
2 Percent Strongly Agree SUM(IF [Response Value] = 1 THEN 1 ELSE 0 END) / SUM([Number of Records]) For counting number of respondents that selected Strongly Agree
3 Percent Agree SUM(IF [Response Value] <= 2 THEN 1 ELSE 0 END) / SUM([Number of Records]) For counting number of respondents that selected Agree
4 Percent Neutral SUM(IF [Response Value] = 3 THEN 1 ELSE 0 END) / SUM([Number of Records]) For counting number of respondents that selected Neutral
5 Percent Disagree SUM(IF [Response Value] >= 4 THEN 1 ELSE 0 END) / SUM([Number of Records]) For counting number of respondents that selected Disagree
6 Percent Strongly Disagree SUM(IF [Response Value] = 5 THEN 1 ELSE 0 END) / SUM([Number of Records]) For counting number of respondents that selected Strongly Disagree
7 Sort Order IF [Sort Descending by] = 1 THEN [Percent Strongly Agree]
ELSEIF [Sort Descending by] = 2 THEN [Percent Agree]
ELSEIF [Sort Descending by] = 4 THEN [Percent Disagree]
ELSEIF [Sort Descending by] = 5 THEN [Percent Strongly Disagree]
ELSE [Percent Neutral] END
To extract the relevant field based on selection in parameter

Customising the Tooltip

3.4 Creating a Dot Plot with Error Bars

Creating the Required Fields

S/N Field Name Formula Purpose
1 Prop_SE SQRT(([Percent Strongly Agree]*(1-[Percent Strongly Agree]))/SUM([Number of Records])) Compute the standard error proportion
2 Z_95% 1.959964 Z-value for 95% confidence interval (CI)
3 Z_99% 2.575829 Z-value for 99% CI
4 Prop_Margin of Error 95% [Z_95%]*[Prop_SE] Compute the margin of error for 95% CI
5 Prop_Margin of Error 99% [Z_99%]*[Prop_SE] Compute the margin of error for 99% CI
6 Prop_Lower Limit 95% [Percent Strongly Agree] - [Prop_Margin of Error 95%] Lower limit of 95% CI
7 Prop_Lower Limit 99% [Percent Strongly Agree] - [Prop_Margin of Error 99%] Lower limit of 99% CI
8 Prop_Upper Limit 95% [Percent Strongly Agree] + [Prop_Margin of Error 95%] Upper limit of 95% CI
9 Prop_Upper Limit 99% [Percent Strongly Agree] + [Prop_Margin of Error 99%] Upper limit of 99% CI

Generating the Chart

3.5 Adding Dynamic Parameters

3.6 Creating Similar Charts for Other Fields

Duplicating the Diverging Stacked Bar Chart

Member Value (Alias)
Vac2_1 I am worried about getting Covid-19
Vac2_2 I am worried about potential side effects of a Covid-19 vaccine
Vac2_3 I believe government health authorities in my country will provide me with an effective Covid-19 vaccine
Vac2_6 If I do not get a Covid-19 vaccine when it is available, I will regret it

Duplicating the Dot Plot with Error Bars

3.7 Dashboard

4.0 Final Data Visualisation Output

Link to Tableau Dashboard: https://public.tableau.com/profile/selenechoong#!/vizhome/ISSS608_DataViz_Makeover_02/Dashboard

5.0 Major Observations

Observation 1:
An increase in proportion of respondents expressed stronger willingness to get receive the Covid-19 vaccination when offered in the next one year, as compared to within this week.

Observation 2:
Willingness to receive the Covid-19 vaccination increases with age. Higher proportion of older respondents agreed that they will definitely get the vaccine if made available, as compared to younger respondents. This applies to both the availability of this week and a year from now.

Observation 3:
While Japan has the highest proportion of respondents who indicated that they are worried about getting Covid-19 among the 14 countries, it has the lowest proportion of respondents who agreed they they get the Covid-19 vaccination if made available to them this week or even a year from now.

Observation 4:
Higher willingness of UK and Denmark respondents to receive the Covid-19 vaccination might have been contributed by their perceptions of the vaccination. We observed a similarly higher proportion of respondents from these countries who believed that their government health authorities will provide them with an effective Covid-19 vaccine, as well as a lower proportion of them who are worried about the potential side effects of the vaccine.