Bringing Understanding to the COVID-19 Outbreak in Ontario

Student Capstone Project

Since its emergence in December 2019, the Coronavirus (COVID-19) quickly evolved into a global pandemic.

By May 2020, COVID-19 had spread worldwide to more than 180 countries, with more than 3 million confirmed cases and over 230,000 confirmed deaths. Across Canada, there were 60,000 confirmed cases, with Ontario alone accounting for more than 19,000 cases and 1,400 deaths.

As the pandemic continued to evolve in Canada, it was observed that individuals at high risk included seniors and people with underlying health conditions. With an estimate of close to 80 per cent of deaths in Ontario having occurred in residents of long-term care (LTC) homes, students from the University of British Columbia’s Master of Data Science Okanagan program worked with Statistics Canada to integrate a number of open source databases to provide insight into the COVID-19 outbreak in Ontario.

With a focus on the spread of COVID-19 among LTC homes and the relationship between various measures of proximity and COVID-19 disease rates, UBCO students analyzed health and accessibility factors associated with seniors across different Public Health Unit (PHU) regions of Ontario. Over the course of the capstone project, students also explored and modeled LTC home characteristics and quality indicators associated with COVID-19 outbreaks.

The team scoured Statistics Canada and Government of Ontario websites for data, using python to scrape, wrangle and clean what they found. Students performed statistical analysis, map generation, and used JavaScript and D3 to implement visualizations that are now hosted on Github pages.

Using clustering methods, principal component analysis and principal component regression, the team of MDS-O students helped to uncover that Ontario PHU regions with higher measures of proximity to various amenities, such as transit, healthcare and employment, had different rates of COVID-19 cases. Amenity-richness also appeared to be an influential health factor with respect to the proportion of COVID-19 cases. The work of the team also highlighted the importance of several health conditions that were associated with COVID-19 proportions within a PHU, such as the effect of physical activity and having a regular healthcare provider.

In addition to providing greater insight into the spread of COVID-19 among LTC homes and the relationship between various measures of proximity and COVID-19 disease rates, this project showcased how leveraging various open data sources can produce comprehensive and meaningful results.

Project Resources

Explore MDS Okanagan Explore Other Data in Action Stories