Using ‘random Forest’ To Gain New Insights On Public Health
A UC-led public health study with city-wide participation generated a mountain of raw data, and researchers turned to machine learning to analyse it all.
Healthy lifestyles are central to our wellbeing and it’s no secret that wider societal factors can play a big role in how we live, work, eat and exercise. Less clear is how things like the weather or location of leisure facilities may influence people’s activity decisions.
A recent international study led by Dr Matthew Hobbs, Senior Lecturer and Co-Director of the GeoHealth Laboratory in the Geospatial Research Institute at Te Whare Wānanga o Waitaha | University of Canterbury (UC), sought to throw some light on these underlying factors. Joining him in the endeavour were UK researchers from the University of Essex, University of Derby, Leeds Beckett University as well as the West Yorkshire & Harrogate Cancer Alliance.
“I originally completed my PhD in Leeds and have strong connections there,” explains Dr Hobbs, whose study platform was a community-based public health intervention called Leeds Let’s Get Active (LLGA) that gave residents free access to Leeds City Council pools and gyms on specified days and times.
More than 19,000 participants were recruited over a year, clocking up 185,245 leisure centre visits. Sifting through all this data was a mammoth task.
“University of Canterbury colleague Professor Elena Moltchanova applied a machine learning technique – Random Forest - to analyse the effects of weather and demographic factors on leisure centre attendance. Random Forests can deal quickly and efficiently with large datasets and can provide exciting insights into effects of different variables on the outcome of interest,” says Dr Hobbs.
Just like real forests, Random Forests have trees too – in this case decision trees, with each acting like a flowchart that generates questions, leading to further questions until an answer is reached.
The study results, published in Preventive Medicine, showed that the most influential determinant in predicting a return visit to a leisure centre was distance, with return visits tailing off if the centre was more than 5km away. The closer the centre, the more visits – and this was particularly evident for older women. Overall, older people had more return visits than younger people. Use of the calendar month proved the most significant predicator of attendance with summer being more popular than winter.
“We demonstrate how geography and place have a significant impact on people’s physical activity levels,” he says.
Following this study, a Healthy Location Index for New Zealand was developed that maps features of where we live that influence public health. The index is being updated this year. “We are currently investigating how the environment around us impacts on our mental health.”
Dr Matthew Hobbs is the lead author on the research paper titled Investigating the environmental, behavioural, and sociodemographic determinants of attendance at a city-wide public health physical activity intervention: Longitudinal evidence over one year from 185,245 visits.