Challenge 3 projects

From Epidemium
Jump to: navigation, search

Challenge 3 (FR) : lire cette page en français.

Challenge1 900px.png
Challenge 3:
Prediction of cancer mortality in developing countries in time and space

October 2017 - March 2018

This Challenge will strive to match cancer data with population factors (from developing countries, excluding Africa) that are thought to induce or protect against cancer, pursuing an improvement of cancer models that have rarely been explored in these regions of the world.
The focus will be on the most prevalent cancers. According to GLOBOCAN 2012, the three most prevalent cancers are lung cancer (1.8 million cases, 13.0% of all cancers), breast cancer (1.7 million cases, 11.9% of the total) and colorectal cancer (1.4 million cases, 9.7% of the total). These figures are an average, so there may exist disparities among developing countries in the incidence of cancers. An approach by continent, and possibly sub-continent, would be appreciated.

Growth in developing countries (excluding Africa) imposes cancer as one of the major causes of mortality, even greater than other diseases that used to be the leading cause of death on those continents - namely infectious diseases. Thus, knowing more about cancer and its root causes, and projecting its evolution in time and space, is a decisive issue for both medical research and public health.

Given the particularity of socio-economic contexts and development models in southern countries, cancer epidemiology has, out of doubts, specific components depending on the regions of the world in which it is expressed. To date, it remains a major challenge to improve medical knowledge. And despite the fact that cancer epidemiology is being widely investigated in the northern countries, it still constitutes an uncharted scientific field of knowledge in the southern regions. Besides this, the disease approach in these latter regions is largely inspired by the existing model compensated with a North-South gradient.


Participants will articulate their analysis from three datasets:

  • An epidemiology_dataset file for epidemiological data that includes three sub-folders (one per database collected by Epidemium: WorldBank, Faostat, Ilostat). For this Challenge 3, you can restrict yourself to the WorldBank database.
  • An incidence_dataset file that includes cancer incidence data by type of cancer. This dataset is made up of data from the WHO.
  • A mortality_dataset file that includes mortality data by type of cancer. This dataset is made up of data from the WHO.

Areas of technology
  • Statistics, Machine Learning, Big Data, Temporal Series
  • Python, R and other languages ​​and soft according to the adopted approaches (package of "forecast", tensor flow if use of Networks of Neurons, etc.)

The registered projects:

No project yet, but soon !