In mid-2022 we set out to build a new tool for collecting, sharing and publishing data, one specifically aimed at African journalists.

The motivation to build the tool was born out of our experience tracking Covid-19 data in South Africa first and then across the continent.

For two years from March 2020 we had collected Covid-19 data in South Africa. There were no official time-series datasets made available so anyone trying to track the evolution of Covid-19 had to collect the data themselves. Eventually global repositories like Our World in Data began to publish data, but they were also having to collect the data piecemeal and because of their global focus were unable to dig deeper into data sources.

We collected the data every day for two years from press statements, TV interviews and, later, weekly and monthly health reports.

We learned how much good reporting relies on good data and how very often the good data needs to be collected manually, in small increments, over time, until the dataset is big enough to deliver insight.

Initially we had a very small dataset but after more than a year the advantages began to compound and we could start to write stories backed up by good amounts of data.

DataDesk is our attempt to turn this hard-won experience into a tool that we and others in the African media can use.

Core concepts

As data trainers on the continent we know the resource constraints most newsrooms face. Most can barely afford enough reporters and very few can afford the technology skills really needed to build useful datasets.

Data Desk is built on a number of core ideas:

1) Reliable datasets are a superpower for newsrooms and enable better and broader reporting

2) Data collection should not be the preserve of the “data team”; everyone in a newsroom should be able to collect and manage data that helps their reporting

3) Datasets usually yield greater insight into issues if they are combined with one another

4) Global databases exist but often need supplementing to be relevant for African audiences

5) Data should be shareable.

That’s a lot to achieve but it’s something we’ve been working on for a while and we are already using many of the underlying ideas and technologies in our own work so we believe these are achievable.

Over the coming months we’ll be writing up some of the lessons learned and ideas behind the project.