One of the most common complaints from journalists who work analyzing data is that tables they receive from entities or organizations come “dirty” or in unsuitable formats. The team at the Open Knowledge Foundation, a global nonprofit network that promotes open content and open data, listened to these problems and created a solution: Open Data Editor (ODE), a free, open-source tool designed to detect errors in data sets.
“When I joined the team, one of the things we decided was not to write a single line of code until we talked to the people who actually work with data: activists, journalists, NGOs,” Romina Colman, product owner of ODE, told LatAm Journalism Review (LJR). “People told us again and again that they spent too much time looking at tables because none of the data came clean. In other words, they lost time exploring the data to detect errors, then cleaning, and finally telling stories.”
To use ODE, users must download the application to their devices, whether MacOS, Windows or Ubuntu.
The app allows users to upload tables in Excel, CSV (comma-separated files), or through a link (Google Sheets). After the data is uploaded, ODE automatically generates a report of the errors it finds.
“ODE tells the user what problems the tables have. They can be duplicate names in columns, rows completely empty of data, or formatting issues. For example, a column of dates with one cell containing a link,” Colman said.
ODE is available in English, Spanish, French and Portuguese.
The app allows users to upload tables in Excel, CSV or through a link. After the data is uploaded, ODE automatically generates a report of the errors it finds. (Photo: Screenshot).
For Colman, one of ODE’s main values is that there’s no need to understand technical language. Since the launch of its pilot in October 2024, the team has been improving thanks to feedback from organizations, media outlets and journalists who have been integrating ODE into their workflows.
One of those outlets is Data Crítica in Mexico, which investigates issues of gender, climate and anticolonial struggles in Latin America.
As founder and director Gibran Mena told LJR, they have been testing ODE to clean their databases and update investigations into land use and environmental rights.
“The tool has a lot of potential, particularly in its artificial intelligence component, to become a good assistant in data cleaning for journalists,” Mena said. “ODE does a great job of coloring in red the spaces where values are missing and guiding journalists through the process of cleaning their own databases.”
Other Latin American organizations, such as the Civil Association for Equality and Justice (ACIJ for its initials in Spanish), which works to defend rights and strengthen democracy in Argentina and maintains an active relationship with the media there, have also used ODE in their processes.
“We decided to use ODE because we found a simple, lightweight and very powerful tool that helps us work better with complex data and produce reliable information for public debate,” Eduardo Ferreyra, co-director of ACIJ, told LJR. “ODE gave us exactly that: an agile way to detect errors, navigate databases and standardize processes, saving time and improving the quality of our analysis.”
According to Ferreyra, a clear example of ODE’s impact on their processes was the Permanent Household Survey (EPH), which brings together more than two decades of quarterly data with more than 200 columns and variables that change names depending on the year. Before ODE, processing that information required weeks of manual work and carried a high risk of errors for the team.
Open Data Editor relies on the Frictionless Framework, a set of standards and utilities designed to make handling data in uniform table format easier. Thanks to this foundation, the application can review the structure of files, flag common errors and facilitate corrections without requiring the user to do any programming, Lucas Petri, head of communications at Open Knowledge, told LJR.
The Open Knowledge Foundation has created an educational program around ODE. In partnership with the School of Data organization, it has published courses available in English and Spanish. (Photo: Open Knowledge)
As an open-source tool, its development does not depend solely on a closed team but can be enriched by contributions from an international community of developers. This allows it to evolve collaboratively.
The tool also has a button that allows users to employ artificial intelligence in data processing.
“For example, AI can suggest better names for your tables or your columns,” Colman said.
But it’s not an integration with ChatGPT. ODE uses local AI models, so data is not sent to external services, protecting users’ privacy.
“The fact that ODE works locally, without depending on a permanent internet connection or cloud services, gives us additional guarantees of privacy and security when working with sensitive data—something crucial for an organization that handles information of a social and legal nature,” Ferreyra said.
The Open Knowledge Foundation has created an educational program around ODE. In partnership with the School of Data organization, it has published courses available in English and Spanish.
It has also conducted in-person workshops aimed not only at journalists but also at activists and government officials. Omar Luna, communications leader for School of Data LATAM, has conducted these workshops in Mexico and Bolivia.
“It is extremely important to see how efforts can be channeled between civil society, journalists, researchers, as well as those of us who work in data and civic technology, to raise awareness among public officials and strengthen data quality processes,” Luna told LJR.
Mena, who is part of the group of innovators testing ODE, has also introduced the tool in data journalism workshops in Germany and Argentina. In Buenos Aires, he worked with a group of more than 45 journalists from outlets such as La Nación, El Diario AR, Agencia Télam, Salta 12, TV Pública, BigBang, El Destape, Radio Nacional, Diario Castellanos, Diario Digital, Diario Huarpe, Diario de Cuyo and Futurock.
In addition, Open Knowledge has developed a “train the trainers” program aimed at preparing people to teach the material in their own communities and local contexts. As part of this initiative, pilot projects have been launched in different sectors, mainly to facilitate access to basic knowledge of quality data analysis, ensuring that economic or technological limitations do not become obstacles.