Peruvian investigative site Ojo Público develops algorithm to track possible acts of corruption

An algorithm against corruption developed by the Peruvian investigative journalism site Ojo Público identified that 40 percent of public contracts in Peru, between 2015 and 2018, have a risk of corruption.

funes Image: Ojo Público. (Screenshot)

Since the beginning of 2018, and for a year and a half, Ojo Público managed to extract information from public databases on contracts made by the Peruvian State in order to investigate possible corruption risk scenarios and identify political and financial connections from them.

This happened thanks to "Funes," an algorithm the site developed based on a Hungarian theoretical model, and which it named in honor of the character of extraordinary memory created by the Argentine writer Jorge Luis Borges in his story "Funes el memorioso" (Funes the Memorious).

Nelly Luna Amancio, project director and co-founder of Ojo Público, said in an interview with the Knight Center that when they began thinking about the project, the idea was to be able to identify around what kind of assets there would be a potential risk of corruption. “We realized that corruption was too gigantic.” “Just in Peru there is a huge amount of titles linked to corruption. So what we did, then, was to narrow the field a little more until we were left with corruption in public contracts,” she added.

When the team started working on the project and looking for what investigations had been done in Latin America regarding corruption issues using artificial intelligence tools, big data, algorithms and machine learning, they realized that there wasn’t much.

"What has been advanced [in the region] is above all to identify red flags, that is, if there is a single competitor in a process, that is risky," Luna said. “But not in combining variables to give you something more automated. And that’s how we found Mihaly Fazekas.”

Fazekas is a Hungarian investigator and economist who is an expert in government contracts and does programming and statistics, according to Luna. He is the author of the model in which Funes is based. In his model, Fazekas has defined corruption risks in the steps of public contracting through indicators, the journalist said.

Ernesto Cabral, one of the investigative journalists of Ojo Público who has used the anti-corruption algorithm in his reports, told the Knight Center that they had to make several adjustments to Fazekas’ model as it was based on the European reality.

In Europe, they have "much broader access to public contracting data," Cabral said. In order to adapt the algorithm to the Latin American and Peruvian reality, they had to prioritize other indicators to be able to identify possible patterns of corruption. "For example, here we have put a lot of weight on contributions in political campaigns."

"If there is a link between the politician and the person of the municipality or the government that he is going to hire, if there is a friendly or political bond, that is a pattern," Luna said. "On the way we had to adjust the model a lot because it does not understand the political context, it does not understand the political situation or the patterns," she added.

In the Peruvian case and according to the findings of Funes, between 2015 and 2018, Peru delivered 110 thousand public contracts to a single bidder that did not have competition and to companies created shortly before the bids are made, for the amount of S/. 57 billion (about US $16,800 billion), Ojo Público reported.

MUNICIPAL Image: Ojo Público. (Screenshot)

Funes analyzed a total of 52 GB of information, that is 245 thousand public contracts at the municipal, regional and national levels. Databases of institutions such as the State Contracting Supervisory Body (OSCE), Infogob, Sunat, business and political records were analyzed.

Journalistic interpretation has been the determining factor in the reading and analysis of data extracted and selected by Funes. The journalistic team has "collaborated both in the analysis of the information, [as] in the verification of the data and in the field work to carry out the investigations based on this database," Cabral said.

Cabral also stressed that the journalistic investigations that saw the light thanks to Funes not only have a group of journalists behind them, but also experts in statistics and computer scientists who are in charge of the technical part. They perform the "scraping" and massive download of the data.

The statistical model that has been applied in Funes, Luna said, is linear regression. "What this linear regression does is combine all the data and tell you where the anomalies are, which of all behave differently from the others."

Funes' development, according to Luna, was funded by the Latin American Alliance for Civic Technology (Altec). Ojo Público’s Funes was one of the 20 initiatives selected from eight countries.

The first reports from Ojo Público that used the Funes tool, published between August and October 2019, were “La leche prometida: los millonarios contratos del Grupo NIISA” (The promised milk: the million-dollar contracts of the NIISA Group,” which was a transnational report, and “Telefónica del Perú pone en riesgo la privacidad de sus usuarios” (Telefónica of Peru puts privacy of its users at risk).

The media outlets that belong to the Latin American Network of Journalists for Transparency and Anti-Corruption (Red PALTA) participated in "The Promised Milk." This is formed and promoted by Ojo Público, La Diaria of Uruguay, El Faro of El Salvador, Datasketch of Colombia, La Nación of Argentina, PODER of México and OjoConMiPisto of Guatemala.

"We have developed an internal tool by which we are now coordinating with some local media in Peru to form a mixed team, because there is too much information," Luna said. "The idea is that joint investigations can be formed," both nationally and with international media, mainly with those of the PALTA Network, she said.