Frederik Obermaier, Bastian Obermayer, Vanessa Wormer and Wolfgang Jaschensky, About the Panama Papers (Süddeutsche Zeitung), here. A great report (with links) from the newspaper and reporters that originally obtained the data.
Coming Soon: ICIJ to Release Panama Papers Offshore Companies Data (ICIJ 4/26/16), here.
The International Consortium of Investigative Journalists will release on May 9 a searchable database with information on more than 200,000 offshore entities that are part of the Panama Papers investigation.
The database will likely be the largest ever release of secret offshore companies and the people behind them.
* * * *
While the database opens up a world that has never been revealed on such a massive scale, the application will not be a “data dump” of the original documents – it will be a careful release of basic corporate information .
ICIJ won’t release personal data en masse; the database will not include records of bank accounts and financial transactions, emails and other correspondence, passports and telephone numbers. The selected and limited information is being published in the public interest.
Meanwhile ICIJ, the German newspaper Süddeutsche Zeitung which received the leak, and other global media partners, including several new outlets in countries where ICIJ has not been able to report, will continue to investigate and publish stories in the weeks and months to come.Meta S. Brown, Why Panama Papers Journalists Use Graph Databases (Forbes 4/30/16), here.
All large organizations, and a whole lot of small ones, use databases. These tools keep names, dates, numbers and other tidbits of information neatly in order, organized into tables by common structure and function, each one laid out in columns and rows that define a single proper place for every little fact.
These same organizations also possess a lot of complex data, such as contracts, email, photographs and many other forms of information that just can’t be neatly organized into uniform columns and rows. Since this stuff is hard to organize, it often remains unorganized, making it hard to find information when it’s needed.
When investigative journalists set out to understand the implications of the Panama Papers, an enormous set of documents leaked from the Panama-based law firm Mossack Fonseca, organization was a primary issue. The leak presented them with a wealth of information, millions of documents, but no guide to structure. Organizing the information, making it searchable, identifying connections among the documents and the people, companies and other facts within them was all up to the journalists.
The International Consortium of Investigative Journalists (ICIJ), a network of journalists from more than 65 countries, foresaw this situation. In anticipation of massive technology-enabled information leaks, ICIJ prepared by developing expertise and resources for data journalism on a grand scale. No one news organization has the resources to fully investigate such a large information leak, but ICIJ can provide technical assistance and coordinate fact-finding by journalists around the world.
Mar Cabra, head of ICIJ’s Data and Research Unit, understands the limitations of the databases used in most business applications, which are known as “relational databases.” They’re not designed for management of lengthy documents, or for the labyrinthine relationships that connect them. For this data journalism effort, she chose another type of database, one that was designed as a natural match for the sort of complex data in the Panama Papers, and well-suited to facilitating journalists’ research. This type of database is called a “graph database,” and it is much like a gigantic diagram of documents and relationships among them.