About this catalog
This is a collaborative collection of datasets that are common to use for follow the money investigations.
The datasets can be easily imported into Aleph or other tools within the followthemoney ecosystem.
This data catalog includes datasets maintained by investigativedata.io, OpenSanctions, and others.
Use the data
You can browse some of the datasets online or download bulk data for non-commercial use as defined by Creative Commons.
We can help to integrate this data into your research infrastructure, e.g. Aleph, or enrich and crossmatch data with other sources. Just get in touch and we can see what we can do.
Commercial use
Commercial use of this data is not implemented yet. If you are interested in commercial use of the data, contact us at hi@investigativedata.org.
For the OpenSanctions datasets, please refer directly to their website for how to obtain data for commercial use.
Get involved
This is a collaborative project. You can submit new datasets or data errors easily on various ways.
There are various ways to get involved:
Slack
Join the Aleph community slack and be part of the discussion within the #investigraph
channel.
Github
The development and issue tracking happens in this repository.
Submit a data issue
If you spotted an error or quality issue within a dataset, we prefer a github issue. If you don't have a github account, join the slack (see above) and discuss the issue there. If you don't want to join slack, just write to hi@investigativedata.org
Submit a new dataset
As a data user
Add your dataset with a brief description to the dataset wishlist (Google Spreadsheet)
As a coder
We are happy to add more datasources to the catalog. Here are a few lines about the involved technology.
This project is build upon investigraph, an etl framework for Follow The Money data written in python.
Head over to the investigraph documentation to learn how the framework works.
Depending on the source data, adding a new dataset might involve some python coding knowledge.
Check out the datasets folder in our repo to get examples for the existing datasets.