About this catalog

This is a collaborative collection of datasets that are common to use for follow the money investigations.

The datasets can be easily imported into Aleph or other tools within the followthemoney ecosystem.

This data catalog includes datasets maintained by investigativedata.io, OpenSanctions, and others.

Use the data

You can browse some of the datasets online or download bulk data for non-commercial use as defined by Creative Commons.

We can help to integrate this data into your research infrastructure, e.g. Aleph, or enrich and crossmatch data with other sources. Just get in touch and we can see what we can do.

Commercial use

Commercial use of this data is not implemented yet. If you are interested in commercial use of the data, contact us at hi@investigativedata.org.

For the OpenSanctions datasets, please refer directly to their website for how to obtain data for commercial use.

Get involved

This is a collaborative project. You can submit new datasets or data errors easily on various ways.

There are various ways to get involved:

Slack

Join the Aleph community slack and be part of the discussion within the #investigraph channel.

Github

The development and issue tracking happens in this repository.

Submit a data issue

If you spotted an error or quality issue within a dataset, we prefer a github issue. If you don't have a github account, join the slack (see above) and discuss the issue there. If you don't want to join slack, just write to hi@investigativedata.org

Submit a new dataset

As a data user

Add your dataset with a brief description to the dataset wishlist (Google Spreadsheet)

As a coder

We are happy to add more datasources to the catalog. Here are a few lines about the involved technology.

This project is build upon investigraph, an etl framework for Follow The Money data written in python.

Head over to the investigraph documentation to learn how the framework works.

Depending on the source data, adding a new dataset might involve some python coding knowledge.

Check out the datasets folder in our repo to get examples for the existing datasets.