How should I expose data from my app for data scientists?

2017-03-20 23:28:57

I'm the product manager for an online app. I'm currently researching a new feature where our users will be able to access "all their raw application data". This data is likely to be used by data scientists, it's also likely to be loaded into BI tools. The datasets will contain up to a few million rows.

How should I actually expose the data in a practical sense?

For example:

Online datasource like Amazon Redshift

Some other RDBMS available online (e.g. a dedicated postgres installation)

CSV files available on S3

CSV files available for download in a web interface

Dumps into Google sheets

Doesn't matter as decent data scientists can easily handle and automate anything