Public domain twitter contents dataset



I collected a small dataset (approximately 3000 tweets) that has been manually annotated. We would like to provide a public link to a downloadable spreadsheet enclosed in a research paper.
The spreadsheet contains:

  • Tweet ID
  • Tweet Text
  • User ID
  • Manual Annotations

According to the Developer Agreement & Policy,

2. If you provide Content to third parties, including downloadable datasets of Content or an API that returns Content, you will only distribute or allow download of Tweet IDs and/or User IDs.
a. You may, however, provide export via non-automated means (e.g., download of spreadsheets or PDF files, or use of a “save as” button) of up to 50,000 public Tweets and/or User Objects per user of your Service, per day.
b. Any Content provided to third parties via non-automated file download remains subject to this Policy.

it seems that we can share the spreadsheet, limiting the total number of downloads, without any violation of the Agreement & Policy.

Is it correct?


Sounds like it should be fine - though ideally, you can just drop the tweet text column, and share the annotation file without any restriction, and recommend people use something like twarc to reconstruct (“hydrate”) the data.