hi,
I would like to retrieve data from Twitter using the developer/academic API and later make this data publicly available. I am not planning to include any user information nor personal information (e.g., user location or screen name). Is it allowed?
thanks in advance!
Yes, you can share sets of IDs, like https://catalog.docnow.io/
In the rules this is in: Developer Policy – Twitter Developers | Twitter Developer Platform
Give twarc a try - it can do both, collect data, or hydrate an existing dataset, or dehydrate collected data to create a new dataset: twarc2 (en) - twarc
Thanks a lot Igor!
Just to be sure - it is not allowed to publish the actual text content of the tweets? I was sure you can publish either the textual content or the TweetIDs (but not both)
thanks again!
No problem! I’m not twitter so i can’t speak with any authority - but as far as i remember, ideally you should have IDs only - that way if tweets are deleted they won’t be hydrated. There were also anonymized datasets with just the text and no IDs, but i’ve no idea if those are approved by twitter or not unfortunately.
Any idea where we can look at to double check on this? I have seen hydrated tweets being shared like https://www.kaggle.com/datasets/kazanova/sentiment140 and https://www.kaggle.com/datasets/ayushggarg/all-trumps-twitter-insults-20152021 (idk why I can’t post them as links) where they have both userid and text of the tweet available publicly.
system
closed
#6
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.