Impossible to get dataset to see if Twitter mentions/hashtag have US primary elections predictability?


I was hoping to get data for individual states to do sentiment analysis to see how tweets do as a measure of predicting the outcome of US presidential primary elections. I’d publish my analysis after the fact.

However, no way to even get one state given the REST rate limits. That would take forever. However, unless I’m mistaken the Streaming APIs do not give historical data (though I suppose they could be useful going forward?)


Pretty much - Streaming API is the way to go, and if your filter queries are good, you’ll get a decent dataset: but remember that only a small % of tweets have geo meta data, and locating tweets & users isn’t trivial. Building lists of users per state and streaming / retrieving tweets for that subset of users might be worth trying.

As for the task of predicting election outcomes with sentiment: Sentiment on Twitter doesn’t seem like a good indicator of support for a party. Twitter is more useful for looking at attention to politics, rather than political support:


I’d say that this is definitely kind of use-case that our Gnip data platform is useful for - it provides both historical data, and also geo-enriched current streaming data at larger volumes.


However, I was just going to do this for fun, so I don’t think it’s worth wasting the GNIP sales team’s time.

Appreciate the quick response though, Andy.