Twitter4J can do what you need, if you need to download tweets given IDs for example:
ConfigurationBuilder config = new ConfigurationBuilder();
config.setOAuthConsumerKey("consumerKey");
config.setOAuthConsumerSecret("consumerSecret");
config.setOAuthAccessToken("accessToken");
config.setOAuthAccessTokenSecret("accessSecret");
Twitter twitter = new TwitterFactory(config.build()).getInstance();
long ids[] = new long [3];
ids[0] = 568363361278296064l;
ids[1] = 568378166512726017l;
ids[2] = 570544187394772992l;
ResponseList<Status> statuses = twitter.lookup(ids);
for (Status status : statuses) {
System.out.println(status.getText());
}
That code is from http://stackoverflow.com/a/28831943
1 Like
Hello, can anybody indicate to me if there is any difference between tweets recovered using API REST or Streaming API and tweets recovered by the Web tool of advanced search (https://twitter.com/search-advanced)?
Thanks in advance.
Yes, there is a difference:
REST API calls to GET search/tweets are different to Web Search results - REST is limited to a few days at most, but web search will display older tweets too. Both REST Search and Web Search do not aim to index every single tweet - so sometimes you may find tweets missing.
Streaming is also limited, but is probably your best bet if you want to gather as much as possible - the limit for streaming is 1% of the firehose (all public tweets posted), and the only time you’ll be missing tweets is when the tweet volume goes above 1% of the firehose (whatever that is at the time - it can vary quite a bit - you’ll need to keep an eye on stream messages about rate limits)
If you’re gathering a dataset - it might help to use both Streaming, and REST - there will be a lot of overlap, but REST Search calls are useful for filling in gaps in data, especially when a stream disconnects sometimes.
1 Like
Hello IgorBrigadir, thanks so much for your explanation.
Yes, as I already replied on this other thread. Your IP is liable to be blocked if you take this approach.
Hello, I also want to get twitter data for my research. can anyone help how it could be done?

Hi,
I’m a researcher from University of Melbourne in Australia. The group I work in is scraping Twitter data to analyse aspects of emotion control. After many months we are now collecting tweets and user timelines in a mongoDB database using docker, pymongo and tweepy. Currently have a dataset of 20 million tweets from 8000 individual users and it’s growing daily.
Happy to share information as I think helping other researchers will help us.
Please re-read the developer agreement and policy and the rules on Tweet storage and distribution (Policy section I.F.2). Large-scale sharing like this is explicitly disallowed and would result in termination of your API access. You may share Tweet IDs, or a small downloadable dataset. You have also agreed (under the developer policy) to promptly remove content that is removed from Twitter (Policy section I.B.6).
Thank you.