Hello guys,
I am quiet new to using the Twitter REST API and therefore I would like to make sure that I completely understand what kind of limits there are. So as I undestood there is the possibility to fetch the most recent tweets of a public twitter account up to 3200 tweets without a date limit and when working with the Search API, I could search for a specific keyword and retrieve every tweet containing this keyword but only from the last 7 days. Correct?
With the python code below I tried to fetch 3200 tweets from a public twitter profile, but so far I am always getting different amount of tweets at the end. How can I change this?
import tweepy
import json
import pandas as pd
consumer_key = "xxx"
consumer_secret = "xxx"
access_token = "xxx"
access_token_secret = "xxx"
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)
results=[]
timeline = tweepy.Cursor(api.user_timeline, screen_name='@realDonaldTrump', tweet_mode="extended").items()
for status in timeline:
data = (
status.user.id,
status.user.screen_name,
status.user.name,
status.full_text,
status.created_at,
status.lang)
results.append(data)
cols = "user_id screen_name name text date lang".split()
df = pd.DataFrame(results, columns=cols)
You are correct about the limits.
Unfortunately the 3200 limit is not exact - so you may end up with slightly less or more sometimes.
I haven’t tried it but that code looks like it’s using the tweepy Cursor correctly so it should retrieve close to 3200 tweets.
Incidentally, when looking at timelines of very popular or newsworthy accounts, it’s worth looking elsewhere for tweet datasets others have curated. You can recover the full tweet objects from a list of ids using GET statuses/lookup | Docs | Twitter Developer Platform
Also, when working with these kinds of requests, i find it’s useful to save the entire response from tweepy in a file, and then load the parts i’m interested in - this avoids having to crawl the data again, when you decide to change what parts of tweets you want in your dataframe for example.
Thank you for your response and the note on how to fetch tweets more easily. I managed to get almost 3200 Tweets with some changes, basically using also max_pages.
system
Closed
#4
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.