Hello,
I am using Twitter API v2 to get tweets containing specific keywords and an URL and I am using twitter stream endopoints. In order to do this, I am using the operator: keyword has:links.
However, I see that not all the resulting tweets contain an URL in their text or expressed in the entities. I read that tweets could start matching with the query 20-30 seconds after the stream began; however I waited way longer than that and there were tweets that still did not contain a hyperlink. Is there any solution to this problem? Am I doing something wrong?
Thank you in advance for taking the time to read this.
I can’t check to make sure right now but i think has:links will also match media - so the tweet may not have url links but have media attachments and still match that operator - maybe that’s it?
I tried to check and it doesn’t seem to be the case to me, but I may be wrong.
Do you have a code sample you could send over here?
I am using the code suggested from Twitter development page. The functions that I edited are the following:
`
def set_rules(headers, delete, bearer_token):
# You can adjust the rules if needed`
sample_rules = [ {"value": "elections has:links", "tag":"ELECTIONS"},`
]
payload = {"add": sample_rules}
response = requests.post(
"https://api.twitter.com/2/tweets/search/stream/rules",
headers=headers,
json=payload,
)
if response.status_code != 201:
raise Exception(
"Cannot add rules (HTTP {}): {}".format(response.status_code, response.text)
)
print(json.dumps(response.json()))
def get_stream(headers, set, bearer_token):
tweet_fields = 'tweet.fields=created_at,geo,author_id,entities,conversation_id,id,in_reply_to_user_id,public_metrics,referenced_tweets,source,attachments,context_annotations'
expansions = 'expansions=author_id,entities.mentions.username,geo.place_id'
user_fields = 'user.fields=created_at,description,public_metrics,verified,location'
place_fields = 'place.fields=country,name,place_type'
response = requests.get(
"https://api.twitter.com/2/tweets/search/stream?{}&{}&{}&{}".format(tweet_fields,expansions,user_fields,place_fields), headers=headers, stream=True,
)
print(response.status_code)
if response.status_code != 200:
raise Exception(
"Cannot get stream (HTTP {}): {}".format(
response.status_code, response.text
)
)
for response_line in response.iter_lines():
if response_line:
json_response = json.loads(response_line)
print(json.dumps(json_response, indent=4, sort_keys=True))
Thanks! Do you have any examples of what you are expected to see but aren’t?
Best,
Jessica
I have an example of what I am not expecting to see but that instead appears.
This is a tweet that does not seem to contain a media nor a URL link to me but still appears under the search having ‘has:links’.
An example is this:
There are many other examples like this even if the majority contains links I’d say. The problem is that having a maximum of 500.000 tweets per month every tweet that I am getting and that doesn’t contain an URL is just wasted for the purpose of my study. If I were to use API v1 would I still have the upper limit of 500.000 tweets?
Thanks for taking care of this issue!
Hello,
Is it possible that has:links looks also for the presence of mentions that start with ‘@’?
Because if I type in the query:
keyword has:links -has:mentions’
then I receive only tweets with hyperlinks.
Thanks 