Hello;
I have an issue with my twitter app.
I have to extract tweets using the logstash twitter plugin. When I was using the standard api v1.1 it works perfectly
Now I created, by default, the api v2. and it is not working (not authorized)
[2022-06-08T22:14:14,280][WARN ][logstash.inputs.twitter ][main][435bb25afab9f4209abf681be00d2246423a14d2c7cc9f3dd9a3062c27577c5a] Twitter client error {:message=>"", :exception=>Twitter::Error::Forbidden, :backtrace=>, :options=>nil}
my question is:
Is there any way to create an api v1.1, or is it deprecated? (because I cannot find where to create it, I meet just v2)
in case it is deprecated, how to connect logstash to twitter api v2?
Thank you
Best regards
The v2 API is totally different so a new logstash plugin or new version of it will have to be used to get v2 to work.
1 Like
Thank you @IgorBrigadir . So is it possible de create a twitter api standard v1.1 ?
because at each time, I’m directly redirected to the v2.
For Standard v1.1 Access, you need an Elevated Access account, not a Basic one Getting Started with the Twitter API | Docs | Twitter Developer Platform
1 Like
I’m really confused. I’m providing in what follows a screenshot of my twitter account.
I have created a project => elevated access:
I have also 2 standalone apps: (I think they can access API 1.1.
I set the OAuth1. but it is required to add a callback url and website url. And I don’t need them because my purpose is to use these APPs in logstash twitter plugin.
I put anything in these two urls, then Logstash is not working. Forbiden.
What is the solution please. And what should I provide in these two urls? I’m not using a website , or any web dev.
1 Like
Oh, if you have elevated access, the v1.1 API should work. Try https://example.com/ as the URL (it does not matter what this url is, it’s just a UI quirk.)
Remember that changing the App permissions also requires resetting the token.
Exactly, I put anything in these two URLs. I regenerated the tokens.
When I test this credentials with tweepy => it works perfectly
But when I try them with logstash => forbidden
1 Like
Is this on the exact same server / everything else? Is tweepy definitely calling the same endpoint as logstash (v1.1) ?
Is logstash reading the credentials correctly? (Sometimes Environment variables are not set right when deploying)
Tweepy and logstash are running in the same sever.
Moreover, The logstash pipeline is very simple
input {
twitter {
consumer_key => "40I2hjtuth**"
consumer_secret => "y1AMno****"
oauth_token => "9260767336814***"
oauth_token_secret => "CopTDMX0mbG***"
keywords => ["engineer"]
full_tweet => true
type => "tweet"
}
}
output {
stdout{}
}
the same pipeline is working perfectly with the elder APP (that i created in 2020)
1 Like
Hello Again;
after many tests, I decided to use curl requests on twitter streaming api.
I defined the rules. like this:
curl --location --request POST 'https://api.twitter.com/2/tweets/search/stream/rules' \
--header 'Content-type: application/json' \
--header 'Authorization: Bearer AAAAAAAAAAAAAAAAAAAAAJ****' \
--header 'Cookie: guest_id=v1%3A165697294294313514' \
--data-raw '{
"add":[
{
"tag":"test test",
"value":"hello"
}
]
}'
Then I have to execute a curl get , to fetch , in real time the tweets with Hello keyword.
The problem i faced is: I have to execute this curl command each xx seconds. it returns the tweets published in the time of execution. but i loose those published between the deadtime. i.e. between the two executions of the curl.
My question. is there any ways to run this curl command
curl --location --request GET 'https://api.twitter.com/2/tweets/search/stream?tweet.fields=created_at,geo,id,lang,text' \
--header 'Authorization: Bearer AAAAAAAAAAAAAAAAAAAAAJ8Rd***'
as a stream? using logstash , or python?
thanks
Well, using python you can use twarc.Client2 - twarc (it will ha dle all the stream connection disconnecting etc.
1 Like
Hello,
after many attempts , I decided to change the solution.
Now, I created a python script which gathers tweets using “Tweepy”;
Then, I return each tweet in a certain dict format.
This is my python script:
import tweepy
import json
credentials = {
"consumer_key": "4qpWxxx",
"consumer_secret": "maHAYxxx",
"access_key": "9260767xxx",
"access_secret": "FYlz7Rxxx"}
auth = tweepy.OAuthHandler(credentials["consumer_key"], credentials["consumer_secret"])
auth.set_access_token(credentials["access_key"], credentials["access_secret"])
############ Tweepy version 4.10.0 ############
from datetime import date
today = date.today()
since = today.strftime("%Y-%m-%d")
api = tweepy.API(auth,
retry_count=10,
timeout=300)
search = tweepy.Cursor(api.search_tweets,
q=["الجزائر"],
count=maxTweets,
tweet_mode = "extended",
result_type="mixed",
since="2022-07-21",
include_entities=True
)
val= []
for tweet in search.items(10):
val.append(tweet)
Tweets =[]
for i in range(len(val)):
model={
"text": val[i].full_text,
"id_str": val[i].id_str,
"lang": val[i].lang
}
Tweets.append(model)
resultat= {"results": Tweets}
print(resultat)
then , I gather the returned value using the exec plugin of logstash like this:
input {
exec {
command => 'python python_twitter.py'
interval => 1000
}
}
output {
stdout {
codec => rubydebug
}
}
I have two prblems with this solution:
1- tweepy prints automatically some messages 'exp: unexcepcted value since" . I want to stop this printings.
2- i have a charset error when the tweet is in arabic. this error is returned by logstash
any help please.
This is an error, due to since="2022-07-21", this part in your code - since is not a valid parameter here, if you need to add that to the query, it should be like "الجزائر since:2022-07-21" but remember that the Search API only goes back 7 days, so it doesn’t make sense to hard code this at all.
This is something you’ll need to fix in logstash - tweet text is in utf8 so this is more of a question for logstash.
1 Like
Thank you, I really appreciate your help.
For the first quest ‘since’. even if an error is rasised, but tweepy consider the date given in since. ie. it returns only tweets with date >= since.
Moreover, when I tested the query as you proposed => date filter is not applied
for the second quest, I didn’t get any reply from the logstash / elastic community.
Yes, this is because the search endpoint can only return tweets that are 7 days old at most, it’s not the parameter that’s setting that. 2022-07-21 is more than 7 days away, so it won’t work.
See API — tweepy 4.10.1 documentation for all the valid parameters for the method (there is since_id but no since) and here for all the valid parameters for the query: Using standard search | Docs | Twitter Developer Platform
1 Like