hykstat
#1
Hi,
there might be UTF-8 characters in hashtags, for example tag åland.
Is it possible to use all UTF-8 characters in APIv2 queries without encoding? The åland hashtag seems to work.
With APIv1 you woud encode å as %C3%A5 for å (“latin small letter with a ring above”)
With APIv2 it doesn’t work:
Traceback (most recent call last):
File “search-tags.py”, line 151, in
json_response = connect_to_endpoint(tag_name, latest_id)
File “search-tags.py”, line 96, in connect_to_endpoint
raise Exception(response.status_code, response.text)
Exception: (400, '{“errors”:[{“parameters”:{“query”:[“#%C3%A5land”]},“message”:“There were errors processing your request: no viable alternative at input '#' (at position 1), no viable alternative at character '%' (at position 2)”}],“title”:“Invalid Request”,“detail”:“One or more parameters to your request was invalid.”,…
Do you have a link to that search-tags.py script if it’s open source? Or include it here?
The API is utf8 so it should work, using twarc for example: twarc2 (en) - twarc
twarc2 search --limit 100 "åland" example.json
hykstat
#3
This is the relevant part of code
looping through a list of hashtags and making a search for each one
tags_list included %C3%A5land
search_url = “. . . /2/tweets/search/recent”
for hashtag in tags_list:
query_params = {‘query’: ‘#’+hashtag,‘max_results’:tweet_count,‘tweet.fields’: 'id,text, etc …}
response = requests.get(search_url, auth=bearer_oauth, params=query_params)
Just noticed that APIv2 is UTF-8-friendly so encoding is not necessary, thanks!
If I later find any problem cases, I’ll send logs.
1 Like