I am using the premium API for Full Archive search. while I am using the “place_country:us” I am still getting back lots of tweets with no GEO objects back.
I have checked lots of them and they have had coordinates : null or place: null.
Since I am also using -is:retweet its supposed to remove the retweets so where exactly the API got the idea that a tweet with coordinates : null and place: null is from US ? :thinking:
I am trying to clean the JSON files that I have saved after my request and I want to have a col as place but since both geo options are null I am not sure what I should do.

I was wondering is there any more tags that I should add ?
here is my rule and “each_item” is each keyword that I am looking for it.

search_term = each_item + " place_country:us lang:en -is:retweet"

rule = gen_rule_payload(search_term,
                               from_date="2018-09-01",  # 2018-09-01 12:00am (UTC) 1514764800
                               to_date="2018-12-31",  # 2018-12-31 12:00am 1546214400
                               results_per_call=500)

tweets = collect_results(rule,
                                max_results=500,
                                result_stream_args=premium_search_args)

BTW, since in the GEO object introduction page is mentioned that the location on these tags doesn’t mean that tweet was from that exact place how should one access to the tweets exactly from inside the US ?

Hello there… That does not seem as expected. If a Tweet is geo-tagged, we’d expect the “place” attribute to be non-null (although there can be cases where an exact location can not be “rolled up” into a Twitter Place).

I see your query is built with : each_item + " place_country:us lang:en -is:retweet"

Can you provide an example of a complete rule and the Tweet JSON ?

If by chance the each_item clause is ORed with the hard-coded clauses, then that would probably explain this behavior.

well I have a text file containing my keywords that I am looking for them which are separated in each line of that text file.
some of them are just one “word3” in a line and some are like “(word1 OR word2)” kind in a new line.
so technically when I use the OR in my keyword I am getting this issue ?

this is a tweet that I got back

{'created_at': 'Sun Dec 30 23:59:59 +0000 2018', 'id': 1079527550437580800, 'id_str': '1079527550437580800', 'text': 'RT @jojoansett: Things to leave in 2018: \n• people that drain you\n• mediocre friendships\n• toxic parts of yourself\n• self sabotaging habits…', 'source': '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>', 'truncated': False, 'in_reply_to_status_id': None, 'in_reply_to_status_id_str': None, 'in_reply_to_user_id': None, 'in_reply_to_user_id_str': None, 'in_reply_to_screen_name': None, 'user': {'id': 954469107398684672, 'id_str': '954469107398684672', 'name': '👁', 'screen_name': 'litxxz', 'location': 'some place higher', 'url': None, 'description': '愛', 'translator_type': 'none', 'derived': {'locations': [{'country': 'Finland', 'country_code': 'FI', 'full_name': 'Finland', 'geo': {'coordinates': [26.0, 64.0], 'type': 'point'}}]}, 'protected': False, 'verified': False, 'followers_count': 86, 'friends_count': 127, 'listed_count': 0, 'favourites_count': 3670, 'statuses_count': 3904, 'created_at': 'Fri Jan 19 21:42:23 +0000 2018', 'utc_offset': None, 'time_zone': None, 'geo_enabled': True, 'lang': None, 'contributors_enabled': False, 'is_translator': False, 'profile_background_color': '000000', 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_tile': False, 'profile_link_color': 'ABB8C2', 'profile_sidebar_border_color': '000000', 'profile_sidebar_fill_color': '000000', 'profile_text_color': '000000', 'profile_use_background_image': False, 'profile_image_url': 'http://pbs.twimg.com/profile_images/1154538602916802560/QW4qL_5j_normal.jpg', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/1154538602916802560/QW4qL_5j_normal.jpg', 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/954469107398684672/1564297435', 'default_profile': False, 'default_profile_image': False, 'following': None, 'follow_request_sent': None, 'notifications': None}, 'geo': None, 'coordinates': None, 'place': None, 'contributors': None, 'retweeted_status': {'created_at': 'Sun Dec 30 03:57:03 +0000 2018', 'id': 1079224822561943552, 'id_str': '1079224822561943552', 'text': 'Things to leave in 2018: \n• people that drain you\n• mediocre friendships\n• toxic parts of yourself\n• self sabotagin… https://t.co/vt1lFrm1ya', 'source': '<a href="http://twitter.com/download/iphone" rel="nofollow">Twitter for iPhone</a>', 'truncated': True, 'in_reply_to_status_id': None, 'in_reply_to_status_id_str': None, 'in_reply_to_user_id': None, 'in_reply_to_user_id_str': None, 'in_reply_to_screen_name': None, 'user': {'id': 1026933638510903298, 'id_str': '1026933638510903298', 'name': 'JOJO', 'screen_name': 'jojoansett', 'location': None, 'url': 'https://www.youtube.com/channel/UCU-lh8btMYjFfrliM6NHp5w', 'description': 'Instagram • jojoansett', 'translator_type': 'none', 'protected': False, 'verified': False, 'followers_count': 21162, 'friends_count': 109, 'listed_count': 171, 'favourites_count': 12438, 'statuses_count': 644, 'created_at': 'Tue Aug 07 20:50:33 +0000 2018', 'utc_offset': None, 'time_zone': None, 'geo_enabled': True, 'lang': None, 'contributors_enabled': False, 'is_translator': False, 'profile_background_color': 'F5F8FA', 'profile_background_image_url': '', 'profile_background_image_url_https': '', 'profile_background_tile': False, 'profile_link_color': '1DA1F2', 'profile_sidebar_border_color': 'C0DEED', 'profile_sidebar_fill_color': 'DDEEF6', 'profile_text_color': '333333', 'profile_use_background_image': True, 'profile_image_url': 'http://pbs.twimg.com/profile_images/1130161152032428032/hswSOar1_normal.jpg', 'profile_image_url_https': 'https://pbs.twimg.com/profile_images/1130161152032428032/hswSOar1_normal.jpg', 'profile_banner_url': 'https://pbs.twimg.com/profile_banners/1026933638510903298/1564170689', 'default_profile': True, 'default_profile_image': False, 'following': None, 'follow_request_sent': None, 'notifications': None}, 'geo': None, 'coordinates': None, 'place': None, 'contributors': None, 'is_quote_status': False, 'extended_tweet': {'full_text': 'Things to leave in 2018: \n• people that drain you\n• mediocre friendships\n• toxic parts of yourself\n• self sabotaging habits \n• situationships\n• abusive relationships \n• boring sex\n• the fear of not being enough\n• doubt/stress/worry \n• inconsistent energy\n• him/her/them', 'display_text_range': [0, 269], 'entities': {'hashtags': [], 'urls': [], 'user_mentions': [], 'symbols': []}}, 'quote_count': 171, 'reply_count': 23, 'retweet_count': 9656, 'favorite_count': 19779, 'entities': {'hashtags': [], 'urls': [{'url': 'https://t.co/vt1lFrm1ya', 'expanded_url': 'https://twitter.com/i/web/status/1079224822561943552', 'display_url': 'twitter.com/i/web/status/1…', 'indices': [117, 140]}], 'user_mentions': [], 'symbols': []}, 'favorited': False, 'retweeted': False, 'filter_level': 'low', 'lang': 'en'}, 'is_quote_status': False, 'quote_count': 0, 'reply_count': 0, 'retweet_count': 0, 'favorite_count': 0, 'entities': {'hashtags': [], 'urls': [], 'user_mentions': [{'screen_name': 'jojoansett', 'name': 'JOJO', 'id': 1026933638510903298, 'id_str': '1026933638510903298', 'indices': [3, 14]}], 'symbols': []}, 'favorited': False, 'retweeted': False, 'filter_level': 'low', 'lang': 'en', 'matching_rules': [{'tag': None}]}

As you can see the coordinates and the place tag are both null.

the query for this tweet was :

search_term = "worry OR worried" + " place_country:ca lang:en -is:retweet"

rule = gen_rule_payload(search_term,
                               from_date="2018-01-01",  # 2018-01-01 12:00am (UTC) 1514764800
                               to_date="2018-12-31",  # 2018-12-31 12:00am 1546214400
                               results_per_call=500)

I have made minor small changes as the country I have choose Canada and I have changed the from_date Jan 1st 2018 compared to the question query.

What about:

search_term = "(worry OR worried)" + " place_country:ca lang:en -is:retweet"

does that work as expected?

1 Like

Yes, that would work as expected, since ANDs (whitespace between clauses) are applied before ORs.

search_term = “worry OR worried” + " place_country:ca lang:en -is:retweet"

The above results in the equivalent of:

(worry) OR (worried place_country:ca lang:en -is:retweet)

So, you would match any Tweet with the keyword “worry” in it.

For the same reason, your example Tweet was a Retweet…

2 Likes

oh ! so for this case since I am reading the “each_item” from another keywords text file what do you recommend me to do ? because for some terms I have lots of OR in one “each_item” like :
((thought OR thoughts) (sleep OR tired) quit)
for these kind of case what do u suggest?
where should the place_country:ca lang:en -is:retweet be added?

I honestly never thought that the result will be like the one you have mentioned :frowning:

Brackets to group OR should do it, remembering that a space is an AND

So what you have looks like it will work:

((thought OR thoughts) (sleep OR tired) quit) place_country:ca lang:en -is:retweet

Surrounding the second half with brackets should also work and maybe make things more explicitly defined:

((thought OR thoughts) (sleep OR tired) quit) (place_country:ca lang:en -is:retweet)

haven’t tested those queries myself, hope i got them right.

I find using the Web search without the is:retweet and other premium specific operators generally gives a decent preview if your query is ORs and words and excluding words with -. Web search on twitter behaves differently to premium but it’s still useful to check queries.

Alternatively, if you’re mainly using and paid for the fullarchive endpoint - using the 30day sandbox is always useful to run experiments for “free”.

Also, using the counts endpoint before running the actual queries can give you a clue if it’s worth running the query too Premium search APIs | Docs | Twitter Developer Platform

Hope that helps!

2 Likes

Thanks I got what the issue was.
Unfortunately I have to wait for the renewal of my account as I have wasted my requests with the wrong way of having my search terms :frowning: I will try it out with the 30days now though.

I have tried to use the 30days as my test option as it was free but the problem is that I never noticed that white spaces can have such a major effect and I didn’t notice anything :frowning:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.