Hello,

I am using postman to do my search and then I use python to flatten the json.

My example search query in Postman:
https://api.twitter.com/2/tweets/search/all?query=x%20OR%20x&max_results=500&start_time=2020-03-15T00%3A00%3A00Z&end_time=2020-06-15T11%3A59%3A59Z&place.fields=contained_within,country,full_name,geo&tweet.fields=author_id,created_at,geo,lang,conversation_id,public_metrics,possibly_sensitive,referenced_tweets&expansions=author_id,entities.mentions.username,geo.place_id,referenced_tweets.id.author_id&user.fields=description,entities,id,location,name,url,username,verified&next_token=xxxxxxxxxxx

Question 1: Is there any way to automatize my search in Postman because manually entering the next token number seems endless:) I am sure there is a way and I just could not find it.

Question 2: When I manually entered the next token into my query I realized that the columns i am getting were slightly different. For instance if one output start with id column the other output from the next page starts with created_at. How can I make them consistent to avoid matching problem when I want to add them all into one csv file.

Question 3: Each of my search (i.e. my initial search, next search by using next_token, then the next next search etc) produced different number of results such as 490, 377 tweets etc. I was expecting to see 500 tweets for each page because I said return me max 500 tweets for each search. Based on what it changes? I would like to understand.

Thank you!!

Yesim

Is there any way to automatize my search in Postman because manually entering the next token number seems endless:) I am sure there is a way and I just could not find it.

I don’t know postman well enough but it’s usually used as a debug interface, so if you’re already using Python to process the data, there is really no need to use Postman for retrieval…

one output start with id column the other output from the next page starts with created_at

This is normal for JSON formatted output - the order of the fields is not guaranteed. When parsing JSON to add to a CSV, using the keys and values will work in any good json implementation, you should not code your own one for this.

Here is an example of “flattening” the json into a CSV: Converting JSON output into csv or Excel - #17 by FlorineHenaff

Twarc will soon support v2 searches and do everything for you. You can try it with:
pip install https://github.com/DocNow/twarc/archive/v2.zip
but unfortunately the documentation is currently lacking, but running twarc2 --help after installing should give you some instructions.

I think this is mainly due to deletions, private accounts, suspended accounts -etc. maybe the errors object in the response can help?

Hi,

Thank you for the link, I see I must enter my column names myself then, I was directly flattening it like this:

with open(’/path’) as f:
data = json.load(f)
data_object = data[‘data’]

df = pd.json_normalize(data_object)

new_df = df[‘referenced_tweets’].apply(pd.Series)
new_df.columns = [‘rt’]

norm = new_df[‘rt’].apply(pd.Series)

df[‘referenced_tweets.id’] = norm[‘id’]
df[‘referenced_tweets.type’] = norm[‘type’]
df

**I installed twarc but if you do not mind can you share a code for retrieving tweets because I actually could not find where to put my bearer token for example.

I have this code:

from twarc import Twarc
t = Twarc(’ ') #here i am supposed to write my keys but the example did not include bearer token and #did not really work with academic api without bearer token.
for tweet in t.search(“mysearchwords”): #I have more than one search word (I would use OR in #postman) and I would need to add dates, fields that I require, I am sorry to ask this much but I am #perhaps not familiar enough with this :sob:.
print(tweet[“text”])

And about the number of tweets not being 500, thank you, it makes so much sense!

Best,

Y.