I am trying to use filtered stream to filter real-time tweets posted by a user. Unfortunately, I am getting errors when I run my code. Could anyone please assist me?

def save_tweets(tweet):
    print(json.dumps(tweet, indent=4, sort_keys=True))
    data = tweet['data']
    includes = tweet['includes']
    user = includes['users']
    for line in user:
        tweet_list.append([data['id'], data['created_at'], data['text'], data['conversation_id'], line['username']])

max_results = 20

save_media_to_disk = False
save_path = ""

search_rules = [
    {
        "value": "-is:retweet (from:NWSNHC OR from:NHC_Atlantic OR from:NWSHouston OR from:NWSSanAntonio OR from:USGS_TexasRain OR from:USGS_TexasFlood OR from:JeffLindner1)"
    }
]

user_fields = "&user.fields=username"
expansions = "?expansions=author_id"
tweet_list = []

bearer_token = ts.BEARER_TOKEN
headers = create_headers(bearer_token)
rules = get_rules(headers, bearer_token)
delete = delete_all_rules(headers, bearer_token, rules)
set = set_rules(headers, delete, bearer_token, search_rules)
get_stream(headers, set, bearer_token, expansions, user_fields, save_media_to_disk, save_path)

df = pd.DataFrame (tweet_list, columns = ['tweetid', 'created_at', 'text', 'conversation_id', 'username'])
df

There’s not enough to know what’s wrong in this snippet, it’s missing the relevant code for get_stream.

I would recommend using a library or tool like twarc to do this instead.

As a command line tool: twarc2 (en) - twarc

Or as a library: twarc.Client2 - twarc

Hi Igor

Please see the code attached.

Here is the code.

this function starts the stream

def get_stream(headers, set, bearer_token, expansions, fields, save_to_disk, save_path):
data =
response = requests.get(

https://api.twitter.com/2/tweets/search/stream
+ expansions + fields, headers=headers, stream=True,
)
print(response.status_code)
if response.status_code != 200:
raise Exception(
“Cannot get stream (HTTP {}): {}”.format(
response.status_code, response.text
)
)
i = 0
for response_line in response.iter_lines():
i += 1
if i == max_results:
break
else:
json_response = json.loads(response_line)
# print(json.dumps(json_response, indent=4, sort_keys=True))
try:
save_tweets(json_response)
if save_to_disk == True:
save_media_to_disk(json_response, save_path)
except (json.JSONDecodeError, KeyError) as err:
# In case the JSON fails to decode, we skip this tweet
print(f"{i}/{max_results}: ERROR: encountered a problem with a line of data… \n")
continue

this function saves a tweet

def save_tweets(tweet):
print(json.dumps(tweet, indent=4, sort_keys=True))
data = tweet[‘data’]
includes = tweet[‘includes’]
user = includes[‘users’]
for line in user:
tweet_list.append([data[‘id’], data[‘created_at’], data[‘text’], data[‘conversation_id’], line[‘username’]])

the max number of tweets that will be returned

max_results = 20

save to disk

save_media_to_disk = False
save_path = “”

You can adjust the rules if needed

search_rules = [
{
“value”: “-is:retweet (from:NWSNHC OR from:NHC_Atlantic OR from:NWSHouston OR from:NWSSanAntonio OR from:USGS_TexasRain OR from:USGS_TexasFlood OR from:JeffLindner1)”
}
]

user_fields = “&user.fields=username”
expansions = “?expansions=author_id”
tweet_list =

bearer_token = ts.BEARER_TOKEN
headers = create_headers(bearer_token)
rules = get_rules(headers, bearer_token)
delete = delete_all_rules(headers, bearer_token, rules)
set = set_rules(headers, delete, bearer_token, search_rules)
get_stream(headers, set, bearer_token, expansions, user_fields, save_media_to_disk, save_path)

df = pd.DataFrame (tweet_list, columns = [‘tweetid’, ‘created_at’, ‘text’, ‘conversation_id’, ‘username’])
df

I can’t really make out anything immediately wrong from the code stuff you posted, but if you keep getting errors, since you’re not modifying the original responses it may be easier to use twarc as a command line tool to create the queries, and get some output files, twarc2 (en) - twarc

and then twarc-csv to get a dataframe: GitHub - DocNow/twarc-csv: A plugin for twarc2 for converting tweet JSON into DataFrames and exporting to CSV.

This is the error I am getting.

bearer_token = ts.BEARER_TOKEN
NameError: name ‘ts’ is not defined

This depends on what ts is in your code. It’s not defined or failed to load somehow.