Hello!
I am working on creating some stream rules, and I would like to include a colon in my match request. I’m trying to match bible citations, so I figured a rule like (matthew OR mark OR luke OR john) (1: OR 2: OR 3: etc.) would work because it would be very uncommon to see that pattern “John 3:” in anything beyond a biblical citation “John 3:16”
However, I’m running into an issue where whenever I include a colon that is not descriptive, I get a “missing EOF” error upon posting the rule to the API.
Does anyone have any thoughts here? Do I need to escape the colons somehow? Should I write the rule differently?
Any help would be appreciated.
Thanks!
Colons are not indexed and not matched unfortunately. Search Tweets - How to build a query | Docs | Twitter Developer Platform The errors you’re seeing are probably something to do with escaping special characters but that’s a separate issue.
Best thing to do to match John 3:16 is probably to match the phrase:
(john "3 16")
The colon (
is used in Twitter’s stream filtering rules as a way to specify a value for a particular field. For example, if you want to filter a stream to only include tweets that contain the word “news,” you would use the following rule:
track:news
This rule tells the stream to only include tweets that have the word “news” in them. You can use multiple fields in a stream filter rule, separated by commas. For example, the following rule would filter the stream to only include tweets that are in English and contain the word “politics”:
language:en,track:politics
There are many different fields that you can use in stream filter rules, including language, track, location, and follow. You can find more information about these fields and how to use them in the Twitter API documentation.
Did ChatGPT write this? This is totally wrong and makes no sense.
Ahhh that is very unfortunate. The colon is required for standard bible notation so there isn’t really a way around that. In order to include the “3 16” I’d have to build a search rule for every verse.
So is (john OR …) (1 OR 2 OR …) probably going to be the rest rule match? Essentially I’d be looking for any tweet where it contained something in (group 1)(a space / and)(something in group 2). However if I understand correctly the (something in group 1) and (something in group 2) would make “John and I hung out until 3 last night” match, right? Since it contains both “John” and “3” in the tokenized match?
Thanks for your help so far, knowing that colons are not indexed is very helpful.
Yeah unfortunately.
Since the rules can be somewhat long though, and bible books are fewer than verses, maybe you can try a search like:
"Genesis 28" OR "Exodus 29" OR "Psalm 85" OR "Joshua 22" OR "Psalm 35" OR "Judges 11" OR "Acts 9" OR "Ezekiel 44" OR "Judges 18" OR "Leviticus 24" OR "Exodus 38" OR "Judges 1" OR "2 Corinthians 5" OR "Nahum 3" OR "John 6" OR "Acts 4" OR "2 Thessalonians 3" OR "Luke 15" OR "Luke 19"...
This seems to find a good result for me on first look.
I just used: https://openbible.com/textfiles/kjv.txt and extracted the book names and joined them into a single giant query, but you will have to split this into about ~37 different 512 character queries for v2 Searches:
with open("kjv.txt") as f:
lines = f.readlines()
book_names = set([f'"{line.split(":")[0]}"' for line in lines[2:len(lines)]])
long_query = " OR ".join(book_names)
print(long_query)
also just noticed, you can get rid of ones like "2 Thessalonians 3" and just have "Thessalonians 3" as a query which will match more.