Why isn't this URL indexed the same?

search
urls

#1

Can someone help me understand what might be going on here? Here is a tweet that I sent a few minutes ago that contains two URLs:

The two URLs are:

If I search via the Twitter website I can find the first one:

url:www.localmemory.org/vis/collections/local-memory-project/queries/usa-63135-protests-10

but I can’t seem to find the tweet using a search for the second url:

url:www.stlamerican.com/news/local_news/protests-resumed-in-ferguson-at-least-one-injured-by-gunfire/article_d3b3b1ea-ee39-11e4-9025-531f05e20e9f.html

I thought maybe only the first one was being indexed so I tweeted them in a different order:

Now only the second URL is searchable? Is it possible that this url is just not indexable for some reason. length? http://www.stlamerican.com/news/local_news/protests-resumed-in-ferguson-at-least-one-injured-by-gunfire/article_d3b3b1ea-ee39-11e4-9025-531f05e20e9f.html

I’m adding this here because I am seeing the same behavior from the search REST API.


#2

Have you tried tweeting with just the stlamerican article? Maybe something within the url is causing the issue or it could potentially be the length of the url? Try a shorter version like: http://www.stlamerican.com/article_d3b3b1ea-ee39-11e4-9025-531f05e20e9f.html

The situation does seem odd indeed.


#3

To add another layer of strangeness I just noticed that the first url is searchable with and without the url prefix. But the second URL is not searchable with the url prefix, but is searchable without it.


#4

Thanks for the idea @DanielCHood. I just tweeted that one URL and can see that my tweet can be found by searching with the URL but not with the url prefix.

Trying to understand what might be going on here is making me begin doubt my sanity!


#5

Might it be the underscore(s)? (Does an url: search work for other underscores-in-path_info URLs you can check?)