I’ll report to the search quality team that the search widget isn’t picking up mentions in this case – are there any improvements if you use operators such as “from:” or “to:” as kind of hints to the system that you’re looking for Tweets either authored or directed to the usernames in question? In this case, it looks as if the efforts to improve search quality have instead degraded your search experience.
As for evolution of search:
The search service for Twitter has always been focused on serving relevant results to ad-hoc user-mediated queries. This means that the main purpose of Search, as we see it, is to satisfy a user who is providing a query at a specific moment in time. Search is not an exhaustive index of all public tweets on Twitter (and never has been), and responses adapt to what it perceives as being most relevant for the query. As part of our continued efforts to focus on relevance and quality, we’ve been experimenting with a few result quality levers lately – specifically on search widgets, but you’re also likely to see these refinements elsewhere on Twitter as well.
In most cases, if a client wants to provide a more deterministic view of Tweets, they’ll leverage a Streaming API-based integration and cache tweets before displaying to end users – with the side effect of also not worrying about client-side rate limiting, a problem shared by all of our widgets. You lose a bit of the control & determinism that you may find in a versioned API like the Streaming API when using the widgets and Search API. If you or your clients are after completeness, the Search API and widgets will never be the best options.