Twitter caching your robots.txt file?


#1

It seems like Twitter caches robots.txt settings, or something like that. Whatever I do, previewing my cards always shows the following error:

“Pre-fetching image http://creativeskills.be/cache/flickrjobs/visual-designer-xaop-13-11-12_25.png failed because it’s denied by robots.txt”

While my robots.txt file is exactly this:

User-agent: Twitterbot
Disallow:/cache/
Allow: /cache/images/

User-agent: *
Disallow:/cache/
Allow: /cache/images/

You can check the robots.txt file here: http://creativeskills.be/robots.txt

Even when I empty the robots.txt file I get this error. But the weird thing is, sometimes (let’s say in 5% of the cases) the preview successfully shows the card, but when trying again (with the same URL and without making any changes to robots.xt) I get the above error.

So it looks like the Twitter Preview tool is unreliable or just moody.


#2

Facing the same issue. Verified robots.txt using http://tools.seobook.com/robots-txt/analyzer/

Multiple robot rules found
Robots allowed: Twitterbot
Robots disallowed: All robots


#3

We cache files (including robots.txt and images) for 24 hours. Try again after that delay, retweeting a link to another page with cards markup (or the same page using a link shortener) is a good trigger for recrawl.


#4

I’m having the same issue.


#5

allowed everything for twitter bot… for rest disaloved what was neccassary to dissalow and that worked just fine… thanks
User-agent: Twitterbot
Disallow:
User-Agent: *
Disallow: /something/
Disallow: /something-else/


#6

absurd situation for test tool


#7

24 hours cache for test tools?
so weird!


#8

24 hours cache for test tools?
so weird!


#9

Hi,
I am also getting
Denied by robots.txt when using validator tool. Why this error is coming? And how to fix it please guide.


#10

At the root of your site, you have a robots.txt file that is explicitly preventing our crawler from accessing your content. Review the instructions for robots.txt at https://dev.twitter.com/docs/cards/getting-started to resolve the issue and try to re-submit.

-Sean


#11

This is totally unacceptable for a validator tool. All you need to add is a way to flush the cache. Epic fail for twitter


#12

Hi,
I’m trying to validate a link already validated successfully in the past. Now I get the error Denied by robots.txt. There is no robots.txt on the site. How can I do to solve?


#13

epic fail is right.


#14

So weird! Would be interesting to know why twitter needs cache it for this dev tool…#fail


#15

What kind of validation tool caches for 24 hrs. One mistake and your whole day is wasted?


#16

I agree with the negative sentiment here. When I develop Wordpress sites, I disallow search engines to index them until I am done. This creates the robots.txt response that is rejecting the card validator, and now I can’t do anything but wait for 24 hours?

Putting an entire task list on halt because of these little annoyances (like not being able to control the cache of my own robots.txt file) makes even light Twitter integrations very frustrating to deal with.


#17

Along with the user agent, what http request does the twitter bot use - I have everything but a “get” blocked, do I need to open another up?


#18

Has Twitter fixed this yet? I tried mine Thursday evening, so it’s been more than 24 hours now but still says deniedbyrobots.txt. Any updates on this???