Robots.txt Tester

Validate your robots.txt file and check crawling rules

Enter domain without https:// or www

About Robots.txt

The robots.txt file tells search engine crawlers which pages they can and cannot access on your site.

Basic Syntax

User-agent: *
Disallow: /admin/
Disallow: /private/
Allow: /public/

Sitemap: https://example.com/sitemap.xml
        

Common Directives

  • User-agent: Specifies which bot the rules apply to (* = all)
  • Disallow: Blocks access to specified paths
  • Allow: Explicitly allows access (overrides Disallow)
  • Sitemap: Points to your XML sitemap

Best Practices

  • Place at root: yoursite.com/robots.txt
  • Don't block CSS/JS files (hurts SEO)
  • Use for staging sites or admin areas
  • Include sitemap reference
  • Test before deploying

Security Note

⚠️ Robots.txt is NOT a security measure. Malicious bots ignore it. Use proper authentication for sensitive content.

Frequently Asked Questions

Is robots.txt required?

No, but it's recommended. Without it, all pages are crawlable by default. Use it to block admin areas, staging sites, or duplicate content.

Does robots.txt stop all bots?

No, it's a guideline that ethical bots follow. Malicious bots ignore it. Never rely on robots.txt for security.

Can I block specific search engines?

Yes, use User-agent directives. For example: "User-agent: Googlebot" for Google, "User-agent: Bingbot" for Bing.

What happens if I block too much?

Blocking CSS/JS files can hurt SEO. Blocking entire sections might hide valuable content from search engines.

Want Automated Monitoring?

Get 24/7 monitoring with instant alerts when issues are detected.

Start Free Trial