Generate a clean robots.txt file for Google & SEO in seconds.
Build a perfectly structured robots.txt file that tells search engines exactly what to crawl and what to ignore. Shape crawl budget, hide thin content, and help Googlebot discover your best pages without touching code.
Start from SEO‑ready presets for WordPress, Shopify or Next.js, tweak disallow rules, attach your XML sitemap and run instant robots.txt checks — all locally in your browser. No server upload, no login, no risk.
How to create a robots.txt file in 3 steps
Use this free robots.txt generator to build a clean, SEO‑friendly robots.txt file for Google, Bing and other crawlers. No coding, no FTP and no plugin required.
Pick a preset for your platform
Choose from presets like “Allow all crawlers”, “Block all crawlers”, WordPress, Shopify or Next.js. Each preset loads safe default User‑agent and Disallow rules you can use as a starting point.
Tweak rules for SEO and privacy
Add your XML sitemap URL, hide admin and login paths, exclude search results or tracking parameters, and optionally add advanced directives like crawl‑delay or Googlebot‑specific rules.
Copy or download robots.txt
Once you are happy with the preview, copy the robots.txt content or download it as a .txt file. Upload it to your site root or configure it inside your CMS, then test it with Google’s robots.txt tester.
Robots.txt generator features for real‑world SEO
Everything you need to create, check and fine‑tune your robots.txt file for Google, WordPress, Shopify, Next.js and custom setups — all in one free browser tool.
Smart robots.txt presets
Start from safe, opinionated presets for popular stacks: generic allow all, block all, WordPress, Shopify and Next.js. Each preset uses battle‑tested crawl rules you can trust.
robots.txt generatorSEO‑friendly default rules
Hide login, admin and system folders, optional search result pages and common tracking parameters while keeping important content accessible so your organic visibility is not hurt.
seo robots.txtSitemap integration
Add one or more XML sitemap URLs to your robots.txt file in a click. Help Google, Bing and other crawlers find your sitemaps faster and index fresh pages more reliably.
robots.txt sitemapRobots.txt check & validation
Paste an existing robots.txt file to run a quick validation. Spot syntax issues, conflicting rules, missing sitemap lines and risky “block all” patterns before they impact SEO.
robots.txt checkGoogle‑style rule testing
See how user‑agents and Disallow paths interact for a specific URL using simplified rule logic. Understand why a URL is blocked or allowed before you change your live robots.txt file.
google robots txt testOne‑click copy & download
When your robots.txt file looks right, copy it to clipboard or download a ready‑to‑upload robots.txt file. Perfect for quick WordPress, Shopify and static site deployments.
robots txt file generatorWhy this free robots.txt generator is different
Most online generators only spit out a basic template. This tool focuses on SEO‑safe presets, validation and privacy‑conscious defaults so you don’t accidentally block Google or expose sensitive paths.
| Feature | Typical robots.txt generators | WebToolTrix Robots.txt generator |
|---|---|---|
| SEO‑ready presets (WP, Shopify, Next.js) | Often missing or outdated | ✓ Curated presets for modern stacks |
| Instant robots.txt preview | ✓ Basic text area | ✓ Themed preview with status hints |
| Block all / allow all toggles | ● Sometimes available | ✓ Safe defaults with clear warnings |
| Robots.txt validation & checks | ✕ Rarely included | ✓ Highlights risky rules and mistakes |
| Support for sitemap lines | ● Not always supported | ✓ Multiple sitemap URLs supported |
| Works fully in your browser | ✕ Some send data to server | ✓ 100% browser‑local, privacy‑friendly |
| Price | Free with limits or paywalls | ✓ Free forever, no signup |
What is a robots.txt file?
A robots.txt file is a plain text file that lives at the root of your domain and controls how search engine crawlers navigate your site. It follows the Robots Exclusion Protocol (RFC 9309), the official internet standard for crawler communication. Googlebot checks for this file before crawling any other URL on your domain — making it one of the most powerful (and dangerous) files in your technical SEO toolkit.
The file lives at https://yourdomain.com/robots.txt and contains a series of rule blocks.
Each block opens with a User-agent line that targets a specific crawler — or the wildcard
* for all bots — followed by Allow and Disallow directives that
define exactly which URL paths that bot may or may not visit.
Robots.txt syntax: all key directives explained
Robots.txt uses a compact vocabulary. Understanding each directive prevents the most costly mistakes — including accidentally blocking your entire site or leaking admin paths to the public web.
| Directive | What it does | Example | SEO impact |
|---|---|---|---|
User-agent |
Targets a specific bot or all bots (*) |
User-agent: Googlebot |
Groups all rules below it for that agent |
Disallow |
Blocks the bot from crawling a path | Disallow: /wp-admin/ |
Saves crawl budget; can accidentally de-index if misused |
Allow |
Grants access to a path inside a disallowed parent | Allow: /wp-admin/admin-ajax.php |
Keeps key scripts crawlable for proper page rendering |
Sitemap |
Points crawlers to your XML sitemap | Sitemap: https://example.com/sitemap.xml |
Speeds up discovery of new and updated URLs |
Crawl-delay |
Asks bots to pause between requests (seconds) | Crawl-delay: 2 |
Reduces server load; Google ignores it but Bingbot respects it |
Host |
Specifies preferred domain for Yandex | Host: example.com |
Yandex-specific; no effect on Google or Bing |
Path matching is case-sensitive and prefix-based. Disallow: /Photo/ does not
block /photo/ — a subtle distinction that causes real indexing problems on Linux-hosted sites.
You can also use the wildcard * inside paths and the $ end-of-URL anchor, though
support varies by crawler. Consult
Google's robots.txt specification
for the full matching rules Googlebot applies.
Why robots.txt matters for SEO
Search engines allocate a crawl budget to every domain — roughly the number of URLs they
will fetch during a given period. Large e-commerce catalogues and content-heavy sites feel this limit acutely.
If Googlebot spends half its budget crawling /wp-admin/, /?s= search result URLs,
and endless filtered collection pages, it has less time for your most commercially important content.
"A broken robots.txt file is one of the few technical SEO issues that can completely de-index a site overnight. Treat it with the same care as your sitemap."
The flip side is equally dangerous. An overly aggressive robots.txt that blocks large sections of your site
prevents Googlebot from seeing your noindex meta tags — because a blocked page can never be fetched, so the
meta tag inside it is never read. This creates a paradox where pages you intend to de-index actually persist
in search results. The correct pattern is: allow crawling, control indexing with
<meta name="robots" content="noindex"> or the X-Robots-Tag HTTP header.
Platform-specific robots.txt examples
WordPress robots.txt example
WordPress typically serves a virtual robots.txt generated by your SEO plugin. A solid WordPress configuration
blocks the admin area, login pages and internal search results while leaving all content, media and REST API
endpoints open for discovery. The critical rule most guides miss is
Allow: /wp-admin/admin-ajax.php — without it, interactive features and some page builders
stop working for authenticated users while bots are watching.
/robots.txt after any change.
Shopify robots.txt example
Shopify generates a default robots.txt for every store and — since late 2021 — allows merchants to customise
it via theme templates. The default ruleset sensibly blocks /admin, /cart,
/checkout and /orders. The most common advanced Shopify use case is extending
the disallow list to cover faceted navigation URLs (collection filters) that produce thousands of
near-duplicate pages and bleed crawl budget.
Next.js robots.txt example
Next.js 13+ has a built-in robots.ts file convention in the app/ directory that
generates robots.txt at build time. For older projects, the popular next-sitemap package
handles both sitemap and robots.txt generation. Either way, the underlying rules you design with this
generator can be directly translated into the configuration your framework expects.
How to block AI crawlers in robots.txt
Since 2023, a new category of bots has changed the robots.txt landscape: AI training crawlers. Companies including OpenAI, Anthropic, Google, Apple and Amazon all deploy bots that collect web content for large language model training. Unlike search crawlers, these bots do not send traffic back to your site — they simply extract content. Many website owners and publishers now actively block them.
| AI bot / User-agent | Company | Purpose | Disallow rule |
|---|---|---|---|
GPTBot |
OpenAI | ChatGPT / GPT-4 training data | Disallow: / |
ChatGPT-User |
OpenAI | ChatGPT browsing plugin | Disallow: / |
CCBot |
Common Crawl | Open dataset (feeds many LLMs) | Disallow: / |
anthropic-ai |
Anthropic | Claude training data | Disallow: / |
ClaudeBot |
Anthropic | Claude web browsing | Disallow: / |
Google-Extended |
Gemini / Vertex AI training | Disallow: / |
|
Applebot-Extended |
Apple | Apple Intelligence training | Disallow: / |
Amazonbot |
Amazon | Alexa AI training | Disallow: / |
Google-Extended as the User-agent with Disallow: /.
Most common robots.txt mistakes that hurt SEO
Most robots.txt problems are not caused by ignorance of the format. They are caused by copy-paste errors, staging configurations being promoted to production, or well-intentioned rules with unintended side effects. Here are the most costly mistakes and how to avoid them.
Accidental "block everything"
Disallow: / under User-agent: * tells every bot to leave immediately.
Left in place after a staging deployment, it can wipe your site from search results within days.
Always validate robots.txt in Search Console before and after any site migration.
Blocking CSS and JavaScript files
Blocking /wp-content/ or theme asset folders was once common practice.
Today, Google needs to fetch and render your CSS and JS to understand how your page looks.
If it cannot, your pages may rank poorly or trigger a mobile-usability penalty.
Never disallow resource folders your front-end depends on.
Using robots.txt to hide sensitive data
Disallow: /private-docs/ does not make a folder private. It just asks bots not to crawl it.
Anyone who knows the URL can still access the content directly.
Use proper authentication, server-side access controls or password protection instead.
Case-sensitive path mismatches
On Linux servers, Disallow: /Images/ does not block /images/.
Paths in robots.txt are case-sensitive on case-sensitive file systems.
Always match the exact capitalisation of your actual URL paths.
Missing sitemap line
Without a Sitemap: line in robots.txt, crawlers must discover your sitemap
through Search Console or by guessing common paths. Adding at least one sitemap URL directly
helps Google and Bing find new content faster — a simple win that many sites overlook.
Robots.txt vs noindex: which to use?
This is one of the most misunderstood distinctions in technical SEO.
Robots.txt controls crawling — whether a bot is allowed to fetch a URL.
The noindex meta tag or X-Robots-Tag HTTP header controls indexing —
whether a fetched page appears in search results.
The critical conflict: if you block a page in robots.txt, Googlebot cannot fetch it, which means it cannot
read your noindex tag. The result is that the page may remain indexed indefinitely based on
external link signals alone. Google's official guidance
recommends allowing crawling and using proper noindex signals for any page you want removed from search results.
noindex to remove specific pages from search results.
Do not use both together on the same URLs.
How to test your robots.txt file
Paste your file into the Check & Validate mode on this tool for an immediate syntax check. The validator flags missing User-agent lines, malformed Disallow rules, dangerous block-all patterns and missing sitemap lines before anything goes live.
Once you deploy your file, open Google Search Console → Settings → robots.txt report. This shows exactly how Googlebot reads your live file and lets you test specific URLs against your current rules. Monitor Coverage and Crawl Stats reports over the following days to confirm that Googlebot can still access your highest-priority pages.
How to upload and deploy your robots.txt file
Generate and download
Use this tool to build your robots.txt, then click the download button to save it as
robots.txt (lowercase, exact filename). Keep the file plain UTF-8 text with
no BOM marker — some text editors add invisible headers that corrupt the file.
Upload to the site root
Upload the file to the root directory of your domain so it is accessible at
https://yourdomain.com/robots.txt. For WordPress, this is usually your
public_html/ or www/ folder via FTP, SFTP, or your hosting
file manager. For Shopify, use the Theme editor's robots.txt template customisation.
Verify it is live
Visit https://yourdomain.com/robots.txt directly in your browser to confirm
the correct version is being served. If you see a cached or old version, your hosting may
be serving a previously generated file — clear the CDN cache or server cache to force a refresh.
Test in Google Search Console
In Search Console, go to Settings → robots.txt, then use the URL tester to confirm that important pages like your homepage, product pages, and blog posts are not accidentally blocked. Monitor the Coverage and Crawl Stats dashboards for the next 48–72 hours after any major change.
Best practices for SEO-friendly robots.txt files
- Keep it small and commented. A concise, well-commented robots.txt is easier to audit during migrations. Add a short comment above each disallow block explaining why it exists.
- Block low-value, high-volume URL groups. Internal search results, pagination beyond page 5, filter combinations and tracking parameter URLs are common candidates for blocking.
- Never block CSS, JS or image assets Google needs to render pages. Blocking theme files or media libraries prevents Google from assessing your page quality accurately.
-
Always include up-to-date sitemap URLs. If you run multiple sitemaps (posts,
products, images, news), include each one on a separate
Sitemap:line. - Review robots.txt before every site migration. Staging configurations that reach production — even briefly — can trigger a significant drop in crawl coverage.
- Test after every deployment. Use the validation mode above to do a pre-flight check, then confirm with Search Console after each change.
-
Decide your AI bot policy intentionally. If you do not want AI companies collecting
your content for model training, add explicit rules for
GPTBot,CCBot,Google-Extendedand other AI agents that are relevant to your situation.
Need an XML sitemap to pair with your robots.txt? Use the free XML Sitemap Generator on WebToolTrix to build and download a clean sitemap in minutes. And if you want to check what meta tags search engines are reading on any of your pages, the Meta Tag Checker gives you a full breakdown without any browser extension.
Robots.txt generator — frequently asked questions
Common questions about robots.txt syntax, crawl rules, AI bot blocking, platform-specific setup, and using this free generator with WordPress, Shopify and Next.js.
More SEO & dev tools you might like
Once your robots.txt file is under control, use these free tools to tighten up the rest of your technical SEO and site performance.
Ship a safe robots.txt file before your next release
Use this free robots.txt generator to double‑check your crawl rules before a redesign, migration or performance optimisation. One quick review today can prevent a major SEO loss tomorrow.