You submitted your sitemap to Google Search Console and got an error. Here is the exact checklist to diagnose 'Couldn't fetch', 'Invalid XML', and 'Redirect error' and get indexed within hours.
Quick calculator. Put in the expected monthly value of a page or link batch and the natural waiting time.
A sitemap submission error Google is rarely a Google problem. It is almost always a configuration error on your side. The three most common errors are 'Couldn't fetch', 'Invalid XML', and 'Redirect error'. Each has a distinct root cause and a specific fix. This guide walks you through each one, with a diagnostic checklist you can run in under 10 minutes.
In practice, when you see 'Couldn't fetch', the first thing to check is your robots.txt file. Googlebot must be allowed to crawl the sitemap URL. A common situation we see is a site that blocks all bots with Disallow: / and then wonders why the sitemap is not fetched. The authority reference from Google is clear: your robots.txt must allow access to the sitemap. Read the official documentation at Google Robots.txt Documentation for exact syntax.
Start with the 'Couldn't fetch' error. Verify the sitemap URL is publicly accessible and returns a 200 status code. Use a tool like Noindex Tag Checker to confirm the page is not accidentally blocked by a noindex directive. Next, check your server logs for 4xx or 5xx responses. A 403 Forbidden means the server is blocking Googlebot. A 503 Service Unavailable means your server is under load. Fix the server configuration first.
For 'Invalid XML', validate the file with an XML validator. Common failures include unescaped ampersands, missing closing tags, and incorrect namespace declarations. Use a tool like Pragmatic Bulk URL Index Checker for Google to check if individual URLs in the sitemap are actually indexable. If a URL returns a 404 or 301, remove it from the sitemap. Google expects only canonical, indexable URLs.
For 'Redirect error', the sitemap URL itself must not redirect. A common mistake is submitting http://example.com/sitemap.xml when the site forces HTTPS and redirects to https://example.com/sitemap.xml. Use the final, canonical URL in Search Console. Check that the redirect chain is zero hops.
| Error Type | Root Cause | Diagnostic Action | Failure Mode |
|---|---|---|---|
| Couldn't fetch Googlebot cannot access the sitemap URL | robots.txt blocks Googlebot Server returns 403/503 DNS resolution failure | Test URL in Search Console URL Inspection tool Check robots.txt for Disallow directive | Blocked by CDN or firewall Rate limiting by server Host unreachable from Google IP range |
| Invalid XML XML parse error | Unescaped special characters (&, <, >) Missing closing tags Wrong namespace (e.g., http vs https) | Validate with W3C XML validator Check encoding (must be UTF-8) Use & for ampersands | Large sitemap with 50,000 URLs may hide errors in middle Editor may introduce BOM characters |
| Redirect error URL returns 3xx status | HTTP to HTTPS redirect WWW to non-WWW redirect Path redirect (e.g., /sitemap to /sitemap.xml) | Check response headers with curl -I Use final canonical URL in Search Console Ensure no redirect chain >0 | Redirect chain hidden by JS or meta refresh Server-level redirect not visible in browser dev tools |
Verify sitemap URL returns 200 status (use curl or browser dev tools).
Check robots.txt for Disallow: / (blocking Googlebot).
Validate XML syntax with W3C XML validator (no ampersand errors).
Test each URL in the sitemap for indexability (200, no noindex, no 301).
Ensure sitemap URL is the final canonical version (no redirects).
Check server logs for 403, 503, or 500 errors on sitemap requests.
Use Google Search Console URL Inspection tool to test live URL.
Add sitemap URL in Google Search Console Sitemaps section.
Check robots.txt and server response. Ensure 200 status.
Validate XML. Fix encoding issues and unescaped characters.
Use final URL. Remove any 301 or 302 redirects.
Wait 24-48 hours. Re-check in Search Console for indexed URLs.
Scenario: A WordPress site with Yoast SEO generates a sitemap at /sitemap_index.xml. The user submits it to Search Console and gets 'Couldn't fetch'.
Step 1: Open the URL in a browser. It returns a 200 status. But the site uses a CDN (Cloudflare). Check robots.txt: it shows Disallow: / for 'AhrefsBot' but allows 'Googlebot'. However, the CDN firewall blocks requests with a missing User-Agent header. Googlebot sends a proper header, but the firewall also blocks by IP range. Whois check shows Google IP range 66.249.64.0/19 is blocked by a custom rule.
Step 2: Remove the IP block for Googlebot in Cloudflare firewall. Wait 5 minutes. Re-submit sitemap. Error resolved.
Result: 1,247 URLs discovered within 24 hours. Previously, zero URLs were indexed from the sitemap.
Your browser may bypass CDN, firewall, or geolocation blocks that Googlebot hits. Use Google Search Console URL Inspection tool to test as Googlebot. Check server logs for 403 or 503 responses. Also verify that your robots.txt allows Googlebot to crawl the sitemap URL specifically.
Ensure the sitemap URL you submit is the final canonical version. If your site redirects HTTP to HTTPS, submit the HTTPS version. If www to non-www, submit the non-www version. Check the redirect chain with curl -I. It should return 200 immediately, not 301 or 302. Remove any intermediate redirects.
Most common cause is an unescaped ampersand (&) in a URL. Use an XML validator like W3C Validator to find the line number. For large sitemaps, split into smaller files (max 50,000 URLs each). Also check encoding: must be UTF-8 without BOM. A tool like Notepad++ can show BOM characters.
Yes. Google expects all URLs in a sitemap to use the same protocol as the sitemap file itself. If your sitemap URL is HTTPS, all URLs inside must be HTTPS. If your site uses HTTPS, never include HTTP URLs. Also ensure the sitemap URL matches the protocol that returns 200.
Use a bulk robots.txt checker like the one built into Search Console. For agencies, script a check: parse each client's robots.txt for 'Disallow: /' or 'Disallow: /sitemap.xml'. If blocked, add 'Allow: /sitemap.xml' above the disallow rule. Also check for User-agent: * blocking all bots.
Ensure the CDN firewall allows Googlebot IP ranges. Whitelist Googlebot User-Agent. Also disable any 'Browser Integrity Check' that may block bots. Submit the sitemap URL that the CDN serves, not the origin URL. Test with curl -H 'User-Agent: Googlebot' to verify 200 response.
Maximum 50,000 URLs per sitemap file, or 50 MB uncompressed. If you exceed, Google will truncate the file and only index the first 50,000 URLs. Split into multiple sitemap files and use a sitemap index file. Also compress with gzip to reduce server load.
This is not an error but a low-priority signal. Google may not crawl all URLs if they are low-quality, duplicate, or blocked by noindex. Check each URL's indexability using the URL Inspection tool. Remove URLs with noindex, 301 redirects, or 404 errors. Also ensure internal linking supports these URLs.
Indirectly, yes. If the sitemap itself redirects, it causes a redirect error. But if internal URLs redirect, Google will follow the redirect but may not treat the URL as canonical. This does not cause a fetch error for the sitemap file. The fetch error is specific to the sitemap URL only.
Export the list of URLs from Search Console. Validate each URL for 200 status, no noindex, no redirect. Use a tool like the Pragmatic Bulk URL Index Checker to identify problem URLs. Remove bad URLs from the sitemap. Regenerate the sitemap and re-submit. For agencies, automate this with a Python script using requests and lxml.