What Is a Sitemap.xml File?
A sitemap.xml is a small XML file that lists every page on your website you want search engines to know about. Google, Bing, and other crawlers use it to discover new content faster, recrawl pages that have changed, and decide which URLs are most important. Submit it once in Google Search Console and your indexing speed improves on day one.
The XML format was standardised in 2005 as the Sitemaps Protocol 0.9 by Google, Yahoo!, and Microsoft (now maintained at sitemaps.org). Each entry can carry four properties: <loc> (the URL), <lastmod> (last modified date), <changefreq> (how often the page changes), and <priority> (0.0–1.0 importance hint). Google today largely ignores changefreq and priority in favour of its own crawl-importance calculations, but lastmod is still used as a strong signal for recrawl scheduling.
Important distinction: XML sitemaps are for search-engine crawlers, while HTML sitemaps are clickable pages for human visitors to navigate large sites. Most sites need both, but the XML version is what affects search indexing. Beyond the standard format, there are specialised variants: Image sitemaps (declare alt text and licensing per image), Video sitemaps (thumbnail, duration, content URL), and News sitemaps (required for Google News inclusion, with strict 48-hour freshness rules). For very large sites, a sitemap index file references multiple individual sitemaps — required when you exceed 50,000 URLs or 50 MB per file.
How the Generator Works
- Paste your homepage URL.
- The tool fetches the page server-side, extracts every internal link, and tries to import URLs from your existing
/sitemap.xmlif one is published. - Each URL is auto-categorized as homepage, blog, category, tool, static, or other — and assigned a sensible priority and changefreq based on its path.
- Use Include paths to add extra URLs and Exclude paths to remove anything you don't want indexed (admin, drafts, search pages, etc.).
- Copy the generated XML or click Download to save it as
sitemap.xml— ready to upload to your site root.
Common Use Cases
- E-commerce catalogues — product pages often sit 4–5 clicks deep behind category filters, so crawlers struggle to find them via navigation alone. A sitemap surfaces every product URL directly to Googlebot, improving discovery and reducing time-to-index for new SKUs.
- News and journalism sites — Google News inclusion requires a separate news sitemap with
<news:publication_date>per article, refreshed within 48 hours of publication. - Blogs with priority signalling — setting higher priority for fresh content and lower for archive pages helps guide crawl budget toward what matters.
- Multilingual sites with hreflang — the sitemap can declare hreflang relationships between language variants, which is more reliable than relying on in-page hreflang tags alone.
- Single-page applications (SPAs) — React, Vue, and Angular apps that rely on client-side rendering can hide URLs from crawlers. A sitemap explicitly lists every route so Googlebot finds the JavaScript-rendered pages.
- Sites with dynamic URL parameters — if your content is reachable through multiple URL combinations, explicitly listing the canonical variant in the sitemap helps avoid duplicate-content confusion.
Smart Priority Inference
- Homepage → priority 1.0, daily
- /blog, /news, /articles → priority 0.9, daily
- Blog posts → priority 0.7, monthly
- Tool / category landings → priority 0.8, weekly
- Tool sub-pages → priority 0.7, monthly
- /about, /contact, /privacy, /terms → priority 0.5, yearly
Pick a single change-frequency or priority in the form to override these per-route defaults for every URL.
Sitemap Generator vs Other Tools
Versus XML-sitemaps.com — the classic free generator caps at 500 URLs for free; paid for more. Tooldit handles any size without a limit since it's a local crawler.
Versus Screaming Frog — Screaming Frog is a desktop crawler with sitemap export. Free for up to 500 URLs, paid (~$259/yr) above that. Tooldit is free with no URL limit.
Versus Yoast / RankMath — WordPress plugins auto-generate sitemaps for WP sites only. Tooldit works for any platform (custom, Next.js, static sites).
Versus Next.js generateSitemap() — Next.js can build a sitemap programmatically. Tooldit is for sites without that infrastructure, or for quick one-off needs.
Troubleshooting & Common Issues
- Missing pages in generated sitemap — the crawler follows links from your homepage. Pages that aren't linked anywhere (orphan pages) won't appear. Add them manually or fix internal linking.
- Sitemap shows 404 pages — if the crawler hit broken internal links, they end up in the sitemap. Fix the broken links on your site, then re-generate.
- Google Search Console errors — GSC rejects sitemaps with HTTP URLs when your site is HTTPS, or with URLs that don't match the canonical domain (www vs non-www). Make sure your sitemap uses the same scheme/domain.
- Sitemap too large — XML sitemaps cap at 50,000 URLs or 50 MB. For larger sites, split into multiple sitemap files and reference them from a sitemap-index file.
- Priority values look wrong — Google ignores priority and changefreq values in practice. Don't obsess over them; focus on which URLs are included.
Frequently Asked Questions
+What is a sitemap.xml file?
A sitemap.xml is a structured list of the URLs on your site that you want search engines to crawl. Submitting one to Google Search Console helps your pages get discovered and re-crawled faster.
+How does this sitemap generator find my pages?
It fetches your homepage, parses every internal link in the HTML, and also imports URLs from your existing /sitemap.xml if one is published. You can add extra URLs with the include paths field and remove paths with exclude.
+What's a good priority value?
1.0 for the homepage, 0.8–0.9 for major category pages and the blog index, 0.6–0.7 for blog posts and tool pages, 0.5 or lower for legal/static pages. Don't make every page 1.0 — that gives Google no signal.
+How often should I update my sitemap?
Whenever you add, remove or significantly update content. For active blogs, daily or weekly is normal. The lastmod date helps Google decide what to recrawl, so keeping it accurate is more important than the changefreq value.
+What if my site has more than 50,000 URLs?
Google's per-sitemap limit is 50,000 URLs and 50 MB uncompressed. Split larger sites across multiple sitemaps and reference them from a sitemap index file (sitemap_index.xml).
+Is the tool free?
Yes. The sitemap.xml generator is 100% free with no signup, no credit card, and no email required. Crawl any URL and generate, copy, and download as many XML sitemaps as you need.
+Where do I upload the sitemap.xml?
Place it at your domain root (https://example.com/sitemap.xml), then submit it in Google Search Console & Bing Webmaster Tools. Add a reference in robots.txt: Sitemap: https://example.com/sitemap.xml.
+How often should I regenerate the sitemap?
After adding/removing significant pages or restructuring URLs. For frequently- updated sites (blogs, e-commerce), automate via your build pipeline. Static sites only need updates on content changes.