An XML sitemap is a file that tells search engines like Google which URLs on your website should be indexed (added to its datab ase of possible search results).
It may also provide additio nal information a bout each URL, including:
When the page was last modified How often the page is updat ed The relative i mportance of the page This information can help search engines crawl (explore) your site more effectively and efficiently. And better match your pages with relevant search queries.
That’s why XML sitemaps are i mportant in search engine optimization (SEO).
An XML sitemap (or sitemap.xml file) looks something like this:
If you’re interested in the details, the main tags used are:
<urlset> : Encloses all the tags for each sitemap <url> : Encloses all the tags for each URL <loc> : Specifies the page’s complete URL <lastmod> : Specifies when the page was last updated (optional) <changefreq> : Specifies how frequently the page is likely to change (optional) <priority> : Specifies the relative i mportance of the page from 0.0 to 1.0 (optional) Webmasters can also create dedicated image, video, and news sitemaps. To help search engines understand these specific types of content.
If you need to create more than one sitemap, you need a sitemap index. Which essentially acts as a sitemap for your sitemaps.
An XML sitemap is highly recommended if you want your pages to show in search engine results.
If you don’t provide an XML sitemap, search engines have to rely on hyperl inks (on your own site or elsewher e) to discover pages on your site. This is inefficient and it can lead to pages being missed.
Now, let’s learn how to create an XML sitemap.
It’s likely that the platform you use to manage your website’s co ntent automatically generates and updat es your XML sitemap.
You may be able to find yours by going to yourdomain.com/ sitemap.xml in your browser.
Like this:
Otherwise, refer to the help center for your website builder or co ntent management system (CMS). Or co ntact your platform’s support team.
If your platform doesn’t provide an XML sitemap, you can use a sitemap generator tool.
These tools can also prove helpful if you want more co ntrol over your sitemap. For example, you can customize your WordPress sitemap with the Yoast SEO plugin.
If you use a tool outside of your platform to create a sitemap, make sure to publish it to your site to make it live.
It’s best practice to submit your sitemap to Google. (Rather than waiting for Google’s website crawlers to discover the file on their own.)
But first, make sure there are no issues with your XML sitemap.
With Semrush’s Site Audit tool, you can check whether your sitemap.xml file:
Can’t be found Has formatting errors Co ntains non-cano nical or non-200 URLs Isn’t specified in robots.txt Is too large Co ntains HTTP rather than HTTPS URLs The tool also checks whether your SEO sitemap co ntains orphaned pages—URLs that aren’t l inked to from anywher e on your site. (It’s best practice to add internal l inks to pages that should be indexed.)
Simply go to the “Issues ” report after setting up your audit. And enter “sitemap” into the search bar.
Rerun the audit after implementing any fixes. So you can check they’re working correctly.
And go to “Indexing ” > “Sitemaps .”
And click “Submit ” when you’re done.
When Google has crawled your sitemap, you’ll see a “Success” notice in the “Status” column.
But if you make major changes that you want to be discovered quickly, you can re-submit your sitemap with a new request.
If you’re using a sitemap.xml file generated by your website platform or a specialized tool, it’ll probably meet XML sitemap best practices.
But if you want to make sure, read and understand these guidelines.
First, your sitemap should o nly reference URLs that:
You want to be indexed . For example, you shouldn’t include pages from your staging environment. Or the URL for an order c o nfirmation page. Return a 200 status code . You shouldn’t attempt to index pages that return other http status codes. Such as 301 redirects (which indicate permanent redirects) or 404 errors (which indicate a page can’t be found). Are fully qualified and absolute . In other words, make sure to specify the entire URL with the scheme, authority, and path (e.g., “https://www.semrush.com/blog/”). Are canonicals . Cano nical URLs represent the sole version of a page or the primary version of a duplicated page. And your sitemap file should:
Be UTF-8 encoded . This is a system that ensures search engines can understand all the characters you’re using. For example, you’ll need to use (without the space) in place of a "&" symbol. Be less than 50MB or 50,000 URLs . If necessary, you can create multiple sitemaps and a sitemap index file. Specify the correct namespace . A namespace is like a label that tells the search engine what kinds of rules the sitemap follows. Most sitemaps use the “http://www.sitemaps.org/schemas/sitemap/0.9” namespace to show that the file co nforms to standards set by sitemaps.org. Include language and region variants for each URL (wher e applicable). You can learn more in this resource from Google. Lastly, make sure to l ink to your sitemap from your robots.txt file. This is a website file that tells search engines which pages they should and shouldn’t crawl.
With Semrush’s Site Audit, you can easily check for issues related to your XML sitemap.
The tool also checks for dozens of other issues that can harm your SEO results.