Ondigitals

Leading Agency
in South East Asia

Japan

China

Thailand

Taiwan

Vietnam

Philippines

Malaysia

Singapore

Indonesia

Australia

Insights

A Detailed User Guide for Sitemap Generator Tools

SEO

16/02/2023

45

What Is A Sitemap?

Sitemap is a file that contains the information of a website, including all the URLs. Sitemaps help search engines crawl and index more efficiently all the content on the site.

Definition of sitemap and sitemap crawler tool

Definition of sitemap and sitemap crawler tool

In addition, sitemaps also inform search engines about which pages are important and which URLs should be prioritized in a website, hence optimizing for results shown on SERPs.

HTML, XML and Other Sitemap Types

A popular way to classify sitemaps is by its structure: HTML or XML.

HTML sitemap 

An HTML sitemap is designed with the purpose of helping users find and access information on a website. With this type of sitemap, you can improve your site rankings by optimizing user experience.

XML sitemap

An XML sitemap is meant to help search engine bots crawl the site more easily and quickly.

Others

Another way to classify sitemaps is by 4 main format types:

  • Image sitemap: Contains information related to images stored on your site. This format is used to optimize for Google Image’s returned results.
  • Video sitemap: Contains information and content related to the videos on your site.
  • News sitemap: Contains information about specific types of content on your site. This format provides the webmaster with the control over which content is sent to Google News.
  • Mobile sitemap: Used when your site has specific pages to display on mobile devices.

A pro sitemap generator can help create many types of sitemaps

A pro sitemap generator can help create many types of sitemaps

Besides the aforementioned types, we also have sitemap index, sitemap-category.xml, sitemap-products.xml, sitemap-articles.xml, sitemap-tags.xml, among other types.

Why Does Your Website Need A Sitemap?

SEO benefits of HTML sitemaps

Sitemaps play an essential role in any website. They send signals to Googlebot to quickly and automatically index all content on the site.

When detecting an error, sitemaps will send a message with suggestions to fix it to optimize the website, which makes the whole site more search-engine-friendly.

Note: The above benefits are only applicable for declared URLs. Also, if the sitemap structure is incorrect (URLs are missing or redundant), the website structure will encounter errors, thereby affecting Google SERP rankings.

Which websites should use XML sitemaps?

Following are types of websites that should use XML sitemaps:

  • New websites or websites with a lot of content but lack an internal linking system. In this case, an XML sitemap will get Google to index faster.
  • E-commerce websites with multiple categories and sub-categories. An XML sitemap will help the bot crawl and show more accurate search results.
  • In case a piece of content is plagiarized or used to cite for multiple websites, the sitemap can help prove to Google which one is the original publishing site.

How To View The Sitemap Of A Website?

It is easy to view the sitemap of any website. Just add “sitemap.xml” to the end of the site address. If there is no result returned, it means this website has not created a sitemap.

E.g. https://ondigitals.com/sitemap_index.xml

Let’s look at the detailed instructions on how to make a sitemap for your website, both in HTML and XML structure.

How to create an HTML sitemap

Create an HTML sitemap for WordPress-based website

  • Step 1: Install and activate the plugin WP Sitemap Page or Hierarchical HTML Sitemap.
  • Step 2: Create a new page with a name containing sitemap, then follow the plugin’s instructions to add the shortcode to it.
  • Step 3: Export the newly created website and check its HTML sitemap.

An example of a sitemap made by WordPress

An example of a sitemap made by WordPress

Create an HTML sitemap manually

An HTML sitemap can also be created manually using free generator tools to obtain the website data and page list. Many SEO practitioners opt for this method due to its convenience and utility. If you are interested in creating online sitemaps in this way, refer to check-domains.com/sitemap.

Before starting to gather data, paste the URL into the site and mark it in the HTML sitemap. Finally, paste the received link into the newly created website to complete the process.

How to create an XML sitemap

XML sitemaps are important in helping SEO practitioners locate necessary information on Google. They support navigation to new pages through links on the sites. It is quite simple to create an XML sitemap, and you can choose one of these two ways.

Create an XML sitemap for WordPress-based website

  • Step 1: Go to the Dashboard, click Plugins, then select “Add new”.
  • Step 2: Search for the plugin Yoast SEO. Click Install Now, then select Activate.
  • Step 3: In the SEO section, select General, then click Features.
  • Step 4: Change the item XML Sitemaps to On. Next, you can adjust other items that have the On status. If there is no further action needed, select Save changes to finish.

Create an XML sitemap using XML-Sitemaps.com

  • Step 1: Access http://www.xml-sitemaps.com/. There are more than 500 different free sitemaps at your disposal. Choose one that best suits your needs.
  • Step 2: Enter the website URL and click Start.
  • Step 3: You will be presented with a list of sitemap files immediately. Now, pay attention to important files such as sitemap.xml, sitemap.html, ror.xml and urllist.txt.
  • Step 4: Select to download the XML sitemap file. Use Notepad ++ to open the file. Next, make necessary adjustments based on priority for the URLs.
  • Step 5: Upload the XML file to your site. Finally, update the generated sitemap to Google Search Console.

Key Notes When Building A Sitemap

It is now clear that a sitemap is essential for a site to be crawled and indexed correctly, which directly affects the site ranking. But not every sitemap works as intended, so there are some key notes to keep in mind when building a sitemap for your site.

Key notes to generate the best sitemap

Key notes to generate the best sitemap

Key notes when building sitemaps:

  • A sitemap file should not exceed 50 MB when unzipped and contain no more than 50,000 URLs. If the sitemap is too large, it is best to split it into smaller files. Otherwise, you are putting your site under the risk of server overload.
  • If you own more than one sitemap, list all of them in a single file for easier management.
  • URLs in the sitemap must not contain any ID.
  • Sitemap URLs should be UTF8 encoded and have an interface that is easy to read to web servers.
  • If your website is accessible in both “www” and “non-www” versions, create separate sitemaps for each version for enhanced website performance.

13 Sitemap Optimization Tactics

Here are 13 tips for you to optimize sitemaps quickly.

Use plugins to create sitemaps automatically

Sitemap creation is easier with the right tools, software and plugins such as Google XML Sitemaps and Yoast SEO.

Utilize free tools like Google XML Sitemaps to generate the best sitemap

Utilize free tools like Google XML Sitemaps to generate the best sitemap

Besides, you can create a sitemap manually using XML. In fact, it doesn’t need to be in XML format. You can just use a normal text file listing the URLs line by line. However, it is still recommended to create a complete XML sitemap if you want to implement the hreflang attribute.

So, if you are new to this, we recommend using an automatic sitemap generator to save time.

Visit the page Build and submit a sitemap of Google Search Central for more information on setting up sitemaps manually.

Submit the sitemap to Google

Similar to submitting URLs, you can also use Google Search Console to submit your sitemaps. But don’t rush. Take your time to check the files before clicking submit.

You need to ensure the sitemaps are free of errors and will not negatively affect Google indexing process.

Use Google Search Console to submit your sitemap

Use Google Search Console to submit your sitemap

When you submit your sitemaps, Google should understand more clearly how your website is structured. This should accelerate the crawling process and ensure proper indexing.

Prioritize putting quality pages in the sitemap

Website quality plays an important role in creating a sitemap. Moreover, this is one of the factors that directly impact the site ranking on Google SERPs.

If your sitemap contains too many low quality pages, search engines will definitely push it down the list. Therefore, when creating a sitemap, try to direct the bot to important pages.

To this end, those pages must fulfill the following requirements:

  • Are optimized on all SEO aspects
  • Feature unique content
  • Contain pictures and videos
  • Show user engagement through comments and reviews

Identify and remove URLs that are not indexed

Google does not index pages in sitemaps. Previously, Google Search Console also did not notify you of problematic sites when indexing.

Therefore, in order for the website to be highly rated by Google, webmasters need to assign existing pages into different sitemaps. Then, remove URLs that are not indexed.

Put the canonical version of the URLs in the sitemap

If your website contains many pages with similar design, use the tag ‘link rel=canonical’ to help Google identify the main site in the sitemap.

Putting the canonical version in the sitemap makes it easier for search engine bots to locate the main page, hence facilitating the crawling and indexing process.

Choose the right sitemap template for highest results

Choose the right sitemap template for highest results

Use robots meta tag instead of robots.txt

If you want to remove a page from the indexing list, use the tag ‘noindex, follow’. This keeps the URL out of the list but still ensures the link value.

However, robots.txt should only be used when you really want to remove certain unnecessary pages.

Exclude “noindex” URLs from the sitemap

If you don’t want the bot to crawl and index a particular URL, remove it from the sitemap. Grouping unnecessary pages with important ones in a sitemap file will do more harm than help.

So remember to check the entire sitemap to ensure it only consists of URLs that you want to index.

Automate XML sitemap generation for large sites

With large websites, it is virtually impossible to check each and every sitemap URL. Instead, try setting up automation rules to specify a page characteristics when included in an XML sitemap, or change from noindex to index or follow tag.

Use an automatic XML sitemap generator for large sites

Use an automatic XML sitemap generator for large sites

Use XML sitemaps and RSS/Atom feeds

RSS/Atom feed is an XML sitemap that helps create a summary for all information on a site. RSS acts as a search engine whenever the site changes or updates new content.

This helps search engines like Google to keep most of the latest information updated from the website. Google also suggests the use of both RSS/Atom feeds and sitemaps to facilitate the understanding of Googlebot in identifying which pages to be indexed.

Use RSS/Atom Feeds and XML sitemap for your website

Use RSS/Atom Feeds and XML sitemap for your website

Only update the “last edited” time in case of a major change

Do not skate on thin ice by updating the “last edited” time of your content in hope of better indexing results when there is no significant change. The bot is more intelligent than we think or would like to believe.

When discovering pages that keep updating but provide no additional value, Google will take action by removing the publishing date of the page. In short, don’t touch that button unless your pages really have gone through a notable makeover.

Pay little mind to priority settings

Some sitemaps have a Priority column to inform bots which sites are really important and necessary. However, whether it works or not is still open to debate. So don’t let it occupy your mind.

Minimize the size of the sitemap file

The smaller the sitemap file, the less strain on the server. Even if Google increases the maximum sitemap file size for your website, don’t be tempted. Keep it neat and small enough to host your main landing pages.

Optimize your sitemap by using a checker tool

Optimize your sitemap by using a checker tool

Create multiple sitemaps if your site has over 50,000 URLs

Each sitemap should not exceed 50,000 URLs. In fact, this number is more than enough for a website operating in Vietnam. But if yours consists of over 50,000 URLs, just split them into multiple sitemaps.

Why Should We Split Sitemaps?

Here are the reasons why splitting sitemaps is a recommended practice for your site.

Benefits of splitting sitemaps

The order to add updated content to sitemap is from newest to oldest. When Google crawls the sitemap file, they will start indexing from the latest content.

However, if the sitemap contains over 50,000 links, it will take a long time for Google to complete the task, not to mention the risks that may arise when you submit new content continuously, because Google will have to constantly download to collect new data and evaluate.

How to make a sitemap that facilitates indexing

How to make a sitemap that facilitates indexing

How to split sitemaps

The recommended number is 500 URLs per sitemap file if you use plugins for sitemap creation. Another way is to split the big sitemap into smaller parts and classify them by content type such as video sitemap, image sitemap, article sitemap, and so on.

Conclusion

We have reached the end of today’s sitemap generator guidelines. Hope you have obtained valuable information about the definition, benefits and how to make a sitemap that works for your website.

To learn about the latest information, knowledge and services concerning SEO, visit On Digitals. Or contact us today to receive tailored consultancy on your specific business case.


Back to list

Read more

    NEED HELP
    with digital growth?
    Tell us about your business challenge and let's discuss together