If you are facing an indexing issue and want to enable robots.txt on your Blogger website, then this article will definitely be useful to you.
Here I explained what a robots.txt file is and why it is important for SEO. I'll also show you how you can customize the robots.txt file on your Blogger site and index your article faster.
What is robots.txt file?
The robots.txt file tells crawlers or search engine bots which URLs they can access and crawl to index in their database.
This is mainly used to avoid overloading your site with a lot of crawl requests and to conserve server bandwidth.
Thus, you can easily block unwanted pages from crawling, allowing important pages and saving server bandwidth.
The robots.txt file is part of the Robots Exclusion Protocol (REP), a group of web standards that regulate how robots or crawlers crawl the Internet, access and index content, and deliver that content to users.
Usually a robots.txt file is added to the root directory of a website and can be easily accessed using a URL like this.
https://example.com/robots.txt
This way, you can easily check your Blogger website robots.txt file by adding robots.txt after your homepage url as shown in the example above.
Default Robots.txt file structure
The basic format for a robots.txt file is as follows.
User agent: [user agent name] Deny: [URL string should not be crawled]
Here, a single robots file can contain multiple lines of user agents and directives (for example, deny, allow, delay crawls, etc.).
Here, each set of user agent directives is written as a separate set, separated by a line break.
There are five common terms used in robots.txt.
User Agent: Specifies the web crawler you are giving crawling instructions to (usually a search engine).
Disallow: Command used to instruct the user agent not to crawl a specific URL. Only one "Disallow:" line is allowed for each URL.
Allow: (Applies to Googlebot only): This command tells Googlebot to access the page or subfolder, even if its parent page or subfolder is not allowed.
Crawl-delay: Specifies how many seconds the web crawler should wait before loading and crawling the content of another page. Used to reduce the load on the hosting server.
Sitemap: Used to instruct search engine crawlers to crawl XML sitemaps associated with this URL. Please note that this command is only supported by Google, Ask, Bing, and Yahoo.
How do I enable a robots.txt file on Blogger?
To include a robots.txt file file in Blogger follow these steps.
- Step 1. Go to Blogger preferences and find the "Crawlers and Indexing" option.
- Step 2. Here, enable the option "Include custom robots.txt".
- Step 3: Now click on the option below and paste the below code into your custom robots.txt field.
- Step 4. Now save the code and the robots.txt file will be added to your Blogger website.
Robots.txt CODE
User-agent: *
Disallow: /search
Disallow: /category/
Disallow: /tag/
Allow: /
Sitemap: https://www.example.com/atom.xml?redirect=false&start-index=1&max-results=500
Now you can check if this is implemented correctly by accessing the url.
(https: //www.yourdomain.com/robots.txt)
If you have more than 500 pages in your sitemap, you can add multiple sitemap URLs to your robots.txt file. Just change the starting index value like 501, 1001, etc.
Now that you've set up a custom robots.txt file on your Blogger website, you can set up your own robots header tags.
Just enable this option and click on homepage tags, select all and noodp and save the settings.
SET THIS SEO SETTINGS
Homepage tags all, noodp
Archive and search page tags noindex, noodp
Post and page tags all, noodp
Why do you need robots.txt?
Preventing duplicate content in search results
This helps block private sites such as staging sites.
Specifying the location of the sitemap (s)
You can prevent search engines from indexing certain files on your website (such as premium images or PDFs, etc.).
Specify a scan delay to prevent overloading servers when crawlers download multiple pieces of content at the same time.
Post a Comment
Post a Comment