What exactly is the Robots.txt file?
The website owner will have to make use of a protocol of programs known as the WordPress Robots.txt (Robots Exclusion Standard) to advise the search engine spiders not to access the parts of the site concerned. In order to do this, the website owner will have to create a robots.txt file (which is similar to a normal text file that can be created and viewed with notepad or other similar text editing programs) and upload it to the root folder of his website.
Many times it may happen that due to a number of reasons the owner of a website might not want the search engine to access certain parts of his/her website.
Some reasons can be- to reduce a load on the webserver, to restrict certain parts of their website from appearing in search engine results, to prevent outdated information from being shown in the search results, etc.
Now when the crawler of any search engine tries to access the website, it will first scan for the sites robots.txt and see if it is being prevented from accessing any part of the website- it finds something like this:
User-Agent: *Disallow: /example/
The web crawler then won’t access those parts of the website.
It must be noted that WordPress Robots.txt is purely advisory in nature and is useless if the crawler accessing the site does not adhere to the standard. Thus, this method is useless for hiding websites and web pages from search engines (as they can use other methods of indexing as well, such as cross-linking of web pages) and also no use against malware robots (scan the web for malware) and bots for harvesting email address (also called email address harvesters used by spammers).
How to add robots.txt to WordPress?
You can use a WordPress plugin such as ‘Yoast Plugin’ to edit the WordPress Robots.txt file from the dashboard. You should also add your sitemap to the robots.txt file as it helps the search engine to quickly index your pages by finding the sitemap file.
Follow the steps for editing Robots.txt file
- Install the Yoast Plugin and go to Tools
- When will it open the tools section, Click File editor
- Paste Robots meta tags here and save it
If for some reasons, you want to disallow the search crawlers from accessing your images, support, and CGI-bin folder, you have used the following commands respectively:
Copy the following code and paste above content to above fields (screenshot 3)
User-agent: *Disallow: /cgi-bin/Disallow: /wp-admin/Disallow: /comments/feed/Disallow: /trackback/Disallow: /index.phpDisallow: /xmlrpc.phpDisallow: /wp-content/plugins/User-agent: NinjaBotAllow: /User-agent: Mediapartners-Google*Allow: /User-agent: Googlebot-ImageAllow: /wp-content/uploads/User-agent: Adsbot-GoogleAllow: /User-agent: Googlebot-MobileAllow: /Sitemap: https://dailyblogscoop.com/sitemapindex.xmlSitemap: https://dailyblogscoop.com/sitemap-image.xml
Remember, the order in which you disallow the folders is not important, but they should always come after the user agent.
If you want to disallow further folders from being accessed, and then feel free to follow this template and keep adding disallow directives.
Alternatively, you can use the FTP server.
If you are using a static site, you need to create manually a robots.txt file. Do follow the step to upload a robots.txt file into FTP
Go to this Robots.txt File Generator and generate a file
After you have created the robots.txt to your satisfaction, upload it to the root folder of your site using any FTP software.
In order to make sure that no content on your site has been inadvertently affected by editing and updating the robots.txt file, you can use the Google Webmaster Tool “Fetch as bot” feature to check if your content can be accessed by the robot.txt.
You can simply log in to Google Webmaster account; go to ‘Diagnostic’ and then select ‘Fetch as Google bot option’.
Crawl errors can also be checked by using this tool and this can be done by selecting crawl errors option from the diagnostic menu and then selecting “restricted by robots.txt”.
Optimizing WordPress Robots.txt file
Now that we have got those basic concepts out of the way and spruced everyone up, let’s get down to the chase.
As we already explained, WordPress Robots.txt files only work with the crawlers that observe the Robots Exclusion Standard and even then, it does not prevent them from seeing the content.
It simply advises them what to do with the content, which almost always involves asking it not to show the content on search results.
The search engine will feel your content is not engaging & vibrant enough as you yourself don’t want it to be shown. This will result in it lowering your position in its search result listings. Here lies the basic problem of optimization of robots.txt with SEO to determine the correct balance so as not to harm the search listing.
If you are using the WordPress SEO plugin, then you should also know that it is not absolutely necessary to have a physical robots.txt file on your site because WordPress already has a virtual robots.txt. However, if you upload a physical robots.txt file, there will be no compatibility issues.
As you will remember while discussing robots previously, we mentioned the following syntax:
User-Agent: *Disallow: /*example*/
User-agent refers to the search engine web crawlers- in this case, web crawlers used by WordPress.
By showing an asterisk after it, we are indicating that we are addressing our robots.txt file to all search engine crawlers. Disallow tells the search engines which links are to be avoided. In addition, to disallowing, we have syntaxes such as host and crawl-delay, allow sitemap, etc.
Sitemap: It tells the location of the sitemap of your website.
Host: It defines your preferred domain name if you have a number of similar sites (called mirror sites).
Crawl-Delay: It sets the time interval or period between the requests to your server made by the search engine.
That’s it, guys. I believe you must have learned about editing and optimizing WordPress Robots.txt file for SEO. It’s important to note that there is no fixed formula for WordPress’s effective optimization of WordPress robots.txt. The content on your site also plays a significant factor. So use the tips mentioned in this article and optimize robots.txt properly for the ideal optimization of your site for search engines.