robot

Robots.txt- hiding sensitive pages on your website

Search engines are continually indexing the World Wide Web. They deploy efficient crawler programs to seek out webpages and index them for better search results.

However, there are some sensitive pages on a website that we recommend site owners not allow search engines to index and display as it could pose a security breach. One such weblink would be our Content Management System (CMS) login page.

Should a hacker finds out the link to your CMS login, he/she could try to brute force themselves into your CMS and take control of your website.

Fortunately, there is a way to ‘tell’ the search engines not to display these sensitive pages by way of a file, robots.txt. In the file, you can list the webpages you do not want search engines to index and make it discoverable.

The robots.txt file is essential to search engines too. A crawler bot from a search engine while indexing your website will also look for the robots.txt file on your site. They will take a peek into it, to see if there is any website for them to avoid displaying. If there is nothing in the robots.txt, they will, by default, make all pages discoverable.

Displaying a sensitive page to the wrong audience (i.e., hacker) could result in a hacker hacking into it, leading to a compromised site, something no human or search engine wants.

Hence search engines need your help to keep the Internet a safer place. They need site owners to specifically list webpages that they do not wish to be displayed.

Do reach out to us if you need any assistance in this area.

Spread the love

Leave a comment

Your email address will not be published. Required fields are marked *