lateral Posted October 6, 2014 Share Posted October 6, 2014 Hi guys I have a "staging" "test" domain set up for my shop. I do not want Google or any other crawler to crawl it or even know about it. is there a simple way to guarantee that they can't crawl it??? Link to comment Share on other sites More sharing options...
parsifal Posted October 7, 2014 Share Posted October 7, 2014 Hello lateral. 1. You could use a robots.txt file: http://www.robotstxt.org/faq/prevent.html. But there is no guarantee here. Webcrawl bots can ignore the robots.txt file, if their creator chooses to do so. 2. You may be able to configure your web server so it denies HTTP requests with user agent headers different than those of well-known browsers. This is also not a sure way, since some bots could easily masquerade themselves as a browser. 3. You could implement a simple username-password authentication at the web server level. Example: http://www.siteground.com/tutorials/cpanel/pass_protected_directories.htm. This would stop virtually all bots. Link to comment Share on other sites More sharing options...
lateral Posted October 7, 2014 Author Share Posted October 7, 2014 Thanks for the ideas.... Link to comment Share on other sites More sharing options...
lateral Posted May 12, 2015 Author Share Posted May 12, 2015 I set up a password protected HTACCESS file. Works great! Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now