Obstructing all internet search engine other than the large ones

I would love to in some way have the ability to obstruct all internet search engine other than Google, Yahoo & Bing (and also their relevant websites like Google Images) from creeping my website as they eat a great deal of server and also transmission capacity yet do not bring any kind of traffic.

Is this conveniently done or hard? It would certainly be excellent if a person kept a checklist of tiny internet search engine that can be pasted right into a robots.txt documents to obstruct them.

Additionally, I understand I can not obstruct spiders that overlook the robots.txt or websites from surreptitiously scratching and also creeping, yet that is not what I desire. I simply intend to obstruct all the Altavistas, Hotbots, Lycos (do these also still exist) and also the college experiment spiders from losing my time.

0
2019-05-06 22:29:25
Source Share
Answers: 3

What have you attempted until now?

Making use of the webmaster tools robots.txt generator I made this :

User-agent: *
Disallow: /

User-agent: Googlebot
Allow: /

But I have not examined it.

0
2019-05-08 18:43:43
Source

How large of a concern is it actually?

The crawlers you need to be worried around are the crawlers that do not adhere to the regulations and also that make believe to be normal site visitors.

Internet search engine traffic is official and also as Dan mentioned Google additionally began as a tiny college task. It isn't actually reasonable to victimize the tiny individuals, and also perhaps not clever in the future.

Kinopiko's solution will certainly function, and also Google's web designer devices will certainly allow you create and also examine your robot.txt (Site arrangement, Crawler Access), yet I assume that if traffic from real search engines is a trouble for you, it might be that your existing organizing remedy is not a bargain.

0
2019-05-08 18:41:56
Source

For the ones that do not adhere to the regulations you can search for them in your logs and afterwards obstruct them by IP.

Usually you can detect a crawler by the reality that it reviews the web pages also quickly to be human.

0
2019-05-08 18:36:33
Source