-
Bot/Spider protection?
Any take on how to keep unwanted guest away?
I tried a few automatic ways, but now I relly more and more on just configuring the server with data from logwatch and the firewall.
(i.e. any kind of stupid behavior that I notice gets you blacklisted).
My nightly logwatch can look like this:
Code:
################### Logwatch 7.4.3 (12/07/16) ####################
Processing Initiated: Sat Nov 24 06:25:16 2018
Date Range Processed: yesterday
( 2018-Nov-23 )
Period is day.
Detail Level of Output: 0
Type of Output/Format: mail / text
Logfiles for Host: server.fantasy-freak.com
##################################################################
--------------------- Amavisd-new Begin ------------------------
2 Total messages scanned ------------------ 100.00%
33.508K Total bytes scanned 34,312
======== ==================================================
2 Passed ---------------------------------- 100.00%
2 Clean passed 100.00%
======== ==================================================
2 Ham ------------------------------------- 100.00%
2 Clean passed 100.00%
======== ==================================================
---------------------- Amavisd-new End -------------------------
--------------------- httpd Begin ------------------------
A total of 2 sites probed the server
213.186.170.226
5.188.210.12
Requests with error response codes
400 Bad Request
null: 3 Time(s)
/: 2 Time(s)
http://5.188.210.12/echo.php: 1 Time(s)
mstshash=Administr: 1 Time(s)
403 Forbidden
/: 244 Time(s)
:
499 (undefined)
/index.php?action=dlattach;attach=81;type=avatar: 1 Time(s)
---------------------- httpd End -------------------------
--------------------- Postfix Begin ------------------------
2 Miscellaneous warnings 2
33.509K Bytes accepted 34,313
34.831K Bytes delivered 35,667
34.540K Bytes forwarded 35,369
======== ==================================================
2 Accepted 100.00%
-------- --------------------------------------------------
2 Total 100.00%
======== ==================================================
6 Removed from queue 6
2 Delivered 2
2 Forwarded 2
126 Postscreen 126
---------------------- Postfix End -------------------------
--------------------- SSHD Begin ------------------------
Users logging in through sshd:
rille:
192.168.1.99 (LAPTOP): 3 times
**Unmatched Entries**
syslogin_perform_logout: logout() returned an error : 3 time(s)
---------------------- SSHD End -------------------------
--------------------- Disk Space Begin ------------------------
Filesystem Size Used Avail Use% Mounted on
/dev/root 30G 4.7G 24G 17% /
/dev/mmcblk0p1 43M 22M 21M 52% /boot
---------------------- Disk Space End -------------------------
###################### Logwatch End #########################
-
English language questions as captchas have always gotten me the most success. I've never cared about preventing access to bots though, just stopping spam.
edit: Every time bots have gotten through the system, it's been because the bots reposted the question on Yahoo Answers or equivalent and copy + pasted the answer given there. You prevent that by asking questions specifically about your own site/forum, like what's the name of the second forum category on your forum. They still repost the question on Yahoo Answers, but humans can't answer it without knowing where they copy + pasted the question from, so it's safe.
-
Usually does the trick.
I find that the most annoying scanners etc. are from Ukraine/Russia, China, Korea, Germany and U.S.
The question is to whatever allow indexers or not. Can be hard to prevent without require login.
-
Most indexers obey the robots.txt file, but they're generally harmless even if they don't.