On the Ubuntu users mailing list,
"Server monitoring" could mean just about anything but to this person, apparently, responding to pings is sufficient. A server could respond to pings but it is still possible that some sub-system, such the database server, web server, or mail server, could be down for any number of reasons. The greater the number of components, the greater the number of points of failure. Most modern web sites have some sort of a content management system (CMS) behind them, even if it may not necessarily be called a CMS, so to prove that the site is running would require that the monitoring tool simulate as closely as possible a person hitting the site with a web browser.
The number of ways a remote server could be monitored and the person interested in that server be notified of a problem is virtually limitless. It took me a few minutes to cobble this simple Python script to issue an HTTP request to a server (or servers) and put up an informative KDE dialogue box, which on my system also plays a sound. To use this, you will need to install the package containingkdialog, have Python (I have tested with version 2.5 only), and the urllib2 Python module installed. Here is the script.
#Begin code
#! /bin/python
# Written by Clifford Ilkay
# Shared under the Creative Commons License
# Put your list of sites to monitor here. You could also modify
# the script to read the list of sites from a file or a database.
urls = ['http://apple.com', 'http://this.site.is.broken']
# You should not need to change anything beyond here.
from urllib2 import Request, urlopen, URLError
from os import spawnvp, P_WAIT
def ping_server(the_url):
request = Request(the_url)
try:
response = urlopen(request)
except URLError, e:
if hasattr(e, 'reason'):
the_error = '%s could not be reached. Reason: %s' % (the_url, e.reason)
spawnvp(P_WAIT, '/usr/bin/kdialog', ['kdialog', '--error', str(the_error)])
elif hasattr(e, 'code'):
the_error = '%s could not fulfil the request. Error code: %s' % (the_url, e.code)
spawnvp(P_WAIT, '/usr/bin/kdialog', ['kdialog', '--error', str(the_error)])
else:
pass
for the_url in urls:
ping_server(the_url)
#End code
Copy and paste everything between the "Begin" and "End code lines above in a file called servicemon.py (or whatever you like) and then have cron execute that at whatever frequency you deem appropriate. Indentation is significant in Python so ensure that the whitespace in the code is preserved. If the sites being monitored are working fine, the script will be silent. (Note the "pass" statement if there is no exception.) If the site is having problems, the script will open a dialogue with the URL of the site and the nature of the problem. It will cycle through all the sites in the list and not just stop at the first one that is having a problem. If you have many sites being monitored, this is probably not a good way to notify since the script will open one dialogue window for each broken site. This is one of many possible notification mechanisms. Python has modules for SMS, IRC, email, and various instant messaging protocols so any of those notification methods is also quite feasible.
Some ISPs (Rogers in Canada, for example) and some third-party DNS providers like OpenDNS are put up a Google page in order to monetize DNS failures, ostensibly with the goal of making it "easier" for their users. This complicates matters for scripts. There is a big difference between a 404 and a domain that doesn't even resolve. OpenDNS returns a 404 for the fictitious http://this.site.is.broken so beware.