Server monitoring is something server admins put off until, well, it's too late.  Server admins have all experienced clients calling in to let them know that a service on their server is not working.  This can be unfortunate and embarrassing at the very least.  I thought I'd document a number of utilities I use to monitor Linux servers.

  • Snort.  Snort is a NIDS and NIPS, and one of the most famous network security tools.  As a server admin, you should already know Snort.  I usually use BASE alongside Snort on my servers.
  • OSSEC.  OSSEC is a HIDS, which can perform a number of useful monitoring tasks, such as rootkit detection, automated log analysis, and file integrity checking.  In addition, it provides instant alerting and active response (which means that it can take some action automatically as soon as it faces a problematic situation).  A must-have tool in your toolbox, IMO.
  • Fail2ban.  Fail2ban is a small utility which does one thing, and does it good.  It monitors the system authentication logs, and detects password guessing attempts by counting the number of failed authentications, and bans the IPs of the offenders using the kernel firewall.
  • Monit.  Monit is a tool for managing services, processes, devices, files, etc.  Monit allows full customization by accepting flexible rules on what to monitor, and what to do in case something goes wrong.  It goes beyond simple stuff such as monitoring a port and pid file, and can for example send you an alert when a filesystem is about to run out of space, stop a service when it consumes too much CPU, or monitor an external server.
  • Munin.  Munin is an excellent utility which can collect data on just about anything going on in the server using RRDtool, and generate graphs for both online and offline viewing.  A few examples of the graphs it generates include network interface incoming/outgoing traffic, inode usage, swap usage, memory usage, and filesystem usage.  It also includes facilities to generate service-specific graphs, such as MySQL query count and Postfix mail queue size.