2005-11-08 Jason Schoonover Experimental Procfix support THIS FEATURE IS HIGHLY EXPERIMENTAL AND HAS NOT BEEN TESTED VERY MUCH AT ALL, PLEASE USE THIS AT YOUR OWN RISK. New to Unnoc 1.0.6 is procfix support. This is an SNMP feature that is implemented by the 'SNMP manager' (unnoc in this case). procfix support is a way to have your processes fixed without having to login to the machine and manually fix it. It basically allows you to specify a command that should be run if a particular process has any problems. Take a look at the snmpd.conf man page for more information. The way it works is unnoc will check the process table, if it finds a process that has an issue, then unnoc will first check to see if there is a command to fix it, if there is then it will set the SNMP MIB 1.3.6.1.4.1.2021.2.1.102.index_number (index_number being the index of the process) which will cause snmpd on the server side to run whatever script is listed with procfix. For example, take ntpd. Sometimes if you have a really wacky clock, then ntpd might keep dying. If you don't want to have to login to a particular server every time ntpd dies just to restart it, you could specify a procfix command of "/etc/init.d/ntp restart" which would automatically get run if ntpd has problems. One big problem I've noticed with this is that snmpd does NOT fork this off, it runs it all while the snmp client waits. This means that if you have some command that might take a long time to run, then unnoc is going to wait for it to get finished and could possibly time out. Very disappointing, but I guess it's not supposed to launch super-duper complex scripts, it was probably designed to just do simple stuff. 1. Configuration Configuration is all client side--i.e., it's all in the snmpd.conf. If you have a process ntpd that you want to monitor, and every time it has a problem ("problem" is unspecified) you want to run the command /etc/init.d/ntp restart, then your snmpd.conf configuration would look like this: proc ntpd 2 1 procfix ntpd /etc/init.d/ntpd restart You also *must* have a Read/Write community instead of just a Readonly community. Otherwise it will not be able to set the bit to tell snmpd to start the command. Replace the RO community with the RW community in the unnoc.conf. host { ... community = private ... } You will get a notification if unnoc found a procfix command for a broken process, something like: "ntpd is DOWN!! Attempted to fix with command: /etc/ntpd restart". Then hopefully you will receive an alert that says: "ntpd is back up". Again, this is very highly experimental, I've only tested this again Net-SNMP and I've tested it on simple things, like ntpd. Everyone has these problem servers in their environments, sometimes procfix would be what you would need to make it work well. Also, keep in mind that if you have a production web server, you might want to re-think putting in /etc/init.d/apache restart as your procfix command for apache, because if it didn't come back up that would kind of suck. If anyone out there is using this feature I would love to hear about it! vim:tw=72:wm=1