The Kick-off! software includes the ability to monitor applications for crashes and other failures using "Rebound! Timer" technology. This support enables our Kick-off! product to monitor your application for failures and automatically take recovery steps. This document describes how to add Rebound! support to your applications.
Rebound! Timers are the means by which Kick-off! monitors applications for failures. Applications can use Rebound! Timers to periodically "check in" with the Kick-off! software, letting it make sure applications are always running normally. The addition of application level failure detection complements Kick-off!'s native ability to detect and respond to system level failures. Together they provide a robust means of ensuring that critical applications are always available.
The Kick-off! software keeps a separate Rebound! Timer for each application. Once set, the timer starts counting down to zero. If any active timer ever actually reaches zero, the Kick-off! software will restart the computer. It is the application's duty to periodically update ("tickle") its timer during normal operation and to close it when the application quits.
For example, a web server application could set its Rebound! Timer to 60 seconds when it starts up. It then updates the timer to its original value once every 15 seconds. Under normal operation, the timer will never go far below 45 seconds before being reset. However, if the server application crashes, it will stop tickling. Its timer will continue counting down and will reach zero ("expire") within 60 seconds. Kick-off! will then restart the computer.
Rebound! Timers were designed as a "dead-man's switch" to allow Kick-off! to take corrective action when an application fails for some reason. For this reason, you should ensure that your timers will only reach zero if something has actually gone wrong. We recommend the following:
To implement a Rebound! Timer in your application, you must link the "libkickoff.a" static library into your project, and include "ko_sdk.h" in your relevant source files. The header file "ko_sdk.h" includes detailed documentation on the Rebound! Timer functions.
Location of library: /usr/local/lib/libkickoff.a
Location of header file: /usr/local/include/kickoff/ko_sdk.h
int ko_init(void);
To open a new Rebound! Timer, call the 'ko_init()' function. The function will return 0 on success, and -1 on failure.
int ko_set_watchdog(u_int32_t secs);
To update an existing Rebound! Timer, call the 'ko_set_watchdog(num-seconds)' function.
void ko_finish(void);
When you are finished with the timer, call the 'ko_finish()' procedure.
Note: Again, it is essential to properly close your timers before your application quits. Otherwise, the Kick-off! software will think it has crashed.