Setting Up Watchdog To Automatically Reboot A Hung Raspberry Pi
Published October 2nd, 2015 at 1:18 PM. by Joe Prochazka
A watchdog timer is used to detect when a device is hung or in an unresponsive state. Generally it is a hardware based timer which counts down to zero. In order to stop the timer from reaching zero a software based daemon running on the device in question continually resets the hardware based timer so that it does not reach zero. If the timer reaches zero the hardware figures the system is in a hung state due to the fact it had not been reset by the software based daemon and reboots the system.
The Broadcom BCM2835 SoC found on the Raspberry Pi comes with a hardware based watchdog timer which can be used by the watchdog daemon. The following are the steps needed in order to set up your Raspberry Pi to use the watchdog feature.
First make sure that chkconfig is installed.
sudo apt-get install chkconfig
You will need to load the watchdog kernel module. This is accomplished by running the following commands will load the watchdog kernel module as well as make sure that the module is loaded whenever the system is booted up.
sudo modprobe bcm2708_wdog
echo "bcm2708_wdog" | sudo tee -a /etc/modules
echo "bcm2708_wdog" | sudo tee /etc/modules-load.d/bcm2708_wdog.conf
Next we will need to install the watchdog deamon. To do so run the following commands related to the Linux distribution you are currently running on your Rasberry Pi. These commands will install the watchdog deamon as well as ensure it is started after each time the device is booted up.
sudo apt-get install watchdog
sudo chkconfig --add watchdog
sudo pacman -S watchdog
sudo systemctl enable watchdog
Now you will now need to configure the watchdog daemon. You can do so by opening the file /etc/watchdog.conf in your favorite text editor such as nano or vi.
sudo nano /etc/watchdog.conf
Once open you will need to uncomment the line #watchdog-device = /dev/watchdog by removing the hash tag from the front of it. You can also uncomment the line #max-load-1 = 24 and adjust this setting to your likeing if you wish.
How max-load works is "If device load goes over 24 for over 1 minute". What this means is if you would need 25 Raspberry Pis in order to complete the task in 1 minute then the system has gone over it's max load. Feel free to adjust this value to your likeing and/or device needs.
All that is left now is to start the watchdog daemon.
sudo chkconfig watchdog on
sudo systemctl start watchdog.service