# /dev/watchdog mini-HOWTO HOWTO watchdog

## petlab

See http://gentoo-wiki.com/HOWTO_Watchdog_Timer for updated version

The watchdog is used to cause the machine to reboot or shut down when something goes wrong.  For example, the kernel goes crazy, or some program starts using 100% cpu cycles.  This mini-HOWTO is a guideline on setting up your own watchdog, based on my experience.  You assume all responsibility for your own actions.

Step 0:  You need to be able to access your watchdog

     If you have a watchdog add-on card, you need to have pci or isa support in your kernel.  If your watchdog is part of your motherboard, you need to enable support for its control - for example i2c, SMBus, or maybe just support for a chipset.  This HOWTO assumes that you already have this set up, and have emerged companion programs where needed, for example, the program i2c which can be used to access on-board chips.

Step 1:  Compile watchdog support into your kernel

```
Device Drivers ->

          Character Devices ->

                    Watchdog Cards ->

                              [*] Watchdog Timer Support

                              <*> Your Watchdog card or chip
```

     Note: You may also need support for i2c, if your watchdog is a chip rather than a card.  My watchdog is a W83627HF which is attached to my SMBus on my motherboard.  I use the lm-sensors and i2c packages to access the chip, which also includes a temperature sensor. (See step 0)

Now compile your kernel and reboot.

Step 2: Make a device entry

You may or may not need to follow this step.  Check for a /dev/watchdog node with this command:

```
# ls -l /dev/wa*

crw-rw---- 1 root root 10, 130 /dev/watchdog
```

If you don't have this entry, then we will make it.

```
# mknod -m 660 /dev/watchdog c 10 130
```

Step 3: emerge an initscript and program to handle it

```
# emerge watchdog
```

Step 4: Add it to your boot runlevel

```
# rc-update add watchdog boot
```

See Note1 at end*

Step 5: reboot or start the service

The preferred:

```
# /etc/init.d/watchdog start
```

or if you aren't comfortable with that, just:

```
# reboot
```

Step 6: Experiment with the watchdog setup to see if it works

I suggest copying the included shell script called repair.sh and editing it to your like.  The 'repair binary' is called when the watchdog program detects an error.  You may want to customize what gets logged or repaired through repair.sh.

Note 1*

On my machine, a dual Opteron, I was not able to get the program to work correctly when added to the boot runlevel.  I ended up running it in /etc/conf.d/local.start.  Not the best, but it works for me now.  I note that the program doesn't seem to log at the correct intervals.  It seems to be only partially functional.  Odd, since this is usually an important issue for servers...

----------

## petlab

Please comment if you like.

Thank you.

----------

