Device monitoring

What do you do when the sensor stops working?

Sometimes devices stop working. Mostly you notice it, because some smarthome function, e.g. turning on the lights, is not available anymore. The trick is to figure out something is wrong, before you or your loved ones notice the smarthome function is not working anymore.
Until now following problems occurred in my setup:

  1. Sensors stopped sending data. This happens mostly when the battery is drained.
  2. Sensor value is stuck. I have one self-made sensor which continues sending data, but the value is frozen.
  3. Some service stops working, e.g. Syncthing, Baikal.

Sensor monitoring with a script

This is the script I wrote to monitor my devices. The script has two parts.

This part here, checks if the monitored devices are sending data to Domoticz. I use the built in function lastUpdate.minutesAgo which is available for all devices. When a device didn't get an update for a defined period of time, I make a log entry and send myself an E-Mail. I also set a user variable with the suffix timeout, which can then be used in other scripts. You can setup the variables under Setup - More options - User variables.

variables.jpg

This is how the script looks like.

local function checktimeout(dev, timeout)
      if (dev.lastUpdate.minutesAgo > timeout) then
          if (domoticz.variables(dev.name .. '_timeout').value == 0) then
             domoticz.variables(dev.name .. '_timeout').set(1)
             domoticz.log(dev.name .. ' hat ein Time-Out.', domoticz.LOG_ERROR)
             domoticz.email(dev.name, dev.name .. ' hat ein Time-Out.', 'pi@homeauto')
          end
      else
             domoticz.log(dev.name .. ' hat sich gemeldet.', domoticz.LOG_INFO)
             domoticz.variables(dev.name .. '_timeout').set(0)
      end
end

check_list_timeout_short.forEach(function(device)
           checktimeout(device, update_time_short)
           end)

The other part of the script checks if the sensor is changing value. The scripts is called every hour, so if the temperature didn't change for a couple of hours (implemented using a simple counter) I want to be notified.

current_value = domoticz.devices('Temperatur_Balkon').temperature
if (current_value == domoticz.data.old_value) then
    if (domoticz.data.error_counter < error_counter_max) then
         domoticz.data.error_counter = domoticz.data.error_counter + 1
     end
else 
    domoticz.data.error_counter = 0
end

if (domoticz.data.error_counter >= error_counter_max) then
    if (domoticz.variables('Temperatur_Balkon_stuck').value == 0) then
        domoticz.variables('Temperatur_Balkon_stuck').set(1)
        domoticz.log(domoticz.devices('Temperatur_Balkon').name .. ' aendert den Wert nicht.', domoticz.LOG_ERROR)
        domoticz.email(domoticz.devices('Temperatur_Balkon').name, domoticz.devices('Temperatur_Balkon').name .. ' aendert den Wert nicht.', 'pi@homeauto')
    end
elseif (domoticz.data.error_counter == 0) then
    domoticz.variables('Temperatur_Balkon_stuck').set(0)
end

domoticz.data.old_value = current_value

Service monitoring with Monit

Monit is a great little tool which can be used to monitor all sort of things. You can then restart local services or notify the admin, that is you :), that something is wrong. Here you can find a good explanation how to set it up.
I am using Monit for several things:

  • Local service monitoring: Domoticz, Syncthing.
  • Remote service monitoring: TVheadend, Baikal.
  • Monitoring if my mounted Samba shares are accessible.

If you need help setting up monitoring with Monit or modifying the script, feel free to contact me.

Happy hacking!!

License: CC BY-SA 4.0 Discuss on Mastodon