The other day I got a bug report about check_ntpmon, which was reporting UNKNOWN status back to Nagios even though everything seemed to be working fine. A bit of debugging revealed that it was receiving the message on standard error:
ntpq: write to ::1 failed: Operation not permitted
This was a bit strange, because various links I found indicated that this message is usually due to firewalls:
But this host was not blocking anything, not to mention that check_ntpmon's use of ntpq only ever uses the loopback interface, which is rarely ever touched by firewalls. A bit of further digging showed that indeed it was not the firewall, but a full conntrack table, with dmesg showing:
Aug 4 03:04:19 hostname kernel: [5226949.016837] nf_conntrack: table full, dropping packet
Increasing the conntrack limit fixed the problem.