Monitor your Internet services or devices to ensure they are always online. Tracks Internet connectivity and speeds with useful proof. For Windows, Linux, ARM (Raspberry, Tinker Board, etc).
Learn more by visiting www.outagesio.com
We have been using this hardware agent for about a week now and it is working great. We were able to narrow down a couple internal and service provider issues. I noticed this morning, after a switch reboot, that the agent is reporting "Your agent is REBOOTING". It is still reporting all metrics properly but now I do not see the heartbeat/status. Everything else is reporting- speed tests, hops, pings, outages, stats, temp/environmental info. Is this cosmetic or how should I resolve this?
Let me get a dev on this right now and we'll take a look. It seems cosmetic as the agent appears to be communicating and sending updates.
I'll report back shortly. FYI, the dev may purposely reboot the device in the meantime.
Can you please tell me at which time (your local timezone) the switch was rebooted ?
I am asking since:
- the HW agent reboots everyday to check OTM or firmware updates
- this changes the status to REBOOTING to avoid fake inactive notifications
- when all the checks are over it sends a confirmation message that all is ok
- that confirmation is missing
I need to be sure that this was happening while the switch was rebooted before I dive deeper to check if there is a bug.
I just checked and now the status is back.
Yes, this is because we sent it another reboot command to see if would do the same thing again and it didn't.
What you are seeing should be addressed in the next agent version release.
Initially, we thought that the multi-threading function of the agent code which handles a number of simultaneous functions might be not sending some data now and then.
However, situations like what happened here help to confirm that what may actually be happening is that if the agent is not able to send something for what ever reason, it could give up so the data could eventually be lost if it doesn't re-try.
It is supposed to always re-try but there is something in the libraries or method we are using that somehow prevents this re-try now and then. Could even be CPU overload causing the re-try thread to be lost.
The new version should be ready to test internally this week some time. How long that testing takes depends on what we see as problems if they come up as this is a rather heavy re-write.
Excellent! Thank you so much for your help and I'm glad I was able to get you more data to analyze for the bugfix. Let me know if there is anything else you need from me, otherwise I'll be looking forward to the firmware update. Thanks again!