Track Internet disconnections, provider outages with historical data, and automated speed testing.
For Windows, Linux, ARM64, ARMa7. Learn more by visiting www.outagesio.com
Notice: If you created an account on app.outagesio.com, simply use the same credentials to log in here.
Agent keeps going inactive / disconnecting
-
wrote on Aug 17, 2023, 11:48 AM last edited by
Good morning.....
I ran tracert from the server running the software agent, and it seems all "hops" after the first one to the router this server is pluged in to (192.168.1.1) time out. I am honestly stumped. -
wrote on Aug 17, 2023, 3:25 PM last edited by OutagesIO_Support Aug 17, 2023, 8:27 AM
Well, at least we have a lead right.
Now, do you have another server or device you could ping from that is connected to the same network?
Something that's connected to the same switch or router that this server is connected to.
I mention that because sometimes there are multiple switches and/or routers.If you can test pings / tracert from another server/device on the same network, that might give you another lead. If you see all responses, then it leads us back to the server the agent is running on. If there are no responses, it leads us to the switch or router that is upstream possibly blocking ICMP at some level or another.
Properly secured equipment often limits the amount of ICMP traffic to lower the chance of being scanned and other hack attempts or simply to lower resource usage.
Maybe something upstream is simply blocking ICMP when it thinks there's too much traffic coming from something inside or outside the network.
ICMP limiting is a common thing and that could cause you to see results then none after what it deems too much traffic.
-
wrote on Aug 17, 2023, 4:37 PM last edited by
I was able to ping both URLs from another machine on the same switch, but tracert would give the same "time out" result as the server. As far as I know, ICMP isn't being blocked by the firewall, but I will check that.
-
wrote on Aug 17, 2023, 4:47 PM last edited by
What's interesting and maybe just a coincidence is restarting the agent service seems to fix the problem temporarily. That implies that whatever is upstream is allowing some ICMP traffic and then blocking it.
FYI, the agent uses standard port 80/443, and ICMP. While ICMP is a small part of actually detecting an outage, it's an important part because it's not only part of the heartbeat to know if it is still communicating but also for the agent to send updated hops changes and other tests.
-
wrote on Aug 18, 2023, 2:32 PM last edited by
If your office has a managed services arrangement with an outside company, it might be worth asking them if they have any ICMP (and/or other) limiters put in place. If it's not them and you can't find anything in the building, then it could also be the provider.
-
wrote on Aug 20, 2023, 2:53 PM last edited by
Hi, as an update, 'Back online" email is correct now.
-
wrote on Aug 21, 2023, 4:38 PM last edited by
Good afternoon.....
I believe I've got everything resolved. So, Cisco disables / blocks ICMP traffic by default to any outside interface. Once I allowed ICMP traffic on my firewall / ASA, my "hops" started showing in the history, and tracert to foxymon.com and tpw.outagesio.com (as well as to any other url) were no longer giving a "time out".
Thank you to everyone for the help!
-
wrote on Aug 21, 2023, 7:04 PM last edited by
Hi,
That's great to hear and I now see hops coming in which means everything should work now.
It also means your hardware agent should function perfectly once you receive it. -
wrote on Aug 21, 2023, 7:09 PM last edited by
I actually just submitted another question regarding the hardware agent.... it is not showing pings or hops, but is connected to the same switch as the server hosting the software agent that is now working fine.
-
wrote on Aug 21, 2023, 7:17 PM last edited by
That's a bit humorous. We were just sending an email to that agent owner about that then noticed it's the same address.
I've asked our dev to look into this one because now I'm stumped and need more input on why it would see a local outage without ICMP.
-
wrote on Aug 21, 2023, 7:19 PM last edited by
The "outage" may have been my fault... I switched to a different port on the switch hoping it would possibly correct the no ping / no hops issue lol.
-
wrote on Aug 21, 2023, 7:22 PM last edited by
As far as I understand it, that should have created only an Inactive, not an outage. Give us a few minutes to look into this (ID 130432).
-
wrote on Aug 21, 2023, 7:29 PM last edited by
I don’t know if this means anything, but the date and time shown for all the events are incorrect. It’s showing 8-16-23 at 11am, but it’s almost 3:30pm here by me in NJ
-
wrote on Aug 21, 2023, 8:03 PM last edited by OutagesIO_Support Aug 21, 2023, 1:26 PM
Hi,
First I need to confirm with you that the agent was activated by you on 2023-08-21 11:25:03 UTC time i.e today at 7:25 am NJ timezone
-
wrote on Aug 21, 2023, 8:13 PM last edited by
I was activated by me at about 2:30pm (NJ time).
-
wrote on Aug 21, 2023, 8:55 PM last edited by mshafrin Aug 21, 2023, 1:55 PM
I activated the HW agent about 2:30pm (NJ time).
-
wrote on Aug 21, 2023, 8:59 PM last edited by
My bad: the time I gave you was MST not UTC so all fine with the 2:25 (almost 2:30) pm
What I think it is happening is that the agent has a wrong date which is kind of weird!!!!
Pings are coming in and hops came in too but with wrong dates
Let me see what I can do
-
wrote on Aug 21, 2023, 9:25 PM last edited by SBK Aug 21, 2023, 2:29 PM
Is it possible that port UDP 123 (NTP service) has been blocked in your network ?
-
wrote on Aug 21, 2023, 9:43 PM last edited by
Not to my knowledge. My workstations and servers are able to sync with time servers, so as far as I know, it’s not blocked.
-
wrote on Aug 21, 2023, 9:53 PM last edited by SBK Aug 21, 2023, 3:10 PM
I have been able to change the date and time right now but the NTP is not working so the agent's time is not exact (almost 1 min off)
All agents are in sync with the ntp.org servers so either the port or the URL are somehow blocked
All the confusion in hops and pings is in fact associated to the date and time sent by the agent that remained stuck to Aug 16 when it was prepared and shipped.
Pings are coming in correctly, you should see them and hops too.
I will change the historical data so they can fit the correct timing but the issue of the NTP not working properly on that agent is still unsolved.