Track Internet disconnections, provider outages with historical data, and automated speed testing.
For Windows, Linux, ARM64, ARMa7. Learn more by visiting www.outagesio.com
Notice: If you created an account on app.outagesio.com, simply use the same credentials to log in here.
2.5Gbps Hardware Agent Repeatedly "Rebooting", Missing Data
-
wrote on Jan 14, 2025, 7:32 PM last edited by
129878 is back online now, downline one switch from 131232.
-
wrote on Jan 14, 2025, 10:45 PM last edited by SBK Jan 14, 2025, 3:46 PM
-
wrote on Jan 14, 2025, 10:53 PM last edited by
The 3 agents are recording a similar situation on Jan 13 at around 13:33 Chicago time but at the same time are not recording any type of outage.
Two different technologies (Windows, Openwrt) and three different versions (MT300 and MT3000 even if they both are openwrt are different in binaries) but they all:
- cannot identify an outage
- are monitoring inactives
- they disagree in some minor timing, which can be related to the way the three agents are connected to the LAN
So next question is: is it possible, without any specific detail, understand if all three agents are connected the same way within the LAN (different VLANs, directly connected to the router or thru a switch, different rules in the firewall)
A simple hand drawn picture is more than enough, as I said no company detail is needed, jsut trying to see where the problem is originated and why they behave in such a way.
-
wrote on Jan 15, 2025, 1:57 AM last edited by OutagesIO_Support Jan 15, 2025, 7:43 AM
Retracted. Accidentally replies to Ed :)
-
wrote on Jan 15, 2025, 2:11 PM last edited by recalc Jan 15, 2025, 7:14 AM
I see @OutagesIO_Support seeming to quote/reply to @SBK and not sure what portion is directed to @SBK versus the OP (me).
The entire purpose of my getting new 2.5Gbps hardware agent was to better monitor outages from new ISP (WeLink) and compare to old ISP (AT&T Fiber). Unfortunately, there may be something else weird going on with the local network which may or may not be an artifact of something the new ISP does (e.g. CGNAT, RF at unknown reliability vs Fiber, maybe some filtering they did not tell me about, etc).Once the bug was quickly fixed about certain but not all 2.5Gbps hardware agents not reporting, I am not sure what's up, but I do know I did NOT keep the LAN and WAN/ISP setup in a steady state suitable for any in-depth monitoring throughout that time, until late evening Chicago time 14 Jan. Work was underway to ensure internet service back online to users. But that no doubt made any troubleshooting of agents difficult.
Over those couple days I had to swap back and forth between the old and new ISP a couple times (eventually switching back the original - had before 2.5GBPS agent installation - ISP) and added a new software agent and put the old yellow hardware agent back online. A few, maybe several, of the times when agent(s) ceased communication that was likely: when ISP tech came to attempt to adjust their RF (did not solve), ISP tech took connection offline, I had to swap back and forth between ISPs by physically moving cable in 2 different floors and 2 separate buildings, I moved HW agent 129878 to a different floor different switch, routers rebooted, etc.
The current setup is since late evening Chicago time 14 Jan is this:
ISP ONT/Router BGW320-500 (in passthrough mode) 3GbE port -> 2.5GbE TPLink Deco BE25 2.5GbE -> 2.5GBe switch:
-> HW 2.5GB Agent 131232
-> [other devices]
-> Win11 PC running SW Agent 131236 (and in live use by a user, so potential PC reboots or disconnects)
-> ~30-40m cat5e cable -> another 2.5GbE switch:
-> HW Agent 129878
-> [other devices]That setup should now remain stable for a month or two until WeLink ISP can persuade me they have solved whatever technical issues they were having and want to attempt again to supplant current ISP (AT&T fiber), after WeLink failed to perform adequately this month.
-
wrote on Jan 15, 2025, 2:30 PM last edited by OutagesIO_Support Jan 15, 2025, 7:35 AM
Yes sorry, I replied to Ed's post in error. I was tired. You can disregard that. Sorry for the confusion.
By the way, only our dedicated hardware agents are rebooted. On Windows, the service restarts itself nightly only. This is because PCs can also be in use by people and we would not want them to lose any work they were doing.
-
I see @OutagesIO_Support seeming to quote/reply to @SBK and not sure what portion is directed to @SBK versus the OP (me).
The entire purpose of my getting new 2.5Gbps hardware agent was to better monitor outages from new ISP (WeLink) and compare to old ISP (AT&T Fiber). Unfortunately, there may be something else weird going on with the local network which may or may not be an artifact of something the new ISP does (e.g. CGNAT, RF at unknown reliability vs Fiber, maybe some filtering they did not tell me about, etc).Once the bug was quickly fixed about certain but not all 2.5Gbps hardware agents not reporting, I am not sure what's up, but I do know I did NOT keep the LAN and WAN/ISP setup in a steady state suitable for any in-depth monitoring throughout that time, until late evening Chicago time 14 Jan. Work was underway to ensure internet service back online to users. But that no doubt made any troubleshooting of agents difficult.
Over those couple days I had to swap back and forth between the old and new ISP a couple times (eventually switching back the original - had before 2.5GBPS agent installation - ISP) and added a new software agent and put the old yellow hardware agent back online. A few, maybe several, of the times when agent(s) ceased communication that was likely: when ISP tech came to attempt to adjust their RF (did not solve), ISP tech took connection offline, I had to swap back and forth between ISPs by physically moving cable in 2 different floors and 2 separate buildings, I moved HW agent 129878 to a different floor different switch, routers rebooted, etc.
The current setup is since late evening Chicago time 14 Jan is this:
ISP ONT/Router BGW320-500 (in passthrough mode) 3GbE port -> 2.5GbE TPLink Deco BE25 2.5GbE -> 2.5GBe switch:
-> HW 2.5GB Agent 131232
-> [other devices]
-> Win11 PC running SW Agent 131236 (and in live use by a user, so potential PC reboots or disconnects)
-> ~30-40m cat5e cable -> another 2.5GbE switch:
-> HW Agent 129878
-> [other devices]That setup should now remain stable for a month or two until WeLink ISP can persuade me they have solved whatever technical issues they were having and want to attempt again to supplant current ISP (AT&T fiber), after WeLink failed to perform adequately this month.
-
I see @OutagesIO_Support seeming to quote/reply to @SBK and not sure what portion is directed to @SBK versus the OP (me).
The entire purpose of my getting new 2.5Gbps hardware agent was to better monitor outages from new ISP (WeLink) and compare to old ISP (AT&T Fiber). Unfortunately, there may be something else weird going on with the local network which may or may not be an artifact of something the new ISP does (e.g. CGNAT, RF at unknown reliability vs Fiber, maybe some filtering they did not tell me about, etc).Once the bug was quickly fixed about certain but not all 2.5Gbps hardware agents not reporting, I am not sure what's up, but I do know I did NOT keep the LAN and WAN/ISP setup in a steady state suitable for any in-depth monitoring throughout that time, until late evening Chicago time 14 Jan. Work was underway to ensure internet service back online to users. But that no doubt made any troubleshooting of agents difficult.
Over those couple days I had to swap back and forth between the old and new ISP a couple times (eventually switching back the original - had before 2.5GBPS agent installation - ISP) and added a new software agent and put the old yellow hardware agent back online. A few, maybe several, of the times when agent(s) ceased communication that was likely: when ISP tech came to attempt to adjust their RF (did not solve), ISP tech took connection offline, I had to swap back and forth between ISPs by physically moving cable in 2 different floors and 2 separate buildings, I moved HW agent 129878 to a different floor different switch, routers rebooted, etc.
The current setup is since late evening Chicago time 14 Jan is this:
ISP ONT/Router BGW320-500 (in passthrough mode) 3GbE port -> 2.5GbE TPLink Deco BE25 2.5GbE -> 2.5GBe switch:
-> HW 2.5GB Agent 131232
-> [other devices]
-> Win11 PC running SW Agent 131236 (and in live use by a user, so potential PC reboots or disconnects)
-> ~30-40m cat5e cable -> another 2.5GbE switch:
-> HW Agent 129878
-> [other devices]That setup should now remain stable for a month or two until WeLink ISP can persuade me they have solved whatever technical issues they were having and want to attempt again to supplant current ISP (AT&T fiber), after WeLink failed to perform adequately this month.
wrote on Jan 16, 2025, 5:10 PM last edited by SBK Jan 16, 2025, 2:11 PM@recalc
I keep seeing micro disconnections (those inactives you are receiving) which are not related to outages.Even if they are lasting few seconds and are not more than 3 per day, they still are there and are not internet outages
-
wrote 24 days ago last edited by
The yellow HW agent works great. And when I swap it for the 2.5GBPS HW agent, 2.5GBS HW agent has problems.
It does seem like the perhaps ISP (WeLink) is doing something weird that may be frustrating the 2.5GBps hardware agent. After leaving this agent on one LAN then on the other, I noticed if I switch this 2.5GBPS agent to the LAN with WeLink as the ISP it gets stuck not communicating (for hours, until I give up). And if I put it on the LAN with AT&T as the ISP it instantly communicates properly.
Power cycle of HW Agent between network swapping does not seem to change this.
Restarting the (WeLink) router also does not seem to change this.OTOH - Attaching yellow HW Agent (130727) instead to that same WeLink ISP LAN on same port on the same WeLink router where 2.5GBPS agent fails, the yellow HW Agent works fine, immediately.
The yellow HW agent works seamlessly and immediately whichever LAN I put it on. The 2.5GBPS HW agent works only on the AT&T LAN. When Yellow HW agent then 2.5GBPS agent are tried on the same ports on the same routers, the yellow HW agent works in all cases, but the 2.5GBPS agent seems to always have trouble on the WeLink LAN. The SW agent (131236) seems to work just as well on either LAN.
-
wrote 23 days ago last edited by SBK 23 days ago
I am definitely interested in troubleshooting this as you came up with the conclusion that the MT3000 is somehow "WeLink intolerant" (joking of course) since there must be something which is triggering a different behavior.
Just for the sake of info the MT300 can only reach a 100Mbit connection while the MT3000 is able to go beyond that value till the nominal 2.5GB
-
I am definitely interested in troubleshooting this as you came up with the conclusion that the MT3000 is somehow "WeLink intolerant" (joking of course) since there must be something which is triggering a different behavior.
Just for the sake of info the MT300 can only reach a 100Mbit connection while the MT3000 is able to go beyond that value till the nominal 2.5GB
wrote 23 days ago last edited by recalc 23 days agoSo, for now, I will leave 2.5GBPS HW agent (131232) on the WeLink ISP LAN, but it seems it may never connect. If it would be helpful, I can move it to the other LAN to receive updates. It may be best to wait until WeLink claims (again) to have made some sort of fix or adjustment, and then see if 2.5GBPS HW Agent remains frustrated by whatever odd thing WeLink may be doing.
In current installation, WeLink is providing symmetrical 2GBPS service, and tests very close to that speed with the gateway router's built-in speed test. However, the gateway router WeLink provides (Eero Pro 6E with 2x Eero 6+) has only one 2.5Gbe and one 1GBe port. So, in this setup, WeLink's antenna internet source (DHCP and they use CGNAT) is connected to Eero router's 2.5GBe port, and 2.5GBPS HW agent is connected to the Eero router's 1GBe port. No change when connected to the 1GBe port of one the Eero 6+ (connected to gateway Eero Pro 6E via wireless backhaul). Eero router claims to have given the HW agent an IP address via DHCP and claims to be communicating with the HW agent.
FWIW, when I swapped Eero for TP-Link Deco BE5000 (i.e. 3x Deco BE25), using Deco as the gateway for WeLink, the Deco Router also seems to have problems, going offline after less than 1 minute and coming back only occasionally (or not until after my patience is lost). All the gear (HW agents, Deco Routers, switches) seem to work well when connected to AT&T Fiber gateway.
-
wrote 22 days ago last edited by
Definitely interesting.
I will be trying between tomorrow and Friday to get in touch with you directly using the chat to see if I can test few things I have in mind.
If this week is not possible then I have to ask you to postpone to the week of February 10, since I will be traveling next week -
wrote 22 days ago last edited by
Is the behavior similar to this post?
-
Is the behavior similar to this post?
wrote 22 days ago last edited by@OutagesIO_Support Some issues do seem similar to that post. To ensure the 2.5G agent (131232) receives updates and can restart properly once, I've moved it to the LAN that it DOES work on (AT&T Fiber), to get any overnight updates. Then tomorrow I'll put it back on the (troublesome) WeLink LAN again.
I also will leave Yellow HW agent (130727) onto the WeLink LAN (where yellow HW agent already works fine).
Note that WeLink uses CGNAT, so public IPs are 50.20.112.0/20, but the router sees its own WAN address as 100.64.0.0/10. Not sure if that matters to HW agent. -
wrote 18 days ago last edited by
The hardware agent would not care what the LAN/WAN addresses are, it will just monitor its own LAN network settings (IP, GW, MASK, DNS) and outgoing paths.
The interesting aspect is that because it does not care about those things, it will always pick up the routes being used so over time, you can get a sense of the routes being used, even if they are CGNAT or SD-WAN for example. Just keep an eye on your hops to see those. Then you can compare with problem hops no matter what the route takes.I hope I'm explaining this correctly.
-
wrote 18 days ago last edited by
So far, with the 2.5g agent and an older hw yellow agent both connected to the same LAN, the 2.5g hw agent always remains disconnected, despite the router claiming to have given it an IP address. The yellow 100M hardware agent on the other hand stays connected just fine. So I'm back to wondering what's the difference between the two hardware agents that makes one perpetually rebooting or disconnected and the other work normally.
-
wrote 18 days ago last edited by
Ed will be back soon and will chime in again.
That's something we keep talking about also while monitoring this.
While the hardware is different, the software running on both is the same.The yellow device is much less powerful than the 2.5g but the problem doesn't seem to be related to performance issues with the devices.
This should not cause any unique behavior but something is obviously happening that eludes us so far.
I'm not sure if this was asked before but is there any chance we could get ssh access into both? Details could be shared in private chat of course.
By running some command line tests, maybe we can find some difference between the two.
-
wrote 18 days ago last edited by recalc 18 days ago
If I can figure out how to set up ssh then sure.
For the next 10 days or so, I cannot mess with the production LAN (AT&T Fiber 1000Mbps), but can do whatever with the WeLink 2000Mbps without impacting any users.Keep in mind that the 2.5GBPS hardware agent seems to work just fine on the AT&T Fiber ISP (1000M) LAN, so there does seem to be something weird related to WeLink:
- Yellow HW agent - works with Welink and AT&T Fiber LAN equally well
- 2.5Gbps HW agent - works on AT&T Fiber service, but problematic on WeLink
- Deco BE5000 mesh router - works on AT&T Fiber service but problematic on WeLink
-
wrote 18 days ago last edited by
For now, we're testing a new firmware build so as long as you keep it online, we can remotely reboot it to flash the new firmware.
-
If I can figure out how to set up ssh then sure.
For the next 10 days or so, I cannot mess with the production LAN (AT&T Fiber 1000Mbps), but can do whatever with the WeLink 2000Mbps without impacting any users.Keep in mind that the 2.5GBPS hardware agent seems to work just fine on the AT&T Fiber ISP (1000M) LAN, so there does seem to be something weird related to WeLink:
- Yellow HW agent - works with Welink and AT&T Fiber LAN equally well
- 2.5Gbps HW agent - works on AT&T Fiber service, but problematic on WeLink
- Deco BE5000 mesh router - works on AT&T Fiber service but problematic on WeLink
wrote 18 days ago last edited by recalc 18 days agoA couple reboots of the (WeLink-provided Eero router and suddenly 2.Gbps HW agent is communicating again.
So I will leave it there a while. WeLink did not tell me they fixed anything, but they may have done, since last time they wanted to troubleshoot it with me I told they I did not have any more time to devote to that until a few weeks from now.I tried disabling IPV6 on their router, which forced a reboot. Then I turned IPV6 back on in their router, which caused another reboot. Then I checked 2.5Gbps HW agent and saw it is communicating again (after reporting being "disconnected" for a couple days).....
unfortunately, it went back to "disconnected" again within about 30 minutes :-(I any event, this does really seem like more of a WeLink issue rather than a 2.5GBps HW agent issue.