Track Internet disconnections, provider outages with historical data, and automated speed testing.
For Windows, Linux, ARM64, ARMa7. Learn more by visiting www.outagesio.com
Notice: If you created an account on app.outagesio.com, simply use the same credentials to log in here.
Agents is running but not reporting
-
@OutagesIO_Support For now the agent is running fine but it's too early to tell, the problem usually comes after few days.
-
That might help.
Just a suggestion but before you do too much work on many servers though, maybe stick to one until we have this figured out. Once we do, then do the rest.
The dev noticed he left his test cert on the build. Sometimes, it's hard to fully test new builds. We might not see problems but as members install and report them, we fix them.
The problem is, most people don't report them so it's always nice when someone reports or works with us so we can figure things out, just like in this situation.
We appreciate your patience. We'll get a new build out to you asap.
-
Yes I agree with you and we should do only one change at a time.
However it will be easier for me to apply this change everywhere as it will enable me to see faster if it's working or not. That's why I updated all the servers to .Net 4.7.2. However, I only updated the agent on one server and will not make the change on the other servers for now.
It didn't take too much effort to upgrade all the servers to .NET 4.7.2 as I just had to write a little script and run it on all the servers (Itarian will automate that with a few clicks).
To know which servers have a problem, I just compare the list of online servers in Itarian (a remote management solution) with the list of online servers in outages.io. And then I know which one have an issue. I recently realized that this problem seem to be on all the servers and it's random.
So if I test only one server, it may take a long time before I can confirm if the change is stable or not. Whereas if I apply this change on all the servers, I will be quick to see because until now there are always some servers where the agent crashes (I manage about 22 servers).
Tomorrow the servers will be rebooted and the .NET 4.7.2 update will be applied. I will check again if the problem is still there or not.
People often report to me that the internet is slow or not working. With your tool I can check whether it's coming from the ISP or from the local installation, It's very useful for me and I am really willing to solve this agent problem so thank you for your support.
-
However it will be easier for me to apply this change everywhere as
it will enable me to see faster if it's working or not.No worries, what ever works for you. I was just a bit concerned that you might spend a lot of time updating a bunch of servers only to find out the agent won't work.
To know which servers have a problem, I just compare the list of >online servers in Itarian (a remote management solution) with the list >of online servers in outages.io.
We use something called Zabbix, to monitor servers/health etc but actually use our own hardware agents for connectivity monitoring.
So if I test only one server, it may take a long time before I can >confirm if the change is stable or not. Whereas if I apply this
Yes, that makes sense. I hope we can figure out what the random aspect of it is. While we don't actually support the version of OS you are using, if we can get to a point where it's working right, we could add it to our list of supported platforms.
change on all the servers, I will be quick to see because until now >there are always some servers where the agent crashes
I suspect it's not crashing but the service is stopping for some reason. We've not seen much for actual crashes since even our early versions but definitely see that the service can stop for various reasons.
The problem of course is there are countless variations of software, drivers, packages, updates, etc etc, even when we think they are nearly clones. Sometimes, something really small can affect one server but not another.
The dev is looking into what you've said and testing some ideas.
People often report to me that the internet is slow or not working. >With your tool I can check whether it's coming from the ISP or from >the local installation,
That's why we built it actually. Customers would call us thinking what we manage was broken but 99% of the time, it was their own Internet that was not working right.
We thought it would be better to monitor from their own locations perspective to see if for ourselves. They didn't have to explain anything to us, we could see it and deal with it.
The service can also be used to gain analytics across all of the ISP's that your organization uses. From your own resources to remote customers/employees, all of that can be consolidated to reveal a lot of interesting information that you'd never get from any ISP. Our Enterprise level gives you that.
For example, you could break down which areas by country, states/provinces, cities, towns, etc, experience the most or least problems.
You could see which ISP's are most or less reliable, when they have the most problems and where. You could see which are more reliable in terms of the speeds you are paying for and all sorts of other metrics.
You can even see where they have weak or trouble points.
It's very useful for me and I am really willing to solve this agent >problem so thank you for your support.
We love hearing this kind of input and yes, we'll solve it. It just takes some time to find the leads to know what to fix.
-
Ok thank you.
Checking this morning, many servers are reported disconnected from outages.io meaning updating to .NET 4.7.2 did not solve the issue.
-
Hi,
If you re-install yet again, you'll get the same version but with the cert fixed. Here is the answer to your question about the various services.
OtmWinClient is the actual client app. This is the same app as on the other platforms and is build from the same source code.
OtmService is a windows service that launches and kills OtmWinClient. This is accessible via the windows service control and starts otm when windows boots.
OtmServiceApplication is the app that shows the tray icon and allows to start/stop the service via a popup menu. This is maybe badly named and would be better named as OtmTrayApplication.
The rest, we are still working on and are building servers so we can test here as well.
-
Thanks for the clarifications.
Are you building Windows servers to test ? Should I wait for your instructions before testing any further ?
-
Yes, Windows servers for testing. I'm not sure which yet but one is a very old 2003 server. I don't seem to have a 2016 version otherwise I would give that a try too.
You can try the most recent version just to see if the cert problem is gone which it should be otherwise, it's just a matter of time for us to do some testing, coding, testing.
Here is what I know so far.
The current and previous versions of the installer is already installing for “all users”.
It will probably work on most server editions as long as it’s 64 bit.
On server, each user gets a copy of the tray icon app.
That tray icon app just starts and stops the service.
There is only one instance of the agent running but multiple instances of the tray app. That should not be a problem. It should be possible to restrict that to administrators so I will check that.
I know we touched on this but you could try what we talked about. Make sure all instances are removed then re-install just one as full admin using the new version and see what happens.
The new version pulls everything down needed so maybe it would pull something down that got missed before.
-
It's a lot of effort to update all the servers as I need to do it manually. I there a way to do an unattended installation so I can script it ?
You can download Windows Server Evaluation (180 days trial) here : https://www.microsoft.com/en-us/evalcenter/download-windows-server-2016
-
Thanks, I'll give it a try.
Which version are you using, standard or datacenter?
I'll try standard first since you mentioned a number of people can log into this server. -
No luck. I'm not a Windows person so cannot get 2016 running right. Just trying to get RDP access to it and cannot.
I'll have to find someone that knows MS.
-
@OutagesIO_Support we use standard edition. You may need to allow RDP in the Windows firewall.
-
Yes, I just can't move around. I built it as a vm and the mouse doesn't work. It didn't work using 2003 either so maybe those are too old to fully funtion on vmware 6.7. I'll give it one more try on proxmox.
BTW, did you try the current version? It's got a change in it that may pull down something missing. I think the dev is waiting to get some feedback from you to know what to look for because we aren't sure why the service might stop on 2016.
-
The current version is deployed on a server but I don't know if it's working properly or not. I need to deploy it to all the servers to notice if there's a problem or not.
Is it possible to make an unattended installation ? It is too time consuming to upgrade all the servers manually.
-
We have no unattended version because each agent has to be assigned to the correct report/owner which happens at installation time, those codes you use. The codes are what create the association.
I would just pick one server and see if we can get that one going. You said all of the servers should be identical so if one works, the rest should.
Can you expand on why you're not sure if it's working or not. Can you share the ID and I'll take a look.
-
The agent ID 128994 was updated to the latest version.
I am not sure if it is working because I don't know if the OtmWinClient.exe process has stopped or not since I updated it (my servers are not always on, we shut them down at night).
I just realized I can set a monitor on this process. It will log an alert on Itarian (my remote management tool) when OtmWinClient.exe has stopped. I also configured an auto remediation action that will restart the OtmService. It might solve my problem.
Would it be possible to update existing installations without needing to re-enter the keys ? Because I can't update manually 22 servers every time there is a new release.
-
No, nothing is coming in from that agent, it is not communicating at all.
It's not possible to update agents without the keys. As I mentioned, it's because those keys are what are used to assign the software side that you install to the reports on OutagesIO.
That would be a nice idea for the Enterprise side of our service but I'm not sure how it could be done. The software still has to know which reports/member owns it.
Once we have a solid agent, it usually doesn't change much unless MS changed something that forces us to have to update.
I'll bring it up as a topic however.
-
The server was shutdown yesterday so it is normal there was no communication from that agent.
My fix seem to work (restart OtmService if OtmWinClient.exe is not running) and all the agents are now up. It's a bit of a dirty solution but it was at least simple to implement.
For other softwares like Itarian or Teamviewer agents, it is possible to embed the authentication keys in the installer and the agents update automatically. When I deploy the Itarian agent, I just double click on the installer and the agent will be automatically configured with my account and reporting to the management portal.
-
We're going to try testing on a 2016 here as well.
If the same key could be used, that would work but right now, I don't see how agents can be installed without a unique key as they would not know which reports to write to.
I've got it in my notes as something to bring up.
However, something like this would only be available in our Enterprise level service as the lower levels aren't really built to handle so many agents.