How are speed tests calculated?

kmichailg

Hello,

Currently in mediation with my ISP regarding intermittent internet connection and slow speeds. Stipulated speed is 150 mpbs, only getting 50-70 mbps on average according to your speed tests. Local governing body mandates 80-90% service reliability and data reliability according to these rules:

Data rate reliability is measured over a period of one (1) day and calculated as:
Data rate reliability (DRR) = (DDRe/DDRu)x(UDRA/UDRu)x100%
Where
DDRA is the average downstream data rate during actual usage during the day
DDRg is the "Up to" downstream data rate
UDRA is the average upstream data rate during actual usage during the day
UDRg is the "Upto" upstream data rate

Problem arises when doing speed tests. The ISPs choice of tool is Ookla's speedtest.net, although I have a sneaking suspicion it is a best-case test suite not real-world payload testing or even comprehensive testing. It doesn't help that the ISP has local testing servers near my geographical area. Might skew the results in their favor. Closest testing suite I found was testmy.net that at least tries to emulate real-world payloads with varying sizes.

In the interest of transparency, how does your speed test service interpret or test speed? Is it something close to what testmy.net is doing? Do you think your process or formula is a good representation of the real-world payload instead of best-case?

OutagesIO_Support

Hi,

This is a tough topic that isn't very easy to explain.

You are absolutely correct that speed testing is not truly real world since they are highly optimized using edge network CDN's that usually always give great results unless there are severe and ongoing problems in your area specifically. Our results are no different, using CDN based speed testing such as speedtest.net, fast.com etc.

The only differences you might see are because of the automated testing rather than you manually testing at the time you feel there is a problem or not. When you enable speed testing, the test will happen on a regular basis called a baseline but also based on the following two conditions.

The agent software keeps track of pings averages and does an ongoing regular low bandwidth speed test. If pings go over the average that the agent has built up for the connection it is monitoring, this will trigger a speed test. If the result of the low speed test is fairly slower than the average it has built up, this will trigger a speed test.

We used to, and may still have, a mention that speed testing is a complicated problem because people use these commercial speed testing sites that aren't real world. On the other hand, you cannot speed test to every site you can reach on the Internet as packets travel over many different networks, switches, taking different paths, some with application shaping to ensure no one source will take up too much bandwidth.

Speed testing is kind of a joke but it's all we have as consumers.

Also, you show a disclaimer. Keep in mind that unless there is a contract called an SLA between the provider and the customer, the service is offered as a 'best effort' one. This means they will guarantee the speed to the street or their nearest switch and no more. This is because the overall bandwidth is shared with any number of neighbors, making it impossible to promise specific speeds.

It's easy for providers to show they are giving you what you pay for because it's only up to the street. If you test your speeds, then it's being tested directly off the providers network over CDN's which are interconnected with super high speeds.

They cannot promise anything inside or past their own network since those networks are not only shared but the packets are then flowing over networks they do not own or control.

We have been working on our own test but it really is complicated. We have all kinds of ideas but the problem will be trying to support it with members that may not understand how traffic flows over the Internet and how different networks affect the flow.

I would say our results are the same and the only exception might be because it is automated and trying to catch things that we humans would otherwise have to try and catch exactly as it happens.

Note that we have also seen something odd lately which is, the higher the speeds we are testing, the stranger the results are getting and we do not at this time know why but are working on it.

kmichailg

I want to understand the differences between the baseline, latency, and outage flags you use for speed testing. Based on your answer, I am fairly informed about what triggers baseline.

> The only differences you might see are because of the automated testing rather than you manually testing at the time you feel there is a problem or not. When you enable speed testing, the test will happen on a regular basis called a baseline but also based on the following two conditions.

The agent software keeps track of pings averages and does an ongoing regular low bandwidth speed test. If pings go over the average that the agent has built up for the connection it is monitoring, this will trigger a speed test. If the result of the low speed test is fairly slower than the average it has built up, this will trigger a speed test.

I still do have a few questions:
* What are the test parameters? e.g.

How much test data is sent when testing download/upload?
Is it also just testing download/upload once?
What specific sites/servers are you pinging/hitting when testing?

Like I said in my earlier post, the closest results to your software I have seen is testmy.net. Not like the ISP's recommended tool which is Ookla's speed test, which I like to contest. SLA with ISP is indeed best effort, but the local governing body mandates 80-90% reliability, with at least 50% when best effort is considered. Which is why I am in mediation with them. I need to ascertain first the authenticity of your speed test methodology vs Ookla and fast.com to build a more solid case. So far, testmy.net is the one which yields the closest to your results. I run it in 5 minute intervals up to 100 times. Only downside is it is not constant like your service.

I would also like to know what triggers latency and outage speed tests. Thank you!

OutagesIO_Support

I think some information has gone missing because there used to be a lot more information about how the speed testing works and what the meanings represent. I'll have to look into that and have this information be included into the 'about this page' for speed testing. Parts of this reply will likely be used to update some help information.

I want to understand the differences between the baseline, latency, and outage flags you use for speed testing.

I can tell you there is a bug that shows black outage based speed tests being triggered which aren't outage based but something else. What those are, that developer still has to figure out. If you don't see an outage around the same time as one of the black bar based tests, it means something else triggered the test.

Green - Baseline test

The agent software is running a speed test on a regular basis in order to establish a baseline or average. This is a full saturation test and is shown in the results to get a visual sense of how things are going.

This baseline is also how we can nudge members to let them know when they have incorrectly set their service speed in their speed test configuration.

For example, someone might specify they have a 100Mbps but the baselines are constantly showing an average more around the 300Mbps mark which means the person entered the wrong speed. On the other hand, someone might enter one gigabit as their connection speed and yet the baseline is constantly showing around a 200Mbps. This person is either experiencing a true bandwidth issue or they simply entered the wrong value.

Blue – Latency trigger

This test is triggered when the latency of the connection begins to fluctuate outside of the measured averages. The agent runs pings as one of its tests. It uses the averages to build an ongoing cache. This average is used to get an idea of latency to a specific destination since we cannot ping everything on the Internet. If the algorithm notices that ping times are going well above what it has built up as an average, it triggers a speed test.

Note that as mentioned above, the pings are not to some nearest destination but to a constant one. The idea here is not that we are trying to know what ping times are to a certain destination but if that average changes drastically. We did test picking a destination that was closer or within the provider but that didn't seem to be as 'real world' as a destination over the Internet.

Orange – Slowdown trigger

This test is triggered when short burst speed tests are run and the results show slower than usual speeds.

In this case, a small download is regularly done by the agent to again build up an average. It doesn't really matter what the speed is since it's a tiny download but the point behind it is the average. If the time it takes to download the small file changes drastically, the algorithm will trigger a speed test.

Black – Outage trigger

This test is run moments after an outage ends to try and determine if speed is back to 'normal' or if it remained slower than the calculated average before the outage. We think low modem/wireless signal level issues could potentially also trigger this test erroneously.

Limitations, problems with speed testing

There are some limitations when speed testing beyond 100Mbps when those tests are http based. The limitation shows up randomly and is something we wish to address.

That said, we really do not want to run full saturation testing day in and out as these tests are uselessly wasting bandwidth. Speed tests should be based on usable bandwidth, not total amount of bandwidth since it's always shared to begin with. So long as we have a fair and usable amount of bandwidth, that is the most important thing. This is something we have been challenged with since starting this and while we have some ideas on the table, nothing has been so reliable that we can start using it.

What are the test parameters? e.g.

See above. I do not have the exact algorithm parameters as we are constantly trying to fine tune this.

How much test data is sent when testing download/upload?

Is it also just testing download/upload once?

I am not 100% sure of all the details but we do not test upload, only download. A full saturation test means downloading multiple streams of X amount of data to saturate the connection in order to get a speed result. You could look that up on the net, there are countless articles on how speed testing works.

What specific sites/servers are you pinging/hitting when testing?

As mentioned, our mission, goal is to try and use real world scenarios. Practically everything is done against OutagesIO, using our own servers/network/s. As we talked about earlier, testing against highly optimized edge network services is not real world at all. In the real world, packets flows across half a dozen or more networks that the source/destination have no control over. Some of those networks do packet/application shaping, some allow more throughput per source while others limit, there really is no way to do honest to goodness speed testing unless you own the entire network and can control everything about it.

Real world is also near impossible because the path taken by your OutagesIO agent to the OutagesIO network will be completely different than the path your browser is taking to Facebook and almost all other tabs you have open to other sites.

Our mission is not about monitoring the entire Internet, it is to show if the provider is experiencing problems that they don't know about or won't admit to until enough people complain. Why would a provider spend money in a neighborhood that doesn't really know how poorly served they are unless many people in that area started complaining.

Our goal is mainly to give people a way to monitor from a source to a destination which is through their provider. When neighbors join in and they start comparing their results, there is no way to hide problems and they must be dealt with.

SLA with ISP is indeed best effort, but the local governing body mandates 80-90% reliability, with at least 50% when best effort is

considered. Which is why I am in mediation with them. I need to ascertain first the authenticity of your speed test methodology

The problem is, how do you prove it? If you have 200 neighbors on the same switch, the reason it works for the provider is that there is little chance everyone will need all of their bandwidth at the same time. The provider can easily claim overall Internet congestion while showing they have more than enough throughput to handle your area. It's a real game.

We cannot guarantee anything. Our service should be used as yet another tool, as part of your overall tool kit. In problems like these that you describe, human intervention is absolutely required. Your best option would be to try and find others in your area to install an agent too and see how their services are performing. Compare with others and see what starts looking like patterns. It's how we do it here.

In fact, these are the kinds of things we like to get involved with to try and get some exposure about the service as it is darn near impossible to be found because big providers and outage sites have much deeper pockets to outbid us making it difficult for people to find services like ours which might be able to help them with such problems and ongoing ones too. If you could get more people involved, we would be happy to help explain what we think we are seeing too.

So far, testmy.net is the one which yields the closest to your results. I run it in 5 minute intervals up to 100 times. Only downside is it is

not constant like your service.

This is inherently the problem with speed testing. Speed testing all day long is only using up that bandwidth and the problem might not even be a bandwidth issue but a throughput one either because there is not enough overall bandwidth in the area or because something is failing and either the provider doesn't know it is limiting customer speeds or doesn't want to admit it because it could cost them a bunch of money they don't want to spend in that area yet.

I hope this helps to some degree.

kmichailg

These are some great answers, which have been factored into the details of the mediation case against our local provider. And yes, I am using all available open-source tools first to get at the real issue at hand. One big problem about the situation is the monopoly of the said provider and lack of competing services; another problem is the issue of being technically competent enough to eloquently address issues being faced, and detecting them when it happens.

Not a lot of people are aware these problems because internet usage for homes had been mostly used for non-commercial use. That all changed when the pandemic hit, forcing everyone who can work remotely to do so. When issues arise, majority of people I see just accept the fact that the current infrastructure is problematic and bear with it. So, no movement there. Hence it has become very important for me to provide them accurate or near accurate data and records, because frankly while my case isn't isolated, nobody has the time to call out our service providers because we're too damn busy to make a living.

Thank you for taking the time to address my questions. I'll try to update you guys or ask additional questions in the future.

OutagesIO_Support

Yes, most consumers just take it on the chin, it's less frustrating than fighting for what is right or what you are owed.

Information and Support

How are speed tests calculated?