TXCTG – Compilation/Averaging of Testing Results Feb 2016 to Nov 2016

Brainstorm69

TXCTG 2016 MOTY
Advanced User
Lifetime Premium Member
Joined
May 23, 2015
Messages
9,335
Reaction score
20,005
Location
Lone Star State
So, as you may recall, I posted recently comparing TXCTG testing results of the Radenso XP vs the Uniden LRD950/DFR7 (https://www.rdforum.org/showthread.php?t=56655). In that post, I mentioned that I had thought about doing that for all the detectors we’ve tested, and even started it, but that it was a rather large task. But, I have since made and posted the TXCTG Testing Index (https://www.rdforum.org/showthread.php?t=56827), which actually helped make this task a little easier by having all the links to all the TXCTG testing in one place.

So I've spent some time the last couple of days making charts comparing all the detectors we’ve tested since I began making charts that include how each detector did compared to the best detection on each band that @Jag42 and I tested since last February. In these charts, I have listed each detector, the number of samples for each detector for each band, the range of results across those samples, and the average percentage of the best detection for each detector.

I’ve done that for Ka 34.7, Ka 35.5, Ka 33.8 and K-band, so there are four charts. In each case, I also made sure that one or more Redlines were also included in the testing as a control of sorts (save one - I did use some 35.5 testing from the Groesbeck East course where every detector was maxing the course, so we didn't even test a Redline. But I thought those results were relevent as the Redline would have just maxed out the course also).

As I mentioned in the XP vs. LRD950 DFR7 post, I haven’t done anything to try to normalize the data at all (probably beyond my level of expertise) for things like the courses the data came from, whether the courses had some feature (like terrain, vegetation, obstacles, etc., that might cause a large variation in detection distances, or no variation at all in detection distances). I also didn’t try to separate the K-band results into filtered, vs, unfiltered (it would have been very difficult with the original data, which mixed filtered vs unfiltered when the percentages were created). So, again, this is hardly scientific, and in some cases for some detectors, the sample size is quite small. So take that into account when viewing this data. If you have questions about anything, please ask. I will try to do my best to answer.

Here are the charts for each band, sorted from highest to lowest average vs. the best detection on each band in each testing event. You can think of these as charts of each detector vs. the theoretical best detector that has the best detection (100%) all the time.

[HR][/HR]34.7 Results

34.7 Compilation Chart 11-5-2016.JPG


[HR][/HR]35.5 Results

35.5 Compilation Chart 11-5-2016.JPG


[HR][/HR]33.8 Results

33.8 Compilation Chart 11-5-2016.JPG


[HR][/HR]K-band Results

K-Band Compilation Chart 11-5-2016.JPG
 

Attachments

Last edited:

Vortex

Making Videos
Observer
Advanced User
Lifetime Premium Member
Joined
Jul 19, 2012
Messages
21,331
Reaction score
41,884
Location
Washington State
This is amazing! Thank you!! :thumbsup:
 

Brainstorm69

TXCTG 2016 MOTY
Advanced User
Lifetime Premium Member
Joined
May 23, 2015
Messages
9,335
Reaction score
20,005
Location
Lone Star State
Just realized I forgot to sort the K-band...will fix that momentarily. [EDIT: Fixed]
 
Last edited:

PointerCone

M3 Kng
Advanced User
Lifetime Premium Member
Joined
Aug 5, 2011
Messages
23,179
Reaction score
21,899
We need a bigger test sample/more runs on the M360. Find it hard to believe it's that good on 33.8.
 

fitz4321

Running With Scissors
Advanced User
Premium Member
Joined
Dec 12, 2015
Messages
2,191
Reaction score
3,945
Location
Bay Area, CA
Wow man. Thanks for your work!

That chart reinforced that my decision to pick up the RPSE was well worth it. Most range with good (excellent the way mine is set up) filtering.


Sent from my iPhone using Tapatalk
 
Last edited:

n3xus

PSL +5
Intermediate User
Premium Member
Joined
Apr 28, 2016
Messages
176
Reaction score
107
Location
East Tennessee
We need a bigger test sample/more runs on the M360. Find it hard to believe it's that good on 33.8.
Definitely agree! The 360 needs more testing for sure with a mix of different units.
 

Brainstorm69

TXCTG 2016 MOTY
Advanced User
Lifetime Premium Member
Joined
May 23, 2015
Messages
9,335
Reaction score
20,005
Location
Lone Star State
We need a bigger test sample/more runs on the M360. Find it hard to believe it's that good on 33.8.
I agree that more data/runs are needed. All the 33.8 data for the Max 360 came from one test and one detector - @akamanonthemoon's Max 360.

I knew we have given somewhat short shrift to 33.8 in our testing in the past. This data highlights that fact. It also highlights that we need more testing of certain models on more than just 33.8. The Max 360 and iX certainly fit in that category.

In general, this also makes me want to perhaps re-think how we are testing. We tend to go out and try to get as many bands tested as we can. That leads to one or two run tests for each detector due to time constraints. I'm thinking perhaps we should narrow the focus; e.g., get all the right detectors, and focus on one band for an entire test day. That way, we could get maybe 10 runs per detector instead or one or two. Or if we could get two examples of each, do 5 runs apiece (10 runs per model).

It is interesting how seeing the forest instead of the trees sometimes results from exercises like this.
 
Last edited:

akamanonthemoon

It is What it is!
Beginner User
Premium Member
Joined
Oct 31, 2015
Messages
145
Reaction score
170
Location
Mission, Texas
I agree that more data/runs are needed. All the 33.8 data for the Max 360 came from one test and one detector - @akamanonthemoon's Max 360.

I knew we have given somewhat short shrift to 33.8 in our testing in the past. This data highlights that fact. It also highlights that we need more testing of certain models on more than just 33.8. The Max 360 and iX certainly fit in that category.

In general, this also makes me want to perhaps re-think how we are testing. We tend to go out and try to get as many bands tested as we can. That leads to one or two run tests for each detector due to time constraints. I'm thinking perhaps we should narrow the focus; e.g., get all the right detectors, and focus on one band for an entire test day. That way, we could get maybe 10 runs per detector instead or one or two. Or if we could get two examples of each, do 5 runs apiece (10 runs per model).

It is interesting how seeing the forest instead of the trees sometimes results from exercises like this.
Always happy to contribute my Max360 for further testing! I also have a 9500ix, Radenso Pro and LRD950 just in case I have a magical RD touch!! LOL!


Sent from my iPhone using Tapatalk
 

n3xus

PSL +5
Intermediate User
Premium Member
Joined
Apr 28, 2016
Messages
176
Reaction score
107
Location
East Tennessee
I also have a M360 that can be used for future tests.
 

Edwv30

Advanced User
Joined
Jun 11, 2011
Messages
2,050
Reaction score
1,568
Nice! Question though, in order to obtain a true average of each detectors performance shouldn't you just include the runs where all of the detectors were included? As an example, one detector has an average of its performance across 30 runs. The second detector has an average of its performance in 20 runs, (because it wasn't included in 10 of the first detectors runs). I don't think we should compare the first detector to the second in this case and should only include the average of the 20 runs where both detectors were tested. The 10 runs for the first detector may have had more difficult terrain, FF, etc. and it would be comparing apples to oranges. In any case, I like where you are going. Thanks for the hard work.
 

Brainstorm69

TXCTG 2016 MOTY
Advanced User
Lifetime Premium Member
Joined
May 23, 2015
Messages
9,335
Reaction score
20,005
Location
Lone Star State
Nice! Question though, in order to obtain a true average of each detectors performance shouldn't you just include the runs where all of the detectors were included? As an example, one detector has an average of its performance across 30 runs. The second detector has an average of its performance in 20 runs, (because it wasn't included in 10 of the first detectors runs). I don't think we should compare the first detector to the second in this case and should only include the average of the 20 runs where both detectors were tested. The 10 runs for the first detector may have had more difficult terrain, FF, etc. and it would be comparing apples to oranges. In any case, I like where you are going. Thanks for the hard work.
I see where you are coming from. I know the approach here is not perfect. Unfortunately, nothing is. One difficulty is getting all the same detectors on all the same courses. And even if you get all the same detectors on the same course, that doesn't guarantee a "fair" outcome. Sometimes you are testing on a course that may produce a wide variance in detection distances because of certain terrain features. For instance, we have tested on a course that if a detector didn't get a detection by a certain spot, it couldn't get a detection for the next two miles. Is that really a fair test to include? We have also tested on courses where because of terrain, etc., every detector alerted at the same distance. Is that a fair test to include? I'm hoping that if we include enough tests, it all evens out somewhat in the end. That's why sample size is important. If the sample size is small and a detector either gets bit in the butt by a difficult course, or lucks out into a course where a better detector can't really get a better detection, it can certainly skew the results. That's why I included sample size and range information.

But I also think it's important to include results like those because it represents the real world. The Redline isn't going to crush every detector in every scenario. Sometime, the crappiest M4 may get a detection that's just as good a a Redline. Other times, the M4 is going to get it's a$$ kicked. In some respects it's up to the reader to make their own determination of how valuable this data is. I'm just putting it out there as another data point, and hopefully a useful one.
 
Last edited:

Edwv30

Advanced User
Joined
Jun 11, 2011
Messages
2,050
Reaction score
1,568
I see where you are coming from. I know the approach here is not perfect. Unfortunately, nothing is. One difficulty is getting all the same detectors on all the same courses. And even if you get all the same detectors on the same course, that doesn't guarantee a "fair" outcome. Sometimes you are testing on a course that may produce a wide variance in detection distances because of certain terrain features. For instance, we have tested on a course that if a detector didn't get a detection by a certain spot, it couldn't get a detection for the next two miles. Is that really a fair test to include? We have also tested on courses where because of terrain, etc., every detector alerted at the same distance. Is that a fair test to include? I'm hoping that if we include enough tests, it all evens out somewhat in the end. That's why sample size is important. If the sample size is small and a detector either gets bit in the butt by a difficult course, or lucks out into a course where a better detector can't really get a better detection, it can certainly skew the results. That's why I included sample size and range information.

But I also think it's important to include results like those because it represents the real world. The Redline isn't going to crush every detector in every scenario. Sometime, the crappiest M4 may get a detection that's just as good a a Redline. Other times, the M4 is going to get it's a$$ kicked. In some respects it's up to the reader to make their own determination of how valuable this data is. I'm just putting it out there as another data point, and hopefully a useful one.
It's a huge undertaking to compile all of this data, I would never want to do it...thank you! I do have a question. In your results you are are giving each detector an average against 100% but no detector ever reaches 100%. What criteria are you using to obtain the end goal, ( 100%)? I don't think it is the top performer in each test as at least one detector\platform would reach 100%. Is it the distance from the start of the course until the detector alerts? Thanks again for the hard work.
 

Brainstorm69

TXCTG 2016 MOTY
Advanced User
Lifetime Premium Member
Joined
May 23, 2015
Messages
9,335
Reaction score
20,005
Location
Lone Star State
It's a huge undertaking to compile all of this data, I would never want to do it...thank you! I do have a question. In your results you are are giving each detector an average against 100% but no detector ever reaches 100%. What criteria are you using to obtain the end goal, ( 100%)? I don't think it is the top performer in each test as at least one detector\platform would reach 100%. Is it the distance from the start of the course until the detector alerts? Thanks again for the hard work.
In each test, one or more detectors equals 100 percent (i.e., the longest alert distance for that test). So far, there has not been one detector that alerts at the longest distance in each test every time across all test events. That is why no detector's average is 100% in the charts.
 
Last edited:

Edwv30

Advanced User
Joined
Jun 11, 2011
Messages
2,050
Reaction score
1,568
In each test, one or more detectors equals 100 percent (i.e., the longest alert distance for that test). So far, there has not been one detector that alerts at the longest distance in each test every time across all test events. That is why no detector's average is 100% in the charts.
Thank you. So, as an example, you have 30 tests where the RL alerted on average 99%- test #1, 91% - test #2 , 100% - test #3, etc.. The RL is the only detector that scored 100% in test #3. Are you then taking an average of the average on all RL tests and comparing them to detectors that may have only been involved in 10 of those tests, ( but may have scored 100% on a different course)? Sorry for all of the questions... am fascinated by trying to figure out the end results, formulas used, etc..

I think I am going back to my original point. You can't give average performance from one detector to the next unless all detectors were included in the same runs. You can say that the RL performed, on average, 97% in all runs in which it ran but not 97% against detector A, B, C and D. unless they were all included in the same test runs.
 
Last edited:

Jag42

USA TMG a-15 Dealer & USA Rep & TXCTG RD tester
Advanced User
Premium Member
Manufacturer
Joined
Jun 13, 2011
Messages
9,163
Reaction score
17,767
Awesome at [MENTION=9768]Brainstorm69[/MENTION]. We should concentrate on 34.7 for a full day. I will not be able to test until after the 21st.
 

Brainstorm69

TXCTG 2016 MOTY
Advanced User
Lifetime Premium Member
Joined
May 23, 2015
Messages
9,335
Reaction score
20,005
Location
Lone Star State
Thank you. So, as an example, you have 30 tests where the RL alerted on average 99%- test #1, 91% - test #2 , 100% - test #3, etc.. The RL is the only detector that scored 100% in test #3. Are you then taking an average of the average on all RL tests and comparing them to detectors that may have only been involved in 10 of those tests, ( but may have scored 100% on a different course)? Sorry for all of the questions... am fascinated by trying to figure out the end results, formulas used, etc..

I think I am going back to my original point. You can't give average performance from one detector to the next unless all detectors were included in the same runs. You can say that the RL performed, on average, 97% in all runs in which it ran but not 97% against detector A, B, C and D. unless they were all included in the same test runs.
[MENTION=181]Edwv30[/MENTION] - I agree. I started this whole thread off with a post with full disclosure that this is not scientific, no attempt has been made to normalize the data, etc. Clearly it is/may be skewed in some places where the sample size is small, and, for instance, where a particular detector with a small sample size was tested either on a course with large separation or no separation at all. This was not an attempt to create a definitive "this detector is better than that detector by X% in all cases." I just wanted to pull the data together, see what it looks like, and share it with the Community. I do think it has value when looked at while knowing how it was put together.
 
Last edited:

Duke295

Advanced User
Premium Member
Joined
Feb 19, 2016
Messages
780
Reaction score
1,271
@Brainstorm69, I think you did a marvelous job. The greatest feature to me is that you have compiled data that over time will grow including sample sizes. In terms of scientific, you always show what is measured in terms of what you can control such as which units are running Filtering, TSR, Band Segmentation, Course Length, Band being used etc... You can't control weather, attenuation, atmospheric absorption in terms of band signal and so forth. So in fact, other than different courses(USED) you have indeed created a Comparison for the RD Units you had to work with at the time. In essence this shows a strong baseline for the RD's that have been included. I think the grade assignment is useful for the purpose of detection which is the whole basis of the test. It does not imply quietness of RD's or Lockout ability or anything else simply performance vs Band. Job Well Done!!!
 
Last edited:

PointerCone

M3 Kng
Advanced User
Lifetime Premium Member
Joined
Aug 5, 2011
Messages
23,179
Reaction score
21,899
I agree that more data/runs are needed. All the 33.8 data for the Max 360 came from one test and one detector - @akamanonthemoon's Max 360.

I knew we have given somewhat short shrift to 33.8 in our testing in the past. This data highlights that fact. It also highlights that we need more testing of certain models on more than just 33.8. The Max 360 and iX certainly fit in that category.

In general, this also makes me want to perhaps re-think how we are testing. We tend to go out and try to get as many bands tested as we can. That leads to one or two run tests for each detector due to time constraints. I'm thinking perhaps we should narrow the focus; e.g., get all the right detectors, and focus on one band for an entire test day. That way, we could get maybe 10 runs per detector instead or one or two. Or if we could get two examples of each, do 5 runs apiece (10 runs per model).

It is interesting how seeing the forest instead of the trees sometimes results from exercises like this.

Dont get get too caught up in 33.8. It's becoming more of a unicorn lately. I typically see 34.7 Stalker and Kustom/Decatur 35.5 and rarely MPH. I'd think most would agree. Anecdotally, Stalker is about > 70% of what's out there in most states.
 

Duke295

Advanced User
Premium Member
Joined
Feb 19, 2016
Messages
780
Reaction score
1,271
I don't doubt that statement but for me it's the reverse. I see 35.5 the most by a large margin 33.8 next and 34.7 the least where I live and travel. Just maybe I am in the minority but 33.8 is alive and well
 

HotRodEV

Ih8w8ing
Advanced User
Premium Member
Joined
Aug 11, 2016
Messages
1,684
Reaction score
2,451
Location
ATL
Yep, found a couple 33.8's in my County recently...
 

Discord Server

Latest threads

Latest posts

Forum statistics

Threads
78,432
Messages
1,193,873
Members
19,993
Latest member
Clamson95
Top