Frank, W3LPL conducted two interesting experiments with WSPRlites on 20m from the US to Europe essentially.

The first experiment was a calibration run if you like to explore the nature of simultaneous WSRP SNR reports for two transmitters using different call signs on slightly different frequencies simultaneously feeding approximately the same power to the same antenna.

This article is about the second test which he describes:

The second test uses a WSPRlite directly feeding the same stacked Yagis, and the second WSPRlite feeding nearly identical stacked Yagis that point directly through the other stack located four wavelengths directly in front. Power at each antenna was about 140 milliwatts for each WSPRlite.

The data for the test interval was extracted from DXplorer, and the statistic of main interest is the paired SNR differences, these are the differences in a report from the same station of the two signals in the same measurement WSPR interval.

There is an immediate temptation of compare the average difference, it is simple and quick. But, it is my experience that WSPR SNR data are not normally distributed and applying parametric statistics (ie statistical methods that depend on knowledge of the underlying distribution) is seriously flawed.

We might expect that whilst the observed SNR varies up and down with fading etc, that the SNR measured due to one antenna relative to the other depends on their gain in the direction of the observer. Even though the two identical antennas point in the same direction for this test, the proximity of one antenna to the other is likely to affect their relative gain in different directions.

What of the distribution of the difference data?

Above is a frequency histogram of the distribution about the mean (4.2). Each of the middle bars (0.675σ) should contain 25% of the 815 observations (204). It is clearly grossly asymmetric and is most unlikely to be normally distributed. A Shapiro-Wik test for normality gives a probability that it is normal p=4.3e-39.

So lets forget about parametric statistics based on normal distribution, means, standard deviation, Student's t-test etc are unsound for making inferences because they depend on normality.

Differently to the first experiment where both transmitters fed the same antenna and we might expect that simultaneous observations at each stations might be approximately equal, in this case there are two apparently identical antennas, one close to and pointing through the other and the question is are they in fact identical in performance or is there some measurable interaction.

So, lets look at the data in a way that might expose their behaviour.

Above is a scatter chart of the 815 paired SNR reports (where an individual station simultaneously decoded both transmitters). Note that many of the dots account for scores of observations, all observations are used to calculate the trend line.

In contrast to the previous test, there is quite a spread of data but a simple least squares linear regression returns a R^2 result that indicates a moderately strong model with a Y intercept of -3.3dB (ie that there is -3.3dB difference between the systems)

We can reasonably draw the conclusion that there is a significant interaction between the otherwise identical antennas.

In fact sub-setting the data to select reports that were within +/- 5° of boresight, the difference was more like -5dB.

This raises the question of the design of an experiment, the hypothesis to be tested and then designing the experiment to collect unbiased observations that should permit a conclusion to be drawn.

One has little control of the location of observers in WSPR, their appearance is for the most part random. However, one can fairly easily filter the observations collected to excise observations outside a given azimuth range, and distance range (which might imply elevation of the propagation path). Filtering in this way ensures that the data is more relevant to the hypothesis being tested, and that should result in better correlation, less uncertainty in the result.