Evaluation of Volunteer Data Quality
in a previous issue, we mentioned how some folks question the quality of data collected through volunteer programs (Winter 2006 issue of The Water Line). At that time LMVP staff were evaluating the reliability of LMVP data, with plans of presenting the findings at the National Water Quality Monitoring Conference. The analyses are done, graphs are made, the slide show presented, and results indicate that data generated by the LMVP are reliable. What follows is a short review of the methods used to evaluate LMVP data and the results.
Comparison of Annual Averages
We compared data collected through the LMVP to data generated by other University of Missouri water quality monitoring projects. To qualify for this analysis a site had to be sampled at least three times during a summer by both the LMVP and MU. The result was a total of 178 comparisons (each comparison was an individual site during an individual year) representing 41 different sites from 29 lakes. Each comparison was statistically analyzed to determine if the LMVP average for a given site/year differed from the MU average.
Statistical analysis indicated that 164 (92%) of the total phosphorus comparisons were not significantly different. Results for the other parameters were even better and are presented in Table 1 below.
Figure 1 shows the annual average total phosphorus data plotted out. The horizontal and vertical axes represent the MU and LMVP values, respectively. Each symbol denotes one of the 178 comparisons, and the dashed line is the 1:1 line (if LMVP and MU values were exactly the same the symbol would fall on the line). As you can see, most of the symbols are located near the 1:1 line indicating close agreement between LMVP and MU data. Most of the comparisons that fall farther away from the 1:1 line were sites with higher phosphorus values (>50 ug/L). This is to be expected because lakes that have higher nutrient levels also have higher variability.
Should we be concerned about the comparisons that were not in agreement?
Given that Missouri lakes tend to be quite variable, differences in the timing of sample collection by LMVP and MU accounts for some of the differences in average values. Also, on some of the large reservoirs ( Lake of the Ozarks, Table Rock, etc.) the location of LMVP and MU sites were not perfectly matched. We feel that given the gradients in water quality found in large reservoirs, even a few miles difference between sample sites can lead to differences in average values.
Parameter | # of Comparisons | # in Agreement | % in Agreement |
---|---|---|---|
Phosphorus | 178 | 164 | 92 |
Nitrogen | 178 | 167 | 94 |
Chlorophyll | 178 | 171 | 96 |
Secchi | 178 | 166 | 93 |
Suspended Sediment | 117 | 116 | 99 |
Table 1. Percent agreement between annual values derived from LMVP and MU data collected at the same site during the same season (though on different days).
Comparison of Long-Term Averages
If a lake site used in the annual average comparison was sampled by both LMVP and MU during four or more years, the lake was included in this analysis. The comparison of long term averages emphasizes year-to-year variation as opposed to the sample-to-sample variation that was the focus of the previous comparisons. Aggregation of data into a long-term average reduces the influence of extreme values and provides a better assessment of a lake site’s true water quality.
A total of 23 different sites from 11 lakes were compared, with individual sites being monitored between 4 to 10 years. None of the comparisons made in this analysis were significantly different for any of the five parameters. This means that once sample-to-sample variability is reduced through aggregation of data, the LMVP and MU data were extremely comparable. The comparability of the long-term phosphorus averages can be seen in Figure 2 (note how close to the 1:1 line the symbols fall).
Split Samples
This analysis evaluates how the processing and storage methods of the LMVP compares to those of MU. Split sampling involves LMVP staff accompanying volunteers as they sample. The volunteer collects composite samples as they normally would, and after they have filled their sample bottle LMVP staff fills another bottle from the same composite container. In theory, both the volunteer and staff person should have the same water. Volunteers process and store their sample as they normally would while the LMVP staff brings their sample back to the University to process and store following lab procedures.
A total of 27 sites were involved in split sampling from 18 lakes in the program during 1998 and 2005. Statistical analysis involves comparing the 27 different Volunteer values to the 27 Staff values for a given parameter. Results for phosphorus, nitrogen and inorganic suspended solids indicated no significant difference between Volunteers and Staff data. There was a difference in the chlorophyll data. When we look at the graph (Figure 3) we find that Staff values were higher than the Volunteer values for the 8 highest chlorophyll readings (based on Staff value). Most of the values were very similar, but because all the points on this end of the plot fall on the same side of the 1:1 line, the statistical test indicated a difference.
This is not the first time that we have done split sampling. When we look at results from 1995 we find the opposite pattern, with all of the higher chlorophyll values being on the other side of the 1:1 line. When we combine the data we find no difference between the Volunteer and Staff data sets (Figure 4). We believe the results from the 1998 and 2005 split samples are simply an anomaly in the data (as were the 1995 data) and do not truly represent a difference in the data.
Filter Replication
In 1996 we introduced a method for gauging how well volunteers process their samples by looking at the replication of filter pairs (LMVP volunteers process two chlorophyll filters per sample). For this comparison we looked not only at how well LMVP filters replicate, but also how well MU filters replicate (all MU projects prepare two chlorophyll filters from each sample)
We used differences of 5%, 10% and 15% to represent excellent, good, fair and poor ratings. A total of 4,035 filter pairs were compared for MU and 3,947 filter pairs for LMVP (Figure 5). MU had 93% of filter pairs rated as either excellent or good, while LMVP had 89% rated in these two categories (Figure 5). Volunteers had 6% of their filter pairs rated as poor replication, a value that is a little higher than we would like to see. In most cases, poor replication occurs because the volunteer fails to record the correct volume of water on the filtration sheet. Error can also occur when samples are analyzed at the lab.
Laboratory error during processing is probably responsible for the bulk of the 3% of MU filter pairs rated as “poor.” It is safe to assume that a similar proportion of LMVP filter pairs are also rated poor due to lab error.