Two parameters were adusted to develop the cforest models, mtry and ntree. The former respresents the number of random variables to use in each tree; the latter characterizes the number of trees. Model stability was tested using the following procedure : 1) develop model using the default mtry and ntree values, 2) tabulate and review variable of importance rankings, 3) adjust starting seed and rerun the model using the default mtry and ntree values, 4) accept model if variable of importance rankings are consistent between the first and second runs; if not proceed to step five, 5) increase ntree by 500 and rerun model following steps one thru four. The procedure was repeated until the variable of importance rankings were consistent between the first and second runs. The party package of R was used to complete the classifications and to obtain the variable importance readings. For each date, two datasets were evaluated as input into the cforest algorithm: 1) twelve vegetation indices dataset and 2) sixteen band multispectral dataset.
Overall, user’s, and producer’s accuracies, and the kappa coefficient, were tabulated to compare accuracies of the classifications . Identical accuracy results were obtained for the vegetation indices and the multispectral data classifications for the June 30, 2014 dataset . Overall accuracy and the kappa coefficient were 90.8% and 0.878, respectively. User’s and producer’s accuracies ranged from 76.7% to 100%. The highest user’s and producer’s accuracies were obtained for the soybean class. The lowest user’s and producer’s accuracies were observed for the redroot pigweed and Palmer amaranth classes,cannabis grow tray respectively. For the September 17, 2014 dataset, the vegetation indices achieved higher classification accuracies than the multispectral dataset with the differences ranging from 0.7% to 3.4% for user’s, producer’s, and overall accuracies . Similar trends were observed in the classification accuracies of the vegetation indices and multispectral data. The best classification accuracies were achieved for the soybean class. Also, velvetleaf was tied for first for the producer’s accuracy. The lowest user’s and producer’s accuracies were observed for the redroot pigweed and Palmer amaranth classes, respectively. Variable importance rankings indicated that eight and nine of the twelve indices were important to the classification of the June and September vegetation indices datasets, respectively .
Its variable importance score was approximately 1.5 times greater than the second ranked vegetation index score. The multispectral datasets variable importance rankings are summarized in Figure 1. Shortwave infrared bands had strong to moderate variable importance rankings for the June dataset. NIR2, G, RE, and NIR1 bands had moderate to low variable importance scores. All of the spectral bands were important to the September 17, 2014 multispectral dataset classification model because their variable importance rankings were distinguishable from the zero line. The highest and lowest rankings were assigned to G and C bands, respectively. Finally, more trees were needed to obtain stable variable importance rankings for the multispectral datasets compared with the vegetation indices datasets . That aspect could be attributed to the multispectral bands datasets having more variables than the vegetation indices datasets, thus requiring more trees to be used for the stabilization process. Using vegetation indices as input variables for soybean and weed discrimination, cforest achieved classification accuracies that were equivalent to or slightly better than classification accuracies obtained with multispectral bands as input variables .
Kappa coefficients for the vegetation indices classifications indicated an almost perfect agreement between reference and predicted data. Almost perfect to substantial agreement occurred between reference data and predicted data for the June 30 and September 17, 2014 multispectral datasets, respectively. Errors for soybean and velvetleaf classes were attributed to the former being misclassified as the latter and vice versa. The cforest algorithm using vegetation indices or multispectral bands as input had problems in distinguishing between Palmer amaranth and redroot pigweed, suggesting combining them into one class. Consistently, indices derived with SWIR and NIR bands, the G and NIR bands, and G and R bands were considered important in separating the plant species. SIWSI1 and SIWSI2 were ranked in the top three for vegetation indices variables when evaluating both datasets .