Tag Archives: cannabis drying

Covariate imputation was based on the distribution of non-missing covariates only

The estimated outdoor PAH concentration represented 7 of the 9 individual PAHs measured in the residential dust. Since both outdoor PAH estimates and traffic density were approximately log-normally distributed, their logged values were used for statistical analyses. The urban indicator variable was coded as either 1, for residences in census blocks classified as “urban” ; or 0, for those classified as “rural” or “other” by the 2000 U.S. Census . A multiple-imputation procedure was used to borrow information from available measurements to impute values for missing data. In simulation studies, multiple imputation has been shown to produce unbiased effect estimates and appropriate confidence intervals . The data had three types of missing data: missing residential-dust PAH values, residential-dust PAH values below the limit of detection, and missing covariate data. Overall, 70 residential-dust PAH measurements were missing for 56 subjects. These PAH measurements were missing as a result of interference from co-eluting compounds during GC-MS analysis, which made detection of some individual PAHs impossible. In addition, there were 63 residential-dust PAH measurements below the limit of detection in 44 participant households. Finally, 246 of the subjects had at least one missing covariate, because respondents were either unable or unwilling to complete all of the survey questions . Because the 9 individual PAHs were correlated in the data, the multiple imputation strategy was particularly useful. Specifically, using Proc MI the joint multivariate normal distribution for the 9 correlated PAHs was estimated. Then,dry cannabis for each missing value, a probability distribution was created conditional upon the values for the non-missing PAHs . Next, five possible imputations for the missing value were randomly drawn from the conditional probability distribution bounded so that each of the randomly drawn values was greater than the limit of detection. The random sampling addressed uncertainty due to missing values and resulted in more valid statistical inferences than single imputation.

Additionally, the relative magnitude of missing PAH estimates reflected the profile of the corresponding non-missing PAHs for the same subjects. A similar procedure was used to estimate five possible values for each PAH measurement below the limit of detection and each missing covariate of interest.Again, logical bounds were set on the randomly selected values so that the estimates were reasonable . Ultimately, five complete data sets were created with five imputed values for each of the three types of missing data. Regression analyses were performed separately on each data set and the results were combined to produce inferential results. The goal of the regression analysis was to build a model that would be useful in predicting concentrations of PAH in residential dust given the questionnaire- and GIS based variables. As such, the deletion-substitution-addition algorithm, a tool for model selection written in R , was used to choose an optimal model from the list of candidate variables. All households and all imputed values were included in the DSA procedure. For each model considered, the DSA algorithm performed a 10-fold cross validation procedure with 10 repeated rounds. Each round of cross-validation involved randomly partitioning the data into 10 complementary subsets, fitting a regression model based on 9/10 of the data, and validating the model by comparing predicted and measured values in the remaining data . This process was repeated 10 times each round so that each partition was used as the validation set once. Finally, to reduce variability, 10 rounds of cross-validation were performed using different partitions, and the regression coefficients were averaged over the rounds. The ‘best’ model was the one that minimized the mean error between the predicted and observed values in 100 validation sets. The parameters in this ‘best’ model should be the most useful in predicting residential dust PAH concentrations in other households from the NCCLS population.

The search for the ‘best’ model began with the intercept-only model and proceeded iteratively by comparing the best model at each step with: 1) a deletion step which removed a term from the model, 2) a substitution step which replaced one term with another, and 3) an addition step which added a term to the model. Initially, the DSA algorithm was restricted so that it produced a model with only linear effects and no interaction terms. However, after narrowing the model selection to the most informative variables, the DSA procedure was repeated and 2nd order non-linear terms and two-way interactions that improved the model fit were added. Statistical analyses in this chapter included 277 cases and 306 controls with PAH residential-dust measurements. As shown in Table 21, individual PAH detection rates ranged from 94-100% and individual PAH concentrations ranged from below detection to a maximum of 2,450 ng/g. The sum of the 9 residential-dust PAH concentrations for the 583 residences ranged from 54-11,170 ng/g, with a median value of 479 ng/g. Table 22 shows the Pearson correlation coefficients between individual log-transformed residential-dust PAH concentrations. In general, levels of the 9 PAHs were moderately to highly correlated. Table 23 shows the Pearson correlation coefficients between total logtransformed residential-dust PAH concentrations and covariates of interest for the multiple imputation analysis and for the participants with complete covariate and PAH data. In general, the correlation coefficients were similar regardless of how missing data were treated. In the bivariate analysis, residence age, traffic density, and outdoor PAH concentrations were the covariates most strongly correlated with total PAH concentrations in residential dust. Table 23 also shows the number of subjects with missing values for the variables of interest. Table 24 shows the sum of the 9 PAH concentrations by covariates of interest. Based on the DSA algorithm that used all homes and included imputed values, six main effects were selected for the optimal model of logged total PAH concentrations in residential dust and subsequently two non-linear terms were added. Table 25 shows the parameter estimates and 95% confidence intervals for the optimal logged residential-dust PAH concentration model given the uncertainty introduced by the multiple imputation analysis .

Restricting the analysis to only HVS3- sampled homes , yielded a model with similar parameter estimates, but with slightly larger confidence intervals . The variable size of sampling area was marginally significant in the model with only HVS3-sampled homes. Similarly, restricting the analysis to only subjects with complete data yielded a model with parameter estimates similar to those in Model 1, but with slightly larger confidence intervals . The overall fit of Model 1 was R 2 = 0.15. During cross validation of Model 1, the average difference between the predicted total PAH concentration in residential dust and the measured total PAH concentration in residential dust was 0.67 . For comparison, the average difference between any measured total PAH concentration in residential dust and the average total PAH concentration in residential dust was 0.72. Figure 10 compares the measured and predicted total PAH concentrations in residential dust . Table 26 shows predicted total PAH concentrations in residential dust for various combinations of the six variables using parameter estimates from Model 1. Table 26, demonstrates the added effect of each term in the model on total residential-dust PAH concentration. For example, while holding all other variables constant,cannabis drying the added effect of indoor gas heating increased the predicted total PAH concentration in residential dust from 510 to 600 ng/g. Two suspected sources of indoor PAHs, i.e., indoor gas heating and estimated outdoor PAH levels, were significant predictors of total residential-dust PAH concentrations in the models. Interestingly, the age of the residence had the most significant effect on total residential-dust PAH concentrations, with older residences having higher PAH concentrations. The age of residence had a similar effect in the previous analysis of nicotine concentrations in residential dust . Previous researchers have shown that only about 5% of the total dust loading present in a 10 year-old carpet is available as surface dust, whereas the larger portion resides deep within the carpet and is not removed by typical cleaning . Taken together, these findings suggest that environmental contaminants can accumulate in household carpets over years or decades . The child’s age at enrollment was also a significant predictor of PAH concentrations in residential dust. Older children appeared to have higher concentrations of PAHs in their residential dust. In bivariate analyses, a child’s age at enrollment was positively correlated with the amount of time his or her family had lived in the current residence and with the age of the carpet sampled . While duration at residence and carpet age were not significant predictors of PAH levels, child’s age may be a more reliably reported surrogate for the age of the dust collected. If so, the positive regression coefficient for the child’s age variable is further evidence that PAHs accumulate in residential dust over time. Residence in an apartment/condominium, duplex/townhouse, or mobile home compared to a single family home, was also a significant predictor of the PAH concentrations in residential dust, with higher concentrations seen for multiple family dwellings. In Model 1, if the residence was not a single family home, the predicted total PAH concentration increased . Because apartments, mobile homes, and townhouses are typically smaller than single family homes, this result is consistent with a previous finding that concentrations of environmental contaminants in residential dust increased with decreasing square footage of the residence .

Presumably, given a constant number of PAH sources ; a smaller residence would have a greater PAH concentration. The mother’s ethnicity was also a significant predictor of PAH concentrations in residential dust. Hispanic mothers appeared to have lower PAH concentrations in their residential dust than non-Hispanic mothers. Notably, Hispanic mothers were also more likely to report that their carpets were vacuumed more than once a week and were less likely to live in an urban census tract . Although these other factors were not selected as variables in the optimal residential-dust PAH model, in bivariate analyses, vacuum frequency was negatively correlated with PAH concentrations and urban location was positively correlated with PAH concentrations. While the DSA algorithm identified several significant determinants of total PAH concentrations in residential dust, even the optimal model only explained a small portion of the total variability of the data . Moreover, during cross validation, the optimal model was only marginally better at predicting PAH concentrations in residential dust than the intercept model . Ultimately, it seems that even the most relevant self-reported and GIS-based data provided only limited information about residential PAH levels; this underscores the importance of making environmental or biological measurements. As discussed, dust samples were collected using both the HVS3 and household vacuum cleaners. Restricting the regression analysis to only those homes with dust collected by the HVS3 did little to change the estimates of the parameters used in Model 1 . This reinforces previous findings from the NCCLS and suggests that collecting residential dust from household vacuum cleaners is a useful alternative to the more expensive and labor-intensive HVS3 sampling method. An implicit assumption of the multiple imputation procedure is that the distribution of the missing data depends only on the observed data. This assumption is plausible given the large size and correlation of the set of predictors used for imputation . Moreover, restricting the regression to participants with complete data had little impact on the estimates of the parameters used in Model 1 . Indeed, whereas the parameter estimates were similar, the standard errors and confidence intervals were smaller for Model 1 than for Model 3. Thus, it appears that the multiple imputation of missing data was useful. The one variable that was substantially different in Model 3 was the variable identifying the residence as an apartment. However, because this variable had only one missing observation, the discrepancy probably points to data censoring in Model 3 rather than to failure of the imputation process. The PAH concentrations measured in residential dust in this chapter were generally lower than those previously reported for residences in Durham, NC , in the Rio Grande Valley, TX , in Cape Cod, MA , in Long Island, NY , and in Ottawa, Canada . However, a recent study of dust from residences in Kuwait found PAH concentrations similar to those reported in this chapter.