These farmers generally found that this interview question missed the mark with regards to soil fertility

These in-depth interviews allowed us to ask the same questions of each farmer so that comparisons between interviews could be made. In person interviews were conducted in the winter, between December 2019 – February 2020; three interviews were conducted in December 2020. All interviews were recorded with permission from the farmer and lasted about 2 hours. To develop interview questions for the semi-structured interviews , we established initial topics and thematic sections first. We consulted with two organic farmers to develop final interview questions. The final format of the semi-structured interviews was designed to encourage deep knowledge sharing. For example, the interview questions were structured such that questions revisited topics to allow interviewees to expand on and deepen their answer with each subsequent version of the question. Certain questions attempted to understand farmer perspectives from multiple angles and avoided scientific jargon or frameworks whenever possible. Most questions promoted open ended responses to elicit the full range of possible responses from farmers. We used an openended, qualitative approach that relies on in-depth and in-person interviews to study farmer knowledge . In the semi-structured interview, farmers were asked a range of questions that included: their personal background with farming and the history of their farm operation, their general farm management approaches, as well as soil management approaches specific to soil health and soil fertility, such as key nutrients in their consideration of soil fertility, and their thoughts on soil tests . A brief in-person survey that asked several key demographic questions was administered at the end of the semistructured interviews. Interviews were transcribed, reviewed for accuracy, cannabis growing system and uploaded to NVivo 12, a software tool used to categorize and organize themes systematically based on research questions .

Through structured analysis of the interview transcripts, key themes were identified and then a codebook was constructed to systematically categorize data related to soil health and soil fertility . We summarize these results in table form.To unpack differences between Fields A and Fields B across all farms, we applied a multi-step approach. We first conducted a preliminary, global comparison between Fields A and Fields B across all farms using a one-way analysis of variance to determine if Fields A were significantly different from Fields B for each indicator for soil fertility. Then, to develop a basis for further comparison of Fields A and Fields B, we considered potential links between management and soil fertility. To do so, we developed a gradient among the farms using a range of soil management practices detailed during the initial farm visit. These soil management practices were based on interview data from the initial farm visit, and were also emphasized by farmers as key practices linked to soil fertility. The practices used to inform the gradient included cover crop application, amount of tillage, crop rotation patterns, crop diversity, the use of integrated crop and livestock systems , and the amount of N-based fertilizer application. Cover crop frequency was determined using the average number of cover crop plantings per year, calculated as cover crop planting counts over the course of two growing years for each field site. Tillage encompassed the number of tillage passes a farmer performed per field site per season. To quantify crop rotation, a rotational complexity index was calculated for each site using the formula outlined by Socolar et al. . To calculate crop diversity, we focused on crop abundance, the total number of crops grown per year at the whole farm level was divided by the total acreage farmed.

To determine ICLS, an index was created based on the number and type of animals utilized . Lastly, we calculated the amount of additional N-based fertilizer applied to each field . In order to group, visualize, and further explore links with indicators for soil fertility, all soil management variables were standardized , and then used in a principal components analysis using the factoextra package in R . In short, these independent management variables were used to create a composite of several management variables. Principal components with eigenvalues greater than 1.0 were retained. To establish the gradient in management, we plotted all 13 farms using the first two principal components,and ordered the farms based on spatial relationships that arose from this visualization using the nearest neighbor analysis . To further explore links between management and soil fertility, we used the results from the PCA to formalize a gradient in management across all farms, and then used this gradient as the basis for comparison between Field A and Field B across all indicators for soil fertility. Using the ggplot and tidyverse packages , we displayed the difference in values between Field A and Field B for each indicator for soil fertility sampled at each farm using bar plots. We also included error bars to show the range of uncertainty in these indicators for soil fertility. Lastly, we further compared Field A and Field B for each farm using radar plots. To generate the radar plots, we first scaled each soil indicator from 0 to 1. Using Jenks natural breaks optimization, we then grouped each farm based on low, medium, and high N-based fertilizer application, as this soil management metric was the strongest coefficient loading from the first principal component . Using the fmsb package in R , we used an averaging approach for each level of N-based fertilizer application to create three radar plots that each compared Field A and Field B across the eight indicators for soil fertility.Farmer responses for describing key aspects of soil health were relatively similar and overlapped considerably in content and language . Specifically, farmers usually emphasized the importance of maintaining soil life and/or soil biology, promoting diversity, limiting soil compaction and minimizing disturbance to soil, and maintaining good soil structure and moisture.

Several farmers also touched on the importance of using crops as indicators for monitoring soil health and the importance of limiting pests and disease. Discussion of the importance of promoting soil life, soil biology, and microbial and fungal activity had the highest count among farmers with ten mentions across the 13 farmers interviewed. Next to this topic, minimizing tillage and soil disturbance was the second most discussed with six of 13 farmers highlighting this key aspect of soil health. The importance of crop health as an indicator for soil health also surfaced for five out of 13 farmers. In addition to discussing soil health more broadly, farmers also provided in-depth responses to a series of questions related to soil fertility—such as key nutrients of interest on their farm, details about their fertility program, and the usefulness of soil tests in their farm operation— summarized in Table 2. When asked to elaborate on the extent to which they considered key nutrients, a handful of farmers readily listed several nutrients, including nitrogen, phosphorous, potassium , and other general macronutrients as well as one micronutrient . Among these farmers that responded with a list of key nutrients, some talked about having their nutrients “lined up” as part of their fertility program. This approach involved keeping nutrients “in balance,” such as for example, monitoring pH to ensure magnesium levels did not impact calcium availability to plants. These farmers also emphasized that though nitrogen represented a key nutrient and was important to consider in their farm operation, flood table tracking soil nitrogen levels was less important than other aspects of soil management, such as promoting soil biological processes, maintaining adequate soil moisture and aeration, or planting cover crops regularly. As one farmer put it, “if you add nutrients to the soil, and the biology is not right, the plants will not be able to absorb it.” Or, as another farmer emphasized, “It’s not about adding more [nitrogen]… I try to cover crop more too.” A third farmer emphasized, that “I don’t use any fertilizers because I honestly don’t believe in adding retroactively to fix a plant from the top down.” This same farmer relied on planting a cover crop once per year in each field, and discing that cover crop into the ground to ensure his crops were provided with adequate nitrogen for the following two seasons. While most farmers readily listed key nutrients, several farmers shifted conversation away from focusing on nutrients. One farmer responded, “I’m not really a nutrient guy.” This same farmer added that he considered [soil fertility] a soil biology issue as much as a chemistry issue.” The general sentiment among these farmers emphasized that soil fertility was not about measuring and “lining up” nutrients, but about taking a more holistic approach.

This approach focused on facilitating conditions in the soil and on-farm that promoted a soil-plant-microbe environment ideal for crop health and vigor. For example, the same farmer quoted above mentioned the importance of establishing and maintaining crop root systems, emphasizing that “if the root systems of a crop are not well established, that’s not something I can overcome just by dumping more nitrogen on the plants.”Another farmer similarly emphasized that they simply created the conditions for plants to “thrive,” and “have pretty much just stepped back and let our system do what it does; specifically, we feed our chickens whey-soaked wheat berries and then we rotate our chickens on the field prior to planting. And we cover crop.” A third farmer also maintained that their base fertility program—a combination of planting a cover crop two seasons per year, an ICLS chicken rotation program, minimal liquid N-based fertilizer addition, and occasionally compost application—all worked together to “synergize with biology in the soil.” This synergy in the soil created by management practices—rather than focusing on nutrient levels—guided this farmer’s approach to building and assessing soil fertility on-farm. Another farmer called this approach “place-based” farming. This particular farmer elaborated on this concept, saying “I think the best style of farming is one where you come up with a routine [meaning like a fertility program] that uses resources you have: cover crops, waste materials beneficial to crops, animals” in order to build organic matter, which “seems to buffer some of the problems” that this farmer encountered on their farm. Similar to other farmers, this farmer asserted that adding more nitrogen-based fertilizer did not lead to better soil fertility or increase yields, in their direct experience. Regardless of whether farmers listed key nutrients, a majority of farmers voiced that nitrogen was not a big concern for them on their farm. This sentiment was shared among most farmers in part because they felt the amount of nitrogen additions from fertilizers they added were insignificant compared to nitrogen additions by conventional farms. Farmers also emphasized that the amount of nitrogen they were adding was not enough to cause environmental harm; relatedly, a few farmers noted the absurdity and added economic burden of the recent nitrogen management plan requirements—specifically among organic farms with very low N-based fertilizer application. The majority of farmers also expressed that their use of cover crops and the small amount of N-based fertilizer additions as part of their soil fertility program ensured on-farm nitrogen demands were met for their crops. Across all farmers interviewed, cover cropping served as the baseline and heart of each fertility program, and was considered more effective than additional N-based fertilizers at maintaining and building soil fertility. Farmers used a range of cover crop species and often applied a mix of cover crops, including vetches and other legumes like red clover and cowpea , grains and cereals like oats . Farmers cited several reasons for the effectiveness of cover cropping, such as increased organic matter content, more established root systems, greater microbial activity, better aeration and crumble in their soils, greater number of earthworms and arthropods, improved drainage in their soils, and more bio-available N. Whereas farmers agreed that “more is not better” with regards to N-based fertilizers, farmers did agree that allocating more fields for planting cover crops over the course of the year was beneficial in terms of soil fertility. However, as one farmer pointed out, while cover crops provide the best basis for an effective soil fertility program, this approach is not always economically viable or physically possible.Several farmers expressed concern because they often must allocate more fields to cover crops than cash crops in any given season, which means that their farm operation requires more land to be able to produce the same amount of vegetables than if they had all their fields in cash crops.

A Euclidean-based dendrogram analysis was then used to further validate the results of the cluster analysis

Using farm typologies identified, we examined the extent to which soil texture and/or soil management practices influenced these measured soil indicators across all working organic farms, using Linear Discriminant Analysis and Variation Partitioning Analysis . We then determined the extent to which gross N cycling rates and other soil N indicators differed across these farm types. Lastly, we developed a linear mixed model to understand the key factors most useful for predicting potential gross N cycling rates along a continuous gradient, incorporating soil indicators, on-farm management practices, and soil texture data. Our study highlights the usefulness of soil indicators towards understanding plant-soil-microbe dynamics that underpin crop N availability on working organic farms. While we found measurable differences among farms based on soil organic matter, strongly influenced by soil texture and management, these differences did not translate for N cycling indicators measured here. Though N cycling is strongly linked to soil organic matter, indicators for soil organic matter are not strong predictors of N cycling rates.During the initial field visits in June 2019, two field sites were selected in collaboration with farmers on each participating farm; these sites represented fields in which farmers planned to grow summer vegetables. Therefore, only fields with all summer vegetable row crops were selected for sampling. At this time, farmers also discussed management practices applied for each field site, including information about crop history and rotations, bed prepping if applicable, tillage, organic fertilizer input, and irrigation . Because of the uniformity of long-term management at the field station , weed trimming tray only one treatment was selected in collaboration with the Cropping Systems Manager—a tomato field in the organic corn-tomato-cover crop system.

Since the farms involved in this study generally grew a wide range of vegetable crops, we designed soil sampling to have greater inference space than a single crop, even at the expense of adding variability. Sampling was therefore designed to capture indicators of nitrogen cycling rates and nitrogen pools in the bulk soil at a single time point. Fields were sampled mid-season near peak vegetative growth when crop nitrogen demand is the highest. Using the planting date and anticipated harvest date for each crop, peak vegetative growth was estimated and used to determine timing of sampling. We collected bulk soil samples that we did not expect to be strongly influenced by the particular crop present. This sampling approach provided a snapshot of on-farm nitrogen cycling. Field sampling occurred over the course of four weeks in July 2019. To sample each site, a random 10m by 20m transect area was placed on the field site across three rows of the same crop, away from field edges. Within the transect area, three composite samples each based on 5sub-samples were collected approximately 30cm from a plant at a depth of 20cm using an auger . Subsamples were composited on site, and mixed thoroughly by hand for 5 minutes before being placed on ice and immediately transported back to the laboratory. To determine bulk density , we hammered a steel bulk density core sampler approximately 30cm from a plant at a depth 20cm below the soil surface and recorded the dry weight of this volume to calculate BD; we sampled three replicates per site and averaged these values to calculate final BD measurements for each site.Soil samples were preserved on ice until processed within several hours of field extraction. Each sample was sieved to 4mm and then either air dried, extracted with 0.5M K2SO4, or utilized to measure net and gross N mineralization and nitrification . Air dried samples were measured for gravimetric water content and BD.

Gravimetric water content was determined by drying fresh soils samples at 105oC for 48 hrs. Moist soils were immediately extracted and analyzed colorimetrically for NH4 + and NO3 – concentrations using modified methods from Miranda et al. and Forster . Additional volume of extracted samples were subsequently frozen for future laboratory analyses. To determine soil textural class, air dried samples were sieved to 2mm and subsequently prepared for analysis using the “micropipette” method . Water holding capacity was determined using the funnel method, adapted from Geisseler et al. , where a jumbo cotton ball thoroughly wetted with deionized water was placed inside the base of a funnel with 100g soil on top. Deionized water was added and allowed to imbibe into the soil until no water dripped from the funnel. The soil was allowed to drain overnight . A subsample of this soil was then weighed and dried for 48 hours at 105oC. The difference following draining and oven drying of a subsample was defined as 100% WHC. Air dried samples were sieved to 2mm, ground, and then analyzed for total soil N and total organic C using an elemental analyzer at the Ohio State Soil Fertility Lab ; additional soil data including pH and soil protein were also measured at this lab. Soil protein was determined using the autoclaved citrate extractable soil protein method outlined by Hurisso et al. . Additional air-dried samples were sieved to 2mm, ground, and then analyzed for POXC using the active carbon method described by Weil et al. , but with modifications as described by Culman et al. . In brief, 2.5g of air-dried soil was placed in a 50mL centrifuge tube with 20mL of 0.02 mol/L KMnO4 solution, shaken on a reciprocal shaker for exactly 2 minutes, and then allowed to settle for 10 minutes. A 0.5-mL aliquot of supernatant was added to a second centrifuge tube containing 49.5mL of water for a 1:100 dilution and analyzed at 550 nm.

The amount of POXC was determined by the loss of permanganate due to C oxidation .To measure gross N mineralization and nitrification in soil samples, we applied an isotope pool dilution approach, adapted from Braun et al. . This method is based on three underlying assumptions listed by Kirkham & Bartholomew : 1) microorganisms in soil do not discriminate between 15N and 14N; 2) rates of processes measured remain constant over the incubation period; and 3) 15N assimilated during the incubation period is not remineralized. To prepare soil samples for IPD, we adjusted soils to approximately 40% WHC prior to incubation with deionized water. Next, four sets of 40g of fresh soil per subsample were weighed into specimen cups and covered with parafilm. Based on initial NH4 + and NO3 – concentrations determined above, a maximum of 20% of the initial NH4 + and NO3 – concentrations was added as either 15N-NH4 + or 15N-NO3 – tracer solution at 10 atom%; the tracer solution also raised each subsample soil water content to 60% WHC. This approach increased the production pool as little as possible while also ensuring sufficient enrichment of the NH4 + and NO3 – pools with 15N-NH4 + and 15N-NO3, respectively, to facilitate high measurement precision . Due to significant variability of initial NH4 + and NO3 – pool sizes in each soil sample, differing amounts of tracer solution were added to each sample set evenly across the soil surface. To begin the incubation, each of the four subsamples received the tracer solution via evenly distributed circular drops from a micropipette. The specimen cups were placed in a dark incubation chamber at 20oC. After four hours , two subsample incubations were stopped by extraction with 0.5M K2SO4 as above for initial NH4 + and NO3 – concentrations. Filters were pre-rinsed with 0.5 M K2SO4 and deionized water and dried in a drying oven at 60°C to avoid the variable NH4 + contamination from the filter paper. Soil extracts were frozen at -20°C until further isotopic analysis. Similarly after 24 hrs , cannabis grow setup two subsample in cubations were stopped by extraction as previously detailed, and subsequently frozen at -20°C. At a later date, filtered extracts were defrosted, homogenized, and analyzed for isotopic composition of NH4 + and NO3 – in order to calculate gross production and consumption rates for N mineralization and nitrification. We prepared extracts for isotope ratio mass spectrometry using a microdiffusion approach based on Lachouani et al. . Briefly, to determine NH4 + pools, 10mL aliquots of samples were diffused with 100mg magnesium oxide into Teflon coated acid traps for 48 hours on an orbital shaker. The traps were subsequently dried, spiked with 20μg NH4+ -N at natural abundance to achieve optimal detection, and subjected to EA-IRMS for 15N:14N analysis of NH4 + . Similarly, to determine NO3 – pools, 10mL aliquots of samples were diffused with 100mg magnesium oxide into Teflon coated acid traps for 48 hours on an orbital shaker. After 48 hours, acid traps were removed and discarded, and then each sample diffused again with 50mg Devarda’s alloy into Teflon coated acid trap for 48 hours on an orbital shaker. These traps were dried and subjected to EA-IRMS for 15N:14N analysis of NO3 + .

Twelve dried samples with very low spiked with 20μg NH4+ -N at natural abundance to achieve optimal detection.In order to identify farm typologies based on indicators for soil organic matter levels, we first used several clustering algorithms. First, a k-means cluster analysis based on four key soil indicators—soil organic matter , total soil nitrogen, and available nitrogen —was used to generate three clusters of farm groups using the facoextra and cluster packages in R . The cluster analysis results were divisive, nonhierarchical, and based on Euclidian distance, which calculates the straight-line distance between the soil indicator combinations of every farm site in Cartesian space , and created a matrix of these distances . To determine the appropriate number of clusters for the cluster analysis, a scree plot was used to signal the point at which the total within-cluster sum of squares decreased as a function of the increasing cluster size. The location of the kink in the curve of this scree plot delineated the optimal number of clusters, in this case three clusters . To further explore appropriate cluster size, we used a histogram to determine the structure and spread of data among clusters. In addition to confirming the results of the cluster analysis, the dendrogram plot showed relationships between sites and relatedness across all sites. To visual cluster analysis results, the final three clusters were plotted based on the axes produced by the cluster analysis.One drawback of cluster analyses is that there is no measure of whether the groups identified are the most effective combination to explain clusters produced by soil indicators, or whether they are statistically different from one another. To address this gap, we used ANOSIM to evaluate and compare the differences between clusters identified with the cluster analysis above. We calculated the global similarity in addition to pairwise tests of each cluster. To formally establish the three farm types and also make the functional link between organic matter and management explicit, we used the three clusters that emerged from the k-means cluster analysis based on soil organic matter indicators, and explored differences in management approaches among the clusters. We then created three farm types based on this exploratory analysis. Specifically, we first analyzed management practices among sites within each cluster to determine if similarities in management approaches emerged for each cluster. Based on this analysis, we used the three clusters from the cluster analysis to create three farm types categorized by soil organic matter levels and informed by management practices applied.Using the three farm types from above, we then analyzed whether our classification created strong differences along soil texture and management gradients using a linear discriminant analysis . LDA is most frequently used as a pattern recognition technique; because LDA is a supervised classification, class membership must be known prior to analysis . The analysis tests the within group covariance matrix of standardized variables and generates a probability of each farm sites being categorized in the most appropriate group based on these variable matrices . To characterize soil texture, we used soil texture class . To characterize soil management, we used crop abundance, tillage frequency, and crop rotational complexity—the three management variables with the strongest gradient of difference among the three farm types. A confusion matrix was first applied to determine if farm sites were correctly categorized among the three clusters created by the cluster analysis. Additional indicator statistics were also generated to confirm if the LDA was sensitive to input variables provided. A plot with axis loadings is provided to visualize the results of the LDA and display differences across farm groups visually. The LDA was carried out using the MASS R package.

Knowledge of the mechanism that underlies the increase in abuse is important for theory and policy

We argue that backlash models reinterpreted in an economic framework do not necessarily “ignore the individual rationality constraints faced by women” , but rather take seriously an additional motive on the part of men – that of restoring a self-image of dominance in the household to which they may feel entitled, for example due to cultural norms. A similar theory, in an instrumental framework, would be that men use violence to attempt to address unwanted female behavior associated with employment. The paper is organized as followed. Section 2 describes the rural Ethiopian context and the experiment. In section 3 the main treatment effects are presented and analyzed in light of existing domestic violence models.Ethiopia has some of the highest poverty, illiteracy and underemployment rates in Africa, especially for women. Domestic violence is unusually prevalent; for example, 54 percent of women in a provincial site surveyed by the WHO report to have been victimized by a partner during the last year . At least until recently, a role for domestic violence was accepted in Ethiopian culture – even by many women. In a nationally representative survey conducted in 2005, 81 percent of Ethiopian women found it justified for a husband to beat his wife if the wife had violated norms . In recent years it has become more common for Ethiopian women to hold formal jobs. In rural areas an important contributing factor has been the explosive rise of the floriculture sector, plant benches which mostly employs women. In 2008, 81 flower farms in Ethiopia employed around 50,000 workers . Hiring on Ethiopian flower farms typically takes place in October and November, before the main growing and harvesting season.

The supervisors on five flower farms agreed to randomize job offers during the fall 2008 hiring season because of an unusual situation in the labor market for flower farm workers. At the time, applicants almost always outnumbered the positions to be filled by large margins. Ethiopian flower farms – still getting to grips with cost components significantly larger than labor, and with little ability to predict the productivity of the mostly uneducated, illiterate and inexperienced applicants – did not prioritize optimization of the unskilled workforce . Because supervisors were already allocating job offers relatively arbitrarily when approached by the researchers, explicit randomization was a modest procedural change.The five farms are located in rural areas two and a half to five hours from Addis Ababa and employ local workers who live in small towns nearby the farms. On hiring days, supervisors first excluded any unacceptable applicants. A team of enumerators then carried out the baseline survey with the remaining applicants. Finally, the names of the number of workers to be hired were drawn randomly from a hat. The sample thus consists of 339 households in which a woman applied to a flower farm job and was deemed acceptable for hiring; we focus on the 329 households in which the applicant was married or living with a steady partner. We attempted to re-interview everyone in the treatment and control groups 5 – 7 months after employment commenced. Careful tracking procedures led to a re-interview rate of 88 percent and no statistically significant differential attrition. Summary statistics are displayed in table 1. There are no statistically significant differences between the characteristics of the treatment and control groups. Literacy rates are low. Almost all the applicants are parents. Income and wealth indicators, such as the material that the applicant’s floor is made of, indicate the severe poverty of the sample. Flower farm employment typically entails six days of full-time work a week, totaling on average 202 hours per month. The alternative for the women in our sample was typically domestic work, and perhaps a few hours of informal paid work per week.

The applicants randomly chosen for employment spent 102 more hours per month working . The income of treated women increased by 154 percent on average, which translates into a 28 percent increase in total household income.The estimated treatment effect are in table 3. The probability of experiencing physical violence increases by 8 percentage points or 13 percent when a woman gets employed in rural Ethiopia. There is also a 19 percentage point or 34 percent increase in emotional abuse. Finally, the intensive margin of violence is affected: the number of violent incidents experienced per month goes up by 0.31 or 32 percent following employment. An alternative interpretation of these results is that employment affects women’s willingness to report violence to an enumerator rather than, or in addition to, violence itself. While we cannot rule out a reporting effect, greater willingness to report violence after employment is unlikely to represent the primary explanation of our findings. Specific, detailed survey questions were used. As noted above, the majority of both men and women in Ethiopia find domestic violence justifiable in some situations, and 63 percent of women in our sample were comfortable reporting abuse at baseline. The prediction that physical abuse will decrease when women are “empowered” by employment is central to the most-cited domestic violence models. The estimates in table 3 represent strong evidence against such models, in the context of rural Ethiopia. In the next two sub-sections we categorize pessimistic models on the basis of the hypothesized male motivation for abuse, and explore the ability of different categories of pessimistic models to explain our findings.This paper’s primary result is that domestic violence increases significantly when women get employed in rural Ethiopia.

It appears that there are two categories of models that may be able to explain our results: expressive models in which a husband’s marginal utility from violence is increasing in the economic standing of his wife, and instrumental models in which violence is used to achieve male goals other than control over household resources. We consider these two possibilities in turn. Aizer is an example of an influential class of expressive domestic violence models in which men derive utility directly from violence. Women with better options outside of marriage should be willing to accept less violence at a given “price”: employment is predicted to shift a woman’s violence “supply curve” up and thus decrease violence. Consider, however, that a husband’s violence “demand curve” may also shift up when his wife gets employed, if the husband’s marginal utility from violence is increasing in the wife’s relative or absolute economic standing. The net outcome may be that the couple’s contract curve – the set of feasible bargaining solutions – shifts up in space, and that violence itself therefore increases. Why would the marginal utility that men in Ethiopia derive from violence go up when women get employed? Suppose that there are emotional costs to men of perceived violations of traditional gender roles. In that case “violence may be a means of reinstating [a husband’s] authority over his wife” . If improvements in women’s economic standing carry emotional costs to men, events that symbolize the perceived challenge to traditional gender roles can likely lead to violence. In columns two and four of table 4 we interact the treatment indicator with the wife’s ex ante income as a share of the combined income of the husband and wife. The results show that the impact of employment on violence is bigger in households in which the newly employed wife is likely to end up further ahead of her husband in income because her share of baseline income was high relative to that of other women in the sample. The increase in the probability of violence when a wife gets employed is seven percentage points higher for every one standard deviation increase in the wife’s share of baseline income, almost as much as the average effect. There is also a small but marginally significant increase in male labor supply when women get employed in rural Ethiopia. Though alternative explanations are possible, these results are consistent with a plausible story in which improvements in the relative economic standing of women carry emotional costs to men; costs that some men choose to act upon through violence. A similar possibility is that violence serves an instrumental purpose, rolling benches but is used not to gain control over household resources but instead to influence the behavior of wives . Husbands may see some dimensions of female behavior associated with employment as undesirable and potentially “correctable” through violence. The arguably most plausible “real” cost to husbands of female employment is that employed wives devote less time to house-work. In our sample, most of the housework of women randomly chosen for employment is taken over by daughters , however. This suggests that costs to husbands of a reallocation of women’s time may, if anything, be due the overturning of traditional responsibilities in the household, rather than house-work being left undone.

In sum, the evidence presented here suggests that emotional costs associated with violations of traditional gender roles belong in theories of domestic violence in gender-unequal societies. If so, identity models, in which disutility is associated with a self-image that de-viates from the individual’s view of his or her “appropriate” role in the household, are a natural starting point . In the appendix we present an example of a framework in which a husband’s incentive to engage in violence depends on his wife’s economic standing relative to his own – as does, in turn, the wife’s response to violence. The framework allows a male “backlash” when women get employed and predicts how domestic violence responds to female employment in Ethiopia well.This paper has analyzed the impact of female employment on domestic violence through a field experiment in which women’s long-term job offers on Ethiopian flower farms were randomized. We estimate a significant 13 percent increase in physical violence when women get employed, as well as large increases in emotional abuse and the intensity of physical violence. These results put into question the relevance of conventional economic models of domestic violence in male-dominated developing countries. Like much existing anti-violence policy, conventional models are “optimistic” in the sense of considering labor force participation a promising route to empowering women and reducing the prevalence of domestic violence. Most “pessimistic” models argue that physical abuse can increase when employment enhances wives’ incomes and bargaining power because husbands use violence as a tool to get access to and control over household resources. But we find no significant correlation between levels of violence and control over household resources, nor changes in violence and control when women get employed, and the reason does not appear to be that violence is used to counteract female bargaining power. Rather than a male quest for control over household resources, it appears that the models that best explain our results would allow men to care about roles in the household deviating from the roles prescribed by traditional norms, and violence being seen as a way to restore a preferred order. We find that the increase in the probability of violence following female employment is greater in households in which the newly employed woman is likely to end up further ahead of her husband in income. The costs to a husband of lost economic dominance are presumably primarily emotional, suggesting that the benefits of turning to violence in response may also be emotional. It may be that men derive “expressive” utility from violence and, while a woman’s “violence supply curve” likely shifts up when her outside option improves, her husband’s “violence demand curve” also shifts up because his marginal utility from violence depends on his wife’s relative economic standing. A similar “instrumental violence” interpretation would be that men abuse their wives not to achieve financial control but rather, for example, to influence their wives’ behavior in the household. We conclude that: conventional optimistic economic models of domestic violence are unlikely to accurately describe the situation in most households in male-dominated developing countries such as Ethiopia; and not all men will passively accept challenges to their economic dominance, and successful models of domestic violence will likely need to account for the male reaction to female economic progress. Finally, it is worth emphasizing that the increase in domestic violence we observe when women get employed does not mean that women are not empowered by employment. Forexample, it may be that some women previously acquiesced in the face of demands from their husbands but choose not to when emboldened by employment.

The effect of conflict on discriminatory workplace behavior does not decay in the nine months after conflict ended

So far we have seen that output in factory production in Kenya is lower when individuals of different ethnic backgrounds work together, and that the reason appears to be that biased upstream workers under supply downstream workers of other ethnic groups and misallocate intermediate goods across coethnic and non-coethnic downstream workers. We have also seen that distortionary workplace discrimination is greater durings times of conflict, and that firms introduce policies in response in order to reduce workers’ incentive to discriminate. By studying how discriminatory preferences are shaped, and how firms choose their response to distortionary discrimination, researchers can go beyond identifying a source of ethnic diversity effects in production and begin to address why those effects vary across space and time and how profit motives in the private sector can reduce the aggregate effect of ethnic diversity. In the model of taste-based discrimination above, the impact of conflict on output in diverse teams should persist for as long as attitudes towards workers of other ethnic groups are affected. Periods of increased antagonism may entail significant hidden economic costs if “mean reversion” in taste for discrimination is slow . The evolution of output in teams of different ethnicity configurations across the three sample periods was depicted in figure 2. After the introduction of team pay, average output in both homogeneous and mixed teams was steady for the remainder of the sample period, cannabis grow setup suggesting that the impact of conflict on social preferences was long-lived.How did the response to conflict of distortionary discrimination at work vary across individuals?

Modeling θC and θNC as parameter values shared by all workers is a simplification: in reality some workers will have a higher taste for discrimination than others. Figure 9 plots the distribution, across individual suppliers, of the difference in output between homogeneous and mixed teams supplied, before and after conflict began. It appears that most suppliers discriminate against non-coethnic processors during the pre-conflict period. Conflict led to an increase in the output gap between homogeneous and mixed teams supplied for most upstream workers, but also to a notable widening of the distribution of the output gap. The figure indicates that some upstream workers respond more to conflict than others, differentially increasing the extent to which they discriminate against non-coethnics downstream. Some workers in the sample were more exposed to the conflict period of early 2008 than others. Though the workers at the plant and their co-habitating family-members were not themselves directly affected, 22 percent of workers report to have “lost a relative” during the conflict. The decrease in output in mixed teams when conflict began was significantly greater in teams supplied by such workers, as seen in columns 1 and 2 of table 10. These results indicate that personal grievances exacerbate individuals’ workplace response to conflict. Younger individuals may have more malleable social preferences. In columns 3 and 4 of table 10 we see that, although output in homogeneous teams led by old and young suppliers was similar, output in mixed teams with young suppliers was significantly higher during the first year of the sample period. Young suppliers were less discriminatory towards noncoethnic co-workers than old suppliers before conflict began, it appears. This finding is consistent with an expectation expressed by many Kenya commentators before 2008.

It was argued that the young coming of age at the time would be the country’s first “post-tribal” generation . The results of table 10 also show that the decrease in output in mixed teams when conflict began was significantly greater in teams with young suppliers, however. Output in mixed teams with young suppliers was no higher than in mixed teams with older suppliers during the conflict period. These results suggest that youth start out relatively tolerant, but that the attitudes of the young towards non-coethnics respond more negatively to conflict. The results discussed in this section paint a consistent picture of how distortionary attitudes towards workers of other ethnic groups respond to ethnic conflict. It appears that conflict may entail significant hidden economic costs because distortionary social preferences are updated in a “Bayesian” fashion when conflict occurs, at least in the Kenyan context. A serious episode of violent, political conflict between the Kikuyu and Luo blocs led to a significant shift in the average weight attached to the well-being of non-coethnics, a shift that did not decay in the nine months after conflict ended. The negative response was greater among those more affected and among those likely to have a less cemented “prior”.Segregating workers of different ethnic groups would appear to be the profit-maximizing response to distortionary discrimination, from the viewpoint of the econometrician. The results in tables 4 and 8 suggest that segregation would have increased plant productivity by four percent before conflict and by eight percent after conflict began, relative to the status quo of arbitrary assignment to teams. Are these expected benefits of a magnitude that is likely to be salient to supervisors? Consider the output increase expected from optimally assigning workers to teams and positions based on ethnicity, productivity or both. If we view a worker as having three characteristics – the tercile to which she belongs in the distribution of processor productivity, the tercile to which she belongs in the distribution of supplier productivity, and her ethnicity – then an average output will be associated with teams of each of 3 ethnicity configurations, 18 productivity configurations and 63 ethnicity productivity configurations. 

In theory, supervisors can then solve the linear programming problem of maximizing total output subject to the expected output associated with a given type of team and the “budget set” of workers available . The optimal assignments and associated expected output gains are shown in table 11. Throughout the period observed, the output gains expected from assigning workers to teams based on ethnicity were larger than those expected from assigning workers based on productivity – twice as large during the conflict period. In fact segregation achieves about half the output gains of the “complete” solution. The complete solution assigns workers optimally to fully specified teams and thus takes into account interactions between the three workers’ ethnicities and productivities – a complicated “general equilibrium” problem that is likely infeasible for supervisors to solve. It thus appears that the expected productivity gain of segregation is sizable relative to the expected effect of changing other comparable factors under supervisors’ control.It is possible that a similar effect occurs in a Kenyan workplace, although in a situation in which mixed teams are characterized by discriminatory behavior it is also possible that interaction increases tensions and exacerbates ethnic biases. To investigate, I compare the behavior of suppliers with greater versus lower experience working with non-coethnics, in table 12. Focusing on output during the second half of 2007 and the first six weeks of 2008, I contrast teams with suppliers with above-average versus below-average time spent in mixed teams during the first half of 2007. Because most workers at the farm had already spent significant time working with non-coethnics before 2007, columns 3 and 4 restrict the sample to those with below-average tenure. The results show no significant effect of time spent working with non-coethnics on the output gap between mixed and homogeneous teams supplied, vertical grow system neither before nor after conflict began. Workers who have interacted more with individuals of other ethnic groups thus appear no less discriminatory in production. The results in table 12 do not rule out the possibility that complete segregation between the two ethnic groups over time would have a negative influence on attitudes or behavior towards non-coethnics, however. Carrell, Sacerdote, and West find that implementing an estimated optimal assignment can have unintended consequences due to unforeseen responses on the part of individuals to out-of-sample assignments. In the context of the sample farm, in a country that has experienced periodical violent clashes between ethnic groups, and where workers of different ethnic groups reside in the same quarters, complete segregation at the plant could for example lead to increased social tensions on the farm.Nevertheless, it is arguably surprising that a supposedly profit-maximizing firm chose to leave large productivity gains “on the table” by not segregating workers of different ethnicities. Ethical considerations add complexity to the issue of team assignment in Kenya, but we would perhaps expect longer-term costs of segregation to be incurred primarily by society, rather than the firm itself, in which case a case can be made for government intervention to enforce integration within firms. Becker pointed out that discriminatory employers should go out of business as their profits suffer.

A priori, the same argument should hold for flower farms that allow workplace discrimination to influence productivity. However, the floriculture business is not particularly competitive, as evidenced by high profit margins . Moreover, as the literature in macroeconomics on across-firm misallocation has highlighted, it is not necessarily the most productive firms that survive in poor countries’ economies . Further, plant managers did respond to the increase in distortionary discrimination when conflict began, as we have seen. The introduction of team pay for processors was likely motivated by the decrease in productivity in diverse teams in early 2008. It is unsurprising that the dramatic differential decrease in mixed teams’ output when conflict began led managers to respond, even though the lower output observed in diverse teams during the first year of the sample period did not. A doubling of the output gap of diverse teams during a short period of time is likely more salient to managers than potential foregone productivity gains from arbitrary assignment to teams. It appears that managers considered an adjustment to contractual incentives a more desirable response to distortionary discrimination than segregating workers. But note that it is likely not possible to eliminate discrimination through contractual incentives, without entirely breaking the link between workers’ output and pay. At the sample plant, vertical discrimination continued to significantly affect output after the introduction of team pay.Evidence suggests that ethnic diversity negatively affects public goods provision and the quality of macroeconomic policies. While the possibility of an additional, direct effect on micro-level productivity has long been recognized, corresponding evidence is largely absent. In this paper, I begin by identifying a sizable, negative productivity effect of ethnic diversity in teams in Kenya. I do so using two years of daily output data for 924 workers, almost equally drawn from two rival tribes, at a flower-packing plant. The packing process takes place in triangular production units, one upstream “supplier” supplying two downstream “processors” who finalize bunches of flowers. I show that an arbitrary position rotation system led to quasi-random variation in teams’ ethnicity configuration. As predicted by a model in which different weight is attached to coethnic and non-coethnic downstream workers’ utility, suppliers discriminate both “vertically” – undersupplying downstream noncoethnics – and “horizontally” – shifting flowers from non-coethnic to coethnics downstream workers. By doing so, upstream workers lower their own pay and total output. I show that less distortionary, non-taste-based ethnic diversity effects are unlikely to explain this paper’s results. As Becker points out, significant aggregate effects “could easily result from the manner in which individual tastes for discrimination allocate resources within a free-enterprise framework” . Discrimination should lead to misallocation of resources in most joint production situations in which individuals influence the output and income of others. I take advantage of two natural experiments during the time period observed to begin to explore how the productivity effects of ethnic diversity are likely to vary across time and space. When contentious presidential election results led to political conflict and violent clashes between the two ethnic groups represented in the sample in early 2008, a dramatic, differential decrease in the output of mixed teams followed, as predicted by themodel. The reason appears to be that workers’ taste for discrimination against non-coethnic co-workers increased. I estimate a decrease in the weight attached to non-coethnics’ utility of approximately 35 percent in early 2008, through a reduced form approach. A back-of the-envelope calculation suggests that the increase in distortionary workplace discrimination may have cost the plant half a million dollars in annual profit, had it not responded. Six weeks into the conflict period, the plant implemented a new pay system in which downstream workers were paid for their combined output .

The effects of ethnic divisions are of particular importance in the Kenyan context

The best method for determining incorporation timing is to walk the field with a shovel and dig numerous holes and “feel” soil moisture at various depths throughout the field. In medium- and heavy textured soils you want to be able to form a ball of soil in your hand and then break it apart easily. For this exercise it is important to get soil from at least 8 inches deep. If the soil “ribbons” easily when squeezed between your thumb and index finger it is probably still too wet to work . Optimum soil moisture is critical for good incorporation and breakdown; in average rainfall years early April is commonly the best time for incorporation in the Central Coast region. After flail mowing the residue needs to be mixed with the soil to enhance microbial breakdown and facilitate seedbed formation. The best tool for this is a mechanical spader. Spaders are ideal for cover crop incorporation for many reasons. When operated in optimal soil moisture conditions spaders have minimal impact on soil aggregation and create almost no compaction compared to other primary tillage tools. Spaders are capable of uniformly mixing the cover crop residue into the tilled zone while at the same time leaving the soil lofted and well aerated, allowing for ideal conditions for microbial breakdown of the residue. Spaders also have two major drawbacks: they are expensive, cannabis grow equipment and they require very slow gearing and high horse power to operate; 10 horse power per working foot of spader is the basic requirement depending on soil conditions and depth of operation. They run at a very slow ground speed, often in the range of 0.6 to 0.8 mph. Thus a 7-foot wide spader requires 70 HP and takes between 3 and 4 hours to spade an acre.

Although time consuming, the results are impossible to replicate with any other tillage options now available. If a mechanical spader is not available the next best and probably most commonly used tool for cover crop incorporation is a heavy offset wheel disc. Depending on the size and weight of the disc multiple passes are often required for adequate incorporation. Chiseling after the first several passes will facilitate the disc’s ability to turn soil and will also help break up compaction from the disc.Rapid urbanization, compounded by globalization, has had lasting effects on the agricultural sector and on both urban and rural communities: As urban populations increase, they place more demands on a shrinking group of rurally-based food suppliers. And as the movement for locally based food systems grows to address this and other food system problems, urban agriculture has become a focal point for discussion, creativity, and progress. Indeed, the production of food in and around densely populated cities bears much promise as part of any solution to food supply and access issues for urban populations. Growing Power, Inc. , a Milwaukee-based organization, exemplifies the way that urban agriculture can address some of the needs of rapidly growing urban communities, particularly those with poor, minority populations. Will Allen, Growing Power’s founder—born to former sharecroppers in 1949—was drawn back to agriculture after a career in professional basketball. Aside from his love for growing food, he saw that the mostly poor, black community near his roadside stand in North Milwaukee had limited access to fresh vegetables or to vegetables they preferred. Confirming his observation, a 2006 study found that more diverse food options exist in wealthy and white neighborhoods than in poor and minority neighborhoods.2 Allen decided that he would serve this unmet demand by growing fresh food in the neighborhood where his customers lived and involve the community in the process. In 1993, long before urban agriculture bloomed into the movement it is today, Growing Power began. The importance of equal access to fresh food cannot be overestimated.

While ever more exotic fruits and vegetables from around the world stock health and natural foods stores in wealthy and predominantly white neighborhoods, poor and minority communities face fewer and less healthy food choices in the form of convenience stores, fast food restaurants, and disappearing supermarkets. This lack of access can lead to higher rates of diet related illnesses . Growing Power has been working to create an alternative food system based on intensive fruit and vegetable production, fish raising, and composting, in order to make healthy food available and affordable to the surrounding community, and to provide community members with some control over their food choices. But as anyone who has initiated an urban agricultural project knows, fertile, uncontaminated land is often difficult to find in a city. Even if land with soil is available, most empty lots are in former industrial areas where toxic contamination often renders land unusable . In Milwaukee, Growing Power sat on a lot with no soil and five abandoned greenhouses. Compost became the foundation for all of Growing Power’s activities. The raw materials needed to produce it were in abundant and cheap supply in the city— food waste, brewery grains, coffee grounds, newspaper waste, grass clippings, and leaf mold are all by-products of urban life destined, in most places, for the landfill. Businesses will often donate these materials to urban agriculture projects, saving the cost of garbage hauling services. Compost, and vermicompost in particular, also provides a renewable source of fertilizer that doesn’t rely on fossil-fuel inputs and can itself be used as a growing media. With a healthy compost-based system, Growing Power discovered a low-cost, renewable, and easy to-duplicate solution to one of the biggest hurdles people face when growing food in cities. Since 1993, Growing Power has grown in size and scope, starting gardens in Chicago as well as Milwaukee, and training centers in 15 cities, and including youth training, outreach and education, and policy initiatives in its mission. Interest in urban agriculture has also blossomed into a movement that includes commercial urban farms, scores of community farms and gardens, and educational gardens and training programs growing food and flowers and raising chickens, bees, goats, and other livestock for local consumption.

Urban agriculture has grown so rapidly in the last two decades that in 2012, the USDA granted $453,000 to Penn State University and New York University for a nationwide survey of the “State of Urban Agriculture”3 with an eye toward providing technical assistance, evaluating risk management, and removing barriers for urban farmers. The federal government’s interest in urban agriculture comes on the heels of state and local initiatives to encourage urban agriculture in numerous cities, including Milwaukee, Chicago, New York and San Francisco. The driving force behind these initiatives and the urban agriculture movement as a whole has always been groups of committed individuals in urban communities in search of food, community, opportunity, security, and access. What makes the Growing Power model work is not just its innovative techniques and creative use of urban spaces, but the partnership with its neighbors who not only receive the program’s services, vertical grow rack but contribute significantly to its success.There is evidence to suggest that ethnic heterogeneity may impede economic growth. A negative influence on decision-making in the public sphere has been documented: public goods provision is lower and macroeconomic policies of lower quality in ethnically fragmented societies . The possibility of an additional direct effect on productivity in the private sector has long been recognized, however. Individuals of different ethnicities may have different skill-sets and therefore complement each other in production, but it is also possible that workers of the same ethnic background collaborate more effectively . Evidence from poor countries on the productivity effects of ethnic diversity is largely absent. This paper provides novel microeconometric evidence on the productivity effects of ethnic divisions. I identify a negative effect of ethnic diversity on output in the context of joint production at a large plant in Kenya where workers were quasi-randomly assigned to teams. I then begin to address how output responds to increased conflict between ethnic groups, how firms respond to lower productivity in diverse teams, and how workplace behavior responds to policies implemented by firms to limit ethnic diversity distortions. A model of taste-based discrimination at work explains my findings across these dimensions. I study a sample of 924 workers working in teams at a plant in Kenya. The workers package flowers and prepare them for shipping: productivity is observed and measured by daily individual output. Tribal competition for political power and economic resources has been a defining character of Kenyan society since independence . Workers at the flower plant are almost equally drawn from two historically antagonistic ethnic blocs – the Kikuyu and the Luo . Production takes place in triangular packing units. One upstream “supplier” supplies and arranges roses that are then passed on to two downstream “processors” who assemble the flowers into bunches, as illustrated in figure 1a. The output of each of the two processors is observed. During the first period of the sample, processors were paid a piece rate based on own output and suppliers a piece rate based on total team output. Inefficiently low supply of roses to downstream workers of the rival ethnic group was thus costly for suppliers. I show that the plant’s system of assigning workers to positions through a rotation process generates quasi-random variation in team composition.

A worker’s past productivity and observable characteristics are orthogonal to those of other workers in her assigned team. The productivity effect of ethnic diversity can thus be identified by comparing the output of teams of different compositions. Two natural experiments during the time period for which I have data allow me to go further. During the second period of the sample, in early 2008, contentious presidential election results led to political and violent conflict between the Kikuyu and Luo ethnic groups, but production at the plant continued as usual. In the third period of the sample, starting six weeks after conflict began, the plant implemented a new pay system in which processors were paid for their combined output . By taking advantage of the three periods observed, I identify the source of productivity effects of ethnic diversity in the context of plant production in Kenya; how the economic costs of ethnic diversity vary with the political and social environment; and how managers responded to ethnic diversity distortions at the plant, and how workplace behavior changed as a consequence of the policies implemented in response. I model ethnic diversity effects as arising from a “taste for discrimination” among upstream workers: suppliers attach a potentially differential weight to coethnics’ and noncoethnics’ utility, a formulation that follows Becker , Charness and Rabin and others. The model predicts that discriminatory suppliers in mixed teams will “misallocate” flowers both vertically – under supplying downstream workers of the other ethnic group – and horizontally – shifting flowers from non-coethnic to coethnic downstream workers.1 The impact of horizontal misallocation on total output will depend on the relative productivity of favored and non-favored downstream workers. If conflict led to a decrease in non-coethnics’ utility-weight, a differential fall in mixed teams’ output in early 2008 is predicted. Under team pay, a positive output effect of a reduction in horizontal misallocation is expected to offset negative free riding effects, in teams in which the two processors are of different ethnic groups. The reason is that suppliers can no longer influence the relative pay of the two processors through relative supply under team pay. Quasi-random assignment led to teams of three different ethnicity configurations. About a quarter of observed teams are ethnically homogeneous, another quarter are “vertically mixed” teams in which both processors are of a different ethnic group than the supplier, and about half are “horizontally mixed” teams in which one processor is of a different ethnic group than the supplier. The ethnicity configurations are displayed in figure 1b. I test the model’s predictions by comparing the average output of teams of different ethnicity configurations within and across the three sample periods. In the first main result of the paper, I find that vertically mixed teams were eight percent less productive and horizontally mixed teams five percent less productive than homogeneous teams during the first period of the sample.

There are many commercial mixes available that come close to meeting most of the above criteria

Synthetic nitrogen-based fertilizers were made possible because of the Haber-Bosch process, which converts stable, inert nitrogen gas unavailable to plants into the reactive ammonia molecule readily available for plant uptake. Once the process was commercialized, synthetic fertilizer use skyrocketed, as farmers were no longer dependent only on their soil organic matter, compost, cover crops, and livestock manure for nitrogen. Fertilizer use in the United States increased from about 7.5 million tons in 1960 to 21 million tons in 2010. In 2007, California farmers applied 740,00 tons of nitrogen in fertilizers to 6.7 million acres of irrigated farmland. With cheap sources of nitrogen and water available, our current agricultural system is based on the liberal application of synthetic fertilizers and irrigation water to ensure high yields, often at the expense of environmental and public health. California’s Central Valley is home to some of the most heavily fertilized cropland and some of the most polluted water in the United States. Communities there are particularly vulnerable to public health effects of nitrate contamination because groundwater provides drinking water for the majority of residents. Additionally, rural communities in the valley are generally poor and populated by immigrants and minorities least able to afford treatment costs and most vulnerable to discriminatory decision-making. Tulare County, the second most productive agricultural county in California, includes many of these communities. Though it generates nearly $5 billion in revenue from agriculture each year, it has the highest poverty rate in California and is populated mainly by minorities , most of whomare Latino. The average per capita income in the county is $18,021. Here, indoor cannabis growing one in five small public water systems and two in five private domestic wells surpass the maximum contaminant level for nitrates.

As a result, residents of towns like Seville, East Orosi, and Tooleville are paying $60 per month for nitrate-contaminated water they can’t safely use, and must spend an additional $60 to purchase bottled water for drinking and bathing. In contrast, San Francisco water customers pay $26 per month for pristine water from the Hetch Hetchy water system in Yosemite. The economic cost of nitrate contamination in drinking water is not the only cost to these communities. Farm workers make up a significant segment of the population of small towns throughout the Central Valley and are both directly exposed to the hazards of heavy fertilizer use in the fields and in the air, and through excess nitrogen leached into groundwater drinking supplies. Scientists estimate that 50–80% of nitrogen applied in fertilizer is unused by plants. Of that, about 25% volatilizes into the atmosphere . As a result, approximately 30–50% of nitrogen applied in fertilizer—about 80 pounds per acre in California—leaches into groundwater beneath irrigated lands and into public and private water supplies.13 High nitrate levels in water can cause a number of health problems, including skin rashes, eye irritation, and hair loss. More severe is “Blue Baby Syndrome” , a potentially fatal blood disorder in infants caused by consumption of nitrate-contaminated water. Direct ingestion, intake through juices from concentrate, and bottle-fed infant formula are all potential threats to children. Nitrate contamination has also been linked to thyroid cancer in women. Widespread contamination of groundwater through leached fertilizer has rendered drinking water in rural communities across the country not only unusable, but dangerously so. While nitrate contamination is an acute problem in California, it exists across the country. The EPA estimates that over half of all community and domestic water wells have detectable levels of nitrates. Rural communities that rely on private wells , or lack access to adequate water treatment facilities, have the most insecure water supplies. In the short term, municipalities must devise a plan to reduce the disproportionately high cost of water to these communities.

One potential solution is a fee attached to the purchase of fertilizer used to subsidize water costs for communities with contaminated water. Communities with contaminated water could also be added to a nearby water district with access to clean water. In the longer term, the obvious solution is to substantially reduce synthetic fertilizer and water use in agriculture. Treatment, while effective on a small scale, cannot keep up with the vast quantities of nitrates continually entering groundwater supplies through fertilizer application. Similarly, reduced irrigation on farms, drawn mostly from uncontaminated sources, frees up new sources of drinking water for nearby communities. Lastly, to reach a truly sustainable and equitable system of water distribution, residents of rural communities must be included in the planning and decision-making process as members of local water boards, irrigation districts, and planning commissions to establish and safeguard their right to uncontaminated water.Currently, there is mounting evidence that suggests sustainable agriculture practices, exemplified by those used in agroecological systems, provide an opportunity to achieve the dual goals of feeding a growing population and shrinking agriculture’s carbon footprint, in addition to the social benefits of increased food security and stronger rural economies. This is in contrast with industrial-scale conventional systems that rely on fossil fuel-based fertilizers, pesticides, and heavy tillage and look to genetic engineering to help plants cope with climate change, e.g. by developing drought-resistant crop varieties, which themselves require high inputs of fertilizers and pesticides to produce optimally. Agroecological systems, on the other hand, can mitigate climate change by reducing fossil fuel use, and employing farming techniques that reduce GHG emissions by sequestering carbon in the soil. Of the range of practices in an agroecological system that address issues related to climate change, cover cropping is perhaps the most effective. As climate change continues to affect weather patterns and cause more frequent and severe weather events, protecting against soil erosion will become increasingly important. Cover crops provide an effective mitigation strategy by protecting soil against wateror wind-driven erosion. Cover cropping also provides other climaterelated benefits, including: an on-farm source of fertility, less dependence on fossil fuels and their derived products, and adaptability and resilience. Most of all, while the specific species, timing, and primary purpose of a cover crop vary geographically, the principles behind their cultivation are universally applicable and their benefits universally available. The use of a leguminous cover crop to fix nitrogen in the soil over the wet season for the next season’s crop is widely recognized as an effective fertility management tool. According to an FAO report on agriculture in developing countries, using cover crops in a maize/pigeon pea rotation led to increased yields and required less labor for weeding than continuous maize cropping systems with conventional fertilizer use.3 Nitrogen-fixing cover crops also greatly reduce, and in some cases eliminate, reliance on off-farm sources of fertility, thus reducing the overall carbon footprint of the farm while maintaining high fertility levels in the soil. Note that even organic fertilizers have a high embedded energy cost as they are mostly derived from manure from animals raised in confined feedlots, so the ability to grow one’s fertility needs on farm is important across different agricultural systems. Cover crops are not only a mitigation strategy for climate change, but also a cost-saving measure. Synthetic fertilizer costs have steadily increased over the last half-century, cannabis growing supplies causing hardship for farmers in developing countries especially where fertilizer prices are already two to three times the world price. Organic farmers are less vulnerable to price shifts in fertilizer, but can equally benefit from the reduced need for compost as a result of cover cropping. By saving seeds from their cover crops, farmers can close the loop in their cover crop management, save on annually purchased seed, and develop strains well-adapted to local conditions.

Fertility management systems based on cover crops insulate conventional farmers from increasingly frequent spikes in fertilizer prices and provide organic farmers with a cheap and renewable source of fertility. Adaptation and resilience are also crucial to farmers’ long-term success in the face of unpredictable and disruptive effects from a changing climate because so much of agriculture depends on constantly changing climatic conditions. Added to climate change are increasing input prices and a growing demand for food that put pressure on farmers to maintain high yields while paring down on costs. Cover crops can provide farmers with the flexibility they need by protecting topsoil from wind and water erosion, storing a reliable supply of nutrients to the soil, and—if managed correctly— minimizing costly weeding requirements. For many resource poor farmers who maintain livestock, cover crops provide a path to financial independence and food security as they can be grown both for soil fertility and livestock feed. Cover crops as part of a climate mitigation strategy also make sense at every scale of agriculture. Large conventional farms require consistently high yields to stay profitable as they often operate on razor thin margins. To achieve this goal, these farms rely heavily on fossil fuel-based sources of energy and fertility. Whether used on conventional or organic farms, cover cropping not only reduces farm emissions, but also contributes to the biological health of the farm’s aggressively cultivated soils. Many organic farms at all scales already use cover crops as part of their fertility management program, contributing to the sustainability of the overall system. Subsistence and small-scale farmers in developing countries who do not already practice cover cropping can benefit greatly in production and climate-related sustainability from adopting locally relevant techniques. And finally, low-cost, locally available sources of fertility are vital to the viability and success of urban agriculture projects that rely on cost minimization and closed-loop systems since external resources are not as readily available or economical in cities.In areas of the Central Coast where winter rainfall typically exceeds 25 inches per year, and especially on sloped ground, cover cropping in annual vegetable cropping systems is highly advisable to protect non-cropped soil from both erosion and nutrient leaching. Based on numerous studies, the optimum time for planting winter cover crops on the Central Coast is mid October. In our mild winter climate we can plant cover crops as late as January, however the best results in terms of weed suppression, stand uniformity, and biomass production are from cover crops planted in mid to late October or early November. Depending on rainfall patterns it is often critical to get winter cover crops planted prior to the onset of heavy winter rainfall. Cover crop ground preparation and planting are best accomplished when soil is dry enough to work without the risk of compaction, which can result in poor drainage and clod formation. This is especially important on heavier soils. Because timing is critical, growers need accurate long-range weather forecasts to help determine when to prepare ground and plant fall cover crops. Timing these operations is directly related to soil type and rainfall amounts, so each farm will have a different set of criteria on which to base ground preparation and planting schedules: the heavier the soil and the greater the rainfall, the tighter the window for fall planted cover crops. There is often a very tight window between cover crop planting and harvest of fall crops which, coupled with the potential for significant rain events, can add considerably to the excitement.Selecting optimum cool season cover crop mixes is challenging since there are so many factors involved. The optimum mix provides early and uniform stand establishment, good weed competition, and minimal pest and disease pressure. It “catches” potentially leachable nutrients, does not lodge or fall over in high wind and heavy rainfall events, does not set viable seed prior to incorporation, fixes nitrogen, does not get too carbonaceous prior to incorporation, and is relatively easy to incorporate and quick to break down once incorporated. The ideal mix also improves overall soil health and helps form stable soil aggregates by providing adequate amounts of carbon as a food source for the soil microbial communities. A good standard mix that has proven successful at the Center for Agroecology & Sustainable Food Systems Farm on the UC Santa Cruz campus over the past 20 years is a 50/50 mix of bell beans and lana vetch with no more than 7% cayuse oats, planted at a rate of about 175 lbs per acre with a no-till drill.There are many options available for mid- and late summer cover crops in the Central Coast region. Water use and “land out of production” are the two biggest challenges with summer cover crops, but in a diverse system they can provide good weed suppression and nutrient cycling, and can significantly improve soil tilth and aggregation when planted in rotation with mixed vegetables.

It is also interesting to note just how little temporal heterogeneity is found among the nonactive and rumination axes

The resulting data was formatted as a three-dimensional tensor, with cows on the first axis, time on the second axis, and mutually exclusive behaviors on the third axis . In order to more intuitively represent this data as the proportion of time devoted to each behavior, the total minutes a given cow was recorded as engaging in a given behavior on a given day was normalized by the total minutes that a given cow was recorded on a given day of observation.An ensemble of data mimicries was created for this observed data tensor using the simTimeBudget utility previously developed for the LIT package. Observational error attributable to the precision of the sensor was again simulated by stochastically resampling each observed hourly time budget using the joint Dirichlet-Multinomial sampling strategy . The resulting mimicry of the raw sensor data was then conditionally aggregated by date of observation to create a four-dimensional tensor formatted identical to the observed data tensor but with simulation number on the fourth axis to comply with LIT package formatting standards. Ensemble variance estimates were calculated over the simulation axis for each combination of cow, day, and behavior indices. As with overall time budget, these ensemble variance estimates could subsequently be used to scale dissimilarity estimates for the observed data to account for heterogeneity in multinomial-formatted data both within and between behavioral axes. In previous analysis of overall time budgets, trimming tray for weed jackknife resampling could be performed within-animal to nonparametrically estimate the reliability of the underlying behavioral signal by leveraging information in the temporal subsamples used to create the aggregate record .

With daily time budget records, however, this strategy could not be employed, ashomogeneity of hourly time budgets could not be assumed due to fluctuations in behaviors imposed by the management schedule and the circadian rhythms of the animals themselves . Thus, while the behavioral axis could be rescaled using the precision penalized ensemble weighted distances, systematic differences between days of observation was here accommodated by the empirically-driven iterative re-weighting of the time axis at the heart of the data mechanics algorithm .In order to encode daily time budgets, the basic data mechanics algorithm was extended to a three-dimensional tensor. As before, column clusters were used to reweight dissimilarity matrices calculated for row observations, and row clusters were used to re-weight dissimilarity matrices calculated for column observations, allowing structural information to be shared between the two axes . To accommodate a multivariate response, dissimilarity values were aggregated over the third axis at each matrix index prior to aggregation over the remaining matrix index for which the dissimilarity matrix was calculated. The efficacy of three dissimilarity estimators were explored. The first was a simple unweighted Euclidean distance , which is functionally equivalent to the standard two-dimensional implementation of the data mechanics algorithm, except that in the tensor implementation all behaviors for a given time budget observation are forced into the same cluster. The second dissimilarity estimator considered was the Kullback-Leibler Divergence Distance, which is the sum of the asymmetric relative entropy estimates for any two probability distributions vectors that sum to one. Here KLD-distance was calculated at each index for the first two axes and prior to aggregation over the axis for which the dissimilarity matrix was computed. The third and final dissimilarity estimator explored wasthe ensemble-weighted Euclidean distance, wherein the squared distance between observed values were normalized by the sum of the ensemble variance estimates calculated over all simulated datasets at the corresponding tensor indices.

The dissimilarity matrices calculated over the subset of row or column indices were reweighted iteratively using the cluster results of the opposite matrix axes until either cluster sets became stable or a maximum of ten iterations were reached . For all three dissimilarity estimators, clustering results were here calculated on a grid of metaparameter values: from one to ten cow clusters and one to six day clusters. Full details on the implementation of the tensorMechanics algorithm are provided in Supplemental Materials. To visualize the final dendrograms produce from each tensor mechanics optimization, the pheatmap package was used to create heatmaps wherein cows were arranged along the row axis, days along the column axis, and cells colored to represent the proportion of time that a given cow dedicated to a given behavior on a given day . To help identify temporal patterns captured by these clustering results, the column axis was annotated in purple with the day on trial that each set of records were recorded. In order to visualize patterns recovered across all five behavioral axes simultaneously, the ggpubr package was used to create a composite image of the final heatmaps created using equivalent 0 to 1 scales to facilitate direct visual comparisons of behavioral investments .Visualizations for final clustering results for all three dissimilarity estimators for all candidate metaparameter values are provided in Supplemental Materials. Comparisons of dissimilarity estimators for encodings of daily time budgets largely mirrored the clustering dynamics found with encodings of overall time budgets. Tensor mechanics encodings using thestandard unweighted Euclidean norm, which employed no rescaling of behavioral axes, overemphasized patterns in high frequency behaviors such as eating and rumination, recovering very little systematic differences in any of the activity axes across days or animals. KLD distance, on the other hand, produced a more balanced encoding across the five behavioral axes, but again was prone to over-estimate dissimilarity at the extremes of the time budget distribution, which resulted in a number of animals with extremely low or high eating times being classified as outlier in clusters containing only one animal, which would effectively remove these animals from consideration in downstream analyses of bivariate associations using these clustering results.

The encoding created using the ensemble-weighted Euclidean distance is visualized in Figure 1. This dissimilarity estimator provided arguably the most balanced representation of heterogeneity in daily time budgets across all five behavioral axes without over-pruning at the extremes of the distribution. Perhaps the most striking feature of this visualization, however, is the remarkable consistency in daily time budget records across this observational window. As with overall time budget, eating time remains the primary driver of difference between animals in this encoding. At the extremes of these eating time budgets, there are few systematic differences in clusters across the temporal axis, nor even much variability in observations within clusters, suggesting that neither transient environmental fluctuations or more persistent changes in management or the biology of these animals had much influence on the time investments of these individuals with more extreme behavioral strategies. Among cows with more moderate eating times, there is certainly more variability within clusters, but systematic differences in cells across the temporal axis are still surprisingly subtle. In contrasting what systematic patterns are apparent against the time indices of these records, it appears that time spent eating was slightly elevated during roughly the first 30-60 days on trial relative to the remainder of the observation window, which would have encompassed the transition period for many of these animals and the earliest stages of lactation for nearly all cows enrolled on this trial. This result is counter intuitive, as we would expect appetites to be suppressed immediately following calving and gradually increase throughout this observational window; however, it has been shown previously that time spent eating is not always well correlated with feed intake, weed trim tray as cows can compensate for reduced time invested in mastication at initial ingestion with increases in time masticating during subsequent rumination . Therefore, it is also possible that this early surge in time spent at the feed bunk might represent increase in feed sorting behaviors, which might represent latent behavioral strategies cows employ to cope with this period of negative energy balance, or it could simply reflect palatability issues related to the nutritional supplement or some other element of the total mixed ration. It is also interesting to note that, amongst animals with moderate eating times, the drop off in eating times seems to correspond with a slight increase in time spent highly active observed across all cow clusters. While this temporal subperiod would likely correspond to the return to estrus for some animals in this herd , the pervasiveness of this shift would seem more easily attributable to some latent biological or managerial shift, such as an increase in appetite during peak milk that led to a greater number of trips to the feed bunk between milkings . As these cows were under stocked with respect to bunk space during this observation window, this may indicate that conditions in this pen support good lying time irrespective of fluctuations in the management environment .In comparing the results of the tensor mechanics clustering with the results for data aggregated to an overall time budget, which are visualized in Figure 2, we see that the two encodings are in close agreement. The contingency table reveals that cluster assignments are nearly identical between the two data sets for the coarser branches of the dendrogram nearer the trunk, which reflect differences among more extreme time budgets, and differ only slightly in the cut off values established between branches representing more subtle differences in time budgets. Given the temporal consistency in this daily time budget data, this result is not necessarily surprising, as there is little additional information or complexity to be recovered from disaggregating this information across days for this herd.

Accommodation of temporal heterogeneity in the tensormechanics encoding appears to have largely only served to more finely distinguish between animals with low eating times. Closer inspection reveals that these distinctions appear to have been made based on differences in tradeoffs between the nonactive and highly active behavioral axes across the early and later phases of this observation window. As a result, the tensor mechanics encoding produces a coarser encoding of animals with more moderate time budgets. This has caused some of the more moderate overall time budget clusters to be consolidated in the tensor mechanics encoding, with some ambiguity in the resulting cutoffs that appears again to be driven by greater weight being placed on how the eating-nonactivity tradeoff and the eating-highly active tradeoff shifts over the observation window.To determine if the temporal heterogeneity found in daily time budget records would modify bivariate associations found in previous analyses between overall time budgets and other farm data streams, bivariate tree tests were conducted using the tensor mechanics results for all three dissimilarity estimators. The first set of tests conducted explored if patterns in daily time budgets differed between animals fed control diets and those whose TMR rations were amended with the Organilac fat supplement. In prior analyses with overall time budgets, no significant bivariate associations were recovered using the bivariate tree test framework ±a result that was not necessarily surprising given that control and treatment animals were housed together throughout the duration of the trial except while head locked for the morning feeding and herd check. Significant bivariate associations were, however, recovered between treatment group and all three dissimilarity estimators used to create tensor mechanics encodings. Visual characterizations of these relationships, which can be found in Supplemental Materials, were created using the compare Encoding utility, wherein contingency table cells were colored by point wise mutual information estimates that were deemed statistically significant by simulation against the null using multinomial resampling . The relationship recovered using the ensemble-weighted Euclidean distance is visualized in Figure 3. Organilac cows were significantly over represented among daily time budgets that were consistent throughout the observation window and characterized by relatively high time spent eating, moderate rates of rumination, and low non-activity. Cows in the control group, on the other hand, were over-represented among daily time budgets that were characterized by time spent eating that varied between moderately-low to moderately-high , rumination rates that were consistently relatively low, elevated rates of nonactivity during the first half of the observation window, and elevated rates of high activity during the second half of the observation period. Collectively, these results might suggest that, amongst cows with more moderate time budgets, control animals may have struggled in the early stages of this study that encompassed much of the transition milk period, whereas fat supplemented cows were able to maintain a robust and well balanced time budget throughout this early lactation period . Alternatively, Organilac cows might simply have spent more time eating throughout the first phase of this research trial in order to sort through their TMR ration in order to avoid the fat supplement.

Animals enrolled in this feed trail were also fitted with a CowManager ear tag accelerometer

Observations collected on a single animal over extended observation windows at high sampling frequencies can, however, contain a range of complex temporal patterns ± cyclicity, non-stationarity, autocorrelation, etc . Further, when sensors are applied to large heterogenous groups of animals housed socially in spatially restricted environments, recorded behaviors may also contain complex interdependencies between animals at the dyadic, triadic, clique, and herd levels . Failing to accommodate all these complex structural and stochastic features in a conventional model-based approach tostatistical inference risks returning spurious insights into the underlying behavioral dynamics. Developing such a model with a single PLF data stream can be challenging. Provided multiple data streams, however, the logistical challenges presented by model-based analytical frameworks can rapidly compound, creating significant barriers to cross-sensor inferences and thereby impeding researchers from extracting more holistic behavioral inferences from increasingly data-rich farm environments. Unsupervised Machine Learning tools may provide a more flexible and forgiving approach to knowledge discovery in the context of large sensor datasets . Such algorithms excel at identifying and characterizing complex nonrandom behavioral patterns lying beneath the stochastic surface of a dataset, while often employing relatively few structural assumptions about the data . Hierarchical clustering-based techniques offer an intuitive and highly adaptable approach to visualizing high dimensional datasets that is particularly well-suited to exploratory data analysis . Indeed, trim tray with screen by reducing the complex behavioral signals present in a sensor dataset into a series of discrete clusters, such algorithms may be viewed as an empirical extension of classical ethological techniques.

Discrete data, however, can be challenging to work with in most frequentist and even many Bayesian frameworks. Estimators based on information entropy, on the other hand, are purpose-made to quantify uncertainty in discretely encoded data without knowledge of the underlying distribution, andthus naturally complement hierarchical clustering-based algorithms . In these analyses, data mechanics algorithms were able to recover complex nonstationarity in the order in which cows entered the milking parlor. Some of these changes in queuing patterns could attributed to the shift to spring pasture access, but other transient and persistent shifts in entry order recovered in these encodings that may have been driven by environmental factors not experimentally recorded . Entropy-based non-parametric permutation tests were also successful in recovering preliminary evidence of significant nonlinear associations between encodings of entry order patterns and activity patterns recorded using ear tag accelerometers. In this paper we will explore how novel ensemble simulation techniques that emulate and adjust for the complex sources of error in PLF data streams may be used to produced more balanced encodings of multi-dimensional behavioral data. We also introduce a new dendrogram pruning algorithm that is able to efficiently repurpose these same ensemble simulations to ensure that that the power of hierarchical clustering tools do not exceed there solution of the sensor. Finally, we demonstrate the utility of information decomposition techniques within our existing non-parametric mutual information testing framework to better facilitate visual characterization of complex behavioral patterns across sensor data sets that might be overlooked in more conventional model-based analyses.

To demonstrate the efficacy of our analytical approach, data was repurposed from a feed trial assessing the impact of an organic fat supplement on cow health and productivity through the first 150 days of lactation. All animal handling and experimental protocols were approved by the Colorado State University Institution of Animal Care and Use Committee . The study ran from January through July in 2017 on a USDA Certified Organic dairy in Northern Colorado, enrolling a total of 200 cows over a 1.5 month period into a mixedparity herd of animals with predominantly Holstein genetics. Cows were maintained in a closed herd in an open-sided free stall barn, stocked at roughly half capacity with respect to both feed bunk spaces and stalls. Cows had free access to an adjacent outdoor dry lot while in their home pen, and beginning in April were moved onto pasture at night to comply with Organic grazing standards. Cows were milked three times a day, with free access to TMR between milkings, and were head locked each morning to facilitate data collection and daily health checks. For more details on feed trial protocols see Manriquez et al. and Manriquez et al. .In addition to standard production and health assessments, behavioral data was also obtained from several PLF data streams . Milking order, or the sequence in which cows enter the parlor to be milked, is automatically recorded as metadata in all modern RFID-equipped milking systems. Study cows were here milked in a DelProTM rotary parlor . At each morning milking, raw milking logs were exported from the parlor software, and the data processed to extract the single-file order that cows entered the rotary . A total of 80 milk order records ±26 recorded while cows remained overnight in a freestall barn, and 54 following the transition to overnight access to spring pasture ±were used to create discrete encodings for parlor entry patterns via data mechanics clustering .

The dendrograms summarizing the distribution of cow entry order patterns and subsequent heatmap visualizations will here be subjected to further analysis without modifications to the previously reported encodings. This commercial sensor platform, while designed and optimized for disease and heat detection, also provides hourly time budget estimates for total time engaged in five mutually exclusive discrete behaviors – eating, rumination, non-activity, activity, and high activity . Time budget data was collected on all animals for a contiguous period of 65 days , with the observation window beginning shortly after trial enrollment was completed on February 17th, and ending on April 23rd when the grazing season commenced and cows were moved overnight beyond the range of the receiver antennae. After dropping cows that were removed prematurely from the observation herd due to acute clinical illness, as well as several cows with persistent receiver failure, complete sensor records were available for 179 animals. In order to more fully focus on the logistical challenges in encoding and characterizing the complex multivariate dynamics of this system, we have here chosen to compress this data over the time axis to consider only the overall time budgets of these cows, and will leave explorations of the longitudinal and cyclical complexity of this dataset for future work.Domain constraints are not, however, the only stochastic feature that need be accommodated when working with time budget data. There is also the measurement error attributable to the sensor itself. Returning to the previous example, suppose that we also know that our rumination records are only accurate to r1hr. Is it then still appropriate to give more weight to the one-hour difference between Daisy and Delilah than between Betty and Betsy? Since both observations are within the bounds of error, attempting to enhance the underlying biological signal may instead only succeed in amplifying measurement noise. A closed form estimator, however, may not be readily generalizable to the wide range of measurement error models encountered with PLF sensors. We therefore propose that a simulation-based approach may offer a more flexible means of accounting for measurement error in dissimilarity estimates . The LIT package provides a built-in simulation utility for time budget data that seeks to mimic the stochastic error structure of the original data while still preserving the underlying behavioral signal . Data is provided as a tensor, with cow indexed on the first axis, time indexed on the second, and the component behaviors on the final axis. The count data at each cow-by-time index is then used to redraw a simulated data point from one of three optional distributions . In the first, the user may sample directly from a multi-nomial distribution centered around the normalized observed count vector. This model assumes that measurement error should shrink as a cow dedicates larger proportions of an observation window to specific behaviors, and intrinsically prevents estimates from being generated outside the domain of support. Variance can be under-estimated at the extremes of the domain, however, cannabis trimming trays if the probability for a behavior is non-negligible but the observed count is zero due to under-sampling. This issue may be addressed in sampling option two, where samples are redrawn from a Multivariate Beta Distribution , also known as a Dirichlet distribution, again parameterized using the normalized observed count. While this sampling strategy slightly biases the simulation towards the center of the distribution, it prevents undersampling at the extremes of the domain. Finally, users may combine these sampling strategies in sampling option three, wherein the probability vector used to parameterize the multinomial is drawn first from the Dirichlet, in order to further increase the uncertainty in the simulated data.

After simulation has been completed by redrawing samples at the finest level of temporal granularity supported by the sensor, the data can then be conditionally or fully aggregated along temporal axis as required for downstream analysis as a time budget. This simulation routine was used to create an ensemble of B = 500 simulated overall time budget matrices that mimicked the stochasticity attributable to a reasonable approximation of the measurement error of the sensor. Stored as a tensor with replication on the last axis, the variance of the ensemble of simulations could then be easily calculated for each combination of cow index and behavioral axis. If the underlying simulation strategy is a reasonable representation of the noise in the sensor, then these variance terms will then serve as a sufficient approximation of the relative uncertainty in each data point. We propose that that this information can then be incorporated into the calculation of dissimilarity estimates by servingas penalty terms in the calculation of an ensemble-weighted distance estimator defined in Equation 3.The rescaling strategy employed in our proposed dissimilarity estimator is strongly inspired by traditional Analysis of Variance techniques, thereby providing several insights into its anticipated behavior. First, because the simulations were generated using the multinomial or one of its analogs, we can infer that these penalty terms will not be homogenous across the domain of support, but should shrink as observations approach the boundary. This will allow the ensemble weighted distance estimator to emulate the rescaling dynamic achieved with the KL distance, but here rescaling at the extremes of the domain will ultimately be bounded by our simulated measurement error so as not to exceed the precision of the sensor. Second, because we have here emulated measurement error in our simulation using sampling uncertainty, the central limit theorem will apply . Thus, we can anticipate that as the number of observations per animals increases, the impact of measurement error on our inferences will shrink, allowing progressively more subtle differences between animals to come into resolution. Taking this property to its limit, however, can it be said that with enough observation minutes the differences between cows can be inferred with near certainty? That intuition, of course, is at odds with our characterization of a dairy herd as a complex system, and highlights an additional stochastic element that must be accommodated ±the behavioral plasticity of the cows themselves in response to changes in the production environment . Given the extended observation window of this particular data set, it would be possible to recalculate time budget conditional on day of observation, and then use the variance in daily time budget along each behavioral axis as a penalty term. Such estimates would collectively reflect heterogeneity in variance attributed domain constraints, measurement error, and behavioral plasticity. Such an approach would not, however, be feasible for datasets collected over shorter time intervals with fewer replications or in applications with behavioral responses where there is no clear hierarchy in the temporal structure of the same. We therefore propose that our stochastic simulation model can be extended to also provide a generalizable means to approximate the uncertainty of the underlying behavioral signal. As before, the measurement error was simulated by redrawing samples at the finest temporal granularity provided by the sensor. Prior to compression along the temporal axis, however, a random subsample of observations days was selected across all cows, and only these values used to calculate overall time budget. If all cows demonstrated comparable levels of consistency in their daily time budgets, then reducing the effective sample size of our simulated data sets through a subsampling routine would increase the ensemble variance estimates. This in turn would make our approximation of measurement error hyper conservative, but this increase would be uniform across all cows.

Each data point is then projected into the resulting low dimension linear space

Any days where less than 75% of the herd was successfully recorded in the parlor were also dropped. This left a total of 80 days of milk order observations ± 26 recorded while cows remained overnight in their pen, and 54 after the transition to overnight pasture. Finally, cows that were not present in at least 50% of the remaining milkings were excluded from further analysis. Of the 177 cows with sufficient records, 114 had no recorded health events.With this metric, the more consistently a smaller set of cows are observed in a given segment of the queue, the smaller the entropy values becomes to reflect less stochasticity in the system. In standard statistical models, the nominal value of estimators such as log likelihood and AIC scale with the size of the data set, and must be interpreted relative the value of equivalent terms assessed against a null model. Analogously, the nominal value of the entropy estimates scales with the number of discrete categories used. The maximum theoretical value occurs when no underlying deterministic structures are present and all categories are equally likely to occur, which algebraically simplifies to the log of the number of discrete categories used . Here the maximum theoretical entropy value would be log2 = 6.83. To visually contrast differences in stochasticity across the queue, the observed entropy values were plotted against the median entry quantile of the corresponding queue segment using the ggplot2 package, vertical grow racks with maximum theoretical entropy added as a horizontal reference line . Nonrandom patterns in queue formation could also be explored by tracking the entry position of individual cows over time. As entry quantile has a numerical value, we can now also use variance to quantify and contrast stochasticity between animals.

As with all analytical approaches reviewed in this paper, there are both strengths and shortcomings to either approach . In this system there are two potential drawbacks to this conventional summary statistic. The first is that variance estimates are quite sensitive to outliers, making it difficult to empirically distinguish between cows that occupy a wider range of queue positions and animals who typically occupy a narrower range but might have gotten jostled far from their normal position on one or several occasions. The second drawback is that, because variance quantifies dispersion about a central value, it cannot distinguish between cows that demonstrate little consistency in entry position and multimodal queuing patterns. For example, if a cow always entered the parlor either first or last, we would intuitively determine that this pattern is nonrandom, but the corresponding variance estimate would be the largest in the herd. Having recovered evidence of nonrandom patterns, the next step was to begin characterizing the behavioral mechanisms driving this heterogeneity. The most fundamental question that need be answered to inform further analysis was the degree to which queueing patterns were driven by individual or collective behaviors. Because cows jockey for position with one another in the crowd pen, where they are pushed up to enter the parlor, we know intuitively that entry quantile records cannot be considered truly independent observations. If cows move through this melee as independent agents, such that their position within the queue is determined by individual attributes ±preferences, dominance, etc ±then a linear model may still provide a reasonable approximation of the underlying system. Early observational work on milking order, however, has suggested that cows may form consistent associations when entering the milking parlor, particularly when heifers are reared together .

If cows move into the parlor in cohesive units, such that queue position is more determined by clique-level than individual attributes, then network analyses may be a more appropriate. Principal Component Analysis is commonly employed to visualize relationships between observational units in high dimensional datasets. In this approach, redundancy between variables, here each milking record, is captured using either covariance or correlation assessed across all data points, here all animals. An eigenvector decomposition is then used to linearly compress the information contained in the data via rotation of the orthogonal axes. New axes are added iteratively such that each new dimension is pointed in the direction of greatest remaining variability until only noise remains . PCA was here performed only on animals with no recorded health in order to prevent any anomalous queuing behaviors recorded from acutely or chronically ill animals from obscuring the queuing patterns of the broader herd. The correlation matrix was constructed using all pairwise complete observations, and a scree plot was used to determine the dimensionality of the resulting space . The plotly package was then used to visualize the final embedding. While PCA provides a computationally expedient means of visualizing high dimensional data, the underlying assumption of linearity is not always appropriate . In some data sets complex geometric constraints, such as those commonly found with images or raw accelerometer data, and other latent deterministic features may project data points onto high dimensional geometric surfaces collectively called manifolds .

When these topologies are nonlinear , the spatial relationships between data points cannot always be reliably maintained when projected directly into a linear space, which can lead to incorrect inferences . Imagine, for example, you had a round globe of the world and wanted instead a flat map. Applying PCA to this task would be analogous to smooshing the globe flat on a table. Some of the original geographic relationships would be discernable, but some locations would appear erroneously close, and some landscapes would be entirely obscured. Modern manifold learning algorithms strive to more reliably project the complex geometric relationships between observational units into a standard Some geographic features will still be lost, particularly over sparsely sampled regions like the oceans, but the spatial relationships between landmarks would collectively prove more representative of the original topography. To further explore the underlying structure of this data absent assumptions of linearity, and thereby potentially accommodate any complex geometric constraints imposed on milk order records by latent social structures within the herd, a diffusion map algorithm was implemented using functions provided in base R . This was done here by first calculating the Euclidean distance between temporally aligned vectors of parlor entry quantiles for each pairwise combination of cows, scaled to adjust for missing records, and then inverting these values to create a similarity matrix. From this similarity matrix a weighted network was created by progressively adding links for the k = 10 nearest neighbors surrounding each data point. A spectral value decomposition was then performed on the corresponding graph Laplacian matrix . The resulting eigenvalues were used to select the appropriate number of dimensions, and the corresponding eigenvectors visualized using the 3D scatter tools from the plotly package . Finally, as a means of comparing geometric structures identified in the observed dataset with those of a completely randomized queuing process, the permutated dataset generated in the previous section was also embedded and visualized using plotly graphics .Having determined from the previous visualizations that a linear model might be a reasonable representation of the underlying deterministic structures of this system, the next step was to explore the temporal dynamics of this dataset. In a standard repeated measures model, multiple observations from the same animal are assumed to be identically and independently sampled, mobile shelving system implying that sampling order should not affect the observed value. If the observation period is sufficiently long to allow the underlying process to shift or evolve over time, however, stationarity cannot be assumed. Failure to statistically accommodate a temporal trend can not only lead to spurious inferences due to incorrect estimation of error variance, but also risks overlooking dynamic features of the behaviors under consideration . In practice temporal trends are often assessed by first fitting a stationary model and analyzing the resulting residuals. This may suffice when the temporal trend is uniform across animals, but risks overlooking more complex nonhomogeneous temporal affects. This could occur if only a subset of the larger group displays a non-stationary pattern, a risk that is likely heightened in large socially heterogeneous groups.

In this physically constrained system, where we know that every cow moving forwards in the queue must force other cows backwards, compensatory trends could also be easily overlooked in collective assessment of residuals. We first assessed temporal trend using two conventional EDA techniques. First, the ggplot2 package was used to generate scatter plots of entry quantile values against the corresponding observation date for each individual cow, with pasture access annotated with a verticalline. Plots were visually inspected for non-stationary, and are provided in supplementary materials. Next, to further explore the impact of the shift from pen to overnight pasture access on morning queueing patterns, median queue positions from the two subperiods were plotted against using the ggplot2 package , and Pearson correlation and Kendall Tau were computed using the stats package . While these preliminary visualizations were easy to both generate and interpret, both treat cows as independent and somewhat isolated units. With such a large number of animals to consider, the capacity for human pattern detection is quickly overwhelmed, making it difficult to contextualize trends within the broader herd. Further, this approach fails to leverage non-independence between animals entering the parlor, and thus risks overlooking subtler collective responses. Data mechanics visualizations were implemented to simultaneously explore systematic heterogeneity in milk entry quantiles both between animals and across the temporal axis. This was done by first using entry quantile values to compute two Euclidean distance matrices: one quantifying the similarity between pairwise combinations of cows, the second quantifying similarity between pairwise combinations of daily milking sequences. These distance matrices were then used to generate two independent hierarchical clustering trees using the Ward D2 method . By cutting both trees at a fixed number of clusters, observation days and cows were both partitioned into empirically defined categories, and a contingency table was then formed with cow clusters as the row variable and day clusters as the column variable. The original distance matrices were then updated, using the clustering structure between cows to create a weighted distance matrix between days and vice versa, thereby allowing mutual information to be shared between the temporal and social axes of the dataset . After several iterations of this algorithm, clusterings converged towards a contingency table with minimal entropy, wherein the entry quantile values within each cell were as homogenous as possible. When the entry quantile values were subsequently visualized using a heat map, this highly generalizable entropy minimization technique served to visually enhance heterogeneity within the data driven by nonrandom patterns along either axis. Further, by facilitating the transfer of information between axes, interaction effects between the social and temporal dimensions of this system were magnified, which here provided a means to explore nonhomogeneous temporal non-stationary between subgroups within the herd . The data mechanics pipeline was used to analyze the temporal dynamics present in both the complete milking order dataset and the subset of animals with no recorded health events. Instead this algorithm was applied on a grid from 1 to 10 clusters for either axis. The resulting 100 heat maps scanned visually to determine the clustering granularity required to bring into resolution any interactions between social and temporal mechanisms. While this process may be computationally cumbersome, it is empirically analogous to systematically varying the focus of a light microscope to bring into resolution microbes of unknown size ±a tedious but effective means of identifying all relevant structures within a sample . Finally, the RColorBrewer package was used to add color annotations to the column margin, to clarify temporal patterns, and to the row margins, which served to visualize potential relationships between queue position, a selection of individual cow attribute variables, and the onset of recorded health complications.Having thoroughly characterized the stochastic structures present in this dataset, the insights gleaned from the preceding visualizations were incorporated into a linear model to evaluate the relationship between queue position and several cow attributes. The four days identified as outliers by the data mechanics visualizations were first removed and the dataset converted to long format to be analyzed as a repeated measures model using the nlme package . Cow was fit as a random intercept via maximum likelihood method. Guided by the results of entropy and data mechanics visualizations, VarIdent was used to estimate separate error variance terms for each cow, and the necessity of this data-hungry heterogeneous variance model confirmed via likelihood ratio test against the null model with homogenous variance .

We multiplied temperature and soil scores to make a preliminary suitability map

As California’s temperatures get hotter and precipitation becomes increasingly variable with climate change , we expect a further systematic overestimation of suitable areas identified based on the past 30 years of weather data. For the suitability analysis we assigned temperature and soil texture to three categories that were each associated with a score: good , tolerable , and intolerable , while precipitation was divided into ranges that were suitable with no additional irrigation, suitable with additional irrigation, and unsuitable.For temperature, we considered the average maximum temperature in the three hottest months of the growing season , categorizing them separately with the scores described above . We then multiplied these three categorized scores together and took the cube root to get temperature suitability scores for the state, also excluding any areas whose monthly 30-year minimum temperature was above 59o F. We followed a similar procedure for soil texture, using SSURGO estimates of clay content averaged across soil horizons at a 90m resolution . Because farmers did not give numeric estimates of how much clay was needed in dry farm soils, we made sure our defined ‘tolerable’ range encompassed the full range of clay content observed in participating farms’ soils . To define the ‘good’ range , we excluded the farm with the lowest clay content, which was also the only farm where farmers stated that they could not grow tomatoes of a high enough quality to consistently market them as “dry farm.” This multiplication reflects the interaction between temperature and soil texture, grow rack in which good texture can compensate for higher temperatures by increasing soil water holding capacity, and lower temperatures can lessen the evapotranspirative demand that would be particularly problematic for plants growing in sandier soils with a lower soil water holding capacity.

We then separated the dataset into three areas based off of farmers’ understandings of where tomato dry farming could occur with no added irrigation and where it could occur with supplemental irrigation , and excluding areas that would not get enough winter rain to grow a suitable winter cover crop . The final map shows suitability scores in all areas that are categorized a ‘cropland’ in the 2019 National Land Cover Database . These areas are superimposed onto groundwater basins categorized as high priority in California’s Sustainable Groundwater Management Act . Crop totals on land that was deemed suitable for tomato dry farm management in these areas were calculated using the 2021 Cropland Data Layer .By focusing on the characteristics that limited water can give a tomato, these farmers highlight a recurring theme in understanding the functional definition of dry farming tomatoes. As the Central Coast faces increasingly limited water availability, the idea of dry farming has gained traction among policymakers purely by virtue of offering a means to continue farming while maintaining a restricted water budget. However, these farmers are quick to recognize that dry farming is only a management style that they can afford to choose for their operations insofar as it can excite customers and return a reasonable profit. In this way, the product that dry farming creates, which is valuable enough to consumers that they are willing to pay a significant premium for it, is the outcome that defines the management approaches farmers can use. Farmers know that they could alter the schedule for the minimal irrigation they do put on their dry farm tomatoes to increase yields . However, while defining the practice by some maximum threshold of water application, and then choosing to allocate irrigation water to maximize yields, may be appealing from a water savings perspective, farmers recognize that they must define the practice in terms of outcomes and not inputs.

Farmers must produce what consumers have come to expect from a dry farm tomato if they are going to make dry farming an economically viable choice for their operation.To better understand where tomatoes might conceivably be farmed in California given the environmental constraints identified above, we modeled dry farm suitability on California cropland as a function of precipitation, temperature, and percent clay in soil. The resulting map shows what lands could potentially support a dry farm crop, with and without supplemental irrigation, using constraints that are relaxed to encompass the least restrictive farmer-elicited constraints . The map therefore errs on the side of including land that is not an ideal candidate for dry farming, rather than leaving off land that may potentially be a good fit. With rising temperatures and less reliable rainfall, this map, which is based off of 30-year normals, likely also systematically overestimates what areas might fall into these thresholds when projecting into future climatic conditions. All areas in blue indicate land that meets a threshold where dry farming could be considered in a non-drought year without adding any irrigation. Areas in orange indicate that, while there is likely enough rain to sustain a winter cover crop, some amount of irrigation would often be needed to grow a successful dry farm crop. Areas in darker colors connote land that falls in conditions that are closer to ideal, whereas lighter colors indicate that more conditions are tolerable, rather than ideal, for dry farming.It is crucial to note that areas that show up as “suitable” on the map–including the most ideal locations–will likely require years of diversified management for soils to build the water holding capacity and fertility that allow for peak dry farm performance. These areas should therefore be considered candidates for long-term dry farm management, rather than ready-to-go dry farm fields. Because the constraints used to build the model were elicited specifically with regard to tomatoes, this of course is not a comprehensive map of everywhere that might be considered for dry farming non-tomato crops.

Particularly when it comes to grains and perennials , the range of possible locations is likely much broader. In the case of grains, winter varietals can be planted that take advantage of rain in winter months, while tree crops have far more extensive root systems that can reach water well beyond that which might be available to a tomato, in both cases relaxing the temperature and precipitation constraints that tomatoes need to survive without irrigation. Tomatoes are likely a better proxy for other vegetable crops , though each will have its unique requirements . As we imagine a shift towards dry farm agriculture in California, it is also important to consider how land that is suitable for dry farming is currently being used. Combining areas that are suitable for tomato dry farming with and without irrigation, we compiled a list of the top ten crops by area that are currently grown on these lands . Some of them are currently being dry farmed with some regularity in the state and could signal particularly easy targets for a shift to low-water practices. Others are dry farmed in other Mediterranean climates and suggest an important opportunity for management exploration in lands that might be particularly forgiving to experimentation. The remaining crops are some of the most water intensive in the state and would therefore lead to substantial water savings if the land could be repurposed. While unrealistic in the near future, calculating potential water savings from a complete conversion of suitable lands to dry farming allows for comparison with other water saving strategies. Even assuming that an acre-foot of irrigation is added to each acre of dry farm crops every year , vertical racks if all the land listed in Table 3 were converted to dry farming and irrigated to the statewide averages listed in the table , California would save 700 billion gallons of water per year, or nearly half the volume of Shasta Lake, the largest reservoir in the state. Given the overlap between suitable dry farm areas and high priority groundwater basins, these potential water savings are especially valuable as water districts scramble to balance their water budgets in light of SGMA. Perhaps the largest caveat to these potential water savings–and any analysis of dry farm suitability that relies solely on environmental constraints–is the economic reality in which conversions to dry farming currently occur. As discussed above, while a dramatic reduction in irrigation inputs might be feasible from a crop physiological perspective, whether farms can remain profitable through such a transition is an entirely different question. Given a dramatically increased supply of dry farm tomatoes, the profits that current dry farmers rely on could easily crumble. When considering other, less charismatic crops that could be good candidates for dry farming , customers’ likely hesitance to pay as steep a premium for high quality produce as they do for tomatoes also casts doubt on the viability of a large-scale dry farm transition given current profit structures for farmers.Our suitability map shows potential for vegetable dry farming to be practiced on California croplands that are currently irrigated, though its expansion is inherently limited.

Even if markets could be adapted to support an influx of dry farmed vegetables, our map indicates that climatic constraints will largely require dry farming to be practiced in coastal regions or other microclimates that can provide cool temperatures and sufficient rainfall. However, the Central Coast’s tomato dry farming offers principles–but not a blueprint–for low water agriculture in other regions. Based on themes from our interviews, these principles show a cycle of water savings that connect reduced inputs, management diversification, and market development . The cycle begins with lower irrigation , which can be accomplished in concert with soil health practices that build soil water holding capacity and increase long-term fertility. Reduced weed pressure and lower biomass production can then lead to reducing other inputs, such as labor and fertilizers, while also allowing for further water savings. The combination of reduced inputs and soil health practices then gives rise to a product that is unique in its water saving potential, and may also be of unusually high quality. By encouraging consumers to appreciate the products, or through novel policy support, farmers can develop markets that will provide a premium for these low-water products–or payment for the practice itself–which in turn creates an opportunity to expand the practice, further lowering inputs.As we ask how policies may impact dry farm production systems, we find a forking path in what types of expansion may result from different policies. An increase in production can be accomplished through both scaling size and scaling number . Both options can tap into the water saving cycle to decrease water usage; however, the search for just, agroecological transitions has pointed time and again to the need for scaling number . On the Central Coast, small, diversified farms have used this water saving cycle to both cut water use and develop a specialty product that allows growers to farm in areas with high land values by increasing their land access, profits, and resilience to local water shortages. Through these principles, small-scale operations have differentiated their management from both industrial farms and even other small farms in the region by creating a system based in localized knowledge, soil health practices, and thought-intensive management. However, it cannot be taken as a given that this water saving cycle will continue to uplift the small scale operations on which it started. Recent work highlights the potential for biophysical and sociopolitical conditions to combine to shrink–rather than grow–the use and viability of agroecological systems . In the case of dry farm tomatoes, socio-political attention is already beginning to target the biophysical need to decrease water consumption. If well-intentioned policy interventions designed to decrease irrigation water use build markets that value the fact of dry farming, rather than the high quality fruits it produces , growers will be able to scale the size of dry farm operations without needing to rely on the highly localized knowledge required to produce high quality fruits. As large grocers scale up dry farm produce sales without worrying about quality-based markets that may quickly saturate at industrial scales, the agroecological systems that originally produced dry farm tomatoes may be edged out of the market. On the other hand, if policies build guaranteed markets for small farms growing dry farm produce, dry farming may grow by scaling out to more small-scale operations. Policies focused on water savings may then favor industrial or small-scale farms, depending on how interventions shape the “Market Development” aspect of the cycle.