Each data point is then projected into the resulting low dimension linear space

Any days where less than 75% of the herd was successfully recorded in the parlor were also dropped. This left a total of 80 days of milk order observations ± 26 recorded while cows remained overnight in their pen, and 54 after the transition to overnight pasture. Finally, cows that were not present in at least 50% of the remaining milkings were excluded from further analysis. Of the 177 cows with sufficient records, 114 had no recorded health events.With this metric, the more consistently a smaller set of cows are observed in a given segment of the queue, the smaller the entropy values becomes to reflect less stochasticity in the system. In standard statistical models, the nominal value of estimators such as log likelihood and AIC scale with the size of the data set, and must be interpreted relative the value of equivalent terms assessed against a null model. Analogously, the nominal value of the entropy estimates scales with the number of discrete categories used. The maximum theoretical value occurs when no underlying deterministic structures are present and all categories are equally likely to occur, which algebraically simplifies to the log of the number of discrete categories used . Here the maximum theoretical entropy value would be log2 = 6.83. To visually contrast differences in stochasticity across the queue, the observed entropy values were plotted against the median entry quantile of the corresponding queue segment using the ggplot2 package, vertical grow racks with maximum theoretical entropy added as a horizontal reference line . Nonrandom patterns in queue formation could also be explored by tracking the entry position of individual cows over time. As entry quantile has a numerical value, we can now also use variance to quantify and contrast stochasticity between animals.

As with all analytical approaches reviewed in this paper, there are both strengths and shortcomings to either approach . In this system there are two potential drawbacks to this conventional summary statistic. The first is that variance estimates are quite sensitive to outliers, making it difficult to empirically distinguish between cows that occupy a wider range of queue positions and animals who typically occupy a narrower range but might have gotten jostled far from their normal position on one or several occasions. The second drawback is that, because variance quantifies dispersion about a central value, it cannot distinguish between cows that demonstrate little consistency in entry position and multimodal queuing patterns. For example, if a cow always entered the parlor either first or last, we would intuitively determine that this pattern is nonrandom, but the corresponding variance estimate would be the largest in the herd. Having recovered evidence of nonrandom patterns, the next step was to begin characterizing the behavioral mechanisms driving this heterogeneity. The most fundamental question that need be answered to inform further analysis was the degree to which queueing patterns were driven by individual or collective behaviors. Because cows jockey for position with one another in the crowd pen, where they are pushed up to enter the parlor, we know intuitively that entry quantile records cannot be considered truly independent observations. If cows move through this melee as independent agents, such that their position within the queue is determined by individual attributes ±preferences, dominance, etc ±then a linear model may still provide a reasonable approximation of the underlying system. Early observational work on milking order, however, has suggested that cows may form consistent associations when entering the milking parlor, particularly when heifers are reared together .

If cows move into the parlor in cohesive units, such that queue position is more determined by clique-level than individual attributes, then network analyses may be a more appropriate. Principal Component Analysis is commonly employed to visualize relationships between observational units in high dimensional datasets. In this approach, redundancy between variables, here each milking record, is captured using either covariance or correlation assessed across all data points, here all animals. An eigenvector decomposition is then used to linearly compress the information contained in the data via rotation of the orthogonal axes. New axes are added iteratively such that each new dimension is pointed in the direction of greatest remaining variability until only noise remains . PCA was here performed only on animals with no recorded health in order to prevent any anomalous queuing behaviors recorded from acutely or chronically ill animals from obscuring the queuing patterns of the broader herd. The correlation matrix was constructed using all pairwise complete observations, and a scree plot was used to determine the dimensionality of the resulting space . The plotly package was then used to visualize the final embedding. While PCA provides a computationally expedient means of visualizing high dimensional data, the underlying assumption of linearity is not always appropriate . In some data sets complex geometric constraints, such as those commonly found with images or raw accelerometer data, and other latent deterministic features may project data points onto high dimensional geometric surfaces collectively called manifolds .

When these topologies are nonlinear , the spatial relationships between data points cannot always be reliably maintained when projected directly into a linear space, which can lead to incorrect inferences . Imagine, for example, you had a round globe of the world and wanted instead a flat map. Applying PCA to this task would be analogous to smooshing the globe flat on a table. Some of the original geographic relationships would be discernable, but some locations would appear erroneously close, and some landscapes would be entirely obscured. Modern manifold learning algorithms strive to more reliably project the complex geometric relationships between observational units into a standard Some geographic features will still be lost, particularly over sparsely sampled regions like the oceans, but the spatial relationships between landmarks would collectively prove more representative of the original topography. To further explore the underlying structure of this data absent assumptions of linearity, and thereby potentially accommodate any complex geometric constraints imposed on milk order records by latent social structures within the herd, a diffusion map algorithm was implemented using functions provided in base R . This was done here by first calculating the Euclidean distance between temporally aligned vectors of parlor entry quantiles for each pairwise combination of cows, scaled to adjust for missing records, and then inverting these values to create a similarity matrix. From this similarity matrix a weighted network was created by progressively adding links for the k = 10 nearest neighbors surrounding each data point. A spectral value decomposition was then performed on the corresponding graph Laplacian matrix . The resulting eigenvalues were used to select the appropriate number of dimensions, and the corresponding eigenvectors visualized using the 3D scatter tools from the plotly package . Finally, as a means of comparing geometric structures identified in the observed dataset with those of a completely randomized queuing process, the permutated dataset generated in the previous section was also embedded and visualized using plotly graphics .Having determined from the previous visualizations that a linear model might be a reasonable representation of the underlying deterministic structures of this system, the next step was to explore the temporal dynamics of this dataset. In a standard repeated measures model, multiple observations from the same animal are assumed to be identically and independently sampled, mobile shelving system implying that sampling order should not affect the observed value. If the observation period is sufficiently long to allow the underlying process to shift or evolve over time, however, stationarity cannot be assumed. Failure to statistically accommodate a temporal trend can not only lead to spurious inferences due to incorrect estimation of error variance, but also risks overlooking dynamic features of the behaviors under consideration . In practice temporal trends are often assessed by first fitting a stationary model and analyzing the resulting residuals. This may suffice when the temporal trend is uniform across animals, but risks overlooking more complex nonhomogeneous temporal affects. This could occur if only a subset of the larger group displays a non-stationary pattern, a risk that is likely heightened in large socially heterogeneous groups.

In this physically constrained system, where we know that every cow moving forwards in the queue must force other cows backwards, compensatory trends could also be easily overlooked in collective assessment of residuals. We first assessed temporal trend using two conventional EDA techniques. First, the ggplot2 package was used to generate scatter plots of entry quantile values against the corresponding observation date for each individual cow, with pasture access annotated with a verticalline. Plots were visually inspected for non-stationary, and are provided in supplementary materials. Next, to further explore the impact of the shift from pen to overnight pasture access on morning queueing patterns, median queue positions from the two subperiods were plotted against using the ggplot2 package , and Pearson correlation and Kendall Tau were computed using the stats package . While these preliminary visualizations were easy to both generate and interpret, both treat cows as independent and somewhat isolated units. With such a large number of animals to consider, the capacity for human pattern detection is quickly overwhelmed, making it difficult to contextualize trends within the broader herd. Further, this approach fails to leverage non-independence between animals entering the parlor, and thus risks overlooking subtler collective responses. Data mechanics visualizations were implemented to simultaneously explore systematic heterogeneity in milk entry quantiles both between animals and across the temporal axis. This was done by first using entry quantile values to compute two Euclidean distance matrices: one quantifying the similarity between pairwise combinations of cows, the second quantifying similarity between pairwise combinations of daily milking sequences. These distance matrices were then used to generate two independent hierarchical clustering trees using the Ward D2 method . By cutting both trees at a fixed number of clusters, observation days and cows were both partitioned into empirically defined categories, and a contingency table was then formed with cow clusters as the row variable and day clusters as the column variable. The original distance matrices were then updated, using the clustering structure between cows to create a weighted distance matrix between days and vice versa, thereby allowing mutual information to be shared between the temporal and social axes of the dataset . After several iterations of this algorithm, clusterings converged towards a contingency table with minimal entropy, wherein the entry quantile values within each cell were as homogenous as possible. When the entry quantile values were subsequently visualized using a heat map, this highly generalizable entropy minimization technique served to visually enhance heterogeneity within the data driven by nonrandom patterns along either axis. Further, by facilitating the transfer of information between axes, interaction effects between the social and temporal dimensions of this system were magnified, which here provided a means to explore nonhomogeneous temporal non-stationary between subgroups within the herd . The data mechanics pipeline was used to analyze the temporal dynamics present in both the complete milking order dataset and the subset of animals with no recorded health events. Instead this algorithm was applied on a grid from 1 to 10 clusters for either axis. The resulting 100 heat maps scanned visually to determine the clustering granularity required to bring into resolution any interactions between social and temporal mechanisms. While this process may be computationally cumbersome, it is empirically analogous to systematically varying the focus of a light microscope to bring into resolution microbes of unknown size ±a tedious but effective means of identifying all relevant structures within a sample . Finally, the RColorBrewer package was used to add color annotations to the column margin, to clarify temporal patterns, and to the row margins, which served to visualize potential relationships between queue position, a selection of individual cow attribute variables, and the onset of recorded health complications.Having thoroughly characterized the stochastic structures present in this dataset, the insights gleaned from the preceding visualizations were incorporated into a linear model to evaluate the relationship between queue position and several cow attributes. The four days identified as outliers by the data mechanics visualizations were first removed and the dataset converted to long format to be analyzed as a repeated measures model using the nlme package . Cow was fit as a random intercept via maximum likelihood method. Guided by the results of entropy and data mechanics visualizations, VarIdent was used to estimate separate error variance terms for each cow, and the necessity of this data-hungry heterogeneous variance model confirmed via likelihood ratio test against the null model with homogenous variance .