This article falls under the overarching theme of 'Bayesian inference challenges, perspectives, and prospects'.
Latent variable models are a frequently used category within the field of statistics. Neural networks, when combined with deep latent variable models, lead to a substantial increase in expressivity, opening up many applications in machine learning. These models' inability to readily evaluate their likelihood function compels the use of approximations for inference tasks. Maximizing an evidence lower bound (ELBO), yielded by a variational approximation of the latent variables' posterior, constitutes a standard procedure. The standard ELBO's tightness, unfortunately, can suffer significantly if the set of variational distributions is not rich enough. A common method to make these bounds more precise is to make use of an impartial, low-variance Monte Carlo estimate of the evidence's support. This section highlights recent advancements in importance sampling, Markov chain Monte Carlo, and sequential Monte Carlo techniques employed to reach this desired outcome. This article forms part of a larger examination of 'Bayesian inference challenges, perspectives, and prospects' in a special issue.
Randomized clinical trials, while a cornerstone of clinical research, often face prohibitive costs and substantial obstacles in recruiting patients. Real-world evidence (RWE) from electronic health records, patient registries, claims data, and other sources is being actively explored as a potential alternative or enhancement to controlled clinical trials. Under the Bayesian paradigm, inference is crucial for the integration of data points from a variety of sources in this process. We present a review of current techniques, along with a novel non-parametric Bayesian (BNP) method. Differing patient populations necessitate the use of BNP priors to facilitate the comprehension and adjustment for population heterogeneities present in disparate data sources. The use of responsive web design for constructing a synthetic control arm in the context of augmenting single-arm, treatment-only studies is a specific problem we consider. The model-based adaptation of patient populations, crucial to this proposed approach, is designed to equalize those in the current study and the (adapted) real-world data. To implement this, common atom mixture models are used. Such models' architecture remarkably simplifies the act of drawing inferences. Variations in population numbers can be accounted for by calculating the ratios of constituent weights. This article is integrated into the broader exploration of 'Bayesian inference challenges, perspectives, and prospects'.
In the paper, shrinkage priors are analyzed; these priors enforce increasing shrinkage in a sequence of parameters. We carefully review Legramanti et al.'s (Legramanti et al. 2020, Biometrika 107, 745-752) approach to cumulative shrinkage, also known as CUSP. see more Utilizing a spike-and-slab shrinkage prior, detailed in (doi101093/biomet/asaa008), the spike probability increases stochastically, stemming from a stick-breaking representation of a Dirichlet process prior. This CUSP prior is initially extended, as a first contribution, through the integration of arbitrary stick-breaking representations, based on beta distributions. Secondarily, we demonstrate that exchangeable spike-and-slab priors, common in sparse Bayesian factor analysis, can be represented by a finite generalized CUSP prior, conveniently obtained from the decreasing order of slab probabilities. Consequently, exchangeable spike-and-slab shrinkage priors suggest that shrinkage intensifies as the column index within the loading matrix escalates, while avoiding explicit ordering restrictions on slab probabilities. The application of this paper's discoveries is highlighted by its use in sparse Bayesian factor analysis. An innovative exchangeable spike-and-slab shrinkage prior, drawing inspiration from the triple gamma prior of Cadonna et al. (2020), is introduced in Econometrics 8, article 20. A simulation investigation reveals the usefulness of (doi103390/econometrics8020020) in determining the uncharacterized quantity of driving factors. This theme issue, 'Bayesian inference challenges, perspectives, and prospects,' includes this article.
Applications involving the enumeration of items frequently demonstrate a high concentration of zero counts (excess zeros data). Regarding zero counts, the hurdle model explicitly accounts for their probability, while simultaneously assuming a specific sampling distribution for positive integers. We evaluate the data arising from the multiple counting operations. In light of this context, it is worthwhile to investigate the patterns of subject counts and subsequently classify subjects into clusters. Employing a novel Bayesian strategy, we cluster multiple zero-inflated processes, which may be related. A joint model for zero-inflated count data is constructed by specifying a hurdle model per process, using a shifted negative binomial sampling mechanism. Based on the model's parameters, the various processes are presumed to be independent, thus causing a considerable decrease in the parameter count compared to conventional multivariate methods. An enhanced finite mixture model with a variable number of components is used to model the subject-specific probabilities of zero-inflation and the parameters of the sampling distribution. A two-tiered clustering of the subjects is performed, the outer layer using zero/non-zero patterns, the inner layer using sampling distribution. For posterior inference, Markov chain Monte Carlo techniques are specifically designed. We showcase the suggested method in an application leveraging the WhatsApp messaging platform. This contribution is part of a larger investigation into 'Bayesian inference challenges, perspectives, and prospects' in a special issue.
From a three-decade-long foundation in philosophy, theory, methods, and computation, Bayesian approaches have evolved into an integral part of the modern statistician's and data scientist's analytical repertoire. Whether they embrace Bayesian principles wholeheartedly or utilize them opportunistically, applied professionals can now capitalize on the advantages presented by the Bayesian method. Six contemporary issues in Bayesian statistics, encompassing intelligent data collection, new data sources, federated analytics, inferential methods for implicit models, model transplantation, and thoughtfully designed software, are highlighted in this paper. This piece of writing forms a part of the larger discussion on 'Bayesian inference challenges, perspectives, and prospects'.
E-variables are the foundation of our representation of a decision-maker's uncertainty. Much like the Bayesian posterior, this e-posterior empowers predictive modeling using arbitrary loss functions, whose form may not be initially known. Unlike Bayesian posterior estimates, this approach guarantees frequentist validity for risk bounds, regardless of prior assumptions. A flawed selection of the e-collection (similar to the Bayesian prior) results in weaker, but not incorrect, bounds, thereby making e-posterior minimax decision procedures more secure than Bayesian ones. The quasi-conditional paradigm's illustration, derived from re-interpreting the prior partial Bayes-frequentist unification of Kiefer-Berger-Brown-Wolpert conditional frequentist tests, employs e-posteriors. This article contributes to the 'Bayesian inference challenges, perspectives, and prospects' theme issue.
Forensic science is a crucial component of the American criminal justice system. Historically, feature-based fields within forensic science, including firearms examination and latent print analysis, have not yielded consistently scientifically valid results. Recent research efforts propose black-box studies as a technique for examining the validity, including accuracy, reproducibility, and repeatability, of these feature-based disciplines. Examiner responses in these studies often exhibit a lack of complete answers to all test items, or a selection of the equivalent of 'uncertain'. Current black-box studies' statistical analyses neglect the substantial missing data. Unfortunately, the authors of black-box studies commonly neglect to share the data vital for meaningful modifications to the estimates relating to the substantial number of missing responses. In the field of small area estimation, we suggest the adoption of hierarchical Bayesian models that are independent of auxiliary data for adjusting non-response. Our formal exploration, using these models, is the first to examine the impact of missingness on error rate estimations in black-box studies. see more Our findings challenge the reported error rates of 0.4%, highlighting potential rates as high as 84% if non-response is factored into the model. If the unclear outcomes are considered as missing data, the actual error rate surpasses 28%. The black-box studies' missing data problem is not resolved by these proposed models. The release of auxiliary information allows for the establishment of new methodologies predicated on adjusting error rate estimations for missing data points. see more This article is contained within the collection of research focusing on 'Bayesian inference challenges, perspectives, and prospects'.
Algorithmic cluster analyses are surpassed by Bayesian methods, which furnish not only the precise locations of clusters, but also the probabilistic uncertainties in the clustering patterns and the structures within each. Bayesian cluster analysis, which includes both model-based and loss-function approaches, is reviewed. A discussion surrounding the significance of kernel/loss choice and the influence of prior specifications is also presented. Clustering cells and discovering latent cell types within single-cell RNA sequencing data are demonstrated in an application showing benefits for studying embryonic cellular development.