统计学学术速递[12.17]

2021-12-17 17:40:23 浏览数 (1)

stat统计学,共计34篇

【1】 A new locally linear embedding scheme in light of Hessian eigenmap 标题:一种新的基于Hessian特征映射的局部线性嵌入方案 链接:https://arxiv.org/abs/2112.09086

作者:Liren Lin,Chih-Wei Chen 备注:13 pages 摘要:我们提供了Hessian局部线性嵌入(HLLE)的一种新解释,揭示了它本质上是实现局部线性嵌入(LLE)相同思想的一种变体。基于新的解释,可以进行实质性的简化,其中“黑森”的概念被相当任意的权重所取代。此外,我们通过数值例子表明,当目标空间的维数大于数据流形的维数时,HLLE可能产生类似于投影的结果,因此建议对流形维数进行进一步修改。结合所有观测结果,我们最终实现了一种新的LLE方法,称为切向LLE(TLLE)。它比HLLE更简单、更健壮。 摘要:We provide a new interpretation of Hessian locally linear embedding (HLLE), revealing that it is essentially a variant way to implement the same idea of locally linear embedding (LLE). Based on the new interpretation, a substantial simplification can be made, in which the idea of "Hessian" is replaced by rather arbitrary weights. Moreover, we show by numerical examples that HLLE may produce projection-like results when the dimension of the target space is larger than that of the data manifold, and hence one further modification concerning the manifold dimension is suggested. Combining all the observations, we finally achieve a new LLE-type method, which is called tangential LLE (TLLE). It is simpler and more robust than HLLE.

【2】 Simultaneous Monitoring of a Large Number of Heterogeneous Categorical Data Streams 标题:大量异构分类数据流的同时监控 链接:https://arxiv.org/abs/2112.09077

作者:Kaizong Bai,Jian Li 摘要:本文提出了一个强大的方案来监控大量具有异构参数或性质的分类数据流。所考虑的数据流可以是具有多个属性级别的标称数据流,也可以是在属性级别之间具有某种自然顺序的有序数据流,如good、marginal和bad。对于顺序数据流,假定存在确定它的相应的潜在连续数据流。此外,不同的数据流可以具有不同数量的属性级别和不同的级别概率值。由于高维性,传统的多元分类控制图无法应用。在这里,我们通过一些标准化过程,将来自每个流的局部指数加权似然比检验统计量(无论是标称还是序数)集成到一个强大的拟合优度检验中。最后提出了一种全局监测统计方法。仿真结果表明了该方法的鲁棒性和有效性。 摘要:This article proposes a powerful scheme to monitor a large number of categorical data streams with heterogeneous parameters or nature. The data streams considered may be either nominal with a number of attribute levels or ordinal with some natural order among their attribute levels, such as good, marginal, and bad. For an ordinal data stream, it is assumed that there is a corresponding latent continuous data stream determining it. Furthermore, different data streams may have different number of attribute levels and different values of level probabilities. Due to high dimensionality, traditional multivariate categorical control charts cannot be applied. Here we integrate the local exponentially weighted likelihood ratio test statistics from each single stream, regardless of nominal or ordinal, into a powerful goodness-of-fit test by some normalization procedure. A global monitoring statistic is proposed ultimately. Simulation results have demonstrated the robustness and efficiency of our method.

【3】 Local Prediction Pools 标题:本地预测池 链接:https://arxiv.org/abs/2112.09073

作者:Oscar Oelrich,Mattias Villani,Sebastian Ankargren 备注:18 pages, 8 figures, 2 tables 摘要:我们提出局部预测池作为一种方法,用于组合一组专家的预测分布,这些专家的预测能力被认为相对于一组池变量局部变化。为了估计每个专家的局部预测能力,我们引入了简单、快速和可解释的卡尺方法。当本地预测性能数据很少时,来自本地预测池的专家池权重将接近等权重解决方案,从而使预测池具有鲁棒性和自适应性。在宏观经济预测评估和自行车租赁公司的日常自行车使用预测中,本地预测池的表现优于广泛使用的最佳线性预测池。 摘要:We propose local prediction pools as a method for combining the predictive distributions of a set of experts whose predictive abilities are believed to vary locally with respect to a set of pooling variables. To estimate the local predictive ability of each expert, we introduce the simple, fast, and interpretable caliper method. Expert pooling weights from the local prediction pool approaches the equal weight solution whenever there is little data on local predictive performance, making the pools robust and adaptive. Local prediction pools are shown to outperform the widely used optimal linear pools in a macroeconomic forecasting evaluation, and in predicting daily bike usage for a bike rental company.

【4】 Nonparametric empirical Bayes estimation based on generalized Laguerre series 标题:基于广义Laguerre级数的非参数经验Bayes估计 链接:https://arxiv.org/abs/2112.09050

作者:Rida Benhaddou,Matthew Connell 备注:30 pages 摘要:在这项工作中,我们深入研究了非参数经验Bayes理论,通过截断广义Laguerre级数来逼近经典Bayes估计,然后通过最小化估计的先验风险来估计其系数。最小化过程产生一个线性方程组,其大小等于截断水平。当混合分布和先验分布在正实半线或其子区间上时,我们研究了经验贝叶斯估计问题。通过研究几种常见的混合分布,我们提出了一种选择广义拉盖尔函数基参数的策略,使得我们的估计量{具有有限}方差。我们证明了我们的广义拉盖尔经验贝叶斯方法在极大极小意义下是渐近最优的。最后,我们的收敛速度与文献中的{几个}结果进行了比较和对比。 摘要:In this work, we delve into the nonparametric empirical Bayes theory and approximate the classical Bayes estimator by a truncation of the generalized Laguerre series and then estimate its coefficients by minimizing the prior risk of the estimator. The minimization process yields a system of linear equations the size of which is equal to the truncation level. We focus on the empirical Bayes estimation problem when the mixing distribution, and therefore the prior distribution, has a support on the positive real half-line or a subinterval of it. By investigating several common mixing distributions, we develop a strategy on how to select the parameter of the generalized Laguerre function basis so that our estimator {possesses a finite} variance. We show that our generalized Laguerre empirical Bayes approach is asymptotically optimal in the minimax sense. Finally, our convergence rate is compared and contrasted with {several} results from the literature.

【5】 The Dual PC Algorithm for Structure Learning 标题:用于结构学习的双PC算法 链接:https://arxiv.org/abs/2112.09036

作者:Enrico Giudice,Jack Kuipers,Giusi Moffa 摘要:虽然从观测数据中学习贝叶斯网络的图形结构是描述和帮助理解复杂应用中数据生成过程的关键,但由于其计算复杂性,该任务带来了相当大的挑战。代表贝叶斯网络模型的有向无环图(DAG)通常无法从观测数据中识别,存在多种方法来估计其等价类。在某些假设下,流行的PC算法可以通过测试条件独立性(CI),从边缘独立关系开始,逐步扩展条件集,一致地恢复正确的等价类。在这里,我们提出了双PC算法,这是一种利用协方差和精度矩阵之间的逆关系在PC算法中执行CI测试的新方案。值得注意的是,精度矩阵的元素与高斯数据的偏相关一致。然后,我们的算法利用协方差矩阵和精度矩阵上的块矩阵求逆,同时对互补(或对偶)条件集的偏相关进行测试。因此,双PC算法的多重CI测试首先考虑边缘和全阶CI关系,然后逐步转移到中心阶CI关系。仿真研究表明,双PC算法在运行时间和恢复底层网络结构方面均优于经典PC算法。 摘要:While learning the graphical structure of Bayesian networks from observational data is key to describing and helping understand data generating processes in complex applications, the task poses considerable challenges due to its computational complexity. The directed acyclic graph (DAG) representing a Bayesian network model is generally not identifiable from observational data, and a variety of methods exist to estimate its equivalence class instead. Under certain assumptions, the popular PC algorithm can consistently recover the correct equivalence class by testing for conditional independence (CI), starting from marginal independence relationships and progressively expanding the conditioning set. Here, we propose the dual PC algorithm, a novel scheme to carry out the CI tests within the PC algorithm by leveraging the inverse relationship between covariance and precision matrices. Notably, the elements of the precision matrix coincide with partial correlations for Gaussian data. Our algorithm then exploits block matrix inversions on the covariance and precision matrices to simultaneously perform tests on partial correlations of complementary (or dual) conditioning sets. The multiple CI tests of the dual PC algorithm, therefore, proceed by first considering marginal and full-order CI relationships and progressively moving to central-order ones. Simulation studies indicate that the dual PC algorithm outperforms the classical PC algorithm both in terms of run time and in recovering the underlying network structure.

【6】 Using Bayesian Evidence Synthesis Methods to Incorporate Real World Evidence in Surrogate Endpoint Evaluation 标题:利用贝叶斯证据合成方法将真实世界证据纳入代理终点评估 链接:https://arxiv.org/abs/2112.08948

作者:Lorna Wheaton,Anastasios Papanikos,Anne Thomas,Sylwia Bujkiewicz 备注:13 pages, 2 figures 摘要:目的:传统上,替代终点的验证是使用RCT数据进行的。然而,RCT数据可能过于有限,无法验证替代端点。在本文中,我们试图通过纳入真实世界证据(RWE)来改进替代终点的验证。研究设计和设置:我们使用比较RWE(cRWE)和单臂RWE(sRWE)的数据,补充RCT证据,以评估无进展生存率(PFS)作为转移性结直肠癌(mCRC)总生存率(OS)的替代终点。通过RCT、cRWE和匹配sRWE的治疗效果评估,将抗血管生成治疗与化疗进行比较,以告知代孕模式,并根据PFS的治疗效果预测OS的治疗效果。结果:确定了7项随机对照试验、4项cRWE研究和3项匹配的sRWE研究。将RWE添加到RCT中降低了替代关系参数估计的不确定性。在随机对照试验中添加RWE也提高了使用观察到的PFS效应数据预测治疗对OS影响的准确性和精确度。结论:在RCT数据中加入RWE提高了描述PFS和OS治疗效果与预测临床效益之间替代关系的参数精度。 摘要:Objective: Traditionally validation of surrogate endpoints has been carried out using RCT data. However, RCT data may be too limited to validate surrogate endpoints. In this paper, we sought to improve validation of surrogate endpoints with the inclusion of real world evidence (RWE). Study Design and Setting: We use data from comparative RWE (cRWE) and single arm RWE (sRWE), to supplement RCT evidence for evaluation of progression free survival (PFS) as a surrogate endpoint to overall survival (OS) in metastatic colorectal cancer (mCRC). Treatment effect estimates from RCTs, cRWE and matched sRWE, comparing anti-angiogenic treatments with chemotherapy, were used to inform surrogacy patterns and predictions of the treatment effect on OS from the treatment effect on PFS. Results: Seven RCTs, four cRWE studies and three matched sRWE studies were identified. The addition of RWE to RCTs reduced the uncertainty around the estimates of the parameters for the surrogate relationship. Addition of RWE to RCTs also improved the accuracy and precision of predictions of the treatment effect on OS obtained using data on the observed effect on PFS. Conclusion: The addition of RWE to RCT data improved the precision of the parameters describing the surrogate relationship between treatment effects on PFS and OS and the predicted clinical benefit.

【7】 BayesFlow can reliably detect Model Misspecification and Posterior Errors in Amortized Bayesian Inference 标题:BayesFlow可以可靠地检测摊销贝叶斯推理中的模型错误和后验误差 链接:https://arxiv.org/abs/2112.08866

作者:Marvin Schmitt,Paul-Christian Bürkner,Ullrich Köthe,Stefan T. Radev 备注:14 pages, 7 figures 摘要:神经密度估计器在不同的研究领域中被证明在执行基于模拟的贝叶斯推理方面非常强大。特别是,BayesFlow框架使用两步方法,在模拟程序隐式定义似然函数的情况下,实现摊销参数估计。但是,当模拟不能很好地反映现实时,这种推断有多可靠呢?在本文中,我们概念化了基于仿真的推理中出现的模型错误指定的类型,并系统地研究了BayesFlow框架在这些错误指定下的性能。我们提出了一个增广优化目标,该目标在潜在数据空间上施加概率结构,并利用最大平均差异(MMD)来检测推理过程中潜在的灾难性错误,从而破坏所获得结果的有效性。我们根据大量人工和现实的错误说明验证了我们的检测标准,从玩具共轭模型到应用于真实数据的决策和疾病爆发动力学的复杂模型。此外,我们还表明,后验推理误差随着真实数据生成分布与潜在摘要空间中典型模拟集之间的距离的增加而增加。因此,我们证明了MMD作为一种检测模型错误指定的方法和作为一种验证摊销贝叶斯推理可信度的代理的双重效用。 摘要:Neural density estimators have proven remarkably powerful in performing efficient simulation-based Bayesian inference in various research domains. In particular, the BayesFlow framework uses a two-step approach to enable amortized parameter estimation in settings where the likelihood function is implicitly defined by a simulation program. But how faithful is such inference when simulations are poor representations of reality? In this paper, we conceptualize the types of model misspecification arising in simulation-based inference and systematically investigate the performance of the BayesFlow framework under these misspecifications. We propose an augmented optimization objective which imposes a probabilistic structure on the latent data space and utilize maximum mean discrepancy (MMD) to detect potentially catastrophic misspecifications during inference undermining the validity of the obtained results. We verify our detection criterion on a number of artificial and realistic misspecifications, ranging from toy conjugate models to complex models of decision making and disease outbreak dynamics applied to real data. Further, we show that posterior inference errors increase as a function of the distance between the true data-generating distribution and the typical set of simulations in the latent summary space. Thus, we demonstrate the dual utility of MMD as a method for detecting model misspecification and as a proxy for verifying the faithfulness of amortized Bayesian inference.

【8】 Classification Under Ambiguity: When Is Average-K Better Than Top-K? 标题:歧义下的分类:Average-K何时优于Top-K? 链接:https://arxiv.org/abs/2112.08851

作者:Titouan Lorieul,Alexis Joly,Dennis Shasha 备注:53 pages, 21 figures 摘要:当可能有多个标签时,选择单个标签可能会导致精度低。一个常见的替代方法,称为top-$K$分类,是选择一些数字$K$(通常约为5),并返回得分最高的$K$标签。不幸的是,对于明确的情况,$K>1$太多,对于非常模糊的情况,$Kleq 5$(例如)可能太小。另一种明智的策略是使用自适应方法,其中返回的标签数量随计算模糊度的函数而变化,但必须在所有样本上平均到某个特定的$K$。我们表示该替代平均值-$K$分类。本文正式描述了当平均$K$分类比固定顶部$K$分类能够获得更低的错误率时的模糊度分布。此外,它为固定大小和自适应分类器提供了自然的估计过程,并证明了它们的一致性。最后,它报告了对真实世界图像数据集的实验,揭示了在实践中平均$K$分类比最高$K$分类的好处。总的来说,当模糊度被精确地知道时,平均值-$K$永远不会比最高值-$K$差,而且在我们的实验中,当它被估计时,这也成立。 摘要:When many labels are possible, choosing a single one can lead to low precision. A common alternative, referred to as top-$K$ classification, is to choose some number $K$ (commonly around 5) and to return the $K$ labels with the highest scores. Unfortunately, for unambiguous cases, $K>1$ is too many and, for very ambiguous cases, $K leq 5$ (for example) can be too small. An alternative sensible strategy is to use an adaptive approach in which the number of labels returned varies as a function of the computed ambiguity, but must average to some particular $K$ over all the samples. We denote this alternative average-$K$ classification. This paper formally characterizes the ambiguity profile when average-$K$ classification can achieve a lower error rate than a fixed top-$K$ classification. Moreover, it provides natural estimation procedures for both the fixed-size and the adaptive classifier and proves their consistency. Finally, it reports experiments on real-world image data sets revealing the benefit of average-$K$ classification over top-$K$ in practice. Overall, when the ambiguity is known precisely, average-$K$ is never worse than top-$K$, and, in our experiments, when it is estimated, this also holds.

【9】 How to estimate heritability, a guide for epidemiologists 标题:如何估计遗传率,流行病学家指南 链接:https://arxiv.org/abs/2112.08840

作者:Ciarrah-Jane S Barry,Venexia M Walker,Rosa C G Cheesman,George Davey Smith,Tim T Morris,Neil M Davies 备注:45 pages, 4 figures 摘要:传统上,遗传力是用基于家庭的方法估计的,如双胞胎研究。分子基因组学的进步促进了利用无关或相关个体大样本的替代方法的发展。然而,在遗传力的估计方面仍然存在一些特殊的挑战,如上位性、分类交配和间接遗传效应。在这里,我们概述了遗传流行病学中用于估计遗传力的常用方法,即遗传变异解释的表型变异比例。我们为理解基于家族的设计(双胞胎和家族研究)、基于无关个体的基因组设计(LD评分回归,GREML)和基于家族的基因组设计(兄弟姐妹回归,GREML-KIN,Trio GCTA,MGCTA,RDR)的遗传力估计方法所需的关键遗传概念提供了指南。对于每种方法,我们描述了如何估计遗传力,其估计所依据的假设,并讨论了不满足这些假设时的含义。我们进一步讨论了与相关个体样本相比,在无关个体样本中估计遗传力的益处和局限性。总的来说,本文旨在帮助读者确定每种方法何时合适以及为什么合适。 摘要:Traditionally, heritability has been estimated using family-based methods such as twin studies. Advancements in molecular genomics have facilitated the development of alternative methods that utilise large samples of unrelated or related individuals. Yet, specific challenges persist in the estimation of heritability such as epistasis, assortative mating and indirect genetic effects. Here, we provide an overview of common methods applied in genetic epidemiology to estimate heritability i.e., the proportion of phenotypic variation explained by genetic variation. We provide a guide to key genetic concepts required to understand heritability estimation methods from family-based designs (twin and family studies), genomic designs based on unrelated individuals (LD score regression, GREML), and family-based genomic designs (Sibling regression, GREML-KIN, Trio-GCTA, MGCTA, RDR). For each method, we describe how heritability is estimated, the assumptions underlying its estimation, and discuss the implications when these assumptions are not met. We further discuss the benefits and limitations of estimating heritability within samples of unrelated individuals compared to samples of related individuals. Overall, this article is intended to help the reader determine the circumstances when each method would be appropriate and why.

【10】 Linear Regression, Covariate Selection and the Failure of Modelling 标题:线性回归、协变量选择与建模失败 链接:https://arxiv.org/abs/2112.08738

作者:Laurie Davies 备注:19 pages 4 figures 摘要:有人认为,所有基于模型的线性回归协变量选择方法都失败了。这适用于基于P值的频繁方法和贝叶斯方法,尽管原因不同。在论文的第一部分中,13个基于模型的程序与无模型高斯协变量程序在所选协变量和所需时间方面进行了比较。比较基于四个数据集和两个模拟。这些数据集在文献中经常用作示例,它们没有什么特别之处。所有基于模型的过程都失败了。在论文的第二部分中,有人认为这种失败的原因正是模型的使用。如果模型包含所有可用的协变量,则可以使用标准P值。在这种情况下,P值的使用非常简单。一旦模型只指定了协变项的某个未知子集(问题是如何识别该子集),情况就会发生根本性的变化。有许多P值,它们是相关的,并且大多数是无效的。贝叶斯范式也假设了一个正确的模型,但尽管没有大量协变量的概念问题,但即使对于中等规模的数据集,也有相当大的开销导致计算和分配问题。高斯协变量程序基于P值,P值定义为随机高斯协变量优于所考虑协变量的概率。无论在何种情况下,这些P值都是准确有效的。分配要求和算法复杂性在数据大小上都是线性的,这使得程序能够处理大型数据集。它在各方面都优于所有其他程序。 摘要:It is argued that all model based approaches to the selection of covariates in linear regression have failed. This applies to frequentist approaches based on P-values and to Bayesian approaches although for different reasons. In the first part of the paper 13 model based procedures are compared to the model-free Gaussian covariate procedure in terms of the covariates selected and the time required. The comparison is based on four data sets and two simulations. There is nothing special about these data sets which are often used as examples in the literature. All the model based procedures failed. In the second part of the paper it is argued that the cause of this failure is the very use of a model. If the model involves all the available covariates standard P-values can be used. The use of P-values in this situation is quite straightforward. As soon as the model specifies only some unknown subset of the covariates the problem being to identify this subset the situation changes radically. There are many P-values, they are dependent and most of them are invalid. The Bayesian paradigm also assumes a correct model but although there are no conceptual problems with a large number of covariates there is a considerable overhead causing computational and allocation problems even for moderately sized data sets. The Gaussian covariate procedure is based on P-values which are defined as the probability that a random Gaussian covariate is better than the covariate being considered. These P-values are exact and valid whatever the situation. The allocation requirements and the algorithmic complexity are both linear in the size of the data making the procedure capable of handling large data sets. It outperforms all the other procedures in every respect.

【11】 Consistency of the maximum likelihood estimator in hidden Markov models with trends 标题:有趋势隐马尔可夫模型极大似然估计的相合性 链接:https://arxiv.org/abs/2112.08731

作者:Luc Lehéricy,Augustin Touron 摘要:具有趋势的隐马尔可夫模型是一种隐马尔可夫模型,其排放分布由依赖于当前隐藏状态和当前时间的趋势来转换。与标准的隐马尔可夫模型相反,这样的过程不是齐次的,不能通过简单的去趋势化步骤使其变得齐次。我们证明了当趋势是多项式时,最大似然估计能够与其他参数一起恢复趋势,并且是强一致的。更准确地说,真实趋势与估计趋势之差的最大范数趋于零。通过模拟研究评估了最大似然估计的数值性质。 摘要:A hidden Markov model with trends is a hidden Markov model whose emission distributions are translated by a trend that depends on the current hidden state and on the current time. Contrary to standard hidden Markov models, such processes are not homogeneous and cannot be made homogeneous by a simple de-trending step. We show that when the trends are polynomial, the maximum likelihood estimator is able to recover the trends together with the other parameters and is strongly consistent. More precisely, the supremum norm of the difference between the true trends and the estimated ones tends to zero. Numerical properties of the maximum likelihood estimator are assessed by a simulation study.

【12】 High-dimensional logistic entropy clustering 标题:高维Logistic熵聚类 链接:https://arxiv.org/abs/2112.08701

作者:Edouard Genetay,Adrien Saumard,Rémi Coulaud 摘要:分类概率(正则化)熵最小化是一类通用的判别聚类方法。分类概率通常是通过使用监督分类中的一些经典损失来定义的,重点是通过优化以观测为条件的标签法则来避免对完整数据分布进行建模。我们通过专门研究逻辑分类概率,首次对此类方法进行了理论研究。我们证明,如果观测是由两组分各向同性高斯混合产生的,那么最小化欧氏球上的熵风险确实可以识别混合的分离向量。此外,如果该分离向量是稀疏的,则通过$ell_{1}$-正则化项惩罚经验风险可以在高维空间中推断分离,并以稀疏性问题的标准速率恢复其支持。我们的方法基于逻辑熵风险的局部凸性,如果分离向量足够大,则会发生局部凸性,其范数条件独立于空间维度。这种局部凸性特性也保证了在经典的低维环境中的快速速率。 摘要:Minimization of the (regularized) entropy of classification probabilities is a versatile class of discriminative clustering methods. The classification probabilities are usually defined through the use of some classical losses from supervised classification and the point is to avoid modelisation of the full data distribution by just optimizing the law of the labels conditioned on the observations. We give the first theoretical study of such methods, by specializing to logistic classification probabilities. We prove that if the observations are generated from a two-component isotropic Gaussian mixture, then minimizing the entropy risk over a Euclidean ball indeed allows to identify the separation vector of the mixture. Furthermore, if this separation vector is sparse, then penalizing the empirical risk by a $ell_{1}$-regularization term allows to infer the separation in a high-dimensional space and to recover its support, at standard rates of sparsity problems. Our approach is based on the local convexity of the logistic entropy risk, that occurs if the separation vector is large enough, with a condition on its norm that is independent from the space dimension. This local convexity property also guarantees fast rates in a classical, low-dimensional setting.

【13】 Model-free Bootstrap Prediction Regions for Multivariate Time Series 标题:多变量时间序列的无模型Bootstrap预测域 链接:https://arxiv.org/abs/2112.08671

作者:Yiren Wang,Dimitris N. Politis 备注:This is an initial version of the paper. A generalization to our setting is under investigation 摘要:Das和Politis(2020)提出了一种无模型自举(MFB)范式,用于生成单变量(局部)平稳时间序列的预测区间。Wang和Politis(2019)在平稳性和弱依赖条件下解决了该算法的理论保证。根据这项工作,我们在多变量时间序列设置下扩展了预测推断的MFB。我们描述了两种算法,第一种算法适用于任意维d下的一类时间序列;第二种方法适用于低维环境下更广义的时间序列。我们通过理论有效性和模拟性能来证明我们的程序。 摘要:In Das and Politis(2020), a model-free bootstrap(MFB) paradigm was proposed for generating prediction intervals of univariate, (locally) stationary time series. Theoretical guarantees for this algorithm was resolved in Wang and Politis(2019) under stationarity and weak dependence condition. Following this line of work, here we extend MFB for predictive inference under a multivariate time series setup. We describe two algorithms, the first one works for a particular class of time series under any fixed dimension d; the second one works for a more generalized class of time series under low-dimensional setting. We justify our procedure through theoretical validity and simulation performance.

【14】 On Gibbs Sampling for Structured Bayesian Models Discussion of paper by Zanella and Roberts 标题:关于结构贝叶斯模型的Gibbs抽样Zanella和Roberts的论文讨论 链接:https://arxiv.org/abs/2112.08641

作者:Xiaodong Yang,Jun S. Liu 备注:18 pages 摘要:本文讨论了Zanella和Roberts的论文:多级线性模型、gibbs采样器和多重网格分解。我们考虑几个扩展,其中多重网格分解将给我们带来有趣的见解,包括向量分层模型,线性混合效应模型和部分定心参数化。 摘要:This article is a discussion of Zanella and Roberts' paper: Multilevel linear models, gibbs samplers and multigrid decompositions. We consider several extensions in which the multigrid decomposition would bring us interesting insights, including vector hierarchical models, linear mixed effects models and partial centering parametrizations.

【15】 A model sufficiency test using permutation entropy 标题:一种基于排列熵的模型充分性检验 链接:https://arxiv.org/abs/2112.08636

作者:Xin Huang,Han Lin Shang,David Pitt 备注:32 pages, 5 figures, to appear at the Journal of Forecasting 摘要:利用排列熵中的有序模式概念,我们提出了一个模型充分性检验来研究给定模型的点预测精度。与一些经典的模型充分性测试(如Broock et al.(1996)测试)相比,我们的建议不需要足够的模型来消除估计残差中显示的所有结构。当调查数据的基本动力学创新显示出某种结构时,例如更高的矩序列相关性,Broock等人(1996)的测试可能会导致关于点预测值充分性的错误结论。由于结构化创新,模型充分性测试和预测精度标准之间可能会出现不一致。我们的建议填补了模型和预测评估方法之间的不一致性,并且在基础过程具有非白色加性创新时仍然有效。 摘要:Using the ordinal pattern concept in permutation entropy, we propose a model sufficiency test to study a given model's point prediction accuracy. Compared to some classical model sufficiency tests, such as the Broock et al.'s (1996) test, our proposal does not require a sufficient model to eliminate all structures exhibited in the estimated residuals. When the innovations in the investigated data's underlying dynamics show a certain structure, such as higher-moment serial dependence, the Broock et al.'s (1996) test can lead to erroneous conclusions about the sufficiency of point predictors. Due to the structured innovations, inconsistency between the model sufficiency tests and prediction accuracy criteria can occur. Our proposal fills in this incoherence between model and prediction evaluation approaches and remains valid when the underlying process has non-white additive innovation.

【16】 A New Model-free Prediction Method: GA-NoVaS 标题:一种新的无模型预测方法:GA-NOVAS 链接:https://arxiv.org/abs/2112.08601

作者:Kejin Wu,Sayar Karmakar 备注:arXiv admin note: substantial text overlap with arXiv:2101.02273 摘要:波动率预测在金融计量经济学中占有重要地位。以往的研究主要基于各种GARCH模型的应用。然而,人们很难选择一个适用于一般情况的特定GARCH模型,而且这种传统的方法对于处理高波动周期或使用小样本量是不稳定的。新提出的归一化和方差稳定(NoVaS)方法是一种更稳健、更精确的预测技术。这种无模型方法是利用基于ARCH模型的逆变换建立的。受ARCH-to-GARCH模型历史发展的启发,我们提出了一种利用GARCH模型结构的NoVaS型方法。通过进行广泛的数据分析,我们发现我们的模型在预测短期和波动性数据方面比当前最先进的NoVaS方法具有更好的时间聚集预测性能。我们的新方法的胜利证实了这一点,也开辟了探索其他新星结构以改进现有结构或解决特定预测问题的途径。 摘要:Volatility forecasting plays an important role in the financial econometrics. Previous works in this regime are mainly based on applying various GARCH-type models. However, it is hard for people to choose a specific GARCH model which works for general cases and such traditional methods are unstable for dealing with high-volatile period or using small sample size. The newly proposed normalizing and variance stabilizing (NoVaS) method is a more robust and accurate prediction technique. This Model-free method is built by taking advantage of an inverse transformation which is based on the ARCH model. Inspired by the historic development of the ARCH to GARCH model, we propose a novel NoVaS-type method which exploits the GARCH model structure. By performing extensive data analysis, we find our model has better time-aggregated prediction performance than the current state-of-the-art NoVaS method on forecasting short and volatile data. The victory of our new method corroborates that and also opens up avenues where one can explore other NoVaS structures to improve on the existing ones or solve specific prediction problems.

【17】 Simultaneous Sieve Inference for Time-Inhomogeneous Nonlinear Time Series Regression 标题:时间非齐次非线性时间序列回归的同时筛选推断 链接:https://arxiv.org/abs/2112.08545

作者:Xiucai Ding,Zhou Zhou 备注:57 pages, 8 figures 摘要:本文考虑一类一般平稳时间序列的时间非齐次非线性时间序列回归问题。一方面,我们提出了时变回归函数的筛非参数估计,它可以达到最小-最大最优速率。另一方面,我们发展了一个统一的同时推理理论,可以用来对函数进行结构和精确形式的检验。即使在局部较弱的情况下,我们提出的统计数据也很强大。我们还提出了一个实际实现的乘数引导程序。我们的方法和理论不需要对回归函数进行任何结构假设,我们还允许在无界域中支持函数。我们还建立了无界域中二维函数的筛近似理论,以及高维局部平稳时间序列的仿射和二次型的高斯近似结果,这些结果可能是独立的。数值模拟和真实的财务数据分析支持我们的结果。 摘要:In this paper, we consider the time-inhomogeneous nonlinear time series regression for a general class of locally stationary time series. On one hand, we propose sieve nonparametric estimators for the time-varying regression functions which can achieve the min-max optimal rate. On the other hand, we develop a unified simultaneous inferential theory which can be used to conduct both structural and exact form testings on the functions. Our proposed statistics are powerful even under locally weak alternatives. We also propose a multiplier bootstrapping procedure for practical implementation. Our methodology and theory do not require any structural assumptions on the regression functions and we also allow the functions to be supported in an unbounded domain. We also establish sieve approximation theory for 2-D functions in unbounded domain and a Gaussian approximation result for affine and quadratic forms for high dimensional locally stationary time series, which can be of independent interest. Numerical simulations and a real financial data analysis are provided to support our results.

【18】 The Impact of TV Advertising on Website Traffic 标题:电视广告对网站流量的影响 链接:https://arxiv.org/abs/2112.08530

作者:Lukáš Veverka,Vladimír Holý 摘要:我们提出了一个模型程序,用于估计电视广告的即时反应,并评估影响广告大小的因素。首先,我们使用核平滑方法捕获网站访问的日和季节模式。其次,我们使用最大似然法估计广告发布后网站访问量会逐渐增加。第三,我们使用随机森林方法分析网站访问量估计增长对广告特征的非线性依赖性。所提出的方法适用于一个数据集,该数据集包含2019年一家电子商务公司每分钟的有机网站访问量和电视广告的详细特征。结果表明,人们确实愿意在屏幕和多任务之间切换。此外,一天中的时间、电视频道和广告动机在广告的影响中扮演着重要角色。基于这些结果,营销人员可以量化单个广告点的回报,评估额外付费广告选项(如溢价位置),并优化购买过程。 摘要:We propose a modeling procedure for estimating immediate responses to TV ads and evaluating the factors influencing their size. First, we capture diurnal and seasonal patterns of website visits using the kernel smoothing method. Second, we estimate a gradual increase in website visits after an ad using the maximum likelihood method. Third, we analyze the non-linear dependence of the estimated increase in website visits on characteristics of the ads using the random forest method. The proposed methodology is applied to a dataset containing minute-by-minute organic website visits and detailed characteristics of TV ads for an e-commerce company in 2019. The results show that people are indeed willing to switch between screens and multitask. Moreover, the time of the day, the TV channel, and the advertising motive play a great role in the impact of the ads. Based on these results, marketers can quantify the return on a single ad spot, evaluate the extra-paid ad options (such as a premium position), and optimize the buying process.

【19】 A Targeted Approach to Confounder Selection for High-Dimensional Data 标题:一种针对高维数据的联合创建者选择方法 链接:https://arxiv.org/abs/2112.08495

作者:Asad Haris,Robert Platt 摘要:我们考虑从潜在的大协变量集合中选择混杂因子的问题,当估计因果效应时。最近,高维倾向评分(hdPS)方法被开发用于此任务;hdPS通过估计每个变量的重要性得分对潜在混杂因素进行排序,并选择前几个变量。然而,这种排序过程是有限的:它要求所有变量都是二进制的。我们建议将hdPS扩展到一般类型的反应和混杂变量。我们进一步发展了一个群体重要性评分,允许我们对潜在混杂因素的群体进行排序。主要的挑战是我们的参数需要倾向评分或反应模型;两者都容易受到模型错误指定的影响。我们提出了一种目标最大似然估计(TMLE),它允许使用非参数机器学习工具来拟合这些中间模型。我们建立了估计量的渐近正态性,从而可以构造置信区间。我们用模拟和真实数据的数值研究来补充我们的工作。 摘要:We consider the problem of selecting confounders for adjustment from a potentially large set of covariates, when estimating a causal effect. Recently, the high-dimensional Propensity Score (hdPS) method was developed for this task; hdPS ranks potential confounders by estimating an importance score for each variable and selects the top few variables. However, this ranking procedure is limited: it requires all variables to be binary. We propose an extension of the hdPS to general types of response and confounder variables. We further develop a group importance score, allowing us to rank groups of potential confounders. The main challenge is that our parameter requires either the propensity score or response model; both vulnerable to model misspecification. We propose a targeted maximum likelihood estimator (TMLE) which allows the use of nonparametric, machine learning tools for fitting these intermediate models. We establish asymptotic normality of our estimator, which consequently allows constructing confidence intervals. We complement our work with numerical studies on simulated and real data.

【20】 On Generalization and Computation of Tukey's Depth: Part II 标题:关于Tukey深度的推广和计算:第二部分 链接:https://arxiv.org/abs/2112.08478

作者:Yiyuan She,Shao Tang,Jingze Liu 摘要:本文研究了如何将Tukey深度推广到约束空间中可能是弯曲的或有边界的问题,以及具有不可微目标的问题。首先,使用流形方法,我们为定义在黎曼流形上的光滑问题提出了一类广泛的黎曼深度,并展示了它在球面数据分析、主成分分析和多元正交回归中的应用。此外,对于非光滑问题,我们引入额外的松弛变量和不等式约束来定义一种新的松弛数据深度,它可以对稀疏学习和降秩回归产生的估计量进行中心向外的排序。实际数据示例说明了一些建议的数据深度的有用性。 摘要:This paper studies how to generalize Tukey's depth to problems defined in a restricted space that may be curved or have boundaries, and to problems with a nondifferentiable objective. First, using a manifold approach, we propose a broad class of Riemannian depth for smooth problems defined on a Riemannian manifold, and showcase its applications in spherical data analysis, principal component analysis, and multivariate orthogonal regression. Moreover, for nonsmooth problems, we introduce additional slack variables and inequality constraints to define a novel slacked data depth, which can perform center-outward rankings of estimators arising from sparse learning and reduced rank regression. Real data examples illustrate the usefulness of some proposed data depths.

【21】 On Generalization and Computation of Tukey's Depth: Part I 标题:关于Tukey深度的推广和计算:第一部分 链接:https://arxiv.org/abs/2112.08475

作者:Yiyuan She,Shao Tang,Jingze Liu 摘要:Tukey的深度为非参数推断和估计提供了强大的工具,但在现代统计数据分析中也遇到了严重的计算和方法困难。本文研究如何在多维空间中推广和计算Tukey型深度。介绍了影响驱动的抛光子空间深度的一般框架,强调了潜在影响空间和差异度量的重要性。新的矩阵公式使我们能够利用最先进的优化技术开发易于实现且保证快速收敛的可伸缩算法。特别是,在大量实验的支持下,现在可以比以前更快地计算半空间深度和回归深度。本期刊同一期还向读者提供了一篇配套论文。 摘要:Tukey's depth offers a powerful tool for nonparametric inference and estimation, but also encounters serious computational and methodological difficulties in modern statistical data analysis. This paper studies how to generalize and compute Tukey-type depths in multi-dimensions. A general framework of influence-driven polished subspace depth, which emphasizes the importance of the underlying influence space and discrepancy measure, is introduced. The new matrix formulation enables us to utilize state-of-the-art optimization techniques to develop scalable algorithms with implementation ease and guaranteed fast convergence. In particular, half-space depth as well as regression depth can now be computed much faster than previously possible, with the support from extensive experiments. A companion paper is also offered to the reader in the same issue of this journal.

【22】 Gaining Outlier Resistance with Progressive Quantiles: Fast Algorithms and Theoretical Studies 标题:用递进分位数获得异常值抵抗:快速算法和理论研究 链接:https://arxiv.org/abs/2112.08471

作者:Yiyuan She,Zhifeng Wang,Jiahui Shen 摘要:异常值广泛存在于大数据应用中,可能严重影响统计估计和推断。本文提出了一种抗离群点估计的框架,用于对任意给定的损失函数进行鲁棒性估计。它与修剪方法密切相关,包括所有样本的显式outlyingness参数,这反过来有助于计算、理论和参数调整。为了解决非凸性和非光滑性问题,我们开发了易于实现且保证快速收敛的可伸缩算法。特别是,提出了一种新技术,以减轻对起点的要求,从而在常规数据集上,可以显著减少数据重采样的次数。基于统计和计算相结合的处理方法,我们能够进行超越M-估计的非交感分析。所得到的抵抗估计虽然不一定是全局最优的,甚至不是局部最优的,但在低维和高维上都具有极大极小率最优性。在回归、分类和神经网络中的实验表明,该方法在粗异常值出现时具有良好的性能。 摘要:Outliers widely occur in big-data applications and may severely affect statistical estimation and inference. In this paper, a framework of outlier-resistant estimation is introduced to robustify an arbitrarily given loss function. It has a close connection to the method of trimming and includes explicit outlyingness parameters for all samples, which in turn facilitates computation, theory, and parameter tuning. To tackle the issues of nonconvexity and nonsmoothness, we develop scalable algorithms with implementation ease and guaranteed fast convergence. In particular, a new technique is proposed to alleviate the requirement on the starting point such that on regular datasets, the number of data resamplings can be substantially reduced. Based on combined statistical and computational treatments, we are able to perform nonasymptotic analysis beyond M-estimation. The obtained resistant estimators, though not necessarily globally or even locally optimal, enjoy minimax rate optimality in both low dimensions and high dimensions. Experiments in regression, classification, and neural networks show excellent performance of the proposed methodology at the occurrence of gross outliers.

【23】 Characterization of causal ancestral graphs for time series with latent confounders 标题:含潜在混杂因素的时间序列因果祖图的刻画 链接:https://arxiv.org/abs/2112.08417

作者:Andreas Gerhardus 备注:55 pages (including appendix), 16 figures 摘要:推广有向最大祖先图,我们引入了一类图形模型,用于表示具有未观测变量的多元时间序列的有限多个定期采样和定期次采样时间步之间的时滞特定因果关系和独立性。我们完全描述了这些图,并表明它们包含的约束超出了先前文献中考虑的约束。这允许在没有附加假设的情况下进行更强的因果推断。在有向部分祖先图的推广中,我们进一步介绍了新类型图的马尔可夫等价类的图形表示,并表明它们比当前最先进的因果发现算法所学的知识更丰富。我们还分析了通过增加观察到的时间步数获得的附加信息。 摘要:Generalizing directed maximal ancestral graphs, we introduce a class of graphical models for representing time lag specific causal relationships and independencies among finitely many regularly sampled and regularly subsampled time steps of multivariate time series with unobserved variables. We completely characterize these graphs and show that they entail constraints beyond those that have previously been considered in the literature. This allows for stronger causal inferences without having imposed additional assumptions. In generalization of directed partial ancestral graphs we further introduce a graphical representation of Markov equivalence classes of the novel type of graphs and show that these are more informative than what current state-of-the-art causal discovery algorithms learn. We also analyze the additional information gained by increasing the number of observed time steps.

【24】 Non-Gaussian Component Analysis via Lattice Basis Reduction 标题:基于格基约简的非高斯分量分析 链接:https://arxiv.org/abs/2112.09104

作者:Ilias Diakonikolas,Daniel M. Kane 摘要:非高斯成分分析(NGCA)是以下分布学习问题:给定$mathbb{R}^d$上分布的i.i.d.样本,在隐藏方向$v$上为非高斯分布,在正交方向上为独立标准高斯分布,目标是近似隐藏方向$v$。先前的工作{DKS17 sq}提供了正式证据,证明在一元非高斯分布$A$的适当矩匹配条件下,NGCA存在信息计算权衡。当分配$A$是离散的时,后一个结果不适用。一个自然的问题是,在这种情况下,信息计算权衡是否会持续。在本文中,我们通过在明确定义的技术意义上,在$a$是离散的或近似离散的情况下,获得NGCA的样本和计算效率高的算法来否定这个问题。我们算法中使用的关键工具是LLL方法cite{LLL82}用于格基约简。 摘要:Non-Gaussian Component Analysis (NGCA) is the following distribution learning problem: Given i.i.d. samples from a distribution on $mathbb{R}^d$ that is non-gaussian in a hidden direction $v$ and an independent standard Gaussian in the orthogonal directions, the goal is to approximate the hidden direction $v$. Prior work cite{DKS17-sq} provided formal evidence for the existence of an information-computation tradeoff for NGCA under appropriate moment-matching conditions on the univariate non-gaussian distribution $A$. The latter result does not apply when the distribution $A$ is discrete. A natural question is whether information-computation tradeoffs persist in this setting. In this paper, we answer this question in the negative by obtaining a sample and computationally efficient algorithm for NGCA in the regime that $A$ is discrete or nearly discrete, in a well-defined technical sense. The key tool leveraged in our algorithm is the LLL method cite{LLL82} for lattice basis reduction.

【25】 Influence of Pedestrian Collision Warning Systems on Driver Behavior: A Driving Simulator Study 标题:行人碰撞预警系统对驾驶员驾驶行为影响的驾驶模拟器研究 链接:https://arxiv.org/abs/2112.09074

作者:Snehanshu Banerjee,Mansoureh Jeihani,Nashid K Khadem,Md. Muhib Kabir 备注:8 figures and 4 tables 摘要:随着连接和自动车辆(CAV)技术的出现,越来越需要在使用此类技术时评估驾驶员的行为。在这项首次研究中,在驾驶模拟器环境中引入了采用CAV技术的行人碰撞警告(PCW)系统,以评估驾驶员在行人横穿马路时的制动行为。本研究共招募了93名来自不同社会经济背景的参与者,为其创建了巴尔的摩市中心的虚拟网络。眼睛跟踪装置也被用来观察分心和头部运动。采用对数逻辑加速失效时间(AFT)分布模型进行分析,计算减速时间;从行人可见到达到最低速度允许行人通过的时间。PCW系统的存在显著影响了减速时间和减速率,因为它增加了前者,降低了后者,这证明了该系统通过大幅降低速度提供有效驾驶操作的有效性。进行了急动分析,以分析制动和加速的突然性。凝视分析表明,该系统能够吸引驾驶员的注意力,因为大多数驾驶员都注意到显示的警告。驾驶员对路线和连接车辆的熟悉减少了减速时间;性别也会产生重大影响,因为男性往往有更长的减速时间,即有更多的时间舒适地刹车并让行人通过。 摘要:With the advent of connected and automated vehicle (CAV) technology, there is an increasing need to evaluate driver behavior while using such technology. In this first of a kind study, a pedestrian collision warning (PCW) system using CAV technology, was introduced in a driving simulator environment, to evaluate driver braking behavior, in the presence of a jaywalking pedestrian. A total of 93 participants from diverse socio-economic backgrounds were recruited for this study, for which a virtual network of downtown Baltimore was created. An eye tracking device was also used to observe distractions and head movements. A Log logistic accelerated failure time (AFT) distribution model was used for this analysis, to calculate speed reduction times; time from the moment the pedestrian becomes visible, to the point where a minimum speed was reached, to allow the pedestrian to pass. The presence of the PCW system significantly impacted the speed reduction time and deceleration rate, as it increased the former and reduced the latter, which proves the effectiveness of this system in providing an effective driving maneuver, by drastically reducing speed. A jerk analysis is conducted to analyze the suddenness of braking and acceleration. Gaze analysis showed that the system was able to attract the attention of the drivers, as the majority of the drivers noticed the displayed warning. The familiarity of the driver with the route and connected vehicles reduces the speed reduction time; gender also can have a significant impact as males tend to have longer speed reduction time, i.e. more time to comfortably brake and allow the pedestrian to pass.

【26】 Deep Reinforcement Learning Policies Learn Shared Adversarial Features Across MDPs 标题:深度强化学习策略学习跨MDP的共享对抗性特征 链接:https://arxiv.org/abs/2112.09025

作者:Ezgi Korkmaz 备注:Published in AAAI 2022 摘要:深度神经网络作为函数逼近器的使用,在强化学习算法和应用方面取得了显著的进展。然而,我们对决策边界几何学和神经策略的损失情况的了解仍然相当有限。在本文中,我们提出了一个框架来研究各州和MDP之间的决策边界和损失景观相似性。我们在Arcade学习环境中的各种游戏中进行实验,发现神经策略的高灵敏度方向在MDP中是相关的。我们认为,这些高灵敏度方向支持以下假设:强化学习代理的训练环境中共享非稳健特征。我们相信,我们的研究结果揭示了深度强化学习训练环境的基本属性,为构建健壮可靠的深度强化学习代理迈出了切实的一步。 摘要:The use of deep neural networks as function approximators has led to striking progress for reinforcement learning algorithms and applications. Yet the knowledge we have on decision boundary geometry and the loss landscape of neural policies is still quite limited. In this paper we propose a framework to investigate the decision boundary and loss landscape similarities across states and across MDPs. We conduct experiments in various games from Arcade Learning Environment, and discover that high sensitivity directions for neural policies are correlated across MDPs. We argue that these high sensitivity directions support the hypothesis that non-robust features are shared across training environments of reinforcement learning agents. We believe our results reveal fundamental properties of the environments used in deep reinforcement learning training, and represent a tangible step towards building robust and reliable deep reinforcement learning agents.

【27】 Intelli-Paint: Towards Developing Human-like Painting Agents 标题:INTILI-PAINT:发展仿人涂饰剂 链接:https://arxiv.org/abs/2112.08930

作者:Jaskirat Singh,Cameron Smith,Jose Echevarria,Liang Zheng 摘要:生成设计良好的艺术品通常非常耗时,并且假定人类画家具有高度的熟练程度。为了促进人类的绘画过程,已经在教机器如何“像人类一样绘画”方面进行了大量的研究,然后使用经过训练的代理作为人类用户的绘画辅助工具。然而,当前这方面的研究通常依赖于基于网格的渐进式分割策略,其中代理将整个图像分割为连续的更精细网格,然后并行绘制每个网格。这不可避免地导致人工绘画序列,人类用户不容易理解。为了解决这个问题,我们提出了一种新的绘画方法,它可以学习生成输出画布,同时展示更人性化的绘画风格。建议的绘制管道Intelli Paint由1)渐进分层策略组成,该策略允许代理首先绘制自然背景场景表示,然后以渐进方式添加每个前景对象。2) 我们还介绍了一种新的顺序笔画引导策略,它可以帮助绘画代理以语义感知的方式在不同的图像区域之间转移注意力。3) 最后,我们提出了一种笔画规则化策略,该策略允许所需笔画总数减少约60-80%,而生成画布的质量没有任何明显差异。通过定量和定性结果,我们表明,生成的代理不仅提高了输出画布生成的效率,而且展示了更自然的绘画风格,这将更好地帮助人类用户通过数字艺术品表达他们的想法。 摘要:The generation of well-designed artwork is often quite time-consuming and assumes a high degree of proficiency on part of the human painter. In order to facilitate the human painting process, substantial research efforts have been made on teaching machines how to "paint like a human", and then using the trained agent as a painting assistant tool for human users. However, current research in this direction is often reliant on a progressive grid-based division strategy wherein the agent divides the overall image into successively finer grids, and then proceeds to paint each of them in parallel. This inevitably leads to artificial painting sequences which are not easily intelligible to human users. To address this, we propose a novel painting approach which learns to generate output canvases while exhibiting a more human-like painting style. The proposed painting pipeline Intelli-Paint consists of 1) a progressive layering strategy which allows the agent to first paint a natural background scene representation before adding in each of the foreground objects in a progressive fashion. 2) We also introduce a novel sequential brushstroke guidance strategy which helps the painting agent to shift its attention between different image regions in a semantic-aware manner. 3) Finally, we propose a brushstroke regularization strategy which allows for ~60-80% reduction in the total number of required brushstrokes without any perceivable differences in the quality of the generated canvases. Through both quantitative and qualitative results, we show that the resulting agents not only show enhanced efficiency in output canvas generation but also exhibit a more natural-looking painting style which would better assist human users express their ideas through digital artwork.

【28】 Graph-wise Common Latent Factor Extraction for Unsupervised Graph Representation Learning 标题:用于无监督图表示学习的图式公共潜在因子提取 链接:https://arxiv.org/abs/2112.08830

作者:Thilini Cooray,Ngai-Man Cheung 备注:Accepted to AAAI 2022 摘要:无监督图级表示学习在分子性质预测和群体分析等各种任务中起着至关重要的作用,尤其是在数据注释费用昂贵的情况下。目前,大多数性能最好的图嵌入方法都是基于Infomax原理的。这些方法的性能在很大程度上取决于阴性样本的选择,如果不仔细选择样本,则会损害性能。如果用于相似性匹配的选定图集质量较低,则基于图间相似性的方法也会受到影响。为了解决这个问题,我们只关注利用当前输入图进行嵌入学习。我们的动机来自于对真实世界图形生成过程的观察,其中图形是基于一个或多个全局因素形成的,这些全局因素对图形的所有元素都是通用的(例如,讨论主题、分子的溶解度水平)。我们假设提取这些共同因素可能非常有益。因此,本文提出了一种新的无监督图表示学习原理:图态公共潜在因子提取(GCFX)。我们进一步提出了一个GCFX的深层模型deepGCFX,该模型基于逆转上述图形生成过程的思想,该过程可以明确地从输入图形中提取常见的潜在因素,并在下游任务上达到目前最先进的水平。通过大量的实验和分析,我们证明,虽然提取公共潜在因素有助于图形级任务减轻因单个节点或局部邻域的局部变化而引起的分心,但它也有助于节点级任务实现远程节点依赖,特别是对于非分解图。 摘要:Unsupervised graph-level representation learning plays a crucial role in a variety of tasks such as molecular property prediction and community analysis, especially when data annotation is expensive. Currently, most of the best-performing graph embedding methods are based on Infomax principle. The performance of these methods highly depends on the selection of negative samples and hurt the performance, if the samples were not carefully selected. Inter-graph similarity-based methods also suffer if the selected set of graphs for similarity matching is low in quality. To address this, we focus only on utilizing the current input graph for embedding learning. We are motivated by an observation from real-world graph generation processes where the graphs are formed based on one or more global factors which are common to all elements of the graph (e.g., topic of a discussion thread, solubility level of a molecule). We hypothesize extracting these common factors could be highly beneficial. Hence, this work proposes a new principle for unsupervised graph representation learning: Graph-wise Common latent Factor EXtraction (GCFX). We further propose a deep model for GCFX, deepGCFX, based on the idea of reversing the above-mentioned graph generation process which could explicitly extract common latent factors from an input graph and achieve improved results on downstream tasks to the current state-of-the-art. Through extensive experiments and analysis, we demonstrate that, while extracting common latent factors is beneficial for graph-level tasks to alleviate distractions caused by local variations of individual nodes or local neighbourhoods, it also benefits node-level tasks by enabling long-range node dependencies, especially for disassortative graphs.

【29】 A Statistics and Deep Learning Hybrid Method for Multivariate Time Series Forecasting and Mortality Modeling 标题:多元时间序列预测与死亡率建模的统计与深度学习混合方法 链接:https://arxiv.org/abs/2112.08618

作者:Thabang Mathonsi,Terence L. van Zyl 摘要:混合方法在预测任务和量化这些预测(预测区间)的相关不确定性方面的表现优于纯统计和纯深度学习方法。一个例子是指数平滑递归神经网络(ES-RNN),它是统计预测模型和递归神经网络变体之间的混合。ES-RNN在Makridakis-4预测比赛中的绝对误差提高了9.4%。这一改进和其他混合模型的类似表现主要仅在单变量数据集上得到证明。将混合预测方法应用于多变量数据的困难包括($i$)不节约模型的超参数调整涉及的高计算成本,($ii$)与数据固有的自相关性相关的挑战,以及($iii$)可能难以捕捉的协变量之间的复杂依赖性(互相关)。本文提出了多元指数平滑长短时记忆(MES-LSTM),它是ES-RNN的一个广义多元扩展,克服了这些挑战。MES-LSTM采用矢量化实现。我们在几个2019年聚合冠状病毒病(COVID-19)发病率数据集上测试MES-LSTM,发现我们的混合方法在预测精度和预测区间构建方面比纯统计和深度学习方法有一致、显著的改进。 摘要:Hybrid methods have been shown to outperform pure statistical and pure deep learning methods at forecasting tasks and quantifying the associated uncertainty with those forecasts (prediction intervals). One example is Exponential Smoothing Recurrent Neural Network (ES-RNN), a hybrid between a statistical forecasting model and a recurrent neural network variant. ES-RNN achieves a 9.4% improvement in absolute error in the Makridakis-4 Forecasting Competition. This improvement and similar outperformance from other hybrid models have primarily been demonstrated only on univariate datasets. Difficulties with applying hybrid forecast methods to multivariate data include ($i$) the high computational cost involved in hyperparameter tuning for models that are not parsimonious, ($ii$) challenges associated with auto-correlation inherent in the data, as well as ($iii$) complex dependency (cross-correlation) between the covariates that may be hard to capture. This paper presents Multivariate Exponential Smoothing Long Short Term Memory (MES-LSTM), a generalized multivariate extension to ES-RNN, that overcomes these challenges. MES-LSTM utilizes a vectorized implementation. We test MES-LSTM on several aggregated coronavirus disease of 2019 (COVID-19) morbidity datasets and find our hybrid approach shows consistent, significant improvement over pure statistical and deep learning methods at forecast accuracy and prediction interval construction.

【30】 Trading with the Momentum Transformer: An Intelligent and Interpretable Architecture 标题:与动量转换器交易:一种智能且可解释的架构 链接:https://arxiv.org/abs/2112.08534

作者:Kieran Wood,Sven Giegerich,Stephen Roberts,Stefan Zohren 摘要:深度学习体系结构,特别是深度动量网络(DMN)[1904.04912],已被发现是动量和均值回归交易的有效方法。然而,近年来的一些关键挑战涉及学习长期依赖性、考虑交易成本后的回报时的绩效下降以及适应新的市场制度,特别是在SARS-CoV-2危机期间。注意力机制或基于转换器的架构是解决此类挑战的一种方法,因为它们允许网络关注过去和长期模式中的重要时间步骤。我们引入动量转换器,这是一种基于注意力的架构,其性能优于基准,并且具有内在的可解释性,为我们深入了解我们的深度学习交易策略提供了更深入的见解。我们的模型是基于LSTM的DMN的扩展,该DMN通过根据风险调整的绩效指标(如夏普比率)优化网络,直接输出头寸大小。我们发现注意LSTM混合解码器-纯时间融合转换器(TFT)风格的架构是性能最好的模型。在可解释性方面,我们观察到注意模式的显著结构,在动量转折点具有显著的重要性峰值。因此,时间序列被划分为不同的区域,模型倾向于关注类似区域中的先前时间步骤。我们发现改变点检测(CPD)[2105.13727],另一种应对体制变化的技术,可以补充多头注意,特别是当我们在多个时间尺度上运行CPD时。通过添加可解释变量选择网络,我们观察了CPD如何帮助我们的模型摆脱以日收益数据为主的交易。我们注意到,该模型可以智能地在经典策略之间切换和混合——基于数据中的模式进行决策。 摘要:Deep learning architectures, specifically Deep Momentum Networks (DMNs) [1904.04912], have been found to be an effective approach to momentum and mean-reversion trading. However, some of the key challenges in recent years involve learning long-term dependencies, degradation of performance when considering returns net of transaction costs and adapting to new market regimes, notably during the SARS-CoV-2 crisis. Attention mechanisms, or Transformer-based architectures, are a solution to such challenges because they allow the network to focus on significant time steps in the past and longer-term patterns. We introduce the Momentum Transformer, an attention-based architecture which outperforms the benchmarks, and is inherently interpretable, providing us with greater insights into our deep learning trading strategy. Our model is an extension to the LSTM-based DMN, which directly outputs position sizing by optimising the network on a risk-adjusted performance metric, such as Sharpe ratio. We find an attention-LSTM hybrid Decoder-Only Temporal Fusion Transformer (TFT) style architecture is the best performing model. In terms of interpretability, we observe remarkable structure in the attention patterns, with significant peaks of importance at momentum turning points. The time series is thus segmented into regimes and the model tends to focus on previous time-steps in alike regimes. We find changepoint detection (CPD) [2105.13727], another technique for responding to regime change, can complement multi-headed attention, especially when we run CPD at multiple timescales. Through the addition of an interpretable variable selection network, we observe how CPD helps our model to move away from trading predominantly on daily returns data. We note that the model can intelligently switch between, and blend, classical strategies - basing its decision on patterns in the data.

【31】 Algorithms for Adaptive Experiments that Trade-off Statistical Analysis with Reward: Combining Uniform Random Assignment and Reward Maximization 标题:统计分析与奖励折衷的适应性实验算法--均匀随机分配与奖励最大化相结合 链接:https://arxiv.org/abs/2112.08507

作者:Jacob Nogas,Tong Li,Fernando J. Yanez,Arghavan Modiri,Nina Deliu,Ben Prystawski,Sofia S. Villar,Anna Rafferty,Joseph J. Williams 摘要:像汤普森抽样这样的多臂bandit算法可以用来进行自适应实验,在这种实验中,最大化奖励意味着数据被用来逐步分配更多的参与者到更有效的武器。这样的分配策略增加了统计假设检验的风险,即当两个臂之间没有差异时,会发现它们之间存在差异,而当确实存在差异时,则无法得出结论。我们对两臂实验进行了模拟,探索了两种算法,它们结合了统计分析中均匀随机化的优点和汤普森抽样(TS)实现的报酬最大化的优点。首先,前两个汤普森抽样加上固定数量的均匀随机分配(UR),随时间均匀分布。其次,提出了一种新的启发式算法,称为TS-PostDiff(后验差分概率)。TS PostDiff采用贝叶斯方法混合TS和UR:使用UR分配分配给参与者的概率是两个手臂之间的差异“小”(低于某个阈值)的后验概率,允许在获得很少或没有奖励时进行更多的UR探索。我们发现TS-PostDiff方法在多个效应尺寸上表现良好,因此不需要根据对真实效应尺寸的猜测进行调整。 摘要:Multi-armed bandit algorithms like Thompson Sampling can be used to conduct adaptive experiments, in which maximizing reward means that data is used to progressively assign more participants to more effective arms. Such assignment strategies increase the risk of statistical hypothesis tests identifying a difference between arms when there is not one, and failing to conclude there is a difference in arms when there truly is one. We present simulations for 2-arm experiments that explore two algorithms that combine the benefits of uniform randomization for statistical analysis, with the benefits of reward maximization achieved by Thompson Sampling (TS). First, Top-Two Thompson Sampling adds a fixed amount of uniform random allocation (UR) spread evenly over time. Second, a novel heuristic algorithm, called TS PostDiff (Posterior Probability of Difference). TS PostDiff takes a Bayesian approach to mixing TS and UR: the probability a participant is assigned using UR allocation is the posterior probability that the difference between two arms is `small' (below a certain threshold), allowing for more UR exploration when there is little or no reward to be gained. We find that TS PostDiff method performs well across multiple effect sizes, and thus does not require tuning based on a guess for the true effect size.

【32】 Maximum likelihood estimation for randomized shortest paths with trajectory data 标题:具有轨迹数据的随机最短路径的最大似然估计 链接:https://arxiv.org/abs/2112.08481

作者:Ilkka Kivimäki,Bram Van Moorter,Manuela Panzacchi,Jari Saramäki,Marco Saerens 备注:None 摘要:随机最短路径(RSP)是近年来开发的一种工具,用于不同的图形和网络分析应用,如网络中的运动或流动建模。本质上,RSP框架考虑了网络路径上与温度相关的Gibbs-Boltzmann分布。在低温下,分布仅集中在最短或最小成本路径上,而随着温度的升高,分布在网络上的随机游动上。从这个分布可以方便地计算出许多相关的量,这些量通常以合理的方式概括了传统的网络度量。然而,当使用RSP对真实现象建模时,需要一种从数据中估计参数的原则性方法。在这项工作中,当基于运动、流动或扩散过程对现象进行建模时,我们开发了计算模型参数最大似然估计的方法,重点是温度参数。我们使用人工网络上生成的轨迹以及地理景观中野生驯鹿运动的真实数据来测试衍生方法的有效性,这些数据用于估计动物运动的随机程度。这些例子证明了RSP框架作为一种通用模型在各种应用中的吸引力。 摘要:Randomized shortest paths (RSP) are a tool developed in recent years for different graph and network analysis applications, such as modelling movement or flow in networks. In essence, the RSP framework considers the temperature-dependent Gibbs-Boltzmann distribution over paths in the network. At low temperatures, the distribution focuses solely on the shortest or least-cost paths, while with increasing temperature, the distribution spreads over random walks on the network. Many relevant quantities can be computed conveniently from this distribution, and these often generalize traditional network measures in a sensible way. However, when modelling real phenomena with RSPs, one needs a principled way of estimating the parameters from data. In this work, we develop methods for computing the maximum likelihood estimate of the model parameters, with focus on the temperature parameter, when modelling phenomena based on movement, flow, or spreading processes. We test the validity of the derived methods with trajectories generated on artificial networks as well as with real data on the movement of wild reindeer in a geographic landscape, used for estimating the degree of randomness in the movement of the animals. These examples demonstrate the attractiveness of the RSP framework as a generic model to be used in diverse applications.

【33】 Real-time Detection of Anomalies in Multivariate Time Series of Astronomical Data 标题:天文数据多变量时间序列异常的实时检测 链接:https://arxiv.org/abs/2112.08415

作者:Daniel Muthukrishna,Kaisey S. Mandel,Michelle Lochner,Sara Webb,Gautham Narayan 备注:9 pages, 5 figures, Accepted at the NeurIPS 2021 workshop on Machine Learning and the Physical Sciences 摘要:天文瞬变是指在不同的时间尺度上暂时变亮的恒星物体,它导致了宇宙学和天文学中一些最重要的发现。其中一些瞬变是被称为超新星的恒星爆炸性死亡,而另一些是罕见的、奇异的或全新的令人兴奋的恒星爆炸。新的天文天象观测正在观测数量空前的多波长瞬变,使得视觉识别新的有趣瞬变的标准方法变得不可行。为了满足这一需求,我们提出了两种新的方法,旨在快速、自动地实时检测异常瞬态光曲线。这两种方法都基于一个简单的想法,即如果已知瞬变总体的光照曲线可以精确建模,那么与模型预测的任何偏差都可能是异常。第一种方法是使用时间卷积网络(TCN)构建的概率神经网络,第二种方法是瞬态的可解释贝叶斯参数模型。我们表明,与我们的参数模型相比,神经网络的灵活性(使其成为许多回归任务的强大工具的属性)使其不适合异常检测。 摘要:Astronomical transients are stellar objects that become temporarily brighter on various timescales and have led to some of the most significant discoveries in cosmology and astronomy. Some of these transients are the explosive deaths of stars known as supernovae while others are rare, exotic, or entirely new kinds of exciting stellar explosions. New astronomical sky surveys are observing unprecedented numbers of multi-wavelength transients, making standard approaches of visually identifying new and interesting transients infeasible. To meet this demand, we present two novel methods that aim to quickly and automatically detect anomalous transient light curves in real-time. Both methods are based on the simple idea that if the light curves from a known population of transients can be accurately modelled, any deviations from model predictions are likely anomalies. The first approach is a probabilistic neural network built using Temporal Convolutional Networks (TCNs) and the second is an interpretable Bayesian parametric model of a transient. We show that the flexibility of neural networks, the attribute that makes them such a powerful tool for many regression tasks, is what makes them less suitable for anomaly detection when compared with our parametric model.

【34】 Deep Generative Models for Vehicle Speed Trajectories 标题:车辆速度轨迹的深层产生式模型 链接:https://arxiv.org/abs/2112.08361

作者:Farnaz Behnia,Dominik Karbowski,Vadim Sokolov 摘要:生成真实的车速轨迹是评估车辆燃油经济性和自动驾驶汽车预测控制的重要组成部分。传统的生成模型依赖于马尔可夫链方法,可以生成精确的合成轨迹,但会受到维数灾难的影响。它们不允许在生成过程中包含条件输入变量。在本文中,我们展示了对深层生成模型的扩展如何允许精确和可伸缩的生成。所提出的架构涉及循环层和前馈层,并使用对抗性技术进行训练。我们的模型在生成车辆轨迹方面表现良好,使用了一个根据芝加哥大都市区GPS数据训练的模型。 摘要:Generating realistic vehicle speed trajectories is a crucial component in evaluating vehicle fuel economy and in predictive control of self-driving cars. Traditional generative models rely on Markov chain methods and can produce accurate synthetic trajectories but are subject to the curse of dimensionality. They do not allow to include conditional input variables into the generation process. In this paper, we show how extensions to deep generative models allow accurate and scalable generation. Proposed architectures involve recurrent and feed-forward layers and are trained using adversarial techniques. Our models are shown to perform well on generating vehicle trajectories using a model trained on GPS data from Chicago metropolitan area.

0 人点赞