机器学习学术速递[12.10]

2021-12-10 17:01:25 浏览数 (1)

Update!H5支持摘要折叠,体验更佳!涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏等功能!

cs.LG 方向,今日共计100篇

Graph相关(图学习|图神经网络|图优化等)(7篇)

【1】 Wikidated 1.0: An Evolving Knowledge Graph Dataset of Wikidata's Revision History 标题:维基百科1.0:维基数据修订历史的进化知识图数据集 链接:https://arxiv.org/abs/2112.05003

作者:Lukas Schmelzeisen,Corina Dima,Steffen Staab 机构: University of Stuttgart, Germany, University of Southampton, United Kingdom 备注:None 摘要:Wikidata是目前公开提供的最大的通用知识库。它由数千名志愿者编辑合作编辑,因此自2012年成立以来有了很大的发展。在本文中,我们介绍了Wikidated 1.0,这是Wikidata完整修订历史的数据集,它将Wikidata修订之间的更改编码为RDF三元组的删除和添加集。据我们所知,它构成了演化知识图的第一个大型数据集,这是语义Web社区中最近出现的一个研究课题。我们介绍了从Wikidata转储生成Wikidated 1.0的方法,讨论了其实现和局限性,并展示了数据集的统计特征。 摘要:Wikidata is the largest general-interest knowledge base that is openly available. It is collaboratively edited by thousands of volunteer editors and has thus evolved considerably since its inception in 2012. In this paper, we present Wikidated 1.0, a dataset of Wikidata's full revision history, which encodes changes between Wikidata revisions as sets of deletions and additions of RDF triples. To the best of our knowledge, it constitutes the first large dataset of an evolving knowledge graph, a recently emerging research subject in the Semantic Web community. We introduce the methodology for generating Wikidated 1.0 from dumps of Wikidata, discuss its implementation and limitations, and present statistical characteristics of the dataset.

【2】 KGE-CL: Contrastive Learning of Knowledge Graph Embeddings 标题:KGE-CL:知识图嵌入的对比学习 链接:https://arxiv.org/abs/2112.04871

作者:Wentao Xu,Zhiping Luo,Weiqing Liu,Jiang Bian,Jian Yin,Tie-Yan Liu 机构:School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China, Microsoft Research Asia, Beijing, China, School of Artificial Intelligence, Sun Yat-sen University, Zhuhai, China 摘要:学习知识图的嵌入在人工智能中是至关重要的,并且可以受益于各种下游应用,例如推荐和问答。近年来,人们对知识图嵌入进行了大量的研究。然而,以往的大多数知识图嵌入方法都忽略了不同三元组中相关实体和实体-关系对之间的语义相似性,因为它们分别使用评分函数对每个三元组进行优化。针对这一问题,我们提出了一个简单而有效的知识图嵌入对比学习框架,该框架可以缩短不同三元组中相关实体和实体-关系对的语义距离,从而提高知识图嵌入的表达能力。我们在三个标准的知识图基准上评估了我们提出的方法。值得注意的是,我们的方法可以产生一些新的最先进的结果,实现51.2%的MRR,46.8%Hits@1在WN18RR数据集上,和59.1%的MRR,51.8%Hits@1在YAGO3-10数据集上。 摘要:Learning the embeddings of knowledge graphs is vital in artificial intelligence, and can benefit various downstream applications, such as recommendation and question answering. In recent years, many research efforts have been proposed for knowledge graph embedding. However, most previous knowledge graph embedding methods ignore the semantic similarity between the related entities and entity-relation couples in different triples since they separately optimize each triple with the scoring function. To address this problem, we propose a simple yet efficient contrastive learning framework for knowledge graph embeddings, which can shorten the semantic distance of the related entities and entity-relation couples in different triples and thus improve the expressiveness of knowledge graph embeddings. We evaluate our proposed method on three standard knowledge graph benchmarks. It is noteworthy that our method can yield some new state-of-the-art results, achieving 51.2% MRR, 46.8% Hits@1 on the WN18RR dataset, and 59.1% MRR, 51.8% Hits@1 on the YAGO3-10 dataset.

【3】 Siamese Attribute-missing Graph Auto-encoder 标题:暹罗属性-缺失图形自动编码器 链接:https://arxiv.org/abs/2112.04842

作者:Wenxuan Tu,Sihang Zhou,Yue Liu,Xinwang Liu 机构:National University of Defense Technology 备注:under review 摘要:属性缺失图的图表示学习(GRL)是一个常见但具有挑战性的问题,近年来受到了广泛关注。我们观察到现有文献:1)孤立了属性和结构嵌入的学习,因此未能充分利用这两类信息;2) 对潜在空间变量施加了过于严格的分布假设,导致特征表示的差别较小。在本文中,基于在两个信息源之间引入亲密信息交互的思想,我们提出了暹罗属性缺失图自动编码器(SAGA)。具体而言,已经实施了三项战略。首先,我们通过引入连体网络结构,将属性嵌入和结构嵌入纠缠在一起,以共享两个过程学习到的参数,从而使网络训练受益于更丰富多样的信息。其次,我们引入K-最近邻(KNN)和结构约束增强学习机制,通过过滤不可靠连接来提高缺失属性潜在特征的质量。第三,我们手动屏蔽多个相邻矩阵上的连接,并强制结构信息嵌入子网络恢复真实的相邻矩阵,从而强制生成的网络能够选择性地利用更高阶的鉴别特征来完成数据。在六个基准数据集上的大量实验证明了我们的SAGA相对于最先进的方法的优越性。 摘要:Graph representation learning (GRL) on attribute-missing graphs, which is a common yet challenging problem, has recently attracted considerable attention. We observe that existing literature: 1) isolates the learning of attribute and structure embedding thus fails to take full advantages of the two types of information; 2) imposes too strict distribution assumption on the latent space variables, leading to less discriminative feature representations. In this paper, based on the idea of introducing intimate information interaction between the two information sources, we propose our Siamese Attribute-missing Graph Auto-encoder (SAGA). Specifically, three strategies have been conducted. First, we entangle the attribute embedding and structure embedding by introducing a siamese network structure to share the parameters learned by both processes, which allows the network training to benefit from more abundant and diverse information. Second, we introduce a K-nearest neighbor (KNN) and structural constraint enhanced learning mechanism to improve the quality of latent features of the missing attributes by filtering unreliable connections. Third, we manually mask the connections on multiple adjacent matrices and force the structural information embedding sub-network to recover the true adjacent matrix, thus enforcing the resulting network to be able to selectively exploit more high-order discriminative features for data completion. Extensive experiments on six benchmark datasets demonstrate the superiority of our SAGA against the state-of-the-art methods.

【4】 Transferability Properties of Graph Neural Networks 标题:图神经网络的可传递性 链接:https://arxiv.org/abs/2112.04629

作者:Luana Ruiz,Luiz F. O. Chamon,Alejandro Ribeiro 机构: Chamon is with the SimonsInstitute 备注:Submitted to IEEE TSP 摘要:图神经网络(GNN)是一种深度卷积结构,由图卷积和点态非线性组成。由于其不变性和稳定性,GNNs在从网络数据学习表示方面是成功的。然而,训练它们需要矩阵计算,这对于大型图来说可能非常昂贵。为了解决这一局限性,我们研究了GNN跨图传输的能力。我们考虑图形,这是图形的限制和生成模型的加权和随机图,定义极限对象的图形卷积和GNS-图形卷积和图形神经网络(WNNs)-我们使用作为生成模型的图形卷积和GNS。我们证明了这些图滤波器和小波神经网络可以用加权图和随机图上的图滤波器和从中采样的小波神经网络来近似。利用这些结果,我们推导了在这些图之间传输图过滤器和GNN的误差界。这些界限表明,可转移性随着图的大小而增加,并且揭示了可转移性和光谱可分辨性之间的折衷,在GNNs中,这种折衷在点态非线性中得到了缓解。这些发现在电影推荐和分散机器人控制的数值实验中得到了进一步的验证。 摘要:Graph neural networks (GNNs) are deep convolutional architectures consisting of layers composed by graph convolutions and pointwise nonlinearities. Due to their invariance and stability properties, GNNs are provably successful at learning representations from network data. However, training them requires matrix computations which can be expensive for large graphs. To address this limitation, we investigate the ability of GNNs to be transferred across graphs. We consider graphons, which are both graph limits and generative models for weighted and stochastic graphs, to define limit objects of graph convolutions and GNNs -- graphon convolutions and graphon neural networks (WNNs) -- which we use as generative models for graph convolutions and GNNs. We show that these graphon filters and WNNs can be approximated by graph filters and GNNs sampled from them on weighted and stochastic graphs. Using these results, we then derive error bounds for transferring graph filters and GNNs across such graphs. These bounds show that transferability increases with the graph size, and reveal a tradeoff between transferability and spectral discriminability which in GNNs is alleviated by the pointwise nonlinearities. These findings are further verified empirically in numerical experiments in movie recommendation and decentralized robot control.

【5】 Prediction of Adverse Biological Effects of Chemicals Using Knowledge Graph Embeddings 标题:基于知识图嵌入的化学品不良生物效应预测 链接:https://arxiv.org/abs/2112.04605

作者:Erik B. Myklebust,Ernesto Jiménez-Ruiz,Jiaoyan Chen,Raoul Wolf,Knut Erik Tollefsen 机构:a Norwegian Institute for Water Research, Oslo, Norway, b SIRIUS, University of Oslo, Oslo, Norway, c City, University of London, London, United Kingdom, d University of Oxford, Oxford, United Kingdom, e Norwegian University of Life Sciences, Ås, Norway 备注:Accepted for publication in the Semantic Web Journal 摘要:我们根据生态毒理学风险评估中使用的主要数据源创建了一个知识图。我们已将此知识图应用于风险评估中的一项重要任务,即化学效应预测。我们评估了九个知识图嵌入模型,这些模型包括几何模型、分解模型和卷积模型。我们表明,使用知识图嵌入可以提高神经网络预测效果的准确性。此外,我们还实现了一个微调架构,该架构将知识图嵌入到效果预测任务中,从而获得更好的性能。最后,我们评估了知识图嵌入模型的某些特征,以了解各个模型的性能。 摘要:We have created a knowledge graph based on major data sources used in ecotoxicological risk assessment. We have applied this knowledge graph to an important task in risk assessment, namely chemical effect prediction. We have evaluated nine knowledge graph embedding models from a selection of geometric, decomposition, and convolutional models on this prediction task. We show that using knowledge graph embeddings can increase the accuracy of effect prediction with neural networks. Furthermore, we have implemented a fine-tuning architecture which adapts the knowledge graph embeddings to the effect prediction task and leads to a better performance. Finally, we evaluate certain characteristics of the knowledge graph embedding models to shed light on the individual model performance.

【6】 Adaptive Kernel Graph Neural Network 标题:自适应核图神经网络 链接:https://arxiv.org/abs/2112.04575

作者:Mingxuan Ju,Shifu Hou,Yujie Fan,Jianan Zhao,Liang Zhao,Yanfang Ye 机构: University of Notre Dame, Notre Dame, IN , Case Western Reserve University, Cleveland, OH , Emory University, Atlanta, GA 备注:To be appear at AAAI2022

【7】 Enhancing Column Generation by a Machine-Learning-Based Pricing Heuristic for Graph Coloring 标题:基于机器学习的图着色定价启发式算法增强列生成 链接:https://arxiv.org/abs/2112.04906

作者:Yunzhuang Shen,Yuan Sun,Xiaodong Li,Andrew Eberhard,Andreas Ernst 机构: School of Computing Technologies, RMIT University, Australia, School of Computing and Information Systems, University of Melbourne, Australia, School of Science, RMIT University, Australia, School of Mathematics, Monash University, Australia 备注:Machine learning for column generation and branch-and-price; accepted to AAAI 2022 摘要:列生成(CG)是解决大规模优化问题的有效方法。CG首先用列的子集(即变量)解决一个子问题,然后逐渐包括可以改进当前子问题解决方案的新列。通过反复解决定价问题(通常是NP难问题,是CG方法的瓶颈),根据需要生成新列。为了解决这个问题,我们提出了一种基于机器学习的定价启发式算法(MLPH),它可以高效地生成许多高质量的列。在CG的每次迭代中,我们的MLPH利用ML模型预测定价问题的最佳解决方案,然后使用该模型指导采样方法以高效生成多个高质量列。使用图着色问题,我们实证表明,与六种最先进的方法相比,MLPH显著增强了SCG,并且CG的改进可以导致分支精确法和价格精确法的性能显著提高。 摘要:Column Generation (CG) is an effective method for solving large-scale optimization problems. CG starts by solving a sub-problem with a subset of columns (i.e., variables) and gradually includes new columns that can improve the solution of the current subproblem. The new columns are generated as needed by repeatedly solving a pricing problem, which is often NP-hard and is a bottleneck of the CG approach. To tackle this, we propose a Machine-Learning-based Pricing Heuristic (MLPH)that can generate many high-quality columns efficiently. In each iteration of CG, our MLPH leverages an ML model to predict the optimal solution of the pricing problem, which is then used to guide a sampling method to efficiently generate multiple high-quality columns. Using the graph coloring problem, we empirically show that MLPH significantly enhancesCG as compared to six state-of-the-art methods, and the improvement in CG can lead to substantially better performance of the branch-and-price exact method.

Transformer(2篇)

【1】 Opinion Extraction as A Structured Sentiment Analysis using Transformers 标题:基于变形器的观点抽取作为结构化情感分析 链接:https://arxiv.org/abs/2112.05056

作者:Yucheng Liu,Tian Zhu 机构:W ,: Natural Language Processing, UC Berkeley School of Information 摘要:关系提取和命名实体识别一直被认为是两个不同的任务,需要不同的输入数据、标签和模型。然而,这两种方法对于结构化情绪分析都是必不可少的。我们相信这两个任务可以组合成一个具有相同输入数据的单一堆叠模型。我们进行了不同的实验,以找到从一个句子中提取多个意见元组的最佳模型。意见元组将由持有者、目标和表达式组成。使用意见元组,我们将能够提取所需的关系。 摘要:Relationship extraction and named entity recognition have always been considered as two distinct tasks that require different input data, labels, and models. However, both are essential for structured sentiment analysis. We believe that both tasks can be combined into a single stacked model with the same input data. We performed different experiments to find the best model to extract multiple opinion tuples from a single sentence. The opinion tuples will consist of holders, targets, and expressions. With the opinion tuples, we will be able to extract the relationship we need.

【2】 PE-former: Pose Estimation Transformer 标题:PE-form:位姿估计转换器 链接:https://arxiv.org/abs/2112.04981

作者:Paschalis Panteleris,Antonis Argyros 机构: Institute of Computer Science, FORTH, Heraklion, Crete, Greece, Computer Science Department, University of Crete, Greece 摘要:视觉转换器架构已经被证明可以非常有效地用于图像分类任务。使用Transformer解决更具挑战性的视觉任务的工作依赖于卷积主干进行特征提取。在本文中,我们研究了使用纯Transformer结构(即,没有CNN主干)来解决二维人体姿势估计问题。我们在COCO数据集上评估了两种ViT体系结构。我们证明了使用编码器-解码器-Transformer结构可以在这个估计问题上产生最先进的结果。 摘要:Vision transformer architectures have been demonstrated to work very effectively for image classification tasks. Efforts to solve more challenging vision tasks with transformers rely on convolutional backbones for feature extraction. In this paper we investigate the use of a pure transformer architecture (i.e., one with no CNN backbone) for the problem of 2D body pose estimation. We evaluate two ViT architectures on the COCO dataset. We demonstrate that using an encoder-decoder transformer architecture yields state of the art results on this estimation problem.

GAN|对抗|攻击|生成相关(6篇)

【1】 Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior 标题:通过学习的交通先验生成有用的易发生事故的驾驶场景 链接:https://arxiv.org/abs/2112.05077

作者:Davis Rempe,Jonah Philion,Leonidas J. Guibas,Sanja Fidler,Or Litany 机构:Stanford University, NVIDIA, University of Toronto, Vector Institute, nv-tlabs.github.ioSTRIVE 摘要:评估和改进自动驾驶车辆的规划需要可扩展的长尾交通场景生成。为了有用,这些场景必须现实且具有挑战性,但并非不可能安全驾驶通过。在这项工作中,我们介绍了STRIVE,一种自动生成具有挑战性的场景的方法,该场景会导致给定的计划人员产生不希望的行为,如碰撞。为了保持场景的合理性,关键思想是以基于图形的条件VAE的形式利用已学习的交通运动模型。场景生成是在该交通模型的潜在空间中进行的优化,通过扰动初始真实场景来产生与给定规划器碰撞的轨迹。随后的优化用于找到场景的“解决方案”,确保它有助于改进给定的计划器。进一步分析基于碰撞类型生成的场景。我们攻击了两个计划者,并证明在这两种情况下,STRIVE成功地生成了现实的、具有挑战性的场景。此外,我们还“关闭循环”,并使用这些场景优化基于规则的规划器的超参数。 摘要:Evaluating and improving planning for autonomous vehicles requires scalable generation of long-tail traffic scenarios. To be useful, these scenarios must be realistic and challenging, but not impossible to drive through safely. In this work, we introduce STRIVE, a method to automatically generate challenging scenarios that cause a given planner to produce undesirable behavior, like collisions. To maintain scenario plausibility, the key idea is to leverage a learned model of traffic motion in the form of a graph-based conditional VAE. Scenario generation is formulated as an optimization in the latent space of this traffic model, effected by perturbing an initial real-world scene to produce trajectories that collide with a given planner. A subsequent optimization is used to find a "solution" to the scenario, ensuring it is useful to improve the given planner. Further analysis clusters generated scenarios based on collision type. We attack two planners and show that STRIVE successfully generates realistic, challenging scenarios in both cases. We additionally "close the loop" and use these scenarios to optimize hyperparameters of a rule-based planner.

【2】 Mutual Adversarial Training: Learning together is better than going alone 标题:相互对抗训练,一起学习总比单独去要好 链接:https://arxiv.org/abs/2112.05005

作者:Jiang Liu,Chun Pong Lau,Hossein Souri,Soheil Feizi,Rama Chellappa 机构:Mutual Adversarial Training:, Learning together is better than going alone., Johns Hopkins University, Baltimore, Maryland, USA, University of Maryland, College Park, College Park, Maryland, USA 备注:Under submission 摘要:最近的研究表明,对抗性攻击的鲁棒性可以通过网络传输。换句话说,我们可以借助强教师模型使弱模型更加健壮。我们问,模型是否可以“一起学习”和“互相教导”以实现更好的健壮性,而不是从静态教师那里学习?在本文中,我们研究了模型之间的交互如何通过知识提取影响鲁棒性。我们提出了相互对抗训练(MAT),其中多个模型一起训练,并共享对抗示例的知识,以提高鲁棒性。MAT允许稳健模型探索更大的对抗性样本空间,并找到更稳健的特征空间和决策边界。通过在CIFAR-10和CIFAR-100上的大量实验,我们证明MAT可以有效地提高模型的鲁棒性,在白盒攻击下的性能优于最先进的方法,为PGD-100攻击下的普通对抗训练(AT)带来$sim$8%的准确度增益。此外,我们还表明,MAT还可以缓解不同扰动类型之间的鲁棒性权衡,使AT基线在联合$linfty$、$lu 2$和$lu 1$攻击时获得高达13.1%的精度增益。这些结果表明了该方法的优越性,并证明了协作学习是设计鲁棒模型的有效策略。 摘要:Recent studies have shown that robustness to adversarial attacks can be transferred across networks. In other words, we can make a weak model more robust with the help of a strong teacher model. We ask if instead of learning from a static teacher, can models "learn together" and "teach each other" to achieve better robustness? In this paper, we study how interactions among models affect robustness via knowledge distillation. We propose mutual adversarial training (MAT), in which multiple models are trained together and share the knowledge of adversarial examples to achieve improved robustness. MAT allows robust models to explore a larger space of adversarial samples, and find more robust feature spaces and decision boundaries. Through extensive experiments on CIFAR-10 and CIFAR-100, we demonstrate that MAT can effectively improve model robustness and outperform state-of-the-art methods under white-box attacks, bringing $sim$8% accuracy gain to vanilla adversarial training (AT) under PGD-100 attacks. In addition, we show that MAT can also mitigate the robustness trade-off among different perturbation types, bringing as much as 13.1% accuracy gain to AT baselines against the union of $l_infty$, $l_2$ and $l_1$ attacks. These results show the superiority of the proposed method and demonstrate that collaborative learning is an effective strategy for designing robust models.

【3】 PARL: Enhancing Diversity of Ensemble Networks to Resist Adversarial Attacks via Pairwise Adversarially Robust Loss Function 标题:PARL:通过两两对抗性鲁棒损失函数增强集成网络的多样性以抵抗对抗性攻击 链接:https://arxiv.org/abs/2112.04948

作者:Manaar Alam,Shubhajit Datta,Debdeep Mukhopadhyay,Arijit Mondal,Partha Pratim Chakrabarti 机构:Department of CSE, IIT Kharagpur, Kharagpur, India, Centre of Excellence in AI, IIT Patna, Patna, India 摘要:由于对抗性攻击的存在,深度学习分类器的安全性是一个重要的研究领域。此类攻击通常依赖于可转移性原则,即在代理分类器上制作的对抗性示例往往会误导在同一数据集上训练的目标分类器,即使两个分类器具有完全不同的体系结构。对抗对抗性攻击的集成方法表明,在具有不同决策边界的集成中,对抗性示例不太可能误导多个分类器。然而,最近的集成方法要么易受更强对手的攻击,要么缺乏端到端的评估。本文试图开发一种新的集成方法,该方法在训练过程中使用成对对抗鲁棒损失(PARL)函数构造多个不同的分类器。PARL同时利用每个层相对于集合中每个分类器的输入的梯度。与以前的集成方法相比,所提出的训练过程使PARL能够实现更高的抗黑箱转移攻击的鲁棒性,而不会对干净示例的准确性产生不利影响。我们还评估了存在白盒攻击时的鲁棒性,其中使用目标分类器的参数制作对抗性示例。我们使用标准图像分类数据集(如CIFAR-10和CIFAR-100)进行了大量实验,这些数据集使用标准ResNet20分类器进行训练,以抵御最先进的对抗性攻击,从而证明所提出的集成方法的鲁棒性。 摘要:The security of Deep Learning classifiers is a critical field of study because of the existence of adversarial attacks. Such attacks usually rely on the principle of transferability, where an adversarial example crafted on a surrogate classifier tends to mislead the target classifier trained on the same dataset even if both classifiers have quite different architecture. Ensemble methods against adversarial attacks demonstrate that an adversarial example is less likely to mislead multiple classifiers in an ensemble having diverse decision boundaries. However, recent ensemble methods have either been shown to be vulnerable to stronger adversaries or shown to lack an end-to-end evaluation. This paper attempts to develop a new ensemble methodology that constructs multiple diverse classifiers using a Pairwise Adversarially Robust Loss (PARL) function during the training procedure. PARL utilizes gradients of each layer with respect to input in every classifier within the ensemble simultaneously. The proposed training procedure enables PARL to achieve higher robustness against black-box transfer attacks compared to previous ensemble methods without adversely affecting the accuracy of clean examples. We also evaluate the robustness in the presence of white-box attacks, where adversarial examples are crafted using parameters of the target classifier. We present extensive experiments using standard image classification datasets like CIFAR-10 and CIFAR-100 trained using standard ResNet20 classifier against state-of-the-art adversarial attacks to demonstrate the robustness of the proposed ensemble methodology.

【4】 Self-Supervised Image-to-Text and Text-to-Image Synthesis 标题:自监督图文和文图合成 链接:https://arxiv.org/abs/2112.04928

作者:Anindya Sundar Das,Sriparna Saha 机构:Department of Computer Science and Engineering, Indian Institute of Technology Patna, India 备注:None 摘要:对视觉和语言及其相互关系的全面理解对于认识这些模式之间潜在的相似性和差异以及学习更普遍、更有意义的表达至关重要。近年来,大多数与文本到图像合成和图像到文本生成相关的工作都集中在有监督生成的深层架构来解决这些问题,其中很少有人关注跨模式的嵌入空间之间的相似性。本文提出了一种新的基于自监督深度学习的跨模态嵌入空间学习方法;用于生成图像到文本和文本到图像。在我们的方法中,我们首先使用基于StackGAN的自动编码器模型获得图像的稠密矢量表示,并使用基于LSTM的文本自动编码器在句子级获得稠密矢量表示;然后利用基于GAN和最大均值差的生成网络研究了从一种模态的嵌入空间到另一种模态的嵌入空间的映射。我们还证明了我们的模型学习从图像数据生成文本描述,以及从定性和定量的文本数据生成图像。 摘要:A comprehensive understanding of vision and language and their interrelation are crucial to realize the underlying similarities and differences between these modalities and to learn more generalized, meaningful representations. In recent years, most of the works related to Text-to-Image synthesis and Image-to-Text generation, focused on supervised generative deep architectures to solve the problems, where very little interest was placed on learning the similarities between the embedding spaces across modalities. In this paper, we propose a novel self-supervised deep learning based approach towards learning the cross-modal embedding spaces; for both image to text and text to image generations. In our approach, we first obtain dense vector representations of images using StackGAN-based autoencoder model and also dense vector representations on sentence-level utilizing LSTM based text-autoencoder; then we study the mapping from embedding space of one modality to embedding space of the other modality utilizing GAN and maximum mean discrepancy based generative networks. We, also demonstrate that our model learns to generate textual description from image data as well as images from textual data both qualitatively and quantitatively.

【5】 Amicable Aid: Turning Adversarial Attack to Benefit Classification 标题:友好援助:变对抗性攻击为利益分类 链接:https://arxiv.org/abs/2112.04720

作者:Juyeop Kim,Jun-Ho Choi,Soobeom Jang,Jong-Seok Lee 机构:Yonsei University 备注:16 pages (3 pages for appendix) 摘要:虽然针对深度图像分类模型的对抗性攻击在实践中引起了严重的安全问题,本文提出了一种新的范例,其中对抗性攻击的概念可以提高分类性能,我们称之为友好帮助。我们表明,通过采取相反的扰动搜索方向,分类模型可以将一幅图像转换为另一幅具有更高可信度的图像,甚至可以对错误分类的图像进行正确分类。此外,通过大量的扰动,图像可以被人眼识别,而模型可以正确识别图像。友好援助的机制是解释在基本的自然图像流形的观点。我们还考虑了普遍的友好扰动,即固定的扰动可以应用于多个图像,以提高它们的分类结果。虽然找到这样的扰动很有挑战性,但我们表明,通过使用修改的数据进行训练,使决策边界尽可能垂直于图像流形,可以有效地获得更容易找到普遍友好扰动的模型。最后,我们讨论了友好帮助可能有用的几个应用场景,包括安全图像通信、保护隐私的图像通信以及对抗性攻击的保护。 摘要:While adversarial attacks on deep image classification models pose serious security concerns in practice, this paper suggests a novel paradigm where the concept of adversarial attacks can benefit classification performance, which we call amicable aid. We show that by taking the opposite search direction of perturbation, an image can be converted to another yielding higher confidence by the classification model and even a wrongly classified image can be made to be correctly classified. Furthermore, with a large amount of perturbation, an image can be made unrecognizable by human eyes, while it is correctly recognized by the model. The mechanism of the amicable aid is explained in the viewpoint of the underlying natural image manifold. We also consider universal amicable perturbations, i.e., a fixed perturbation can be applied to multiple images to improve their classification results. While it is challenging to find such perturbations, we show that making the decision boundary as perpendicular to the image manifold as possible via training with modified data is effective to obtain a model for which universal amicable perturbations are more easily found. Finally, we discuss several application scenarios where the amicable aid can be useful, including secure image communication, privacy-preserving image communication, and protection against adversarial attacks.

【6】 InvGAN: Invertable GANs 标题:InvGAN:可逆GANS 链接:https://arxiv.org/abs/2112.04598

作者:Partha Ghosh,Dominik Zietlow,Michael J. Black,Larry S. Davis,Xiaochen Hu 机构:† MPI for Intelligent Systems, Tübingen 摘要:高分辨率生成模型的许多潜在应用包括照片真实感图像的生成、语义编辑和表示学习。GAN的最新进展已将其确定为此类任务的最佳选择。然而,由于它们不提供推理模型,因此无法使用GAN潜在空间对真实图像进行图像编辑或诸如分类之类的下游任务。尽管为训练推理模型或设计迭代方法以反转预先训练的生成器进行了大量工作,但以前的方法都是针对数据集(如人脸图像)和体系结构(如StyleGAN)的。这些方法对于扩展到新的数据集或体系结构来说是非常重要的。我们提出了一个对体系结构和数据集不可知的通用框架。我们的主要见解是,通过将推理和生成模型一起训练,我们可以使它们相互适应,并收敛到质量更好的模型。我们的textbf{InvGAN}是可逆GAN的缩写,它成功地将真实图像嵌入高质量生成模型的潜在空间。这使我们能够执行图像修复、合并、插值和在线数据增强。我们通过大量的定性和定量实验证明了这一点。 摘要:Generation of photo-realistic images, semantic editing and representation learning are a few of many potential applications of high resolution generative models. Recent progress in GANs have established them as an excellent choice for such tasks. However, since they do not provide an inference model, image editing or downstream tasks such as classification can not be done on real images using the GAN latent space. Despite numerous efforts to train an inference model or design an iterative method to invert a pre-trained generator, previous methods are dataset (e.g. human face images) and architecture (e.g. StyleGAN) specific. These methods are nontrivial to extend to novel datasets or architectures. We propose a general framework that is agnostic to architecture and datasets. Our key insight is that, by training the inference and the generative model together, we allow them to adapt to each other and to converge to a better quality model. Our textbf{InvGAN}, short for Invertable GAN, successfully embeds real images to the latent space of a high quality generative model. This allows us to perform image inpainting, merging, interpolation and online data augmentation. We demonstrate this with extensive qualitative and quantitative experiments.

半/弱/无/有监督|不确定性|主动学习(6篇)

【1】 Extending the WILDS Benchmark for Unsupervised Adaptation 标题:扩展无监督适应的WARDS基准 链接:https://arxiv.org/abs/2112.05090

作者:Shiori Sagawa,Pang Wei Koh,Tony Lee,Irena Gao,Sang Michael Xie,Kendrick Shen,Ananya Kumar,Weihua Hu,Michihiro Yasunaga,Henrik Marklund,Sara Beery,Etienne David,Ian Stavness,Wei Guo,Jure Leskovec,Kate Saenko,Tatsunori Hashimoto,Sergey Levine,Chelsea Finn,Percy Liang

【2】 The Peril of Popular Deep Learning Uncertainty Estimation Methods 标题:流行的深度学习不确定性估计方法的危害性 链接:https://arxiv.org/abs/2112.05000

作者:Yehao Liu,Matteo Pagliardini,Tatjana Chavdarova,Sebastian U. Stich 机构:EPFL, UC Berkeley, CISPA 备注:Presented at the Bayesian Deep Learning Workshop at NeurIPS 2021 摘要:不确定性估计(UE)技术——如高斯过程(GP)、贝叶斯神经网络(BNN)、蒙特卡罗辍学(MCDropout)——旨在通过将估计的不确定性值分配给机器学习模型的每个预测输出来提高其可解释性。然而,由于过高的不确定性估计在实践中可能会产生致命的后果,本文分析了上述技术。首先,我们证明了GP方法对非分布(OOD)数据总是产生高不确定性估计。其次,我们在一个2D玩具示例中显示,BNN和MCDropout都没有给出OOD样本的高不确定性估计。最后,我们从经验上证明,BNNs和MCDropout的这一缺陷也适用于真实世界的数据集。我们的见解(i)提高对当前流行的UE方法在深度学习中更谨慎使用的认识,(ii)鼓励开发近似基于GP的方法的UE方法,而不是BNN和MCDropout,以及(iii)我们的经验设置可用于验证任何其他UE方法的OOD性能。源代码可在https://github.com/epfml/uncertainity-estimation. 摘要:Uncertainty estimation (UE) techniques -- such as the Gaussian process (GP), Bayesian neural networks (BNN), Monte Carlo dropout (MCDropout) -- aim to improve the interpretability of machine learning models by assigning an estimated uncertainty value to each of their prediction outputs. However, since too high uncertainty estimates can have fatal consequences in practice, this paper analyzes the above techniques. Firstly, we show that GP methods always yield high uncertainty estimates on out of distribution (OOD) data. Secondly, we show on a 2D toy example that both BNNs and MCDropout do not give high uncertainty estimates on OOD samples. Finally, we show empirically that this pitfall of BNNs and MCDropout holds on real world datasets as well. Our insights (i) raise awareness for the more cautious use of currently popular UE methods in Deep Learning, (ii) encourage the development of UE methods that approximate GP-based methods -- instead of BNNs and MCDropout, and (iii) our empirical setups can be used for verifying the OOD performances of any other UE method. The source code is available at https://github.com/epfml/uncertainity-estimation.

【3】 Ymir: A Supervised Ensemble Framework for Multivariate Time Series Anomaly Detection 标题:YMIR:一种有监督的多变量时间序列异常检测集成框架 链接:https://arxiv.org/abs/2112.04704

作者:Zhanxiang Zhao 摘要:我们提出了一种多元时间序列异常检测框架Ymir,该框架利用集成学习和监督学习技术有效地学习和适应现实系统中的异常。Ymir通过一种插入码学习方法集成了几种目前广泛使用的无监督异常检测模型,因此可以在无监督场景中提供鲁棒的frontalanomaly检测结果。在监督的环境中,领域专家和系统用户讨论并为训练数据提供标签(异常或非异常),这反映了他们对特定系统的异常检测标准。Ymir利用上述无监督方法从原始多元时间序列数据中提取丰富且有用的特征表示,然后将特征和标签与有监督分类器结合起来进行异常检测。我们在大型监测系统的内部多变量系列数据集上评估了Ymir,并取得了良好的异常检测性能。 摘要:We proposed a multivariate time series anomaly detection frame-work Ymir, which leverages ensemble learning and supervisedlearning technology to efficiently learn and adapt to anomaliesin real-world system applications. Ymir integrates several currentlywidely used unsupervised anomaly detection models through anensemble learning method, and thus can provide robust frontalanomaly detection results in unsupervised scenarios. In a super-vised setting, domain experts and system users discuss and providelabels (anomalous or not) for the training data, which reflects theiranomaly detection criteria for the specific system. Ymir leveragesthe aforementioned unsupervised methods to extract rich and usefulfeature representations from the raw multivariate time series data,then combines the features and labels with a supervised classifier todo anomaly detection. We evaluated Ymir on internal multivariatetime series datasets from large monitoring systems and achievedgood anomaly detection performance.

【4】 Autoregressive Quantile Flows for Predictive Uncertainty Estimation 标题:用于预测不确定性估计的自回归分位数流 链接:https://arxiv.org/abs/2112.04643

作者:Phillip Si,Allan Bishop,Volodymyr Kuleshov 机构:Department of Computer Science, Cornell Tech and Cornell University 备注:9 pages, 4 figures, 6 tables (main body) additional 4 pages, 2 figures, 4 tables (appendix) 摘要:机器学习的许多应用涉及预测模型输出的灵活概率分布。我们提出了自回归分位数流,这是一类灵活的高维变量概率模型,可用于准确捕获预测任意不确定性。这些模型是使用基于适当评分规则的新目标训练的自回归流的实例,它简化了训练期间计算代价高昂的雅可比行列式的计算,并支持新类型的神经结构。我们证明,这些模型可以用于参数化预测条件分布,并提高时间序列预测和目标检测的概率预测质量。 摘要:Numerous applications of machine learning involve predicting flexible probability distributions over model outputs. We propose Autoregressive Quantile Flows, a flexible class of probabilistic models over high-dimensional variables that can be used to accurately capture predictive aleatoric uncertainties. These models are instances of autoregressive flows trained using a novel objective based on proper scoring rules, which simplifies the calculation of computationally expensive determinants of Jacobians during training and supports new types of neural architectures. We demonstrate that these models can be used to parameterize predictive conditional distributions and improve the quality of probabilistic predictions on time series forecasting and object detection.

【5】 CoSSL: Co-Learning of Representation and Classifier for Imbalanced Semi-Supervised Learning 标题:CoSSL:非平衡半监督学习的表示和分类器协同学习 链接:https://arxiv.org/abs/2112.04564

作者:Yue Fan,Dengxin Dai,Bernt Schiele 机构:Max Planck Institute for Informatics, Saarbr¨ucken, Germany, Saarland Informatics Campus 摘要:在本文中,我们提出了一个新的合作学习框架(CoSSL)的解耦表示学习和分类器学习的不平衡SSL。为了处理数据不平衡,我们设计了用于分类器学习的尾类特征增强(TFE)。此外,目前针对不平衡SSL的评估协议只关注平衡测试集,这在现实场景中的实用性有限。因此,我们进一步在各种转移测试分布下进行综合评估。在实验中,我们表明,我们的方法在大范围的移位分布上优于其他方法,在从CIFAR-10、CIFAR-100、ImageNet到Food-101的基准数据集上实现了最先进的性能。我们的代码将公开发布。 摘要:In this paper, we propose a novel co-learning framework (CoSSL) with decoupled representation learning and classifier learning for imbalanced SSL. To handle the data imbalance, we devise Tail-class Feature Enhancement (TFE) for classifier learning. Furthermore, the current evaluation protocol for imbalanced SSL focuses only on balanced test sets, which has limited practicality in real-world scenarios. Therefore, we further conduct a comprehensive evaluation under various shifted test distributions. In experiments, we show that our approach outperforms other methods over a large range of shifted distributions, achieving state-of-the-art performance on benchmark datasets ranging from CIFAR-10, CIFAR-100, ImageNet, to Food-101. Our code will be made publicly available.

【6】 Robust Weakly Supervised Learning for COVID-19 Recognition Using Multi-Center CT Images 标题:多中心CT图像识别冠状病毒的鲁棒弱监督学习 链接:https://arxiv.org/abs/2112.04984

作者:Qinghao Ye,Yuan Gao,Weiping Ding,Zhangming Niu,Chengjia Wang,Yinghui Jiang,Minhao Wang,Evandro Fei Fang,Wade Menpes-Smith,Jun Xia,Guang Yang 机构:Hangzhou Ocean’s Smart Boya Co., Ltd, University of California, San Diego, La Jolla, California, USA, Institute of Biomedical Engineering, University of Oxford, UK, Aladdin Healthcare Technologies Ltd, Nantong University, Nantong , China 备注:32 pages, 8 figures, Applied Soft Computing 摘要:世界目前正在经历一种名为2019年冠状病毒病(即新冠病毒-19)的传染病大流行,该传染病由严重急性呼吸综合征冠状病毒2(SARS-CoV-2)引起。计算机断层扫描(CT)在评估感染的严重程度中起着重要的作用,也可用于鉴别有症状和无症状的COVID-19携带者。随着COVID2019冠状病毒疾病的累积,放射学家越来越强调手动检查CT扫描。因此,自动化3D CT扫描识别工具的需求量很大,因为手动分析对放射科医生来说非常耗时,而且他们的疲劳可能会导致误判。然而,由于位于不同医院的CT扫描仪的各种技术规格,CT图像的外观可能会显著不同,从而导致许多自动图像识别方法的失败。因此,多中心和多扫描仪研究的多领域转移问题非常重要,这对于可靠的识别和可重复的客观诊断及预后也至关重要。在2019冠状病毒疾病诊断中,提出了一种COVID-19 CT扫描识别模型,即冠状病毒信息融合诊断网络(CIFD Net),通过一种新的鲁棒弱监督学习模式有效地处理多域移位问题。我们的模型能够可靠有效地解决CT扫描图像中不同外观的问题,同时与其他最先进的方法相比具有更高的精确度。 摘要:The world is currently experiencing an ongoing pandemic of an infectious disease named coronavirus disease 2019 (i.e., COVID-19), which is caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Computed Tomography (CT) plays an important role in assessing the severity of the infection and can also be used to identify those symptomatic and asymptomatic COVID-19 carriers. With a surge of the cumulative number of COVID-19 patients, radiologists are increasingly stressed to examine the CT scans manually. Therefore, an automated 3D CT scan recognition tool is highly in demand since the manual analysis is time-consuming for radiologists and their fatigue can cause possible misjudgment. However, due to various technical specifications of CT scanners located in different hospitals, the appearance of CT images can be significantly different leading to the failure of many automated image recognition approaches. The multi-domain shift problem for the multi-center and multi-scanner studies is therefore nontrivial that is also crucial for a dependable recognition and critical for reproducible and objective diagnosis and prognosis. In this paper, we proposed a COVID-19 CT scan recognition model namely coronavirus information fusion and diagnosis network (CIFD-Net) that can efficiently handle the multi-domain shift problem via a new robust weakly supervised learning paradigm. Our model can resolve the problem of different appearance in CT scan images reliably and efficiently while attaining higher accuracy compared to other state-of-the-art methods.

迁移|Zero/Few/One-Shot|自适应(2篇)

【1】 Adaptive Methods for Aggregated Domain Generalization 标题:一种自适应的聚合域综合方法 链接:https://arxiv.org/abs/2112.04766

作者:Xavier Thomas,Dhruv Mahajan,Alex Pentland,Abhimanyu Dubey 机构:Manipal Institute of Technology†Facebook AI Research 摘要:领域泛化涉及从异构的训练源集合中学习分类器,从而将其泛化为来自类似未知目标领域的数据,并应用于大规模学习和个性化推理。在许多情况下,隐私问题禁止获取训练数据样本的域标签,而只具有训练点的聚合集合。利用域标签创建域不变特征表示的现有方法在此设置中不适用,需要其他方法来学习通用分类器。在本文中,我们提出了一种域自适应方法来解决这个问题,该方法分为两个步骤:(a)我们在精心选择的特征空间中聚类训练数据以创建伪域,以及(b)使用这些伪域,我们学习一个域自适应分类器,该分类器使用有关输入和它所属的伪域的信息进行预测。我们的方法在不使用任何域标签的情况下,在各种域泛化基准上实现了最先进的性能。此外,我们还利用聚类信息为领域泛化提供了新的理论保证。我们的方法适用于基于集成的方法,即使在大规模基准数据集上也能提供可观的收益。有关代码,请访问:https://github.com/xavierohan/AdaClust_DomainBed 摘要:Domain generalization involves learning a classifier from a heterogeneous collection of training sources such that it generalizes to data drawn from similar unknown target domains, with applications in large-scale learning and personalized inference. In many settings, privacy concerns prohibit obtaining domain labels for the training data samples, and instead only have an aggregated collection of training points. Existing approaches that utilize domain labels to create domain-invariant feature representations are inapplicable in this setting, requiring alternative approaches to learn generalizable classifiers. In this paper, we propose a domain-adaptive approach to this problem, which operates in two steps: (a) we cluster training data within a carefully chosen feature space to create pseudo-domains, and (b) using these pseudo-domains we learn a domain-adaptive classifier that makes predictions using information about both the input and the pseudo-domain it belongs to. Our approach achieves state-of-the-art performance on a variety of domain generalization benchmarks without using domain labels whatsoever. Furthermore, we provide novel theoretical guarantees on domain generalization using cluster information. Our approach is amenable to ensemble-based methods and provides substantial gains even on large-scale benchmark datasets. The code can be found at: https://github.com/xavierohan/AdaClust_DomainBed

【2】 STAF: A Spatio-Temporal Attention Fusion Network for Few-shot Video Classification 标题:STAF:一种用于Few-Shot视频分类的时空注意力融合网络 链接:https://arxiv.org/abs/2112.04585

作者:Rex Liu,Huanle Zhang,Hamed Pirsiavash,Xin Liu 机构:Department of Computer Science, University of California, Davis 摘要:我们提出了一种时空注意融合网络STAF,用于小镜头视频分类。STAF首先利用三维卷积神经网络嵌入网络提取视频的粗粒度时空特征。然后使用自我注意和交叉注意网络对提取的特征进行微调。最后,STAF采用轻量级融合网络和最近邻分类器对每个查询视频进行分类。为了评估STAF,我们在三个基准(UCF101、HMDB51和Something-Something-V2)上进行了大量实验。实验结果表明,STAF极大地提高了最先进的精度,例如,STAF使UCF101和HMDB51的五向单发精度分别提高了5.3%和7.0%。 摘要:We propose STAF, a Spatio-Temporal Attention Fusion network for few-shot video classification. STAF first extracts coarse-grained spatial and temporal features of videos by applying a 3D Convolution Neural Networks embedding network. It then fine-tunes the extracted features using self-attention and cross-attention networks. Last, STAF applies a lightweight fusion network and a nearest neighbor classifier to classify each query video. To evaluate STAF, we conduct extensive experiments on three benchmarks (UCF101, HMDB51, and Something-Something-V2). The experimental results show that STAF improves state-of-the-art accuracy by a large margin, e.g., STAF increases the five-way one-shot accuracy by 5.3% and 7.0% for UCF101 and HMDB51, respectively.

强化学习(7篇)

【1】 JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical Reinforcement Learning 标题:JUEWU-MC:用样本高效的分层强化学习玩“我的世界” 链接:https://arxiv.org/abs/2112.04907

作者:Zichuan Lin,Junyou Li,Jianing Shi,Deheng Ye,Qiang Fu,Wei Yang 机构:Tencent AI Lab, Shenzhen, China 备注:The champion solution of NeurIPS 2021 MineRL research competition ( this https URL ) 摘要:由于部分可观察性、高维视觉感知和延迟奖励的复合挑战,在Minecraft等开放世界游戏中学习理性行为仍然是强化学习(RL)研究的挑战。为了解决这个问题,我们提出了JueWu MC,这是一种样本有效的分层RL方法,配备了表征学习和模仿学习来处理感知和探索。具体地说,我们的方法包括两个层次结构,其中高级控制器学习控制选项的策略,低级工作人员学习解决每个子任务。为了促进子任务的学习,我们提出了一种技术组合,包括1)捕捉动作和表征之间潜在关系的动作感知表征学习,2)用于有效探索的基于鉴别器的自我模仿学习,3)集成行为克隆和一致性过滤,以增强策略的稳健性。通过大量的基线实验表明,EWU显著提高了效率。值得注意的是,我们赢得了NYPIPS矿物2021研究竞赛冠军,并取得了最高的成绩。 摘要:Learning rational behaviors in open-world games like Minecraft remains to be challenging for Reinforcement Learning (RL) research due to the compound challenge of partial observability, high-dimensional visual perception and delayed reward. To address this, we propose JueWu-MC, a sample-efficient hierarchical RL approach equipped with representation learning and imitation learning to deal with perception and exploration. Specifically, our approach includes two levels of hierarchy, where the high-level controller learns a policy to control over options and the low-level workers learn to solve each sub-task. To boost the learning of sub-tasks, we propose a combination of techniques including 1) action-aware representation learning which captures underlying relations between action and representation, 2) discriminator-based self-imitation learning for efficient exploration, and 3) ensemble behavior cloning with consistency filtering for policy robustness. Extensive experiments show that JueWu-MC significantly improves sample efficiency and outperforms a set of baselines by a large margin. Notably, we won the championship of the NeurIPS MineRL 2021 research competition and achieved the highest performance score ever.

【2】 Real-World Dexterous Object Manipulation based Deep Reinforcement Learning 标题:基于深度强化学习的现实世界灵巧物体操作 链接:https://arxiv.org/abs/2112.04893

作者:Qingfeng Yao,Jilong Wang,Shuyu Yang 机构:REAL-WORLD DEXTEROUS OBJECT MANIPULATION BASEDDEEP REINFORCEMENT LEARNINGA PREPRINTWestlake UniversityQingfeng Yao 备注:Best Paper Award Runner Up winner submission for Real Robot Challenge 2021

【3】 VMAgent: Scheduling Simulator for Reinforcement Learning 标题:VMAgent:强化学习调度模拟器 链接:https://arxiv.org/abs/2112.04785

作者:Junjie Sheng,Shengliang Cai,Haochuan Cui,Wenhao Li,Yun Hua,Bo Jin,Wenli Zhou,Yiqiu Hu,Lei Zhu,Qian Peng,Hongyuan Zha,Xiangfeng Wang 摘要:引入了一种称为VMAgent的新型模拟器,以帮助RL研究人员更好地探索新方法,特别是虚拟机调度。VMAgent受实际虚拟机(VM)调度任务的启发,提供了一个能够反映云计算真实情况的高效仿真平台。从实际云计算中总结出三种场景(衰退、恢复和扩展),它们对应于许多强化学习挑战(高维状态和行动空间、高非平稳性和终身需求)。VMAgent为RL研究人员提供了灵活的配置,以设计考虑不同问题特征的定制调度环境。从VM调度的角度来看,VMAgent还有助于探索更好的基于学习的调度解决方案。 摘要:A novel simulator called VMAgent is introduced to help RL researchers better explore new methods, especially for virtual machine scheduling. VMAgent is inspired by practical virtual machine (VM) scheduling tasks and provides an efficient simulation platform that can reflect the real situations of cloud computing. Three scenarios (fading, recovering, and expansion) are concluded from practical cloud computing and corresponds to many reinforcement learning challenges (high dimensional state and action spaces, high non-stationarity, and life-long demand). VMAgent provides flexible configurations for RL researchers to design their customized scheduling environments considering different problem features. From the VM scheduling perspective, VMAgent also helps to explore better learning-based scheduling solutions.

【4】 DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization 标题:DR3:基于值的深度强化学习需要显式正则化 链接:https://arxiv.org/abs/2112.04716

作者:Aviral Kumar,Rishabh Agarwal,Tengyu Ma,Aaron Courville,George Tucker,Sergey Levine 机构: UC Berkeley, Google Research, MILA, Stanford University 摘要:尽管过度参数化,但通过监督学习训练的深度网络易于优化,并表现出良好的泛化能力。解释这一点的一个假设是,过度参数化的深层网络享有由随机梯度下降引起的隐式正则化的好处,这有利于在测试输入上推广良好的简约解。可以合理地推测,深度强化学习(RL)方法也可以从这种效应中获益。在本文中,我们讨论了在监督学习中看到的SGD的内隐正则化效应在离线深度RL环境中实际上是如何有害的,从而导致较差的泛化和退化的特征表示。我们的理论分析表明,当现有的隐式正则化模型应用于时间差分学习时,所得到的正则化器倾向于具有过度“混叠”的退化解,这与监督学习情况形成鲜明对比。我们从经验上支持这些发现,表明通过自举训练的深度网络值函数学习的特征表示确实可能退化,从而混淆了Bellman备份两侧出现的状态-动作对表示。为了解决这个问题,我们推导了这个隐式正则化器的形式,并受这个推导的启发,提出了一个简单有效的显式正则化器,称为DR3,它抵消了这个隐式正则化器的不良影响。当与现有的离线RL方法相结合时,DR3大大提高了性能和稳定性,减轻了在Atari 2600游戏、D4RL域和图像的机器人操作中的遗忘。 摘要:Despite overparameterization, deep networks trained via supervised learning are easy to optimize and exhibit excellent generalization. One hypothesis to explain this is that overparameterized deep networks enjoy the benefits of implicit regularization induced by stochastic gradient descent, which favors parsimonious solutions that generalize well on test inputs. It is reasonable to surmise that deep reinforcement learning (RL) methods could also benefit from this effect. In this paper, we discuss how the implicit regularization effect of SGD seen in supervised learning could in fact be harmful in the offline deep RL setting, leading to poor generalization and degenerate feature representations. Our theoretical analysis shows that when existing models of implicit regularization are applied to temporal difference learning, the resulting derived regularizer favors degenerate solutions with excessive "aliasing", in stark contrast to the supervised learning case. We back up these findings empirically, showing that feature representations learned by a deep network value function trained via bootstrapping can indeed become degenerate, aliasing the representations for state-action pairs that appear on either side of the Bellman backup. To address this issue, we derive the form of this implicit regularizer and, inspired by this derivation, propose a simple and effective explicit regularizer, called DR3, that counteracts the undesirable effects of this implicit regularizer. When combined with existing offline RL methods, DR3 substantially improves performance and stability, alleviating unlearning in Atari 2600 games, D4RL domains and robotic manipulation from images.

【5】 Ambiguous Dynamic Treatment Regimes: A Reinforcement Learning Approach 标题:模糊动态处理机制:一种强化学习方法 链接:https://arxiv.org/abs/2112.04571

作者:Soroush Saghafian 机构:Harvard Kennedy School, Harvard University, Cambridge, MA 摘要:各种研究的一个主要研究目标是使用观察数据集,并提供一套新的反事实指南,以产生因果改善。动态处理机制(DTR)被广泛研究,以规范这一过程。然而,寻找最佳DTR的可用方法通常依赖于在实际应用中违反的假设(例如,医疗决策或公共政策),尤其是当(a)无法忽略未观察到的混杂因素的存在,以及(b)未观察到的混杂因素是时变的(例如,受先前行动的影响)。当违反这些假设时,人们通常会面临关于需要假设以获得最佳DTR的潜在因果模型的模糊性。这种模糊性是不可避免的,因为无法从观测数据中理解未观测到的混杂因素的动态及其对观测数据部分的因果影响。受在我们的合作医院接受移植并面临被称为移植后新发糖尿病(NODAT)的医疗状况的患者寻找更好治疗方案的案例研究的启发,我们将DTR扩展到一个新的类别,称为模糊动态治疗方案(ADTR),其中,基于潜在因果模型的“云”评估治疗方案的偶然影响。然后,我们将ADTR与Saghafian(2018)提出的模糊部分可观察标记决策过程(APOMDP)相连接,并开发了两种强化学习方法,称为直接增强V-学习(DAV-Learning)和安全增强V-学习(SAV-Learning),这使得使用观测数据能够有效地学习最佳治疗方案。我们建立了这些学习方法的理论结果,包括(弱)一致性和渐近正态性。在我们的案例研究和模拟实验中,我们进一步评估了这些学习方法的性能。 摘要:A main research goal in various studies is to use an observational data set and provide a new set of counterfactual guidelines that can yield causal improvements. Dynamic Treatment Regimes (DTRs) are widely studied to formalize this process. However, available methods in finding optimal DTRs often rely on assumptions that are violated in real-world applications (e.g., medical decision-making or public policy), especially when (a) the existence of unobserved confounders cannot be ignored, and (b) the unobserved confounders are time-varying (e.g., affected by previous actions). When such assumptions are violated, one often faces ambiguity regarding the underlying causal model that is needed to be assumed to obtain an optimal DTR. This ambiguity is inevitable, since the dynamics of unobserved confounders and their causal impact on the observed part of the data cannot be understood from the observed data. Motivated by a case study of finding superior treatment regimes for patients who underwent transplantation in our partner hospital and faced a medical condition known as New Onset Diabetes After Transplantation (NODAT), we extend DTRs to a new class termed Ambiguous Dynamic Treatment Regimes (ADTRs), in which the casual impact of treatment regimes is evaluated based on a "cloud" of potential causal models. We then connect ADTRs to Ambiguous Partially Observable Mark Decision Processes (APOMDPs) proposed by Saghafian (2018), and develop two Reinforcement Learning methods termed Direct Augmented V-Learning (DAV-Learning) and Safe Augmented V-Learning (SAV-Learning), which enable using the observed data to efficiently learn an optimal treatment regime. We establish theoretical results for these learning methods, including (weak) consistency and asymptotic normality. We further evaluate the performance of these learning methods both in our case study and in simulation experiments.

【6】 High-Dimensional Stock Portfolio Trading with Deep Reinforcement Learning 标题:基于深度强化学习的高维股票组合交易 链接:https://arxiv.org/abs/2112.04755

作者:Uta Pigorsch,Sebastian Schäfer 机构:Schumpeter School of Business and Economics, University of Wuppertal, Wuppertal, Germany 备注:14 pages, 5 figures, 2 tables 摘要:提出了一种基于深度Q学习的金融组合交易深度强化学习算法。该算法能够从任何规模的横截面数据集中交易高维投资组合,其中可能包括资产中的数据缺口和非唯一历史长度。我们通过为每个环境采样一项资产来顺序设置环境,同时以所得资产的回报回报回报投资,并以资产组的平均回报回报回报现金储备。这迫使代理人战略性地将资本分配给其预期业绩高于平均水平的资产。我们在样本外分析中应用了我们的方法,对48个美国股票投资组合进行了分析,股票数量从10只到500只不等,选择标准和交易成本水平也各不相同。平均而言,该算法在所有投资组合中仅使用一个超参数设置,大大优于所有考虑的被动和主动基准投资策略。 摘要:This paper proposes a Deep Reinforcement Learning algorithm for financial portfolio trading based on Deep Q-learning. The algorithm is capable of trading high-dimensional portfolios from cross-sectional datasets of any size which may include data gaps and non-unique history lengths in the assets. We sequentially set up environments by sampling one asset for each environment while rewarding investments with the resulting asset's return and cash reservation with the average return of the set of assets. This enforces the agent to strategically assign capital to assets that it predicts to perform above-average. We apply our methodology in an out-of-sample analysis to 48 US stock portfolio setups, varying in the number of stocks from ten up to 500 stocks, in the selection criteria and in the level of transaction costs. The algorithm on average outperforms all considered passive and active benchmark investment strategies by a large margin using only one hyperparameter setup for all portfolios.

【7】 Recent Advances in Reinforcement Learning in Finance 标题:强化学习在金融学中的最新进展 链接:https://arxiv.org/abs/2112.04553

作者:Ben Hambly,Renyuan Xu,Huining Yang 备注:60 pages, 1 figure 摘要:由于数据量的不断增加,金融行业发生了快速变化,这给数据处理和数据分析技术带来了革命性的变化,并带来了新的理论和计算挑战。与经典随机控制理论和其他解决财务决策问题的分析方法相比,强化学习(RL)有了新的发展能够以较少的模型假设充分利用大量财务数据,并在复杂的财务环境中改进决策。本调查报告旨在回顾RL方法在金融领域的最新发展和使用情况。我们介绍了马尔可夫决策过程,这是许多常用的RL方法的设置。然后介绍了各种算法,重点介绍了不需要任何模型假设的基于价值和策略的方法。与神经网络进行连接,以扩展框架,使其包含深度RL算法。我们的调查最后讨论了这些RL算法在金融领域各种决策问题中的应用,包括最优执行、投资组合优化、期权定价和套期保值、做市、智能订单路由和机器人咨询。 摘要:The rapid changes in the finance industry due to the increasing amount of data have revolutionized the techniques on data processing and data analysis and brought new theoretical and computational challenges. In contrast to classical stochastic control theory and other analytical approaches for solving financial decision-making problems that heavily reply on model assumptions, new developments from reinforcement learning (RL) are able to make full use of the large amount of financial data with fewer model assumptions and to improve decisions in complex financial environments. This survey paper aims to review the recent developments and use of RL approaches in finance. We give an introduction to Markov decision processes, which is the setting for many of the commonly used RL approaches. Various algorithms are then introduced with a focus on value and policy based methods that do not require any model assumptions. Connections are made with neural networks to extend the framework to encompass deep RL algorithms. Our survey concludes by discussing the application of these RL algorithms in a variety of decision-making problems in finance, including optimal execution, portfolio optimization, option pricing and hedging, market making, smart order routing, and robo-advising.

分层学习(1篇)

【1】 Learning Transferable Motor Skills with Hierarchical Latent Mixture Policies 标题:用分层潜在混合策略学习可迁移运动技能 链接:https://arxiv.org/abs/2112.05062

作者:Dushyant Rao,Fereshteh Sadeghi,Leonard Hasenclever,Markus Wulfmeier,Martina Zambelli,Giulia Vezzani,Dhruva Tirumala,Yusuf Aytar,Josh Merel,Nicolas Heess,Raia Hadsell 机构:DeepMind, London, UK 摘要:对于在现实世界中运行的机器人,需要学习可有效转移并适应多种任务和场景的可重用行为。我们提出了一种利用层次混合潜变量模型从数据中学习抽象运动技能的方法。与现有工作相比,我们的方法利用了离散和连续潜在变量的三级层次结构,以捕获一组高级行为,同时考虑执行方式的差异。我们在操作域中证明,该方法可以有效地将离线数据聚类成不同的可执行行为,同时保持连续潜在变量模型的灵活性。与现有的基于技能和模仿的方法相比,生成的技能可以根据新任务、看不见的对象以及从状态到基于视觉的策略进行转移和微调,从而产生更好的样本效率和渐进性能。我们进一步分析了技能如何以及何时最有益:它们鼓励定向探索,以覆盖与任务相关的状态空间的大区域,使它们在挑战稀疏奖励设置时最有效。 摘要:For robots operating in the real world, it is desirable to learn reusable behaviours that can effectively be transferred and adapted to numerous tasks and scenarios. We propose an approach to learn abstract motor skills from data using a hierarchical mixture latent variable model. In contrast to existing work, our method exploits a three-level hierarchy of both discrete and continuous latent variables, to capture a set of high-level behaviours while allowing for variance in how they are executed. We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours, while retaining the flexibility of a continuous latent variable model. The resulting skills can be transferred and fine-tuned on new tasks, unseen objects, and from state to vision-based policies, yielding better sample efficiency and asymptotic performance compared to existing skill- and imitation-based methods. We further analyse how and when the skills are most beneficial: they encourage directed exploration to cover large regions of the state space relevant to the task, making them most effective in challenging sparse-reward settings.

自动驾驶|车辆|车道检测等(1篇)

【1】 Does Redundancy in AI Perception Systems Help to Test for Super-Human Automated Driving Performance? 标题:人工智能感知系统中的冗余是否有助于测试超人自动驾驶性能? 链接:https://arxiv.org/abs/2112.04758

作者:Hanno Gottschalk,Matthias Rottmann,Maida Saltagic 机构: equal contribution 2UniversityofWuppertal 摘要:虽然自动驾驶的广告通常比人工驾驶的性能更好,但这项工作回顾说,几乎不可能在系统层面上提供直接的统计证据,证明事实确实如此。所需的标记数据量将超过当今技术和经济能力的规模。因此,一种常用的策略是使用冗余,同时证明子系统具有足够的性能。众所周知,该策略特别适用于子系统独立运行的情况,即错误的发生在统计意义上是独立的。在这里,我们给出了一些初步考虑和实验证据,证明这种策略不是免费的,因为完成相同计算机视觉任务的神经网络的错误,至少在某些情况下,显示出错误的相关发生。如果训练数据、体系结构和训练保持分离,或者使用特殊的损失函数训练独立性,这仍然是正确的。在我们的实验中,使用来自不同传感器的数据(通过3D MNIST数据集的多达五个2D投影实现)可以更有效地降低相关性,但这并没有实现减少冗余和统计独立子系统可获得的测试数据的潜力。 摘要:While automated driving is often advertised with better-than-human driving performance, this work reviews that it is nearly impossible to provide direct statistical evidence on the system level that this is actually the case. The amount of labeled data needed would exceed dimensions of present day technical and economical capabilities. A commonly used strategy therefore is the use of redundancy along with the proof of sufficient subsystems' performances. As it is known, this strategy is efficient especially for the case of subsystems operating independently, i.e. the occurrence of errors is independent in a statistical sense. Here, we give some first considerations and experimental evidence that this strategy is not a free ride as the errors of neural networks fulfilling the same computer vision task, at least for some cases, show correlated occurrences of errors. This remains true, if training data, architecture, and training are kept separate or independence is trained using special loss functions. Using data from different sensors (realized by up to five 2D projections of the 3D MNIST data set) in our experiments is more efficiently reducing correlations, however not to an extent that is realizing the potential of reduction of testing data that can be obtained for redundant and statistically independent subsystems.

点云|SLAM|雷达|激光|深度RGBD相关(1篇)

【1】 3D-VField: Learning to Adversarially Deform Point Clouds for Robust 3D Object Detection 标题:3D-Vfield:学习逆变形点云以实现稳健的3D对象检测 链接:https://arxiv.org/abs/2112.04764

作者:Alexander Lehner,Stefano Gasperini,Alvaro Marcos-Ramiro,Michael Schmidt,Mohammad-Ali Nikouei Mahani,Nassir Navab,Benjamin Busam,Federico Tombari 机构: Technical University of Munich, BMW Group, Johns Hopkins University, Google

联邦学习|隐私保护|加密(1篇)

【1】 Asynchronous Semi-Decentralized Federated Edge Learning for Heterogeneous Clients 标题:面向异构客户端的异步半分散联合边缘学习 链接:https://arxiv.org/abs/2112.04737

作者:Yuchang Sun,Jiawei Shao,Yuyi Mao,Jun Zhang 机构:∗Dept. of ECE, The Hong Kong, China University of Science and Technology, Hong Kong, China, †Dept. of EIE, The Hong Kong, China Polytechnic University, Hong Kong, China

推理|分析|理解|解释(3篇)

【1】 PTR: A Benchmark for Part-based Conceptual, Relational, and Physical Reasoning 标题:PTR:基于零件的概念推理、关系推理和物理推理的基准 链接:https://arxiv.org/abs/2112.05136

作者:Yining Hong,Li Yi,Joshua B. Tenenbaum,Antonio Torralba,Chuang Gan 机构:UCLA, Stanford University, MIT BCS, CBMM, CSAIL, MIT CSAIL, MIT-IBM Watson AI Lab 备注:NeurIPS 2021. Project page: this http URL 摘要:将视觉对象进一步分解为一个整体和一个部分的能力是构成视觉层次的关键。这种复合结构可以归纳出丰富的语义概念和关系,从而在视觉信号的解释和组织以及视觉感知和推理的泛化中发挥重要作用。然而,现有的视觉推理基准主要关注对象而不是零件。由于更细粒度的概念、更丰富的几何关系和更复杂的物理,基于完整零件-整体层次结构的可视化推理比以对象为中心的推理更具挑战性。因此,为了更好地为基于零件的概念、关系和物理推理服务,我们引入了一个新的大规模诊断可视化推理数据集PTR。PTR包含约70k RGBD合成图像,其中包含关于语义实例分割、颜色属性、空间和几何关系以及某些物理属性(如稳定性)的地面真实对象和零件级注释。这些图像与700k机器生成的问题相结合,涵盖各种类型的推理类型,使它们成为视觉推理模型的良好测试平台。我们在此数据集上研究了几种最先进的视觉推理模型,并观察到它们在人类可以轻松推断正确答案的情况下仍然会犯许多令人惊讶的错误。我们相信,该数据集将为基于零件的推理提供新的机会。 摘要:A critical aspect of human visual perception is the ability to parse visual scenes into individual objects and further into object parts, forming part-whole hierarchies. Such composite structures could induce a rich set of semantic concepts and relations, thus playing an important role in the interpretation and organization of visual signals as well as for the generalization of visual perception and reasoning. However, existing visual reasoning benchmarks mostly focus on objects rather than parts. Visual reasoning based on the full part-whole hierarchy is much more challenging than object-centric reasoning due to finer-grained concepts, richer geometry relations, and more complex physics. Therefore, to better serve for part-based conceptual, relational and physical reasoning, we introduce a new large-scale diagnostic visual reasoning dataset named PTR. PTR contains around 70k RGBD synthetic images with ground truth object and part level annotations regarding semantic instance segmentation, color attributes, spatial and geometric relationships, and certain physical properties such as stability. These images are paired with 700k machine-generated questions covering various types of reasoning types, making them a good testbed for visual reasoning models. We examine several state-of-the-art visual reasoning models on this dataset and observe that they still make many surprising mistakes in situations where humans can easily infer the correct answer. We believe this dataset will open up new opportunities for part-based reasoning.

【2】 Automated Side Channel Analysis of Media Software with Manifold Learning 标题:基于流形学习的媒体软件自动旁路分析 链接:https://arxiv.org/abs/2112.04947

作者:Yuanyuan Yuan,Qi Pang,Shuai Wang 机构:The Hong Kong, China University of Science and Technology 摘要:云计算和机器学习作为一种服务的蓬勃发展导致媒体软件被广泛用于处理机密媒体数据。本文探讨了对手针对媒体软件启动侧通道分析(SCA)以重建机密媒体输入的能力。表示学习和知觉学习的最新进展启发我们考虑从侧通道迹线的媒体输入的重建作为跨模态的歧管学习任务,可以以统一的方式来处理,并通过训练的自编码框架来学习媒体输入和旁路观测之间的映射。因此,我们可以通过自动定位软件来进一步提高对主SCA中的信息泄漏点的关注度。我们还提出了一种新的、高效的防御技术,称为感知致盲,它可以使用感知面具干扰媒体输入,并缓解基于多种学习的SCA。我们的评估利用三种流行的媒体软件重建图像、音频和文本格式的输入。我们分析了三种常见的侧通道——缓存库、缓存线和页表——以及标准Prime Probe记录的仅限用户空间的缓存集访问。我们的框架成功地从经过评估的媒体软件中重建高质量的机密输入,并自动确定其易受攻击的程序点,其中许多是公众未知的。我们进一步表明,感知盲法可以减轻基于多种学习的SCA,而额外成本可以忽略不计。 摘要:The prosperous development of cloud computing and machine learning as a service has led to the widespread use of media software to process confidential media data. This paper explores an adversary's ability to launch side channel analyses (SCA) against media software to reconstruct confidential media inputs. Recent advances in representation learning and perceptual learning inspired us to consider the reconstruction of media inputs from side channel traces as a cross-modality manifold learning task that can be addressed in a unified manner with an autoencoder framework trained to learn the mapping between media inputs and side channel observations. We further enhance the autoencoder with attention to localize the program points that make the primary contribution to SCA, thus automatically pinpointing information-leakage points in media software. We also propose a novel and highly effective defensive technique called perception blinding that can perturb media inputs with perception masks and mitigate manifold learning-based SCA. Our evaluation exploits three popular media software to reconstruct inputs in image, audio, and text formats. We analyze three common side channels - cache bank, cache line, and page tables - and userspace-only cache set accesses logged by standard Prime Probe. Our framework successfully reconstructs high-quality confidential inputs from the assessed media software and automatically pinpoint their vulnerable program points, many of which are unknown to the public. We further show that perception blinding can mitigate manifold learning-based SCA with negligible extra cost.

【3】 Identification of Twitter Bots based on an Explainable ML Framework: the US 2020 Elections Case Study 标题:基于可解释ML框架的Twitter机器人识别:美国2020年选举案例研究 链接:https://arxiv.org/abs/2112.04913

作者:Alexander Shevtsov,Christos Tzagkarakis,Despoina Antonakaki,Sotiris Ioannidis 机构: Institute of Computer Science, Foundation for Research and Technology, Technical University of Crete, Computer Science Department - University of Crete 摘要:推特是最受欢迎的社交网络之一,吸引了数百万用户,同时也捕获了相当大比例的在线话语。它提供了一个带有短消息的简单使用框架和一个高效的应用程序编程接口(API),使研究社区能够研究和分析这个社交网络的几个方面。然而,Twitter使用的简单性可能导致各种机器人进行恶意处理。恶意处理现象在网络话语中不断扩大,特别是在选举期间,除了用于传播和沟通目的的合法机器人外,其目的是操纵公众舆论和选民走向某个方向、特定意识形态或政党。本文的重点是设计一个基于标记Twitter数据的Twitter机器人识别系统。为此,采用了一种使用极端梯度增强(XGBoost)算法的有监督机器学习(ML)框架,通过交叉验证调整超参数。我们的研究还部署了Shapley加法解释(SHAP),通过计算特征重要性,使用基于博弈论的Shapley值来解释ML模型预测。对不同Twitter数据集的实验评估表明,与最新的Twitter机器人检测方法相比,我们的方法在机器人检测准确性方面具有优势。 摘要:Twitter is one of the most popular social networks attracting millions of users, while a considerable proportion of online discourse is captured. It provides a simple usage framework with short messages and an efficient application programming interface (API) enabling the research community to study and analyze several aspects of this social network. However, the Twitter usage simplicity can lead to malicious handling by various bots. The malicious handling phenomenon expands in online discourse, especially during the electoral periods, where except the legitimate bots used for dissemination and communication purposes, the goal is to manipulate the public opinion and the electorate towards a certain direction, specific ideology, or political party. This paper focuses on the design of a novel system for identifying Twitter bots based on labeled Twitter data. To this end, a supervised machine learning (ML) framework is adopted using an Extreme Gradient Boosting (XGBoost) algorithm, where the hyper-parameters are tuned via cross-validation. Our study also deploys Shapley Additive Explanations (SHAP) for explaining the ML model predictions by calculating feature importance, using the game theoretic-based Shapley values. Experimental evaluation on distinct Twitter datasets demonstrate the superiority of our approach, in terms of bot detection accuracy, when compared against a recent state-of-the-art Twitter bot detection method.

检测相关(4篇)

【1】 A Survey on Echo Chambers on Social Media: Description, Detection and Mitigation 标题:社交媒体上回音室的调查:描述、检测与缓解 链接:https://arxiv.org/abs/2112.05084

作者:Faisal Alatawi,Lu Cheng,Anique Tahir,Mansooreh Karami,Bohan Jiang,Tyler Black,Huan Liu 机构:Arizona State University - DMML Lab 备注:21 pages, 5 figures

【2】 Scalable and Decentralized Algorithms for Anomaly Detection via Learning-Based Controlled Sensing 标题:基于学习控制感知的可扩展分散异常检测算法 链接:https://arxiv.org/abs/2112.04912

作者:Geethu Joseph,Chen Zhong,M. Cenk Gursoy,Senem Velipasalar,Pramod K. Varshney 备注:13 pages, 4 figures. arXiv admin note: substantial text overlap with arXiv:2105.06289 摘要:我们解决的问题是从一个给定的集合中依次选择和观察过程,以发现其中的异常。决策者在任何给定的时刻观察过程的子集,并获得相应过程是否异常的噪声二元指标。在此设置中,我们开发了一种异常检测算法,该算法选择在给定时刻要观察的进程,决定何时停止观察,并宣布对异常进程的决定。检测算法的目标是识别精度超过期望值的异常,同时最小化决策延迟。我们设计了一个集中式算法,其中流程由一个公共代理共同选择,以及一个分散式算法,其中每个流程都独立决定是否选择流程。我们的算法依赖于一个马尔可夫决策过程,该过程是根据观测值,使用每个过程正常或异常的边际概率定义的。我们使用深度参与者-批评家强化学习框架来实现检测算法。与之前在这个主题上的工作不同,我们的算法在进程数量上具有指数复杂性,我们的算法在计算和内存方面的需求在进程数量上都是多项式的。通过与最新方法的比较,我们用数值实验证明了这些算法的有效性。 摘要:We address the problem of sequentially selecting and observing processes from a given set to find the anomalies among them. The decision-maker observes a subset of the processes at any given time instant and obtains a noisy binary indicator of whether or not the corresponding process is anomalous. In this setting, we develop an anomaly detection algorithm that chooses the processes to be observed at a given time instant, decides when to stop taking observations, and declares the decision on anomalous processes. The objective of the detection algorithm is to identify the anomalies with an accuracy exceeding the desired value while minimizing the delay in decision making. We devise a centralized algorithm where the processes are jointly selected by a common agent as well as a decentralized algorithm where the decision of whether to select a process is made independently for each process. Our algorithms rely on a Markov decision process defined using the marginal probability of each process being normal or anomalous, conditioned on the observations. We implement the detection algorithms using the deep actor-critic reinforcement learning framework. Unlike prior work on this topic that has exponential complexity in the number of processes, our algorithms have computational and memory requirements that are both polynomial in the number of processes. We demonstrate the efficacy of these algorithms using numerical experiments by comparing them with state-of-the-art methods.

【3】 Combining Textual Features for the Detection of Hateful and Offensive Language 标题:结合文本特征检测仇恨和攻击性语言 链接:https://arxiv.org/abs/2112.04803

作者:Sherzod Hakimov,Ralph Ewerth 机构:TIB - Leibniz Information Centre for Science and Technology, Hannover, Germany, Leibniz University Hannover, L,S Research Center, Hannover, Germany 备注:HASOC 2021, Forum for Information Retrieval Evaluation, 2021 摘要:自从网络攻击成为一种攻击性行为以来,许多网络用户在日常社交活动中都会受到攻击性语言的攻击。在这篇文章中,我们分析了如何结合不同的文本特征来检测Twitter上的仇恨或攻击性帖子。我们提供了详细的实验评估,以了解神经网络架构中每个构建块的影响。在英语子任务1A:从团队名称TIB-VA下的HASOC-2021数据集的post数据集中识别仇恨、攻击性和亵渎性内容上,对提议的架构进行了评估。我们比较了上下文词嵌入的不同变体,结合字符级嵌入和收集的仇恨词编码。 摘要:The detection of offensive, hateful and profane language has become a critical challenge since many users in social networks are exposed to cyberbullying activities on a daily basis. In this paper, we present an analysis of combining different textual features for the detection of hateful or offensive posts on Twitter. We provide a detailed experimental evaluation to understand the impact of each building block in a neural network architecture. The proposed architecture is evaluated on the English Subtask 1A: Identifying Hate, offensive and profane content from the post datasets of HASOC-2021 dataset under the team name TIB-VA. We compared different variants of the contextual word embeddings combined with the character level embeddings and the encoding of collected hate terms.

【4】 Detecting Potentially Harmful and Protective Suicide-related Content on Twitter: A Machine Learning Approach 标题:检测Twitter上与自杀相关的潜在有害和保护性内容:一种机器学习方法 链接:https://arxiv.org/abs/2112.04796

作者:Hannah Metzler,Hubert Baginski,Thomas Niederkrotenthaler,David Garcia 机构:Affiliations:, ., Complexity Science Hub Vienna, Austria, Section for Science of Complex Systems, Center for Medical Statistics, Informatics and, Intelligent Systems, Medical University of Vienna, Austria 摘要:研究表明,接触与自杀相关的新闻媒体内容与自杀率相关,其中一些内容特征可能具有有害影响,而另一些可能具有保护作用。虽然一些选定的特征存在良好的证据,但总体上缺乏系统的大规模调查,尤其是社交媒体数据。我们使用机器学习方法自动标记大量Twitter数据。我们开发了一种新的注释方案,将与自杀相关的推特分类为不同的消息类型和以问题与解决方案为中心的观点。然后,我们训练了一个机器学习模型的基准,包括一个多数分类器、一种基于词频的方法(TF-IDF和一个线性SVM)和两个最先进的深度学习模型(BERT,XLNet)。两种深度学习模式在两项分类任务中表现最佳:首先,我们将六个主要内容分类,包括关于自杀想法和企图或应对的个人故事,旨在传播问题意识或预防相关信息的行动呼吁,自杀案例报告,以及其他与自杀相关和离题的推文。深度学习模型在六个类别中的准确率平均达到73%以上,F1在除自杀意念和企图类别(55%)以外的所有类别中的得分在69%到85%之间。其次,在将涉及实际自杀的帖子与非主题推文分开的过程中,他们正确标记了约88%的推文,BERT在这两类推文中的F1得分分别为93%和74%。这些分类性能与类似任务的最新水平相当。通过提高数据标记的效率,这项工作使未来能够大规模调查各种社交媒体内容对自杀率和求助行为的有害和保护作用。 摘要:Research shows that exposure to suicide-related news media content is associated with suicide rates, with some content characteristics likely having harmful and others potentially protective effects. Although good evidence exists for a few selected characteristics, systematic large scale investigations are missing in general, and in particular for social media data. We apply machine learning methods to automatically label large quantities of Twitter data. We developed a novel annotation scheme that classifies suicide-related tweets into different message types and problem- vs. solution-focused perspectives. We then trained a benchmark of machine learning models including a majority classifier, an approach based on word frequency (TF-IDF with a linear SVM) and two state-of-the-art deep learning models (BERT, XLNet). The two deep learning models achieved the best performance in two classification tasks: First, we classified six main content categories, including personal stories about either suicidal ideation and attempts or coping, calls for action intending to spread either problem awareness or prevention-related information, reportings of suicide cases, and other suicide-related and off-topic tweets. The deep learning models reach accuracy scores above 73% on average across the six categories, and F1-scores in between 69% and 85% for all but the suicidal ideation and attempts category (55%). Second, in separating postings referring to actual suicide from off-topic tweets, they correctly labelled around 88% of tweets, with BERT achieving F1-scores of 93% and 74% for the two categories. These classification performances are comparable to the state-of-the-art on similar tasks. By making data labeling more efficient, this work enables future large-scale investigations on harmful and protective effects of various kinds of social media content on suicide rates and on help-seeking behavior.

分类|识别(5篇)

【1】 Model Doctor: A Simple Gradient Aggregation Strategy for Diagnosing and Treating CNN Classifiers 标题:模型医生:一种诊断和治疗CNN分类器的简单梯度聚合策略 链接:https://arxiv.org/abs/2112.04934

作者:Zunlei Feng,Jiacong Hu,Sai Wu,Xiaotian Yu,Jie Song,Mingli Song 机构:Zhejiang University 备注:Accepted by AAAI 2022 摘要:最近,卷积神经网络(CNN)在分类任务中取得了优异的性能。众所周知,CNN被认为是一个“黑匣子”,很难理解预测机制和调试错误的预测。为了解决上述问题,本文进行了一些模型调试和解释工作。然而,这些方法侧重于解释和诊断模型预测的可能原因,研究人员在此基础上手动处理以下模型优化。在本文中,我们提出了第一个全自动的模型诊断和治疗工具,称为模型医生。基于以下两个发现:1)每个类别仅与稀疏和特定卷积核相关;2)在特征空间中分离对抗性样本,而正常样本是连续的。设计了一个简单的聚合梯度约束,用于有效诊断和优化CNN分类器。聚合梯度策略是主流CNN分类器的通用模块。大量实验表明,该模型适用于所有现有的CNN分类器,并将$16$主流CNN分类器的准确率提高了1%-5%。 摘要:Recently, Convolutional Neural Network (CNN) has achieved excellent performance in the classification task. It is widely known that CNN is deemed as a 'black-box', which is hard for understanding the prediction mechanism and debugging the wrong prediction. Some model debugging and explanation works are developed for solving the above drawbacks. However, those methods focus on explanation and diagnosing possible causes for model prediction, based on which the researchers handle the following optimization of models manually. In this paper, we propose the first completely automatic model diagnosing and treating tool, termed as Model Doctor. Based on two discoveries that 1) each category is only correlated with sparse and specific convolution kernels, and 2) adversarial samples are isolated while normal samples are successive in the feature space, a simple aggregate gradient constraint is devised for effectively diagnosing and optimizing CNN classifiers. The aggregate gradient strategy is a versatile module for mainstream CNN classifiers. Extensive experiments demonstrate that the proposed Model Doctor applies to all existing CNN classifiers, and improves the accuracy of $16$ mainstream CNN classifiers by 1%-5%.

【2】 Differentially Private Ensemble Classifiers for Data Streams 标题:数据流的差分私有集成分类器 链接:https://arxiv.org/abs/2112.04640

作者:Lovedeep Gondara,Ke Wang,Ricardo Silva Carvalho 机构:School of Computing Science, Simon Fraser University, British Columbia, Canada 备注:Accepted at WSDM 2022 摘要:通过分类/回归从连续数据流中学习在许多领域都很普遍。适应不断变化的数据特征(概念漂移),同时保护数据所有者的私人信息是一个公开的挑战。我们针对这个问题提出了一个差异私有集成解决方案,它具有两个显著的特点:它允许textit{unbounded}个集成更新来处理固定隐私预算下可能永无止境的数据流,并且它是textit{model agnostic},因为它将任何预先训练的差异私有分类/回归模型视为一个黑箱。在真实世界和模拟数据集上,我们的方法在隐私、概念漂移和数据分布的不同设置方面优于竞争对手。 摘要:Learning from continuous data streams via classification/regression is prevalent in many domains. Adapting to evolving data characteristics (concept drift) while protecting data owners' private information is an open challenge. We present a differentially private ensemble solution to this problem with two distinguishing features: it allows an textit{unbounded} number of ensemble updates to deal with the potentially never-ending data streams under a fixed privacy budget, and it is textit{model agnostic}, in that it treats any pre-trained differentially private classification/regression model as a black-box. Our method outperforms competitors on real-world and simulated datasets for varying settings of privacy, concept drift, and data distribution.

【3】 The perils of being unhinged: On the accuracy of classifiers minimizing a noise-robust convex loss 标题:错位的危险:关于最小化噪声鲁棒性凸性损失的分类器的精度 链接:https://arxiv.org/abs/2112.04590

作者:Philip M. Long,Rocco A. Servedio 机构:Google, Columbia University 摘要:van Rooyen等人引入了凸损失函数对随机分类噪声具有鲁棒性的概念,并确定了“非交错”损失函数在这个意义上是鲁棒的。在这篇文章中,我们研究了通过最小化失谐损失获得的二元分类器的精度,并观察到即使对于简单的线性可分离数据分布,最小化失谐损失也只能得到精度不优于随机猜测的二元分类器。 摘要:van Rooyen et al. introduced a notion of convex loss functions being robust to random classification noise, and established that the "unhinged" loss function is robust in this sense. In this note we study the accuracy of binary classifiers obtained by minimizing the unhinged loss, and observe that even for simple linearly separable data distributions, minimizing the unhinged loss may only yield a binary classifier with accuracy no better than random guessing.

【4】 Merging Subject Matter Expertise and Deep Convolutional Neural Network for State-Based Online Machine-Part Interaction Classification 标题:融合主题知识和深卷积神经网络的基于状态的在线机械零件交互分类 链接:https://arxiv.org/abs/2112.04572

作者:Hao Wang,Yassine Qamsane,James Moyne,Kira Barton 机构:Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 备注:Published at ASME Manufacturing Science and Engineering Conference (MSEC)

【5】 SoK: Anti-Facial Recognition Technology 标题:SOK:反人脸识别技术 链接:https://arxiv.org/abs/2112.04558

作者:Emily Wenger,Shawn Shan,Haitao Zheng,Ben Y. Zhao 机构:Department of Computer Science, The University of Chicago 备注:13 pages 摘要:近年来,政府和商业实体迅速采用面部识别(FR)技术,引起了人们对公民自由和隐私的关注。作为回应,一系列所谓的“反面部识别”(AFR)工具被开发出来,以帮助用户避免不必要的面部识别。在过去几年中提出的AFR工具的集合是广泛的和快速发展的,需要退后一步考虑更广泛的设计空间的AFR系统和长期的挑战。本文旨在填补这一空白,并首次对AFR研究领域进行全面分析。以FR系统的运行阶段为起点,我们创建了一个系统框架,用于分析不同AFR方法的效益和权衡。然后我们考虑AFR工具所面临的技术和社会挑战,并提出未来的研究方向在这一领域。 摘要:The rapid adoption of facial recognition (FR) technology by both government and commercial entities in recent years has raised concerns about civil liberties and privacy. In response, a broad suite of so-called "anti-facial recognition" (AFR) tools has been developed to help users avoid unwanted facial recognition. The set of AFR tools proposed in the last few years is wide-ranging and rapidly evolving, necessitating a step back to consider the broader design space of AFR systems and long-term challenges. This paper aims to fill that gap and provides the first comprehensive analysis of the AFR research landscape. Using the operational stages of FR systems as a starting point, we create a systematic framework for analyzing the benefits and tradeoffs of different AFR approaches. We then consider both technical and social challenges facing AFR tools and propose directions for future research in this field.

表征(5篇)

【1】 Neural Descriptor Fields: SE(3)-Equivariant Object Representations for Manipulation 标题:神经描述符域:SE(3)-用于操作的等变对象表示 链接:https://arxiv.org/abs/2112.05124

作者:Anthony Simeonov,Yilun Du,Andrea Tagliasacchi,Joshua B. Tenenbaum,Alberto Rodriguez,Pulkit Agrawal,Vincent Sitzmann 机构:Massachusetts Institute of Technology, Google Research, University of Toronto, ∗Authors contributed equally, order determined by coin flip. †Equal Advising., Small Handful (~,-,) of Demonstrations, Test-time executions: Unseen objects in out-of-distribution poses 备注:Website: this https URL First two authors contributed equally (order determined by coin flip), last two authors equal advising 摘要:我们提出了神经描述符字段(NDF),一种通过类别级描述符对对象和目标(如用于悬挂的机器人夹具或机架)之间的点和相对姿势进行编码的对象表示。我们将此表示用于对象操作,在给定任务演示的情况下,我们希望在同一类别的新对象实例上重复相同的任务。我们建议通过搜索(通过优化)描述符与演示中观察到的匹配的姿势来实现这一目标。NDF通过不依赖专家标记的关键点的3D自动编码任务以自我监督的方式方便地进行训练。此外,NDF是SE(3)-等变的,保证了在所有可能的3D对象平移和旋转中通用的性能。我们在模拟和真实机器人上演示了从少量(5-10)演示中学习操作任务。我们的性能概括了对象实例和6自由度对象姿势,并且显著优于最近依赖2D描述符的基线。项目网站:https://yilundu.github.io/ndf/. 摘要:We present Neural Descriptor Fields (NDFs), an object representation that encodes both points and relative poses between an object and a target (such as a robot gripper or a rack used for hanging) via category-level descriptors. We employ this representation for object manipulation, where given a task demonstration, we want to repeat the same task on a new object instance from the same category. We propose to achieve this objective by searching (via optimization) for the pose whose descriptor matches that observed in the demonstration. NDFs are conveniently trained in a self-supervised fashion via a 3D auto-encoding task that does not rely on expert-labeled keypoints. Further, NDFs are SE(3)-equivariant, guaranteeing performance that generalizes across all possible 3D object translations and rotations. We demonstrate learning of manipulation tasks from few (5-10) demonstrations both in simulation and on a real robot. Our performance generalizes across both object instances and 6-DoF object poses, and significantly outperforms a recent baseline that relies on 2D descriptors. Project website: https://yilundu.github.io/ndf/.

【2】 Learning Personal Representations from fMRIby Predicting Neurofeedback Performance 标题:通过预测神经反馈性能从fMRI学习个人表征 链接:https://arxiv.org/abs/2112.04902

作者:Jhonathan Osin,Lior Wolf,Guy Gurevitch,Jackob Nimrod Keynan,Tom Fruchtman-Steinbok,Ayelet Or-Borichev,Shira Reznik Balter,Talma Hendler 机构: School of Computer Science, Tel Aviv University, Sagol Brain Institue, Tel-Aviv Sourasky Medical Center, School of Psychological Sciences, Tel Aviv University, Sagol School of Neuroscience, Tel Aviv University 备注:None 摘要:我们提出了一种深度神经网络方法,用于在功能磁共振成像(fMRI)的指导下,学习执行自我神经调节任务的个体的个人表征。这种神经反馈任务(观察与调节)根据受试者杏仁核信号的下调为受试者提供持续反馈,学习算法关注该区域活动的时间过程。这种表征是通过一个自我监督的递归神经网络学习的,该神经网络根据最近的功能磁共振成像帧预测下一个功能磁共振成像帧中的杏仁核活动,并以学习到的个体表征为条件。结果表明,个体的表示大大改善了下一帧预测。此外,这种仅从功能磁共振成像图像中学习的个人表征,在精神特征的线性预测中表现良好,这比基于临床数据和人格测试进行预测要好。我们的规范作为补充附在后,数据将在道德批准的情况下共享。 摘要:We present a deep neural network method for learning a personal representation for individuals that are performing a self neuromodulation task, guided by functional MRI (fMRI). This neurofeedback task (watch vs. regulate) provides the subjects with a continuous feedback contingent on down regulation of their Amygdala signal and the learning algorithm focuses on this region's time-course of activity. The representation is learned by a self-supervised recurrent neural network, that predicts the Amygdala activity in the next fMRI frame given recent fMRI frames and is conditioned on the learned individual representation. It is shown that the individuals' representation improves the next-frame prediction considerably. Moreover, this personal representation, learned solely from fMRI images, yields good performance in linear prediction of psychiatric traits, which is better than performing such a prediction based on clinical data and personality tests. Our code is attached as supplementary and the data would be shared subject to ethical approvals.

【3】 Next Steps: Learning a Disentangled Gait Representation for Versatile Quadruped Locomotion 标题:下一步:学习多功能四足移动的解缠步态表示 链接:https://arxiv.org/abs/2112.04809

作者:Alexander L. Mitchell,Wolfgang Merkt,Mathieu Geisert,Siddhant Gangapurwala,Martin Engelcke,Oiwi Parker Jones,Ioannis Havoutis,Ingmar Posner 机构: contact schedules for discrete gaitsAll authors are with the Oxford Robotics Institute, University of OxfordEmail 备注:8 pages, 6 figures, under review at Robotics and Automation Letters (RA-L) 摘要:四足动物的运动正在迅速成熟到一定程度,机器人现在经常穿越各种非结构化地形。然而,虽然步态通常可以通过从一系列预先计算的样式中进行选择来改变,但当前的规划人员无法在机器人运动时连续改变关键步态参数。在飞行中,合成具有意外操作特征的步态,甚至混合动态动作,超出了当前最先进技术的能力。在这项工作中,我们通过学习一个潜在的空间,捕捉构成特定步态的关键站姿阶段,来解决这一限制。这是通过在单步小跑风格上训练的生成模型来实现的,该模型鼓励分离,从而将驱动信号应用于潜在状态的单个维度,从而诱导综合连续各种小跑风格的整体计划。我们证明了驱动信号的特定特性直接映射到步态参数,如步频、足步高度和全站姿持续时间。由于我们方法的性质,这些合成步态在机器人操作过程中在线不断变化,并有力地捕捉到丰富的运动,大大超过了训练过程中相对狭窄的行为。此外,生成模型的使用有助于检测和缓解干扰,从而提供一个通用且稳健的规划框架。我们在真实的ANYmal四足机器人上评估了我们的方法,并证明我们的方法实现了动态小跑风格的连续混合,同时对外部扰动具有鲁棒性和反应性。 摘要:Quadruped locomotion is rapidly maturing to a degree where robots now routinely traverse a variety of unstructured terrains. However, while gaits can be varied typically by selecting from a range of pre-computed styles, current planners are unable to vary key gait parameters continuously while the robot is in motion. The synthesis, on-the-fly, of gaits with unexpected operational characteristics or even the blending of dynamic manoeuvres lies beyond the capabilities of the current state-of-the-art. In this work we address this limitation by learning a latent space capturing the key stance phases constituting a particular gait. This is achieved via a generative model trained on a single trot style, which encourages disentanglement such that application of a drive signal to a single dimension of the latent state induces holistic plans synthesising a continuous variety of trot styles. We demonstrate that specific properties of the drive signal map directly to gait parameters such as cadence, foot step height and full stance duration. Due to the nature of our approach these synthesised gaits are continuously variable online during robot operation and robustly capture a richness of movement significantly exceeding the relatively narrow behaviour seen during training. In addition, the use of a generative model facilitates the detection and mitigation of disturbances to provide a versatile and robust planning framework. We evaluate our approach on a real ANYmal quadruped robot and demonstrate that our method achieves a continuous blend of dynamic trot styles whilst being robust and reactive to external perturbations.

【4】 BACON: Band-limited Coordinate Networks for Multiscale Scene Representation 标题:BACON:多尺度场景表示的带限坐标网络 链接:https://arxiv.org/abs/2112.04645

作者:David B. Lindell,Dave Van Veen,Jeong Joon Park,Gordon Wetzstein 机构:Stanford University 摘要:基于坐标的网络已成为三维表示和场景重建的有力工具。这些网络经过训练,可以将连续输入坐标映射到每个点的信号值。尽管如此,当前的体系结构仍然是黑匣子:它们的光谱特征不容易分析,它们在无监督点的行为也很难预测。此外,这些网络通常被训练成以单个尺度表示信号,因此简单的下采样或上采样会导致伪影。我们介绍了带限坐标网络(BACON),一种具有解析傅里叶谱的网络结构。BACON在无监督点上具有可预测的行为,可以根据所表示信号的频谱特征进行设计,并且可以在没有明确监督的情况下在多个尺度上表示信号。我们用符号距离函数演示了BACON对图像、辐射场和3D场景的多尺度神经表示,并表明它在可解释性和质量方面优于传统的单尺度坐标网络。 摘要:Coordinate-based networks have emerged as a powerful tool for 3D representation and scene reconstruction. These networks are trained to map continuous input coordinates to the value of a signal at each point. Still, current architectures are black boxes: their spectral characteristics cannot be easily analyzed, and their behavior at unsupervised points is difficult to predict. Moreover, these networks are typically trained to represent a signal at a single scale, and so naive downsampling or upsampling results in artifacts. We introduce band-limited coordinate networks (BACON), a network architecture with an analytical Fourier spectrum. BACON has predictable behavior at unsupervised points, can be designed based on the spectral characteristics of the represented signal, and can represent signals at multiple scales without explicit supervision. We demonstrate BACON for multiscale neural representation of images, radiance fields, and 3D scenes using signed distance functions and show that it outperforms conventional single-scale coordinate networks in terms of interpretability and quality.

【5】 Deep Molecular Representation Learning via Fusing Physical and Chemical Information 标题:基于物理化学信息融合的深度分子表征学习 链接:https://arxiv.org/abs/2112.04624

作者:Shuwen Yang,Ziyao Li,Guojie Song,Lingsheng Cai 机构:Key Laboratory of Machine Perception and Intelligence (MOE), Center for Data Science, Peking University, Beijing, China 备注:In NeurIPS-2021, 18 pages, 5 figures, appendix included 摘要:分子表征学习是深度学习与分子科学相结合的第一步,也是至关重要的一步。为了突破分子表征学习的界限,我们提出了PhysChem,这是一种通过融合分子的物理和化学信息来学习分子表征的新型神经结构。PhysChem由物理学家网络(PhysNet)和化学家网络(ChemNet)组成。PhysNet是一个神经物理引擎,通过参数化力模拟分子动力学来学习分子构象;ChemNet实现几何感知的深层信息传递,以了解分子的化学/生物医学特性。两个网络专门从事各自的任务,并通过相互提供专业知识进行合作。通过融合物理和化学信息,PhysChem在标准分子机器学习基准MoleculeNet上实现了最先进的性能。PhysChem的有效性在SARS-CoV-2的尖端数据集上得到进一步证实。 摘要:Molecular representation learning is the first yet vital step in combining deep learning and molecular science. To push the boundaries of molecular representation learning, we present PhysChem, a novel neural architecture that learns molecular representations via fusing physical and chemical information of molecules. PhysChem is composed of a physicist network (PhysNet) and a chemist network (ChemNet). PhysNet is a neural physical engine that learns molecular conformations through simulating molecular dynamics with parameterized forces; ChemNet implements geometry-aware deep message-passing to learn chemical / biomedical properties of molecules. Two networks specialize in their own tasks and cooperate by providing expertise to each other. By fusing physical and chemical information, PhysChem achieved state-of-the-art performances on MoleculeNet, a standard molecular machine learning benchmark. The effectiveness of PhysChem was further corroborated on cutting-edge datasets of SARS-CoV-2.

优化|敛散性(5篇)

【1】 A Fully Single Loop Algorithm for Bilevel Optimization without Hessian Inverse 标题:一种无海森逆的双层优化完全单环算法 链接:https://arxiv.org/abs/2112.04660

作者:Junyi Li,Bin Gu,Heng Huang 机构: University of Pittsburgh, MBZUAI 备注:To appear in AAAI 2022 摘要:在本文中,我们提出了一种新的求解双层优化问题的Hessian无逆全单回路算法(FSLA)。经典的双层优化算法采用双环结构,计算量大。最近,人们提出了几种交替优化内外变量的单回路算法。然而,这些算法还没有完全实现单循环。因为它们忽略了为给定的内部和外部状态计算超梯度所需的循环。为了开发完全单循环算法,我们首先研究了超梯度的结构,并确定了超梯度计算的一般近似公式,该公式包含了几种以前常用的方法,例如时间反向传播、共轭梯度等,我们引入一个新的状态变量来维护历史超梯度信息。结合我们的新公式和内外变量的交替更新,我们提出了一种有效的全单循环算法。我们从理论上证明了新状态产生的误差是有界的,并且我们的算法收敛速度为$O(epsilon^{-2})$。最后,我们通过多个基于双层优化的机器学习任务来验证算法的有效性。 摘要:In this paper, we propose a new Hessian inverse free Fully Single Loop Algorithm (FSLA) for bilevel optimization problems. Classic algorithms for bilevel optimization admit a double loop structure which is computationally expensive. Recently, several single loop algorithms have been proposed with optimizing the inner and outer variable alternatively. However, these algorithms not yet achieve fully single loop. As they overlook the loop needed to evaluate the hyper-gradient for a given inner and outer state. In order to develop a fully single loop algorithm, we first study the structure of the hyper-gradient and identify a general approximation formulation of hyper-gradient computation that encompasses several previous common approaches, e.g. back-propagation through time, conjugate gradient, emph{etc.} Based on this formulation, we introduce a new state variable to maintain the historical hyper-gradient information. Combining our new formulation with the alternative update of the inner and outer variables, we propose an efficient fully single loop algorithm. We theoretically show that the error generated by the new state can be bounded and our algorithm converges with the rate of $O(epsilon^{-2})$. Finally, we verify the efficacy our algorithm empirically through multiple bilevel optimization based machine learning tasks.

【2】 Calibration Improves Bayesian Optimization 标题:校准改进了贝叶斯优化 链接:https://arxiv.org/abs/2112.04620

作者:Shachi Deshpande,Volodymyr Kuleshov 机构:Department of Computer Science, Cornell Tech, New York, NY 摘要:贝叶斯优化是一种允许获得黑箱函数全局最优值的过程,在超参数优化等应用中非常有用。目标函数形状的不确定性估计有助于指导优化过程。但是,如果目标函数违反基础模型中的假设(例如高斯性),则这些估计可能不准确。作为贝叶斯优化过程的一部分,我们提出了一个简单的算法来校准目标函数上后验分布的不确定性。我们表明,通过校准改进后验分布的不确定性估计,贝叶斯优化可以做出更好的决策,并以更少的步骤达到全局最优。我们表明,该技术提高了贝叶斯优化在标准基准函数和超参数优化任务上的性能。 摘要:Bayesian optimization is a procedure that allows obtaining the global optimum of black-box functions and that is useful in applications such as hyper-parameter optimization. Uncertainty estimates over the shape of the objective function are instrumental in guiding the optimization process. However, these estimates can be inaccurate if the objective function violates assumptions made within the underlying model (e.g., Gaussianity). We propose a simple algorithm to calibrate the uncertainty of posterior distributions over the objective function as part of the Bayesian optimization process. We show that by improving the uncertainty estimates of the posterior distribution with calibration, Bayesian optimization makes better decisions and arrives at the global optimum in fewer steps. We show that this technique improves the performance of Bayesian optimization on standard benchmark functions and hyperparameter optimization tasks.

【3】 PATO: Producibility-Aware Topology Optimization using Deep Learning for Metal Additive Manufacturing 标题:PATO:基于深度学习的金属添加剂制造可制造性拓扑优化 链接:https://arxiv.org/abs/2112.04552

作者:Naresh S. Iyer,Amir M. Mirzendehdel,Sathyanarayanan Raghavan,Yang Jiao,Erva Ulu,Morad Behandish,Saigopal Nelaturi,Dean M. Robinson 机构:GE Research (GER), Research Circle, Niskayuna, NY, United States, Palo Alto Research Center (PARC), Coyote Hill Rd., Palo Alto, CA, United States 摘要:在本文中,我们提出了一种可生产性感知拓扑优化(TO)框架,以帮助有效探索使用金属添加剂制造(AM)制造的部件的设计空间,同时确保裂纹方面的可制造性。具体而言,通过激光粉末床熔接制造的零件容易出现翘曲或开裂等缺陷,这是由于制造过程中产生的陡峭热梯度产生的高残余应力值造成的。成熟此类零件的设计并规划其制造可能需要数月到数年的时间,通常涉及设计和制造工程师之间的多次交接。PATO基于无裂纹设计的先验发现,因此优化零件可以在一开始就无缺陷制造。为确保优化过程中设计无裂纹,可生产性使用裂纹指数在To的标准公式中明确编码。对多重裂纹指数进行了探索,并通过实验验证,证明最大剪切应变指数(MSSI)是一种精确的裂纹指数。模拟构建过程是一个耦合的多物理计算,将其合并到TO循环中可能会在计算上受到限制。我们利用深度卷积神经网络的最新进展,提出了一种基于基于注意的U-Net架构的高保真代理模型,以预测零件域上空间变化的MSSI值。此外,我们采用自动微分直接计算最大MSSI相对于输入设计变量的梯度,并使用基于性能的灵敏度场对其进行增强,以优化设计,同时考虑重量、可制造性和功能性之间的权衡。通过三维基准测试和实验验证,我们证明了该方法的有效性。 摘要:In this paper, we propose PATO-a producibility-aware topology optimization (TO) framework to help efficiently explore the design space of components fabricated using metal additive manufacturing (AM), while ensuring manufacturability with respect to cracking. Specifically, parts fabricated through Laser Powder Bed Fusion are prone to defects such as warpage or cracking due to high residual stress values generated from the steep thermal gradients produced during the build process. Maturing the design for such parts and planning their fabrication can span months to years, often involving multiple handoffs between design and manufacturing engineers. PATO is based on the a priori discovery of crack-free designs, so that the optimized part can be built defect-free at the outset. To ensure that the design is crack free during optimization, producibility is explicitly encoded within the standard formulation of TO, using a crack index. Multiple crack indices are explored and using experimental validation, maximum shear strain index (MSSI) is shown to be an accurate crack index. Simulating the build process is a coupled, multi-physics computation and incorporating it in the TO loop can be computationally prohibitive. We leverage the current advances in deep convolutional neural networks and present a high-fidelity surrogate model based on an Attention-based U-Net architecture to predict the MSSI values as a spatially varying field over the part's domain. Further, we employ automatic differentiation to directly compute the gradient of maximum MSSI with respect to the input design variables and augment it with the performance-based sensitivity field to optimize the design while considering the trade-off between weight, manufacturability, and functionality. We demonstrate the effectiveness of the proposed method through benchmark studies in 3D as well as experimental validation.

【4】 On Convergence of Federated Averaging Langevin Dynamics 标题:关于联邦平均朗之万动力学的收敛性 链接:https://arxiv.org/abs/2112.05120

作者:Wei Deng,Yi-An Ma,Zhao Song,Qian Zhang,Guang Lin 摘要:我们提出了一种用于分布式客户机的不确定性量化和平均预测的联合平均Langevin算法(FA-LD)。特别是,我们推广超越正常后验分布,并考虑一般类的模型。我们为具有非i.i.d数据的强对数凹分布的FA-LD提供了理论保证,并研究了注入噪声和随机梯度噪声、数据的异质性以及不同的学习速率对收敛性的影响。这样的分析有助于优化本地更新的选择,从而最大限度地降低通信成本。对于我们的方法来说,重要的是,在Langevin算法中,注入噪声不会降低通信效率。此外,我们在我们的FA-LD算法中检查了在不同客户机上使用的独立和相关噪声。我们注意到,在联邦和通信成本之间也存在权衡。由于本地设备在联邦网络中可能变得不活动,我们还展示了基于不同平均方案的收敛结果,其中只有部分设备更新可用。 摘要:We propose a federated averaging Langevin algorithm (FA-LD) for uncertainty quantification and mean predictions with distributed clients. In particular, we generalize beyond normal posterior distributions and consider a general class of models. We develop theoretical guarantees for FA-LD for strongly log-concave distributions with non-i.i.d data and study how the injected noise and the stochastic-gradient noise, the heterogeneity of data, and the varying learning rates affect the convergence. Such an analysis sheds light on the optimal choice of local updates to minimize communication costs. Important to our approach is that the communication efficiency does not deteriorate with the injected noise in the Langevin algorithms. In addition, we examine in our FA-LD algorithm both independent and correlated noise used over different clients. We observe that there is also a trade-off between federation and communication cost there. As local devices may become inactive in the federated network, we also show convergence results based on different averaging schemes where only partial device updates are available.

【5】 Continuation Path with Linear Convergence Rate 标题:具有线性收敛速度的连续路径 链接:https://arxiv.org/abs/2112.05104

作者:Eugene Ndiaye,Ichiro Takeuchi 机构:Georgia Institute of Technology (ISyE), edu†Riken AIP and Nagoya Institute of Technology 摘要:路径跟踪算法常用于组合优化问题,其中一系列具有不同正则化超参数的子问题依次求解。通过重复使用以前的解作为初始化,数值上观察到了更好的收敛速度。这使得在机器学习中加速优化算法的执行成为一种非常有用的启发式方法。我们对路径跟踪算法进行了原始-对偶分析,并探讨了如何设计其超参数,以及如何确定每个子问题的求解精度,以保证目标问题的线性收敛速度。此外,考虑具有稀疏诱导惩罚的优化问题,我们分析了活动集相对于正则化参数的变化。然后,可以对后者进行自适应校准,以精确确定将沿解决方案路径选择的特征数量。这导致了校准活动集方法超参数的简单启发式方法,以降低其复杂性并提高其执行时间。 摘要:Path-following algorithms are frequently used in composite optimization problems where a series of subproblems, with varying regularization hyperparameters, are solved sequentially. By reusing the previous solutions as initialization, better convergence speeds have been observed numerically. This makes it a rather useful heuristic to speed up the execution of optimization algorithms in machine learning. We present a primal dual analysis of the path-following algorithm and explore how to design its hyperparameters as well as determining how accurately each subproblem should be solved to guarantee a linear convergence rate on a target problem. Furthermore, considering optimization with a sparsity-inducing penalty, we analyze the change of the active sets with respect to the regularization parameter. The latter can then be adaptively calibrated to finely determine the number of features that will be selected along the solution path. This leads to simple heuristics for calibrating hyperparameters of active set approaches to reduce their complexity and improve their execution time.

预测|估计(6篇)

【1】 Model-Agnostic Hybrid Numerical Weather Prediction and Machine Learning Paradigm for Solar Forecasting in the Tropics 标题:模式-不可知性混合数值天气预报和机器学习模式在热带地区太阳预报中的应用 链接:https://arxiv.org/abs/2112.04963

作者:Nigel Yuan Yun Ng,Harish Gopalan,Venugopalan S. G. Raghavan,Chin Chun Ooi 机构:Department of Physics, Block S, Science Drive , National University of Singapore, Singapore , Institute of High Performance Computing, Fusionopolis Way, #,-, Connexis, Singapore 摘要:数值天气预报(NWP)和机器学习(ML)方法是太阳预报的常用方法。然而,NWP模型有多种可能的物理参数化,这需要特定于场地的NWP优化。当区域NWP模式与具有不同可能参数化的全球气候模式一起使用时,这就更加复杂了。在这项研究中,提出了一种替代方法,并对四种辐射模型进行了评估。天气研究和预测(WRF)模型在全球和区域模式下运行,以提供太阳辐照度的估计值。然后使用ML对该估计进行后处理,以提供最终预测。使用该ML误差校正模型,WRF的归一化均方根误差可减少40-50%。使用CAM、GFDL、新Goddard和RRTMG辐射模型获得的结果在该校正后具有可比性,无需进行WRF参数化调整。还评估了其他包含附近位置和传感器数据的模型,后者尤其有希望。 摘要:Numerical weather prediction (NWP) and machine learning (ML) methods are popular for solar forecasting. However, NWP models have multiple possible physical parameterizations, which requires site-specific NWP optimization. This is further complicated when regional NWP models are used with global climate models with different possible parameterizations. In this study, an alternative approach is proposed and evaluated for four radiation models. Weather Research and Forecasting (WRF) model is run in both global and regional mode to provide an estimate for solar irradiance. This estimate is then post-processed using ML to provide a final prediction. Normalized root-mean-square error from WRF is reduced by up to 40-50% with this ML error correction model. Results obtained using CAM, GFDL, New Goddard and RRTMG radiation models were comparable after this correction, negating the need for WRF parameterization tuning. Other models incorporating nearby locations and sensor data are also evaluated, with the latter being particularly promising.

【2】 Machine Learning for Utility Prediction in Argument-Based Computational Persuasion 标题:基于参数的计算说服中效用预测的机器学习 链接:https://arxiv.org/abs/2112.04953

作者:Ivan Donadello,Anthony Hunter,Stefano Teso,Mauro Dragoni 机构: Free University of Bozen-Bolzano, Italy, University College London, United Kingdom, Fondazione Bruno Kessler, Italy, University of Trento, Italy 摘要:自动说服系统(APS)旨在通过对话说服用户相信某件事,在对话中交换论点和反驳。为了最大限度地提高APS成功说服用户的可能性,它可以确定一个全局策略,该策略将允许APS在对话的每个阶段选择最佳论据,无论用户提出什么论据。然而,在实际应用中,例如在医疗保健方面,对话结果的效用对于AP和用户来说不太可能是相同的,或者是完全相反的。为了应对这种情况,在两党决策理论中,扩展形式的博弈被用于论证。这开启了我们在本文中要解决的新问题:(1)我们如何使用机器学习(ML)方法来预测不同用户亚群体的效用函数?(2)我们如何从我们所学的功能中为新用户确定最佳实用功能?在这个程度上,我们开发了两种ML方法,EAI和EDS,它们利用来自用户的信息来预测他们的实用程序。EAI仅限于固定数量的信息,而EDS可以选择最能检测用户亚群体的信息。我们在一个模拟环境和一个关于健康饮食习惯的现实案例研究中评估EAI和EDS。结果在这两种情况下都是有希望的,但EDS在预测有用的效用函数方面更有效。 摘要:Automated persuasion systems (APS) aim to persuade a user to believe something by entering into a dialogue in which arguments and counterarguments are exchanged. To maximize the probability that an APS is successful in persuading a user, it can identify a global policy that will allow it to select the best arguments it presents at each stage of the dialogue whatever arguments the user presents. However, in real applications, such as for healthcare, it is unlikely the utility of the outcome of the dialogue will be the same, or the exact opposite, for the APS and user. In order to deal with this situation, games in extended form have been harnessed for argumentation in Bi-party Decision Theory. This opens new problems that we address in this paper: (1) How can we use Machine Learning (ML) methods to predict utility functions for different subpopulations of users? and (2) How can we identify for a new user the best utility function from amongst those that we have learned? To this extent, we develop two ML methods, EAI and EDS, that leverage information coming from the users to predict their utilities. EAI is restricted to a fixed amount of information, whereas EDS can choose the information that best detects the subpopulations of a user. We evaluate EAI and EDS in a simulation setting and in a realistic case study concerning healthy eating habits. Results are promising in both cases, but EDS is more effective at predicting useful utility functions.

【3】 Regularization methods for the short-term forecasting of the Italian electric load 标题:意大利电力负荷短期预测的正则化方法 链接:https://arxiv.org/abs/2112.04604

作者:Alessandro Incremona,Giuseppe De Nicolao 机构:Department of Industrial and Information Engineering, University of Pavia, Via Adolfo, Ferrata , Pavia, Italy 摘要:预测意大利电力负荷的整个24个剖面的问题被视为一个多任务学习问题,其复杂性通过替代正则化方法得到控制。考虑到季度每小时抽样,使用了96个预测值,每个预测值线性依赖于96个回归系数。96x96矩阵权重形成96x96矩阵,可以看到并显示为在方形域上采样的曲面。探索了减少表面自由度的不同正则化和稀疏方法,并将获得的预测结果与意大利输电系统运营商Terna的预测结果进行了比较。除了在季度小时平均绝对百分比误差和平均绝对误差方面优于Terna外,预测残差与Terna的相关性较弱,这表明预测汇总可能会带来进一步的改善。事实上,在考虑的三个测试年中,综合预测在季度小时和每日平均绝对百分比误差、平均绝对误差和均方根误差(高达30%)方面产生了进一步的相关下降。 摘要:The problem of forecasting the whole 24 profile of the Italian electric load is addressed as a multitask learning problem, whose complexity is kept under control via alternative regularization methods. In view of the quarter-hourly samplings, 96 predictors are used, each of which linearly depends on 96 regressors. The 96x96 matrix weights form a 96x96 matrix, that can be seen and displayed as a surface sampled on a square domain. Different regularization and sparsity approaches to reduce the degrees of freedom of the surface were explored, comparing the obtained forecasts with those of the Italian Transmission System Operator Terna. Besides outperforming Terna in terms of quarter-hourly mean absolute percentage error and mean absolute error, the prediction residuals turned out to be weakly correlated with Terna, which suggests that further improvement could ensue from forecasts aggregation. In fact, the aggregated forecasts yielded further relevant drops in terms of quarter-hourly and daily mean absolute percentage error, mean absolute error and root mean square error (up to 30%) over the three test years considered.

【4】 Daily peak electrical load forecasting with a multi-resolution approach 标题:基于多分辨率方法的日高峰电力负荷预测 链接:https://arxiv.org/abs/2112.04492

作者:Yvenn Amara-Ouali,Matteo Fasiolo,Yannig Goude,Hui Yan 机构:Laboratoire de Math´ematiques d’Orsay (LMO), CNRS, Universit´e Paris-Saclay, Facult´e, des Sciences d’Orsay, bat , Orsay, France, CELESTE, Inria Saclay, FRANCE, School of Mathematics, University of Bristol, Bristol, UK 摘要:在智能电网和负载平衡的背景下,每日峰值负载预测已成为能源行业利益相关者的关键活动。了解峰值大小和时间对于实施智能电网战略(如调峰)至关重要。本文提出的建模方法利用高分辨率和低分辨率信息预测每日峰值需求规模和时间。由此产生的多分辨率建模框架可适用于不同的模型类别。本文的主要贡献是:a)对多分辨率建模方法进行了一般和正式的介绍;b)讨论了通过广义加法模型和神经网络实现的不同分辨率下的建模方法;c)对英国电力市场真实数据的实验结果。结果证实,所提出的建模方法的预测性能与低分辨率和高分辨率备选方案的预测性能具有竞争力。 摘要:In the context of smart grids and load balancing, daily peak load forecasting has become a critical activity for stakeholders of the energy industry. An understanding of peak magnitude and timing is paramount for the implementation of smart grid strategies such as peak shaving. The modelling approach proposed in this paper leverages high-resolution and low-resolution information to forecast daily peak demand size and timing. The resulting multi-resolution modelling framework can be adapted to different model classes. The key contributions of this paper are a) a general and formal introduction to the multi-resolution modelling approach, b) a discussion on modelling approaches at different resolutions implemented via Generalised Additive Models and Neural Networks and c) experimental results on real data from the UK electricity market. The results confirm that the predictive performance of the proposed modelling approach is competitive with that of low- and high-resolution alternatives.

【5】 Forecast Evaluation in Large Cross-Sections of Realized Volatility 标题:大断面已实现波动率的预测评价 链接:https://arxiv.org/abs/2112.04887

作者:Christis Katsouris 摘要:在本文中,我们考虑预测评估的横截面依赖性的实现波动率测量等预测精度测试程序。在预测实际波动率时,我们评估了基于增强横截面的模型的预测精度。在预测精度相等的零假设下,采用的基准模型是标准的HAR模型,而在预测精度不相等的替代方案下,预测模型是通过套索收缩估计的增强的HAR模型。我们通过结合测量误差修正和横截面跳跃分量测量来研究预测对模型规范的敏感性。模型的样本外预测评估通过数值实现进行评估。 摘要:In this paper, we consider the forecast evaluation of realized volatility measures under cross-section dependence using equal predictive accuracy testing procedures. We evaluate the predictive accuracy of the model based on the augmented cross-section when forecasting Realized Volatility. Under the null hypothesis of equal predictive accuracy the benchmark model employed is a standard HAR model while under the alternative of non-equal predictive accuracy the forecast model is an augmented HAR model estimated via the LASSO shrinkage. We study the sensitivity of forecasts to the model specification by incorporating a measurement error correction as well as cross-sectional jump component measures. The out-of-sample forecast evaluation of the models is assessed with numerical implementations.

【6】 Multimodal Pre-Training Model for Sequence-based Prediction of Protein-Protein Interaction 标题:基于序列的蛋白质相互作用预测的多模态预训练模型 链接:https://arxiv.org/abs/2112.04814

作者:Yang Xue,Zijing Liu,Xiaomin Fang,Fan Wang 机构:Baidu Inc., Shenzhen, China 备注:MLCB 2021 Spotlight 摘要:蛋白质相互作用(PPI)是许多生物过程中的基本要素,其中两个或多个蛋白质物理结合在一起以实现其功能。PPI建模对于许多生物医学应用非常有用,例如疫苗设计、抗体治疗和肽药物发现。对蛋白质模型进行预训练以学习有效的表征对于PPI至关重要。大多数PPI的预训练模型都是基于序列的,它们天真地将自然语言处理中使用的语言模型用于氨基酸序列。更高级的工作利用结构感知预训练技术,利用已知蛋白质结构的接触图。然而,无论是序列还是接触图谱都不能完全描述与PPI问题密切相关的蛋白质的结构和功能。受此启发,我们提出了一个具有三种模式的多模式蛋白质预训练模型:序列、结构和功能(S2F)。值得注意的是,我们没有使用接触图来学习氨基酸水平的刚性结构,而是使用重原子点云的拓扑复合体来编码结构特征。它使我们的模型不仅可以了解主干的结构信息,还可以了解侧链的结构信息。此外,我们的模型结合了从文献或手工注释中提取的蛋白质功能描述的知识。我们的实验表明,S2F学习在多种PPI任务中获得良好性能的蛋白质嵌入,包括跨物种PPI、抗体-抗原亲和力预测、SARS-CoV-2抗体中和预测和突变驱动的结合亲和力变化预测。 摘要:Protein-protein interactions (PPIs) are essentials for many biological processes where two or more proteins physically bind together to achieve their functions. Modeling PPIs is useful for many biomedical applications, such as vaccine design, antibody therapeutics, and peptide drug discovery. Pre-training a protein model to learn effective representation is critical for PPIs. Most pre-training models for PPIs are sequence-based, which naively adopt the language models used in natural language processing to amino acid sequences. More advanced works utilize the structure-aware pre-training technique, taking advantage of the contact maps of known protein structures. However, neither sequences nor contact maps can fully characterize structures and functions of the proteins, which are closely related to the PPI problem. Inspired by this insight, we propose a multimodal protein pre-training model with three modalities: sequence, structure, and function (S2F). Notably, instead of using contact maps to learn the amino acid-level rigid structures, we encode the structure feature with the topology complex of point clouds of heavy atoms. It allows our model to learn structural information about not only the backbones but also the side chains. Moreover, our model incorporates the knowledge from the functional description of proteins extracted from literature or manual annotations. Our experiments show that the S2F learns protein embeddings that achieve good performances on a variety of PPIs tasks, including cross-species PPI, antibody-antigen affinity prediction, antibody neutralization prediction for SARS-CoV-2, and mutation-driven binding affinity change prediction.

其他神经网络|深度学习|模型|建模(16篇)

【1】 A Novel Tropical Geometry-based Interpretable Machine Learning Method: Application in Prognosis of Advanced Heart Failure 标题:一种新的基于热带几何的可解释机器学习方法在晚期心力衰竭预后中的应用 链接:https://arxiv.org/abs/2112.05071

作者:Heming Yao,Harm Derksen,Jessica R. Golbus,Justin Zhang,Keith D. Aaronson,Jonathan Gryak,Kayvan Najarian 机构: Department of Internal Medicine, University of Michigan, Universityof Michigan 摘要:模型的可解释性对于临床决策支持系统等许多实际应用至关重要。本文提出了一种新的可解释机器学习方法,该方法可以在人类可理解的规则中对输入变量和响应之间的关系进行建模。该方法将热带几何应用于模糊推理系统,通过监督学习发现变量编码函数和显著规则。使用合成数据集进行了实验,研究了该算法在分类和规则发现方面的性能和容量。此外,所提出的方法被应用于临床应用,确定心力衰竭患者将受益于先进的治疗,如心脏移植或持久的机械循环支持。实验结果表明,该网络在分类任务上取得了良好的性能。除了从数据集中学习人类可以理解的规则外,现有的模糊领域知识可以很容易地转移到网络中,并用于促进模型训练。从我们的结果来看,提出的模型和学习现有领域知识的能力可以显著提高模型的可推广性。该网络的特点使其在需要模型可靠性和合理性的应用中具有广阔的应用前景。 摘要:A model's interpretability is essential to many practical applications such as clinical decision support systems. In this paper, a novel interpretable machine learning method is presented, which can model the relationship between input variables and responses in humanly understandable rules. The method is built by applying tropical geometry to fuzzy inference systems, wherein variable encoding functions and salient rules can be discovered by supervised learning. Experiments using synthetic datasets were conducted to investigate the performance and capacity of the proposed algorithm in classification and rule discovery. Furthermore, the proposed method was applied to a clinical application that identified heart failure patients that would benefit from advanced therapies such as heart transplant or durable mechanical circulatory support. Experimental results show that the proposed network achieved great performance on the classification tasks. In addition to learning humanly understandable rules from the dataset, existing fuzzy domain knowledge can be easily transferred into the network and used to facilitate model training. From our results, the proposed model and the ability of learning existing domain knowledge can significantly improve the model generalizability. The characteristics of the proposed network make it promising in applications requiring model reliability and justification.

【2】 Gradient-matching coresets for continual learning 标题:用于连续学习的梯度匹配核集 链接:https://arxiv.org/abs/2112.05025

作者:Lukas Balles,Giovanni Zappella,Cédric Archambeau 机构:Amazon Web Services, Berlin 备注:Accepted at the NeurIPS '21 Workshop on Distribution Shifts 摘要:基于梯度匹配的思想,我们设计了一种核心集选择方法:核心集产生的梯度应尽可能与原始训练数据集产生的梯度相匹配。我们在持续学习的背景下评估这种方法,在这种背景下,它可以用来策划预演记忆。我们的方法在一系列内存大小中执行强大的竞争对手,如水库采样。 摘要:We devise a coreset selection method based on the idea of gradient matching: The gradients induced by the coreset should match, as closely as possible, those induced by the original training dataset. We evaluate the method in the context of continual learning, where it can be used to curate a rehearsal memory. Our method performs strong competitors such as reservoir sampling across a range of memory sizes.

【3】 Millimeter Wave Localization with Imperfect Training Data using Shallow Neural Networks 标题:基于浅层神经网络的非理想训练数据毫米波定位 链接:https://arxiv.org/abs/2112.05008

作者:Anish Shastri,Joan Palacios,Paolo Casari 机构:DISI, University of Trento, Trento, Italy, North Carolina State University, Raleigh, NC, USA 备注:6 pages, 9 figures. The paper is submitted to IEEE WCNC 2022 摘要:毫米波定位算法利用毫米波信号的准光传播,在接收机处产生稀疏的角谱。基于角度定位的几何方法通常需要知道环境地图和接入点的位置。因此,有几项工作求助于自动学习,以便根据接收到的毫米波信号的特性推断设备的位置。然而,收集此类模型的训练数据是一项重大负担。在这项工作中,我们提出了一个浅层神经网络模型来定位室内的毫米波设备。与文献中提出的模型相比,该模型需要的权重要少得多。因此,它易于在资源受限的硬件中实现,并且需要较少的训练样本来收敛。我们还建议通过从基于几何的毫米波定位算法中检索(固有的不完美)位置估计来减轻训练数据收集工作。即使在这种情况下,我们的结果也表明,所提出的神经网络的性能与最先进的算法相当或更好。 摘要:Millimeter wave (mmWave) localization algorithms exploit the quasi-optical propagation of mmWave signals, which yields sparse angular spectra at the receiver. Geometric approaches to angle-based localization typically require to know the map of the environment and the location of the access points. Thus, several works have resorted to automated learning in order to infer a device's location from the properties of the received mmWave signals. However, collecting training data for such models is a significant burden. In this work, we propose a shallow neural network model to localize mmWave devices indoors. This model requires significantly fewer weights than those proposed in the literature. Therefore, it is amenable for implementation in resource-constrained hardware, and needs fewer training samples to converge. We also propose to relieve training data collection efforts by retrieving (inherently imperfect) location estimates from geometry-based mmWave localization algorithms. Even in this case, our results show that the proposed neural networks perform as good as or better than state-of-the-art algorithms.

【4】 Bringing Atomistic Deep Learning to Prime Time 标题:将原子式深度学习带入黄金时间 链接:https://arxiv.org/abs/2112.04977

作者:Nathan C. Frey,Siddharth Samsi,Bharath Ramsundar,Connor W. Coley,Vijay Gadepally 机构:MIT, Deep Forest Sciences 备注:6 pages, 1 figure, NeurIPS 2021 AI for Science workshop 摘要:人工智能尚未彻底改变材料和分子的设计。从这个角度来看,我们确定了阻碍原子论深度学习、分子科学和高性能计算集成的四个障碍。我们概述了为应对这些挑战带来的机遇而开展的重点研究工作。 摘要:Artificial intelligence has not yet revolutionized the design of materials and molecules. In this perspective, we identify four barriers preventing the integration of atomistic deep learning, molecular science, and high-performance computing. We outline focused research efforts to address the opportunities presented by these challenges.

【5】 Multi-Task Learning on Networks 标题:网络环境下的多任务学习 链接:https://arxiv.org/abs/2112.04891

作者:Andrea Ponti 机构:Matricola , Anno Accademico ,-, arXiv:,.,v, [cs.LG] , Dec 备注:94 pages, 53 figures, 8 tables 摘要:多任务学习(MTL)范式可以追溯到Caruana(1997)的一篇早期论文中,其中认为可以使用多个任务的数据来获得比独立学习每个任务更好的性能。具有冲突目标的MTL解决方案需要对它们之间的权衡进行建模,这通常超出了直线组合可以实现的范围。一个理论上有原则且计算上有效的策略是寻找不受其他人支配的解,正如帕累托分析中所述。多任务学习环境中出现的多目标优化问题具有特定的特点,需要特殊的方法。对这些特征的分析和一种新的计算方法的提出是这项工作的重点。多目标进化算法(MOEA)可以很容易地包含优势的概念,因此也包括帕累托分析。MOEAs的主要缺点是功能评估的样本效率较低。这一缺点的关键原因是大多数进化方法不使用模型来近似目标函数。贝叶斯优化采用了一种完全不同的基于替代模型的方法,如高斯过程。在本论文中,输入空间中的解表示为概率分布,封装了函数求值中包含的知识。在这一概率分布空间中,利用Wasserstein距离给出的度量,可以设计一种新的算法MOEA/WST,其中模型不是直接在目标函数上,而是在中间信息空间中,其中来自输入空间的对象映射为直方图。计算结果表明,MOEA/WST提供的样本效率和Pareto集的质量明显优于标准MOEA。 摘要:The multi-task learning (MTL) paradigm can be traced back to an early paper of Caruana (1997) in which it was argued that data from multiple tasks can be used with the aim to obtain a better performance over learning each task independently. A solution of MTL with conflicting objectives requires modelling the trade-off among them which is generally beyond what a straight linear combination can achieve. A theoretically principled and computationally effective strategy is finding solutions which are not dominated by others as it is addressed in the Pareto analysis. Multi-objective optimization problems arising in the multi-task learning context have specific features and require adhoc methods. The analysis of these features and the proposal of a new computational approach represent the focus of this work. Multi-objective evolutionary algorithms (MOEAs) can easily include the concept of dominance and therefore the Pareto analysis. The major drawback of MOEAs is a low sample efficiency with respect to function evaluations. The key reason for this drawback is that most of the evolutionary approaches do not use models for approximating the objective function. Bayesian Optimization takes a radically different approach based on a surrogate model, such as a Gaussian Process. In this thesis the solutions in the Input Space are represented as probability distributions encapsulating the knowledge contained in the function evaluations. In this space of probability distributions, endowed with the metric given by the Wasserstein distance, a new algorithm MOEA/WST can be designed in which the model is not directly on the objective function but in an intermediate Information Space where the objects from the input space are mapped into histograms. Computational results show that the sample efficiency and the quality of the Pareto set provided by MOEA/WST are significantly better than in the standard MOEA.

【6】 A New Measure of Model Redundancy for Compressed Convolutional Neural Networks 标题:压缩卷积神经网络模型冗余度的一种新度量 链接:https://arxiv.org/abs/2112.04857

作者:Feiqing Huang,Yuefeng Si,Yao Zheng,Guodong Li 机构:† Department of Statistics and Actuarial Science, University of Hong Kong, China, ‡ Department of Statistics, University of Connecticut 摘要:虽然最近提出了许多设计来提高卷积神经网络(CNN)在固定资源预算下的模型效率,但对这些设计的理论理解仍然明显不足。本文旨在提供一个新的框架来回答这个问题:压缩后的CNN中是否还有剩余的模型冗余?我们首先通过张量分解开发CNN和压缩CNN的通用统计公式,这样跨层的权重可以总结为单个张量。然后,通过严格的样本复杂度分析,我们揭示了导出的样本复杂度与原始参数计数之间的一个重要差异,这是模型冗余的直接指标。基于这一发现,我们引入了一种新的压缩CNN模型冗余度量,称为$K/R$比率,它进一步允许非线性激活。这项新措施的有效性得到了对流行区块设计和数据集的消融研究的支持。 摘要:While recently many designs have been proposed to improve the model efficiency of convolutional neural networks (CNNs) on a fixed resource budget, theoretical understanding of these designs is still conspicuously lacking. This paper aims to provide a new framework for answering the question: Is there still any remaining model redundancy in a compressed CNN? We begin by developing a general statistical formulation of CNNs and compressed CNNs via the tensor decomposition, such that the weights across layers can be summarized into a single tensor. Then, through a rigorous sample complexity analysis, we reveal an important discrepancy between the derived sample complexity and the naive parameter counting, which serves as a direct indicator of the model redundancy. Motivated by this finding, we introduce a new model redundancy measure for compressed CNNs, called the $K/R$ ratio, which further allows for nonlinear activations. The usefulness of this new measure is supported by ablation studies on popular block designs and datasets.

【7】 Effective dimension of machine learning models 标题:机器学习模型的有效维度 链接:https://arxiv.org/abs/2112.04807

作者:Amira Abbas,David Sutter,Alessio Figalli,Stefan Woerner 机构:IBM Quantum, IBM Research – Zurich, University of KwaZulu-Natal, Durban, Department of Mathematics, ETH Zurich 备注:17 pages, 2 figures 摘要:在涉及新数据的任务上对经过训练的模型的性能进行说明是机器学习的主要目标之一,即理解模型的泛化能力。各种能力度量都试图捕捉这种能力,但通常无法解释我们在实践中观察到的模型的重要特征。在这项研究中,我们提出了局部有效维数作为容量度量,它似乎与标准数据集上的泛化误差有很好的相关性。重要的是,我们证明了局部有效维数限制了泛化误差,并讨论了这种能力测度对机器学习模型的适用性。 摘要:Making statements about the performance of trained models on tasks involving new data is one of the primary goals of machine learning, i.e., to understand the generalization power of a model. Various capacity measures try to capture this ability, but usually fall short in explaining important characteristics of models that we observe in practice. In this study, we propose the local effective dimension as a capacity measure which seems to correlate well with generalization error on standard data sets. Importantly, we prove that the local effective dimension bounds the generalization error and discuss the aptness of this capacity measure for machine learning models.

【8】 New Tight Relaxations of Rank Minimization for Multi-Task Learning 标题:多任务学习中一种新的紧致最小秩松弛算法 链接:https://arxiv.org/abs/2112.04734

作者:Wei Chang,Feiping Nie,Rong Wang,Xuelong Li 机构:Northwestern Polytechnical University, Xi’an, Shaanxi, China 摘要:多任务学习已经被许多研究者所观察到,他们认为不同的任务可以共享一个低阶的公共子空间。这意味着联合学习多项任务比独立学习要好。在本文中,我们提出了两种新的基于两个正则化项的多任务学习公式,通过最小化精确的$k$最小奇异值来学习最优共享潜子空间。所提出的正则化项是比迹范数更紧密的秩极小化近似。但求解精确秩极小化问题是一个NP难问题。因此,我们设计了一种新的基于重加权的迭代策略来求解我们的模型,通过设置一个大的惩罚参数,可以策略性地处理精确的秩最小化问题。在基准数据集上的实验结果表明,我们的方法能够正确地恢复任务间共享的低秩结构,并且优于相关的多任务学习方法。 摘要:Multi-task learning has been observed by many researchers, which supposes that different tasks can share a low-rank common yet latent subspace. It means learning multiple tasks jointly is better than learning them independently. In this paper, we propose two novel multi-task learning formulations based on two regularization terms, which can learn the optimal shared latent subspace by minimizing the exactly $k$ minimal singular values. The proposed regularization terms are the more tight approximations of rank minimization than trace norm. But it's an NP-hard problem to solve the exact rank minimization problem. Therefore, we design a novel re-weighted based iterative strategy to solve our models, which can tactically handle the exact rank minimization problem by setting a large penalizing parameter. Experimental results on benchmark datasets demonstrate that our methods can correctly recover the low-rank structure shared across tasks, and outperform related multi-task learning methods.

【9】 Mimicking the Oracle: An Initial Phase Decorrelation Approach for Class Incremental Learning 标题:模仿Oracle:一种用于类增量学习的初始阶段去相关方法 链接:https://arxiv.org/abs/2112.04731

作者:Yujun Shi,Kuangqi Zhou,Jian Liang,Zihang Jiang,Jiashi Feng,Philip Torr,Song Bai,Vincent Y. F. Tan 机构:Vincent Y.F. Tan, National University of Singapore, ByteDance Inc., Institute of Automation, Chinese Academy of Sciences (CAS), University of Oxford 摘要:类增量学习(CIL)旨在以逐阶段的方式学习多类分类器,在每个阶段仅提供类子集的数据。以往的研究主要集中在缓解初始遗忘后的阶段性遗忘。然而,我们发现在初始阶段改善CIL也是一个有希望的方向。具体而言,我们的实验表明,直接鼓励CIL学习者在初始阶段输出与所有课程上联合训练的模型相似的表示,可以极大地提高CIL的性能。基于此,我们研究了na和“经过专业训练的初始阶段模型和oracle模型。具体来说,由于这两个模型之间的一个主要差异是训练班的数量,我们研究了这种差异如何影响模型表示。我们发现,在训练班较少的情况下,每个班的数据表示都位于一个狭长的区域;随着训练班的增加,每个班的表示分布更加均匀。受这一观察结果的启发,我们提出了基于类的去相关(CwD)方法,该方法可以有效地正则化每个类的表示,使其更均匀地分散,从而模仿与所有类联合训练的模型(即oracle模型)。我们的CwD易于实现,并且易于插入现有方法。在各种基准数据集上进行的大量实验表明,CwD持续显著地将现有最先进方法的性能提高了约1%到3%。代码将被发布。 摘要:Class Incremental Learning (CIL) aims at learning a multi-class classifier in a phase-by-phase manner, in which only data of a subset of the classes are provided at each phase. Previous works mainly focus on mitigating forgetting in phases after the initial one. However, we find that improving CIL at its initial phase is also a promising direction. Specifically, we experimentally show that directly encouraging CIL Learner at the initial phase to output similar representations as the model jointly trained on all classes can greatly boost the CIL performance. Motivated by this, we study the difference between a na"ively-trained initial-phase model and the oracle model. Specifically, since one major difference between these two models is the number of training classes, we investigate how such difference affects the model representations. We find that, with fewer training classes, the data representations of each class lie in a long and narrow region; with more training classes, the representations of each class scatter more uniformly. Inspired by this observation, we propose Class-wise Decorrelation (CwD) that effectively regularizes representations of each class to scatter more uniformly, thus mimicking the model jointly trained with all classes (i.e., the oracle model). Our CwD is simple to implement and easy to plug into existing methods. Extensive experiments on various benchmark datasets show that CwD consistently and significantly improves the performance of existing state-of-the-art methods by around 1% to 3%. Code will be released.

【10】 Gaussian Process Constraint Learning for Scalable Chance-Constrained Motion Planning from Demonstrations 标题:基于演示的可伸缩机会约束运动规划的高斯过程约束学习 链接:https://arxiv.org/abs/2112.04612

作者:Glen Chou,Hao Wang,Dmitry Berenson 机构:All authors are affiliated with the University of Michigan 备注:Under review at RA-L ICRA 2022 摘要:我们提出了一种从局部最优演示中学习以高斯过程(GPs)表示的约束的方法。我们的方法使用Karush-Kuhn-Tucker(KKT)最优性条件来确定在演示中约束紧的位置,并在这些状态下缩放约束梯度。然后,我们训练一个约束的GP表示,它与此信息一致,并且概括了此信息。我们进一步证明,GP不确定性可以在kinodynamic RRT中用于规划概率安全的轨迹,并且我们可以利用规划器中的GP结构精确地实现指定的安全概率。我们证明了我们的方法可以学习复杂的非线性约束,这些约束在5D非完整汽车、12D四旋翼和3连杆平面臂上演示,同时需要最少的约束先验信息。我们的结果表明,所学习的GP约束是准确的,优于以前需要更多先验知识的约束学习方法。 摘要:We propose a method for learning constraints represented as Gaussian processes (GPs) from locally-optimal demonstrations. Our approach uses the Karush-Kuhn-Tucker (KKT) optimality conditions to determine where on the demonstrations the constraint is tight, and a scaling of the constraint gradient at those states. We then train a GP representation of the constraint which is consistent with and which generalizes this information. We further show that the GP uncertainty can be used within a kinodynamic RRT to plan probabilistically-safe trajectories, and that we can exploit the GP structure within the planner to exactly achieve a specified safety probability. We demonstrate our method can learn complex, nonlinear constraints demonstrated on a 5D nonholonomic car, a 12D quadrotor, and a 3-link planar arm, all while requiring minimal prior information on the constraint. Our results suggest the learned GP constraint is accurate, outperforming previous constraint learning methods that require more a priori knowledge.

【11】 Variational Regularization in Inverse Problems and Machine Learning 标题:反问题中的变分正则化与机器学习 链接:https://arxiv.org/abs/2112.04591

作者:Martin Burger 机构:Department Mathematik and Center for Mathematics of Data, Friedrich-Alexander, Universit¨at Erlangen-N¨urnberg, Cauerstr. , Erlangen; email: 摘要:本文讨论了反问题变分正则化方法的基本结果和最新进展。在一个典型的设置中,我们回顾了获得收敛正则化方案所需的基本性质,并进一步讨论了定量估计的推导,这些估计分别需要凸泛函的Bregman距离等成分。除了为反问题开发的方法之外,我们还将讨论机器学习中的变分正则化,并找出与经典正则化理论的一些联系。特别是,我们将讨论在正则化理论框架下对机器学习问题的重新解释,以及在风险最小化框架下对反问题变分方法的重新解释。此外,我们在Bregman距离的误差估计和泛化误差之间建立了一些以前未知的联系。 摘要:This paper discusses basic results and recent developments on variational regularization methods, as developed for inverse problems. In a typical setup we review basic properties needed to obtain a convergent regularization scheme and further discuss the derivation of quantitative estimates respectively needed ingredients such as Bregman distances for convex functionals. In addition to the approach developed for inverse problems we will also discuss variational regularization in machine learning and work out some connections to the classical regularization theory. In particular we will discuss a reinterpretation of machine learning problems in the framework of regularization theory and a reinterpretation of variational methods for inverse problems in the framework of risk minimization. Moreover, we establish some previously unknown connections between error estimates in Bregman distances and generalization errors.

【12】 Application of Artificial Intelligence and Machine Learning in Libraries: A Systematic Review 标题:人工智能和机器学习在图书馆中的应用:系统评述 链接:https://arxiv.org/abs/2112.04573

作者:Rajesh Kumar Das,Mohammad Sharif Ul Islam 摘要:随着人工智能和机器学习等尖端技术的概念和实现变得越来越重要,学者、研究人员和信息专业人员都参与了这一领域的研究。本系统文献综述的目的是提供一个综合的实证研究,探索人工智能和机器学习在图书馆中的应用。为了实现研究目标,根据Kitchenham等人(2009)提出的原始指南,进行了系统的文献综述。数据收集自科学网、Scopus、LISA和LISTA数据库。经过严格/既定的选择过程,最终选择、审查和分析了32篇文章,总结了AI和ML领域的应用以及图书馆中最常用的技术。研究结果表明,与LIS领域相关的AI和ML研究的当前状态主要集中在理论工作上。然而,一些研究人员也强调实施项目或案例研究。这项研究将为研究人员、实践者和教育工作者提供图书馆中AI和ML的全景视图,以进一步推动更面向技术的方法,并预测未来的创新路径。 摘要:As the concept and implementation of cutting-edge technologies like artificial intelligence and machine learning has become relevant, academics, researchers and information professionals involve research in this area. The objective of this systematic literature review is to provide a synthesis of empirical studies exploring application of artificial intelligence and machine learning in libraries. To achieve the objectives of the study, a systematic literature review was conducted based on the original guidelines proposed by Kitchenham et al. (2009). Data was collected from Web of Science, Scopus, LISA and LISTA databases. Following the rigorous/ established selection process, a total of thirty-two articles were finally selected, reviewed and analyzed to summarize on the application of AI and ML domain and techniques which are most often used in libraries. Findings show that the current state of the AI and ML research that is relevant with the LIS domain mainly focuses on theoretical works. However, some researchers also emphasized on implementation projects or case studies. This study will provide a panoramic view of AI and ML in libraries for researchers, practitioners and educators for furthering the more technology-oriented approaches, and anticipating future innovation pathways.

【13】 Deep Q-Learning Market Makers in a Multi-Agent Simulated Stock Market 标题:多智能体模拟股市中的深度Q学习做市商 链接:https://arxiv.org/abs/2112.04494

作者:Oscar Fernández Vicente,Fernando Fernández Rebollo,Francisco Javier García Polo 机构:Universidad Carlos III Madrid, Madrid, Spain 备注:Presented at 2nd ACM International Conference on AI in Finance 摘要:做市商通过提供流动性在金融市场中发挥着关键作用。他们通常在订单簿上填写买入和卖出限价订单,以便为交易者提供可供选择的价格水平。本文正是从基于代理的角度研究这些做市商的策略。特别是,我们提出应用强化学习(RL)在模拟股票市场中创建智能市场标记。在同一时间,他们分析同一个市场中的多个学习者如何适应同一个市场中的多个学习者(学习者)和多个学习者(学习者)的行为。此外,它还涵盖了不同实验之间策略转移的应用,描述了竞争环境对RL代理性能的影响。RL和深度RL技术被证明是有利可图的做市商方法,有助于更好地了解它们在股票市场中的行为。 摘要:Market makers play a key role in financial markets by providing liquidity. They usually fill order books with buy and sell limit orders in order to provide traders alternative price levels to operate. This paper focuses precisely on the study of these markets makers strategies from an agent-based perspective. In particular, we propose the application of Reinforcement Learning (RL) for the creation of intelligent market markers in simulated stock markets. This research analyzes how RL market maker agents behaves in non-competitive (only one RL market maker learning at the same time) and competitive scenarios (multiple RL market markers learning at the same time), and how they adapt their strategies in a Sim2Real scope with interesting results. Furthermore, it covers the application of policy transfer between different experiments, describing the impact of competing environments on RL agents performance. RL and deep RL techniques are proven as profitable market maker approaches, leading to a better understanding of their behavior in stock markets.

【14】 Fair Structure Learning in Heterogeneous Graphical Models 标题:异构图形模型中的公平结构学习 链接:https://arxiv.org/abs/2112.05128

作者:Davoud Ataee Tarzanagh,Laura Balzano,Alfred O. Hero 机构:Department of Electrical Engineering and Computer Science, University of Michigan 摘要:当节点具有人口统计属性时,概率图形模型中的社区结构推断可能与公平性约束不一致。某些人口统计数据可能在某些检测到的社区中代表性过高,而在其他社区中代表性不足。本文定义了一种新的$ellu_1$正则化伪似然方法,用于公平图形模型选择。特别是,我们假设在真实的基础图中存在某种社区或集群结构,并试图从数据中学习稀疏无向图及其社区,以便人口群体在社区中得到公平的代表。我们的优化方法使用人口均等公平性定义,但该框架很容易扩展到其他公平性定义。我们分别对连续数据和二进制数据建立了高斯图形模型和伊辛模型的统计一致性,证明了我们的方法能够以高概率恢复图及其公平社区。 摘要:Inference of community structure in probabilistic graphical models may not be consistent with fairness constraints when nodes have demographic attributes. Certain demographics may be over-represented in some detected communities and under-represented in others. This paper defines a novel $ell_1$-regularized pseudo-likelihood approach for fair graphical model selection. In particular, we assume there is some community or clustering structure in the true underlying graph, and we seek to learn a sparse undirected graph and its communities from the data such that demographic groups are fairly represented within the communities. Our optimization approach uses the demographic parity definition of fairness, but the framework is easily extended to other definitions of fairness. We establish statistical consistency of the proposed method for both a Gaussian graphical model and an Ising model for, respectively, continuous and binary data, proving that our method can recover the graphs and their fair communities with high probability.

【15】 Provable Continual Learning via Sketched Jacobian Approximations 标题:基于草图雅可比近似的可证明连续学习 链接:https://arxiv.org/abs/2112.05095

作者:Reinhard Heckel 机构:∗Dept. of Electrical and Computer Engineering, Technical University of Munich, †Dept. of Electrical and Computer Engineering, Rice University 摘要:机器学习中的一个重要问题是以顺序方式学习任务的能力。如果使用标准的一阶方法进行训练,大多数模型在接受新任务训练时会忘记以前学习过的任务,这通常被称为灾难性遗忘。克服遗忘的一种流行方法是通过惩罚在以前任务中表现不佳的模型来规范损失函数。例如,弹性权重固结(EWC)采用二次形式进行正则化,其中涉及基于过去数据构建的对角矩阵。虽然EWC在某些设置中工作得很好,但我们表明,即使在其他理想条件下,如果对角矩阵与以前任务的Hessian矩阵的近似性较差,它也可能遭受灾难性遗忘。我们提出了一种简单的方法来克服这一问题:用过去数据的雅可比矩阵草图来正则化新任务的训练。这可以证明能够克服线性模型和广泛的神经网络的灾难性遗忘,而代价是内存。本文的总体目标是提供关于基于正则化的持续学习算法何时工作以及在何种内存成本下工作的见解。 摘要:An important problem in machine learning is the ability to learn tasks in a sequential manner. If trained with standard first-order methods most models forget previously learned tasks when trained on a new task, which is often referred to as catastrophic forgetting. A popular approach to overcome forgetting is to regularize the loss function by penalizing models that perform poorly on previous tasks. For example, elastic weight consolidation (EWC) regularizes with a quadratic form involving a diagonal matrix build based on past data. While EWC works very well for some setups, we show that, even under otherwise ideal conditions, it can provably suffer catastrophic forgetting if the diagonal matrix is a poor approximation of the Hessian matrix of previous tasks. We propose a simple approach to overcome this: Regularizing training of a new task with sketches of the Jacobian matrix of past data. This provably enables overcoming catastrophic forgetting for linear models and for wide neural networks, at the cost of memory. The overarching goal of this paper is to provided insights on when regularization-based continual learning algorithms work and under what memory costs.

【16】 A Training Framework for Stereo-Aware Speech Enhancement using Deep Neural Networks 标题:一种基于深度神经网络的立体感知语音增强训练框架 链接:https://arxiv.org/abs/2112.04939

作者:Bahareh Tolooshams,Kazuhito Koishida 机构:⋆School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, † Microsoft Corporation, One Microsoft Way, Redmond, WA 备注:Submitted to ICASSP 2022 摘要:近年来,基于深度学习的语音增强显示出前所未有的性能。最流行的单声道语音增强框架是端到端网络,将噪声混合映射为干净语音的估计。随着计算能力的提高和多通道麦克风录音的可用性,以前的工作旨在结合空间统计和光谱信息来提高性能。尽管单声道输出的增强性能有所提高,但空间图像的保存和主观评价在文献中并没有得到太多的关注。本文提出了一种新的基于立体感知的语音增强框架,即基于深度学习的语音增强的训练损失,在增强立体混合的同时保留空间图像。该框架与模型无关,因此可以应用于任何基于深度学习的体系结构。我们通过听力测试对经过训练的模型进行广泛的客观和主观评估。我们表明,通过对图像保留损失进行正则化,整体性能得到了改善,并且语音的立体声方面得到了更好的保留。 摘要:Deep learning-based speech enhancement has shown unprecedented performance in recent years. The most popular mono speech enhancement frameworks are end-to-end networks mapping the noisy mixture into an estimate of the clean speech. With growing computational power and availability of multichannel microphone recordings, prior works have aimed to incorporate spatial statistics along with spectral information to boost up performance. Despite an improvement in enhancement performance of mono output, the spatial image preservation and subjective evaluations have not gained much attention in the literature. This paper proposes a novel stereo-aware framework for speech enhancement, i.e., a training loss for deep learning-based speech enhancement to preserve the spatial image while enhancing the stereo mixture. The proposed framework is model independent, hence it can be applied to any deep learning based architecture. We provide an extensive objective and subjective evaluation of the trained models through a listening test. We show that by regularizing for an image preservation loss, the overall performance is improved, and the stereo aspect of the speech is better preserved.

其他(22篇)

【1】 PixMix: Dreamlike Pictures Comprehensively Improve Safety Measures 标题:PixMix:梦幻画卷全面提升安全措施 链接:https://arxiv.org/abs/2112.05135

作者:Dan Hendrycks,Andy Zou,Mantas Mazeika,Leonard Tang,Dawn Song,Jacob Steinhardt 机构:UC Berkeley, UIUC, Harvard University 备注:Code and models are available at this https URL 摘要:在真实的机器学习应用中,可靠和安全的系统必须考虑性能测试的标准测试集的精度。这些其他目标包括分布外(OOD)稳健性、预测一致性、对抗能力、校准的不确定性估计以及检测异常输入的能力。然而,为了实现这些目标而提高性能通常是一种平衡行为,而今天的方法在不牺牲其他安全轴性能的情况下是无法实现的。例如,对抗性训练提高了对抗性稳健性,但严重降低了其他分类器性能指标。类似地,强大的数据增强和正则化技术通常会提高OOD鲁棒性,但会损害异常检测,这就提出了一个问题,即是否可以对所有现有安全措施进行帕累托改进。为了应对这一挑战,我们设计了一种新的数据增强策略,利用分形等图片的自然结构复杂性,其性能优于许多基线,接近帕累托最优,并全面改进了安全措施。 摘要:In real-world applications of machine learning, reliable and safe systems must consider measures of performance beyond standard test set accuracy. These other goals include out-of-distribution (OOD) robustness, prediction consistency, resilience to adversaries, calibrated uncertainty estimates, and the ability to detect anomalous inputs. However, improving performance towards these goals is often a balancing act that today's methods cannot achieve without sacrificing performance on other safety axes. For instance, adversarial training improves adversarial robustness but sharply degrades other classifier performance metrics. Similarly, strong data augmentation and regularization techniques often improve OOD robustness but harm anomaly detection, raising the question of whether a Pareto improvement on all existing safety measures is possible. To meet this challenge, we design a new data augmentation strategy utilizing the natural structural complexity of pictures such as fractals, which outperforms numerous baselines, is near Pareto-optimal, and roundly improves safety measures.

【2】 A Bayesian Treatment of Real-to-Sim for Deformable Object Manipulation 标题:用于可变形物体操纵的Real-to-Sim的贝叶斯处理 链接:https://arxiv.org/abs/2112.05068

作者:Rika Antonova,Jingyun Yang,Priya Sundaresan,Dieter Fox,Fabio Ramos,Jeannette Bohg 机构: The University of Sydney 摘要:可变形物体操纵仍然是机器人学研究中一项具有挑战性的任务。传统的参数推断和状态估计技术通常依赖于状态空间及其动力学的精确定义。虽然这适用于刚体对象和机器人状态,但定义可变形对象的状态空间及其随时间演化的方式仍具有挑战性。在这项工作中,我们提出的问题,推断物理参数的变形物体作为一个概率推理任务定义的模拟器。我们提出了一种从图像序列中提取状态信息的新方法,通过一种将可变形对象的状态表示为分布嵌入的技术。这允许以原则性的方式将噪声状态观测直接纳入现代基于贝叶斯模拟的推理工具中。我们的实验证实,我们可以估计物理特性的后验分布,例如高度可变形对象(如布料和绳索)的弹性、摩擦力和比例。总的来说,我们的方法从概率上解决了真实到模拟的问题,并有助于更好地表示可变形对象状态的演化。 摘要:Deformable object manipulation remains a challenging task in robotics research. Conventional techniques for parameter inference and state estimation typically rely on a precise definition of the state space and its dynamics. While this is appropriate for rigid objects and robot states, it is challenging to define the state space of a deformable object and how it evolves in time. In this work, we pose the problem of inferring physical parameters of deformable objects as a probabilistic inference task defined with a simulator. We propose a novel methodology for extracting state information from image sequences via a technique to represent the state of a deformable object as a distribution embedding. This allows to incorporate noisy state observations directly into modern Bayesian simulation-based inference tools in a principled manner. Our experiments confirm that we can estimate posterior distributions of physical properties, such as elasticity, friction and scale of highly deformable objects, such as cloth and ropes. Overall, our method addresses the real-to-sim problem probabilistically and helps to better represent the evolution of the state of deformable objects.

【3】 i-SpaSP: Structured Neural Pruning via Sparse Signal Recovery 标题:I-SpaSP:基于稀疏信号恢复的结构化神经剪枝 链接:https://arxiv.org/abs/2112.04905

作者:Cameron R. Wolfe,Anastasios Kyrillidis 机构:Department of Computer Science, Rice University, Houston, TX, USA. 备注:27 pages, 4 figures 摘要:我们提出了一种新的神经网络结构化剪枝算法——迭代稀疏结构化剪枝算法,称为i-SpaSP。受稀疏信号恢复思想的启发,i-SpaSP通过迭代识别网络中对剪枝和密集网络输出之间的残差贡献最大的一组重要参数组(例如,滤波器或神经元),然后基于更小的预定义剪枝比对这些组进行阈值化来运行。对于具有ReLU激活的两层和多层网络结构,我们展示了由i-SpaSP修剪引起的误差以多项式形式衰减,其中该多项式的次数根据稠密网络隐藏表示的稀疏性变得任意大。在我们的实验中,i-SpaSP在各种数据集(即MNIST和ImageNet)和体系结构(即前馈网络、ResNet34和MobileNetV2)上进行评估,结果表明,i-SpaSP可以发现高性能子网络,并将可证明基线方法的修剪效率提高几个数量级。简单地说,i-SpaSP易于通过自动微分实现,获得了很强的经验结果,具有理论上的收敛保证,并且是高效的,因此,它是为数不多的计算高效、实用且可证明的修剪算法之一。 摘要:We propose a novel, structured pruning algorithm for neural networks -- the iterative, Sparse Structured Pruning algorithm, dubbed as i-SpaSP. Inspired by ideas from sparse signal recovery, i-SpaSP operates by iteratively identifying a larger set of important parameter groups (e.g., filters or neurons) within a network that contribute most to the residual between pruned and dense network output, then thresholding these groups based on a smaller, pre-defined pruning ratio. For both two-layer and multi-layer network architectures with ReLU activations, we show the error induced by pruning with i-SpaSP decays polynomially, where the degree of this polynomial becomes arbitrarily large based on the sparsity of the dense network's hidden representations. In our experiments, i-SpaSP is evaluated across a variety of datasets (i.e., MNIST and ImageNet) and architectures (i.e., feed forward networks, ResNet34, and MobileNetV2), where it is shown to discover high-performing sub-networks and improve upon the pruning efficiency of provable baseline methodologies by several orders of magnitude. Put simply, i-SpaSP is easy to implement with automatic differentiation, achieves strong empirical results, comes with theoretical convergence guarantees, and is efficient, thus distinguishing itself as one of the few computationally efficient, practical, and provable pruning algorithms.

【4】 Assessing Fairness in the Presence of Missing Data 标题:在存在丢失数据的情况下评估公平性 链接:https://arxiv.org/abs/2112.04899

作者:Yiliang Zhang,Qi Long 机构:University of Pennsylvania, Philadelphia, PA , USA 摘要:缺失数据非常普遍,在实际数据分析中带来了严峻的挑战。虽然有越来越多的关于完全观察数据分析中的公平性的文献,但关于不完整数据分析中公平性的研究却很少。在实践中,处理缺失数据的一种流行分析方法是仅使用一组完整的案例,即所有特征均已完全观察到的观测值来训练预测算法。然而,根据缺失数据机制的不同,完整案例的分布和完整数据的分布可能会有很大的不同。当目标是在不存在缺失值的完整数据域中开发公平算法时,在完整案例域中公平的算法可能会对完整数据域中的某些边缘化群体表现出不成比例的偏见。为了填补这一重大空白,我们研究了仅使用完整案例评估的任意模型在完整数据域中的公平性估计问题。我们提供了公平性估计误差的上界和下界,并进行了数值实验来评估我们的理论结果。我们的工作提供了第一个已知的不完全数据分析中公平性保证的理论结果。 摘要:Missing data are prevalent and present daunting challenges in real data analysis. While there is a growing body of literature on fairness in analysis of fully observed data, there has been little theoretical work on investigating fairness in analysis of incomplete data. In practice, a popular analytical approach for dealing with missing data is to use only the set of complete cases, i.e., observations with all features fully observed to train a prediction algorithm. However, depending on the missing data mechanism, the distribution of complete cases and the distribution of the complete data may be substantially different. When the goal is to develop a fair algorithm in the complete data domain where there are no missing values, an algorithm that is fair in the complete case domain may show disproportionate bias towards some marginalized groups in the complete data domain. To fill this significant gap, we study the problem of estimating fairness in the complete data domain for an arbitrary model evaluated merely using complete cases. We provide upper and lower bounds on the fairness estimation error and conduct numerical experiments to assess our theoretical results. Our work provides the first known theoretical results on fairness guarantee in analysis of incomplete data.

【5】 Latent Space Explanation by Intervention 标题:干预对潜在空间的解释 链接:https://arxiv.org/abs/2112.04895

作者:Itai Gat,Guy Lorberbom,Idan Schwartz,Tamir Hazan 机构: Technion - Israel Institute of Technology, NetApp 备注:Accepted to AAAI22 摘要:深层神经网络的成功在很大程度上依赖于其编码输入和输出之间复杂关系的能力。虽然该属性可以很好地拟合训练数据,但它也掩盖了驱动预测的机制。这项研究的目的是通过使用一种干预机制来揭示隐藏的概念,该机制基于离散变分自动编码器来转移预测类。然后,解释模型将任何隐藏层的编码信息及其相应的中间表示可视化。通过评估原始表示和介入表示之间的差异,可以确定可以改变类的概念,从而提供可解释性。我们在CelebA上展示了我们的方法的有效性,在CelebA中,我们展示了数据中对偏差的各种可视化,并建议不同的干预措施来揭示和改变偏差。 摘要:The success of deep neural nets heavily relies on their ability to encode complex relations between their input and their output. While this property serves to fit the training data well, it also obscures the mechanism that drives prediction. This study aims to reveal hidden concepts by employing an intervention mechanism that shifts the predicted class based on discrete variational autoencoders. An explanatory model then visualizes the encoded information from any hidden layer and its corresponding intervened representation. By the assessment of differences between the original representation and the intervened representation, one can determine the concepts that can alter the class, hence providing interpretability. We demonstrate the effectiveness of our approach on CelebA, where we show various visualizations for bias in the data and suggest different interventions to reveal and change bias.

【6】 GPU backed Data Mining on Android Devices 标题:基于GPU的Android设备数据挖掘 链接:https://arxiv.org/abs/2112.04800

作者:Robert Fritze,Claudia Plant 机构:University of Vienna, Vienna, Austria, ORCID ,-,-,- 备注:11 pages 摘要:为低功耗设备上的高性能计算选择合适的编程范式有助于加快计算速度。许多安卓设备都有一个集成的GPU,尽管没有官方支持,但OpenCL框架可以在安卓设备上用于处理这些GPU。OpenCL支持线程和数据并行。使用GPU的应用程序必须考虑这样一个事实,即用户或Android操作系统可以随时暂停它们。我们已经创建了一个包装器库,允许在Android设备上使用OpenCL。已经编写的OpenCL程序几乎不需要修改就可以执行。我们使用该库比较了在Arm-v7平板电脑的集成GPU上DBSCAN和Kmeans算法与同一设备上的其他单线程和多线程实现的性能。我们已经调查了哪种编程范式和语言允许在执行速度和能耗之间进行最佳权衡。在Android设备上使用GPU for HPC有助于在偏远地区、恶劣环境条件下以及能源供应存在问题的地区执行计算密集型机器学习或数据挖掘任务。 摘要:Choosing an appropriate programming paradigm for high-performance computing on low-power devices can be useful to speed up calculations. Many Android devices have an integrated GPU and - although not officially supported - the OpenCL framework can be used on Android devices for addressing these GPUs. OpenCL supports thread and data parallelism. Applications that use the GPU must account for the fact that they can be suspended by the user or the Android operating system at any moment. We have created a wrapper library that allows to use OpenCL on Android devices. Already written OpenCL programs can be executed with almost no modification. We have used this library to compare the performance of the DBSCAN and Kmeans algorithms on an integrated GPU of an Arm-v7 tablet with other single and multithreaded implementations on the same device. We have investigated which programming paradigm and language allows the best tradeoff between execution speed and energy consumption. Using the GPU for HPC on Android devices can help to carry out computationally intensive machine learning or data mining tasks in remote areas, under harsh environmental conditions and in areas where energy supply is an issue.

【7】 From Good to Best: Two-Stage Training for Cross-lingual Machine Reading Comprehension 标题:从好到好:跨语种机器阅读理解的两阶段训练 链接:https://arxiv.org/abs/2112.04735

作者:Nuo Chen,Linjun Shou,Min Gong,Jian Pei,Daxin Jiang 机构:ADSPLAB, School of ECE, Peking University, Shenzhen, China, NLP Group, Microsoft STCA, School of Computing Science, Simon Fraser University 摘要:由于缺乏低资源语言的训练数据,跨语言机器阅读理解(xMRC)具有挑战性。最近的方法仅使用英语等资源丰富的语言中的训练数据来微调大规模跨语言预训练语言模型。由于语言之间的巨大差异,仅由源语言微调的模型可能无法在目标语言中运行良好。有趣的是,我们观察到,虽然先前方法预测的前1名结果可能经常无法找到基本的真相答案,但正确的答案通常包含在前k名预测结果中。基于这一观察,我们开发了一种两阶段方法来提高模型性能。第一阶段的目标是回忆:我们设计了一个硬学习(HL)算法,以最大化top-k预测包含准确答案的可能性。第二阶段侧重于精确性:开发了一种答案感知对比学习(AA-CL)机制,以了解准确答案与其他候选答案之间的细微差异。我们的大量实验表明,在两个跨语言的MRC基准数据集上,我们的模型明显优于一系列强基线。 摘要:Cross-lingual Machine Reading Comprehension (xMRC) is challenging due to the lack of training data in low-resource languages. The recent approaches use training data only in a resource-rich language like English to fine-tune large-scale cross-lingual pre-trained language models. Due to the big difference between languages, a model fine-tuned only by a source language may not perform well for target languages. Interestingly, we observe that while the top-1 results predicted by the previous approaches may often fail to hit the ground-truth answers, the correct answers are often contained in the top-k predicted results. Based on this observation, we develop a two-stage approach to enhance the model performance. The first stage targets at recall: we design a hard-learning (HL) algorithm to maximize the likelihood that the top-k predictions contain the accurate answer. The second stage focuses on precision: an answer-aware contrastive learning (AA-CL) mechanism is developed to learn the fine difference between the accurate answer and other candidates. Our extensive experiments show that our model significantly outperforms a series of strong baselines on two cross-lingual MRC benchmark datasets.

【8】 Reducing Catastrophic Forgetting in Self Organizing Maps with Internally-Induced Generative Replay 标题:用内诱导生成重放减少自组织映射中的灾难性遗忘 链接:https://arxiv.org/abs/2112.04728

作者:Hitesh Vaidya,Travis Desell,Alexander Ororbia 机构:Rochester Institute of Technology 摘要:终身学习代理能够从潜在的无限模式感知数据流中不断学习。构建以这种方式适应的代理的一个主要历史困难是,神经系统在从新样本学习时难以保留先前获得的知识。这个问题被称为灾难性遗忘(干扰),至今仍是机器学习领域尚未解决的问题。过去几十年来,人们在前馈网络的背景下对遗忘进行了广泛的研究,但在替代体系结构(如古老的自组织映射(SOM))的背景下,遗忘的研究却少得多。SOM是一种无监督的神经模型,通常用于聚类和降维等任务。尽管其内部神经元之间的竞争可能具有改善记忆保留的潜力,但我们观察到,一个固定大小的SOM在任务增量数据上训练,即它以一定的时间增量接收与特定类相关的数据点,会经历显著的遗忘。在这项研究中,我们提出了连续SOM(c-SOM),这是一种能够在处理信息时减少自身遗忘的模型。 摘要:A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data. One major historic difficulty in building agents that adapt in this way is that neural systems struggle to retain previously-acquired knowledge when learning from new samples. This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day. While forgetting in the context of feedforward networks has been examined extensively over the decades, far less has been done in the context of alternative architectures such as the venerable self-organizing map (SOM), an unsupervised neural model that is often used in tasks such as clustering and dimensionality reduction. Although the competition among its internal neurons might carry the potential to improve memory retention, we observe that a fixed-sized SOM trained on task incremental data, i.e., it receives data points related to specific classes at certain temporal increments, experiences significant forgetting. In this study, we propose the continual SOM (c-SOM), a model that is capable of reducing its own forgetting when processing information.

【9】 Trajectory-Constrained Deep Latent Visual Attention for Improved Local Planning in Presence of Heterogeneous Terrain 标题:异质地形下改进局部规划的轨迹约束深度潜伏视觉注意 链接:https://arxiv.org/abs/2112.04684

作者:Stefan Wapnick,Travis Manderson,David Meger,Gregory Dudek 机构: School of Computer Science 备注:Published in International Conference on Intelligent Robots and Systems (IROS) 2021 proceedings. Project website: this https URL 摘要:我们提出了一种奖励预测、基于模型的深度学习方法,该方法具有轨迹约束视觉注意,可用于mapless局部视觉导航任务。我们的方法学习将视觉注意力放置在潜影空间中的位置,这些位置跟随车辆控制动作引起的轨迹,以提高规划过程中的预测精度。注意模型通过特定任务损失和额外的轨迹约束损失进行联合优化,允许适应性,但鼓励正则化结构以提高泛化和可靠性。重要的是,视觉注意被应用于潜在特征地图空间,而不是原始图像空间,以促进有效的规划。我们在视觉导航任务中验证了我们的模型,这些任务包括在越野环境中规划低湍流、无碰撞的轨迹,以及在湿滑地形下使用锁定差速器爬山。实验涉及随机程序生成的模拟和真实环境。我们发现,与不注意和自我注意相比,我们的方法提高了泛化和学习效率。 摘要:We present a reward-predictive, model-based deep learning method featuring trajectory-constrained visual attention for use in mapless, local visual navigation tasks. Our method learns to place visual attention at locations in latent image space which follow trajectories caused by vehicle control actions to enhance predictive accuracy during planning. The attention model is jointly optimized by the task-specific loss and an additional trajectory-constraint loss, allowing adaptability yet encouraging a regularized structure for improved generalization and reliability. Importantly, visual attention is applied in latent feature map space instead of raw image space to promote efficient planning. We validated our model in visual navigation tasks of planning low turbulence, collision-free trajectories in off-road settings and hill climbing with locking differentials in the presence of slippery terrain. Experiments involved randomized procedural generated simulation and real-world environments. We found our method improved generalization and learning efficiency when compared to no-attention and self-attention alternatives.

【10】 Clairvoyance: Intelligent Route Planning for Electric Buses Based on Urban Big Data 标题:千里眼:基于城市大数据的电动公交车智能路径规划 链接:https://arxiv.org/abs/2112.04682

作者:Xiangyong Lu,Kaoru Ota,Mianxiong Dong,Chen Yu,Hai Jin 机构: School of Computer Science and Technology, Huazhong University ofScience and Technology 备注:13 pages,12 figures 摘要:如今,世界上许多城市都引进了电动公交车,以优化城市交通,减少当地碳排放。为了减少碳排放并最大限度地利用电动公共汽车,为其选择合适的路线是很重要的。传统上,路线选择基于专用调查,这在时间和人力上都很昂贵。在本论文中,我们主要关注智能规划电动公交线路,根据整个城市各个区域的独特需求进行规划。我们提出了“千里眼”路线规划系统,该系统利用深度神经网络和多层感知器分别预测未来人们的出行和未来整个城市的交通碳排放。考虑到人们出行和交通碳排放的未来信息,我们利用贪婪机制为理想状态下出发的电动公交车推荐公交路线。此外,还从异构的城市数据集中提取了这两个神经网络的代表性特征。我们通过在中国珠海的真实数据源上的大量实验来评估我们的方法。结果表明,我们设计的基于神经网络的算法始终优于典型基线。此外,电动公交车的推荐路线有助于降低碳排放峰值,充分利用城市电动公交车。 摘要:Nowadays many cities around the world have introduced electric buses to optimize urban traffic and reduce local carbon emissions. In order to cut carbon emissions and maximize the utility of electric buses, it is important to choose suitable routes for them. Traditionally, route selection is on the basis of dedicated surveys, which are costly in time and labor. In this paper, we mainly focus attention on planning electric bus routes intelligently, depending on the unique needs of each region throughout the city. We propose Clairvoyance, a route planning system that leverages a deep neural network and a multilayer perceptron to predict the future people's trips and the future transportation carbon emission in the whole city, respectively. Given the future information of people's trips and transportation carbon emission, we utilize a greedy mechanism to recommend bus routes for electric buses that will depart in an ideal state. Furthermore, representative features of the two neural networks are extracted from the heterogeneous urban datasets. We evaluate our approach through extensive experiments on real-world data sources in Zhuhai, China. The results show that our designed neural network-based algorithms are consistently superior to the typical baselines. Additionally, the recommended routes for electric buses are helpful in reducing the peak value of carbon emissions and making full use of electric buses in the city.

【11】 Enhancing Food Intake Tracking in Long-Term Care with Automated Food Imaging and Nutrient Intake Tracking (AFINI-T) Technology 标题:利用自动食物成像和营养素摄取跟踪(AFINI-T)技术加强长期护理中的食物摄入量跟踪 链接:https://arxiv.org/abs/2112.04608

作者:Kaylen J. Pfisterer,Robert Amelard,Jennifer Boger,Audrey G. Chung,Heather H. Keller,Alexander Wong 机构:University of Waterloo, Waterloo, Systems Design Engineering, Waterloo, ON, N,L ,G, Canada, Waterloo AI Institute, Waterloo, ON, N,L ,G, Canada, Schlegel-UW Research Institute for Aging, Waterloo, N,J ,E, Canada 备注:Key words: Automatic segmentation, convolutional neural network, deep learning, food intake tracking, volume estimation, malnutrition prevention, long-term care, hospital 摘要:半数长期护理(LTC)居民营养不良,住院率、死亡率、发病率不断增加,生活质量下降。目前的跟踪方法主观且耗时。本文介绍了为LTC设计的自动食品成像和营养素摄入跟踪(AFINI-T)技术。我们提出了一种用于食品分类的新型卷积自动编码器,在增强的UNIMIB2016数据集上进行训练,并在我们的模拟LTC食品摄入数据集上进行测试(12种膳食场景;每个场景最多15种;顶级分类准确率:88.9%;平均摄入误差:-0.4 mL$pm$36.7 mL)。按体积计算的营养素摄入量估算值与从质量计算的营养素估算值(r^2$0.92至0.99)呈强线性相关,方法之间具有良好的一致性($sigma$=-2.7至-0.01;在每个一致性限值内为零)。AFINI-T方法是一种以深度学习为动力的计算营养素传感系统,可为更准确、客观地跟踪LTC居民的食物摄入量提供一种新方法,以支持和预防营养不良跟踪策略。 摘要:Half of long-term care (LTC) residents are malnourished increasing hospitalization, mortality, morbidity, with lower quality of life. Current tracking methods are subjective and time consuming. This paper presents the automated food imaging and nutrient intake tracking (AFINI-T) technology designed for LTC. We propose a novel convolutional autoencoder for food classification, trained on an augmented UNIMIB2016 dataset and tested on our simulated LTC food intake dataset (12 meal scenarios; up to 15 classes each; top-1 classification accuracy: 88.9%; mean intake error: -0.4 mL$pm$36.7 mL). Nutrient intake estimation by volume was strongly linearly correlated with nutrient estimates from mass ($r^2$ 0.92 to 0.99) with good agreement between methods ($sigma$= -2.7 to -0.01; zero within each of the limits of agreement). The AFINI-T approach is a deep-learning powered computational nutrient sensing system that may provide a novel means for more accurately and objectively tracking LTC resident food intake to support and prevent malnutrition tracking strategies.

【12】 Estimating Divergences in High Dimensions 标题:高维空间中的散度估计 链接:https://arxiv.org/abs/2112.04583

作者:Loong Kuan Lee,Nico Piatkowski,François Petitjean,Geoffrey I. Webb 机构: Webb are with the Department of DataScience and AI, Monash University, Piatkowski is with Fraunhofer Institute for Intelligent Analysis andInformation Systems IAIS 备注:13 pages, 6 Figures. Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence 摘要:有限样本下两个高维分布的散度估计问题是机器学习等领域中的一个重要问题。尽管以前的方法在中等维数据下表现良好,但在使用100秒二进制变量的情况下,其精度开始下降。因此,我们建议使用可分解模型来估计高维数据中的差异。这使我们能够将高维分布的估计密度分解为低维函数的乘积。我们进行了形式分析和实验分析,以探索在散度估计的背景下使用可分解模型的特性。为此,我们从经验上证明,在维数较高的情况下,使用最大似然估计的可分解模型估计Kullback-Leibler散度优于现有散度估计方法,并且可以从可用数据中学习有用的可分解模型。 摘要:The problem of estimating the divergence between 2 high dimensional distributions with limited samples is an important problem in various fields such as machine learning. Although previous methods perform well with moderate dimensional data, their accuracy starts to degrade in situations with 100s of binary variables. Therefore, we propose the use of decomposable models for estimating divergences in high dimensional data. These allow us to factorize the estimated density of the high-dimensional distribution into a product of lower dimensional functions. We conduct formal and experimental analyses to explore the properties of using decomposable models in the context of divergence estimation. To this end, we show empirically that estimating the Kullback-Leibler divergence using decomposable models from a maximum likelihood estimator outperforms existing methods for divergence estimation in situations where dimensionality is high and useful decomposable models can be learnt from the available data.

【13】 Whose Ground Truth? Accounting for Individual and Collective Identities Underlying Dataset Annotation 标题:谁的地面真相?在数据集注解基础上考虑个人和集体身份 链接:https://arxiv.org/abs/2112.04554

作者:Emily Denton,Mark Díaz,Ian Kivlichan,Vinodkumar Prabhakaran,Rachel Rosen 机构:Google Research, Jigsaw 摘要:人类注释在机器学习(ML)的研究和开发中起着至关重要的作用。然而,围绕构建ML数据集的过程和决策的伦理考虑还没有得到足够的重视。在本文中,我们调查了一系列文献,这些文献提供了关于众包数据集注释的伦理考虑的见解。我们综合了这些见解,并从两个层面阐述了这一领域的挑战:(1)注释者是谁,注释者的生活经历如何影响他们的注释,以及(2)注释者与众包平台之间的关系以及这种关系为他们提供了什么。最后,我们在ML数据管道的各个阶段为数据集开发人员提出了一组具体的建议和考虑事项:任务制定、注释器选择、平台和基础设施选择、数据集分析和评估以及数据集文档和发布。 摘要:Human annotations play a crucial role in machine learning (ML) research and development. However, the ethical considerations around the processes and decisions that go into building ML datasets has not received nearly enough attention. In this paper, we survey an array of literature that provides insights into ethical considerations around crowdsourced dataset annotation. We synthesize these insights, and lay out the challenges in this space along two layers: (1) who the annotator is, and how the annotators' lived experiences can impact their annotations, and (2) the relationship between the annotators and the crowdsourcing platforms and what that relationship affords them. Finally, we put forth a concrete set of recommendations and considerations for dataset developers at various stages of the ML data pipeline: task formulation, selection of annotators, platform and infrastructure choices, dataset analysis and evaluation, and dataset documentation and release.

【14】 A fully-differentiable compressible high-order computational fluid dynamics solver 标题:一种完全可微的可压缩高阶计算流体力学求解器 链接:https://arxiv.org/abs/2112.04979

作者:Deniz A. Bezgin,Aaron B. Buhendwa,Nikolaus A. Adams 机构:Chair of Aerodynamics and Fluid Mechanics, Technical University of Munich, D-, Garching bei Muenchen 摘要:流体流动在自然和工程学科中无处不在。由于多时空尺度上的非线性相互作用,流体的可靠计算一直是一个长期的挑战。可压缩Navier-Stokes方程控制可压缩流动,并允许出现复杂现象,如湍流和冲击。尽管在硬件和软件方面取得了巨大的进步,但捕捉流体流动中的最小长度尺度仍然会给实际应用带来令人望而却步的计算成本。我们目前正在目睹一种范式的转变,即以机器学习支持的数值格式设计作为解决上述问题的手段。虽然之前的工作已经探索了一维或二维不可压缩流体流动的可微算法,但我们提出了一个完全可微的三维框架,用于使用高阶最先进的数值方法计算可压缩流体流动。首先,我们通过计算经典的二维和三维测试用例(包括强激波和向湍流的过渡)来证明我们的解算器的效率。其次,也是更重要的是,我们的框架允许端到端优化,以改进计算流体动力学算法中的现有数值方案。特别是,我们使用神经网络来代替传统的数值通量函数。 摘要:Fluid flows are omnipresent in nature and engineering disciplines. The reliable computation of fluids has been a long-lasting challenge due to nonlinear interactions over multiple spatio-temporal scales. The compressible Navier-Stokes equations govern compressible flows and allow for complex phenomena like turbulence and shocks. Despite tremendous progress in hardware and software, capturing the smallest length-scales in fluid flows still introduces prohibitive computational cost for real-life applications. We are currently witnessing a paradigm shift towards machine learning supported design of numerical schemes as a means to tackle aforementioned problem. While prior work has explored differentiable algorithms for one- or two-dimensional incompressible fluid flows, we present a fully-differentiable three-dimensional framework for the computation of compressible fluid flows using high-order state-of-the-art numerical methods. Firstly, we demonstrate the efficiency of our solver by computing classical two- and three-dimensional test cases, including strong shocks and transition to turbulence. Secondly, and more importantly, our framework allows for end-to-end optimization to improve existing numerical schemes inside computational fluid dynamics algorithms. In particular, we are using neural networks to substitute a conventional numerical flux function.

【15】 Measuring Wind Turbine Health Using Drifting Concepts 标题:利用漂移概念测量风力机健康 链接:https://arxiv.org/abs/2112.04933

作者:Agnieszka Jastrzebska,Alejandro Morales-Hernández,Gonzalo Nápoles,Yamisleydi Salgueiro,Koen Vanhoof 机构:Warsaw University of Technology, Poland., Hasselt University, Belgium., Department of Cognitive Science & Artificial Intelligence, Tilburg University, The, Netherlands., Department of Computer Sciences, Universidad de Talca, Campus Curic´o, Chile. 摘要:时间序列处理是风力发电机组健康监测的一个重要方面。尽管在这一领域取得了进展,但仍有改进建模质量的新方法的空间。在本文中,我们提出了两种新的风力发电机组健康分析方法。这两种方法都基于抽象概念,使用模糊集实现,模糊集汇总和聚合底层原始数据。通过观察概念的变化,我们推断出涡轮机健康状况的变化。分别针对不同的外部条件(风速和温度)进行分析。我们提取代表相对低、中、高功率生产的概念。第一种方法旨在评估相对高功率和低功率生产的减少或增加。此任务使用类似回归的模型执行。第二种方法评估提取概念的总体漂移。大漂移表明发电过程在时间上会发生波动。使用语言标签标记概念,从而使我们的模型具有改进的可解释性特征。我们应用所提出的方法来处理描述四个风力涡轮机的公开数据。仿真结果表明,老化过程并非在所有风力涡轮机中都是均匀的。 摘要:Time series processing is an essential aspect of wind turbine health monitoring. Despite the progress in this field, there is still room for new methods to improve modeling quality. In this paper, we propose two new approaches for the analysis of wind turbine health. Both approaches are based on abstract concepts, implemented using fuzzy sets, which summarize and aggregate the underlying raw data. By observing the change in concepts, we infer about the change in the turbine's health. Analyzes are carried out separately for different external conditions (wind speed and temperature). We extract concepts that represent relative low, moderate, and high power production. The first method aims at evaluating the decrease or increase in relatively high and low power production. This task is performed using a regression-like model. The second method evaluates the overall drift of the extracted concepts. Large drift indicates that the power production process undergoes fluctuations in time. Concepts are labeled using linguistic labels, thus equipping our model with improved interpretability features. We applied the proposed approach to process publicly available data describing four wind turbines. The simulation results have shown that the aging process is not homogeneous in all wind turbines.

【16】 A More Stable Accelerated Gradient Method Inspired by Continuous-Time Perspective 标题:一种受连续时间透视启发的更稳定的加速梯度法 链接:https://arxiv.org/abs/2112.04922

作者:Yasong Feng,Weiguo Gao 摘要:Nesterov的加速梯度法(NAG)广泛应用于机器学习背景下的问题,包括深度学习,它对应于一个连续时间微分方程。由此,可以研究微分方程的性质及其数值逼近,以改进加速梯度法。在这项工作中,我们提出了一个新的改进NAG的稳定性方面的启发数值分析。我们给出了NAG的精确阶数作为其连续时间极限的数值逼近,然后提出了一种新的高阶方法。我们从理论上证明,对于大步长,我们的新方法比NAG更稳定。矩阵补全和手写数字识别实验表明,该方法具有较好的稳定性。此外,在实验中,更好的稳定性导致更高的计算速度。 摘要:Nesterov's accelerated gradient method (NAG) is widely used in problems with machine learning background including deep learning, and is corresponding to a continuous-time differential equation. From this connection, the property of the differential equation and its numerical approximation can be investigated to improve the accelerated gradient method. In this work we present a new improvement of NAG in terms of stability inspired by numerical analysis. We give the precise order of NAG as a numerical approximation of its continuous-time limit and then present a new method with higher order. We show theoretically that our new method is more stable than NAG for large step size. Experiments of matrix completion and handwriting digit recognition demonstrate that the stability of our new method is better. Furthermore, better stability leads to higher computational speed in experiments.

【17】 End-to-end Alexa Device Arbitration 标题:端到端Alexa设备仲裁 链接:https://arxiv.org/abs/2112.04914

作者:Jarred Barber,Yifeng Fan,Tao Zhang 机构:Amazon Alexa AI, USA, University of Illinois, Urbana-Champaign, USA 备注:Submitted to ICASSP 2022 摘要:我们介绍了一种不同的说话人定位问题,我们称之为设备仲裁。在设备仲裁问题中,用户发出一个由多个分布式麦克风阵列(智能家居设备)检测到的关键字,我们希望确定哪个设备离用户最近。我们提出了一种端到端的机器学习系统,而不是解决完全的本地化问题。该系统学习在每个设备上独立计算的特征嵌入。然后,将来自每个设备的嵌入聚合在一起,以产生最终仲裁决定。我们使用一个大型房间模拟来生成训练和评估数据,并将我们的系统与信号处理基线进行比较。 摘要:We introduce a variant of the speaker localization problem, which we call device arbitration. In the device arbitration problem, a user utters a keyword that is detected by multiple distributed microphone arrays (smart home devices), and we want to determine which device was closest to the user. Rather than solving the full localization problem, we propose an end-to-end machine learning system. This system learns a feature embedding that is computed independently on each device. The embeddings from each device are then aggregated together to produce the final arbitration decision. We use a large-scale room simulation to generate training and evaluation data, and compare our system against a signal processing baseline.

【18】 Evaluating saliency methods on artificial data with different background types 标题:不同背景类型人工数据的显著性评价方法 链接:https://arxiv.org/abs/2112.04882

作者:Céline Budding,Fabian Eitel,Kerstin Ritter,Stefan Haufe 机构:Department of Industrial Engineering & Innovation Sciences., Eindhoven University of Technology, Eindhoven, The Netherlands, Charité – Universitätsmedizin Berlin, Berlin, Germany;, Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany 备注:6 pages, 2 figures. Presented at Medical Imaging meets NeurIPS 2021 (poster presentation) 摘要:在过去的几年里,许多“可解释人工智能”(xAI)方法已经被开发出来,但这些方法并不总是被客观地评估。为了评估由各种显著性方法生成的热图的质量,我们开发了一个框架,用合成病变和已知的地面真值图生成人工数据。利用这个框架,我们评估了两个不同背景的数据集,柏林噪声和2D脑MRI切片,发现显著性方法和背景之间的热图差异很大。我们强烈鼓励在将显著性图和xAI方法应用于临床或其他安全关键环境之前,使用该框架对显著性图和xAI方法进行进一步评估。 摘要:Over the last years, many 'explainable artificial intelligence' (xAI) approaches have been developed, but these have not always been objectively evaluated. To evaluate the quality of heatmaps generated by various saliency methods, we developed a framework to generate artificial data with synthetic lesions and a known ground truth map. Using this framework, we evaluated two data sets with different backgrounds, Perlin noise and 2D brain MRI slices, and found that the heatmaps vary strongly between saliency methods and backgrounds. We strongly encourage further evaluation of saliency maps and xAI methods using this framework before applying these in clinical or other safety-critical settings.

【19】 Evaluation of survival distribution predictions with discrimination measures 标题:用判别测度评价生存分布预测 链接:https://arxiv.org/abs/2112.04828

作者:Raphael Sonabend,Andreas Bender,Sebastian Vollmer 机构:MRC Centre for Global Infectious Disease Analysis, Jameel Institute, Imperial, College London, School of Public Health, W,PG, London, UK, Department of Computer Science, Technische Universit¨at Kaiserslautern, Gottlieb-Daimler-Straße , Kaiserslautern, Germany 摘要:在本文中,我们考虑如何评估生存分配预测的歧视措施。这是一个非常重要的问题,因为判别度量是生存分析中最常用的,但是没有明确的方法从分布预测中得出风险预测。我们调查的方法提出的文献和软件,并考虑其各自的优点和缺点。虽然分配经常通过歧视措施进行评估,但我们发现这样做的方法很少在文献中描述,并且经常导致不公平的比较。我们发现,将分布降低为风险的最稳健的方法是将预测的累积风险相加。我们建议机器学习生存分析软件在分布和风险预测之间实现清晰的转换,以允许更透明和可访问的模型评估。 摘要:In this paper we consider how to evaluate survival distribution predictions with measures of discrimination. This is a non-trivial problem as discrimination measures are the most commonly used in survival analysis and yet there is no clear method to derive a risk prediction from a distribution prediction. We survey methods proposed in literature and software and consider their respective advantages and disadvantages. Whilst distributions are frequently evaluated by discrimination measures, we find that the method for doing so is rarely described in the literature and often leads to unfair comparisons. We find that the most robust method of reducing a distribution to a risk is to sum over the predicted cumulative hazard. We recommend that machine learning survival analysis software implements clear transformations between distribution and risk predictions in order to allow more transparent and accessible model evaluation.

【20】 Regularized Modal Regression on Markov-dependent Observations: A Theoretical Assessment 标题:马尔可夫相依观测值的正则化模式回归:一个理论评估 链接:https://arxiv.org/abs/2112.04779

作者:Tielang Gong,Yuxin Dong,Hong Chen,Bo Dong,Wei Feng,Chen Li 机构:School of Computer Science and Technology, Xi’an Jiaotong University, Xi’an , China, Key Laboratory of Intelligent Networks and Network Security, Ministry of Education, Xi’an , China, College of Science, Huazhong Agriculture University, Wuhan , China 摘要:模态回归是一种广泛使用的回归协议,由于其对异常值和重尾噪声的鲁棒性,在统计和机器学习领域得到了广泛的研究。理解模态回归的理论行为是学习理论的基础。尽管在描述其统计特性方面取得了重大进展,但大多数结果都是基于样本是独立和相同分布(i.i.d.)的假设,这对于实际应用来说限制太大。本文研究了一种重要的依赖结构——马尔可夫依赖结构中正则模态回归(RMR)的统计性质。具体地说,我们在中等条件下建立了RMR估计的上界,并给出了一个明确的学习率。我们的结果表明,马尔可夫依赖性对泛化误差的影响方式是,样本量将根据潜在马尔可夫链的谱间隙通过乘法因子进行折扣。这一结果为描述稳健回归的理论基础提供了新的思路。 摘要:Modal regression, a widely used regression protocol, has been extensively investigated in statistical and machine learning communities due to its robustness to outliers and heavy-tailed noises. Understanding modal regression's theoretical behavior can be fundamental in learning theory. Despite significant progress in characterizing its statistical property, the majority of the results are based on the assumption that samples are independent and identical distributed (i.i.d.), which is too restrictive for real-world applications. This paper concerns the statistical property of regularized modal regression (RMR) within an important dependence structure - Markov dependent. Specifically, we establish the upper bound for RMR estimator under moderate conditions and give an explicit learning rate. Our results show that the Markov dependence impacts on the generalization error in the way that sample size would be discounted by a multiplicative factor depending on the spectral gap of underlying Markov chain. This result shed a new light on characterizing the theoretical underpinning for robust regression.

【21】 A Note on Comparison of F-measures 标题:关于F-测度比较的一个注记 链接:https://arxiv.org/abs/2112.04677

作者:Wei Ju,Wenxin Jiang 机构: Jiang is with the Department of Statistics 摘要:我们对TKDE最近的一篇论文“不平衡数据集分类算法性能评估的F-测度线性近似”进行了评论,并对两种预测规则的F-测度进行了比较。 摘要:We comment on a recent TKDE paper "Linear Approximation of F-measure for the Performance Evaluation of Classification Algorithms on Imbalanced Data Sets", and make two improvements related to comparison of F-measures for two prediction rules.

【22】 Building Quantum Field Theories Out of Neurons 标题:用神经元构建量子场理论 链接:https://arxiv.org/abs/2112.04527

作者:James Halverson 机构:The NSF AI Institute for Artificial Intelligence and Fundamental Interactions, Department of Physics, Northeastern University, Boston, MA 摘要:研究了一种场论方法,其中场由$N$组成的随机神经元组成。当神经元通过中心极限定理独立分布时,高斯理论产生于无限-$N$极限,而相互作用产生于有限-$N$效应或非独立分布的神经元。神经元的欧几里德不变集合被设计,具有可调的两点函数,产生欧几里德不变场理论家族。一些高斯、欧几里德不变量理论是反射正的,这允许对洛伦兹不变量量子场论进行解析延拓。举例说明在无限-$N$时产生对偶理论,但在有限-$N$时具有不同的对称性。经典场配置的景观由参数分布的局部极大值决定。预测来自混合场神经元相关器。近高斯性总体上表现为-$N$,这可能解释了自然界场论的一个特征。 摘要:An approach to field theory is studied in which fields are comprised of $N$ constituent random neurons. Gaussian theories arise in the infinite-$N$ limit when neurons are independently distributed, via the Central Limit Theorem, while interactions arise due to finite-$N$ effects or non-independently distributed neurons. Euclidean-invariant ensembles of neurons are engineered, with tunable two-point function, yielding families of Euclidean-invariant field theories. Some Gaussian, Euclidean invariant theories are reflection positive, which allows for analytic continuation to a Lorentz-invariant quantum field theory. Examples are presented that yield dual theories at infinite-$N$, but have different symmetries at finite-$N$. Landscapes of classical field configurations are determined by local maxima of parameter distributions. Predictions arise from mixed field-neuron correlators. Near-Gaussianity is exhibited at large-$N$, potentially explaining a feature of field theories in Nature.

机器翻译,仅供参考

0 人点赞