机器学习学术速递[7.27]

2021-07-28 14:53:48 浏览数 (1)

访问www.arxivdaily.com获取含摘要速递,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏、发帖等功能!点击阅读原文即可访问

cs.LG 方向,今日共计113篇

Graph相关(图学习|图神经网络|图优化等)(9篇)

【1】 Embedding Signals on Knowledge Graphs with Unbalanced Diffusion Earth Mover's Distance 标题:非平衡扩散推土机距离知识图上的信号嵌入

作者:Alexander Tong,Guillaume Huguet,Dennis Shung,Amine Natik,Manik Kuchroo,Guillaume Lajoie,Guy Wolf,Smita Krishnaswamy 机构:Dept. of Comp. Sci., Yale University, New Haven, CT, USA, alexander.tong, Dept. of Math. and Stat., Univ. de Montréal ; Mila, Montreal, QC, Canada, guillaume.huguet, Dept. of Medicine, dennis.shung, amine.natik, Dept. of Neuroscience, manik.kuchroo, g.lajoie 备注:17 pages, 7 figures, 2 tables 链接:https://arxiv.org/abs/2107.12334 摘要:在现代关系机器学习中,经常会遇到通过许多领域中观察值之间的交互或相似性产生的大型图。此外,在许多情况下,用于分析的目标实体实际上是此类图上的信号。我们建议比较和组织这样的数据集的图形信号通过使用地球移动的距离(EMD)与测地成本的基础图。通常,EMD是通过在一个基本度量空间上优化一个概率分布到另一个概率分布的传输成本来计算的。然而,当计算多个信号之间的EMD时,这是低效的。在这里,我们提出了一个非平衡图-地球移动距离,它有效地将底层图上的非平衡EMD嵌入到一个$L^1$空间中,我们称之为非平衡扩散-地球移动距离(UDEMD)。这使得我们在一个大型图上定义的许多信号上找到一个有效的最近邻核。接下来,我们将展示如何给出对噪声鲁棒的图形信号之间的距离。最后,我们将此应用于根据临床笔记组织患者,这些患者在SNOMED-CT医学知识图上被建模为信号,在基因图上嵌入被建模为信号的淋巴母细胞,以及在大的外周血单个核细胞(PBMC)细胞图上组织被建模为信号的基因。在每一种情况下,我们都证明了基于UDEMD的嵌入方法能够找到精确的距离,这与其他方法相比是非常有效的。 摘要:In modern relational machine learning it is common to encounter large graphs that arise via interactions or similarities between observations in many domains. Further, in many cases the target entities for analysis are actually signals on such graphs. We propose to compare and organize such datasets of graph signals by using an earth mover's distance (EMD) with a geodesic cost over the underlying graph. Typically, EMD is computed by optimizing over the cost of transporting one probability distribution to another over an underlying metric space. However, this is inefficient when computing the EMD between many signals. Here, we propose an unbalanced graph earth mover's distance that efficiently embeds the unbalanced EMD on an underlying graph into an $L^1$ space, whose metric we call unbalanced diffusion earth mover's distance (UDEMD). This leads us to an efficient nearest neighbors kernel over many signals defined on a large graph. Next, we show how this gives distances between graph signals that are robust to noise. Finally, we apply this to organizing patients based on clinical notes who are modelled as signals on the SNOMED-CT medical knowledge graph, embedding lymphoblast cells modeled as signals on a gene graph, and organizing genes modeled as signals over a large peripheral blood mononuclear (PBMC) cell graph. In each case, we show that UDEMD-based embeddings find accurate distances that are highly efficient compared to other methods.

【2】 HW2VEC: A Graph Learning Tool for Automating Hardware Security 标题:HW2VEC:一个自动化硬件安全的图学习工具

作者:Shih-Yuan Yu,Rozhin Yasaei,Qingrong Zhou,Tommy Nguyen,Mohammad Abdullah Al Faruque 机构:Department of Electrical Engineering and Computer Science, University of California, Irvine, California, USA 链接:https://arxiv.org/abs/2107.12328 摘要:上市时间的压力和不断增长的硬件设计复杂性推动了集成电路(IC)供应链的全球化。然而,这种全球化也给IC供应链的各个阶段带来了各种各样的安全威胁。尽管机器学习的发展推动了硬件安全的前沿,但是大多数传统的基于机器学习的方法只能通过人工寻找非欧几里德数据电路的鲁棒特征来达到预期的性能。因此,利用图形学习对这些电路进行建模以改进设计流程已成为电子设计自动化(EDA)领域的研究热点。然而,由于缺乏相关的支持工具,目前只有少数作品将图形学习应用于解决硬件安全问题。为了吸引更多的关注,我们提出了HW2VEC,这是一个开源的图形学习工具,它降低了新手使用图形研究硬件安全应用程序的门槛。HW2VEC提供了一个自动管道,用于从硬件设计中提取各种抽象级别(寄存器传输级别或门级网络表)的图形表示。此外,HW2VEC用户可以将非欧几里德的硬件设计自动转换为欧几里德图嵌入来解决他们的问题。在本文中,我们证明了HW2VEC可以在两个硬件安全相关的任务上实现最先进的性能:硬件特洛伊木马检测和知识产权盗版检测。我们为HW2VEC中的图形提取和学习管道提供了时间分析结果。 摘要:The time-to-market pressure and continuous growing complexity of hardware designs have promoted the globalization of the Integrated Circuit (IC) supply chain. However, such globalization also poses various security threats in each phase of the IC supply chain. Although the advancements of Machine Learning (ML) have pushed the frontier of hardware security, most conventional ML-based methods can only achieve the desired performance by manually finding a robust feature representation for circuits that are non-Euclidean data. As a result, modeling these circuits using graph learning to improve design flows has attracted research attention in the Electronic Design Automation (EDA) field. However, due to the lack of supporting tools, only a few existing works apply graph learning to resolve hardware security issues. To attract more attention, we propose HW2VEC, an open-source graph learning tool that lowers the threshold for newcomers to research hardware security applications with graphs. HW2VEC provides an automated pipeline for extracting a graph representation from a hardware design in various abstraction levels (register transfer level or gate-level netlist). Besides, HW2VEC users can automatically transform the non-Euclidean hardware designs into Euclidean graph embeddings for solving their problems. In this paper, we demonstrate that HW2VEC can achieve state-of-the-art performance on two hardware security-related tasks: Hardware Trojan Detection and Intellectual Property Piracy Detection. We provide the time profiling results for the graph extraction and the learning pipelines in HW2VEC.

【3】 Local2Global: Scaling global representation learning on graphs via local training 标题:Local2Global:通过局部训练缩放图上的全局表示学习

作者:Lucas G. S. Jeub,Giovanni Colavizza,Xiaowen Dong,Marya Bazzi,Mihai Cucuringu 机构:The Alan Turing Institute, University of Amsterdam, University of Oxford, University of Warwick 备注:5 pages, 1 figure, to appear at DLG-KDD '21 链接:https://arxiv.org/abs/2107.12224 摘要:我们提出了一种分散的“local2global”图表示学习方法,可以预先使用它来扩展任何嵌入技术。我们的local2global方法首先将输入图划分为重叠的子图(或“面片”),并为每个面片单独训练局部表示。在第二步中,我们通过组同步,使用面片重叠的信息,通过估计一组最佳对齐局部表示的刚体运动,将局部表示组合成全局一致的表示。local2global相对于现有工作的一个关键区别在于,补丁程序是独立训练的,而不需要在分布式训练期间进行代价高昂的参数同步。这使得local2global可以扩展到大规模的工业应用程序,在这些应用程序中,输入图形甚至可能无法放入内存,并且可能以分布式方式存储。在中等规模数据集(高达$sim$7K个节点和$sim$200K个边)上的初步结果是有希望的,local2global的图形重建性能与全局训练的嵌入相当。对local2global的大规模数据和下游任务(如节点分类和链路预测)的应用程序进行全面评估是一项正在进行的工作。 摘要:We propose a decentralised "local2global" approach to graph representation learning, that one can a-priori use to scale any embedding technique. Our local2global approach proceeds by first dividing the input graph into overlapping subgraphs (or "patches") and training local representations for each patch independently. In a second step, we combine the local representations into a globally consistent representation by estimating the set of rigid motions that best align the local representations using information from the patch overlaps, via group synchronization. A key distinguishing feature of local2global relative to existing work is that patches are trained independently without the need for the often costly parameter synchronisation during distributed training. This allows local2global to scale to large-scale industrial applications, where the input graph may not even fit into memory and may be stored in a distributed manner. Preliminary results on medium-scale data sets (up to $sim$7K nodes and $sim$200K edges) are promising, with a graph reconstruction performance for local2global that is comparable to that of globally trained embeddings. A thorough evaluation of local2global on large scale data and applications to downstream tasks, such as node classification and link prediction, constitutes ongoing work.

【4】 Provably Accelerated Decentralized Gradient Method Over Unbalanced Directed Graphs 标题:非平衡有向图上的可证明加速分散梯度法

作者:Zhuoqing Song,Lei Shi,Shi Pu,Ming Yan 机构:Shanghai Center for Mathematical Sciences, Fudan University, Shanghai, China, School of Mathematical Sciences, Shanghai Key Laboratory for Contemporary Applied, Mathematics, Fudan University, Shanghai, China 链接:https://arxiv.org/abs/2107.12065 摘要:在这项工作中,我们考虑分散优化问题,其中每个具有平滑凸目标函数的$N$代理的网络希望通过有向图中的对等通信协作最小化所有目标函数的平均值。为了解决这个问题,我们提出了两种加速的推挖方法,分别称为APD和APD-SC,用于最小化非强凸目标函数和强凸目标函数。我们证明了APD和APD-SC分别以$O left(frac{1}{k^2}right)$和$O left(left(1-Csqrt{frac{mu}{L}}}right)^k right)$的速率收敛到仅依赖于混合矩阵的常数因子。据我们所知,APD和APD-SC是第一个在不平衡有向图上实现可证明加速的分散方法。数值实验验证了两种方法的有效性。 摘要:In this work, we consider the decentralized optimization problem in which a network of $n$ agents, each possessing a smooth and convex objective function, wish to collaboratively minimize the average of all the objective functions through peer-to-peer communication in a directed graph. To solve the problem, we propose two accelerated Push-DIGing methods termed APD and APD-SC for minimizing non-strongly convex objective functions and strongly convex ones, respectively. We show that APD and APD-SC respectively converge at the rates $Oleft(frac{1}{k^2}right)$ and $Oleft(left(1 - Csqrt{frac{mu}{L}}right)^kright)$ up to constant factors depending only on the mixing matrix. To the best of our knowledge, APD and APD-SC are the first decentralized methods to achieve provable acceleration over unbalanced directed graphs. Numerical experiments demonstrate the effectiveness of both methods.

【5】 GCExplainer: Human-in-the-Loop Concept-based Explanations for Graph Neural Networks 标题:GCExplainer:基于人在环概念的图神经网络解释

作者:Lucie Charlotte Magister,Dmitry Kazhdan,Vikash Singh,Pietro Liò 机构: which allows 1Department of Computer Science and Technology, Universityof Cambridge 备注:Accepted as 3rd ICML Workshop on Human in the Loop Learning, 2021 链接:https://arxiv.org/abs/2107.11889 摘要:尽管图形神经网络(GNNs)在基于图形的数据处理方面表现良好,但它们缺乏透明度和责任感,这阻碍了信任,从而阻碍了此类模型在高风险和安全关键场景中的部署。尽管最近的研究调查了解释GNNs的方法,但这些方法仅限于单实例解释,也称为局部解释。出于提供全局解释的目的,我们将著名的基于概念的自动解释方法(Ghorbani et al.,2019)应用于GNN节点和图分类,并提出了GCExplainer。GCExplainer是一种无监督的方法,用于事后发现和提取GNNs的基于全局概念的解释,将人置于循环中。我们在五节点分类数据集和两个图分类数据集上展示了我们的技术的成功,表明我们能够通过将人放在循环中来发现和提取高质量的概念表示。在所有数据集中,我们的最大完整性得分为1,平均完整性得分为0.753。最后,我们表明,与gnnexplaner(Ying等人,2019)提出的最新解释相比,基于概念的解释提供了对数据集和GNN模型更深入的了解。 摘要:While graph neural networks (GNNs) have been shown to perform well on graph-based data from a variety of fields, they suffer from a lack of transparency and accountability, which hinders trust and consequently the deployment of such models in high-stake and safety-critical scenarios. Even though recent research has investigated methods for explaining GNNs, these methods are limited to single-instance explanations, also known as local explanations. Motivated by the aim of providing global explanations, we adapt the well-known Automated Concept-based Explanation approach (Ghorbani et al., 2019) to GNN node and graph classification, and propose GCExplainer. GCExplainer is an unsupervised approach for post-hoc discovery and extraction of global concept-based explanations for GNNs, which puts the human in the loop. We demonstrate the success of our technique on five node classification datasets and two graph classification datasets, showing that we are able to discover and extract high-quality concept representations by putting the human in the loop. We achieve a maximum completeness score of 1 and an average completeness score of 0.753 across the datasets. Finally, we show that the concept-based explanations provide an improved insight into the datasets and GNN models compared to the state-of-the-art explanations produced by GNNExplainer (Ying et al., 2019).

【6】 ROD: Reception-aware Online Distillation for Sparse Graphs 标题:ROD:稀疏图的接收感知在线蒸馏

作者:Wentao Zhang,Yuezihan Jiang,Yang Li,Zeang Sheng,Yu Shen,Xupeng Miao,Liang Wang,Zhi Yang,Bin Cui 机构:†School of EECS & Key Laboratory of High Confidence Software Technologies, Peking University §Center for Data, Science, Peking University & National Engineering Laboratory for Big Data Analysis and Applications ‡Alibaba Group 链接:https://arxiv.org/abs/2107.11789 摘要:图神经网络(GNNs)广泛应用于节点分类、链路预测、节点聚类等基于图的任务中。然而,GNNs的性能优势主要来自于对图的边缘进行特征传播和平滑,因此需要足够的连通性和标签信息来进行有效的传播。不幸的是,现实世界中的许多网络在边缘和标签方面都是稀疏的,导致GNNs的性能处于次优状态。最近人们对这个稀疏问题的兴趣集中在自训练方法上,这种方法用伪标签扩展监督信号。然而,由于伪标号的数量和质量都不尽如人意,自学习方法本身就不能充分发挥细化稀疏图学习性能的潜力。在本文中,我们提出了一种新的接收感知稀疏图学习的在线知识提取方法ROD。我们为ROD设计了三种监督信号:多尺度接收感知图形知识、基于任务的监督和丰富的提炼知识,允许以同伴教学方式进行在线知识转移。为了提取隐藏在多尺度接收域中的知识,ROD明确要求单个学生模型保留不同层次的局部信息。对于一个给定的任务,每个学生根据自己的接受量表知识进行预测,同时结合多量表知识动态建立一个强大的教师。我们的方法已经在9个数据集和各种基于图的任务上进行了广泛的评估,包括节点分类、链接预测和节点聚类。结果表明,ROD算法具有良好的性能,对图的稀疏性具有较强的鲁棒性。 摘要:Graph neural networks (GNNs) have been widely used in many graph-based tasks such as node classification, link prediction, and node clustering. However, GNNs gain their performance benefits mainly from performing the feature propagation and smoothing across the edges of the graph, thus requiring sufficient connectivity and label information for effective propagation. Unfortunately, many real-world networks are sparse in terms of both edges and labels, leading to sub-optimal performance of GNNs. Recent interest in this sparse problem has focused on the self-training approach, which expands supervised signals with pseudo labels. Nevertheless, the self-training approach inherently cannot realize the full potential of refining the learning performance on sparse graphs due to the unsatisfactory quality and quantity of pseudo labels. In this paper, we propose ROD, a novel reception-aware online knowledge distillation approach for sparse graph learning. We design three supervision signals for ROD: multi-scale reception-aware graph knowledge, task-based supervision, and rich distilled knowledge, allowing online knowledge transfer in a peer-teaching manner. To extract knowledge concealed in the multi-scale reception fields, ROD explicitly requires individual student models to preserve different levels of locality information. For a given task, each student would predict based on its reception-scale knowledge, while simultaneously a strong teacher is established on-the-fly by combining multi-scale knowledge. Our approach has been extensively evaluated on 9 datasets and a variety of graph-based tasks, including node classification, link prediction, and node clustering. The result demonstrates that ROD achieves state-of-art performance and is more robust for the graph sparsity.

【7】 Graph Convolutional Network with Generalized Factorized Bilinear Aggregation 标题:具有广义因子双线性聚集的图卷积网络

作者:Hao Zhu,Piotr Koniusz 机构:Australian National University and Data,CSIRO, Canberra, Australia 链接:https://arxiv.org/abs/2107.11666 摘要:尽管图卷积网络(GCN)在各种应用中显示了其强大的功能,但作为GCN最重要的组成部分,图卷积层仍然使用线性变换和简单的池化步骤。在本文中,我们提出了一种新的泛化因子化双线性(FB)层来模拟GCNs中的特征交互。FB执行两个矩阵向量乘法,即将权重矩阵与来自两侧的隐藏特征向量的外积相乘。然而,FB层由于隐藏表示的信道之间的相关性违反i.i.d.假设而遭受系数的二次数、过拟合和虚假相关性。因此,我们提出了一个紧凑的FB层通过定义一个家庭的总结运算符适用于二次项。我们分析了提出的池运算符并激励它们的使用。我们在多个数据集上的实验结果表明,GFB-GCN在文本分类方面与其他方法具有一定的竞争力。 摘要:Although Graph Convolutional Networks (GCNs) have demonstrated their power in various applications, the graph convolutional layers, as the most important component of GCN, are still using linear transformations and a simple pooling step. In this paper, we propose a novel generalization of Factorized Bilinear (FB) layer to model the feature interactions in GCNs. FB performs two matrix-vector multiplications, that is, the weight matrix is multiplied with the outer product of the vector of hidden features from both sides. However, the FB layer suffers from the quadratic number of coefficients, overfitting and the spurious correlations due to correlations between channels of hidden representations that violate the i.i.d. assumption. Thus, we propose a compact FB layer by defining a family of summarizing operators applied over the quadratic term. We analyze proposed pooling operators and motivate their use. Our experimental results on multiple datasets demonstrate that the GFB-GCN is competitive with other methods for text classification.

【8】 Combining Graph Neural Networks with Expert Knowledge for Smart Contract Vulnerability Detection 标题:图神经网络与专家知识相结合的智能合同漏洞检测

作者:Zhenguang Liu,Peng Qian,Xiaoyang Wang,Yuan Zhuang,Lin Qiu,Xun Wang 机构:com•Yuan Zhuang is with National University of Singapore, •Xiaoyang Wang is with School of Computer and Information Engineering, Zhejiang Gongshang University, •Lin Qiu is with Southern University of Science and Technology 备注:This paper has been accepted by TKDE 2021 链接:https://arxiv.org/abs/2107.11598 摘要:由于黑客攻击造成的巨大损失,智能合约漏洞检测近年来受到广泛关注。现有的合同安全分析工作严重依赖于专家定义的严格规则,这些规则是劳动密集型的,不可扩展。更重要的是,专家定义的规则往往容易出错,并且存在被狡猾的攻击者欺骗的固有风险。目前的研究主要集中在对智能契约的符号执行和形式化分析上,尚没有一个精确的、可扩展的解决方案。尽管已经提出了几种方法来检测智能合约中的漏洞,但是仍然缺乏将专家定义的安全模式与深度神经网络相结合的努力。本文探讨了利用图神经网络和专家知识进行智能合约漏洞检测。具体来说,我们将源代码丰富的控制流和数据流语义转换为契约图。为了突出图中的关键节点,我们进一步设计了节点消除阶段来规范化图。然后,提出了一种新的时态信息传播网络,从归一化图中提取图特征,并将图特征与设计的专家模式相结合,形成最终的检测系统。对所有在以太坊和VNT链平台上具有源代码的智能合约进行了广泛的实验。实验结果表明,与现有方法相比,我们的方法对三种类型的漏洞的检测准确率有了显著的提高,其中对于重入性、时间戳依赖性和无限循环漏洞的检测准确率分别达到89.15%、89.02%和83.21%。 摘要:Smart contract vulnerability detection draws extensive attention in recent years due to the substantial losses caused by hacker attacks. Existing efforts for contract security analysis heavily rely on rigid rules defined by experts, which are labor-intensive and non-scalable. More importantly, expert-defined rules tend to be error-prone and suffer the inherent risk of being cheated by crafty attackers. Recent researches focus on the symbolic execution and formal analysis of smart contracts for vulnerability detection, yet to achieve a precise and scalable solution. Although several methods have been proposed to detect vulnerabilities in smart contracts, there is still a lack of effort that considers combining expert-defined security patterns with deep neural networks. In this paper, we explore using graph neural networks and expert knowledge for smart contract vulnerability detection. Specifically, we cast the rich control- and data- flow semantics of the source code into a contract graph. To highlight the critical nodes in the graph, we further design a node elimination phase to normalize the graph. Then, we propose a novel temporal message propagation network to extract the graph feature from the normalized graph, and combine the graph feature with designed expert patterns to yield a final detection system. Extensive experiments are conducted on all the smart contracts that have source code in Ethereum and VNT Chain platforms. Empirical results show significant accuracy improvements over the state-of-the-art methods on three types of vulnerabilities, where the detection accuracy of our method reaches 89.15%, 89.02%, and 83.21% for reentrancy, timestamp dependence, and infinite loop vulnerabilities, respectively.

【9】 Graph Representation Learning on Tissue-Specific Multi-Omics 标题:面向特定组织的多元有机体的图形表示学习

作者:Amine Amor,Pietro Lio',Vikash Singh,Ramon Viñas Torné,Helena Andres Terre 机构: Whilethere has been a significant interest in building integrative 1Department of Computer Science and Technology, Universityof Cambridge 备注:This paper was accepted at the 2021 ICML Workshop on Computational Biology 链接:https://arxiv.org/abs/2107.11856 摘要:结合不同形式的人体组织数据对于推进生物医学研究和个性化医疗保健至关重要。在这项研究中,我们利用图嵌入模型(即VGAE)对组织特异性基因-基因相互作用(GGI)网络进行链接预测。通过消融实验,我们证明了多种生物学模式(即多组学)的结合可以产生强大的嵌入和更好的链接预测性能。我们的评估表明,整合基因甲基化图谱和RNA测序数据显著提高了链接预测性能。总的来说,RNA测序和基因甲基化数据的结合使得GGI网络的链接预测准确率达到71%。通过利用多组学数据的图表示学习,我们的工作为生物信息学中多组学集成的当前文献带来了新的见解。 摘要:Combining different modalities of data from human tissues has been critical in advancing biomedical research and personalised medical care. In this study, we leverage a graph embedding model (i.e VGAE) to perform link prediction on tissue-specific Gene-Gene Interaction (GGI) networks. Through ablation experiments, we prove that the combination of multiple biological modalities (i.e multi-omics) leads to powerful embeddings and better link prediction performances. Our evaluation shows that the integration of gene methylation profiles and RNA-sequencing data significantly improves the link prediction performance. Overall, the combination of RNA-sequencing and gene methylation data leads to a link prediction accuracy of 71% on GGI networks. By harnessing graph representation learning on multi-omics data, our work brings novel insights to the current literature on multi-omics integration in bioinformatics.

Transformer(2篇)

【1】 Contextual Transformer Networks for Visual Recognition 标题:用于视觉识别的上下文变换网络

作者:Yehao Li,Ting Yao,Yingwei Pan,Tao Mei 机构:JD AI Research, Beijing, China 备注:Rank 1 in open-set image classification task of Open World Vision Challenge @ CVPR 2021; The source code and models are publicly available at: url{this https URL} 链接:https://arxiv.org/abs/2107.12292 摘要:具有自我关注的Transformer引发了自然语言处理领域的一场革命,并在众多的计算机视觉任务中激发了Transformer风格的体系结构设计。然而,现有的设计大多直接在二维特征图上利用自注意来获得基于每个空间位置上的孤立查询和密钥对的注意矩阵,而没有充分利用相邻密钥之间丰富的上下文。在这项工作中,我们设计了一个新的转换器样式模块,即上下文转换器(CoT)块,用于视觉识别。这种设计充分利用了输入键间的上下文信息来指导动态注意矩阵的学习,从而增强了视觉表征能力。从技术上讲,CoT block first通过$3times3$卷积对输入键进行上下文编码,从而产生输入的静态上下文表示。我们通过两个连续的$1times1$卷积,将编码的密钥与输入查询连接起来,学习动态多头部注意矩阵。学习的注意力矩阵乘以输入值,实现输入的动态上下文表示。最后将静态和动态的上下文表示融合作为输出。我们的CoT块很吸引人,因为它可以很容易地替换ResNet架构中的每个$3times3$卷积,产生一个称为上下文Transformer网络(CoTNet)的Transformer式主干。通过对广泛应用(如图像识别、目标检测和实例分割)的大量实验,验证了CoTNet作为更强大主干网的优越性。源代码位于url{https://github.com/JDAI-CV/CoTNet}. 摘要:Transformer with self-attention has led to the revolutionizing of natural language processing field, and recently inspires the emergence of Transformer-style architecture design with competitive results in numerous computer vision tasks. Nevertheless, most of existing designs directly employ self-attention over a 2D feature map to obtain the attention matrix based on pairs of isolated queries and keys at each spatial location, but leave the rich contexts among neighbor keys under-exploited. In this work, we design a novel Transformer-style module, i.e., Contextual Transformer (CoT) block, for visual recognition. Such design fully capitalizes on the contextual information among input keys to guide the learning of dynamic attention matrix and thus strengthens the capacity of visual representation. Technically, CoT block first contextually encodes input keys via a $3times3$ convolution, leading to a static contextual representation of inputs. We further concatenate the encoded keys with input queries to learn the dynamic multi-head attention matrix through two consecutive $1times1$ convolutions. The learnt attention matrix is multiplied by input values to achieve the dynamic contextual representation of inputs. The fusion of the static and dynamic contextual representations are finally taken as outputs. Our CoT block is appealing in the view that it can readily replace each $3times3$ convolution in ResNet architectures, yielding a Transformer-style backbone named as Contextual Transformer Networks (CoTNet). Through extensive experiments over a wide range of applications (e.g., image recognition, object detection and instance segmentation), we validate the superiority of CoTNet as a stronger backbone. Source code is available at url{https://github.com/JDAI-CV/CoTNet}.

【2】 H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences 标题:H-Transformer-1D:序列的快速一维分层关注

作者:Zhenhai Zhu,Radu Soricut 机构:Google Research 备注:ACL2021 long paper oral presentation 链接:https://arxiv.org/abs/2107.11906 摘要:我们描述了一种有效的层次化方法来计算Transformer结构中的注意。该注意机制采用了一种类似于数值分析界提出的层次矩阵(H矩阵)的矩阵结构,具有线性的运行时间和记忆复杂度。我们进行了大量的实验来证明我们的层次注意所体现的归纳偏差在捕捉自然语言和视觉任务中典型序列的层次结构方面是有效的。我们的方法是优于替代次二次建议超过6点平均长期竞技场基准。它还设置了一个新的SOTA测试困惑10亿字的数据集与5倍以下的模型参数比以前最好的Transformer为基础的模型。 摘要:We describe an efficient hierarchical method to compute attention in the Transformer architecture. The proposed attention mechanism exploits a matrix structure similar to the Hierarchical Matrix (H-Matrix) developed by the numerical analysis community, and has linear run time and memory complexity. We perform extensive experiments to show that the inductive bias embodied by our hierarchical attention is effective in capturing the hierarchical structure in the sequences typical for natural language and vision tasks. Our method is superior to alternative sub-quadratic proposals by over 6 points on average on the Long Range Arena benchmark. It also sets a new SOTA test perplexity on One-Billion Word dataset with 5x fewer model parameters than that of the previous-best Transformer-based models.

GAN|对抗|攻击|生成相关(6篇)

【1】 Beyond Voice Identity Conversion: Manipulating Voice Attributes by Adversarial Learning of Structured Disentangled Representations 标题:超越语音身份转换:通过结构化解缠表征的对抗性学习操纵语音属性

作者:Laurent Benaroya,Nicolas Obin,Axel Roebel 机构:STMS Lab., Ircam, CNRS, Sorbonne Universit´e, place Igor Stravinsky, Paris, France, Rafael Ferro 链接:https://arxiv.org/abs/2107.12346 摘要:语音转换(Voice conversion,VC)是通过数字方式改变一个人的声音,以操纵其部分内容,主要是其身份,同时保持其余内容不变。神经VC的研究已经取得了相当大的突破,能够用少量的数据和高度逼真的渲染来伪造语音身份。本文超越了语音识别,提出了一种神经结构,允许语音属性(如性别和年龄)的操纵。利用结构化语音表征的对抗学习的最新进展,提出了一种新的结构化神经网络,其中使用多个自动编码器将语音编码为一组理想独立的语言和非语言表征,在风险投资过程中,可以通过对手学习并进行操作。此外,所提出的架构是时间同步的,以便在转换期间保留原始语音定时,从而允许唇形同步应用。将该体系结构应用于实际VCTK数据集上的语音性别转换,可以成功地学习与性别无关的表示,并以非常高的效率和自然的方式转换语音性别。 摘要:Voice conversion (VC) consists of digitally altering the voice of an individual to manipulate part of its content, primarily its identity, while maintaining the rest unchanged. Research in neural VC has accomplished considerable breakthroughs with the capacity to falsify a voice identity using a small amount of data with a highly realistic rendering. This paper goes beyond voice identity and presents a neural architecture that allows the manipulation of voice attributes (e.g., gender and age). Leveraging the latest advances on adversarial learning of structured speech representation, a novel structured neural network is proposed in which multiple auto-encoders are used to encode speech as a set of idealistically independent linguistic and extra-linguistic representations, which are learned adversariarly and can be manipulated during VC. Moreover, the proposed architecture is time-synchronized so that the original voice timing is preserved during conversion which allows lip-sync applications. Applied to voice gender conversion on the real-world VCTK dataset, our proposed architecture can learn successfully gender-independent representation and convert the voice gender with a very high efficiency and naturalness.

【2】 Black-Box Diagnosis and Calibration on GAN Intra-Mode Collapse: A Pilot Study 标题:GaN内模崩塌的黑匣子诊断和校准:初步研究

作者:Zhenyu Wu,Zhaowen Wang,Ye Yuan,Jianming Zhang,Zhangyang Wang,Hailin Jin 机构: Texas A&M University, The University of Texas at Austin 备注:This paper has been accepted by Transactions on Multimedia Computing Communications and Applications (TOMM) for publication in 2021 链接:https://arxiv.org/abs/2107.12202 摘要:生成性对抗网络(GANs)如今能够产生难以置信的真实感。提出的一个问题是,最先进的赣学分布是否仍存在模式崩溃的问题,如果是,该怎么办。现有的GANs样本多样性检验通常是在小范围内进行的,并且/或者依赖于对原始训练数据以及训练模型参数的访问。本文探讨了在一个新的黑盒设置下,诊断GAN模内崩溃并进行校正:假设没有训练数据,也没有训练的模型参数。新的环境是实际需要的,但很少探索和更具挑战性。作为第一个尝试,我们设计了一套基于抽样的统计工具,可以可视化、量化和纠正模式内崩溃。我们通过大量的仿真和实验,证明了我们提出的诊断和校准技术在无条件GAN图像生成(如人脸和车辆)上的有效性。我们的研究表明,在最新的GANs中,模式内崩溃仍然是一个普遍存在的问题,并且模式崩溃在黑盒环境中是可诊断和可校准的。我们的代码位于:https://github.com/VITA-Group/BlackBoxGANCollapse. 摘要:Generative adversarial networks (GANs) nowadays are capable of producing images of incredible realism. One concern raised is whether the state-of-the-art GAN's learned distribution still suffers from mode collapse, and what to do if so. Existing diversity tests of samples from GANs are usually conducted qualitatively on a small scale, and/or depends on the access to original training data as well as the trained model parameters. This paper explores to diagnose GAN intra-mode collapse and calibrate that, in a novel black-box setting: no access to training data, nor the trained model parameters, is assumed. The new setting is practically demanded, yet rarely explored and significantly more challenging. As a first stab, we devise a set of statistical tools based on sampling, that can visualize, quantify, and rectify intra-mode collapse. We demonstrate the effectiveness of our proposed diagnosis and calibration techniques, via extensive simulations and experiments, on unconditional GAN image generation (e.g., face and vehicle). Our study reveals that the intra-mode collapse is still a prevailing problem in state-of-the-art GANs and the mode collapse is diagnosable and calibratable in black-box settings. Our codes are available at: https://github.com/VITA-Group/BlackBoxGANCollapse.

【3】 Membership Inference Attack and Defense for Wireless Signal Classifiers with Deep Learning 标题:基于深度学习的无线信号分类器隶属度推理攻防

作者:Yi Shi,Yalin E. Sagduyu 备注:arXiv admin note: substantial text overlap with arXiv:2006.14576 链接:https://arxiv.org/abs/2107.12173 摘要:提出了一种空中成员推断攻击(MIA)方法来泄漏无线信号分类器的私有信息。机器学习(ML)为无线信号分类提供了强有力的手段,例如用于物理层认证。作为一种对抗性的机器学习攻击,MIA可以推断目标分类器的训练数据中是否使用了感兴趣的信号。该私有信息包含波形、信道和设备特征,如果泄漏,对手可以利用这些信息来识别底层ML模型的漏洞(例如,渗透物理层身份验证)。空中MIA面临的一个挑战是,由于信道条件的差异,接收到的信号以及敌方和目标接收机的RF指纹会有所不同。因此,敌方首先通过观察光谱建立一个代理分类器,然后在此分类器上启动黑盒MIA。MIA结果表明,敌方能够可靠地推断出用于构建目标分类器的信号(以及潜在的无线电和信道信息)。因此,通过建立一个影子MIA模型并愚弄对手,开发了一种针对MIA的主动防御。这种防御可以成功地降低MIA的准确率,防止无线信号分类器的信息泄漏。 摘要:An over-the-air membership inference attack (MIA) is presented to leak private information from a wireless signal classifier. Machine learning (ML) provides powerful means to classify wireless signals, e.g., for PHY-layer authentication. As an adversarial machine learning attack, the MIA infers whether a signal of interest has been used in the training data of a target classifier. This private information incorporates waveform, channel, and device characteristics, and if leaked, can be exploited by an adversary to identify vulnerabilities of the underlying ML model (e.g., to infiltrate the PHY-layer authentication). One challenge for the over-the-air MIA is that the received signals and consequently the RF fingerprints at the adversary and the intended receiver differ due to the discrepancy in channel conditions. Therefore, the adversary first builds a surrogate classifier by observing the spectrum and then launches the black-box MIA on this classifier. The MIA results show that the adversary can reliably infer signals (and potentially the radio and channel information) used to build the target classifier. Therefore, a proactive defense is developed against the MIA by building a shadow MIA model and fooling the adversary. This defense can successfully reduce the MIA accuracy and prevent information leakage from the wireless signal classifier.

【4】 Adversarial training may be a double-edged sword 标题:对抗性训练可能是一把双刃剑

作者:Ali Rahmati,Seyed-Mohsen Moosavi-Dezfooli,Huaiyu Dai 机构:∗North Carolina State University, Raleigh, NC, †Institute for Machine Learning, ETH Z¨urich 备注:Presented as a RobustML workshop paper at ICLR 2021 链接:https://arxiv.org/abs/2107.11671 摘要:对抗训练是提高图像分类器抗白盒攻击鲁棒性的有效方法。然而,它对黑匣子攻击的有效性更为微妙。在这项工作中,我们证明了在深度网络的决策边界上对抗性训练的一些几何结果为某些类型的黑盒攻击提供了优势。特别地,我们定义了一个称为鲁棒性增益的度量来表明,尽管对抗性训练是一种有效的方法,可以显著地提高白盒场景中的鲁棒性,但它可能无法提供针对更现实的基于决策的黑盒攻击的鲁棒性增益。此外,我们还证明了即使是最小扰动白盒攻击也能比常规神经网络更快地收敛于对手训练的神经网络。 摘要:Adversarial training has been shown as an effective approach to improve the robustness of image classifiers against white-box attacks. However, its effectiveness against black-box attacks is more nuanced. In this work, we demonstrate that some geometric consequences of adversarial training on the decision boundary of deep networks give an edge to certain types of black-box attacks. In particular, we define a metric called robustness gain to show that while adversarial training is an effective method to dramatically improve the robustness in white-box scenarios, it may not provide such a good robustness gain against the more realistic decision-based black-box attacks. Moreover, we show that even the minimal perturbation white-box attacks can converge faster against adversarially-trained neural networks compared to the regular ones.

【5】 Tail of Distribution GAN (TailGAN): Generative- Adversarial-Network-Based Boundary Formation 标题:分配的尾巴(TailGan):基于生成-对抗-网络的边界形成

作者:Nikolaos Dionelis 机构:The University of Edinburgh, Edinburgh, UK 备注:None 链接:https://arxiv.org/abs/2107.11658 摘要:生成性对抗网络(GAN)是一种强大的方法,可用于无监督异常检测,而目前的技术存在一些局限性,如在分布尾部附近准确检测异常。GANs一般不保证概率密度的存在,并且容易受到模式崩溃的影响,而很少有GANs使用可能性来减少模式崩溃。在本文中,我们建立了一个基于GAN的尾部异常检测模型,即尾部分布GAN(TailGAN),在尾部数据分布上生成样本,并在支撑边界附近检测异常。使用TailGAN,我们利用GANs进行异常检测,并使用最大熵正则化。利用学习潜在分布概率的GANs,可以设计边界样本发生器,并利用该模型来描述异常,从而改进异常检测方法。TailGAN解决了不相交组件的支持问题,并在图像上实现了有竞争力的性能。我们评估了TailGAN用于识别分布外(OoD)数据的性能,并在MNIST、CIFAR-10、Baggage X-Ray和OoD数据上对其性能进行了评估,与文献中的方法相比,OoD数据显示出了竞争力。 摘要:Generative Adversarial Networks (GAN) are a powerful methodology and can be used for unsupervised anomaly detection, where current techniques have limitations such as the accurate detection of anomalies near the tail of a distribution. GANs generally do not guarantee the existence of a probability density and are susceptible to mode collapse, while few GANs use likelihood to reduce mode collapse. In this paper, we create a GAN-based tail formation model for anomaly detection, the Tail of distribution GAN (TailGAN), to generate samples on the tail of the data distribution and detect anomalies near the support boundary. Using TailGAN, we leverage GANs for anomaly detection and use maximum entropy regularization. Using GANs that learn the probability of the underlying distribution has advantages in improving the anomaly detection methodology by allowing us to devise a generator for boundary samples, and use this model to characterize anomalies. TailGAN addresses supports with disjoint components and achieves competitive performance on images. We evaluate TailGAN for identifying Out-of-Distribution (OoD) data and its performance evaluated on MNIST, CIFAR-10, Baggage X-Ray, and OoD data shows competitiveness compared to methods from the literature.

【6】 Detecting Adversarial Examples Is (Nearly) As Hard As Classifying Them 标题:检测敌意的例子(几乎)和分类一样难。

作者:Florian Tramèr 机构: 1Stanford University 备注:ICML 2021 Workshop on the Prospects and Perils of Adversarial Machine Learning 链接:https://arxiv.org/abs/2107.11630 摘要:使分类器对对抗性例子具有鲁棒性是很困难的。因此,许多防御措施解决了检测扰动输入这一看似简单的任务。我们向这个目标展示了一个障碍。我们证明了对抗性示例的检测和分类之间的一般硬度降低:给定一个{epsilon}距离攻击的鲁棒检测器(在某些度量中),我们可以为{epsilon}/2距离攻击构建一个类似的鲁棒(但低效)分类器,因此不能用来构建实用的分类器。相反,这是一个有用的健全性检查,以测试经验检测结果是否意味着比作者预期的要强烈得多的东西。为了说明这一点,我们重温了13个探测器防御系统。对于11/13的情况,我们表明,声称的检测结果将意味着一个低效的分类器,其鲁棒性远远超过了最先进的水平。 摘要:Making classifiers robust to adversarial examples is hard. Thus, many defenses tackle the seemingly easier task of detecting perturbed inputs. We show a barrier towards this goal. We prove a general hardness reduction between detection and classification of adversarial examples: given a robust detector for attacks at distance {epsilon} (in some metric), we can build a similarly robust (but inefficient) classifier for attacks at distance {epsilon}/2. Our reduction is computationally inefficient, and thus cannot be used to build practical classifiers. Instead, it is a useful sanity check to test whether empirical detection results imply something much stronger than the authors presumably anticipated. To illustrate, we revisit 13 detector defenses. For 11/13 cases, we show that the claimed detection results would imply an inefficient classifier with robustness far beyond the state-of-the-art.

半/弱/无/有监督|不确定性|主动学习(8篇)

【1】 Improve Unsupervised Pretraining for Few-label Transfer 标题:改进用于少标签转移的无监督预训练

作者:Suichan Li,Dongdong Chen,Yinpeng Chen,Lu Yuan,Lei Zhang,Qi Chu,Bin Liu,Nenghai Yu 机构:University of Science and Technology of China, Microsoft Research 备注:ICCV 2021. arXiv admin note: substantial text overlap with arXiv:2012.05899 链接:https://arxiv.org/abs/2107.12369 摘要:无监督预训练已经取得了很大的成功,最近的许多研究表明,在下游目标数据集上,无监督预训练可以获得与有监督预训练相当甚至稍好的传输性能。但在本文中,我们发现当目标数据集只有很少的标记样本进行微调时,即很少有标记转移时,这一结论可能不成立。我们从聚类的角度分析了可能的原因:1)目标样本的聚类质量对很少的标签转移有重要影响;2) 尽管对比学习是学习聚类的必要手段,但由于缺乏对标签的监督,对比学习的聚类质量仍然不如有监督的预训练。在分析的基础上,我们有趣地发现,只有将一些未标记的目标域加入到无监督的预训练中,才能提高聚类质量,从而缩小与有监督预训练的传输性能差距。这一发现也促使我们为实际应用提出一种新的渐进式少标签传输算法,其目的是在有限的注释预算下最大限度地提高传输性能。为了支持我们的分析和提出的方法,我们在9个不同的目标数据集上进行了广泛的实验。实验结果表明,本文提出的方法能显著提高无监督预训练的少量标记转移性能。 摘要:Unsupervised pretraining has achieved great success and many recent works have shown unsupervised pretraining can achieve comparable or even slightly better transfer performance than supervised pretraining on downstream target datasets. But in this paper, we find this conclusion may not hold when the target dataset has very few labeled samples for finetuning, ie, few-label transfer. We analyze the possible reason from the clustering perspective: 1) The clustering quality of target samples is of great importance to few-label transfer; 2) Though contrastive learning is essential to learn how to cluster, its clustering quality is still inferior to supervised pretraining due to lack of label supervision. Based on the analysis, we interestingly discover that only involving some unlabeled target domain into the unsupervised pretraining can improve the clustering quality, subsequently reducing the transfer performance gap with supervised pretraining. This finding also motivates us to propose a new progressive few-label transfer algorithm for real applications, which aims to maximize the transfer performance under a limited annotation budget. To support our analysis and proposed method, we conduct extensive experiments on nine different target datasets. Experimental results show our proposed method can significantly boost the few-label transfer performance of unsupervised pretraining.

【2】 Uncertainty-Aware Time-to-Event Prediction using Deep Kernel Accelerated Failure Time Models 标题:基于深度核加速故障时间模型的不确定性感知事件间隔时间预测

作者:Zhiliang Wu,Yinchong Yang,Peter A. Fasching,Volker Tresp 机构:Ludwig Maximilians University Munich, Siemens AG, Technology, Munich, Department of Gynecology and Obstetrics, University Hospital Erlangen, Erlangen 链接:https://arxiv.org/abs/2107.12250 摘要:基于递归神经网络的解决方案正越来越多地用于纵向电子病历数据的分析。然而,目前的研究大多集中在预测精度上,而忽略了预测的不确定性。我们提出了事件时间预测任务的深核加速失效时间模型,通过递归神经网络和稀疏高斯过程的流水线实现了预测的不确定性感知。此外,采用基于深度度量学习的预训练步骤对模型进行了改进。在两个真实数据集上的实验表明,我们的模型比基于递归神经网络的基线具有更好的点估计性能。更重要的是,我们模型的预测方差可以用来量化对事件时间预测的不确定性估计:当我们的模型对其预测更有信心时,它可以提供更好的性能。与蒙特卡罗法等相关方法相比,我们的模型利用解析解提供了更好的不确定性估计,计算效率更高。 摘要:Recurrent neural network based solutions are increasingly being used in the analysis of longitudinal Electronic Health Record data. However, most works focus on prediction accuracy and neglect prediction uncertainty. We propose Deep Kernel Accelerated Failure Time models for the time-to-event prediction task, enabling uncertainty-awareness of the prediction by a pipeline of a recurrent neural network and a sparse Gaussian Process. Furthermore, a deep metric learning based pre-training step is adapted to enhance the proposed model. Our model shows better point estimate performance than recurrent neural network based baselines in experiments on two real-world datasets. More importantly, the predictive variance from our model can be used to quantify the uncertainty estimates of the time-to-event prediction: Our model delivers better performance when it is more confident in its prediction. Compared to related methods, such as Monte Carlo Dropout, our model offers better uncertainty estimates by leveraging an analytical solution and is more computationally efficient.

【3】 ReDAL: Region-based and Diversity-aware Active Learning for Point Cloud Semantic Segmentation 标题:Redal:基于区域和多样性的主动学习点云语义分割

作者:Tsung-Han Wu,Yueh-Cheng Liu,Yu-Kai Huang,Hsin-Ying Lee,Hung-Ting Su,Ping-Chia Huang,Winston H. Hsu 机构:National Taiwan University 备注:Accepted by ICCV 2021 链接:https://arxiv.org/abs/2107.11769 摘要:尽管深度学习在有监督点云语义分割方面取得了成功,但是获取大规模的逐点人工标注仍然是一个巨大的挑战。为了减少标注的巨大负担,我们提出了一种基于区域和多样性感知的主动学习(ReDAL),这是许多深度学习方法的一个通用框架,旨在自动选择信息丰富和多样的子场景区域进行标签获取。观察到只有一小部分的注释区域足够用于深度学习的三维场景理解,我们使用softmax熵、颜色不连续性和结构复杂性来度量子场景区域的信息。提出了一种多样性感知的选择算法,避免了在查询批中选择信息丰富但相似的区域所产生的冗余标注。大量实验表明,该方法的性能优于以往的主动学习策略,在S3DIS和SemanticKITTI数据集上实现了90%的完全监督学习,而标注量分别小于15%和5%。 摘要:Despite the success of deep learning on supervised point cloud semantic segmentation, obtaining large-scale point-by-point manual annotations is still a significant challenge. To reduce the huge annotation burden, we propose a Region-based and Diversity-aware Active Learning (ReDAL), a general framework for many deep learning approaches, aiming to automatically select only informative and diverse sub-scene regions for label acquisition. Observing that only a small portion of annotated regions are sufficient for 3D scene understanding with deep learning, we use softmax entropy, color discontinuity, and structural complexity to measure the information of sub-scene regions. A diversity-aware selection algorithm is also developed to avoid redundant annotations resulting from selecting informative but similar regions in a querying batch. Extensive experiments show that our method highly outperforms previous active learning strategies, and we achieve the performance of 90% fully supervised learning, while less than 15% and 5% annotations are required on S3DIS and SemanticKITTI datasets, respectively.

【4】 Multi-Perspective Content Delivery Networks Security Framework Using Optimized Unsupervised Anomaly Detection 标题:基于优化无监督异常检测的多视角内容分发网络安全框架

作者:Li Yang,Abdallah Moubayed,Abdallah Shami,Parisa Heidari,Amine Boukhtouta,Adel Larabi,Richard Brunner,Stere Preda,Daniel Migault 机构:Western University 备注:Accepted and to Appear in IEEE Transactions on Network and Service Management 链接:https://arxiv.org/abs/2107.11514 摘要:内容交付网络(CDN)通过Internet提供高效的内容分发。cdn提高了全球通信的连接性和效率,但其缓存机制可能被网络攻击者破坏。在安全机制中,有效的异常检测是CDN安全增强的重要组成部分。在这项工作中,我们提出了一个多视角的无监督学习框架,用于CDNs中的异常检测。在该框架中,提出了一种多视角特征工程方法、一种基于隔离林和高斯混合模型的优化无监督异常检测模型以及一种多视角验证方法,主要从客户端互联网协议(IP)和节点的角度检测CDN中的异常行为,从而识别拒绝服务(DoS)和缓存污染攻击(CPA)模式。实验结果是基于一个主要CDN运营商提供的8天的真实CDN日志数据的分析。通过实验,该框架有效地识别了异常内容、受损节点、恶意ip及其相应的攻击类型,并得到了多个网络安全专家的验证。这表明了该方法在实际CDN数据处理中的有效性。 摘要:Content delivery networks (CDNs) provide efficient content distribution over the Internet. CDNs improve the connectivity and efficiency of global communications, but their caching mechanisms may be breached by cyber-attackers. Among the security mechanisms, effective anomaly detection forms an important part of CDN security enhancement. In this work, we propose a multi-perspective unsupervised learning framework for anomaly detection in CDNs. In the proposed framework, a multi-perspective feature engineering approach, an optimized unsupervised anomaly detection model that utilizes an isolation forest and a Gaussian mixture model, and a multi-perspective validation method, are developed to detect abnormal behaviors in CDNs mainly from the client Internet Protocol (IP) and node perspectives, therefore to identify the denial of service (DoS) and cache pollution attack (CPA) patterns. Experimental results are presented based on the analytics of eight days of real-world CDN log data provided by a major CDN operator. Through experiments, the abnormal contents, compromised nodes, malicious IPs, as well as their corresponding attack types, are identified effectively by the proposed framework and validated by multiple cybersecurity experts. This shows the effectiveness of the proposed method when applied to real-world CDN data.

【5】 μDARTS: Model Uncertainty-Aware Differentiable Architecture Search标题:μDARTS:模型不确定性感知可区分体系结构搜索

作者:Biswadeep Chakraborty,Saibal Mukhopadhyay 机构:GeorgiaInstituteofTechnology 备注:10 pages, 7 Tables, 6 Figures, Submitted in TNNLS 链接:https://arxiv.org/abs/2107.11500 摘要:我们提出了一个模型不确定性感知可微结构搜索($mu$DARTS),它优化了神经网络,同时实现了高精度和低不确定性。我们在DARTS单元中引入了具体的退出,并在训练损失中加入了montecarlo正则化器来优化具体的退出概率。在验证损失中引入了预测方差项,使得搜索模型不确定性最小的体系结构成为可能。在CIFAR10、CIFAR100、SVHN和ImageNet上的实验验证了$mu$省道与现有省道方法相比在提高精度和减少不确定性方面的有效性。此外,从$mu$DARTS得到的最终结构与从现有DARTS方法得到的结构相比,对输入图像和模型参数处的噪声具有更高的鲁棒性。 摘要:We present a Model Uncertainty-aware Differentiable ARchiTecture Search ($mu$DARTS) that optimizes neural networks to simultaneously achieve high accuracy and low uncertainty. We introduce concrete dropout within DARTS cells and include a Monte-Carlo regularizer within the training loss to optimize the concrete dropout probabilities. A predictive variance term is introduced in the validation loss to enable searching for architecture with minimal model uncertainty. The experiments on CIFAR10, CIFAR100, SVHN, and ImageNet verify the effectiveness of $mu$DARTS in improving accuracy and reducing uncertainty compared to existing DARTS methods. Moreover, the final architecture obtained from $mu$DARTS shows higher robustness to noise at the input image and model parameters compared to the architecture obtained from existing DARTS methods.

【6】 HierMUD: Hierarchical Multi-task Unsupervised Domain Adaptation between Bridges for Drive-by Damage Diagnosis 标题:HierMUD:桥间分层多任务无监督领域自适应驾车损伤诊断

作者:Jingxiao Liu,Susu Xu,Mario Bergés,Hae Young Noh 机构:have been developed to achieve continuous and autonomous 1Stanford University, USA 2Stony Brook University, USA 3Carnegie Mellon University, Department of Civil & Environmental Engineering, StanfordUniversity 链接:https://arxiv.org/abs/2107.11435 摘要:利用驾驶车辆的振动监测桥梁的健康状况有很多好处,例如不需要在桥梁上直接安装和维护传感器。然而,许多现有的驱动监控方法都是基于有监督学习模型的,需要从每个感兴趣的桥中获取标记数据,这是昂贵和耗时的,如果不是不可能的话。为此,我们引入了一个新的框架,将从一座桥上学习到的模型转换到另一座桥上的损伤诊断,而不需要从目标桥上获取任何标签。我们的框架以对抗的方式训练一个层次化的神经网络模型,以提取任务共享和任务特定的特征,这些特征对多个诊断任务是有用的,并且跨多个桥是不变的。我们在2座桥和3辆车的实验数据上评估了我们的框架。我们实现了95%的损伤检测准确率,93%的定位,高达72%的量化,这是约2倍的改善基线方法。 摘要:Monitoring bridge health using vibrations of drive-by vehicles has various benefits, such as no need for directly installing and maintaining sensors on the bridge. However, many of the existing drive-by monitoring approaches are based on supervised learning models that require labeled data from every bridge of interest, which is expensive and time-consuming, if not impossible, to obtain. To this end, we introduce a new framework that transfers the model learned from one bridge to diagnose damage in another bridge without any labels from the target bridge. Our framework trains a hierarchical neural network model in an adversarial way to extract task-shared and task-specific features that are informative to multiple diagnostic tasks and invariant across multiple bridges. We evaluate our framework on experimental data collected from 2 bridges and 3 vehicles. We achieve accuracies of 95% for damage detection, 93% for localization, and up to 72% for quantification, which are ~2 times improvements from baseline methods.

【7】 Weakly Supervised Attention Model for RV StrainClassification from volumetric CTPA Scans 标题:基于容积CTPA扫描的房车应变分类的弱监督注意模型

作者:Noa Cahan,Edith M. Marom,Shelly Soffer,Yiftach Barash,Eli Konen,Eyal Klang,Hayit Greenspan 机构:Department of Biomedical Engineering, Tel-Aviv University, Tel-Aviv, Israel, Department of Diagnostic Imaging, Sheba Medical Center, Ramat Gan, Israel affiliated with, the Tel Aviv University, Tel Aviv, Israel 备注:12 pages, 6 figures, 5 tables 链接:https://arxiv.org/abs/2107.12009 摘要:肺栓塞(PE)是指血块阻塞肺动脉。仅在美国,PE每年就造成大约10万人死亡。PE的临床表现往往是非特异性的,这使得诊断具有挑战性。因此,快速准确的风险分层至关重要。高风险PE是由急性压力超负荷引起的右心室(RV)功能障碍引起的,这反过来有助于确定哪些患者需要更积极的治疗。胸部CT重建的心脏四腔图可以发现右心室扩大。CT肺动脉造影(CTPA)是诊断可疑PE的金标准。因此,它可以将诊断和危险分层策略联系起来。我们开发了一种弱监督的深度学习算法,着重于一种新的注意机制,用于在CTPA上自动分类RV菌株。我们的方法是一个三维DenseNet模型与集成的三维剩余注意块。我们在急诊室(ED)PE患者的CTPA数据集上评估了我们的模型。该模型获得了0.88的受试者操作特征曲线(AUC)下面积,用于RV应变分类。该模型的敏感性为87%,特异性为83.7%。我们的解决方案优于最先进的3D CNN网络。所提出的设计允许一个完全自动化的网络,可以很容易地以端到端的方式进行训练,而不需要计算密集和耗时的预处理或繁重的数据标记。我们推断,未标记的CTPA可以用于有效的RV应变分类。这可以作为第二个读取器,提醒高风险PE患者。据我们所知,以前没有深入的研究试图解决这个问题。 摘要:Pulmonary embolus (PE) refers to obstruction of pulmonary arteries by blood clots. PE accounts for approximately 100,000 deaths per year in the United States alone. The clinical presentation of PE is often nonspecific, making the diagnosis challenging. Thus, rapid and accurate risk stratification is of paramount importance. High-risk PE is caused by right ventricular (RV) dysfunction from acute pressure overload, which in return can help identify which patients require more aggressive therapy. Reconstructed four-chamber views of the heart on chest CT can detect right ventricular enlargement. CT pulmonary angiography (CTPA) is the golden standard in the diagnostic workup of suspected PE. Therefore, it can link between diagnosis and risk stratification strategies. We developed a weakly supervised deep learning algorithm, with an emphasis on a novel attention mechanism, to automatically classify RV strain on CTPA. Our method is a 3D DenseNet model with integrated 3D residual attention blocks. We evaluated our model on a dataset of CTPAs of emergency department (ED) PE patients. This model achieved an area under the receiver operating characteristic curve (AUC) of 0.88 for classifying RV strain. The model showed a sensitivity of 87% and specificity of 83.7%. Our solution outperforms state-of-the-art 3D CNN networks. The proposed design allows for a fully automated network that can be trained easily in an end-to-end manner without requiring computationally intensive and time-consuming preprocessing or strenuous labeling of the data.We infer that unmarked CTPAs can be used for effective RV strain classification. This could be used as a second reader, alerting for high-risk PE patients. To the best of our knowledge, there are no previous deep learning-based studies that attempted to solve this problem.

【8】 Deep-learning-driven Reliable Single-pixel Imaging with Uncertainty Approximation 标题:深度学习驱动的不确定性近似可靠单像素成像

作者:Ruibo Shang,Mikaela A. O'Brien,Geoffrey P. Luke 机构:Thayer School of Engineering, Dartmouth College, Engineering Dr., Hanover, NH , USA 备注:19 pages, 4 figures 链接:https://arxiv.org/abs/2107.11678 摘要:单像素成像(SPI)具有波长范围宽、系统紧凑等优点,是传统成像传感器难以实现的。然而,一个常见的挑战是由于欠采样导致的低图像质量。深度学习(Deep learning,DL)是一种新兴的、功能强大的计算成像工具,研究者将DL应用到SPI中,以获得比传统重建方法更高的图像质量。然而,一个突出的挑战是在实际应用中无法评估SPI中DL预测的准确性,因为在实际应用中,基本事实是未知的。在这里,我们建议使用贝叶斯卷积神经网络(BCNN)来近似SPI中DL预测的不确定性(来自有限的训练数据和网络模型)。BCNN预测结果中的每个像素代表概率分布的参数,而不是图像强度值。然后,通过在训练阶段最小化负对数似然损失函数和在预测阶段最小化蒙特卡罗误差,用BCNN逼近不确定性。结果表明,在不同的压缩比和噪声水平下,BCNN能够可靠地逼近SPI中DL预测的不确定性。在基于深度学习的SPI中,BCNN的预测不确定性揭示了大部分重建误差来自于图像特征的边缘。结果表明,所提出的BCNN可以为SPI中DL预测的不确定性提供一个可靠的近似工具,可以广泛应用于SPI的许多应用中。 摘要:Single-pixel imaging (SPI) has the advantages of high-speed acquisition over a broad wavelength range and system compactness, which are difficult to achieve by conventional imaging sensors. However, a common challenge is low image quality arising from undersampling. Deep learning (DL) is an emerging and powerful tool in computational imaging for many applications and researchers have applied DL in SPI to achieve higher image quality than conventional reconstruction approaches. One outstanding challenge, however, is that the accuracy of DL predictions in SPI cannot be assessed in practical applications where the ground truths are unknown. Here, we propose the use of the Bayesian convolutional neural network (BCNN) to approximate the uncertainty (coming from finite training data and network model) of the DL predictions in SPI. Each pixel in the predicted result from BCNN represents the parameter of a probability distribution rather than the image intensity value. Then, the uncertainty can be approximated with BCNN by minimizing a negative log-likelihood loss function in the training stage and Monte Carlo dropout in the prediction stage. The results show that the BCNN can reliably approximate the uncertainty of the DL predictions in SPI with varying compression ratios and noise levels. The predicted uncertainty from BCNN in SPI reveals that most of the reconstruction errors in deep-learning-based SPI come from the edges of the image features. The results show that the proposed BCNN can provide a reliable tool to approximate the uncertainty of DL predictions in SPI and can be widely used in many applications of SPI.

迁移|Zero/Few/One-Shot|自适应(1篇)

【1】 Trade When Opportunity Comes: Price Movement Forecasting via Locality-Aware Attention and Adaptive Refined Labeling 标题:机会来临时的交易:通过位置感知关注和自适应精细化标签预测价格走势

作者:Liang Zeng,Lei Wang,Hui Niu,Jian Li,Ruchen Zhang,Zhonghao Dai,Dewei Zhu,Ling Wang 机构:Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University, China, Huatai Securities Co., Ltd, China 链接:https://arxiv.org/abs/2107.11972 摘要:价格变动预测是根据当前市场状况和其他相关信息,对金融资产未来走势进行预测。近年来,机器学习(ML)方法在价格运动预测中得到了越来越广泛的应用,并取得了很好的效果。大多数现有的ML解决方案将预测问题描述为整个训练数据集中的分类(预测方向)或回归(预测回报)问题。然而,由于金融数据的极低信噪比和随机性,良好的交易机会极为稀缺。因此,如果不仔细选择有潜在收益的样本,这种ML方法很容易捕获噪声的模式而不是真实的信号。为了解决上述问题,我们提出了一个新的框架LARA(位置感知注意和自适应精细标记),它包含以下三个组成部分:1)位置感知注意通过关注样本的标签信息,自动提取出有潜在收益的样本,从而在这些样本上构造更精确的分类器。2) 自适应细化标签进一步迭代细化标签,降低样本噪声。3) 借助于度量学习技术,位置感知注意力可以享受特定于任务的距离度量,并以一种更有效的方式将注意力分布在潜在盈利的样本上。为了验证我们的方法,我们在三个真实的金融市场上进行了综合实验:etf、中国A股股票市场和加密货币市场。在Qlib平台上,与时间序列分析方法和一组基于机器学习的竞争对手相比,LARA取得了优异的性能。大量的烧蚀研究和实验表明,劳拉确实抓住了更可靠的交易机会。 摘要:Price movement forecasting aims at predicting the future trends of financial assets based on the current market conditions and other relevant information. Recently, machine learning(ML) methods have become increasingly popular and achieved promising results for price movement forecasting in both academia and industry. Most existing ML solutions formulate the forecasting problem as a classification(to predict the direction) or a regression(to predict the return) problem in the entire set of training data. However, due to the extremely low signal-to-noise ratio and stochastic nature of financial data, good trading opportunities are extremely scarce. As a result, without careful selection of potentially profitable samples, such ML methods are prone to capture the patterns of noises instead of real signals. To address the above issues, we propose a novel framework-LARA(Locality-Aware Attention and Adaptive Refined Labeling), which contains the following three components: 1)Locality-aware attention automatically extracts the potentially profitable samples by attending to their label information in order to construct a more accurate classifier on these selected samples. 2)Adaptive refined labeling further iteratively refines the labels, alleviating the noise of samples. 3)Equipped with metric learning techniques, Locality-aware attention enjoys task-specific distance metrics and distributes attention on potentially profitable samples in a more effective way. To validate our method, we conduct comprehensive experiments on three real-world financial markets: ETFs, the China's A-share stock market, and the cryptocurrency market. LARA achieves superior performance compared with the time-series analysis methods and a set of machine learning based competitors on the Qlib platform. Extensive ablation studies and experiments demonstrate that LARA indeed captures more reliable trading opportunities.

强化学习(2篇)

【1】 DR2L: Surfacing Corner Cases to Robustify Autonomous Driving via Domain Randomization Reinforcement Learning 标题:DR2L:基于领域随机化强化学习的机动车自动驾驶表面化

作者:Haoyi Niu,Jianming Hu,Zheyu Cui,Yi Zhang 机构:Department of, AutomationTsinghua, University, Beijing, China, , edu.cn 备注:8 pages, 7 figures 链接:https://arxiv.org/abs/2107.11762 摘要:在深度强化学习(DeepRL)自主驾驶的背景下,如何尽可能有效和彻底地探索弯道案例一直是人们关注的焦点之一。用模拟数据进行训练比用真实数据进行训练成本低、危险性小,但由于参数分布的不一致性和模拟器中系统建模的不正确性,往往导致不可避免的Sim2real缺口,这可能是NEW性能不佳的原因,模拟器难以产生的异常和危险案例。领域随机化(DR)是一种可以在很少或没有真实数据的情况下弥补这一差距的方法。因此,本研究提出了一个对抗性模型,以使在模拟中训练的基于DeepRL的自主车辆能够逐渐地在较难的事件中进行表面处理,从而使模型能够很容易地转移到现实世界中。 摘要:How to explore corner cases as efficiently and thoroughly as possible has long been one of the top concerns in the context of deep reinforcement learning (DeepRL) autonomous driving. Training with simulated data is less costly and dangerous than utilizing real-world data, but the inconsistency of parameter distribution and the incorrect system modeling in simulators always lead to an inevitable Sim2real gap, which probably accounts for the underperformance in novel, anomalous and risky cases that simulators can hardly generate. Domain Randomization(DR) is a methodology that can bridge this gap with little or no real-world data. Consequently, in this research, an adversarial model is put forward to robustify DeepRL-based autonomous vehicles trained in simulation to gradually surfacing harder events, so that the models could readily transfer to the real world.

【2】 Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose? 标题:基于模型的微观数据强化学习:模型的关键属性是什么,应该选择哪种模型?

作者:Balázs Kégl,Gabriel Hurtado,Albert Thomas 机构:Huawei Noah’s Ark Lab, Paris, France 备注:Published at International Conference on Learning Representations, 2021: this https URL 链接:https://arxiv.org/abs/2107.11587 摘要:通过使用固定(随机射击)控制代理严格比较流行的生成模型,我们有助于基于微观数据模型的强化学习(MBRL)。我们发现,在一个需要多模态后验预测的环境中,混合密度网络比其他模型有很大的优势。当不需要多模态时,我们令人惊讶的发现是,我们不需要概率后验预测:确定性模型是平价的,事实上它们一致(虽然不显著)胜过它们的概率对应。我们还发现,训练时的异方差性,也许可以作为一个正则化器,提高预测在更长的视野。在方法论方面,我们设计了度量和实验协议,可以用来评估各种模型,预测它们在控制问题上的渐近性能。利用这个框架,我们将Acrobot上MBRL的最新样本复杂度提高了两到四倍,使用了一个超出通常所考虑的超参数区间的主动训练计划 摘要:We contribute to micro-data model-based reinforcement learning (MBRL) by rigorously comparing popular generative models using a fixed (random shooting) control agent. We find that on an environment that requires multimodal posterior predictives, mixture density nets outperform all other models by a large margin. When multimodality is not required, our surprising finding is that we do not need probabilistic posterior predictives: deterministic models are on par, in fact they consistently (although non-significantly) outperform their probabilistic counterparts. We also found that heteroscedasticity at training time, perhaps acting as a regularizer, improves predictions at longer horizons. At the methodological side, we design metrics and an experimental protocol which can be used to evaluate the various models, predicting their asymptotic performance when using them on the control problem. Using this framework, we improve the state-of-the-art sample complexity of MBRL on Acrobot by two to four folds, using an aggressive training schedule which is outside of the hyperparameter interval usually considered

医学相关(3篇)

【1】 An Argumentative Dialogue System for COVID-19 Vaccine Information 标题:一个冠状病毒疫苗信息讨论式对话系统

作者:Bettina Fazzinga,Andrea Galassi,Paolo Torroni 机构: ICAR CNR, Rende, Italy, DISI, University of Bologna, Bologna, Italy 备注:20 pages, 2 figures, currently under submission 链接:https://arxiv.org/abs/2107.12079 摘要:对话系统在人工智能中被广泛应用,以支持与用户的及时互动交流。我们提出一个通用的对话系统架构,利用计算论证和最先进的语言技术。我们用一个COVID-19疫苗信息案例来说明和评估这个系统。 摘要:Dialogue systems are widely used in AI to support timely and interactive communication with users. We propose a general-purpose dialogue system architecture that leverages computational argumentation and state-of-the-art language technologies. We illustrate and evaluate the system using a COVID-19 vaccine information case study.

【2】 3D AGSE-VNet: An Automatic Brain Tumor MRI Data Segmentation Framework 标题:3DAGSE-vNet:一种脑肿瘤MRI数据自动分割框架

作者:Xi Guan,Guang Yang,Jianming Ye,Weiji Yang,Xiaomei Xu,Weiwei Jiang,Xiaobo Lai 机构: School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University, Cardiovascular Research Centre, Royal Brompton Hospital, London, SW,NP, UK, National Heart and Lung Institute, Imperial College London, London, SW,AZ, UK 备注:34 pages, 12 figure, Accepted by BMC Medical Imaging 链接:https://arxiv.org/abs/2107.12046 摘要:背景:脑胶质瘤是最常见的脑恶性肿瘤,发病率高,死亡率高达3%以上,严重危害人类健康。临床上获取脑肿瘤的主要方法是MRI。从多模态MRI扫描图像中分割脑肿瘤区域,有助于治疗检查、诊断后监测和疗效评价。然而,目前临床上常用的脑肿瘤分割操作仍然是手工分割,导致其耗时长,不同算子之间的性能差异较大,迫切需要一种一致、准确的自动分割方法。方法:针对上述问题,提出了一种脑肿瘤MRI数据自动分割框架AGSE-VNet。在我们的研究中,在每个编码器中加入压缩和激励(SE)模块,在每个解码器中加入注意引导滤波器(AG)模块,利用信道关系自动增强信道中的有用信息,抑制无用信息,利用注意机制引导边缘信息,去除噪声等无关信息的影响。结果:我们使用BraTS2020挑战在线验证工具来评估我们的方法。验证的重点是整个肿瘤(WT)、肿瘤核心(TC)和增强肿瘤(ET)的Dice评分分别为0.68、0.85和0.70。结论:尽管MRI图像强度不同,但AGSE-VNet不受肿瘤大小的影响,能更准确地提取三个区域的特征,取得了令人印象深刻的效果,为脑肿瘤患者的临床诊断和治疗做出了突出贡献。 摘要:Background: Glioma is the most common brain malignant tumor, with a high morbidity rate and a mortality rate of more than three percent, which seriously endangers human health. The main method of acquiring brain tumors in the clinic is MRI. Segmentation of brain tumor regions from multi-modal MRI scan images is helpful for treatment inspection, post-diagnosis monitoring, and effect evaluation of patients. However, the common operation in clinical brain tumor segmentation is still manual segmentation, lead to its time-consuming and large performance difference between different operators, a consistent and accurate automatic segmentation method is urgently needed. Methods: To meet the above challenges, we propose an automatic brain tumor MRI data segmentation framework which is called AGSE-VNet. In our study, the Squeeze and Excite (SE) module is added to each encoder, the Attention Guide Filter (AG) module is added to each decoder, using the channel relationship to automatically enhance the useful information in the channel to suppress the useless information, and use the attention mechanism to guide the edge information and remove the influence of irrelevant information such as noise. Results: We used the BraTS2020 challenge online verification tool to evaluate our approach. The focus of verification is that the Dice scores of the whole tumor (WT), tumor core (TC) and enhanced tumor (ET) are 0.68, 0.85 and 0.70, respectively. Conclusion: Although MRI images have different intensities, AGSE-VNet is not affected by the size of the tumor, and can more accurately extract the features of the three regions, it has achieved impressive results and made outstanding contributions to the clinical diagnosis and treatment of brain tumor patients.

【3】 Lung Cancer Risk Estimation with Incomplete Data: A Joint Missing Imputation Perspective 标题:不完全数据下的肺癌风险估计:联合缺失归因视角

作者:Riqiang Gao,Yucheng Tang,Kaiwen Xu,Ho Hin Lee,Steve Deppen,Kim Sandler,Pierre Massion,Thomas A. Lasko,Yuankai Huo,Bennett A. Landman 机构:Landman, EECS, Vanderbilt University, Nashville, TN, USA , Vanderbilt University Medical Center, Nashville, TN, USA 备注:Early Accepted by MICCAI 2021. Traveling Award 链接:https://arxiv.org/abs/2107.11882 摘要:来自多模态的数据为临床预测提供了补充信息,但临床队列中的缺失数据限制了多模态学习环境中受试者的数量。当1)缺失数据跨异质模式(如图像与非图像)时,多模式缺失插补对现有方法具有挑战性;或者2)一种方式基本上缺失。本文通过建立多模态数据的联合分布模型来解决缺失数据的插补问题。基于部分双向生成对抗网(PBiGAN)的思想,提出了一种新的条件PBiGAN(C-PBiGAN)方法,结合另一模态的条件知识对一模态进行插补。具体而言,C-PBiGAN在缺失插补框架中引入了一个条件潜空间,对可用的多模态数据进行联合编码,同时对插补数据进行类正则化丢失,以恢复判别信息。据我们所知,这是第一个通过建立图像和非图像数据的联合分布模型来解决多模态缺失插补问题的生成性对抗模型。我们用国家肺筛查试验(NLST)数据集和外部临床验证队列来验证我们的模型。与有代表性的插补方法相比,建议的C-PBiGAN在肺癌风险估计方面取得了显著的改进(例如,与PBiGAN相比,NLST( 2.9%)和内部数据集( 4.3%)的AUC值均增加,p$<$0.05)。 摘要:Data from multi-modality provide complementary information in clinical prediction, but missing data in clinical cohorts limits the number of subjects in multi-modal learning context. Multi-modal missing imputation is challenging with existing methods when 1) the missing data span across heterogeneous modalities (e.g., image vs. non-image); or 2) one modality is largely missing. In this paper, we address imputation of missing data by modeling the joint distribution of multi-modal data. Motivated by partial bidirectional generative adversarial net (PBiGAN), we propose a new Conditional PBiGAN (C-PBiGAN) method that imputes one modality combining the conditional knowledge from another modality. Specifically, C-PBiGAN introduces a conditional latent space in a missing imputation framework that jointly encodes the available multi-modal data, along with a class regularization loss on imputed data to recover discriminative information. To our knowledge, it is the first generative adversarial model that addresses multi-modal missing imputation by modeling the joint distribution of image and non-image data. We validate our model with both the national lung screening trial (NLST) dataset and an external clinical validation cohort. The proposed C-PBiGAN achieves significant improvements in lung cancer risk estimation compared with representative imputation methods (e.g., AUC values increase in both NLST ( 2.9%) and in-house dataset ( 4.3%) compared with PBiGAN, p$<$0.05).

聚类(2篇)

【1】 EGGS: Eigen-Gap Guided Search\ Making Subspace Clustering Easy 标题:鸡蛋:Eigen-Gap引导式搜索简化子空间聚类

作者:Jicong Fan,Yiheng Tu,Zhao Zhang,Mingbo Zhao 机构: Zhao Zhang is with the School of Computer Science andInformation Engineering, Hefei University of Technology, MingboZhao is with the School of Information Science, Donghua University 链接:https://arxiv.org/abs/2107.12183 摘要:谱聚类的性能在很大程度上依赖于亲和矩阵的质量。已有多种亲和力矩阵的构造方法,但这些方法都需要预先确定超参数,这就需要很强的经验,在实际应用中会遇到困难,特别是当簇间相似度较高或/或数据集较大时。另一方面,我们经常要决定使用线性模型还是非线性模型,这仍然取决于经验。为了解决这两个问题,本文提出了一种基于特征间隙的子空间聚类搜索方法。其主要思想是在由线性回归和核回归构造的候选集之间寻找最可靠的亲和矩阵,其中可靠度由本文定义的拉普拉斯图的相对特征差来量化。我们从理论上和数值上证明了拉普拉斯矩阵具有较大的相对本征间隙,可以获得较高的聚类精度和稳定性。该方法能够在预先定义的空间中自动搜索最优模型和超参数。搜索空间非常容易确定并且可以任意大,但是相对紧凑的搜索空间可以减少非常不必要的计算。该方法在实际应用中具有很高的灵活性和方便性,并且由于亲和矩阵不是通过迭代优化来计算的,因此计算量较小。我们将该方法扩展到大规模数据集,如MNIST,其时间开销小于90秒,聚类精度是最先进的。大量的自然图像聚类实验表明,该方法比基线方法更稳定、准确、高效。 摘要:The performance of spectral clustering heavily relies on the quality of affinity matrix. A variety of affinity-matrix-construction methods have been proposed but they have hyper-parameters to determine beforehand, which requires strong experience and lead to difficulty in real applications especially when the inter-cluster similarity is high or/and the dataset is large. On the other hand, we often have to determine to use a linear model or a nonlinear model, which still depends on experience. To solve these two problems, in this paper, we present an eigen-gap guided search method for subspace clustering. The main idea is to find the most reliable affinity matrix among a set of candidates constructed by linear and kernel regressions, where the reliability is quantified by the textit{relative-eigen-gap} of graph Laplacian defined in this paper. We show, theoretically and numerically, that the Laplacian matrix with a larger relative-eigen-gap often yields a higher clustering accuracy and stability. Our method is able to automatically search the best model and hyper-parameters in a pre-defined space. The search space is very easy to determine and can be arbitrarily large, though a relatively compact search space can reduce the highly unnecessary computation. Our method has high flexibility and convenience in real applications, and also has low computational cost because the affinity matrix is not computed by iterative optimization. We extend the method to large-scale datasets such as MNIST, on which the time cost is less than 90s and the clustering accuracy is state-of-the-art. Extensive experiments of natural image clustering show that our method is more stable, accurate, and efficient than baseline methods.

【2】 Invariance-based Multi-Clustering of Latent Space Embeddings for Equivariant Learning 标题:基于不变性的等变学习潜在空间嵌入多聚类

作者:Chandrajit Bajaj,Avik Roy,Haoran Zhang 机构: We areinterested in fleshing out a canonical reconstruction that isinvariant under such group transformations while indepen- 1Department of Computer Science, The University of Texas atAustin, TX 787 1 2 2Department of Physics 备注:The codebase for MCEVAE is available at this https URL 链接:https://arxiv.org/abs/2107.11717 摘要:变分自动编码器(VAE)在恢复计算机视觉任务中的模型潜空间方面具有显著的效果。然而,由于种种原因,目前训练的vae似乎在潜伏期空间的学习不变性和等变聚类方面存在不足。我们的工作集中在提供解决这个问题的方法,并提出了一种方法来解开等变特征映射在一个李群流形通过执行深入,组不变的学习。同时实现了潜在空间表示的语义变量和等变变量的分离,我们通过使用一个混合模型pdf-like高斯混合变量进行不变聚类嵌入,形成了一个改进的证据下界(ELBO),该模型允许更好的无监督变分聚类。我们的实验表明,与目前最好的深度学习模型相比,该模型有效地学习分离不变量和等变量表示,显著提高了学习率,显著地提高了图像识别和规范状态重建的效率。 摘要:Variational Autoencoders (VAEs) have been shown to be remarkably effective in recovering model latent spaces for several computer vision tasks. However, currently trained VAEs, for a number of reasons, seem to fall short in learning invariant and equivariant clusters in latent space. Our work focuses on providing solutions to this problem and presents an approach to disentangle equivariance feature maps in a Lie group manifold by enforcing deep, group-invariant learning. Simultaneously implementing a novel separation of semantic and equivariant variables of the latent space representation, we formulate a modified Evidence Lower BOund (ELBO) by using a mixture model pdf like Gaussian mixtures for invariant cluster embeddings that allows superior unsupervised variational clustering. Our experiments show that this model effectively learns to disentangle the invariant and equivariant representations with significant improvements in the learning rate and an observably superior image recognition and canonical state reconstruction compared to the currently best deep learning models.

超分辨率|去噪|去模糊|去雾(2篇)

【1】 Denoising and Segmentation of Epigraphical Scripts 标题:碑文的去噪与分割

作者:P Preethi,Hrishikesh Viswanath 机构:Assistant Professor, Department of Computer Science, PES University, Bangalore 链接:https://arxiv.org/abs/2107.11801 摘要:本文提出了一种利用Haralick特征对图像进行去噪和利用人工神经网络进一步分割字符的新方法。图像被划分成核,每个核被转换成一个GLCM(灰度共生矩阵),在这个GLCM上调用一个Haralick特征生成函数,其结果是一个包含14个元素的数组,对应14个特征Haralick值和相应的噪声/文本分类形成一个字典,然后通过核比较对图像进行去噪处理。切分是从文档中提取字符的过程,当字母被空白分隔时,切分可以使用,空白是一种明确的边界标记。分词是许多自然语言处理问题的第一步。本文探讨了利用神经网络进行图像分割的过程。虽然已有许多方法来分割文档中的字符,但本文只关注使用神经网络进行分割的准确性。字符的正确分割是非常必要的,否则将导致自然语言处理工具对字符的错误识别。采用人工神经网络,准确率达89%。此方法适用于字符由空格分隔的语言。然而,当语言大量使用连接字母时,这种方法将无法提供可接受的结果。天成文书就是一个例子,主要在印度北部使用。 摘要:This paper is a presentation of a new method for denoising images using Haralick features and further segmenting the characters using artificial neural networks. The image is divided into kernels, each of which is converted to a GLCM (Gray Level Co-Occurrence Matrix) on which a Haralick Feature generation function is called, the result of which is an array with fourteen elements corresponding to fourteen features The Haralick values and the corresponding noise/text classification form a dictionary, which is then used to de-noise the image through kernel comparison. Segmentation is the process of extracting characters from a document and can be used when letters are separated by white space, which is an explicit boundary marker. Segmentation is the first step in many Natural Language Processing problems. This paper explores the process of segmentation using Neural Networks. While there have been numerous methods to segment characters of a document, this paper is only concerned with the accuracy of doing so using neural networks. It is imperative that the characters be segmented correctly, for failing to do so will lead to incorrect recognition by Natural language processing tools. Artificial Neural Networks was used to attain accuracy of upto 89%. This method is suitable for languages where the characters are delimited by white space. However, this method will fail to provide acceptable results when the language heavily uses connected letters. An example would be the Devanagari script, which is predominantly used in northern India.

【2】 Discrete Denoising Flows 标题:离散去噪流

作者:Alexandra Lindt,Emiel Hoogeboom 机构: University of Amsterdam 备注:Accepted to the Third workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models (ICML 2021) 链接:https://arxiv.org/abs/2107.11625 摘要:基于离散流的模型是最近提出的一类生成模型,它学习离散随机变量的可逆变换。由于它们不需要数据去量化和最大化精确的似然目标,因此可以直接用于无损压缩。本文介绍了一种新的基于离散流的分类随机变量模型:离散去噪流(DDFs)。与其他基于离散流的模型相比,我们的模型可以在不引入梯度偏差的情况下进行局部训练。我们展示了DDFs在模拟玩具示例、二进制MNIST和Cityscapes分割图(以对数似然度量)时优于离散流。 摘要:Discrete flow-based models are a recently proposed class of generative models that learn invertible transformations for discrete random variables. Since they do not require data dequantization and maximize an exact likelihood objective, they can be used in a straight-forward manner for lossless compression. In this paper, we introduce a new discrete flow-based model for categorical random variables: Discrete Denoising Flows (DDFs). In contrast with other discrete flow-based models, our model can be locally trained without introducing gradient bias. We show that DDFs outperform Discrete Flows on modeling a toy example, binary MNIST and Cityscapes segmentation maps, measured in log-likelihood.

自动驾驶|车辆|车道检测等(2篇)

【1】 Multimodal Fusion Using Deep Learning Applied to Driver's Referencing of Outside-Vehicle Objects 标题:基于深度学习的多模态融合在驾驶员对车外物体参考中的应用

作者:Abdul Rafey Aftab,Michael von der Beeck,Steven Rohrhirsch,Benoit Diotte,Michael Feld 机构:©, IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media 链接:https://arxiv.org/abs/2107.12167 摘要:人们越来越感兴趣的是更智能的自然用户与汽车的互动。手势和语音已经被应用于司机和汽车的互动。此外,多模式方法在汽车工业中也显示出前景。在本文中,我们利用深度学习的多模态融合网络的参考对象以外的车辆。我们同时利用凝视、头部姿势和手指指向的特征来精确预测不同汽车姿势下的参考对象。我们演示了每种模式在用于自然引用形式时的实际局限性,特别是在车内。从我们的结果可以明显看出,我们通过添加其他模式在很大程度上克服了特定模式的局限性。这项工作强调了多模态感知的重要性,特别是在走向自然用户交互时。此外,我们基于用户的分析显示,根据车辆姿态,用户行为的识别存在显著差异。 摘要:There is a growing interest in more intelligent natural user interaction with the car. Hand gestures and speech are already being applied for driver-car interaction. Moreover, multimodal approaches are also showing promise in the automotive industry. In this paper, we utilize deep learning for a multimodal fusion network for referencing objects outside the vehicle. We use features from gaze, head pose and finger pointing simultaneously to precisely predict the referenced objects in different car poses. We demonstrate the practical limitations of each modality when used for a natural form of referencing, specifically inside the car. As evident from our results, we overcome the modality specific limitations, to a large extent, by the addition of other modalities. This work highlights the importance of multimodal sensing, especially when moving towards natural user interaction. Furthermore, our user based analysis shows noteworthy differences in recognition of user behavior depending upon the vehicle pose.

【2】 Deep Machine Learning Based Egyptian Vehicle License Plate Recognition Systems 标题:基于深度机器学习的埃及车牌识别系统

作者:Mohamed Shehata,Mohamed Taha Abou-Kreisha,Hany Elnashar 机构:Systems & Computers Engineering, Al-Azhar University, Egypt, , Mathematical & Computer Science, Department, Al-, Azhar University. Egypt., Project Development Unite, Computers and Artificial Intelligent, Beni-Suef University 备注:None 链接:https://arxiv.org/abs/2107.11640 摘要:自动车牌检测与识别是近年来的一个重要研究课题。VLP定位和识别是利用数字技术进行交通管理的关键技术之一。本文开发了四种埃及汽车牌照识别智能系统。两个系统是基于字符识别的,分别是(System1,经典机器学习字符识别)和(System2,深度机器学习字符识别)。另外两个系统是基于全车牌识别的,分别是(System3,经典机器学习的全车牌识别)和(System4,深度机器学习的全车牌识别)。我们使用目标检测算法和基于机器学习的目标识别算法。在实际图像上对所开发的系统进行了性能测试,实验结果表明,采用深度学习方法可以获得最佳的VLP检测准确率。其中VLP的检测准确率比经典系统高出32%。然而,经典的车牌阿拉伯字符(VLPAC)检测方法提供了最佳的检测准确率。其中VLPAC的检测准确率比基于深度学习的系统高出6%。同时,实验结果表明,在VLP识别过程中,深度学习比经典技术有更好的效果。其中识别准确率比经典系统高8%。最后,本文提出了一个基于统计和深度机器学习的鲁棒VLP识别系统。 摘要:Automated Vehicle License Plate (VLP) detection and recognition have ended up being a significant research issue as of late. VLP localization and recognition are some of the most essential techniques for managing traffic using digital techniques. In this paper, four smart systems are developed to recognize Egyptian vehicles license plates. Two systems are based on character recognition, which are (System1, Characters Recognition with Classical Machine Learning) and (System2, Characters Recognition with Deep Machine Learning). The other two systems are based on the whole plate recognition which are (System3, Whole License Plate Recognition with Classical Machine Learning) and (System4, Whole License Plate Recognition with Deep Machine Learning). We use object detection algorithms, and machine learning based object recognition algorithms. The performance of the developed systems has been tested on real images, and the experimental results demonstrate that the best detection accuracy rate for VLP is provided by using the deep learning method. Where the VLP detection accuracy rate is better than the classical system by 32%. However, the best detection accuracy rate for Vehicle License Plate Arabic Character (VLPAC) is provided by using the classical method. Where VLPAC detection accuracy rate is better than the deep learning-based system by 6%. Also, the results show that deep learning is better than the classical technique used in VLP recognition processes. Where the recognition accuracy rate is better than the classical system by 8%. Finally, the paper output recommends a robust VLP recognition system based on both statistical and deep machine learning.

联邦学习|隐私保护|加密(7篇)

【1】 On The Impact of Client Sampling on Federated Learning Convergence 标题:客户抽样对联合学习收敛性的影响研究

作者:Yann Fraboni,Richard Vidal,Laetitia Kameni,Marco Lorenzi 机构:Université Côte d’Azur, Inria Sophia Antipolis, Epione Research Group, France, Accenture Labs, Sophia Antipolis, France 链接:https://arxiv.org/abs/2107.12211 摘要:虽然客户抽样是当前最先进的联邦学习(FL)方法的核心操作,但这一过程对FL的收敛性和速度的影响至今仍有待研究。在这项工作中,我们介绍了一个新的分解定理收敛的流动性,允许明确量化的影响,客户抽样对全球模型更新。与以前的收敛分析相反,我们的定理提供了给定收敛步骤的精确分解,从而能够准确地考虑客户端采样和异构性的作用。首先,我们提供了一个理论基础,为先前报道的结果之间的关系FL收敛和方差的聚合权重。其次,我们第一次证明了聚合权值之间的协方差也会影响FL收敛的质量。第三,我们确定聚合权重之和是另一个减慢的来源,并且应该等于1以提高FL收敛速度。我们的理论是一般性的,并在这里应用于多项式分布(MD)和均匀抽样,这两种是FL中默认的客户端抽样,并通过在非iid和不平衡场景下的一系列实验进行了验证。我们的研究结果表明,MD抽样可以作为默认的抽样方案,因为MD抽样对学习过程中数据比率的变化具有弹性,而均匀抽样仅在客户具有相同数据量的特殊情况下才具有优势。 摘要:While clients' sampling is a central operation of current state-of-the-art federated learning (FL) approaches, the impact of this procedure on the convergence and speed of FL remains to date under-investigated. In this work we introduce a novel decomposition theorem for the convergence of FL, allowing to clearly quantify the impact of client sampling on the global model update. Contrarily to previous convergence analyses, our theorem provides the exact decomposition of a given convergence step, thus enabling accurate considerations about the role of client sampling and heterogeneity. First, we provide a theoretical ground for previously reported results on the relationship between FL convergence and the variance of the aggregation weights. Second, we prove for the first time that the quality of FL convergence is also impacted by the resulting covariance between aggregation weights. Third, we establish that the sum of the aggregation weights is another source of slow-down and should be equal to 1 to improve FL convergence speed. Our theory is general, and is here applied to Multinomial Distribution (MD) and Uniform sampling, the two default client sampling in FL, and demonstrated through a series of experiments in non-iid and unbalanced scenarios. Our results suggest that MD sampling should be used as default sampling scheme, due to the resilience to the changes in data ratio during the learning process, while Uniform sampling is superior only in the special case when clients have the same amount of data.

【2】 Decentralized Federated Learning: Balancing Communication and Computing Costs 标题:分散的联合学习:平衡通信和计算成本

作者:Wei Liu,Li Chen,Wenyi Zhang 链接:https://arxiv.org/abs/2107.12048 摘要:分散联合学习(DFL)是一种强大的分布式机器学习框架,分散随机梯度下降(SGD)是DFL的驱动引擎。分散SGD的性能受通信效率和收敛速度的共同影响。在本文中,我们提出了一个通用的分散式联邦学习框架来平衡通信效率和收敛性能。该框架统一了传统的分散SGD方法,可以周期性地执行多个局部更新和多个节点间通信。在没有凸目标函数假设的情况下,我们为所提出的DFL算法建立了很强的收敛性保证。在通信和计算资源受限的情况下,通信轮和计算轮的平衡是优化分散联邦学习的关键。为了进一步提高DFL的通信效率,将压缩通信应用到DFL中,称为压缩通信DFL(C-DFL)。提出的C-DFL对强凸目标具有线性收敛性。基于MNIST和CIFAR-10数据集的实验结果表明,DFL方法优于传统的分散SGD方法,进一步提高了通信效率。 摘要:Decentralized federated learning (DFL) is a powerful framework of distributed machine learning and decentralized stochastic gradient descent (SGD) is a driving engine for DFL. The performance of decentralized SGD is jointly influenced by communication-efficiency and convergence rate. In this paper, we propose a general decentralized federated learning framework to strike a balance between communication-efficiency and convergence performance. The proposed framework performs both multiple local updates and multiple inter-node communications periodically, unifying traditional decentralized SGD methods. We establish strong convergence guarantees for the proposed DFL algorithm without the assumption of convex objective function. The balance of communication and computation rounds is essential to optimize decentralized federated learning under constrained communication and computation resources. For further improving communication-efficiency of DFL, compressed communication is applied to DFL, named DFL with compressed communication (C-DFL). The proposed C-DFL exhibits linear convergence for strongly convex objectives. Experiment results based on MNIST and CIFAR-10 datasets illustrate the superiority of DFL over traditional decentralized SGD methods and show that C-DFL further enhances communication-efficiency.

【3】 Aggregate or Not? Exploring Where to Privatize in DNN Based Federated Learning Under Different Non-IID Scenes 标题:合计还是不合计?基于DNN的联邦学习在不同非IID场景下的私有化方向探讨

作者:Xin-Chun Li,Le Gan,De-Chuan Zhan,Yunfeng Shao,Bingshuai Li,Shaoming Song 机构:Technology, Nanjing University 链接:https://arxiv.org/abs/2107.11954 摘要:虽然联邦学习(FL)最近被提出用于有效的分布式训练和数据隐私保护,但它仍然遇到许多障碍。其中之一是客户机之间自然存在的统计异构性,使得本地数据分布不独立且分布相同(即非iid),这对模型聚合和个性化提出了挑战。对于带有深度神经网络(DNN)的FL,私有化某些层是解决非iid问题的简单而有效的方法。然而,我们应该私有化哪些层面来促进学习过程?不同类别的非iid场景是否倾向于私有化方式?我们能在FL期间自动学习最合适的私有化方式吗?在本文中,我们通过大量的实验研究来回答这些问题。首先,我们给出了这些基准的详细统计数据,并将它们分类为协变量和非iid场景。然后,我们研究粗粒度和细粒度的网络分裂,并探讨首选私有化方式是否与非iid场景的特定类别有任何潜在关系。我们的发现是令人兴奋的,例如,私有化的基础层可以提高性能,即使在标签转移非iid场景,这与一些自然的猜测是不一致的。我们还发现,这些私有化的方式都不能改善莎士比亚的表演基准,我们猜测莎士比亚可能不是一个严重的非iid场景。最后,我们提出了几种方法,通过十字绣、软注意和硬选择来自动学习聚集的位置。我们主张所提出的方法可以作为一个初步的尝试,探索在哪里私有化一个新的非iid场景。 摘要:Although federated learning (FL) has recently been proposed for efficient distributed training and data privacy protection, it still encounters many obstacles. One of these is the naturally existing statistical heterogeneity among clients, making local data distributions non independently and identically distributed (i.e., non-iid), which poses challenges for model aggregation and personalization. For FL with a deep neural network (DNN), privatizing some layers is a simple yet effective solution for non-iid problems. However, which layers should we privatize to facilitate the learning process? Do different categories of non-iid scenes have preferred privatization ways? Can we automatically learn the most appropriate privatization way during FL? In this paper, we answer these questions via abundant experimental studies on several FL benchmarks. First, we present the detailed statistics of these benchmarks and categorize them into covariate and label shift non-iid scenes. Then, we investigate both coarse-grained and fine-grained network splits and explore whether the preferred privatization ways have any potential relations to the specific category of a non-iid scene. Our findings are exciting, e.g., privatizing the base layers could boost the performances even in label shift non-iid scenes, which are inconsistent with some natural conjectures. We also find that none of these privatization ways could improve the performances on the Shakespeare benchmark, and we guess that Shakespeare may not be a seriously non-iid scene. Finally, we propose several approaches to automatically learn where to aggregate via cross-stitch, soft attention, and hard selection. We advocate the proposed methods could serve as a preliminary try to explore where to privatize for a novel non-iid scene.

【4】 Federated Learning with Fair Worker Selection: A Multi-Round Submodular Maximization Approach 标题:公平选拔员工的联合学习:多轮子模块最大化方法

作者:Fengjiao Li,Jia Liu,Bo Ji 机构:The Ohio State University 链接:https://arxiv.org/abs/2107.11728 摘要:本文研究了联邦学习系统中的公平工人选择问题,其中公平作为一种激励机制,鼓励更多的工人参与联邦学习。考虑到全局模型的训练精度是所选工人的效用,工人选择问题通常是一个单调子模函数,我们将工人选择问题描述为一个新的具有基数和公平性约束的多轮单调子模最大化问题。目标是在多轮中最大化时间平均效用,但需要额外的公平性要求,即每个工人必须在一定的时间段内被选中。传统的带基数约束的子模最大化问题已经是一个众所周知的NP难题,而多轮环境下的公平约束又增加了一层难度。为了解决这一新的挑战,我们提出了三种算法:公平连续贪婪算法(FairCG1和FairCG2)和公平离散贪婪算法(FairDG),所有算法在可行的情况下都满足公平性要求。此外,我们在FairCG1和FairCG2下证明了时间平均效用的非平凡下界。此外,通过给予公平更高的优先权,FairDG确保了一个更强有力的短期公平保障,这在每一轮都是有效的。最后,我们通过大量的仿真来验证所提算法在时间平均效用和公平性满意度方面的有效性。 摘要:In this paper, we study the problem of fair worker selection in Federated Learning systems, where fairness serves as an incentive mechanism that encourages more workers to participate in the federation. Considering the achieved training accuracy of the global model as the utility of the selected workers, which is typically a monotone submodular function, we formulate the worker selection problem as a new multi-round monotone submodular maximization problem with cardinality and fairness constraints. The objective is to maximize the time-average utility over multiple rounds subject to an additional fairness requirement that each worker must be selected for a certain fraction of time. While the traditional submodular maximization with a cardinality constraint is already a well-known NP-Hard problem, the fairness constraint in the multi-round setting adds an extra layer of difficulty. To address this novel challenge, we propose three algorithms: Fair Continuous Greedy (FairCG1 and FairCG2) and Fair Discrete Greedy (FairDG), all of which satisfy the fairness requirement whenever feasible. Moreover, we prove nontrivial lower bounds on the achieved time-average utility under FairCG1 and FairCG2. In addition, by giving a higher priority to fairness, FairDG ensures a stronger short-term fairness guarantee, which holds in every round. Finally, we perform extensive simulations to verify the effectiveness of the proposed algorithms in terms of the time-average utility and fairness satisfaction.

【5】 FedLab: A Flexible Federated Learning Framework 标题:FedLab:一种灵活的联合学习框架

作者:Dun Zeng,Siqi Liang,Xiangjing Hu,Zenglin Xu 机构:School of Computer Science and Engineering, University of Electronic Science and technology of China, Chengdu, China., School of Computer Science and Technology, Harbin Institute of Technology Shenzhen, Shenzhen, China. 链接:https://arxiv.org/abs/2107.11621 摘要:联合学习(FL)是一种隐私挑战解决方案,它允许多方在不违反隐私保护规定的情况下训练共享模型。近年来,许多优秀的外语教学作品被提出。为了帮助研究者验证他们在外语教学中的想法,我们设计并开发了一个基于PyTorch的灵活的模块化外语教学框架FedLab。本文将介绍FedLab的体系结构和特点。针对当前研究热点:优化和通信压缩,FedLab提供了功能接口,并提供了一系列的基线实现,使得研究人员能够快速实现思想。此外,FedLab在客户端模拟和分布式通信方面都具有可扩展性。 摘要:Federated learning (FL) is a solution for privacy challenge, which allows multiparty to train a shared model without violating privacy protection regulations. Many excellent works of FL have been proposed in recent years. To help researchers verify their ideas in FL, we designed and developed FedLab, a flexible and modular FL framework based on PyTorch. In this paper, we will introduce architecture and features of FedLab. For current popular research points: optimization and communication compression, FedLab provides functional interfaces and a series of baseline implementation are available, making researchers quickly implement ideas. In addition, FedLab is scale-able in both client simulation and distributed communication.

【6】 Accelerating Federated Edge Learning via Optimized Probabilistic Device Scheduling 标题:通过优化的概率设备调度加速联合边缘学习

作者:Maojun Zhang,Guangxu Zhu,Shuai Wang,Jiamo Jiang,Caijun Zhong,Shuguang Cui 机构:§ College of information Science and Electronic Engineering, Zhejiang University, Hangzhou, China, ∗Shenzhen Research Institute of Big Data, Shenzhen, China, † Southern University of Science and Technology, Shenzhen , China 备注:In Proc. IEEE SPAWC2021 链接:https://arxiv.org/abs/2107.11588 摘要:流行的联邦边缘学习(FEEL)框架允许通过边缘设备和服务器之间频繁的学习更新交换来保护隐私的协作模型训练。由于带宽有限,只有一部分设备可以在每轮通信中上传更新。这导致了一个活跃的研究领域,研究最佳的设备调度策略,以尽量减少通信时间。然而,由于难以量化准确的通信时间,该领域的前期工作只能通过考虑通信轮或每轮延迟来部分解决问题,而总的通信时间由这两个指标决定。为了填补这一空白,本文首次尝试提出并解决了通信时间最小化问题。我们首先推导出一个紧界,通过跨学科的努力,包括收敛分析的学习理论和每轮延迟分析的通信理论来近似通信时间。在分析结果的基础上,通过求解近似的通信时间最小化问题,得到了一种封闭形式的优化概率调度策略。研究发现,随着训练过程的发展,优化策略的优先级逐渐从抑制剩余的通信轮转向减少每轮延迟。以自主驾驶协同三维目标检测为例,验证了该方法的有效性。 摘要:The popular federated edge learning (FEEL) framework allows privacy-preserving collaborative model training via frequent learning-updates exchange between edge devices and server. Due to the constrained bandwidth, only a subset of devices can upload their updates at each communication round. This has led to an active research area in FEEL studying the optimal device scheduling policy for minimizing communication time. However, owing to the difficulty in quantifying the exact communication time, prior work in this area can only tackle the problem partially by considering either the communication rounds or per-round latency, while the total communication time is determined by both metrics. To close this gap, we make the first attempt in this paper to formulate and solve the communication time minimization problem. We first derive a tight bound to approximate the communication time through cross-disciplinary effort involving both learning theory for convergence analysis and communication theory for per-round latency analysis. Building on the analytical result, an optimized probabilistic scheduling policy is derived in closed-form by solving the approximate communication time minimization problem. It is found that the optimized policy gradually turns its priority from suppressing the remaining communication rounds to reducing per-round latency as the training process evolves. The effectiveness of the proposed scheme is demonstrated via a use case on collaborative 3D objective detection in autonomous driving.

【7】 Device Scheduling and Update Aggregation Policies for Asynchronous Federated Learning 标题:异步联合学习的设备调度和更新聚合策略

作者:Chung-Hsuan Hu,Zheng Chen,Erik G. Larsson 机构:Department of Electrical Engineering, Linköping University, Sweden. 备注:5 pages, 4 figures, accepted in 22nd IEEE international workshop on signal processing advances in wireless communications (SPAWC 2021) 链接:https://arxiv.org/abs/2107.11415 摘要:联邦学习(FL)是一种新兴的分散式机器学习(ML)框架,它将设备上的局部训练与基于服务器的模型同步相结合,在分布式节点上训练集中式ML模型。本文提出了一种周期聚合的异步FL框架来消除FL系统中的散乱问题。对于所提出的模型,我们研究了几种设备调度和更新聚合策略,并比较了当设备具有异构计算能力和训练数据分布时它们的性能。仿真结果表明,异步FL的调度和聚合设计与同步FL有很大的不同。例如,基于范数的显著性感知调度策略在异步FL设置中可能不是有效的,并且适当的“年龄感知”模型聚合加权设计可以极大地提高此类系统的学习性能。 摘要:Federated Learning (FL) is a newly emerged decentralized machine learning (ML) framework that combines on-device local training with server-based model synchronization to train a centralized ML model over distributed nodes. In this paper, we propose an asynchronous FL framework with periodic aggregation to eliminate the straggler issue in FL systems. For the proposed model, we investigate several device scheduling and update aggregation policies and compare their performances when the devices have heterogeneous computation capabilities and training data distributions. From the simulation results, we conclude that the scheduling and aggregation design for asynchronous FL can be rather different from the synchronous case. For example, a norm-based significance-aware scheduling policy might not be efficient in an asynchronous FL setting, and an appropriate "age-aware" weighting design for the model aggregation can greatly improve the learning performance of such systems.

推理|分析|理解|解释(7篇)

【1】 A brief note on understanding neural networks as Gaussian processes 标题:关于将神经网络理解为高斯过程的一点注记

作者:Mengwu Guo 机构:Applied Analysis, Department of Applied Mathematics, University of Twente 链接:https://arxiv.org/abs/2107.11892 摘要:作为[Lee et al.,2017]工作的推广,本文简要讨论了神经网络输出的先验何时遵循高斯过程,以及神经网络诱导的高斯过程是如何形成的。这种高斯过程回归的后验平均函数位于神经网络诱导核定义的再生核Hilbert空间。在两层神经网络的情况下,诱导高斯过程提供了再生核Hilbert空间的解释,其并集形成Barron空间。 摘要:As a generalization of the work in [Lee et al., 2017], this note briefly discusses when the prior of a neural network output follows a Gaussian process, and how a neural-network-induced Gaussian process is formulated. The posterior mean functions of such a Gaussian process regression lie in the reproducing kernel Hilbert space defined by the neural-network-induced kernel. In the case of two-layer neural networks, the induced Gaussian processes provide an interpretation of the reproducing kernel Hilbert spaces whose union forms a Barron space.

【2】 Federated Causal Inference in Heterogeneous Observational Data 标题:异质观测数据中的联合因果推理

作者:Ruoxuan Xiong,Allison Koenecke,Michael Powell,Zhu Shen,Joshua T. Vogelstein,Susan Athey 链接:https://arxiv.org/abs/2107.11732 摘要:分析来自多个来源的观测数据有助于提高检测治疗效果的统计能力;然而,隐私考虑等实际约束可能会限制跨数据集的个人级信息共享。本文提出了只利用异构数据集摘要级信息的联邦方法。我们的联合方法提供了治疗效果的双鲁棒点估计以及方差估计。我们得到了我们的联邦估计的渐近分布,它被证明是渐近等价于相应的估计从组合,个人水平的数据。我们表明,为了实现这些属性,联邦方法应该根据诸如模型是否正确指定以及跨异构数据集的稳定性等条件进行调整。 摘要:Analyzing observational data from multiple sources can be useful for increasing statistical power to detect a treatment effect; however, practical constraints such as privacy considerations may restrict individual-level information sharing across data sets. This paper develops federated methods that only utilize summary-level information from heterogeneous data sets. Our federated methods provide doubly-robust point estimates of treatment effects as well as variance estimates. We derive the asymptotic distributions of our federated estimators, which are shown to be asymptotically equivalent to the corresponding estimators from the combined, individual-level data. We show that to achieve these properties, federated methods should be adjusted based on conditions such as whether models are correctly specified and stable across heterogeneous data sets.

【3】 Efficient inference of interventional distributions 标题:干预分布的有效推断

作者:Arnab Bhattacharyya,Sutanu Gayen,Saravanan Kandasamy,Vedant Raval,N. V. Vinodchandran 机构:National University of Singapore, Cornell University, Indian Institute of Technology Delhi, University of Nebraska-Lincoln 备注:16 pages, 2 figures 链接:https://arxiv.org/abs/2107.11712 摘要:我们考虑的问题,有效推断推断分布在因果贝叶斯网络从有限数量的观察。设$mathcal{P}$是给定因果图$G$上可观测变量集$mathbf{V}$上的因果模型。对于集合$mathbf{X}、mathbf{Y}subseteqmathbf{V}$,并将${bf X}$设置为$mathbf{X}$,让$P{bf X}(mathbf{Y})$表示$mathbf{Y}$上关于干预${bf X}$到变量${bf X}$的干预分布。shpither和Pearl(AAAI 2006)在Tian和Pearl(AAAI 2001)的工作基础上,给出了一类因果图的精确刻画,对于这类因果图,$P{bf x}({mathbf{Y}})$的介入分布可以唯一地确定。给出了第一种有效的Shpitser-Pearl算法。特别地,在自然假设下,我们给出了一个多项式时间算法,在输入一个可观测变量$mathbf{V}$上的因果图$$G$,一个有界大小的集合$mathbf{x}substeqmathbf{V}$的集合${bf x}$,如果可以识别$P{bf x}(mathbf{Y})$,则输出分布$hat{P}$的求值器和生成器的简洁描述,该分布$varepsilon$-接近(总变化距离)到$P{bf x}({mathbf{Y})$,其中$Y=mathbf{V}setminusmathbf{x}$。我们还表明,当$mathbf{Y}$是一个任意集时,除非所有具有统计零知识证明的问题(包括图同构问题)都有有效的随机化算法,否则没有有效的算法输出$varepsilon$-接近于$P{bf x}({mathbf{Y})$)的分布的求值器。 摘要:We consider the problem of efficiently inferring interventional distributions in a causal Bayesian network from a finite number of observations. Let $mathcal{P}$ be a causal model on a set $mathbf{V}$ of observable variables on a given causal graph $G$. For sets $mathbf{X},mathbf{Y}subseteq mathbf{V}$, and setting ${bf x}$ to $mathbf{X}$, let $P_{bf x}(mathbf{Y})$ denote the interventional distribution on $mathbf{Y}$ with respect to an intervention ${bf x}$ to variables ${bf x}$. Shpitser and Pearl (AAAI 2006), building on the work of Tian and Pearl (AAAI 2001), gave an exact characterization of the class of causal graphs for which the interventional distribution $P_{bf x}({mathbf{Y}})$ can be uniquely determined. We give the first efficient version of the Shpitser-Pearl algorithm. In particular, under natural assumptions, we give a polynomial-time algorithm that on input a causal graph $G$ on observable variables $mathbf{V}$, a setting ${bf x}$ of a set $mathbf{X} subseteq mathbf{V}$ of bounded size, outputs succinct descriptions of both an evaluator and a generator for a distribution $hat{P}$ that is $varepsilon$-close (in total variation distance) to $P_{bf x}({mathbf{Y}})$ where $Y=mathbf{V}setminus mathbf{X}$, if $P_{bf x}(mathbf{Y})$ is identifiable. We also show that when $mathbf{Y}$ is an arbitrary set, there is no efficient algorithm that outputs an evaluator of a distribution that is $varepsilon$-close to $P_{bf x}({mathbf{Y}})$ unless all problems that have statistical zero-knowledge proofs, including the Graph Isomorphism problem, have efficient randomized algorithms.

【4】 A general sample complexity analysis of vanilla policy gradient 标题:香草政策梯度的一般样本复杂性分析

作者:Rui Yuan,Robert M. Gower,Alessandro Lazaric 备注:ICML 2021 Workshop on "Reinforcement learning theory" 链接:https://arxiv.org/abs/2107.11433 摘要:策略梯度(PG)是解决强化学习(RL)问题最常用的方法之一。然而,一个坚实的理论理解,即使是“香草”PG一直难以捉摸很长一段时间。本文应用最新的非凸优化SGD分析工具,在目标函数的光滑性假设和估计梯度范数的二阶矩的弱条件下,得到了REINFORCE和GPOMDP的收敛性保证。当在策略空间的公共假设下实例化时,我们的一般结果立即恢复现有的$widetilde{mathcal{O}(epsilon^{-4})$样本复杂性保证,但是对于更大范围的参数(例如,步长和批大小$m$),相对于以前的文献。值得注意的是,我们的结果包括了单轨迹情况(即,$m=1$),通过修正文献中已有的结果,它提供了对问题特定参数依赖性的更精确的分析。我们相信,非凸优化的最新工具的集成可能会导致识别更广泛的问题,其中PG方法具有强大的理论保证。 摘要:The policy gradient (PG) is one of the most popular methods for solving reinforcement learning (RL) problems. However, a solid theoretical understanding of even the "vanilla" PG has remained elusive for long time. In this paper, we apply recent tools developed for the analysis of SGD in non-convex optimization to obtain convergence guarantees for both REINFORCE and GPOMDP under smoothness assumption on the objective function and weak conditions on the second moment of the norm of the estimated gradient. When instantiated under common assumptions on the policy space, our general result immediately recovers existing $widetilde{mathcal{O}}(epsilon^{-4})$ sample complexity guarantees, but for wider ranges of parameters (e.g., step size and batch size $m$) with respect to previous literature. Notably, our result includes the single trajectory case (i.e., $m=1$) and it provides a more accurate analysis of the dependency on problem-specific parameters by fixing previous results available in the literature. We believe that the integration of state-of-the-art tools from non-convex optimization may lead to identify a much broader range of problems where PG methods enjoy strong theoretical guarantees.

【5】 Inference for Heteroskedastic PCA with Missing Data 标题:具有缺失数据的异方差主元分析的推断

作者:Yuling Yan,Yuxin Chen,Jianqing Fan 链接:https://arxiv.org/abs/2107.12365 摘要:本文研究了如何构造高维主成分分析(PCA)的置信域,这是一个尚未得到充分研究的问题。虽然计算非线性/非凸估计量的不确定性测度在高维上通常是困难的,但由于普遍存在的缺失数据和异方差噪声,这一挑战进一步加剧。我们提出了一套基于两个估计器对主子空间进行有效推断的解决方案:一种基于vanilla SVD的方法,以及一种更精细的迭代方案$textsf{HeteroPCA}$(Zhang et al.,2018)。我们发展了这两种估计量的非渐近分布保证,并演示了如何调用它们来计算主子空间的置信域和尖峰协方差矩阵的入口置信区间。特别值得强调的是建立在$textsf{HeteroPCA}$之上的推理过程,它不仅有效,而且在更广泛的情况下在统计上也是有效的(例如,它涵盖了更广泛的丢失率和信噪比)。我们的解决方案是完全数据驱动和适应异方差随机噪声,无需事先了解噪声水平和噪声分布。 摘要:This paper studies how to construct confidence regions for principal component analysis (PCA) in high dimension, a problem that has been vastly under-explored. While computing measures of uncertainty for nonlinear/nonconvex estimators is in general difficult in high dimension, the challenge is further compounded by the prevalent presence of missing data and heteroskedastic noise. We propose a suite of solutions to perform valid inference on the principal subspace based on two estimators: a vanilla SVD-based approach, and a more refined iterative scheme called $textsf{HeteroPCA}$ (Zhang et al., 2018). We develop non-asymptotic distributional guarantees for both estimators, and demonstrate how these can be invoked to compute both confidence regions for the principal subspace and entrywise confidence intervals for the spiked covariance matrix. Particularly worth highlighting is the inference procedure built on top of $textsf{HeteroPCA}$, which is not only valid but also statistically efficient for broader scenarios (e.g., it covers a wider range of missing rates and signal-to-noise ratios). Our solutions are fully data-driven and adaptive to heteroskedastic random noise, without requiring prior knowledge about the noise levels and noise distributions.

【6】 Inference of collective Gaussian hidden Markov models 标题:集体高斯隐马尔可夫模型的推论

作者:Rahul Singh,Yongxin Chen 机构:ChenarewiththeSchoolofAerospaceEngineering, GeorgiaInstituteofTechnology 链接:https://arxiv.org/abs/2107.11662 摘要:我们考虑一类连续状态的集体隐马尔可夫模型的推断问题,其中数据被记录在由相同的动态的大量的个体所生成的集合(集体)形式中。我们提出了一种称为集合高斯向前向后算法的聚合推理算法,将最近提出的Sinkhorn信度传播算法推广到具有高斯密度特征的模型。我们的算法具有收敛性保证。此外,当观测值由单个个体产生时,它简化为标准的Kalman滤波。通过多个实验验证了该算法的有效性。 摘要:We consider inference problems for a class of continuous state collective hidden Markov models, where the data is recorded in aggregate (collective) form generated by a large population of individuals following the same dynamics. We propose an aggregate inference algorithm called collective Gaussian forward-backward algorithm, extending recently proposed Sinkhorn belief propagation algorithm to models characterized by Gaussian densities. Our algorithm enjoys convergence guarantee. In addition, it reduces to the standard Kalman filter when the observations are generated by a single individual. The efficacy of the proposed algorithm is demonstrated through multiple experiments.

【7】 Finite-time Analysis of Globally Nonstationary Multi-Armed Bandits 标题:全局非平稳多臂土匪的有限时间分析

作者:Junpei Komiyama,Edouard Fouché,Junya Honda 链接:https://arxiv.org/abs/2107.11419 摘要:我们考虑非定常多臂BoDIT问题,其中武器的模型参数随时间变化。本文介绍了自适应重设bandit(ADR-bandit)算法,它是一类利用数据流社区自适应窗口技术的bandit算法。我们首先对自适应加窗技术产生的估计器的质量提供了新的保证,这些技术在数据挖掘领域是独立的。此外,我们对ADR-bandit在两种典型环境下进行了有限时间分析:突变环境和渐进环境。我们证明了ADR-bandit在突然的或全局的变化以一种我们称之为全局变化的协调方式发生时具有接近最优的性能。我们证明,当我们把兴趣局限于全球变化时,强迫探索是不必要的。与现有的非平稳bandit算法不同,ADR-bandit算法在平稳环境和全局变化的非平稳环境中都具有最优的性能。实验结果表明,该算法在合成环境和真实环境中的性能均优于现有算法。 摘要:We consider nonstationary multi-armed bandit problems where the model parameters of the arms change over time. We introduce the adaptive resetting bandit (ADR-bandit), which is a class of bandit algorithms that leverages adaptive windowing techniques from the data stream community. We first provide new guarantees on the quality of estimators resulting from adaptive windowing techniques, which are of independent interest in the data mining community. Furthermore, we conduct a finite-time analysis of ADR-bandit in two typical environments: an abrupt environment where changes occur instantaneously and a gradual environment where changes occur progressively. We demonstrate that ADR-bandit has nearly optimal performance when the abrupt or global changes occur in a coordinated manner that we call global changes. We demonstrate that forced exploration is unnecessary when we restrict the interest to the global changes. Unlike the existing nonstationary bandit algorithms, ADR-bandit has optimal performance in stationary environments as well as nonstationary environments with global changes. Our experiments show that the proposed algorithms outperform the existing approaches in synthetic and real-world environments.

检测相关(6篇)

【1】 Are Bayesian neural networks intrinsically good at out-of-distribution detection? 标题:贝叶斯神经网络本质上擅长非分布检测吗?

作者:Christian Henning,Francesco D'Angelo,Benjamin F. Grewe 机构: AssumingEqual contribution 1Institute of Neuroinformatics, Universityof Z¨urich and ETH Z¨urich 备注:Published at UDL Workshop, ICML 2021 链接:https://arxiv.org/abs/2107.12248 摘要:避免对不熟悉的数据进行自信预测的需求引发了对分布外(OOD)检测的兴趣。人们普遍认为,贝叶斯神经网络(BNN)非常适合这项任务,因为被赋予的认知不确定性会导致对异常值的预测不一致。在本文中,我们质疑这一假设,并提供经验证据表明,适当的贝叶斯推理与常见的神经网络结构不一定导致良好的OOD检测。为了避免使用近似推理,我们首先研究了无限宽情况,其中贝叶斯推理可以精确地考虑相应的高斯过程。引人注目的是,在通用架构选择下产生的内核导致了不确定性,这些不确定性不能反映底层数据生成过程,因此不适合用于OOD检测。最后,我们利用HMC研究了有限宽度网络,观察到了与无限宽度情况一致的OOD行为。总的来说,我们的研究揭示了单纯使用BNNs进行OOD检测时的基本问题,并为将来的研究开辟了有趣的途径。 摘要:The need to avoid confident predictions on unfamiliar data has sparked interest in out-of-distribution (OOD) detection. It is widely assumed that Bayesian neural networks (BNN) are well suited for this task, as the endowed epistemic uncertainty should lead to disagreement in predictions on outliers. In this paper, we question this assumption and provide empirical evidence that proper Bayesian inference with common neural network architectures does not necessarily lead to good OOD detection. To circumvent the use of approximate inference, we start by studying the infinite-width case, where Bayesian inference can be exact considering the corresponding Gaussian process. Strikingly, the kernels induced under common architectural choices lead to uncertainties that do not reflect the underlying data generating process and are therefore unsuited for OOD detection. Finally, we study finite-width networks using HMC, and observe OOD behavior that is consistent with the infinite-width case. Overall, our study discloses fundamental problems when naively using BNNs for OOD detection and opens interesting avenues for future research.

【2】 AA3DNet: Attention Augmented Real Time 3D Object Detection 标题:AA3DNet:注意力增强的实时三维目标检测

作者:Abhinav Sagar 机构:Independent Researcher, Mumbai, India 备注:12 pages, 8 tables, 6 figures 链接:https://arxiv.org/abs/2107.12137 摘要:在这项工作中,我们解决了从点云数据中实时检测三维目标的问题。对于自主车辆来说,感知部件在高精度和快速推理的同时检测真实世界的物体是非常重要的。我们提出了一种新的神经网络结构以及训练和优化细节,用于使用点云数据检测三维对象。我们提出锚设计以及自定义损失函数在这项工作中使用。在这项工作中结合了空间和通道注意模块。我们使用kitti3d鸟瞰图数据集进行基准测试和验证我们的结果。我们的方法在平均精度和运行速度方面都超过了这一领域的最新技术。最后,我们提出烧蚀研究,以证明我们的网络性能是普遍的。这使得它成为一个可行的选择,部署在实时应用程序,如自动驾驶汽车。 摘要:In this work, we address the problem of 3D object detection from point cloud data in real time. For autonomous vehicles to work, it is very important for the perception component to detect the real world objects with both high accuracy and fast inference. We propose a novel neural network architecture along with the training and optimization details for detecting 3D objects using point cloud data. We present anchor design along with custom loss functions used in this work. A combination of spatial and channel wise attention module is used in this work. We use the Kitti 3D Birds Eye View dataset for benchmarking and validating our results. Our method surpasses previous state of the art in this domain both in terms of average precision and speed running at > 30 FPS. Finally, we present the ablation study to demonstrate that the performance of our network is generalizable. This makes it a feasible option to be deployed in real time applications like self driving cars.

【3】 Decision-forest voting scheme for classification of rare classes in network intrusion detection 标题:网络入侵检测稀有类分类的决策森林投票方案

作者:Jan Brabec,Lukas Machlica 机构:∗Cisco Systems, Inc., Charles Square Center, Karlovo Namesti , Street, Prague, Czech Republic, †Czech Technical University in Prague, Czech Republic 备注:None 链接:https://arxiv.org/abs/2107.11862 摘要:本文研究了决策树集成(决策林)中基于贝叶斯的决策树聚合问题。重点放在多类分类上,样本数量明显向其中一类倾斜。该算法利用包外数据集估计单个树的预测误差,然后根据Bayes规则对集合的决策进行优化。该算法考虑了单个类的普遍性,不需要设置与类权重或决策得分阈值相关的任何附加参数。评估基于公开可用的数据集以及专有数据集,该数据集包括来自数百个企业网络的网络流量遥测,用户总数超过100万。目的是提高恶意软件检测系统的检测能力。虽然我们能够将系统的精度保持在94%以上,也就是说,显示给网络管理员的100个检测中只有6个是假警报,但我们能够实现约7%的检测数量增长。该算法有效地处理了大量的数据,可以与大多数用于训练决策林的最新算法结合使用。 摘要:In this paper, Bayesian based aggregation of decision trees in an ensemble (decision forest) is investigated. The focus is laid on multi-class classification with number of samples significantly skewed toward one of the classes. The algorithm leverages out-of-bag datasets to estimate prediction errors of individual trees, which are then used in accordance with the Bayes rule to refine the decision of the ensemble. The algorithm takes prevalence of individual classes into account and does not require setting of any additional parameters related to class weights or decision-score thresholds. Evaluation is based on publicly available datasets as well as on an proprietary dataset comprising network traffic telemetry from hundreds of enterprise networks with over a million of users overall. The aim is to increase the detection capabilities of an operating malware detection system. While we were able to keep precision of the system higher than 94%, that is only 6 out of 100 detections shown to the network administrator are false alarms, we were able to achieve increase of approximately 7% in the number of detections. The algorithm effectively handles large amounts of data, and can be used in conjunction with most of the state-of-the-art algorithms used to train decision forests.

【4】 Improving Variational Autoencoder based Out-of-Distribution Detection for Embedded Real-time Applications 标题:改进的基于变分自动编码器的嵌入式实时应用失配检测

作者:Yeli Feng,Daniel Jun Xian Ng,Arvind Easwaran 机构: Nanyang Technological UniversityUncertainties in machine learning are a significant roadblock for its application in safety-critical cyber-physicalsystems (CPS), Nanyang Technological University 链接:https://arxiv.org/abs/2107.11750 摘要:机器学习中的不确定性是其应用于安全关键网络物理系统(CPS)的一个重要障碍。不确定性的一个来源是训练和测试场景之间输入数据的分布变化。实时检测这种分布变化是应对这一挑战的一种新兴方法。在涉及成像的CPS应用中,高维的输入空间给任务增加了额外的难度。生成学习模型被广泛应用于任务的检测,即分布外(OoD)检测。为了提高现有的技术水平,我们研究了机器学习和CPS领域的现有方案。在后者中,自动驾驶代理的实时安全监控一直是人们关注的焦点。利用视频中运动的时空相关性,我们可以鲁棒地检测出自主驾驶代理周围的危险运动。受变分自动编码器(VAE)理论和实践的最新进展的启发,我们利用数据中的先验知识来进一步提高OoD检测的鲁棒性。对nuScenes和Synthia数据集的比较研究表明,我们的方法显著提高了对驾驶场景特有的OoD因素的检测能力,比最先进的方法提高了42%。我们的模型也近乎完美地进行了推广,在真实世界和模拟驾驶数据集实验中,比最先进的模型好97%。最后,我们定制了一个双编码器模型,可以部署到资源有限的嵌入式设备进行实时OoD检测。在低精度的8位整数推理中,它的执行时间减少了4倍以上,而检测能力与相应的浮点模型相当。 摘要:Uncertainties in machine learning are a significant roadblock for its application in safety-critical cyber-physical systems (CPS). One source of uncertainty arises from distribution shifts in the input data between training and test scenarios. Detecting such distribution shifts in real-time is an emerging approach to address the challenge. The high dimensional input space in CPS applications involving imaging adds extra difficulty to the task. Generative learning models are widely adopted for the task, namely out-of-distribution (OoD) detection. To improve the state-of-the-art, we studied existing proposals from both machine learning and CPS fields. In the latter, safety monitoring in real-time for autonomous driving agents has been a focus. Exploiting the spatiotemporal correlation of motion in videos, we can robustly detect hazardous motion around autonomous driving agents. Inspired by the latest advances in the Variational Autoencoder (VAE) theory and practice, we tapped into the prior knowledge in data to further boost OoD detection's robustness. Comparison studies over nuScenes and Synthia data sets show our methods significantly improve detection capabilities of OoD factors unique to driving scenarios, 42% better than state-of-the-art approaches. Our model also generalized near-perfectly, 97% better than the state-of-the-art across the real-world and simulation driving data sets experimented. Finally, we customized one proposed method into a twin-encoder model that can be deployed to resource limited embedded devices for real-time OoD detection. Its execution time was reduced over four times in low-precision 8-bit integer inference, while detection capability is comparable to its corresponding floating-point model.

【5】 WiP Abstract : Robust Out-of-distribution Motion Detection and Localization in Autonomous CPS 标题:WIP摘要:自主CPS中健壮的分布外运动检测与定位

作者:Yeli Feng,Arvind Easwaran 机构:Nanyang Technological University 链接:https://arxiv.org/abs/2107.11736 摘要:高度复杂的深度学习模型越来越多地集成到现代网络物理系统(CPS)中,其中许多具有严格的安全要求。由此产生的一个问题是,深度学习缺乏可解释性,就像黑匣子一样运作。深度学习的可靠性很大程度上取决于模型训练数据对运行时测试数据的表示程度,特别是当输入空间维数高达自然图像时。作为回应,我们提出了一个健壮的分布外(OOD)检测框架。该方法将经典的光流运算与变分自动编码器(VAE)的表示学习相结合,实时检测驱动视频中的异常运动。我们还设计了一种在图像中定位OOD因子的方法。对一个驾驶模拟数据集的评估表明,我们的方法在统计上比相关的工作更稳健。 摘要:Highly complex deep learning models are increasingly integrated into modern cyber-physical systems (CPS), many of which have strict safety requirements. One problem arising from this is that deep learning lacks interpretability, operating as a black box. The reliability of deep learning is heavily impacted by how well the model training data represents runtime test data, especially when the input space dimension is high as natural images. In response, we propose a robust out-of-distribution (OOD) detection framework. Our approach detects unusual movements from driving video in real-time by combining classical optic flow operation with representation learning via variational autoencoder (VAE). We also design a method to locate OOD factors in images. Evaluation on a driving simulation data set shows that our approach is statistically more robust than related works.

【6】 Automatic Detection Of Noise Events at Shooting Range Using Machine Learning 标题:基于机器学习的靶场噪声事件自动检测

作者:Jon Nordby,Fabian Nemazi,Dag Rieber 机构:Soundsensing AS, Norwegian University of Life Sciences, Rieber Prosjekt AS 备注:Accepted at 27th International Congress of Sound and Vibration (ICSV27) 链接:https://arxiv.org/abs/2107.11453 摘要:户外射击场应遵守地方和国家当局的噪音规定。这些法规中的限制可能包括对活动时间、噪声事件总数的限制,以及根据噪声或活动类别对事件数量的限制。噪声监测系统可用于跟踪整体声级,但很少提供检测活动或计数事件数量的能力,需要直接与此类法规进行比较。本研究探讨了一个自动侦测系统的可行性和性能,以计数噪音事件。通过在新建的射击场和训练设施收集数据,进行了实证评估。这些数据包括从小型枪支到高口径步枪和炸药的多种武器配置的测试,在多个源位置,并在多个不同的日子收集。使用标准声学指标的时间序列(如A加权声级和1/3倍频程谱图)和分类器(如逻辑回归和卷积神经网络)作为输入,测试了几种可选的机器学习模型。根据假阳性率和假阴性率报告了各种备选方案的性能。检测性能被发现是令人满意的用于自动记录时间段与训练活动。 摘要:Outdoor shooting ranges are subject to noise regulations from local and national authorities. Restrictions found in these regulations may include limits on times of activities, the overall number of noise events, as well as limits on number of events depending on the class of noise or activity. A noise monitoring system may be used to track overall sound levels, but rarely provide the ability to detect activity or count the number of events, required to compare directly with such regulations. This work investigates the feasibility and performance of an automatic detection system to count noise events. An empirical evaluation was done by collecting data at a newly constructed shooting range and training facility. The data includes tests of multiple weapon configurations from small firearms to high caliber rifles and explosives, at multiple source positions, and collected on multiple different days. Several alternative machine learning models are tested, using as inputs time-series of standard acoustic indicators such as A-weighted sound levels and 1/3 octave spectrogram, and classifiers such as Logistic Regression and Convolutional Neural Networks. Performance for the various alternatives are reported in terms of the False Positive Rate and False Negative Rate. The detection performance was found to be satisfactory for use in automatic logging of time-periods with training activity.

分类|识别(5篇)

【1】 Workpiece Image-based Tool Wear Classification in Blanking Processes Using Deep Convolutional Neural Networks 标题:基于深度卷积神经网络的冲裁过程刀具磨损图像分类

作者:Dirk Alexander Molitor,Christian Kubik,Ruben Helmut Hetfleisch,Peter Groche 机构:Institute for Production Engineering, and Forming Machines, Technical University of Darmstadt, Germany, Darmstadt 链接:https://arxiv.org/abs/2107.12034 摘要:冲裁工艺由于其经济性,属于应用最广泛的制造工艺。它们的经济可行性在很大程度上取决于最终产品质量和相关的客户满意度以及可能的停机时间。特别是,刀具磨损增加会降低产品质量并导致停机,这就是近年来在磨损检测方面进行大量研究的原因。基于力和加速度信号的过程监测已得到广泛应用,本文提出了一种新的方法。对16种不同磨损状态的冲床冲裁件进行了拍照,并将其作为深度卷积神经网络的输入对磨损状态进行分类。结果表明,该方法能准确地预测刀具的磨损状态,为刀具磨损监测开辟了新的可能性和研究机会。 摘要:Blanking processes belong to the most widely used manufacturing techniques due to their economic efficiency. Their economic viability depends to a large extent on the resulting product quality and the associated customer satisfaction as well as on possible downtimes. In particular, the occurrence of increased tool wear reduces the product quality and leads to downtimes, which is why considerable research has been carried out in recent years with regard to wear detection. While processes have widely been monitored based on force and acceleration signals, a new approach is pursued in this paper. Blanked workpieces manufactured by punches with 16 different wear states are photographed and then used as inputs for Deep Convolutional Neural Networks to classify wear states. The results show that wear states can be predicted with surprisingly high accuracy, opening up new possibilities and research opportunities for tool wear monitoring of blanking processes.

【2】 Joint Direction and Proximity Classification of Overlapping Sound Events from Binaural Audio 标题:双耳音频重叠声事件的联合方向和邻近度分类

作者:Daniel Aleksander Krause,Archontis Politis,Annamaria Mesaros 机构:Tampere University, Korkeakoulunkatu , Tampere, Finland 链接:https://arxiv.org/abs/2107.12033 摘要:声源邻近和距离估计在许多实际应用中具有重要意义,因为它们为声场景分析提供了重要的信息。由于这两项任务都具有互补性,因此确保这两项任务之间的有效交互对于完整了解听觉环境至关重要。在本文中,我们的目的是研究几种方法来执行联合接近和方向估计从双耳记录,都被定义为粗分类问题的基础上,深层神经网络(DNNs)。考虑到双耳音频的局限性,我们提出了两种将球体分割成角度区域的方法来获得一组方向类。对于每种方法,我们研究了不同的模型类型来获取到达角(DoA)的信息。最后,我们提出了各种方法,将邻近性和方向估计问题结合到一个联合任务中,提供关于出现源的起始点和偏移量的时间信息。实验进行了一个合成混响双耳数据集组成多达两个重叠的声音事件。 摘要:Sound source proximity and distance estimation are of great interest in many practical applications, since they provide significant information for acoustic scene analysis. As both tasks share complementary qualities, ensuring efficient interaction between these two is crucial for a complete picture of an aural environment. In this paper, we aim to investigate several ways of performing joint proximity and direction estimation from binaural recordings, both defined as coarse classification problems based on Deep Neural Networks (DNNs). Considering the limitations of binaural audio, we propose two methods of splitting the sphere into angular areas in order to obtain a set of directional classes. For each method we study different model types to acquire information about the direction-of-arrival (DoA). Finally, we propose various ways of combining the proximity and direction estimation problems into a joint task providing temporal information about the onsets and offsets of the appearing sources. Experiments are performed for a synthetic reverberant binaural dataset consisting of up to two overlapping sound events.

【3】 Preliminary Steps Towards Federated Sentiment Classification 标题:迈向联合情感分类的初步步骤

作者:Xin-Chun Li,De-Chuan Zhan,Yunfeng Shao,Bingshuai Li,Shaoming Song 机构:State Key Laboratory for Novel Software Technology, Nanjing University, Huawei Noah’s Ark Lab 链接:https://arxiv.org/abs/2107.11956 摘要:自然语言中情感倾向的自动挖掘是人工智能应用中的一个基础性研究课题,解决方案与挑战并存。迁移学习和多任务学习技术被用来缓解监督稀疏性,并相应地对多个异构领域进行协作。近年来,用户隐私数据的敏感性给情感分类提出了另一个挑战,即数据隐私保护。本文在语料库必须存储在分散的设备上的约束条件下,采用联邦学习方法进行多领域情感分类。针对多方语义的异构性和嵌入词的特殊性,有针对性地提出了相应的解决方案。首先,我们提出了一个知识转移增强的私有共享(KTEPS)框架,以便在联邦情感分类中更好地进行模型聚合和个性化。其次,我们提出KTEPS$^star$,考虑到词向量丰富的语义和巨大的嵌入量特性,利用基于投影的降维(PDR)方法同时实现隐私保护和高效传输。我们提出了两种基于公共基准的联邦情感分类场景,并通过大量的实验研究验证了本文方法的优越性。 摘要:Automatically mining sentiment tendency contained in natural language is a fundamental research to some artificial intelligent applications, where solutions alternate with challenges. Transfer learning and multi-task learning techniques have been leveraged to mitigate the supervision sparsity and collaborate multiple heterogeneous domains correspondingly. Recent years, the sensitive nature of users' private data raises another challenge for sentiment classification, i.e., data privacy protection. In this paper, we resort to federated learning for multiple domain sentiment classification under the constraint that the corpora must be stored on decentralized devices. In view of the heterogeneous semantics across multiple parties and the peculiarities of word embedding, we pertinently provide corresponding solutions. First, we propose a Knowledge Transfer Enhanced Private-Shared (KTEPS) framework for better model aggregation and personalization in federated sentiment classification. Second, we propose KTEPS$^star$ with the consideration of the rich semantic and huge embedding size properties of word vectors, utilizing Projection-based Dimension Reduction (PDR) methods for privacy protection and efficient transmission simultaneously. We propose two federated sentiment classification scenes based on public benchmarks, and verify the superiorities of our proposed methods with abundant experimental investigations.

【4】 A Model-Agnostic Algorithm for Bayes Error Determination in Binary Classification 标题:一种模型无关的二值分类贝叶斯误差判定算法

作者:Umberto Michelucci,Michela Sperti,Dario Piga,Francesca Venturini,Marco A. Deriu 机构:TOELT llc, Birchlenstr. , D¨ubendorf, Switzerland, PolitoBIOMed Lab, Department of Mechanical and Aerospace Engineering, Politecnico, di Torino, Turin, Italy, Institute of Applied Mathematics and Physics, Zurich University of Applied Sciences 备注:21 pages 链接:https://arxiv.org/abs/2107.11609 摘要:本文提出了一种新的确定最佳性能的方法——内禀极限确定算法(ILD算法),它是根据AUC(ROC曲线下面积)和精确度来测量的,可以从二进制分类问题中的特定数据集获得的数据,这些数据集具有分类特征{sl,而不管}所使用的模型。这个极限,即Bayes误差,完全独立于所使用的任何模型,并且描述了数据集的内在属性。因此,ILD算法在应用于所考虑的数据集时提供了关于任何二元分类算法的预测极限的重要信息。本文对该算法进行了详细的描述,给出了其完整的数学框架,并给出了便于实现的伪码。最后给出了一个实例。 摘要:This paper presents the intrinsic limit determination algorithm (ILD Algorithm), a novel technique to determine the best possible performance, measured in terms of the AUC (area under the ROC curve) and accuracy, that can be obtained from a specific dataset in a binary classification problem with categorical features {sl regardless} of the model used. This limit, namely the Bayes error, is completely independent of any model used and describes an intrinsic property of the dataset. The ILD algorithm thus provides important information regarding the prediction limits of any binary classification algorithm when applied to the considered dataset. In this paper the algorithm is described in detail, its entire mathematical framework is presented and the pseudocode is given to facilitate its implementation. Finally, an example with a real dataset is given.

【5】 Using Deep Learning Techniques and Inferential Speech Statistics for AI Synthesised Speech Recognition 标题:深度学习技术和推理语音统计在人工智能合成语音识别中的应用

作者:Arun Kumar Singh,Priyanka Singh,Karan Nathwani 备注:13 Pages, 13 Figures, 6 Tables. arXiv admin note: substantial text overlap with arXiv:2009.01934 链接:https://arxiv.org/abs/2107.11412 摘要:技术的最新发展使我们得到了像TACOTRON和WAVENETS这样令人惊叹的音频合成模型。另一方面,它带来了更大的威胁,如语音克隆和深度伪造,可能无法被发现。为了解决这些令人担忧的情况,迫切需要提出一种模型来帮助区分合成语音和实际人类语音,并确定合成语音的来源。在这里,我们提出了一个基于卷积神经网络(CNN)和双向递归神经网络(BiRNN)的模型,有助于实现上述两个目标。利用双向RNN和CNN对人工智能合成语音中存在的时间依赖性进行了研究。该模型对人工智能合成音频进行分类,错误率为1.9%,检测底层结构的准确率为97%,优于现有的方法。 摘要:The recent developments in technology have re-warded us with amazing audio synthesis models like TACOTRON and WAVENETS. On the other side, it poses greater threats such as speech clones and deep fakes, that may go undetected. To tackle these alarming situations, there is an urgent need to propose models that can help discriminate a synthesized speech from an actual human speech and also identify the source of such a synthesis. Here, we propose a model based on Convolutional Neural Network (CNN) and Bidirectional Recurrent Neural Network (BiRNN) that helps to achieve both the aforementioned objectives. The temporal dependencies present in AI synthesized speech are exploited using Bidirectional RNN and CNN. The model outperforms the state-of-the-art approaches by classifying the AI synthesized audio from real human speech with an error rate of 1.9% and detecting the underlying architecture with an accuracy of 97%.

表征(1篇)

【1】 Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations 标题:Faceron:基于跨模态潜在表征的多说话人人脸到语音模型

作者:Se-Yun Um,Jihyun Kim,Jihyun Lee,Sangshin Oh,Kyungguen Byun,Hong-Goo Kang 机构:Dept. of E.E., Yonsei University, Seoul, Korea 备注:10 pages (including references), 3 figures 链接:https://arxiv.org/abs/2107.12003 摘要:在这篇论文中,我们提出了一种有效的方法来合成特定说话人的语音波形通过调节个人的脸视频。该方法以语言特征和说话人特征为辅助条件,在端到端的训练框架下,直接将人脸图像转换成语音波形。利用唇读模型从唇部运动中提取语言特征,利用预先训练好的声学模型通过跨模态学习从人脸图像中预测说话人特征。由于这两个特征是不相关的,并且是独立控制的,因此我们可以灵活地合成语音波形,其说话人特征随输入的人脸图像而变化。因此,我们的方法可以看作是一个多说话人面对面语音波形模型。在客观和主观评价结果方面,我们证明了我们提出的模型比传统方法的优越性。具体来说,我们分别通过测量自动语音识别和自动说话人/性别识别任务的准确性来评估语言特征和说话人特征生成模块的性能。我们也使用平均意见评分(MOS)测试来评估合成语音波形的自然度。 摘要:In this paper, we propose an effective method to synthesize speaker-specific speech waveforms by conditioning on videos of an individual's face. Using a generative adversarial network (GAN) with linguistic and speaker characteristic features as auxiliary conditions, our method directly converts face images into speech waveforms under an end-to-end training framework. The linguistic features are extracted from lip movements using a lip-reading model, and the speaker characteristic features are predicted from face images using cross-modal learning with a pre-trained acoustic model. Since these two features are uncorrelated and controlled independently, we can flexibly synthesize speech waveforms whose speaker characteristics vary depending on the input face images. Therefore, our method can be regarded as a multi-speaker face-to-speech waveform model. We show the superiority of our proposed model over conventional methods in terms of both objective and subjective evaluation results. Specifically, we evaluate the performances of the linguistic feature and the speaker characteristic generation modules by measuring the accuracy of automatic speech recognition and automatic speaker/gender recognition tasks, respectively. We also evaluate the naturalness of the synthesized speech waveforms using a mean opinion score (MOS) test.

优化|敛散性(6篇)

【1】 The Holy Grail of Multi-Robot Planning: Learning to Generate Online-Scalable Solutions from Offline-Optimal Experts 标题:多机器人规划的圣杯:学习从离线最优专家那里生成在线可扩展的解决方案

作者:Amanda Prorok,Jan Blumenkamp,Qingbiao Li,Ryan Kortvelesy,Zhe Liu,Ethan Stump 机构:Department of Computer Science and Technology, University of Cambridge, UK, DEVCOM Army Research Laboratory (ARL), Maryland, USA. 链接:https://arxiv.org/abs/2107.12254 摘要:许多多机器人规划问题都受到维数灾难的影响,这使得求解大规模问题的难度加大。基于学习的方法在多机器人规划中的应用前景广阔,因为它使我们能够将昂贵但最优的求解器的在线计算负担转移到离线学习过程中。简单地说,其思想是训练一个策略来复制一个小规模系统生成的最优模式,然后将该策略转移到更大的系统中,希望学习到的策略能够扩展,同时保持接近最优的性能。然而,许多问题阻碍我们充分发挥这一想法的潜力。这份蓝天报告阐述了一些仍然存在的关键挑战。 摘要:Many multi-robot planning problems are burdened by the curse of dimensionality, which compounds the difficulty of applying solutions to large-scale problem instances. The use of learning-based methods in multi-robot planning holds great promise as it enables us to offload the online computational burden of expensive, yet optimal solvers, to an offline learning procedure. Simply put, the idea is to train a policy to copy an optimal pattern generated by a small-scale system, and then transfer that policy to much larger systems, in the hope that the learned strategy scales, while maintaining near-optimal performance. Yet, a number of issues impede us from leveraging this idea to its full potential. This blue-sky paper elaborates some of the key challenges that remain.

【2】 A binary variant of gravitational search algorithm and its application to windfarm layout optimization problem 标题:重力搜索算法的二进制变体及其在风电场布局优化问题中的应用

作者:Susheel Kumar Joshi,Jagdish Chand Bansal 机构:Received: date Accepted: date 链接:https://arxiv.org/abs/2107.11844 摘要:在二进制搜索空间中,GSA框架存在停滞、多样性丢失、过早收敛和时间复杂度高等缺点。为了解决这些问题,本文提出了一种新的二元搜索算法,称之为嵌入引力常数的邻域新算法(BNAGGSA)。在BNAGGSA中,新的基于适应度距离的社会交互策略产生了一种自适应步长机制,通过该机制,agent可以根据当前的搜索需求,以最优步长向最优方向移动。在23个著名的基准测试问题上,将该算法与GSA的两个二进制变量进行了性能比较。实验结果和统计分析证明了BNAGGSA算法优于其他算法。此外,为了验证该算法在实际应用中的适用性,还考虑了一个风电场布局优化问题。以两个不同风场的两组不同风场数据为例进行了实验研究。 摘要:In the binary search space, GSA framework encounters the shortcomings of stagnation, diversity loss, premature convergence and high time complexity. To address these issues, a novel binary variant of GSA called `A novel neighbourhood archives embedded gravitational constant in GSA for binary search space (BNAGGSA)' is proposed in this paper. In BNAGGSA, the novel fitness-distance based social interaction strategy produces a self-adaptive step size mechanism through which the agent moves towards the optimal direction with the optimal step size, as per its current search requirement. The performance of the proposed algorithm is compared with the two binary variants of GSA over 23 well-known benchmark test problems. The experimental results and statistical analyses prove the supremacy of BNAGGSA over the compared algorithms. Furthermore, to check the applicability of the proposed algorithm in solving real-world applications, a windfarm layout optimization problem is considered. Two case studies with two different wind data sets of two different wind sites is considered for experiments.

【3】 Power of human-algorithm collaboration in solving combinatorial optimization problems 标题:人-算法协作在解决组合优化问题中的作用

作者:Tapani Toivonen 机构:University of Eastern Finland, School of computing 备注:19 pages 链接:https://arxiv.org/abs/2107.11784 摘要:许多组合优化问题往往被认为难以精确求解或近似求解。这种问题的一个例子是最大团,在复杂性理论的标准假设下,它不能在次指数时间内求解,也不能在多项式因子内有效地逼近。我们证明,如果一个多项式时间算法能够从专家$poly(n)$次查询信息高斯先验,那么一类组合优化问题可以有效地求解,期望可达一个乘法因子$epsilon$,其中$epsilon$是任意常数。虽然我们提出的方法仅仅是理论上的,但它们为如何解决这些通常被认为难以解决的问题提供了新的思路。 摘要:Many combinatorial optimization problems are often considered intractable to solve exactly or by approximation. An example of such problem is maximum clique which -- under standard assumptions in complexity theory -- cannot be solved in sub-exponential time or be approximated within polynomial factor efficiently. We show that if a polynomial time algorithm can query informative Gaussian priors from an expert $poly(n)$ times, then a class of combinatorial optimization problems can be solved efficiently in expectation up to a multiplicative factor $epsilon$ where $epsilon$ is arbitrary constant. While our proposed methods are merely theoretical, they cast new light on how to approach solving these problems that have been usually considered intractable.

【4】 Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition 标题:压缩神经网络:朝着确定最佳分层分解的方向发展

作者:Lucas Liebenwein,Alaa Maalouf,Oren Gal,Dan Feldman,Daniela Rus 机构:MIT CSAIL, University of Haifa 链接:https://arxiv.org/abs/2107.11442 摘要:我们提出了一种新的深度神经网络全局压缩框架,该框架自动分析每一层,以确定最佳的每一层压缩比,同时实现所需的整体压缩。我们的算法依赖于压缩每个卷积(或完全连接)层的思想,通过将其信道分为多个组,并通过低秩分解对每个组进行分解。该算法的核心是从Eckart-Young-Mirsky定理推导出分层误差界。然后,我们利用这些边界将压缩问题框架化为一个优化问题,在这个优化问题中,我们希望最小化跨层的最大压缩错误,并提出一个有效的算法来解决这个问题。我们的实验表明,我们的方法优于现有的低秩压缩方法在广泛的网络和数据集。我们相信,我们的研究结果为将来研究现代神经网络的全局性能-规模权衡开辟了新的途径。我们的代码在https://github.com/lucaslie/torchprune. 摘要:We present a novel global compression framework for deep neural networks that automatically analyzes each layer to identify the optimal per-layer compression ratio, while simultaneously achieving the desired overall compression. Our algorithm hinges on the idea of compressing each convolutional (or fully-connected) layer by slicing its channels into multiple groups and decomposing each group via low-rank decomposition. At the core of our algorithm is the derivation of layer-wise error bounds from the Eckart Young Mirsky theorem. We then leverage these bounds to frame the compression problem as an optimization problem where we wish to minimize the maximum compression error across layers and propose an efficient algorithm towards a solution. Our experiments indicate that our method outperforms existing low-rank compression approaches across a wide range of networks and data sets. We believe that our results open up new avenues for future research into the global performance-size trade-offs of modern neural networks. Our code is available at https://github.com/lucaslie/torchprune.

【5】 Enhanced Bilevel Optimization via Bregman Distance 标题:基于Bregman距离的增强型双层优化

作者:Feihu Huang,Heng Huang 机构: University of Pittsburgh, com†Department of Electrical and Computer Engineering 备注:15 pages, 2 tables 链接:https://arxiv.org/abs/2107.12301 摘要:双层优化在超参数优化、策略优化、元学习等机器学习问题中有着广泛的应用。虽然最近提出了许多双层优化方法来求解二层优化问题,但它们仍然具有较高的计算复杂度,并且不考虑具有非光滑正则化的更一般的二层问题。因此,本文提出了一类基于Bregman距离的两层优化方法。在我们的方法中,我们使用镜像体面迭代来解决双层问题的外子问题,使用强凸Bregman函数。具体来说,我们提出了一种基于Bregman距离(BiO-breed)的双层优化方法来解决确定性双层问题,该方法的计算复杂度低于已知的结果。基于随机逼近梯度和Bregman距离,提出了一种求解随机双层问题的随机双层优化方法。此外,我们提出了一个加速版本的SBiO繁殖方法(ASBiO繁殖)使用方差减少技术。此外,我们证明了对于非凸强凸二层问题的$epsilon$稳定点,ASBiO-breed在条件数$kappa$和目标精度$epsilon$方面的计算复杂度优于最著名的计算复杂度。特别地,我们的方法可以以较低的计算复杂度解决具有非光滑正则化的双层优化问题。 摘要:Bilevel optimization has been widely applied many machine learning problems such as hyperparameter optimization, policy optimization and meta learning. Although many bilevel optimization methods more recently have been proposed to solve the bilevel optimization problems, they still suffer from high computational complexities and do not consider the more general bilevel problems with nonsmooth regularization. In the paper, thus, we propose a class of efficient bilevel optimization methods based on Bregman distance. In our methods, we use the mirror decent iteration to solve the outer subproblem of the bilevel problem by using strongly-convex Bregman functions. Specifically, we propose a bilevel optimization method based on Bregman distance (BiO-BreD) for solving deterministic bilevel problems, which reaches the lower computational complexities than the best known results. We also propose a stochastic bilevel optimization method (SBiO-BreD) for solving stochastic bilevel problems based on the stochastic approximated gradients and Bregman distance. Further, we propose an accelerated version of SBiO-BreD method (ASBiO-BreD) by using the variance-reduced technique. Moreover, we prove that the ASBiO-BreD outperforms the best known computational complexities with respect to the condition number $kappa$ and the target accuracy $epsilon$ for finding an $epsilon$-stationary point of nonconvex-strongly-convex bilevel problems. In particular, our methods can solve the bilevel optimization problems with nonsmooth regularization with a lower computational complexity.

【6】 Brain Inspired Computing Approach for the Optimization of the Thin Film Thickness of Polystyrene on the Glass Substrates 标题:玻璃衬底聚苯乙烯薄膜厚度优化的脑启发计算方法

作者:Akshansh Mishra,Devarrishi Dixit 机构:a Centre for Artificial Intelligent Manufacturing Systems, Stir Research Technologies, India, Department of Materials Science Engineering, Christian Albrechts University zu Kiel, Germany 链接:https://arxiv.org/abs/2107.12156 摘要:机器学习的出现对包括材料科学领域在内的各个领域产生了深远的影响。重点介绍了多项式回归、决策树回归、随机林回归、支持向量回归等多种有监督机器学习回归算法的应用,用人工神经网络算法确定聚苯乙烯在玻璃基板上的薄膜厚度。结果表明,多项式回归机器学习算法的确定系数约为0.96,均方误差为0.04,优于其它机器学习模型。 摘要:Advent in machine learning is leaving a deep impact on various sectors including the material science domain. The present paper highlights the application of various supervised machine learning regression algorithms such as polynomial regression, decision tree regression algorithm, random forest algorithm, support vector regression algorithm, and artificial neural network algorithm to determine the thin film thickness of Polystyrene on the glass substrates. The results showed that the polynomial regression machine learning algorithm outperforms all other machine learning models by yielding the coefficient of determination of 0.96 approximately and mean square error of 0.04 respectively.

预测|估计(5篇)

【1】 Predicting Influential Higher-Order Patterns in Temporal Network Data 标题:时态网络数据中有影响力的高阶模式预测

作者:Christoph Gote,Vincenzo Perri,Ingo Scholtes 机构:Predicting Infuential Higher-Order Patterns in Temporal Network Data, , Predicting Influential Higher-Order Paterns in Temporal Network Data, Chair of Systems Design, ETH Zurich, Zurich, Switzerland 备注:28 pages, 7 figures, 2 tables 链接:https://arxiv.org/abs/2107.12100 摘要:网络经常被用来模拟由相互作用的元素组成的复杂系统。虽然链路捕捉到直接交互的拓扑结构,但许多系统的真正复杂性源于路径中的高阶模式,通过这些模式节点可以间接地相互影响。路径数据表示连续直接交互的有序序列,可以用来建模这些模式。然而,为了避免过度拟合,这样的模型应该只考虑数据提供足够的统计证据的那些高阶模式。另一方面,我们假设仅捕捉直接交互的网络模型不适合数据中存在的高阶模式。因此,这两种方法都可能误判复杂网络中有影响的节点。我们在MOGen模型的基础上提出了八个中心性度量,这是一个多阶生成模型,它可以计算到最大距离的所有路径,但忽略了更大距离的路径。在一个预测实验中,我们将基于MOGen的中心性与网络模型和路径数据的等效度量进行了比较,目的是在样本外数据中识别出有影响的节点。我们的结果显示有力的证据支持我们的假设。MOGen始终优于网络模型和基于路径的预测。我们进一步证明,如果我们有足够的观测值,MOGen和基于路径的方法之间的性能差异将消失,这证实了误差是由于过度拟合造成的。 摘要:Networks are frequently used to model complex systems comprised of interacting elements. While links capture the topology of direct interactions, the true complexity of many systems originates from higher-order patterns in paths by which nodes can indirectly influence each other. Path data, representing ordered sequences of consecutive direct interactions, can be used to model these patterns. However, to avoid overfitting, such models should only consider those higher-order patterns for which the data provide sufficient statistical evidence. On the other hand, we hypothesise that network models, which capture only direct interactions, underfit higher-order patterns present in data. Consequently, both approaches are likely to misidentify influential nodes in complex networks. We contribute to this issue by proposing eight centrality measures based on MOGen, a multi-order generative model that accounts for all paths up to a maximum distance but disregards paths at higher distances. We compare MOGen-based centralities to equivalent measures for network models and path data in a prediction experiment where we aim to identify influential nodes in out-of-sample data. Our results show strong evidence supporting our hypothesis. MOGen consistently outperforms both the network model and path-based prediction. We further show that the performance difference between MOGen and the path-based approach disappears if we have sufficient observations, confirming that the error is due to overfitting.

【2】 Deep Learning Explicit Differentiable Predictive Control Laws for Buildings 标题:建筑物的深度学习显式微分预测控制律

作者:Jan Drgona,Aaron Tuor,Soumya Vasisht,Elliott Skomski,Draguna Vrabie 机构:∗ Pacific Northwest National Laboratory, Richland, Washington, USA 链接:https://arxiv.org/abs/2107.11843 摘要:针对未知非线性系统,提出了一种学习约束控制律的可微预测控制方法。DPC为显式非线性模型预测控制(MPC)中出现的多参数规划问题提供了一种近似解。与近似MPC相反,DPC不需要专家控制器的监督。相反,系统动力学模型是从观察到的系统动力学中学习出来的,而神经控制律则是利用可微闭环系统模型进行离线优化的。将可微闭环系统与惩罚方法相结合,对系统的输出和输入进行约束处理,使我们可以通过学习的系统模型对经济MPC损失进行反向传播,从而直接优化控制律的参数。利用多区域建筑热动力学学习模型进行仿真,验证了该方法的控制性能。 摘要:We present a differentiable predictive control (DPC) methodology for learning constrained control laws for unknown nonlinear systems. DPC poses an approximate solution to multiparametric programming problems emerging from explicit nonlinear model predictive control (MPC). Contrary to approximate MPC, DPC does not require supervision by an expert controller. Instead, a system dynamics model is learned from the observed system's dynamics, and the neural control law is optimized offline by leveraging the differentiable closed-loop system model. The combination of a differentiable closed-loop system and penalty methods for constraint handling of system outputs and inputs allows us to optimize the control law's parameters directly by backpropagating economic MPC loss through the learned system model. The control performance of the proposed DPC method is demonstrated in simulation using learned model of multi-zone building thermal dynamics.

【3】 Protein-RNA interaction prediction with deep learning: Structure matters 标题:基于深度学习的蛋白质-RNA相互作用预测:结构问题

作者:Junkang Wei,Siyuan Chen,Licheng Zong,Xin Gao,Yu Li 机构: Owing toEqual contribution 1Department of Computer Science and En-gineering (CSE), The Chinese University of Hong Kong (CUHK), King Ab-dullah University of Science and Technology (KAUST), Saudi Arabia 3The CUHK Shenzhen ResearchInstitute 链接:https://arxiv.org/abs/2107.12243 摘要:蛋白质与RNA的相互作用对多种细胞活动至关重要。为了研究这种相互作用,已经发展了实验技术和计算技术。由于以往数据库的局限性,尤其是蛋白质结构数据的缺乏,现有的计算方法大多依赖于序列数据,只有一小部分方法利用了结构信息。最近,AlphaFold已经彻底改变了整个蛋白质和生物学领域。可以预见,蛋白质与RNA相互作用预测在未来几年也将得到显著的推广。在这项工作中,我们对这一领域进行了全面的回顾,调查了结合位点和结合偏好预测问题,涵盖了常用的数据集、特征和模型。我们还指出了这一领域潜在的挑战和机遇。本文综述了RBP-RNA相互作用领域过去的研究进展,并对其在后α时代的发展进行了展望。 摘要:Protein-RNA interactions are of vital importance to a variety of cellular activities. Both experimental and computational techniques have been developed to study the interactions. Due to the limitation of the previous database, especially the lack of protein structure data, most of the existing computational methods rely heavily on the sequence data, with only a small portion of the methods utilizing the structural information. Recently, AlphaFold has revolutionized the entire protein and biology field. Foreseeably, the protein-RNA interaction prediction will also be promoted significantly in the upcoming years. In this work, we give a thorough review of this field, surveying both the binding site and binding preference prediction problems and covering the commonly used datasets, features, and models. We also point out the potential challenges and opportunities in this field. This survey summarizes the development of the RBP-RNA interaction field in the past and foresees its future development in the post-AlphaFold era.

【4】 Robust Regularized Locality Preserving Indexing for Fiedler Vector Estimation 标题:Fiedler向量估计的鲁棒正则化保局索引

作者:Aylin Tastan,Michael Muma,Abdelhak M. Zoubir 机构: Technische Universit¨atDarmstadt 链接:https://arxiv.org/abs/2107.12070 摘要:连通图的Fiedler向量是与图的代数连通性相关联的特征向量,它为了解图的潜在结构提供了大量的信息。然而,在实际应用中,数据可能会受到重尾噪声和异常值的影响,从而导致Fiedler向量估计结构的恶化。我们设计了一种鲁棒的正则化局部保持索引(RRLPI)方法用于Fiedler向量估计,该方法旨在逼近Laplace-Beltrami算子的非线性流形结构,同时最小化异常值的负面影响。首先,分析了聚类分析中两种基本离群点类型对块亲和矩阵特征分解的影响。然后,建立了误差模型,提出了一种鲁棒的Fiedler矢量估计算法。提出了一种利用投影空间的几何结构进行鲁棒正则化Fiedler估计的无监督罚参数选择算法。通过大量的合成和真实数据实验,从检测概率、分割质量、图像分割能力、鲁棒性和计算时间等方面对RRLPI的性能进行了测试。 摘要:The Fiedler vector of a connected graph is the eigenvector associated with the algebraic connectivity of the graph Laplacian and it provides substantial information to learn the latent structure of a graph. In real-world applications, however, the data may be subject to heavy-tailed noise and outliers which results in deteriorations in the structure of the Fiedler vector estimate. We design a Robust Regularized Locality Preserving Indexing (RRLPI) method for Fiedler vector estimation that aims to approximate the nonlinear manifold structure of the Laplace Beltrami operator while minimizing the negative impact of outliers. First, an analysis of the effects of two fundamental outlier types on the eigen-decomposition for block affinity matrices which are essential in cluster analysis is conducted. Then, an error model is formulated and a robust Fiedler vector estimation algorithm is developed. An unsupervised penalty parameter selection algorithm is proposed that leverages the geometric structure of the projection space to perform robust regularized Fiedler estimation. The performance of RRLPI is benchmarked against existing competitors in terms of detection probability, partitioning quality, image segmentation capability, robustness and computation time using a large variety of synthetic and real data experiments.

【5】 TargetNet: Functional microRNA Target Prediction with Deep Neural Networks 标题:TargetNet:基于深度神经网络的功能性microRNA靶标预测

作者:Seonwoo Min,Byunghan Lee,Sungroh Yoon 机构:Department of Electrical and Computer Engineering, Seoul National University, Seoul , South Korea, Department of Electronic and IT Media Engineering, Seoul National University of Science and Technology, Seoul , South Korea 备注:7 pages, under review 链接:https://arxiv.org/abs/2107.11381 摘要:MicroRNAs(miRNAs)通过与信使rna(mRNAs)的靶位点结合,在基因表达调控中起着关键作用。虽然识别miRNAs的功能靶点是非常重要的,但是它们的预测仍然是一个巨大的挑战。以前的计算算法有很大的局限性。他们使用保守的候选靶位点(CTS)选择标准,主要集中在标准位点类型上,依赖费时费力的手工特征提取,并且没有充分利用miRNA-CTS相互作用的信息。本文介绍了一种基于深度学习的功能性miRNA靶点预测算法TargetNet。为了解决以前方法的局限性,TargetNet有三个关键组成部分:(1)宽松的CTS选择标准,以适应种子区域的不规则性,(2)一种新的miRNA-CTS序列编码方案,包括扩展的种子区域比对,和(3)一个基于深度残差网络的预测模型。该模型用miRNA-CTS对数据集进行训练,并用miRNA-mRNA对数据集进行评价。TargetNet改进了以前用于功能性miRNA目标分类的最新算法。此外,它在区分高功能miRNA靶点方面显示出巨大的潜力。 摘要:MicroRNAs (miRNAs) play pivotal roles in gene expression regulation by binding to target sites of messenger RNAs (mRNAs). While identifying functional targets of miRNAs is of utmost importance, their prediction remains a great challenge. Previous computational algorithms have major limitations. They use conservative candidate target site (CTS) selection criteria mainly focusing on canonical site types, rely on laborious and time-consuming manual feature extraction, and do not fully capitalize on the information underlying miRNA-CTS interactions. In this paper, we introduce TargetNet, a novel deep learning-based algorithm for functional miRNA target prediction. To address the limitations of previous approaches, TargetNet has three key components: (1) relaxed CTS selection criteria accommodating irregularities in the seed region, (2) a novel miRNA-CTS sequence encoding scheme incorporating extended seed region alignments, and (3) a deep residual network-based prediction model. The proposed model was trained with miRNA-CTS pair datasets and evaluated with miRNA-mRNA pair datasets. TargetNet advances the previous state-of-the-art algorithms used in functional miRNA target classification. Furthermore, it demonstrates great potential for distinguishing high-functional miRNA targets.

其他神经网络|深度学习|模型|建模(23篇)

【1】 Sisyphus: A Cautionary Tale of Using Low-Degree Polynomial Activations in Privacy-Preserving Deep Learning 标题:西西弗斯:在保护隐私的深度学习中使用低次多项式激活的警示故事

作者:Karthik Garimella,Nandan Kumar Jha,Brandon Reagen 机构:New York University 备注:2 figures and 2 tables 链接:https://arxiv.org/abs/2107.12342 摘要:客户机-服务器机器学习中的隐私问题引起了私有推理(PI),即神经推理直接发生在加密的输入上。PI保护客户的个人数据和服务器的知识产权。PI中的一种常见做法是使用乱码电路私下计算非线性函数,即ReLUs。然而,乱码电路面临着高存储、带宽和延迟成本。为了缓解这些问题,PI友好的多项式激活函数被用来代替ReLU。在这项工作中,我们问:用低次多项式激活函数代替所有ReLUs来构建深层的、隐私友好的神经网络是否可行?我们通过分析用多项式代替ReLUs所面临的挑战来探讨这个问题,从简单的替换和替换解决方案到新颖、更复杂的替换和再训练策略。我们研究了每种方法的局限性,并对PI多项式激活函数的使用进行了评述。我们发现所有被评估的解都存在逃逸激活问题:正向激活值不可避免地开始以指数速率远离多项式的稳定区域,这导致爆炸值(NaNs)或较差的近似。 摘要:Privacy concerns in client-server machine learning have given rise to private inference (PI), where neural inference occurs directly on encrypted inputs. PI protects clients' personal data and the server's intellectual property. A common practice in PI is to use garbled circuits to compute nonlinear functions privately, namely ReLUs. However, garbled circuits suffer from high storage, bandwidth, and latency costs. To mitigate these issues, PI-friendly polynomial activation functions have been employed to replace ReLU. In this work, we ask: Is it feasible to substitute all ReLUs with low-degree polynomial activation functions for building deep, privacy-friendly neural networks? We explore this question by analyzing the challenges of substituting ReLUs with polynomials, starting with simple drop-and-replace solutions to novel, more involved replace-and-retrain strategies. We examine the limitations of each method and provide commentary on the use of polynomial activation functions for PI. We find all evaluated solutions suffer from the escaping activation problem: forward activation values inevitably begin to expand at an exponential rate away from stable regions of the polynomials, which leads to exploding values (NaNs) or poor approximations.

【2】 From Implicit to Explicit feedback: A deep neural network for modeling sequential behaviours and long-short term preferences of online users 标题:从隐式反馈到显式反馈:模拟在线用户顺序行为和长短期偏好的深度神经网络

作者:Quyen Tran,Lam Tran,Linh Chu Hai,Linh Ngo Van,Khoat Than 机构:Hanoi University of Science and Technology, No. , Dai Co Viet road, Hanoi, Vietnam, VCCorp Corporation, Vietnam, A R T I C L E I N F O 备注:17 pages 链接:https://arxiv.org/abs/2107.12325 摘要:在这项工作中,我们研究了在推荐系统中使用多种行为的优点。直观地说,每个用户在做出明确的决定(如购买)之前都必须做一些隐含的动作(如单击)。以往的研究表明,内隐反馈和外显反馈对有用的推荐具有不同的作用。然而,这些研究要么分别利用内隐行为和外显行为,要么忽略了用户和项目之间顺序交互的语义。此外,我们假设用户在某一时刻的偏好是长期和短期兴趣的结合。在本文中,我们提出了一些深度学习架构。第一种是隐式到显式(ITE),通过用户的行为序列来挖掘用户的兴趣。两个版本的ITE采用了基于Transformer(BERT-based)架构的双向编码器表示,称为BERT-ITE和BERT-itesi,这两个版本结合了用户的长期和短期偏好,没有或有侧信息,以增强用户表示。实验结果表明,我们的模型优于以往的最新模型,也证明了我们的观点,即在两个大规模数据集中利用隐式到显式顺序以及结合长期和短期偏好的有效性。 摘要:In this work, we examine the advantages of using multiple types of behaviour in recommendation systems. Intuitively, each user has to do some implicit actions (e.g., click) before making an explicit decision (e.g., purchase). Previous studies showed that implicit and explicit feedback have different roles for a useful recommendation. However, these studies either exploit implicit and explicit behaviour separately or ignore the semantic of sequential interactions between users and items. In addition, we go from the hypothesis that a user's preference at a time is a combination of long-term and short-term interests. In this paper, we propose some Deep Learning architectures. The first one is Implicit to Explicit (ITE), to exploit users' interests through the sequence of their actions. And two versions of ITE with Bidirectional Encoder Representations from Transformers based (BERT-based) architecture called BERT-ITE and BERT-ITE-Si, which combine users' long- and short-term preferences without and with side information to enhance user representation. The experimental results show that our models outperform previous state-of-the-art ones and also demonstrate our views on the effectiveness of exploiting the implicit to explicit order as well as combining long- and short-term preferences in two large-scale datasets.

【3】 In Defense of the Learning Without Forgetting for Task Incremental Learning 标题:为任务增量学习的学习不忘辩护

作者:Guy Oren,Lior Wolf 机构:Tel-Aviv University 备注:12 pages with 4 figures 链接:https://arxiv.org/abs/2107.12304 摘要:灾难性遗忘是持续学习系统面临的主要挑战之一,它伴随着一系列的在线任务。这一领域引起了相当大的兴趣,并提出了一套不同的方法来克服这一挑战。不遗忘学习(LwF)是最早也是最常被引用的学习方法之一。它的优点是不需要存储以前任务中的样本,实现简单,并且依赖于知识提炼而具有良好的基础。然而,普遍的观点是,当只引入两个任务时,它显示出相对少量的遗忘,但它不能扩展到长序列的任务。本文对这一观点提出了质疑,通过使用正确的体系结构和标准的扩充集,LwF得到的结果超过了任务增量场景的最新算法。在CIFAR-100和Tiny-ImageNet上进行的大量实验证明了这种改进的性能,同时也表明其他方法不能从类似的改进中获益。 摘要:Catastrophic forgetting is one of the major challenges on the road for continual learning systems, which are presented with an on-line stream of tasks. The field has attracted considerable interest and a diverse set of methods have been presented for overcoming this challenge. Learning without Forgetting (LwF) is one of the earliest and most frequently cited methods. It has the advantages of not requiring the storage of samples from the previous tasks, of implementation simplicity, and of being well-grounded by relying on knowledge distillation. However, the prevailing view is that while it shows a relatively small amount of forgetting when only two tasks are introduced, it fails to scale to long sequences of tasks. This paper challenges this view, by showing that using the right architecture along with a standard set of augmentations, the results obtained by LwF surpass the latest algorithms for task incremental scenario. This improved performance is demonstrated by an extensive set of experiments over CIFAR-100 and Tiny-ImageNet, where it is also shown that other methods cannot benefit as much from similar improvements.

【4】 Thought Flow Nets: From Single Predictions to Trains of Model Thought 标题:思维流网络:从单一预测到模型思维序列

作者:Hendrik Schuff,Heike Adel,Ngoc Thang Vu 机构: Bosch Center for Artificial Intelligence, Renningen, Germany, Institut für Maschinelle Sprachverarbeitung, University of Stuttgart 链接:https://arxiv.org/abs/2107.12220 摘要:当人类解决复杂的问题时,很少能马上做出决定。相反,他们从一个直观的决定开始,反思它,发现错误,解决矛盾,在不同的假设之间跳跃。因此,他们创造了一系列的想法,并遵循一系列的思路,最终得出结论性的决定。与此相反,今天的神经分类模型大多是训练一个单一的输入映射到一个固定的输出。在本文中,我们将探讨如何给予模型第二次、第三次和第k$次思考的机会。我们从黑格尔的辩证法中得到启发,提出了一种将现有分类器的类预测(如图像类forest)转化为一系列预测(如forest$rightarrow$tree$rightarrow$蘑菇)的方法。具体地说,我们提出了一个校正模块,用来估计模型的正确性,以及一个基于预测梯度的迭代预测更新。我们的方法在类概率分布上产生一个动态系统$unicode{x2014}$思想流。我们从计算机视觉和自然语言处理的不同数据集和任务来评估我们的方法。我们观察到令人惊讶的复杂但直观的行为,并证明我们的方法(i)可以纠正错误分类,(ii)增强模型性能,(iii)对高水平的敌对攻击具有鲁棒性,(iv)在标签分布偏移设置中可将精确度提高高达4%,(iv)提供了一种模型解释性工具,该工具可揭示在单个分布预测中不可见的模型知识。 摘要:When humans solve complex problems, they rarely come up with a decision right-away. Instead, they start with an intuitive decision, reflect upon it, spot mistakes, resolve contradictions and jump between different hypotheses. Thus, they create a sequence of ideas and follow a train of thought that ultimately reaches a conclusive decision. Contrary to this, today's neural classification models are mostly trained to map an input to one single and fixed output. In this paper, we investigate how we can give models the opportunity of a second, third and $k$-th thought. We take inspiration from Hegel's dialectics and propose a method that turns an existing classifier's class prediction (such as the image class forest) into a sequence of predictions (such as forest $rightarrow$ tree $rightarrow$ mushroom). Concretely, we propose a correction module that is trained to estimate the model's correctness as well as an iterative prediction update based on the prediction's gradient. Our approach results in a dynamic system over class probability distributions $unicode{x2014}$ the thought flow. We evaluate our method on diverse datasets and tasks from computer vision and natural language processing. We observe surprisingly complex but intuitive behavior and demonstrate that our method (i) can correct misclassifications, (ii) strengthens model performance, (iii) is robust to high levels of adversarial attacks, (iv) can increase accuracy up to 4% in a label-distribution-shift setting and (iv) provides a tool for model interpretability that uncovers model knowledge which otherwise remains invisible in a single distribution prediction.

【5】 How to Certify Machine Learning Based Safety-critical Systems? A Systematic Literature Review 标题:如何认证基于机器学习的安全关键系统?系统的文献综述

作者:Florian Tambon,Gabriel Laberge,Le An,Amin Nikanjam,Paulina Stevia Nouwou Mindom,Yann Pequignot,Foutse Khomh,Giulio Antoniol,Ettore Merlo,François Laviolette 机构:Stevia Nouwou Mindom, · Yann, Laviolette, Received: date Accepted: date 备注:69 pages (89 pages with ref.) 链接:https://arxiv.org/abs/2107.12045 摘要:背景:机器学习(ML)是过去几年中许多创新的核心。然而,将其纳入所谓的“安全关键”系统(如汽车或航空)已被证明是非常具有挑战性的,因为ML带来的范式转变彻底改变了传统的认证方法。目的:本文旨在阐明与基于机器学习的安全关键系统认证相关的挑战,以及文献中提出的解决方案,回答“如何认证基于机器学习的安全关键系统”的问题。方法:我们对2015-2020年间发表的研究论文进行系统文献综述(SLR),涉及ML系统认证的相关主题。总共,我们确定了229篇论文,涵盖了被认为是ML认证的主要支柱的主题:健壮性、不确定性、解释性、验证、安全强化学习和直接认证。我们分析了各子领域的主要趋势和存在的问题,并对论文进行了总结。结果:SLR结果突出了社会各界对这一课题的热情,以及在数据集和模型类型方面缺乏多样性。它还强调需要进一步发展学术界和工业界之间的联系,以深化该领域的研究。最后,还说明了在目前主要单独研究的上述主要支柱之间建立联系的必要性。结论:我们重点介绍了目前在基于ML的软件系统认证方面所做的努力,并讨论了一些未来的研究方向。 摘要:Context: Machine Learning (ML) has been at the heart of many innovations over the past years. However, including it in so-called 'safety-critical' systems such as automotive or aeronautic has proven to be very challenging, since the shift in paradigm that ML brings completely changes traditional certification approaches. Objective: This paper aims to elucidate challenges related to the certification of ML-based safety-critical systems, as well as the solutions that are proposed in the literature to tackle them, answering the question 'How to Certify Machine Learning Based Safety-critical Systems?'. Method: We conduct a Systematic Literature Review (SLR) of research papers published between 2015 to 2020, covering topics related to the certification of ML systems. In total, we identified 229 papers covering topics considered to be the main pillars of ML certification: Robustness, Uncertainty, Explainability, Verification, Safe Reinforcement Learning, and Direct Certification. We analyzed the main trends and problems of each sub-field and provided summaries of the papers extracted. Results: The SLR results highlighted the enthusiasm of the community for this subject, as well as the lack of diversity in terms of datasets and type of models. It also emphasized the need to further develop connections between academia and industries to deepen the domain study. Finally, it also illustrated the necessity to build connections between the above mention main pillars that are for now mainly studied separately. Conclusion: We highlighted current efforts deployed to enable the certification of ML based software systems, and discuss some future research directions.

【6】 Compensation Learning 标题:薪酬学习

作者:Rujing Yao,Mengyang Li,Ou Wu 机构:Center for Applied Mathematics, Tianjin University, Tianjin, China 链接:https://arxiv.org/abs/2107.11921 摘要:加权策略在机器学习中占主导地位。例如,鲁棒机器学习中的一种常用方法是对可能有噪声或硬噪声的样本施加较低的权重。这项研究揭示了另一个未被发现的策略,即补偿,也被广泛应用于机器学习。本文将补偿学习称为补偿学习,并对补偿学习进行了系统的分类。在我们的分类法中,补偿学习是根据补偿目标、推理方式和粒度级别来划分的。现有的许多学习算法包括一些经典的学习算法都可以看作是补偿学习或部分利用补偿学习的特例。此外,将补偿学习引入到现有的学习算法中,可以得到一系列新的学习算法。具体来说,本文提出了三种新的鲁棒机器学习算法。通过对文本情感分析、图像分类和图形分类的大量实验,验证了这三种新算法的有效性。补偿学习也可用于各种学习场景,如不平衡学习、聚类、回归等。 摘要:Weighting strategy prevails in machine learning. For example, a common approach in robust machine learning is to exert lower weights on samples which are likely to be noisy or hard. This study reveals another undiscovered strategy, namely, compensating, that has also been widely used in machine learning. Learning with compensating is called compensation learning and a systematic taxonomy is constructed for it in this study. In our taxonomy, compensation learning is divided on the basis of the compensation targets, inference manners, and granularity levels. Many existing learning algorithms including some classical ones can be seen as a special case of compensation learning or partially leveraging compensating. Furthermore, a family of new learning algorithms can be obtained by plugging the compensation learning into existing learning algorithms. Specifically, three concrete new learning algorithms are proposed for robust machine learning. Extensive experiments on text sentiment analysis, image classification, and graph classification verify the effectiveness of the three new algorithms. Compensation learning can also be used in various learning scenarios, such as imbalance learning, clustering, regression, and so on.

【7】 Reinforced Imitation Learning by Free Energy Principle 标题:利用自由能原理强化模仿学习

作者:Ryoya Ogishima,Izumi Karino,Yasuo Kuniyoshi 机构: Therefore 1Graduate School of Information Science and Technology, TheUniversity of Tokyo 链接:https://arxiv.org/abs/2107.11811 摘要:强化学习(RL)需要大量的探索,特别是在稀疏奖励环境下。模仿学习(IL)可以从专家的演示中学习而不需要探索,但它永远不会超过专家的表现,而且很容易在演示和执行之间发生分布转换。本文基于自由能原理(FEP),从根本上统一了RL和IL。FEP是一个统一的贝叶斯理论的大脑,解释知觉,行动和模型学习的共同基本原则。我们提出了FEP的一个理论扩展,并推导了一个算法,在该算法中,一个agent学习内部化专家演示的世界模型,同时使用该模型来推断当前和未来的状态和行为,以获得最大的回报。因此,该算法通过部分模仿专家并以无缝方式最大化其回报来降低勘探成本,从而比次优专家具有更高的性能。实验结果表明,该方法在稀疏奖赏环境下的视觉控制任务中具有良好的应用前景。 摘要:Reinforcement Learning (RL) requires a large amount of exploration especially in sparse-reward settings. Imitation Learning (IL) can learn from expert demonstrations without exploration, but it never exceeds the expert's performance and is also vulnerable to distributional shift between demonstration and execution. In this paper, we radically unify RL and IL based on Free Energy Principle (FEP). FEP is a unified Bayesian theory of the brain that explains perception, action and model learning by a common fundamental principle. We present a theoretical extension of FEP and derive an algorithm in which an agent learns the world model that internalizes expert demonstrations and at the same time uses the model to infer the current and future states and actions that maximize rewards. The algorithm thus reduces exploration costs by partially imitating experts as well as maximizing its return in a seamless way, resulting in a higher performance than the suboptimal expert. Our experimental results show that this approach is promising in visual control tasks especially in sparse-reward environments.

【8】 Character Spotting Using Machine Learning Techniques 标题:基于机器学习技术的字符定位

作者:P Preethi,Hrishikesh Viswanath 机构:Department of Computer Science and Engineering, PES University 链接:https://arxiv.org/abs/2107.11795 摘要:这项工作提出了一个机器学习算法的比较,实现了分割字符的文本作为一个图像。这些算法设计用于处理文本未按组织方式对齐的降级文档。研究了利用支持向量机、K近邻算法和编码网络进行字符定位的方法。字符定位是通过选择空白区域从文本流中提取潜在字符。 摘要:This work presents a comparison of machine learning algorithms that are implemented to segment the characters of text presented as an image. The algorithms are designed to work on degraded documents with text that is not aligned in an organized fashion. The paper investigates the use of Support Vector Machines, K-Nearest Neighbor algorithm and an Encoder Network to perform the operation of character spotting. Character Spotting involves extracting potential characters from a stream of text by selecting regions bound by white space.

【9】 Learning Risk-aware Costmaps for Traversability in Challenging Environments 标题:学习具有风险意识的成本图,以便在具有挑战性的环境中实现可旅行性

作者:David D. Fan,Ali-akbar Agha-mohammadi,Evangelos A. Theodorou 机构: often computing traversabilitymeans calculating worst-case bounds on the uncertainty of 1Institute for Robotics and Intelligent Machines, Georgia Institute ofTechnology, California Institute of Technology 链接:https://arxiv.org/abs/2107.11722 摘要:在未知和非结构化环境中,机器人自主探索和导航的主要挑战之一是确定机器人可以或不能安全移动的位置。这种确定的一个重要困难来源是随机性和不确定性,来自定位误差、传感器稀疏性和噪声、难以建模的机器人-地面相互作用以及对车辆运动的干扰。解决这个问题的经典方法依赖于对周围地形的几何分析,这很容易产生建模错误,并且计算成本很高。此外,对不确定可遍历性代价的分布进行建模是一项困难的任务,由于上述各种误差源的存在,使得建模变得更加复杂。在这项工作中,我们采取原则性的学习方法来解决这个问题。我们介绍了一个神经网络结构的鲁棒学习分布的遍历性成本。由于我们的动机是保护机器人的生命,因此我们从学习尾部风险的角度来解决这个学习问题,即条件风险值(CVaR)。我们证明,这种方法可靠地学习期望的尾部风险给定一个期望的概率风险阈值在0和1之间,产生了一个遍历性成本图,它对异常值更稳健,更准确地捕捉尾部风险,并且与基线相比计算效率更高。我们通过一个步行机器人在充满挑战的非结构化环境中(包括废弃的地铁、石灰岩洞穴和熔岩管洞穴)进行数据采集,验证了我们的方法。 摘要:One of the main challenges in autonomous robotic exploration and navigation in unknown and unstructured environments is determining where the robot can or cannot safely move. A significant source of difficulty in this determination arises from stochasticity and uncertainty, coming from localization error, sensor sparsity and noise, difficult-to-model robot-ground interactions, and disturbances to the motion of the vehicle. Classical approaches to this problem rely on geometric analysis of the surrounding terrain, which can be prone to modeling errors and can be computationally expensive. Moreover, modeling the distribution of uncertain traversability costs is a difficult task, compounded by the various error sources mentioned above. In this work, we take a principled learning approach to this problem. We introduce a neural network architecture for robustly learning the distribution of traversability costs. Because we are motivated by preserving the life of the robot, we tackle this learning problem from the perspective of learning tail-risks, i.e. the Conditional Value-at-Risk (CVaR). We show that this approach reliably learns the expected tail risk given a desired probability risk threshold between 0 and 1, producing a traversability costmap which is more robust to outliers, more accurately captures tail risks, and is more computationally efficient, when compared against baselines. We validate our method on data collected a legged robot navigating challenging, unstructured environments including an abandoned subway, limestone caves, and lava tube caves.

【10】 Boosting Video Captioning with Dynamic Loss Network 标题:利用动态丢失网络增强视频字幕

作者:Nasibullah,Partha Pratim Mohanta 机构: Indian Statistical Institute, Kolkata 备注:10 pages, 3 figures, Preprint 链接:https://arxiv.org/abs/2107.11707 摘要:视频字幕是视觉与语言交叉领域的一个极具挑战性的问题,在视频检索、视频监控、辅助视觉障碍者、人机界面等方面有着广泛的应用。近年来,基于深度学习的方法取得了很好的效果,但与其他视觉任务(如图像分类、目标检测等)相比仍处于较低的水平。现有的视频字幕方法的一个显著缺点是,它们是在交叉熵损失函数的基础上进行优化的,而交叉熵损失函数与事实上的评价指标(BLEU、METEOR、CIDER、ROUGE)不相关,换句话说,交叉熵不是视频字幕真实损失函数的合适替代品。本文通过引入动态损耗网络(DLN)来解决这个问题,DLN提供了一个直接反映评估指标的反馈信号。我们在microsoftrearch视频描述语料库(MSVD)和MSR视频到文本(MSRVTT)数据集上的研究结果优于以前的方法。 摘要:Video captioning is one of the challenging problems at the intersection of vision and language, having many real-life applications in video retrieval, video surveillance, assisting visually challenged people, Human-machine interface, and many more. Recent deep learning-based methods have shown promising results but are still on the lower side than other vision tasks (such as image classification, object detection). A significant drawback with existing video captioning methods is that they are optimized over cross-entropy loss function, which is uncorrelated to the de facto evaluation metrics (BLEU, METEOR, CIDER, ROUGE).In other words, cross-entropy is not a proper surrogate of the true loss function for video captioning. This paper addresses the drawback by introducing a dynamic loss network (DLN), which provides an additional feedback signal that directly reflects the evaluation metrics. Our results on Microsoft Research Video Description Corpus (MSVD) and MSR-Video to Text (MSRVTT) datasets outperform previous methods.

【11】 The Impact of Negative Sampling on Contrastive Structured World Models 标题:负抽样对对比结构世界模型的影响

作者:Ondrej Biza,Elise van der Pol,Thomas Kipf 机构: but it is less obvious 1Northeastern University, USA 2University ofAmsterdam 备注:This work appeared at the ICML 2021 Workshop: Self-Supervised Learning for Reasoning and Perception 链接:https://arxiv.org/abs/2107.11676 摘要:通过对比学习训练的世界模型是基于自动编码器的世界模型的一种令人信服的替代方法,而基于自动编码器的世界模型是通过重建像素状态来学习的。在这篇文章中,我们描述了三个案例,在对比损失中,我们如何取样负态的微小变化会导致模型性能的剧烈变化。在先前研究的Atari数据集中,我们发现利用时间步长相关性可以使对比结构化世界模型的性能提高一倍。我们还收集了一个完整的数据集来研究对比学习在一个更多元化的经验集。 摘要:World models trained by contrastive learning are a compelling alternative to autoencoder-based world models, which learn by reconstructing pixel states. In this paper, we describe three cases where small changes in how we sample negative states in the contrastive loss lead to drastic changes in model performance. In previously studied Atari datasets, we show that leveraging time step correlations can double the performance of the Contrastive Structured World Model. We also collect a full version of the datasets to study contrastive learning under a more diverse set of experiences.

【12】 On the Sample Complexity of Privately Learning Axis-Aligned Rectangles 标题:关于自学习轴对齐矩形的样本复杂度

作者:Menachem Sadigurschi,Uri Stemmer 链接:https://arxiv.org/abs/2107.11526 摘要:我们重新讨论了在有限网格$X^dsubsteq{mathbb{R}}}^d$上学习轴对齐矩形的基本问题。现有结果表明,该问题的样本复杂度最大为$minleft{d{cdot}log | X | ,;d^{1.5}{cdot}左(log^*| X | 右)^{1.5}右}$。也就是说,现有的构造要么需要样本复杂度随$log | X |$线性增长,要么它随维度$d$超线性增长。我们提出了一种新的算法,将样本复杂度降低到$tilde{O}left{d{cdot}left(log^*| X |right)^{1.5}right}$,在不要求样本复杂度随$log | X |$$增长的情况下获得维度最佳依赖性。为实现此改进而使用的技术包括在移动中删除“暴露”的数据点,以避免自适应合成定理的成本。这项技术的核心可能是个人的兴趣,介绍了一种新的方法来构造统计有效的私有算法。 摘要:We revisit the fundamental problem of learning Axis-Aligned-Rectangles over a finite grid $X^dsubseteq{mathbb{R}}^d$ with differential privacy. Existing results show that the sample complexity of this problem is at most $minleft{ d{cdot}log|X| ;,; d^{1.5}{cdot}left(log^*|X| right)^{1.5}right}$. That is, existing constructions either require sample complexity that grows linearly with $log|X|$, or else it grows super linearly with the dimension $d$. We present a novel algorithm that reduces the sample complexity to only $tilde{O}left{d{cdot}left(log^*|X|right)^{1.5}right}$, attaining a dimensionality optimal dependency without requiring the sample complexity to grow with $log|X|$.The technique used in order to attain this improvement involves the deletion of "exposed" data-points on the go, in a fashion designed to avoid the cost of the adaptive composition theorems. The core of this technique may be of individual interest, introducing a new method for constructing statistically-efficient private algorithms.

【13】 Training multi-objective/multi-task collocation physics-informed neural network with student/teachers transfer learnings 标题:利用学生/教师迁移学习训练多目标/多任务配置物理信息神经网络

作者:Bahador Bahmani,WaiChing Sun 机构: Accepted: date 链接:https://arxiv.org/abs/2107.11496 摘要:本文提出了一种PINN训练框架,该框架采用(1)预训练步骤,利用点云存储的辅助数据加速和提高了物理信息神经网络训练的鲁棒性,(2)改进神经网络权值初始化的网到网知识转移算法;(3)改进具有竞争约束的物理信息神经网络性能的多目标优化算法。我们将物理通知神经网络(PNN)的训练、传递和多任务学习看作多目标问题,其中物理约束如控制方程、边界条件、热力学不等式、对称性和不变性性质,以及用于预训练的点云有时会导致冲突,需要寻求帕累托最优解。在这种情况下,通常用于处理多个约束的加权范数可能会导致性能较差,而其他多目标算法可能会随着维数的增加而伸缩性较差。为了克服这一技术障碍,我们采用了向量化目标函数的概念,并改进了梯度下降法来处理梯度冲突问题。将数值实验与用PINN求解的基准边值问题进行了比较。与经典的等权范数方法进行了性能比较。数值实验表明,该策略可以克服PINN实现中存在的脆弱性和鲁棒性不足的问题。 摘要:This paper presents a PINN training framework that employs (1) pre-training steps that accelerates and improve the robustness of the training of physics-informed neural network with auxiliary data stored in point clouds, (2) a net-to-net knowledge transfer algorithm that improves the weight initialization of the neural network and (3) a multi-objective optimization algorithm that may improve the performance of a physical-informed neural network with competing constraints. We consider the training and transfer and multi-task learning of physics-informed neural network (PINN) as multi-objective problems where the physics constraints such as the governing equation, boundary conditions, thermodynamic inequality, symmetry, and invariant properties, as well as point cloud used for pre-training can sometimes lead to conflicts and necessitating the seek of the Pareto optimal solution. In these situations, weighted norms commonly used to handle multiple constraints may lead to poor performance, while other multi-objective algorithms may scale poorly with increasing dimensionality. To overcome this technical barrier, we adopt the concept of vectorized objective function and modify a gradient descent approach to handle the issue of conflicting gradients. Numerical experiments are compared the benchmark boundary value problems solved via PINN. The performance of the proposed paradigm is compared against the classical equal-weighted norm approach. Our numerical experiments indicate that the brittleness and lack of robustness demonstrated in some PINN implementations can be overcome with the proposed strategy.

【14】 Using a Cross-Task Grid of Linear Probes to Interpret CNN Model Predictions On Retinal Images 标题:使用线性探针的跨任务网格解释视网膜图像上的CNN模型预测

作者:Katy Blumer,Subhashini Venugopalan,Michael P. Brenner,Jon Kleinberg 机构: One simple idea is to take a model that achieves 1Cornell University 2Google Research 3Harvard University 备注:Extended abstract at Interpretable Machine Learning in Healthcare (IMLH) workshop at ICML 2021 链接:https://arxiv.org/abs/2107.11468 摘要:我们使用线性探针分析视网膜图像数据集:线性回归模型训练一些“目标”任务,使用嵌入从深度卷积(CNN)模型训练一些“源”任务作为输入。我们在英国生物库视网膜图像数据集中93个任务的所有可能配对中使用这种方法,得到了约164k个不同的模型。我们通过源任务和目标任务以及层深度来分析这些线性探针的性能。我们观察到来自网络中间层的表示更具普遍性。我们发现,不管源任务是什么,一些目标任务都很容易预测,而其他一些目标任务从相关的源任务比从同一任务上训练的嵌入任务更准确地预测。 摘要:We analyze a dataset of retinal images using linear probes: linear regression models trained on some "target" task, using embeddings from a deep convolutional (CNN) model trained on some "source" task as input. We use this method across all possible pairings of 93 tasks in the UK Biobank dataset of retinal images, leading to ~164k different models. We analyze the performance of these linear probes by source and target task and by layer depth. We observe that representations from the middle layers of the network are more generalizable. We find that some target tasks are easily predicted irrespective of the source task, and that some other target tasks are more accurately predicted from correlated source tasks than from embeddings trained on the same task.

【15】 Non-intrusive reduced order modeling of natural convection in porous media using convolutional autoencoders: comparison with linear subspace techniques 标题:使用卷积自动编码器的多孔介质中自然对流的非侵入式降阶模拟:与线性子空间技术的比较

作者:T. Kadeethum,F. Ballarin,Y. Cho,D. O'Malley,H. Yoon,N. Bouklas 机构: COMPARISON WITH LINEAR SUBSPACETECHNIQUESA PREPRINTTeeratorn KadeethumSibley School of Mechanical and Aerospace EngineeringCornell University, eduFrancesco BallarinDepartment of Mathematics and PhysicsCatholic University of the Sacred Heart 链接:https://arxiv.org/abs/2107.11460 摘要:多孔介质中的自然对流是一个高度非线性的多物理问题,涉及到许多工程应用(例如,$mathrm{cou2}$固存过程)。在这里本文提出了一种多孔介质中自然对流的非侵入降阶模型,该模型采用深度卷积自动编码器进行压缩和重构,径向基函数(RBF)插值或人工神经网络(ANNs)将偏微分方程(pde)的参数映射到相应的网格上非线性流形。为了验证我们的方法,我们还描述了基于适当正交分解(POD)和人工神经网络的线性压缩和重建过程。我们通过三个基准问题对不同模型进行了综合比较。降阶模型,线性和非线性的方法,比有限元模型快得多,因为我们的框架不受Courant-Friedrichs-Lewy条件的约束,所以获得了7×10 ^{6}$的最大速度;因此,它可以在任何给定的时间提供与有限元模型相反的感兴趣的数量。在最坏的情况下,我们的模型的精度仍然在均方误差0.07(比有限元结果的最大值低两个数量级)之内。我们说明,在特定的环境下,非线性方法优于线性方法,反之亦然。我们假设主成分分析(PCA)和t-分布随机邻域嵌入(t-SNE)之间的视觉比较可以表明在采用任何特定的压缩策略之前,哪种方法的性能更好。 摘要:Natural convection in porous media is a highly nonlinear multiphysical problem relevant to many engineering applications (e.g., the process of $mathrm{CO_2}$ sequestration). Here, we present a non-intrusive reduced order model of natural convection in porous media employing deep convolutional autoencoders for the compression and reconstruction and either radial basis function (RBF) interpolation or artificial neural networks (ANNs) for mapping parameters of partial differential equations (PDEs) on the corresponding nonlinear manifolds. To benchmark our approach, we also describe linear compression and reconstruction processes relying on proper orthogonal decomposition (POD) and ANNs. We present comprehensive comparisons among different models through three benchmark problems. The reduced order models, linear and nonlinear approaches, are much faster than the finite element model, obtaining a maximum speed-up of $7 times 10^{6}$ because our framework is not bound by the Courant-Friedrichs-Lewy condition; hence, it could deliver quantities of interest at any given time contrary to the finite element model. Our model's accuracy still lies within a mean squared error of 0.07 (two-order of magnitude lower than the maximum value of the finite element results) in the worst-case scenario. We illustrate that, in specific settings, the nonlinear approach outperforms its linear counterpart and vice versa. We hypothesize that a visual comparison between principal component analysis (PCA) or t-Distributed Stochastic Neighbor Embedding (t-SNE) could indicate which method will perform better prior to employing any specific compression strategy.

【16】 Self-Repairing Neural Networks: Provable Safety for Deep Networks via Dynamic Repair 标题:自修复神经网络:深层网络的动态修复可证明安全性

作者:Klas Leino,Aymeric Fromherz,Ravi Mangal,Matt Fredrikson,Bryan Parno,Corina Păsăreanu 机构: Carnegie Mellon UniversityAYMERIC FROMHERZ, Carnegie Mellon UniversityRAVI MANGAL, Carnegie Mellon UniversityMATT FREDRIKSON, Carnegie Mellon UniversityBRYAN PARNO, Carnegie Mellon UniversityCORINA PĂSĂREANU 链接:https://arxiv.org/abs/2107.11445 摘要:神经网络正越来越多地被部署在安全是一个关键问题的环境中。在这项工作中,我们提出了一种方法来构建神经网络分类器,动态修复违反非关系安全约束的行为,称为安全排序属性。安全排序属性将网络输出索引的排序要求与其输入条件联系起来,并足以表示分类器最有用的非关系安全概念。我们的方法是基于一个新的自修复层,它可以产生安全的输出,而不管其输入的特性。我们将这一层与现有的网络组成一个自修复网络(SR-Net),并证明SR-Net除了提供安全的输出外,还保证了原始网络的准确性。值得注意的是,我们的方法与被修复网络的大小和结构无关,仅取决于指定的属性和网络输出的维度;因此,它可以扩展到大型最先进的网络。我们表明,我们的方法可以使用矢量化计算来实现,这些计算在GPU上高效执行,在当前硬件上引入了不到一毫秒的运行时开销——即使在包含数十万个神经元和数百万个参数的广泛使用的大网络上也是如此。 摘要:Neural networks are increasingly being deployed in contexts where safety is a critical concern. In this work, we propose a way to construct neural network classifiers that dynamically repair violations of non-relational safety constraints called safe ordering properties. Safe ordering properties relate requirements on the ordering of a network's output indices to conditions on their input, and are sufficient to express most useful notions of non-relational safety for classifiers. Our approach is based on a novel self-repairing layer, which provably yields safe outputs regardless of the characteristics of its input. We compose this layer with an existing network to construct a self-repairing network (SR-Net), and show that in addition to providing safe outputs, the SR-Net is guaranteed to preserve the accuracy of the original network. Notably, our approach is independent of the size and architecture of the network being repaired, depending only on the specified property and the dimension of the network's output; thus it is scalable to large state-of-the-art networks. We show that our approach can be implemented using vectorized computations that execute efficiently on a GPU, introducing run-time overhead of less than one millisecond on current hardware -- even on large, widely-used networks containing hundreds of thousands of neurons and millions of parameters.

【17】 A Realistic Simulation Framework for Learning with Label Noise 标题:一种逼真的标签噪声学习仿真框架

作者:Keren Gu,Xander Masotto,Vandana Bachani,Balaji Lakshminarayanan,Jack Nikodem,Dong Yin 备注:Datasets released at this https URL 链接:https://arxiv.org/abs/2107.11413 摘要:我们提出了一个模拟框架,通过伪标记范式生成真实的实例相关噪声标签。通过与CIFAR10-H数据集的比较,我们证明了该框架生成的合成噪声标签在实际环境中表现出标签噪声的重要特征。配备了可控的标签噪声,我们研究了噪声标签的负面影响跨越几个现实的设置,以了解当标签噪声是更大的问题。我们还测试了几种现有的带噪声标签的学习算法,并比较了它们在合成数据集和带独立随机标签噪声的数据集上的行为。此外,利用我们的模拟框架提供的注释器信息,我们提出了一种新的技术,标签质量模型(LQM),它利用注释器的特性来预测和校正噪声标签。我们证明,在应用现有的噪声标签技术之前,通过添加LQM作为标签校正步骤,可以进一步提高模型的性能。 摘要:We propose a simulation framework for generating realistic instance-dependent noisy labels via a pseudo-labeling paradigm. We show that this framework generates synthetic noisy labels that exhibit important characteristics of the label noise in practical settings via comparison with the CIFAR10-H dataset. Equipped with controllable label noise, we study the negative impact of noisy labels across a few realistic settings to understand when label noise is more problematic. We also benchmark several existing algorithms for learning with noisy labels and compare their behavior on our synthetic datasets and on the datasets with independent random label noise. Additionally, with the availability of annotator information from our simulation framework, we propose a new technique, Label Quality Model (LQM), that leverages annotator features to predict and correct against noisy labels. We show that by adding LQM as a label correction step before applying existing noisy label techniques, we can further improve the models' performance.

【18】 Robust Explainability: A Tutorial on Gradient-Based Attribution Methods for Deep Neural Networks 标题:稳健可解释性:深度神经网络基于梯度的属性方法教程

作者:Ian E. Nielsen,Ghulam Rasool,Dimah Dera,Nidhal Bouaynaya,Ravi P. Ramachandran 机构: Rowan University, The University of Texas Rio Grande Valley 备注:21 pages, 3 figures 链接:https://arxiv.org/abs/2107.11400 摘要:随着深层神经网络的兴起,人们越来越认识到解释这些网络预测的挑战。虽然有许多方法可以解释深层神经网络的决策,但目前对于如何评价它们还没有共识。另一方面,稳健性是深度学习研究的热门话题;然而,直到最近才有人在可解释性方面谈论它。在本教程中,我们首先介绍基于梯度的可解释性方法。这些技术使用梯度信号来分配输入特征的决策负担。之后,我们将讨论如何评估基于梯度的方法的稳健性,以及对抗性稳健性在有意义的解释中所起的作用。我们还讨论了基于梯度的方法的局限性。最后,我们给出了在选择可解释性方法之前应该检查的最佳实践和属性。最后,我们在稳健性和可解释性的收敛性方面提出了该领域未来的研究方向。 摘要:With the rise of deep neural networks, the challenge of explaining the predictions of these networks has become increasingly recognized. While many methods for explaining the decisions of deep neural networks exist, there is currently no consensus on how to evaluate them. On the other hand, robustness is a popular topic for deep learning research; however, it is hardly talked about in explainability until very recently. In this tutorial paper, we start by presenting gradient-based interpretability methods. These techniques use gradient signals to assign the burden of the decision on the input features. Later, we discuss how gradient-based methods can be evaluated for their robustness and the role that adversarial robustness plays in having meaningful explanations. We also discuss the limitations of gradient-based methods. Finally, we present the best practices and attributes that should be examined before choosing an explainability method. We conclude with the future directions for research in the area at the convergence of robustness and explainability.

【19】 End-to-End Deep Learning of Long-Haul Coherent Optical Fiber Communications via Regular Perturbation Model 标题:基于规则微扰模型的长距离相干光纤通信端到端深度学习

作者:Vladislav Neskorniuk,Andrea Carnio,Vinod Bajaj,Domenico Marsella,Sergei K. Turitsyn,Jaroslaw E. Prilepsky,Vahid Aref 机构:com( 2)Aston Institute of Photonic Technologies, Aston University, Delft University of Technology 备注:4 pages; accepted for presentation at ECOC 2021 in September 2021 链接:https://arxiv.org/abs/2107.12320 摘要:提出了一种基于“可并行”微扰信道模型的端到端自编码相干光通信学习算法。我们联合优化了星座成形和非线性预加重,实现了0.18位/sym./pol的互信息增益。用EDFAs模拟30x80km G.652smf链路上64gbd双极化单信道传输。 摘要:We present a novel end-to-end autoencoder-based learning for coherent optical communications using a "parallelizable" perturbative channel model. We jointly optimized constellation shaping and nonlinear pre-emphasis achieving mutual information gain of 0.18 bits/sym./pol. simulating 64 GBd dual-polarization single-channel transmission over 30x80 km G.652 SMF link with EDFAs.

【20】 Combining Maximum-Likelihood with Deep Learning for Event Reconstruction in IceCube 标题:极大似然与深度学习相结合的冰立方事件重构

作者:Mirco Hünnefeld 机构: TU Dortmund University, ∗ Presenter, th International Cosmic Ray Conference (ICRC ,), Online – Berlin, Germany, © Copyright owned by the author(s) under the terms of the Creative Commons 备注:Presented at the 37th International Cosmic Ray Conference (ICRC 2021). See arXiv:2107.06966 for all IceCube contributions 链接:https://arxiv.org/abs/2107.12110 摘要:在粒子物理实验中,深度学习的领域变得越来越重要,产生了大量的进展,主要是在事件分类和重建任务中。其中许多应用已被其他领域采用。然而,在机器学习的背景下,物理领域的数据是独特的,因为它们的生成过程以及它们所遵循的规律和对称性通常都被很好地理解。大多数常用的深度学习架构都无法利用这些可用信息。相比之下,更传统的基于似然的方法能够利用领域知识,但它们往往受到计算复杂性的限制。在这篇文章中,提出了一种混合方法,利用产生式神经网络来近似似然,然后可用于传统的最大似然设置。领域知识,如不变性和检测器特性,可以很容易地纳入这种方法。以冰立方事件重建为例说明了该方法的有效性。 摘要:The field of deep learning has become increasingly important for particle physics experiments, yielding a multitude of advances, predominantly in event classification and reconstruction tasks. Many of these applications have been adopted from other domains. However, data in the field of physics are unique in the context of machine learning, insofar as their generation process and the laws and symmetries they abide by are usually well understood. Most commonly used deep learning architectures fail at utilizing this available information. In contrast, more traditional likelihood-based methods are capable of exploiting domain knowledge, but they are often limited by computational complexity. In this contribution, a hybrid approach is presented that utilizes generative neural networks to approximate the likelihood, which may then be used in a traditional maximum-likelihood setting. Domain knowledge, such as invariances and detector characteristics, can easily be incorporated in this approach. The hybrid approach is illustrated by the example of event reconstruction in IceCube.

【21】 A Study on Speech Enhancement Based on Diffusion Probabilistic Model 标题:基于扩散概率模型的语音增强研究

作者:Yen-Ju Lu,Yu Tsao,Shinji Watanabe 机构:∗ Research Center for Information Technology Innovation, Academia Sinica, Taipei, Taiwan, † Language Technology Institute, Carnegie Mellon University, Pittsburgh, PA, United States 备注:submitted to APSIPA 2021 链接:https://arxiv.org/abs/2107.11876 摘要:扩散概率模型通过成对扩散和反向过程对自然图像和原始音频波形进行了建模。利用逆过程的独特特性(即从高斯噪声和噪声信号中去除非目标信号)可以恢复干净的信号。基于这一特性,我们提出了一种基于扩散概率模型的语音增强(DiffuSE)模型,旨在从噪声信号中恢复干净的语音信号。提出的漫反射模型的基本结构类似于DiffWave,DiffWave是一种高质量的音频波形生成模型,具有相对较低的计算成本和占用空间。为了获得更好的增强效果,我们设计了一种改进的反向过程,称为支持性反向过程,它在预测语音的每一个时间步长中加入带噪语音。实验结果表明,在标准化的语音库语料库SE任务中,DiffuSE产生的效果与相关的音频生成模型相当。此外,相对于一般建议的全抽样计划,所提出的支持性反向过程特别改进了快速抽样,与传统的全步骤推理过程相比,只需很少的步骤就能得到更好的增强效果。 摘要:Diffusion probabilistic models have demonstrated an outstanding capability to model natural images and raw audio waveforms through a paired diffusion and reverse processes. The unique property of the reverse process (namely, eliminating non-target signals from the Gaussian noise and noisy signals) could be utilized to restore clean signals. Based on this property, we propose a diffusion probabilistic model-based speech enhancement (DiffuSE) model that aims to recover clean speech signals from noisy signals. The fundamental architecture of the proposed DiffuSE model is similar to that of DiffWave--a high-quality audio waveform generation model that has a relatively low computational cost and footprint. To attain better enhancement performance, we designed an advanced reverse process, termed the supportive reverse process, which adds noisy speech in each time-step to the predicted speech. The experimental results show that DiffuSE yields performance that is comparable to related audio generative models on the standardized Voice Bank corpus SE task. Moreover, relative to the generally suggested full sampling schedule, the proposed supportive reverse process especially improved the fast sampling, taking few steps to yield better enhancement results over the conventional full step inference process.

【22】 Identifying the fragment structure of the organic compounds by deeply learning the original NMR data 标题:深入学习原始核磁共振数据识别有机化合物的碎片结构

作者:Chongcan Li,Yong Cong,Weihua Deng 机构:School of Mathematics and Statistics, Lanzhou University, Lanzhou , P.R. China, College of Chemistry and Chemical Engineering, State Key Lab Appl Organ Chem, Key, Lab Nonferrous Met Chem and Resources Utilizat, Lanzhou University, Lanzhou 备注:12 pages, 8 figures 链接:https://arxiv.org/abs/2107.11740 摘要:对原始核磁共振波谱进行预处理,采用等距采样和峰值采样两种方法提取关键特征,用于后续的子结构模式识别;同时可以提供另一种策略来解决统计建模数据集采集中经常遇到的NMR数据集的不平衡问题,并分别建立两个常规的SVM和KNN模型来评估两种特征选择的能力。我们的研究结果表明,使用峰值抽样特征的模型优于使用其他特征的模型。然后利用峰值采样得到的数据B建立递归神经网络(RNN)模型。通过与传统的机器学习支持向量机和KNN模型的比较,说明了RNN深度学习模型更易于优化超参数,具有更好的泛化能力。 摘要:We preprocess the raw NMR spectrum and extract key characteristic features by using two different methodologies, called equidistant sampling and peak sampling for subsequent substructure pattern recognition; meanwhile may provide the alternative strategy to address the imbalance issue of the NMR dataset frequently encountered in dataset collection of statistical modeling and establish two conventional SVM and KNN models to assess the capability of two feature selection, respectively. Our results in this study show that the models using the selected features of peak sampling outperform the ones using the other. Then we build the Recurrent Neural Network (RNN) model trained by Data B collected from peak sampling. Furthermore, we illustrate the easier optimization of hyper parameters and the better generalization ability of the RNN deep learning model by comparison with traditional machine learning SVM and KNN models in detail.

【23】 Combining Online Learning and Offline Learning for Contextual Bandits with Deficient Support 标题:支持不足的情境土匪的线上学习和线下学习相结合

作者:Hung Tran-The,Sunil Gupta,Thanh Nguyen-Tang,Santu Rana,Svetha Venkatesh 机构:Applied Artificial Intelligence Institute, Deakin University, Australia 链接:https://arxiv.org/abs/2107.11533 摘要:我们解决策略学习与记录的数据在上下文强盗。当前的离线策略学习算法大多基于反向倾向得分(IPS)加权,要求日志策略具有{完全支持},即对评估策略的任何上下文/操作具有非零概率。然而,现实世界中的许多系统并不能保证这样的日志策略,特别是当操作空间很大,并且许多操作的回报很差或缺失时。由于支持度不足,离线学习无法找到最优策略。我们提出了一种新的方法,使用离线学习和在线探索的混合。在线探索用于探索记录数据中不支持的操作,而离线学习用于利用记录数据中支持的操作,以避免不必要的探索。我们的方法使用最少的在线探索次数来确定具有理论保证的最优策略。我们在不同的数据集上用经验证明了我们的算法的有效性。 摘要:We address policy learning with logged data in contextual bandits. Current offline-policy learning algorithms are mostly based on inverse propensity score (IPS) weighting requiring the logging policy to have emph{full support} i.e. a non-zero probability for any context/action of the evaluation policy. However, many real-world systems do not guarantee such logging policies, especially when the action space is large and many actions have poor or missing rewards. With such emph{support deficiency}, the offline learning fails to find optimal policies. We propose a novel approach that uses a hybrid of offline learning with online exploration. The online exploration is used to explore unsupported actions in the logged data whilst offline learning is used to exploit supported actions from the logged data avoiding unnecessary explorations. Our approach determines an optimal policy with theoretical guarantees using the minimal number of online explorations. We demonstrate our algorithms' effectiveness empirically on a diverse collection of datasets.

其他(16篇)

【1】 AAVAE: Augmentation-Augmented Variational Autoencoders 标题:AAVAE:增广-增广变分自动编码器

作者:William Falcon,Ananya Harsh Jha,Teddy Koker,Kyunghyun Cho 机构: Grid AI Labs, New York University, CIFAR Fellow 备注:15 pages, 4 figures, 1 table 链接:https://arxiv.org/abs/2107.12329 摘要:最近的自我监督学习方法可以分为两种:对比学习和非对比学习。它们的成功很大程度上可以归功于数据增强管道,该管道生成单个输入的多个视图,从而保留了底层语义。在这项工作中,我们介绍了增广增广变分自动编码器(AAVAE),第三种方法自我监督学习的基础上自动编码。我们从传统的变分自动编码器(VAE)出发,通过用数据扩充取代KL散度正则化(对输入域不可知),导出AAVAE,该数据扩充明确鼓励内部表示来编码特定于域的不变性和等变性。我们对所提出的AAVAE在图像分类中的应用进行了实证评估,类似于最近对比和非对比学习算法的评估。我们的实验证实了数据增强作为KL散度正则化的替代方法的有效性。AAVAE在CIFAR-10和STL-10上的性能分别比VAE好30%和40%。AAVAE的结果在很大程度上与最先进的自监督学习方法相当。 摘要:Recent methods for self-supervised learning can be grouped into two paradigms: contrastive and non-contrastive approaches. Their success can largely be attributed to data augmentation pipelines which generate multiple views of a single input that preserve the underlying semantics. In this work, we introduce augmentation-augmented variational autoencoders (AAVAE), a third approach to self-supervised learning based on autoencoding. We derive AAVAE starting from the conventional variational autoencoder (VAE), by replacing the KL divergence regularization, which is agnostic to the input domain, with data augmentations that explicitly encourage the internal representations to encode domain-specific invariances and equivariances. We empirically evaluate the proposed AAVAE on image classification, similar to how recent contrastive and non-contrastive learning algorithms have been evaluated. Our experiments confirm the effectiveness of data augmentation as a replacement for KL divergence regularization. The AAVAE outperforms the VAE by 30% on CIFAR-10 and 40% on STL-10. The results for AAVAE are largely comparable to the state-of-the-art for self-supervised learning.

【2】 MLDev: Data Science Experiment Automation and Reproducibility Software 标题:MLDev:数据科学实验自动化和可重复性软件

作者:Anton Khritankov,Nikita Pershin,Nikita Ukhov,Artem Ukhov 机构:Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russian Federation 备注:11 pages, 2 figures 链接:https://arxiv.org/abs/2107.12322 摘要:本文探讨了数据科学实验自动化的挑战。我们提出了一个可扩展的实验模型,为不同的开源工具进行研究实验的集成奠定了基础。我们在一个原型开源MLDev软件包中实现了我们的方法,并在一系列实验中对其进行了评估,得到了令人满意的结果。与其他最先进的工具相比,我们的方法具有新颖性。 摘要:In this paper we explore the challenges of automating experiments in data science. We propose an extensible experiment model as a foundation for integration of different open source tools for running research experiments. We implement our approach in a prototype open source MLDev software package and evaluate it in a series of experiments yielding promising results. Comparison with other state-of-the-art tools signifies novelty of our approach.

【3】 Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment 标题:随机动态环境中减方差的后见值函数

作者:Jiaming Guo,Rui Zhang,Xishan Zhang,Shaohui Peng,Qi Yi,Zidong Du,Xing Hu,Qi Guo,Yunji Chen 机构:SKL of Computer Architecture, Institute of Computing Technology, CAS, Beijing, China, Cambricon Technologies, University of Chinese Academy of Sciences, China, University of Science and Technology of China 链接:https://arxiv.org/abs/2107.12216 摘要:策略梯度方法在深度强化学习中有很好的应用前景,但梯度估计方差较大。为了减小方差,通常采用状态值函数。然而,在随机动态环境中,状态值函数的作用变得有限,意外的状态动态和报酬会增加方差。在本文中,我们提出用一种新的后见值函数代替状态值函数,它利用来自未来的信息来减少随机动态环境中梯度估计的方差。特别地,为了得到一个理想的无偏梯度估计,我们提出了一种信息论方法,它优化了未来的嵌入,使之独立于以前的行为。在我们的实验中,我们将所提出的后见值函数应用于随机动态环境,包括离散动作环境和连续动作环境。与标准状态值函数相比,本文提出的后知后觉值函数能一致地减小方差,稳定训练,改善最终策略。 摘要:Policy gradient methods are appealing in deep reinforcement learning but suffer from high variance of gradient estimate. To reduce the variance, the state value function is applied commonly. However, the effect of the state value function becomes limited in stochastic dynamic environments, where the unexpected state dynamics and rewards will increase the variance. In this paper, we propose to replace the state value function with a novel hindsight value function, which leverages the information from the future to reduce the variance of the gradient estimate for stochastic dynamic environments. Particularly, to obtain an ideally unbiased gradient estimate, we propose an information-theoretic approach, which optimizes the embeddings of the future to be independent of previous actions. In our experiments, we apply the proposed hindsight value function in stochastic dynamic environments, including discrete-action environments and continuous-action environments. Compared with the standard state value function, the proposed hindsight value function consistently reduces the variance, stabilizes the training, and improves the eventual policy.

【4】 A Shallow Ritz Method for elliptic problems with Singular Sources 标题:具有奇异源的椭圆型问题的一种浅Ritz方法

作者:Ming-Chih Lai,Che-Chia Chang,Wei-Syuan Lin,Wei-Fan Hu,Te-Sheng Lin 机构:Department of Applied Mathematics, National Yang Ming Chiao Tung University, Hsinchu , Taiwan, Department of Mathematics, National Central University, Taoyuan , Taiwan, National Center for Theoretical Sciences, National Taiwan University, Taipei , Taiwan 链接:https://arxiv.org/abs/2107.12013 摘要:本文提出了一种求解界面上具有delta函数奇异源的椭圆问题的浅Ritz型神经网络。这部作品有三个新颖的特点;即,(i)delta函数奇异性被自然消除,(ii)水平集函数被引入作为特征输入,(iii)它是完全浅的,只有一个隐藏层。我们首先引入问题的能量泛函,然后将奇异源的贡献转化为沿界面的正则曲面积分。这样就可以自然地消除δ函数的奇异性,而无需引入离散δ函数,这种离散δ函数常用于传统的正则化方法,如著名的浸入边界法。然后将原问题转化为极小化问题。提出了一种具有一个隐层的浅Ritz型神经网络来逼近能量泛函的全局极小值。因此,通过最小化能量的离散形式的损失函数来训练网络。另外,我们将界面的水平集函数作为特征输入,发现它显著提高了训练的效率和准确性。我们进行了一系列的数值试验,以证明该网络的精度以及它在不规则区域和高维问题上的能力。 摘要:In this paper, a shallow Ritz-type neural network for solving elliptic problems with delta function singular sources on an interface is developed. There are three novel features in the present work; namely, (i) the delta function singularity is naturally removed, (ii) level set function is introduced as a feather input, (iii) it is completely shallow consisting of only one hidden layer. We first introduce the energy functional of the problem and then transform the contribution of singular sources to a regular surface integral along the interface. In such a way the delta function singularity can be naturally removed without the introduction of discrete delta function that is commonly used in traditional regularization methods such as the well-known immersed boundary method. The original problem is then reformulated as a minimization problem. We propose a shallow Ritz-type neural network with one hidden layer to approximate the global minimizer of the energy functional. As a result, the network is trained by minimizing the loss function that is a discrete version of the energy. In addition, we include the level set function of the interface as a feature input and find that it significantly improves the training efficiency and accuracy. We perform a series of numerical tests to demonstrate the accuracy of the present network as well as its capability for problems in irregular domains and in higher dimensions.

【5】 Stable Dynamic Mode Decomposition Algorithm for Noisy Pressure-Sensitive Paint Measurement Data 标题:噪声压敏涂料测量数据的稳定动态模式分解算法

作者:Yuya Ohmichi,Yosuke Sugioka,Kazuyuki Nakakita 机构:Japan Aerospace Exploration Agency, Tokyo ,-, Japan 链接:https://arxiv.org/abs/2107.11999 摘要:在这项研究中,我们提出了截断全最小二乘动态模式分解(T-TLS-DMD)算法,可以对含噪数据进行DMD分析。通过在传统TLS-DMD算法的基础上加入截断正则化,T-TLS-DMD在保持TLS-DMD精度的同时,提高了计算的稳定性。通过对圆柱后尾迹的分析和对抖振单元现象的压敏漆(PSP)数据的分析,评价了该方法的有效性。结果表明了正则化在DMD算法中的重要性。在特征值方面,T-TLS-DMD受噪声影响较小,能够稳定地获得准确的特征值,而TLS和子空间DMD的特征值受噪声影响较大。此外,还观察到标准和精确DMD的特征值存在向阻尼侧偏移的问题,如先前研究中所报告的那样。对于特征向量,T-TLS和精确DMD即使在有噪声的情况下也能清晰地捕捉到特征流型,而TLS和子空间DMD由于噪声的影响不能清晰地捕捉到特征流型。 摘要:In this study, we proposed the truncated total least squares dynamic mode decomposition (T-TLS DMD) algorithm, which can perform DMD analysis of noisy data. By adding truncation regularization to the conventional TLS DMD algorithm, T-TLS DMD improves the stability of the computation while maintaining the accuracy of TLS DMD. The effectiveness of the proposed method was evaluated by the analysis of the wake behind a cylinder and pressure-sensitive paint (PSP) data for the buffet cell phenomenon. The results showed the importance of regularization in the DMD algorithm. With respect to the eigenvalues, T-TLS DMD was less affected by noise, and accurate eigenvalues could be obtained stably, whereas the eigenvalues of TLS and subspace DMD varied greatly due to noise. It was also observed that the eigenvalues of the standard and exact DMD had the problem of shifting to the damping side, as reported in previous studies. With respect to eigenvectors, T-TLS and exact DMD captured the characteristic flow patterns clearly even in the presence of noise, whereas TLS and subspace DMD were not able to capture them clearly due to noise.

【6】 Dissecting FLOPs along input dimensions for GreenAI cost estimations 标题:沿输入维度解剖Flops,用于GreenAI成本估计

作者:Andrea Asperti,Davide Evangelista,Moreno Marzolla 机构: University of Bologna, Department of Informatics: Science and Engineering (DISI), Department of Mathematics 备注:Article accepted at the 7th International Conference on Machine Learning, Optimization, and Data Science. October 4-8, 2021, Grasmere, Lake District, UK 链接:https://arxiv.org/abs/2107.11949 摘要:GreenAI一词指的是一种新的深度学习方法,它更了解其方法的生态影响和计算效率。GreenAI的发起人建议使用浮点运算(FLOPs)来衡量神经网络的计算成本;然而,这一指标与配备大规模并行处理单元(如gpu或tpu)的硬件的能耗没有很好的相关性。在这篇文章中,我们提出了一个简单的公式来计算浮点运算的卷积层,称为{alpha}-FLOPs,解释和纠正传统的差异相对于不同的层,更接近现实。{alpha}-FLOPs的概念依赖于一个关键的洞察力,即在输入具有多个维度的情况下,没有理由相信并行所提供的加速比在所有不同的轴上是一致的。 摘要:The term GreenAI refers to a novel approach to Deep Learning, that is more aware of the ecological impact and the computational efficiency of its methods. The promoters of GreenAI suggested the use of Floating Point Operations (FLOPs) as a measure of the computational cost of Neural Networks; however, that measure does not correlate well with the energy consumption of hardware equipped with massively parallel processing units like GPUs or TPUs. In this article, we propose a simple refinement of the formula used to compute floating point operations for convolutional layers, called {alpha}-FLOPs, explaining and correcting the traditional discrepancy with respect to different layers, and closer to reality. The notion of {alpha}-FLOPs relies on the crucial insight that, in case of inputs with multiple dimensions, there is no reason to believe that the speedup offered by parallelism will be uniform along all different axes.

【7】 Measuring Ethics in AI with AI: A Methodology and Dataset Construction 标题:用人工智能测量人工智能中的伦理:一种方法论和数据集构建

作者:Pedro H. C. Avelar,Rafael B. Audibert,Anderson R. Tavares,Luís C. Lamb 机构:Universidade Federal do Rio Grande do Sul 链接:https://arxiv.org/abs/2107.11913 摘要:最近,在人工智能中使用合理的度量和量度已经成为学术界、政府和工业界感兴趣的课题。衡量不同现象的努力在人工智能界得到了广泛的关注,一些有影响力的实地报告和政策文件的发表就说明了这一点。这些指标的目的是帮助决策者了解人工智能和机器学习领域的关键进展的快速发展和影响。在这篇论文中,我们建议使用人工智能技术的这些新发现的能力来增强我们的人工智能测量能力。我们通过训练一个模型来对与道德问题和关注相关的出版物进行分类。在我们的方法中,我们使用一个专家,手工整理的数据集作为训练集,然后评估一大组研究论文。最后,我们强调了人工智能度量的含义,特别是它们对开发可信和公平的人工智能工具和技术的贡献。关键词:人工智能伦理;AI公平;AI测量。计算机科学中的伦理学。 摘要:Recently, the use of sound measures and metrics in Artificial Intelligence has become the subject of interest of academia, government, and industry. Efforts towards measuring different phenomena have gained traction in the AI community, as illustrated by the publication of several influential field reports and policy documents. These metrics are designed to help decision takers to inform themselves about the fast-moving and impacting influences of key advances in Artificial Intelligence in general and Machine Learning in particular. In this paper we propose to use such newfound capabilities of AI technologies to augment our AI measuring capabilities. We do so by training a model to classify publications related to ethical issues and concerns. In our methodology we use an expert, manually curated dataset as the training set and then evaluate a large set of research papers. Finally, we highlight the implications of AI metrics, in particular their contribution towards developing trustful and fair AI-based tools and technologies. Keywords: AI Ethics; AI Fairness; AI Measurement. Ethics in Computer Science.

【8】 Logspace Reducibility From Secret Leakage Planted Clique 标题:秘密泄漏种植团的Logspace可约性

作者:Jay Mardia 链接:https://arxiv.org/abs/2107.11886 摘要:植根集团问题是在观察、解释和预测与统计问题有关的有趣的计算现象的背景下被很好地研究的。当计算效率等同于多项式时间算法的存在性时,种植集团问题的计算硬度可以用来推断许多其他统计问题的计算硬度。这种将计算困难从植根的小团体问题转移到其他统计问题的能力,对于将我们的计算效率的概念转变为空间效率的能力是否强大?我们肯定地回答了这一问题的三个不同的统计问题,即稀疏主成分分析,子矩阵检测和测试几乎k-明智的独立性。关键的挑战是节省空间的随机化约简需要重复访问它们使用的随机性。这些问题的已知约简都是随机的,需要多项式上的许多随机位来实现。由于我们不能在内存中存储多项式数量的随机位,如何有效地实现这些现有的空间缩减还不清楚。有两个想法涉及到规避这个问题和执行已知的减少这些问题的空间效率。1.在解决统计问题时,我们可以将部分输入本身作为随机性。2.当我们想使用部分输入作为随机性时,具有适当秘密泄漏的种植集团问题的秘密泄漏变体比标准种植集团问题更有用(摘要(因arxiv限制而缩短) 摘要:The planted clique problem is well-studied in the context of observing, explaining, and predicting interesting computational phenomena associated with statistical problems. When equating computational efficiency with the existence of polynomial time algorithms, the computational hardness of (some variant of) the planted clique problem can be used to infer the computational hardness of a host of other statistical problems. Is this ability to transfer computational hardness from (some variant of) the planted clique problem to other statistical problems robust to changing our notion of computational efficiency to space efficiency? We answer this question affirmatively for three different statistical problems, namely Sparse PCA, submatrix detection, and testing almost k-wise independence. The key challenge is that space efficient randomized reductions need to repeatedly access the randomness they use. Known reductions to these problems are all randomized and need polynomially many random bits to implement. Since we can not store polynomially many random bits in memory, it is unclear how to implement these existing reductions space efficiently. There are two ideas involved in circumventing this issue and implementing known reductions to these problems space efficiently. 1. When solving statistical problems, we can use parts of the input itself as randomness. 2. Secret leakage variants of the planted clique problem with appropriate secret leakage can be more useful than the standard planted clique problem when we want to use parts of the input as randomness. (abstract shortened due to arxiv constraints)

【9】 Neural Circuit Synthesis from Specification Patterns 标题:基于规范模式的神经电路综合

作者:Frederik Schmitt,Christopher Hahn,Markus N. Rabe,Bernd Finkbeiner 机构:CISPA Helmholtz Center for Information Security, Saarbrücken, Germany, Google Research, Mountain View, California, USA 链接:https://arxiv.org/abs/2107.11864 摘要:我们训练分层Transformer的任务是直接从线性时间时序逻辑(LTL)的高级逻辑规范中合成硬件电路。LTL综合问题是一个众所周知的算法挑战,有着悠久的历史,每年都会组织一次竞赛来跟踪算法和工具的改进。使用机器学习的新方法可能会在这一领域带来很多可能性,但是由于缺乏足够的训练数据。在本文中,我们考虑一种方法来产生大量的额外的训练数据,即,对实施的规格和电路。我们通过从合成比赛中使用的规范中挖掘公共模式,确保合成数据与人类编写的规范足够接近。我们显示,在这个合成数据上训练的层次Transformer解决了合成比赛中的大部分问题,甚至最近案例研究中的分布外的例子。 摘要:We train hierarchical Transformers on the task of synthesizing hardware circuits directly out of high-level logical specifications in linear-time temporal logic (LTL). The LTL synthesis problem is a well-known algorithmic challenge with a long history and an annual competition is organized to track the improvement of algorithms and tooling over time. New approaches using machine learning might open a lot of possibilities in this area, but suffer from the lack of sufficient amounts of training data. In this paper, we consider a method to generate large amounts of additional training data, i.e., pairs of specifications and circuits implementing them. We ensure that this synthetic data is sufficiently close to human-written specifications by mining common patterns from the specifications used in the synthesis competitions. We show that hierarchical Transformers trained on this synthetic data solve a significant portion of problems from the synthesis competitions, and even out-of-distribution examples from a recent case study.

【10】 Distributional Shifts in Automated Diabetic Retinopathy Screening 标题:糖尿病视网膜病变自动筛查中的分布偏移

作者:Jay Nandy,Wynne Hsu,Mong Li Lee 机构:School of Computing, National University of Singapore, Institute of Data Science, National University of Singapore 备注:Accepted at IEEE ICIP 2021 链接:https://arxiv.org/abs/2107.11822 摘要:在糖尿病视网膜病变(DR)筛查中,基于深度学习的模型可以自动检测视网膜图像是否“可参考”。然而,当输入图像分布偏离训练分布时,分类精度下降。此外,即使输入的不是视网膜图像,标准的DR分类器也会产生一个高度自信的预测,即该图像是“可参考的”。本文提出了一个基于Dirichlet先验网络的框架来解决这个问题。它利用了一个非分布(OOD)检测器模型和一个DR分类模型,通过识别OOD图像来提高泛化能力。在真实数据集上的实验表明,该框架能够消除未知的非视网膜图像,识别出分布移位的视网膜图像,便于人工干预。 摘要:Deep learning-based models are developed to automatically detect if a retina image is `referable' in diabetic retinopathy (DR) screening. However, their classification accuracy degrades as the input images distributionally shift from their training distribution. Further, even if the input is not a retina image, a standard DR classifier produces a high confident prediction that the image is `referable'. Our paper presents a Dirichlet Prior Network-based framework to address this issue. It utilizes an out-of-distribution (OOD) detector model and a DR classification model to improve generalizability by identifying OOD images. Experiments on real-world datasets indicate that the proposed framework can eliminate the unknown non-retina images and identify the distributionally shifted retina images for human intervention.

【11】 Go Wider Instead of Deeper 标题:走得更广,而不是更深

作者:Fuzhao Xue,Ziji Shi,Yuxuan Lou,Yong Liu,Yang You 机构:Department of Computer Science, National University of Singapore, Singapore 链接:https://arxiv.org/abs/2107.11817 摘要:Transformer最近在各种任务上取得了令人印象深刻的成果。为了进一步提高Transformer的有效性和效率,现有的工作有两个思路:(1)通过扩展到更多的可训练参数来扩大Transformer的范围(2) 通过参数共享或模型随深度压缩而变浅。然而,当可用于训练的令牌较少时,较大的模型通常不能很好地扩展,并且当模型非常大时,需要高级并行。与原始Transformer模型相比,较小的模型通常由于表现功率的损失而获得较差的性能。在本文中,为了在可训练参数较少的情况下获得更好的性能,我们提出了一个框架来有效地部署可训练参数,方法是更广泛而不是更深。特别地,我们用混合专家(MoE)代替前馈网络(FFN),沿模型宽度进行缩放。然后,我们使用单独的层规范化跨Transformer块共享MoE层。这样的部署起到了转换各种语义表示的作用,使得模型的参数更为高效和有效。为了评估我们的框架,我们设计了WideNet并在ImageNet-1K上进行了评估。我们最好的模型比视觉变换器(ViT)高出1.46%$,可训练参数为$0.72倍。使用$0.46乘以$和$0.13乘以$参数,我们的WideNet仍然可以分别超过ViT和ViT MoE$0.83%$和$2.08%$。 摘要:The transformer has recently achieved impressive results on various tasks. To further improve the effectiveness and efficiency of the transformer, there are two trains of thought among existing works: (1) going wider by scaling to more trainable parameters; (2) going shallower by parameter sharing or model compressing along with the depth. However, larger models usually do not scale well when fewer tokens are available to train, and advanced parallelisms are required when the model is extremely large. Smaller models usually achieve inferior performance compared to the original transformer model due to the loss of representation power. In this paper, to achieve better performance with fewer trainable parameters, we propose a framework to deploy trainable parameters efficiently, by going wider instead of deeper. Specially, we scale along model width by replacing feed-forward network (FFN) with mixture-of-experts (MoE). We then share the MoE layers across transformer blocks using individual layer normalization. Such deployment plays the role to transform various semantic representations, which makes the model more parameter-efficient and effective. To evaluate our framework, we design WideNet and evaluate it on ImageNet-1K. Our best model outperforms Vision Transformer (ViT) by $1.46%$ with $0.72 times$ trainable parameters. Using $0.46 times$ and $0.13 times$ parameters, our WideNet can still surpass ViT and ViT-MoE by $0.83%$ and $2.08%$, respectively.

【12】 SGD May Never Escape Saddle Points 标题:SGD可能永远不会逃脱鞍点

作者:Liu Ziyin,Botao Li,Masahito Ueda 机构:Department of Physics, University of Tokyo, Laboratoire de Physique de l’Ecole normale sup´erieure, ENS, Universit´e PSL, CNRS, Sorbonne Universit´e, Universit´e Paris-Diderot, Sorbonne Paris Cit´e, Institute for Physics of Intelligence, University of Tokyo 链接:https://arxiv.org/abs/2107.11774 摘要:随机梯度下降法(SGD)被用来解决高度非线性和非凸的机器学习问题,如深层神经网络的训练。然而,以往关于SGD的研究往往依赖于对SGD中噪声性质的高度限制和不切实际的假设。在这项工作中,我们在数学上构造的例子,无视以往的理解SGD。例如,我们的构造表明:(1)SGD可能收敛到一个局部极大值(2) SGD可以任意缓慢地脱离鞍点(3) 新加坡元可能更喜欢尖锐的最低比平的;AMSGrad可以收敛到局部极大值。我们的结果表明,在神经网络训练中,SGD的噪声结构可能比损失情况更为重要,未来的研究应该集中在深入学习中得出实际的噪声结构。 摘要:Stochastic gradient descent (SGD) has been deployed to solve highly non-linear and non-convex machine learning problems such as the training of deep neural networks. However, previous works on SGD often rely on highly restrictive and unrealistic assumptions about the nature of noise in SGD. In this work, we mathematically construct examples that defy previous understandings of SGD. For example, our constructions show that: (1) SGD may converge to a local maximum; (2) SGD may escape a saddle point arbitrarily slowly; (3) SGD may prefer sharp minima over the flat ones; and (4) AMSGrad may converge to a local maximum. Our result suggests that the noise structure of SGD might be more important than the loss landscape in neural network training and that future research should focus on deriving the actual noise structure in deep learning.

【13】 Two Headed Dragons: Multimodal Fusion and Cross Modal Transactions 标题:双头龙:多模式融合和跨模式交易

作者:Rupak Bose,Shivam Pande,Biplab Banerjee 机构:Centre of Studies in Resources Engineering, Indian Institute of Technology Bombay, India 备注:Accepted in IEEE International conference on Image Processing (ICIP), 2021 链接:https://arxiv.org/abs/2107.11585 摘要:随着遥感领域的不断发展,我们见证了多光谱(MS)、高光谱(HSI)、激光雷达等多种模式的信息积累,每种模式都有其独特的特点,当它们协同工作时,在识别和分类任务中表现得非常好。然而,由于各领域高度不同,在遥感中融合多种模式很麻烦。此外,现有的方法不利于跨模态相互作用。为此,我们提出了一种新的基于Transformer的HSI和LiDAR融合方法。该模型由堆叠式自动编码器组成,利用HSI和激光雷达的交叉键值对,从而在两种模式之间建立通信,同时使用CNN从HSI和激光雷达提取光谱和空间信息。我们在休斯顿(2013年数据融合大赛)和MUUFL-Gulfport数据集上测试了我们的模型,并取得了有竞争力的结果。 摘要:As the field of remote sensing is evolving, we witness the accumulation of information from several modalities, such as multispectral (MS), hyperspectral (HSI), LiDAR etc. Each of these modalities possess its own distinct characteristics and when combined synergistically, perform very well in the recognition and classification tasks. However, fusing multiple modalities in remote sensing is cumbersome due to highly disparate domains. Furthermore, the existing methods do not facilitate cross-modal interactions. To this end, we propose a novel transformer based fusion method for HSI and LiDAR modalities. The model is composed of stacked auto encoders that harness the cross key-value pairs for HSI and LiDAR, thus establishing a communication between the two modalities, while simultaneously using the CNNs to extract the spectral and spatial information from HSI and LiDAR. We test our model on Houston (Data Fusion Contest - 2013) and MUUFL Gulfport datasets and achieve competitive results.

【14】 Imbalanced Big Data Oversampling: Taxonomy, Algorithms, Software, Guidelines and Future Directions 标题:不平衡大数据过抽样:分类、算法、软件、准则和未来方向

作者:William C. Sleeman IV,Bartosz Krawczyk 机构:Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA 备注:52 pages, 9 tables, 13 figures, 15 algorithms 链接:https://arxiv.org/abs/2107.11508 摘要:从不平衡数据中学习是当代机器学习中最具挑战性的领域之一。如果考虑到需要能够进行高性能处理的专用体系结构的大数据环境,这将变得更加困难。apachespark是一种高效且流行的体系结构,但它对为其实现的算法提出了特定的挑战。虽然过采样算法是处理类不平衡的一种有效方法,但它们还没有被设计用于分布式环境。在本文中,我们提出了一个整体的看法,过采样算法的不平衡大数据。我们讨论了过采样算法的分类及其用于处理倾斜类分布的机制。我们介绍了一个Spark库,实现了14种最先进的过采样算法,并通过大量的实验研究评估了它们的有效性。利用二值和多类海量数据集,分析了过采样算法的有效性及其与不同类型分类器的关系。我们评估了过采样算法的精度和时间复杂度之间的权衡,以及它们在增加数据大小时的可伸缩性。这使我们能够深入了解大数据过采样算法的特定组件的有用性,并为设计未来大规模不平衡数据的重采样方法制定指导方针和建议。我们的图书馆可以从https://github.com/fsleeman/spark-class-balancing.git. 摘要:Learning from imbalanced data is among the most challenging areas in contemporary machine learning. This becomes even more difficult when considered the context of big data that calls for dedicated architectures capable of high-performance processing. Apache Spark is a highly efficient and popular architecture, but it poses specific challenges for algorithms to be implemented for it. While oversampling algorithms are an effective way for handling class imbalance, they have not been designed for distributed environments. In this paper, we propose a holistic look on oversampling algorithms for imbalanced big data. We discuss the taxonomy of oversampling algorithms and their mechanisms used to handle skewed class distributions. We introduce a Spark library with 14 state-of-the-art oversampling algorithms implemented and evaluate their efficacy via extensive experimental study. Using binary and multi-class massive data sets, we analyze the effectiveness of oversampling algorithms and their relationships with different types of classifiers. We evaluate the trade-off between accuracy and time complexity of oversampling algorithms, as well as their scalability when increasing the size of data. This allows us to gain insight into the usefulness of specific components of oversampling algorithms for big data, as well as formulate guidelines and recommendations for designing future resampling approaches for massive imbalanced data. Our library can be downloaded from https://github.com/fsleeman/spark-class-balancing.git.

【15】 6DCNN with roto-translational convolution filters for volumetric data processing 标题:具有旋转-平移卷积滤波器的6DCNN用于体数据处理

作者:Dmitrii Zhemchuzhnikov,Ilia Igashov,Sergei Grudinin 机构:Univ. Grenoble Alpes, CNRS, Grenoble INP, LJK, Grenoble, France 链接:https://arxiv.org/abs/2107.12078 摘要:在这项工作中,我们引入了6D卷积神经网络(6DCNN)来解决处理三维体数据时局部模式的相对位置和方向的检测问题。6DCNN还包括SE(3)-等变信息传递和在Fourier空间中构造的非线性激活操作。在傅立叶空间中工作可以显著降低我们操作的计算复杂度。证明了6D卷积的性质及其在空间模式识别中的有效性。我们还从最近CASP蛋白结构预测挑战的几个数据集上评估了6DCNN模型。在这里,6DCNN改进了基线架构,也优于最新技术。 摘要:In this work, we introduce 6D Convolutional Neural Network (6DCNN) designed to tackle the problem of detecting relative positions and orientations of local patterns when processing three-dimensional volumetric data. 6DCNN also includes SE(3)-equivariant message-passing and nonlinear activation operations constructed in the Fourier space. Working in the Fourier space allows significantly reducing the computational complexity of our operations. We demonstrate the properties of the 6D convolution and its efficiency in the recognition of spatial patterns. We also assess the 6DCNN model on several datasets from the recent CASP protein structure prediction challenges. Here, 6DCNN improves over the baseline architecture and also outperforms the state of the art.

【16】 Restless Bandits with Many Arms: Beating the Central Limit Theorem 标题:多臂躁动的强盗:击败中心极限定理

作者:Xiangyu Zhang,Peter I. Frazier 机构:Beating the Central Limit TheoremXiangyu ZhangDepartment of Operations Research and Information Engineering, Cornell University, FrazierDepartment of Operations Research and Information Engineering 链接:https://arxiv.org/abs/2107.11911 摘要:我们考虑有限时域不平移匪徒多周期拉,在推荐系统,主动学习,收入管理,以及许多其他领域中发挥了重要作用。虽然在原则上,可以使用动态规划计算最优策略,但计算所需的臂数按指数级扩展$N$。因此,了解指数策略和其他策略的性能是非常有价值的,这些策略可以有效地计算大量的$N$。我们研究了whitle提出的一个经典渐近机制中的最优缺口的增长,即期望性能相对于最优策略的损失,在这个渐近机制中,$N$增长同时保持每个周期可以拉动的臂的分数不变。来自中心极限定理的直觉和之前最严密的理论界表明,这个最优性差距应该像$O(sqrt{N})$一样增长。令人惊讶的是,我们证明了超越这个界限是可能的。我们刻画了一个非简并条件和一类新的实际可计算策略,称为流体优先策略,其中最优性差为$O(1)$。其中包括最广泛使用的索引策略。当这个非简并条件不成立时,我们证明了流体优先策略仍然有一个最优性缺口,即$O(sqrt{N})$,显著地推广了收敛速度已知的一类策略。在数值实验中,我们证明了流体优先策略在一系列不安的bandit问题上提供了最先进的性能。 摘要:We consider finite-horizon restless bandits with multiple pulls per period, which play an important role in recommender systems, active learning, revenue management, and many other areas. While an optimal policy can be computed, in principle, using dynamic programming, the computation required scales exponentially in the number of arms $N$. Thus, there is substantial value in understanding the performance of index policies and other policies that can be computed efficiently for large $N$. We study the growth of the optimality gap, i.e., the loss in expected performance compared to an optimal policy, for such policies in a classical asymptotic regime proposed by Whittle in which $N$ grows while holding constant the fraction of arms that can be pulled per period. Intuition from the Central Limit Theorem and the tightest previous theoretical bounds suggest that this optimality gap should grow like $O(sqrt{N})$. Surprisingly, we show that it is possible to outperform this bound. We characterize a non-degeneracy condition and a wide class of novel practically-computable policies, called fluid-priority policies, in which the optimality gap is $O(1)$. These include most widely-used index policies. When this non-degeneracy condition does not hold, we show that fluid-priority policies nevertheless have an optimality gap that is $O(sqrt{N})$, significantly generalizing the class of policies for which convergence rates are known. We demonstrate that fluid-priority policies offer state-of-the-art performance on a collection of restless bandit problems in numerical experiments.

0 人点赞