机器学习学术速递[6.29]

访问www.arxivdaily.com获取含摘要速递，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏、发帖等功能！点击阅读原文即可访问

cs.LG 方向，今日共计147篇

Graph相关(图学习|图神经网络|图优化等)(3篇)

【1】 DGL-LifeSci: An Open-Source Toolkit for Deep Learning on Graphs in Life Science 标题：DGL-LifeSci：面向生命科学图形深度学习的开源工具包

作者：Mufei Li,Jinjing Zhou,Jiajing Hu,Wenxuan Fan,Yangkang Zhang,Yaxin Gu,George Karypis 机构： AWS Shanghai AI Lab, King’s College London, East China University of Science and Technology, Zhejiang University, AWS AI 链接：https://arxiv.org/abs/2106.14232 摘要：图形神经网络是一类对图形数据进行深度学习的方法。在分子性质预测、反应预测、药物与靶点相互作用预测等化学和生物学领域有着广泛的应用。尽管人们对此感兴趣，但基于GNN的建模仍然具有挑战性，因为它除了需要编程和深入学习外，还需要对图形数据进行预处理和建模。这里我们介绍dgllifeci，一个用于生命科学中图形深度学习的开源软件包。dgllifesci是一个基于RDKit、PyTorch和deepgraph库（DGL）的python工具箱。dgllifesci允许基于GNN的定制数据集建模，用于分子性质预测、反应预测和分子生成。通过它的命令行界面，用户可以在没有任何编程和深入学习背景的情况下执行建模。我们使用标准基准测试MoleculeNet、USPTO和zink测试命令行接口。与以前的实现相比，dgllifesci实现了高达6倍的速度。对于建模灵活性，dgllifesci为建模管道的各个阶段提供了优化的模块。此外，dgllifesci还提供了预先训练的模型，用于再现测试实验结果和应用模型而不需要训练。代码是在Apache-2.0许可证下分发的，可以在https://github.com/awslabs/dgl-lifesci. 摘要：Graph neural networks (GNNs) constitute a class of deep learning methods for graph data. They have wide applications in chemistry and biology, such as molecular property prediction, reaction prediction and drug-target interaction prediction. Despite the interest, GNN-based modeling is challenging as it requires graph data pre-processing and modeling in addition to programming and deep learning. Here we present DGL-LifeSci, an open-source package for deep learning on graphs in life science. DGL-LifeSci is a python toolkit based on RDKit, PyTorch and Deep Graph Library (DGL). DGL-LifeSci allows GNN-based modeling on custom datasets for molecular property prediction, reaction prediction and molecule generation. With its command-line interfaces, users can perform modeling without any background in programming and deep learning. We test the command-line interfaces using standard benchmarks MoleculeNet, USPTO, and ZINC. Compared with previous implementations, DGL-LifeSci achieves a speed up by up to 6x. For modeling flexibility, DGL-LifeSci provides well-optimized modules for various stages of the modeling pipeline. In addition, DGL-LifeSci provides pre-trained models for reproducing the test experiment results and applying models without training. The code is distributed under an Apache-2.0 License and is freely accessible at https://github.com/awslabs/dgl-lifesci.

【2】 Graph Convolutional Memory for Deep Reinforcement Learning 标题：深度强化学习的图卷积记忆

作者：Steven D. Morad,Stephan Liwicki,Amanda Prorok 机构：Department of Computer Science and Technology, University of Cambridge UK, Toshiba Europe Ltd. 链接：https://arxiv.org/abs/2106.14117 摘要：解决部分可观测马尔可夫决策过程（POMDPs）是将深度强化学习（DRL）应用于现实世界机器人问题的关键，在现实世界中，agent对世界的看法是不完全的。提出了一种利用深度强化学习求解POMDPs的图卷积存储器（GCM）。与递归神经网络（RNN）或变换器不同，GCM通过知识图将特定领域的先验知识嵌入到记忆召回过程中。通过在图中封装先验知识，GCM可以适应特定的任务，但仍然适用于任何DRL任务。利用图卷积，GCM提取层次图特征，类似于卷积神经网络（CNN）中的图像特征。我们发现GCM在控制、长期非顺序回忆和3D导航任务上优于长-短期记忆（LSTM）、强化学习的门控Transformer（GTrXL）和可微神经计算机（DNCs），同时使用的参数明显较少。摘要：Solving partially-observable Markov decision processes (POMDPs) is critical when applying deep reinforcement learning (DRL) to real-world robotics problems, where agents have an incomplete view of the world. We present graph convolutional memory (GCM) for solving POMDPs using deep reinforcement learning. Unlike recurrent neural networks (RNNs) or transformers, GCM embeds domain-specific priors into the memory recall process via a knowledge graph. By encapsulating priors in the graph, GCM adapts to specific tasks but remains applicable to any DRL task. Using graph convolutions, GCM extracts hierarchical graph features, analogous to image features in a convolutional neural network (CNN). We show GCM outperforms long short-term memory (LSTM), gated transformers for reinforcement learning (GTrXL), and differentiable neural computers (DNCs) on control, long-term non-sequential recall, and 3D navigation tasks while using significantly fewer parameters.

【3】 Spectral-Spatial Graph Reasoning Network for Hyperspectral Image Classification 标题：用于高光谱图像分类的光谱-空间图推理网络

作者：Di Wang,Bo Du,Liangpei Zhang 机构： Zhang is with the State Key Laboratory of Information Engineering inSurveying 链接：https://arxiv.org/abs/2106.13952 摘要：提出了一种用于高光谱图像分类的光谱空间图推理网络（SSGRN）。具体地说，该网络由空间图推理子网（SAGRN）和谱图推理子网（SEGRN）两部分组成，分别捕获空间图和谱图的上下文。与以往对原始图像进行超像素分割或试图在标签图像的指导下获取类别特征的方法不同，本文对网络的中间特征进行超像素分割，自适应地产生同质区域，得到有效的描述子。然后，我们在谱部分采用了相似的思想，合理地聚合信道，生成谱描述符，用于谱图上下文捕获。SAGRN和SEGRN中的所有图推理过程都是通过图卷积实现的。为了保证该方法的全局感知能力，利用非局部自注意机制获得了图推理中的所有邻接矩阵。最后，结合提取的空间和谱图上下文，得到SSGRN，实现了高精度的分类。在三个公共恒生指数基准上进行的大量定量和定性实验表明，与其他最先进的方法相比，所提出的方法具有竞争力。摘要：In this paper, we propose a spectral-spatial graph reasoning network (SSGRN) for hyperspectral image (HSI) classification. Concretely, this network contains two parts that separately named spatial graph reasoning subnetwork (SAGRN) and spectral graph reasoning subnetwork (SEGRN) to capture the spatial and spectral graph contexts, respectively. Different from the previous approaches implementing superpixel segmentation on the original image or attempting to obtain the category features under the guide of label image, we perform the superpixel segmentation on intermediate features of the network to adaptively produce the homogeneous regions to get the effective descriptors. Then, we adopt a similar idea in spectral part that reasonably aggregating the channels to generate spectral descriptors for spectral graph contexts capturing. All graph reasoning procedures in SAGRN and SEGRN are achieved through graph convolution. To guarantee the global perception ability of the proposed methods, all adjacent matrices in graph reasoning are obtained with the help of non-local self-attention mechanism. At last, by combining the extracted spatial and spectral graph contexts, we obtain the SSGRN to achieve a high accuracy classification. Extensive quantitative and qualitative experiments on three public HSI benchmarks demonstrate the competitiveness of the proposed methods compared with other state-of-the-art approaches.

Transformer(6篇)

【1】 TENT: Tensorized Encoder Transformer for Temperature Forecasting 标题：帐篷：用于温度预测的张力编码器Transformer

作者：Onur Bilgin,Paweł Mąka,Thomas Vergutz,Siamak Mehrkanoon 机构：Department of Data Science and Knowledge Engineering, Maastricht University, The Netherlands 备注：9 pages, 10 figures 链接：https://arxiv.org/abs/2106.14742 摘要：可靠的天气预报在科学、商业和社会中具有重要意义。用于天气预报任务的最佳数据驱动模型依赖于递归或卷积神经网络，其中一些神经网络包含注意机制。在这项工作中，我们介绍了一个新的模式，基于Transformer架构的天气预报。提出的TENT（TENT）模型具有张量注意力，通过对气象数据进行多维张量处理，充分利用了气象数据的时空结构。结果表明，与原始Transformer的编码器部分和三维卷积神经网络相比，本文提出的帐篷模型能更好地模拟复杂天气模式，用于温度预报任务。在两个实际气象数据集上进行了实验。数据集包括来自美国、加拿大和欧洲城市的历史测量数据。第一个数据集包含2012年10月至2017年11月美国和加拿大30个城市的每小时天气属性测量值。第二个数据集包含2005年5月至2020年4月欧洲18个城市的每日天气属性测量值。我们使用我们的注意力机制计算出的注意力得分来阐明我们的模型的决策过程和对任务最重要城市的洞察知识。摘要：Reliable weather forecasting is of great importance in science, business and society. The best performing data-driven models for weather prediction tasks rely on recurrent or convolutional neural networks, where some of which incorporate attention mechanisms. In this work, we introduce a new model based on the Transformer architecture for weather forecasting. The proposed Tensorial Encoder Transformer (TENT) model is equipped with tensorial attention and thus it exploits the spatiotemporal structure of weather data by processing it in multidimensional tensorial format. We show that compared to the encoder part of the original transformer and 3D convolutional neural networks, the proposed TENT model can better model the underlying complex pattern of weather data for the studied temperature prediction task. Experiments on two real-life weather datasets are performed. The datasets consist of historical measurements from USA, Canada and European cities. The first dataset contains hourly measurements of weather attributes for 30 cities in USA and Canada from October 2012 to November 2017. The second dataset contains daily measurements of weather attributes of 18 cities across Europe from May 2005 to April 2020. We use attention scores calculated from our attention mechanism to shed light on the decision-making process of our model and have insight knowledge on the most important cities for the task.

【2】 Complexity-based partitioning of CSFI problem instances with Transformers 标题：使用Transformers对CSFI问题实例进行基于复杂度的划分

作者：Luca Benedetto,Paolo Fantozzi,Luigi Laura 机构： di Informatica e Sistemistica Universita di Roma ”La Sapienza”, 3 International Telematic University Uninettuno, uninettunouniversity 链接：https://arxiv.org/abs/2106.14481 摘要：本文提出了一种两步的方法，将连接范式（CNF）句法公式同构问题（CSFI）的实例划分为不同复杂度的组。首先，我们建立了一个基于Transformer架构的模型，试图解决CSFI问题。然后，我们利用这种模型的错误，训练第二个基于Transformer的模型，将问题实例划分为不同复杂度的组，从而检测出不需要花费太多资源就能解决的问题实例。我们在一个伪随机生成的数据集上对所提出的方法进行了评估，得到了令人满意的结果。最后，我们讨论了将这种方法扩展到基于相同类型文本表示的其他问题的可能性。摘要：In this paper, we propose a two-steps approach to partition instances of the Conjunctive Normal Form (CNF) Syntactic Formula Isomorphism problem (CSFI) into groups of different complexity. First, we build a model, based on the Transformer architecture, that attempts to solve instances of the CSFI problem. Then, we leverage the errors of such model and train a second Transformer-based model to partition the problem instances into groups of different complexity, thus detecting the ones that can be solved without using too expensive resources. We evaluate the proposed approach on a pseudo-randomly generated dataset and obtain promising results. Finally, we discuss the possibility of extending this approach to other problems based on the same type of textual representation.

【3】 Feature Combination Meets Attention: Baidu Soccer Embeddings and Transformer based Temporal Detection 标题：特征组合备受关注：百度足球嵌入和基于Transformer的时间检测

作者：Xin Zhou,Le Kang,Zhiyu Cheng,Bo He,Jingyu Xin 机构：Baidu Research, Bordeaux Dr, Sunnyvale, CA , USA 备注：Tech Report. Authors Xin Zhou, Le Kang, and Zhiyu Cheng made equal contributions 链接：https://arxiv.org/abs/2106.14447 摘要：随着互联网技术和新兴工具的迅速发展，与体育相关的在线视频以前所未有的速度增长。为了实现体育视频编辑/亮点生成过程的自动化，一个关键的任务是准确地识别和定位长视频中的事件。在这份技术报告中，我们提出了一个两阶段的范例来检测足球广播视频中发生了什么以及什么时候发生的事件。具体来说，我们对足球数据中的多个动作识别模型进行了微调，以提取高层语义特征，并设计了一个基于变换器的时态检测模块来定位目标事件。在CVPR 2021 ActivityNet研讨会的SoccerNet-v2挑战赛中，这种方法在动作捕捉和重放接地这两项任务中都取得了最先进的性能。我们的足球嵌入功能发布于https://github.com/baidu-research/vidpress-sports. 通过与更广泛的社区分享这些特征，我们希望能够加速足球视频理解的研究。摘要：With rapidly evolving internet technologies and emerging tools, sports related videos generated online are increasing at an unprecedentedly fast pace. To automate sports video editing/highlight generation process, a key task is to precisely recognize and locate the events in the long untrimmed videos. In this tech report, we present a two-stage paradigm to detect what and when events happen in soccer broadcast videos. Specifically, we fine-tune multiple action recognition models on soccer data to extract high-level semantic features, and design a transformer based temporal detection module to locate the target events. This approach achieved the state-of-the-art performance in both two tasks, i.e., action spotting and replay grounding, in the SoccerNet-v2 Challenge, under CVPR 2021 ActivityNet workshop. Our soccer embedding features are released at https://github.com/baidu-research/vidpress-sports. By sharing these features with the broader community, we hope to accelerate the research into soccer video understanding.

【4】 A Reinforcement Learning Approach for Sequential Spatial Transformer Networks 标题：一种序贯空间Transformer网络的强化学习方法

作者：Fatemeh Azimi,Federico Raue,Joern Hees,Andreas Dengel 机构： TU Kaiserslautern, Germany, Smart Data and Knowledge Services, German Research Center for Artificial, Intelligence (DFKI), Germany 链接：https://arxiv.org/abs/2106.14295 摘要：空间变换网络（STN）可以产生几何变换来修改输入图像，从而提高分类器的性能。在这项工作中，我们将STN的思想与强化学习（RL）相结合。为此，我们将仿射变换分解为一系列简单的离散变换。我们将任务描述为一个马尔可夫决策过程（MDP），并使用RL来解决这个顺序决策问题。STN结构通过最小化分类误差和通过次可微采样模块反向传播梯度来学习变换参数。在我们的方法中，我们不受采样模的可微性的限制。此外，我们可以自由地设计目标，而不仅仅是最小化误差；e、我们可以直接将目标设定为最大化精度。我们设计了多个实验来验证我们的方法的有效性，使用杂乱的MNIST和时尚的MNIST数据集，并表明我们的方法优于STN与适当的MDP组件的定义。摘要：Spatial Transformer Networks (STN) can generate geometric transformations which modify input images to improve the classifier's performance. In this work, we combine the idea of STN with Reinforcement Learning (RL). To this end, we break the affine transformation down into a sequence of simple and discrete transformations. We formulate the task as a Markovian Decision Process (MDP) and use RL to solve this sequential decision-making problem. STN architectures learn the transformation parameters by minimizing the classification error and backpropagating the gradients through a sub-differentiable sampling module. In our method, we are not bound to the differentiability of the sampling modules. Moreover, we have freedom in designing the objective rather than only minimizing the error; e.g., we can directly set the target as maximizing the accuracy. We design multiple experiments to verify the effectiveness of our method using cluttered MNIST and Fashion-MNIST datasets and show that our method outperforms STN with a proper definition of MDP components.

【5】 SymbolicGPT: A Generative Transformer Model for Symbolic Regression 标题：SymbolicGPT：一种符号回归的产生式变换模型

作者：Mojtaba Valipour,Bowen You,Maysum Panju,Ali Ghodsi 机构：University of Waterloo 备注：11 pages, 4 figures 链接：https://arxiv.org/abs/2106.14131 摘要：符号回归的任务是识别一个数学表达式，该表达式最适合所提供的输入和输出值数据集。由于数学表达式空间的丰富性，符号回归通常是一个具有挑战性的问题。传统的基于遗传进化算法的方法已经使用了几十年，而基于深度学习的方法是一个相对较新的和活跃的研究领域。在这项工作中，我们提出了一个新的基于Transformer的符号回归语言模型SymbolicGPT。该模型利用了概率语言模型（如GPT）的优点，包括性能上的优势和灵活性。通过综合实验，我们发现我们的模型在精确度、运行时间和数据效率方面都比竞争模型有很强的表现。摘要：Symbolic regression is the task of identifying a mathematical expression that best fits a provided dataset of input and output values. Due to the richness of the space of mathematical expressions, symbolic regression is generally a challenging problem. While conventional approaches based on genetic evolution algorithms have been used for decades, deep learning-based methods are relatively new and an active research area. In this work, we present SymbolicGPT, a novel transformer-based language model for symbolic regression. This model exploits the advantages of probabilistic language models like GPT, including strength in performance and flexibility. Through comprehensive experiments, we show that our model performs strongly compared to competing models with respect to the accuracy, running time, and data efficiency.

【6】 Self-Attentive Ensemble Transformer: Representing Ensemble Interactions in Neural Networks for Earth System Models 标题：自关注系综变换器：在地球系统模型的神经网络中表示系综相互作用

作者：Tobias Sebastian Finn 机构： 20 20) by model output statis- 1Meteorological Institute, University of Hamburg, Germany 2International Max Planck Research School on EarthSystem Modelling, Max Planck Institute for Meteorology 备注：6 Pages, 3 Figures, Accepted at the ICML 2021 workshop "Tackling Climate Change with Machine Learning", Code to the paper: this https URL 链接：https://arxiv.org/abs/2106.13924 摘要：来自地球系统模型的集合数据必须进行校准和后处理。提出了一种新的基于神经网络的逐员后处理方法。我将集合数据同化中的想法与自我关注联系起来，形成自我关注的集合Transformer。在这里，集成成员之间的相互作用表现为加法和动态的自我注意部分。作为概念证明，全球ECMWF集合预报从ERA5再分析回归到2米温度场。我证明了系综变换器可以校准系综扩散，并从系综中提取额外的信息。此外，系综变换器直接输出多元空间相干系综成员。因此，自我注意和变换技术可能是神经网络集成数据逐员后处理的一个缺失部分。摘要：Ensemble data from Earth system models has to be calibrated and post-processed. I propose a novel member-by-member post-processing approach with neural networks. I bridge ideas from ensemble data assimilation with self-attention, resulting into the self-attentive ensemble transformer. Here, interactions between ensemble members are represented as additive and dynamic self-attentive part. As proof-of-concept, global ECMWF ensemble forecasts are regressed to 2-metre-temperature fields from the ERA5 reanalysis. I demonstrate that the ensemble transformer can calibrate the ensemble spread and extract additional information from the ensemble. Furthermore, the ensemble transformer directly outputs multivariate and spatially-coherent ensemble members. Therefore, self-attention and the transformer technique can be a missing piece for a member-by-member post-processing of ensemble data with neural networks.

GAN|对抗|攻击|生成相关(10篇)

【1】 Feature Importance Guided Attack: A Model Agnostic Adversarial Attack 标题：特征重要性制导攻击：一种不可知的对抗性攻击模型

作者：Gilad Gressel,Niranjan Hegde,Archana Sreekumar,Michael Darling 机构：Darling, Center for Cybersecurity Systems and Networks, Amrita University, Kerala, India, Sandia National Laboratories, Albuquerque, USA 链接：https://arxiv.org/abs/2106.14815 摘要：机器学习模型容易受到敌方攻击，这大大降低了它们的性能。对这些攻击的可靠防御是一个尚未解决的挑战。在这项工作中，我们提出了一种新的规避攻击：特征重要性引导攻击（FIGA），它生成对抗性规避样本。FIGA是一种模型无关的算法，它不需要预先知道模型的学习算法，只需要知道模型的特征表示。FIGA利用特征重要性排名；它沿着我们希望模拟的目标类的方向干扰输入的最重要的特性。我们演示了针对八个网络钓鱼检测模型的FIGA。我们通过干扰对手可以控制的钓鱼网站功能来保持攻击的真实性。使用FIGA，我们能够使钓鱼检测模型的F1得分平均从0.96降低到0.41。最后，我们将对抗性训练作为对抗FIGA的一种防御手段，并证明了虽然对抗性训练有时是有效的，但是可以通过改变FIGA的参数来规避。摘要：Machine learning models are susceptible to adversarial attacks which dramatically reduce their performance. Reliable defenses to these attacks are an unsolved challenge. In this work, we present a novel evasion attack: the 'Feature Importance Guided Attack' (FIGA) which generates adversarial evasion samples. FIGA is model agnostic, it assumes no prior knowledge of the defending model's learning algorithm, but does assume knowledge of the feature representation. FIGA leverages feature importance rankings; it perturbs the most important features of the input in the direction of the target class we wish to mimic. We demonstrate FIGA against eight phishing detection models. We keep the attack realistic by perturbing phishing website features that an adversary would have control over. Using FIGA we are able to cause a reduction in the F1-score of a phishing detection model from 0.96 to 0.41 on average. Finally, we implement adversarial training as a defense against FIGA and show that while it is sometimes effective, it can be evaded by changing the parameters of FIGA.

【2】 Speech2Properties2Gestures: Gesture-Property Prediction as a Tool for Generating Representational Gestures from Speech 标题：Speech2Properties2Gestures：手势属性预测作为从语音生成代表性手势的工具

作者：Taras Kucherenko,Rajmund Nagy,Patrik Jonell,Michael Neff,Hedvig Kjellström,Gustav Eje Henter 机构：Speech,GestProp, Text transcription, Speech,GestExist, GestureFlow, Gesture probability: ,%, Beat:, Deictic:, Iconic:, Metaphoric: 备注：Accepted for publication at the ACM International Conference on Intelligent Virtual Agents (IVA 2021) 链接：https://arxiv.org/abs/2106.14736 摘要：我们提出了一个新的手势生成框架，旨在允许数据驱动的方法生成语义更丰富的手势。我们的方法首先预测是否需要手势，然后预测手势的属性。然后，这些特性被用作现代概率手势生成模型的条件，该模型能够获得高质量的输出。这使得该方法能够生成多样性和代表性的手势。摘要：We propose a new framework for gesture generation, aiming to allow data-driven approaches to produce more semantically rich gestures. Our approach first predicts whether to gesture, followed by a prediction of the gesture properties. Those properties are then used as conditioning for a modern probabilistic gesture-generation model capable of high-quality output. This empowers the approach to generate gestures that are both diverse and representational.

【3】 Scalable Optimal Classifiers for Adversarial Settings under Uncertainty 标题：不确定条件下对抗性设置的可扩展最优分类器

作者：Patrick Loiseau,Benjamin Roussillon 机构：Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIG 备注：24 pages, 9 figures 链接：https://arxiv.org/abs/2106.14702 摘要：我们考虑在一个敌对环境中寻找最优分类器的问题，在这个环境中，一类数据是由攻击者生成的，而攻击者的目标并不为防御者所知——这是现实应用的关键，但迄今为止在文献中一直被忽略。为了模拟这种情况，我们提出了一个贝叶斯博弈框架，在这个框架中，防御者选择一个分类器，而对可能的分类器集没有先验限制。该框架的关键难点在于，可能的分类器集合在可能的数据集合中是指数的，而可能的数据集合本身在用于分类的特征数量上是指数的。为了克服这一点，我们首先证明了贝叶斯纳什均衡可以完全通过具有少量参数的函数阈值分类器来刻画。然后，我们证明了这种低维特征能够开发一种训练方法，以可伸缩的方式计算可证明的近似最优分类器；并提出了一种低后悔度的在线学习算法（与可能数据集的维数无关）。我们通过模拟来说明我们的结果。摘要：We consider the problem of finding optimal classifiers in an adversarial setting where the class-1 data is generated by an attacker whose objective is not known to the defender -- an aspect that is key to realistic applications but has so far been overlooked in the literature. To model this situation, we propose a Bayesian game framework where the defender chooses a classifier with no a priori restriction on the set of possible classifiers. The key difficulty in the proposed framework is that the set of possible classifiers is exponential in the set of possible data, which is itself exponential in the number of features used for classification. To counter this, we first show that Bayesian Nash equilibria can be characterized completely via functional threshold classifiers with a small number of parameters. We then show that this low-dimensional characterization enables to develop a training method to compute provably approximately optimal classifiers in a scalable manner; and to develop a learning algorithm for the online setting with low regret (both independent of the dimension of the set of possible data). We illustrate our results through simulations.

【4】 GAN-MDF: A Method for Multi-fidelity Data Fusion in Digital Twins 标题：GAN-MDF：一种数字双胞胎多保真数据融合方法

作者：Lixue Liu,Chao Zhang,Dacheng Tao 机构：School of Mathematical Sciences, Dalian University of Technology, Liaoning, P.R. China., Key Laboratory for Computational Mathematics and Data Intelligence of Liaoning Province, School of 链接：https://arxiv.org/abs/2106.14655 摘要：物联网（IoT）收集智能工厂、智能机器人、医疗保健系统等物理系统的实时数据，为数字孪生提供必要的支持。根据质量和精度，这些多源数据被划分为不同的保真度级别。高保真（HF）响应准确地描述了感兴趣的系统，但计算成本很高。相比之下，低保真度（LF）响应具有较低的计算成本，但不能满足所需的精度。多保真度数据融合（MDF）方法的目标是利用大量的LF样本和少量的HF样本，建立一个准确有效的模型，以合理的计算量描述系统。在本文中，我们提出了一个新的生成性对抗网络的数字双胞胎中的MDF（甘-MDF）。GAN-MDF发生器由两个子网络组成：一个子网络从输入信号中提取低频特征；另一种方法将输入和提取的低频特征进行融合，形成后续鉴别器的输入。GAN-MDF的鉴别器识别发生器输出是否是HF模型产生的真实样本。为了提高GAN-MDF训练的稳定性，我们还引入了有监督的损失技巧，在对抗训练的每次迭代过程中对生成器权值进行细化。与现有的方法相比，本文提出的GAN-MDF方法具有以下优点：1）无论是嵌套结构还是非嵌套结构，其性能都很好；2）数据分布没有具体假设；3）即使提供很少的高频样本，它也具有很高的鲁棒性。实验结果也证实了GAN-MDF的有效性。摘要：The Internet of Things (IoT) collects real-time data of physical systems, such as smart factory, intelligent robot and healtcare system, and provide necessary support for digital twins. Depending on the quality and accuracy, these multi-source data are divided into different fidelity levels. High-fidelity (HF) responses describe the system of interest accurately but are computed costly. In contrast, low-fidelity (LF) responses have a low computational cost but could not meet the required accuracy. Multi-fidelity data fusion (MDF) methods aims to use massive LF samples and small amounts of HF samples to develop an accurate and efficient model for describing the system with a reasonable computation burden. In this paper, we propose a novel generative adversarial network for MDF in digital twins (GAN-MDF). The generator of GAN-MDF is composed of two sub-networks: one extracts the LF features from an input; and the other integrates the input and the extracted LF features to form the input of the subsequent discriminator. The discriminator of GAN-MDF identifies whether the generator output is a real sample generated from HF model. To enhance the stability of GAN-MDF's training, we also introduce the supervised-loss trick to refine the generator weights during each iteration of the adversarial training. Compared with the state-of-the-art methods, the proposed GAN-MDF has the following advantages: 1) it performs well in the case of either nested or unnested sample structure; 2) there is no specific assumption on the data distribution; and 3) it has high robustness even when very few HF samples are provided. The experimental results also support the validity of GAN-MDF.

【5】 Progressive Open-Domain Response Generation with Multiple Controllable Attributes 标题：具有多个可控属性的渐进式开放域响应生成

作者：Haiqin Yang,Xiaoyuan Yao,Yiqun Duan,Jianping Shen,Jie Zhong,Kun Zhang 机构：Ping An Life Insurance Company of China, Carnegie Mellon University 备注：7 pages, 2 figures, 3 tables, in IJCAI'21 链接：https://arxiv.org/abs/2106.14614 摘要：在开放域对话系统中，为了增强生成响应的多样性，需要包含更多的可控属性。然而，现有的方法只能生成一个可控属性的响应，或者缺乏一种灵活的方法来生成多个可控属性的响应。在本文中，我们提出了一个逐步训练的分层编码器-解码器（PHED）来解决这个问题。更具体地说，PHED在Transformer上部署了条件变分自动编码器（CVAE），以便在一个阶段包含属性的一个方面。CVAE的一个重要特点是将每个阶段的潜在变量分为两类：一类是捕获共同语义特征的全局变量，另一类是吸收该阶段属性信息的特定变量。然后，PHED将CVAE潜变量与Transformer编码器耦合，并通过最小化新导出的ELBO和受控损耗来训练，以产生下一阶段的输入并根据需要产生响应。最后，我们进行了广泛的评估，以表明PHED显著优于最先进的神经生成模型，并产生更多样化的反应。摘要：It is desirable to include more controllable attributes to enhance the diversity of generated responses in open-domain dialogue systems. However, existing methods can generate responses with only one controllable attribute or lack a flexible way to generate them with multiple controllable attributes. In this paper, we propose a Progressively trained Hierarchical Encoder-Decoder (PHED) to tackle this task. More specifically, PHED deploys Conditional Variational AutoEncoder (CVAE) on Transformer to include one aspect of attributes at one stage. A vital characteristic of the CVAE is to separate the latent variables at each stage into two types: a global variable capturing the common semantic features and a specific variable absorbing the attribute information at that stage. PHED then couples the CVAE latent variables with the Transformer encoder and is trained by minimizing a newly derived ELBO and controlled losses to produce the next stage's input and produce responses as required. Finally, we conduct extensive evaluations to show that PHED significantly outperforms the state-of-the-art neural generation models and produces more diverse responses as expected.

【6】 Non-Exhaustive Learning Using Gaussian Mixture Generative Adversarial Networks 标题：基于高斯混合生成对抗性网络的非穷举学习

作者：Jun Zhuang,Mohammad Al Hasan 机构：Indiana University-Purdue University Indianapolis, Indianapolis, IN, USA 备注：Accepted by ECML-PKDD 2021 链接：https://arxiv.org/abs/2106.14344 摘要：监督学习，虽然部署在现实生活中的场景，经常遇到未知类的实例。传统的训练有监督学习模型的算法没有提供检测实例的选项，因此它们以100%的概率漏检实例。开放集识别（OSR）和非穷举学习（NEL）是解决这一问题的有效方法。大多数现有的OSR方法首先对现有类的成员进行分类，然后识别新类的实例。然而，现有的OSR方法大多只做二元判定，即只识别未知类的存在。因此，这种方法无法区分属于增量不可见类的测试实例。另一方面，由于现实生活中的复杂数据集可能不遵循已知的数据分布，大多数的NEL方法往往对数据分布进行参数化假设，要么不能得到很好的结果。本文提出了一种新的在线非穷举学习模型，即非穷举高斯混合生成对抗网络（NE-GM-GAN）。我们提出的模型综合了基于高斯混合的潜在代表性的深层生成模型，如GAN，增量检测实例的新兴类的测试数据。在多个基准数据集上的大量实验结果表明，NE-GM-GAN在检测流数据中的新类实例方面明显优于现有的方法。摘要：Supervised learning, while deployed in real-life scenarios, often encounters instances of unknown classes. Conventional algorithms for training a supervised learning model do not provide an option to detect such instances, so they miss-classify such instances with 100% probability. Open Set Recognition (OSR) and Non-Exhaustive Learning (NEL) are potential solutions to overcome this problem. Most existing methods of OSR first classify members of existing classes and then identify instances of new classes. However, many of the existing methods of OSR only makes a binary decision, i.e., they only identify the existence of the unknown class. Hence, such methods cannot distinguish test instances belonging to incremental unseen classes. On the other hand, the majority of NEL methods often make a parametric assumption over the data distribution, which either fail to return good results, due to the reason that real-life complex datasets may not follow a well-known data distribution. In this paper, we propose a new online non-exhaustive learning model, namely, Non-Exhaustive Gaussian Mixture Generative Adversarial Networks (NE-GM-GAN) to address these issues. Our proposed model synthesizes Gaussian mixture based latent representation over a deep generative model, such as GAN, for incremental detection of instances of emerging classes in the test data. Extensive experimental results on several benchmark datasets show that NE-GM-GAN significantly outperforms the state-of-the-art methods in detecting instances of novel classes in streaming data.

【7】 ASK: Adversarial Soft k-Nearest Neighbor Attack and Defense 标题：问：对抗性软k近邻攻击与防御

作者：Ren Wang,Tianqi Chen,Philip Yao,Sijia Liu,Indika Rajapakse,Alfred Hero 机构：University of Michigan, Michigan State University 链接：https://arxiv.org/abs/2106.14300 摘要：基于K近邻（kNN）的深度学习方法由于其简单性和几何可解释性而被广泛应用。然而，基于kNN的分类模型的鲁棒性还没有得到充分的研究，kNN攻击策略还不成熟。在本文中，我们提出了一种对抗性的软kNN（ASK）损失，以设计更有效的kNN攻击策略，并开发更好的防御。我们的ASK-loss方法有两个优点。首先，ASK-loss比以往提出的目标更能逼近kNN的分类错误概率。其次，ASK损失是可解释的：它保留了扰动输入和未扰动输入的kNN之间的互信息。我们利用ASK丢失产生了一种新的攻击方法ASK攻击（ASK-Atk），它比以往的kNN攻击具有更高的攻击效率和准确率。在ASK-Atk算法的基础上，我们提出了一种ASK-Def算法来优化ASK-Atk引起的最坏训练损失。摘要：K-Nearest Neighbor (kNN)-based deep learning methods have been applied to many applications due to their simplicity and geometric interpretability. However, the robustness of kNN-based classification models has not been thoroughly explored and kNN attack strategies are underdeveloped. In this paper, we propose an Adversarial Soft kNN (ASK) loss to both design more effective kNN attack strategies and to develop better defenses against them. Our ASK loss approach has two advantages. First, ASK loss can better approximate the kNN's probability of classification error than objectives proposed in previous works. Second, the ASK loss is interpretable: it preserves the mutual information between the perturbed input and the kNN of the unperturbed input. We use the ASK loss to generate a novel attack method called the ASK-Attack (ASK-Atk), which shows superior attack efficiency and accuracy degradation relative to previous kNN attacks. Based on the ASK-Atk, we then derive an ASK-Defense (ASK-Def) method that optimizes the worst-case training loss induced by ASK-Atk.

【8】 The Feasibility and Inevitability of Stealth Attacks 标题：论隐形攻击的可行性和必然性

作者：Ivan Y. Tyukin,Desmond J. Higham,Eliyas Woldegeorgis,Alexander N. Gorban 机构：University of Leicester, Leicester, LE,RH, UK, University of Edinburgh, Edinburgh, EH,FD, UK 链接：https://arxiv.org/abs/2106.13997 摘要：我们开发和研究了新的对抗性干扰，使攻击者能够控制包括深度学习神经网络在内的通用人工智能（AI）系统中的决策。与对抗性数据修改不同，我们这里考虑的攻击机制涉及对人工智能系统本身的修改。这样的秘密攻击可能是由软件开发团队中一个调皮、腐败或不满的成员实施的。它也可以由那些希望利用“人工智能民主化”议程的人来实现，在这个议程中，网络架构和经过训练的参数集是公开共享的。在[Tyukin等人，国际神经网络联合会议，2020]的工作基础上，我们开发了一系列新的可实施的攻击策略，并进行了相应的分析，表明隐形攻击很有可能是透明的，在攻击者未知的固定验证集上，系统性能保持不变，同时在感兴趣的触发器输入上引发任何所需的输出。攻击者只需要估计验证集的大小和AI相关潜在空间的分布。在深度学习神经网络的情况下，我们证明了单神经元攻击是可能的-修改了与单个神经元相关的权重和偏差-揭示了由过度参数化引起的脆弱性。我们在现实环境中阐述这些概念。在理论和计算结果的指导下，提出了防范隐身攻击的策略。摘要：We develop and study new adversarial perturbations that enable an attacker to gain control over decisions in generic Artificial Intelligence (AI) systems including deep learning neural networks. In contrast to adversarial data modification, the attack mechanism we consider here involves alterations to the AI system itself. Such a stealth attack could be conducted by a mischievous, corrupt or disgruntled member of a software development team. It could also be made by those wishing to exploit a "democratization of AI" agenda, where network architectures and trained parameter sets are shared publicly. Building on work by [Tyukin et al., International Joint Conference on Neural Networks, 2020], we develop a range of new implementable attack strategies with accompanying analysis, showing that with high probability a stealth attack can be made transparent, in the sense that system performance is unchanged on a fixed validation set which is unknown to the attacker, while evoking any desired output on a trigger input of interest. The attacker only needs to have estimates of the size of the validation set and the spread of the AI's relevant latent space. In the case of deep learning neural networks, we show that a one neuron attack is possible - a modification to the weights and bias associated with a single neuron - revealing a vulnerability arising from over-parameterization. We illustrate these concepts in a realistic setting. Guided by the theory and computational results, we also propose strategies to guard against stealth attacks.

【9】 Discovering Generalizable Skills via Automated Generation of Diverse Tasks 标题：通过自动生成不同的任务来发现可概括的技能

作者：Kuan Fang,Yuke Zhu,Silvio Savarese,Li Fei-Fei 机构： Stanford University, UT Austin, Nvidia 备注：RSS 2021 链接：https://arxiv.org/abs/2106.13935 摘要：智能体的学习效率和泛化能力可以通过使用一组有用的技能得到很大的提高。然而，机器人技能的设计在现实世界的应用中往往是棘手的，因为它需要大量的努力和专业知识。在这项工作中，我们介绍了在多样化环境中的技能学习（SLIDE），一种通过自动生成一组不同的任务来发现可概括技能的方法。与以往在无监督下发现技能的工作不同，我们的方法鼓励技能在相同的环境中产生不同的结果，我们将每一项技能与一个可训练的任务生成器生成的唯一任务配对。为了鼓励归纳技能的出现，我们的方法训练每一项技能，使之专门化成对的任务，并最大限度地提高生成任务的多样性。根据机器人在生成的任务中的行为，联合训练一个任务鉴别器来估计多样性目标的证据下界。所学的技能，然后可以组成一个分层强化学习算法来解决看不见的目标任务。实验结果表明，该方法能有效地学习两个桌面操作领域的机器人技能。结果表明，与现有的强化学习和技能学习方法相比，所学习的技能能够有效地提高机器人在各种未知目标任务中的性能。摘要：The learning efficiency and generalization ability of an intelligent agent can be greatly improved by utilizing a useful set of skills. However, the design of robot skills can often be intractable in real-world applications due to the prohibitive amount of effort and expertise that it requires. In this work, we introduce Skill Learning In Diversified Environments (SLIDE), a method to discover generalizable skills via automated generation of a diverse set of tasks. As opposed to prior work on unsupervised discovery of skills which incentivizes the skills to produce different outcomes in the same environment, our method pairs each skill with a unique task produced by a trainable task generator. To encourage generalizable skills to emerge, our method trains each skill to specialize in the paired task and maximizes the diversity of the generated tasks. A task discriminator defined on the robot behaviors in the generated tasks is jointly trained to estimate the evidence lower bound of the diversity objective. The learned skills can then be composed in a hierarchical reinforcement learning algorithm to solve unseen target tasks. We demonstrate that the proposed method can effectively learn a variety of robot skills in two tabletop manipulation domains. Our results suggest that the learned skills can effectively improve the robot's performance in various unseen target tasks compared to existing reinforcement learning and skill learning methods.

【10】 Transflower: probabilistic autoregressive dance generation with multimodal attention 标题：TransFlow：具有多模态注意的概率自回归舞蹈生成

作者：Guillermo Valle-Pérez,Gustav Eje Henter,Jonas Beskow,André Holzapfel,Pierre-Yves Oudeyer,Simon Alexanderson 机构： KTH Royal Institute of Technology 链接：https://arxiv.org/abs/2106.13871 摘要：舞蹈需要复杂动作的巧妙组合，这些动作遵循音乐的节奏、音调和音色特征。从形式上讲，生成以音乐为条件的舞蹈可以表示为以音频信号为条件的高维连续运动信号的建模问题。在这项工作中，我们为解决这个问题作出了两项贡献。首先，我们提出了一种新的概率自回归结构，该结构使用多模态变换器编码器，通过基于先前姿势和音乐背景的规范化流对未来姿势的分布进行建模。其次，我们介绍目前最大的三维舞蹈动作数据集，通过各种动作捕捉技术获得，包括专业和休闲舞者。利用这个数据集，我们通过客观指标和用户研究，将我们的新模型与两个基线进行比较，结果表明，建立概率分布模型的能力，以及能够在大运动和音乐环境中参与，都是产生与音乐相匹配的有趣、多样和真实的舞蹈所必需的。摘要：Dance requires skillful composition of complex movements that follow rhythmic, tonal and timbral features of music. Formally, generating dance conditioned on a piece of music can be expressed as a problem of modelling a high-dimensional continuous motion signal, conditioned on an audio signal. In this work we make two contributions to tackle this problem. First, we present a novel probabilistic autoregressive architecture that models the distribution over future poses with a normalizing flow conditioned on previous poses as well as music context, using a multimodal transformer encoder. Second, we introduce the currently largest 3D dance-motion dataset, obtained with a variety of motion-capture technologies, and including both professional and casual dancers. Using this dataset, we compare our new model against two baselines, via objective metrics and a user study, and show that both the ability to model a probability distribution, as well as being able to attend over a large motion and music context are necessary to produce interesting, diverse, and realistic dance that matches the music.

半/弱/无/有监督|不确定性|主动学习(9篇)

【1】 A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning 标题：一种理论驱动的对比表征学习自标注求精方法

作者：Pan Zhou,Caiming Xiong,Xiao-Tong Yuan,Steven Hoi 机构：∗ Salesforce Research, † Nanjing University of Information Science & Technology 备注：under review. arXiv admin note: substantial text overlap with arXiv:1903.11680 by other authors 链接：https://arxiv.org/abs/2106.14749 摘要：对于图像查询，无监督对比学习将同一图像的裁剪标记为正片，将其他图像的裁剪标记为负片。尽管这样的本地标签分配策略很直观，但它不能揭示查询及其正负项之间潜在的语义相似性，并且会降低性能，因为某些负项在语义上与查询相似，甚至与查询共享同一语义类。在这项工作中，我们首先证明了在对比学习中，不准确的标签分配严重影响了语义实例识别的泛化，而准确的标签则有利于语义实例识别的泛化。受这一理论的启发，我们提出了一种新的对比学习自标记优化方法。它通过两个互补的模块来提高标签质量：（i）自标记精炼（SLR）来生成准确的标签；（ii）动量混合（MM）来增强查询与其正查询之间的相似性。SLR使用查询的一个正数来估计查询与其正数和负数之间的语义相似度，并将估计的相似度与对比学习中的香草标签分配相结合，以迭代方式生成更准确、信息更丰富的软标签。从理论上证明了SLR能够准确地恢复标签损坏数据的真实语义标签，并监督网络实现分类任务的零预测误差。MM将查询和阳性信息随机组合，以增加生成的虚拟查询和它们的阳性信息之间的语义相似性，从而提高标签的准确性。在CIFAR10、ImageNet、VOC和COCO上的实验结果表明了该方法的有效性。PyTorch代码和模型将在网上发布。摘要：For an image query, unsupervised contrastive learning labels crops of the same image as positives, and other image crops as negatives. Although intuitive, such a native label assignment strategy cannot reveal the underlying semantic similarity between a query and its positives and negatives, and impairs performance, since some negatives are semantically similar to the query or even share the same semantic class as the query. In this work, we first prove that for contrastive learning, inaccurate label assignment heavily impairs its generalization for semantic instance discrimination, while accurate labels benefit its generalization. Inspired by this theory, we propose a novel self-labeling refinement approach for contrastive learning. It improves the label quality via two complementary modules: (i) self-labeling refinery (SLR) to generate accurate labels and (ii) momentum mixup (MM) to enhance similarity between query and its positive. SLR uses a positive of a query to estimate semantic similarity between a query and its positive and negatives, and combines estimated similarity with vanilla label assignment in contrastive learning to iteratively generate more accurate and informative soft labels. We theoretically show that our SLR can exactly recover the true semantic labels of label-corrupted data, and supervises networks to achieve zero prediction error on classification tasks. MM randomly combines queries and positives to increase semantic similarity between the generated virtual queries and their positives so as to improves label accuracy. Experimental results on CIFAR10, ImageNet, VOC and COCO show the effectiveness of our method. PyTorch code and model will be released online.

【2】 Improving Uncertainty Calibration of Deep Neural Networks via Truth Discovery and Geometric Optimization 标题：基于真值发现和几何优化的深度神经网络不确定性校正

作者：Chunwei Ma,Ziyun Huang,Jiayi Xian,Mingchen Gao,Jinhui Xu 机构：Department of Computer Science and Engineering, University at Buffalo, Buffalo, NY, USA, Computer Science and Software Engineering, Penn State Erie, Erie, PA, USA 备注：37th Conference on Uncertainty in Artificial Intelligence (UAI 2021) 链接：https://arxiv.org/abs/2106.14662 摘要：深度神经网络（Deep Neural Networks，DNNs）尽管近年来取得了巨大的成功，但由于其学习过程中固有的不确定性，仍然可能对其预测产生怀疑。集成技术和事后校准是两种在提高DNNs不确定度校准方面有前途的方法。然而，这两种方法的协同效应还没有得到很好的探讨。在这篇论文中，我们提出了一个真理发现框架来整合基于集成和事后校准的方法。利用集合候选者的几何方差作为样本不确定性的一个良好指标，我们设计了一个可证明无精度下降的保精度真值估计器。此外，我们还证明了通过真理发现正则化优化可以增强事后校准。在包括CIFAR和ImageNet在内的大规模数据集上，我们的方法在基于直方图和基于核密度的评价指标上都比最新的校正方法有了一致的改进。我们的代码在https://github.com/horsepurve/truly-uncertain. 摘要：Deep Neural Networks (DNNs), despite their tremendous success in recent years, could still cast doubts on their predictions due to the intrinsic uncertainty associated with their learning process. Ensemble techniques and post-hoc calibrations are two types of approaches that have individually shown promise in improving the uncertainty calibration of DNNs. However, the synergistic effect of the two types of methods has not been well explored. In this paper, we propose a truth discovery framework to integrate ensemble-based and post-hoc calibration methods. Using the geometric variance of the ensemble candidates as a good indicator for sample uncertainty, we design an accuracy-preserving truth estimator with provably no accuracy drop. Furthermore, we show that post-hoc calibration can also be enhanced by truth discovery-regularized optimization. On large-scale datasets including CIFAR and ImageNet, our method shows consistent improvement against state-of-the-art calibration approaches on both histogram-based and kernel density-based evaluation metrics. Our codes are available at https://github.com/horsepurve/truly-uncertain.

【3】 Unsupervised Continual Learning via Self-Adaptive Deep Clustering Approach 标题：基于自适应深度聚类的无监督连续学习

作者：Mahardhika Pratama,Andri Ashfahani,Edwin Lughofer 机构：SCSE, Nanyang Technological University, Singapore, DKBMS, Johanes Kepler University, Linz, Austria 备注：currently under review 链接：https://arxiv.org/abs/2106.14563 摘要：在现有文献中，无监督的持续学习仍然是一个相对未知的领域，因为现有的绝大多数作品都要求无限地获取基础知识，这会产生昂贵的标签成本。另一个问题在于任务边界和任务id的问题，这些问题必须为模型的更新或模型的预测所知，从而妨碍了实时部署的可行性。本文提出了自适应深度持续学习者的知识保持问题。KIERA是从灵活的深度聚类方法的概念发展而来的，它具有弹性的网络结构，能够及时地应对不断变化的环境。为了克服灾难性遗忘问题，提出了基于质心的经验回放方法。KIERA不利用任何标记样本进行模型更新，同时具有任务不可知的优点。KIERA的优势已经在流行的持续学习问题中得到了数值验证，与最先进的方法相比，KIERA具有很强的竞争力。我们的实现在textit{url中提供{https://github.com/ContinualAL/KIERA}}. 摘要：Unsupervised continual learning remains a relatively uncharted territory in the existing literature because the vast majority of existing works call for unlimited access of ground truth incurring expensive labelling cost. Another issue lies in the problem of task boundaries and task IDs which must be known for model's updates or model's predictions hindering feasibility for real-time deployment. Knowledge Retention in Self-Adaptive Deep Continual Learner, (KIERA), is proposed in this paper. KIERA is developed from the notion of flexible deep clustering approach possessing an elastic network structure to cope with changing environments in the timely manner. The centroid-based experience replay is put forward to overcome the catastrophic forgetting problem. KIERA does not exploit any labelled samples for model updates while featuring a task-agnostic merit. The advantage of KIERA has been numerically validated in popular continual learning problems where it shows highly competitive performance compared to state-of-the art approaches. Our implementation is available in textit{url{https://github.com/ContinualAL/KIERA}}.

【4】 Unsupervised Skill Discovery with Bottleneck Option Learning 标题：具有瓶颈选项学习的无监督技能发现

作者：Jaekyeom Kim,Seohong Park,Gunhee Kim 机构：Equal contribution 1Department of Computer Science andEngineering, Seoul National University 备注：Accepted to ICML 2021. Code at this https URL 链接：https://arxiv.org/abs/2106.14305 摘要：像人类一样，在没有任何外部奖励或监督的情况下，从环境中获得固有技能的能力是一个重要的问题。提出了一种新的无监督技能发现方法&信息瓶颈选择学习（IBOL）。除了环境的线性化可以促进更多不同的和遥远的状态转换之外，IBOL还可以发现不同的技能。它提供了抽象的技能学习与信息瓶颈框架的选择与改进的稳定性和鼓励解开。我们的经验证明，IBOL在MuJoCo环境中，包括Ant、HalfCheetah、Hopper和D'Kitty，在信息论评估和下游任务上优于多种最先进的无监督技能发现方法。摘要：Having the ability to acquire inherent skills from environments without any external rewards or supervision like humans is an important problem. We propose a novel unsupervised skill discovery method named Information Bottleneck Option Learning (IBOL). On top of the linearization of environments that promotes more various and distant state transitions, IBOL enables the discovery of diverse skills. It provides the abstraction of the skills learned with the information bottleneck framework for the options with improved stability and encouraged disentanglement. We empirically demonstrate that IBOL outperforms multiple state-of-the-art unsupervised skill discovery methods on the information-theoretic evaluations and downstream tasks in MuJoCo environments, including Ant, HalfCheetah, Hopper and D'Kitty.

【5】 Improving Sequential Recommendation Consistency with Self-Supervised Imitation 标题：利用自监督模仿提高序列推荐一致性

作者：Xu Yuan,Hongshen Chen,Yonghao Song,Xiaofang Zhao,Zhuoye Ding,Zhen He,Bo Long 机构：Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China, University of Chinese Academy of Sciences, Beijing, China 链接：https://arxiv.org/abs/2106.14031 摘要：大多数顺序推荐模型捕获用户项交互历史中连续项的特征。虽然有效，但稀疏的学习信号仍然阻碍了它们的表征表达能力。因此，顺序推荐者很容易做出不一致的预测。在本文中，我们提出了一个模型textbf{SSI}，以改善顺序推荐的一致性与自我监督的模仿。准确地说，我们利用三个自我监督的预训练任务来提取一致性知识，其中时间一致性和人物角色一致性分别根据时间顺序和人物角色敏感度来捕获用户交互动态。此外，通过最大化全局和局部交互序列之间的互信息，引入全局会话一致性，为模型提供全局视角。最后，为了综合利用一致性增强知识的三个独立方面，我们建立了一个完整的模仿学习框架。通过模仿传统的预测logit和一致性增强的项目表示，有效地将一致性知识内化并传递到学生模型中。此外，灵活的自我监督模仿框架也可以使其他学生推荐者受益。在四个真实数据集上的实验表明，SSI算法的性能优于现有的序列推荐算法。摘要：Most sequential recommendation models capture the features of consecutive items in a user-item interaction history. Though effective, their representation expressiveness is still hindered by the sparse learning signals. As a result, the sequential recommender is prone to make inconsistent predictions. In this paper, we propose a model, textbf{SSI}, to improve sequential recommendation consistency with Self-Supervised Imitation. Precisely, we extract the consistency knowledge by utilizing three self-supervised pre-training tasks, where temporal consistency and persona consistency capture user-interaction dynamics in terms of the chronological order and persona sensitivities, respectively. Furthermore, to provide the model with a global perspective, global session consistency is introduced by maximizing the mutual information among global and local interaction sequences. Finally, to comprehensively take advantage of all three independent aspects of consistency-enhanced knowledge, we establish an integrated imitation learning framework. The consistency knowledge is effectively internalized and transferred to the student model by imitating the conventional prediction logit as well as the consistency-enhanced item representations. In addition, the flexible self-supervised imitation framework can also benefit other student recommenders. Experiments on four real-world datasets show that SSI effectively outperforms the state-of-the-art sequential recommendation methods.

【6】 Semi-Supervised Deep Ensembles for Blind Image Quality Assessment 标题：基于半监督深度集成的盲图像质量评价

作者：Zhihua Wang,Dingquan Li,Kede Ma 机构：Department of Computer Science, City University of Hong Kong, Peng Cheng Laboratory 备注：6 pages, 1 figure, 5 tables 链接：https://arxiv.org/abs/2106.14008 摘要：如果基础学习者被认为是“精确的”和“多样的”，集成方法通常被认为比单一模型更好。在这里，我们研究了一种半监督集成学习策略来产生可推广的盲图像质量评估模型。我们训练了一个用于质量预测的多头卷积网络，通过最大化集合（以及基础学习者）对标记数据的准确性，以及它们之间对未标记数据的不一致性（即多样性），两者都通过保真度损失来实现。我们进行了大量的实验来证明使用未标记数据进行BIQA的优势，特别是在模型泛化和故障识别方面。摘要：Ensemble methods are generally regarded to be better than a single model if the base learners are deemed to be "accurate" and "diverse." Here we investigate a semi-supervised ensemble learning strategy to produce generalizable blind image quality assessment models. We train a multi-head convolutional network for quality prediction by maximizing the accuracy of the ensemble (as well as the base learners) on labeled data, and the disagreement (i.e., diversity) among them on unlabeled data, both implemented by the fidelity loss. We conduct extensive experiments to demonstrate the advantages of employing unlabeled data for BIQA, especially in model generalization and failure identification.

【7】 Intrinsically Motivated Self-supervised Learning in Reinforcement Learning 标题：强化学习中的内在激励自监督学习

作者：Yue Zhao,Chenzhuang Du,Hang Zhao,Tiejun Li 机构：Peking University, Tsinghua University, Shanghai Qi Zhi Institute 链接：https://arxiv.org/abs/2106.13970 摘要：在基于视觉的强化学习（RL）任务中，为了获得更多的语义表示和提高样本效率，通常采用一种具有代理自监督损失的辅助任务分配方法。然而，由于表示学习部分和决策部分是分离的，自监督辅助任务中的大量信息被忽略了。为了充分利用辅助任务中的信息，我们提出了一种简单而有效的方法，将自我监督损失作为内在奖励，称为强化学习中的内在动机自我监督学习（IM-SSR）。形式化地证明了自监督损失可以分解为新状态的探索和干扰消除的鲁棒性改进。IM-SSR可以毫不费力地插入任何强化学习与自我监督辅助目标几乎没有额外的费用。与IM-SSR相结合，以前的算法在基于视觉的机器人任务中，特别是在奖励信号稀疏的情况下，在样本效率和泛化能力上都有显著的提高。摘要：In vision-based reinforcement learning (RL) tasks, it is prevalent to assign the auxiliary task with a surrogate self-supervised loss so as to obtain more semantic representations and improve sample efficiency. However, abundant information in self-supervised auxiliary tasks has been disregarded, since the representation learning part and the decision-making part are separated. To sufficiently utilize information in the auxiliary task, we present a simple yet effective idea to employ self-supervised loss as an intrinsic reward, called Intrinsically Motivated Self-Supervised learning in Reinforcement learning (IM-SSR). We formally show that the self-supervised loss can be decomposed as exploration for novel states and robustness improvement from nuisance elimination. IM-SSR can be effortlessly plugged into any reinforcement learning with self-supervised auxiliary objectives with nearly no additional cost. Combined with IM-SSR, the previous underlying algorithms achieve salient improvements on both sample efficiency and generalization in various vision-based robotics tasks from the DeepMind Control Suite, especially when the reward signal is sparse.

【8】 Midpoint Regularization: from High Uncertainty Training to Conservative Classification 标题：中点正则化：从高不确定性训练到保守分类

作者：Hongyu Guo 机构：National Research Council Canada, Montreal Road, Ottawa, Ontario, K,A ,R 备注：Accepted to ECML-PKDD 2021. arXiv admin note: substantial text overlap with arXiv:2012.01559 链接：https://arxiv.org/abs/2106.13913 摘要：标签平滑（LS）通过惩罚产生过度自信输出分布的模型，提高了模型的泛化能力。对于每个训练样本，LS策略通过将其分布质量分布在非地面真值类上来平滑一个热编码的训练信号。我们通过考虑示例对来扩展这一技术，即PLS。PLS首先通过平均随机样本对来创建中点样本，然后在训练过程中学习每个中点样本的平滑分布，从而得到具有高不确定性标签的中点用于训练。实验表明，PLS显著优于LS，实现了高达30%的相对分类误差降低。我们还设想，PLS产生非常低的中奖softmax分数都在分布内和分布外的样本。摘要：Label Smoothing (LS) improves model generalization through penalizing models from generating overconfident output distributions. For each training sample the LS strategy smooths the one-hot encoded training signal by distributing its distribution mass over the non-ground truth classes. We extend this technique by considering example pairs, coined PLS. PLS first creates midpoint samples by averaging random sample pairs and then learns a smoothing distribution during training for each of these midpoint samples, resulting in midpoints with high uncertainty labels for training. We empirically show that PLS significantly outperforms LS, achieving up to 30% of relative classification error reduction. We also visualize that PLS produces very low winning softmax scores for both in and out of distribution samples.

【9】 Scene Uncertainty and the Wellington Posterior of Deterministic Image Classifiers 标题：场景不确定性与确定性图像分类器的惠灵顿后验

作者：Stephanie Tsuei,Aditya Golatkar,Stefano Soatto 机构：Department of Computer Science, UCLA, Los Angeles, CA 链接：https://arxiv.org/abs/2106.13870 摘要：提出了一种在给定输入数据上估计图像分类器结果不确定性的方法。通常用于图像分类的深度神经网络是从输入图像到输出类的确定性映射。因此，他们在给定数据上的结果不涉及不确定性，因此我们必须明确定义、测量和解释“信心”时所指的可变性。为此，我们介绍了惠灵顿后验法，它是根据产生给定图像的同一场景可能产生的数据而获得的结果的分布。由于有无限多的场景可以生成给定的图像，惠灵顿后方需要从场景以外的其他描绘归纳。我们探讨了使用数据增强、置乱和模型线性化的替代方法。其他的替代方案包括生成对抗网络、条件先验网络和有监督的单视图重建。我们通过推断视频中时间相邻帧的类别来检验这些替代方法。这些发展只是评估深度网络分类器可靠性的一小步，其方式与安全关键应用兼容。摘要：We propose a method to estimate the uncertainty of the outcome of an image classifier on a given input datum. Deep neural networks commonly used for image classification are deterministic maps from an input image to an output class. As such, their outcome on a given datum involves no uncertainty, so we must specify what variability we are referring to when defining, measuring and interpreting "confidence." To this end, we introduce the Wellington Posterior, which is the distribution of outcomes that would have been obtained in response to data that could have been generated by the same scene that produced the given image. Since there are infinitely many scenes that could have generated the given image, the Wellington Posterior requires induction from scenes other than the one portrayed. We explore alternate methods using data augmentation, ensembling, and model linearization. Additional alternatives include generative adversarial networks, conditional prior networks, and supervised single-view reconstruction. We test these alternatives against the empirical posterior obtained by inferring the class of temporally adjacent frames in a video. These developments are only a small step towards assessing the reliability of deep network classifiers in a manner that is compatible with safety-critical applications.

迁移|Zero/Few/One-Shot|自适应(6篇)

【1】 Zero-shot learning approach to adaptive Cybersecurity using Explainable AI 标题：基于可解释人工智能的自适应网络安全Zero-Shot学习方法

作者：Dattaraj Rao,Shraddha Mane 备注：arXiv admin note: substantial text overlap with arXiv:2103.07110 链接：https://arxiv.org/abs/2106.14647 摘要：网络安全是一个攻击模式不断变化的领域，我们需要各种方法使我们的网络安全系统更能适应新的攻击，并分类采取适当的行动。我们提出了一种新的方法来处理网络安全系统（如安全信息和事件管理（SIEM）和入侵检测（IDS））面临的警报泛滥问题。我们将零炮学习方法应用于机器学习（ML），通过利用ML模型产生的异常预测的解释。这种方法在自动检测SIEM中生成的报警标签并将其与特定的攻击类型相关联方面具有巨大的潜力。在这种方法中，在没有任何攻击的先验知识的情况下，我们尝试识别它，破译有助于分类的特征，并尝试使用可解释的人工智能将攻击排除在特定的类别中。这些解释为我们提供了一些可衡量的因素，比如哪些特征会影响对网络攻击的预测，以及影响程度如何。这些基于博弈论的解释被用来根据特定特征对特定预测的影响来分配信用。利用这种信用分配，我们提出了一种基于特征影响的新攻击分类方法。结果表明，该系统能够很好地将攻击流量与正常流量分离，并根据导致攻击的特征自动生成攻击标签。这些自动生成的标签可以呈现给SIEM分析人员，并且足够直观，可以判断攻击的性质。我们将此方法应用于网络流数据集，并演示特定攻击类型（如ip扫描、拒绝服务、远程到本地等）的结果。该论文在2021年6月IIT Madras召开的第一届可部署AI会议上发表。摘要：Cybersecurity is a domain where there is constant change in patterns of attack, and we need ways to make our Cybersecurity systems more adaptive to handle new attacks and categorize for appropriate action. We present a novel approach to handle the alarm flooding problem faced by Cybersecurity systems like security information and event management (SIEM) and intrusion detection (IDS). We apply a zero-shot learning method to machine learning (ML) by leveraging explanations for predictions of anomalies generated by a ML model. This approach has huge potential to auto detect alarm labels generated in SIEM and associate them with specific attack types. In this approach, without any prior knowledge of attack, we try to identify it, decipher the features that contribute to classification and try to bucketize the attack in a specific category - using explainable AI. Explanations give us measurable factors as to what features influence the prediction of a cyber-attack and to what degree. These explanations generated based on game-theory are used to allocate credit to specific features based on their influence on a specific prediction. Using this allocation of credit, we propose a novel zero-shot approach to categorize novel attacks into specific new classes based on feature influence. The resulting system demonstrated will get good at separating attack traffic from normal flow and auto-generate a label for attacks based on features that contribute to the attack. These auto-generated labels can be presented to SIEM analyst and are intuitive enough to figure out the nature of attack. We apply this approach to a network flow dataset and demonstrate results for specific attack types like ip sweep, denial of service, remote to local, etc. Paper was presented at the first Conference on Deployable AI at IIT-Madras in June 2021.

【2】 Domain Adaptation Broad Learning System Based on Locally Linear Embedding 标题：基于局部线性嵌入的领域自适应广泛学习系统

作者：Chao Yuan,Chang-E Ren 机构：accuracy with less training time than support vector, machine (SVM), hierarchical extreme learning machine, (HELM) and convolutional neural network (CNN) [,]., Many researchers have put forward several improved 链接：https://arxiv.org/abs/2106.14367 摘要：广泛学习系统（BLS）已经提出了几年。它展示了一个有效的学习能力，许多分类和回归问题。然而，BLS及其改进版本主要用于处理单个领域的无监督、有监督和半监督学习问题。据我们所知，BLS的跨领域学习能力受到了关注。为此，我们将BLS引入迁移学习领域，提出了一种基于局部线性嵌入的域自适应广义学习系统（DABLS-LLE）算法。该算法利用目标域的一小部分标记数据和源域的所有标记数据学习鲁棒分类模型。该算法继承了BLS算法的计算效率和学习能力。在基准数据集Office-Caltech-10上的实验验证了该方法的有效性。实验结果表明，与现有的迁移学习方法相比，该方法能以较少的运行时间获得更好的分类精度。这表明我们的方法可以为BLS带来新的优势。摘要：Broad learning system (BLS) has been proposed for a few years. It demonstrates an effective learning capability for many classification and regression problems. However, BLS and its improved versions are mainly used to deal with unsupervised, supervised and semi-supervised learning problems in a single domain. As far as we know, a little attention is paid to the cross-domain learning ability of BLS. Therefore, we introduce BLS into the field of transfer learning and propose a novel algorithm called domain adaptation broad learning system based on locally linear embedding (DABLS-LLE). The proposed algorithm can learn a robust classification model by using a small part of labeled data from the target domain and all labeled data from the source domain. The proposed algorithm inherits the computational efficiency and learning capability of BLS. Experiments on benchmark dataset (Office-Caltech-10) verify the effectiveness of our approach. The results show that our approach can get better classification accuracy with less running time than many existing transfer learning approaches. It shows that our approach can bring a new superiority for BLS.

【3】 Transfer-based adaptive tree for multimodal sentiment analysis based on user latent aspects 标题：基于转移的自适应树基于用户潜在特征的多模态情感分析

作者：Sana Rahmani,Saeid Hosseini,Raziyeh Zall,Mohammad Reza Kangavari,Sara Kamran,Wen Hua 备注：Under Review on IEEE Transactions on Pattern Analysis and Machine Intelligence 链接：https://arxiv.org/abs/2106.14174 摘要：多模态情感分析有利于各种应用，如人机交互和推荐系统。它的目的是利用视觉、文本和声音信号来推断用户的两极想法。虽然研究人员肯定了认知线索和情绪表现之间的联系，但目前情绪分析中的大多数多模态方法都忽略了用户特定的方面。为了解决这个问题，我们设计了一种新的方法来进行多模态情绪预测使用认知线索，如个性。该框架通过分层划分用户构造自适应树，训练基于LSTM的子模型，利用基于注意的融合在树内传递面向认知的知识。随后，该框架使用自适应树中的结论性凝聚知识来预测最终情感。我们还设计了一种动态丢包方法，以方便相邻节点之间的数据共享，减少数据稀疏性。在真实数据集上的实证结果表明，我们提出的情绪预测模型能够超越趋势对手。此外，与其他集成方法相比，该算法能更好地利用潜在的认知线索，提高预测结果。基于给定的外部和内部分析结果，我们注意到与其他基于理论的方法相比，所提出的层次聚类方法可以更好地在自适应树中对用户进行分组。摘要：Multimodal sentiment analysis benefits various applications such as human-computer interaction and recommendation systems. It aims to infer the users' bipolar ideas using visual, textual, and acoustic signals. Although researchers affirm the association between cognitive cues and emotional manifestations, most of the current multimodal approaches in sentiment analysis disregard user-specific aspects. To tackle this issue, we devise a novel method to perform multimodal sentiment prediction using cognitive cues, such as personality. Our framework constructs an adaptive tree by hierarchically dividing users and trains the LSTM-based submodels, utilizing an attention-based fusion to transfer cognitive-oriented knowledge within the tree. Subsequently, the framework consumes the conclusive agglomerative knowledge from the adaptive tree to predict final sentiments. We also devise a dynamic dropout method to facilitate data sharing between neighboring nodes, reducing data sparsity. The empirical results on real-world datasets determine that our proposed model for sentiment prediction can surpass trending rivals. Moreover, compared to other ensemble approaches, the proposed transfer-based algorithm can better utilize the latent cognitive cues and foster the prediction outcomes. Based on the given extrinsic and intrinsic analysis results, we note that compared to other theoretical-based techniques, the proposed hierarchical clustering approach can better group the users within the adaptive tree.

【4】 AdaptCL: Efficient Collaborative Learning with Dynamic and Adaptive Pruning 标题：AdaptCL：动态自适应剪枝的高效协作学习

作者：Guangmeng Zhou,Ke Xu,Qi Li,Yang Liu,Yi Zhao 机构：Department of Computer Science and Technology, Tsinghua University, Beijing, China 链接：https://arxiv.org/abs/2106.14126 摘要：在多方协作学习中，参数服务器向每个数据持有者发送一个全局模型进行局部训练，然后在全局范围内聚合提交的模型以实现隐私保护。然而，同步协作学习的拖沓问题和异步协作学习的陈旧性问题都使得协作学习在现实的异构环境中效率低下。我们提出了一个新的高效的协作学习框架adadcl，该框架不需要任何关于工人能力的先验信息，从全局基础模型中为每个数据持有者动态生成一个自适应子模型。所有工作者（数据持有者）通过为他们配备适应能力的修剪模型，实现与最快工作者大致相同的更新时间。因此，训练过程可以大大加快。此外，我们还针对AdaptCL定制了高效的剪枝率学习算法和剪枝方法。同时，AdaptCL提供了一种机制来处理精度和时间开销之间的折衷，并可以与其他技术相结合，进一步加快训练速度。实验结果表明，AdaptCL引入的计算和通信开销很小。AdaptCL平均节省了41%以上的时间，并在低异构环境中提高了准确性。在高度异构的环境中，AdaptCL实现了6.2倍的训练加速，但精确度略有下降。摘要：In multi-party collaborative learning, the parameter server sends a global model to each data holder for local training and then aggregates committed models globally to achieve privacy protection. However, both the dragger issue of synchronous collaborative learning and the staleness issue of asynchronous collaborative learning make collaborative learning inefficient in real-world heterogeneous environments. We propose a novel and efficient collaborative learning framework named AdaptCL, which generates an adaptive sub-model dynamically from the global base model for each data holder, without any prior information about worker capability. All workers (data holders) achieve approximately identical update time as the fastest worker by equipping them with capability-adapted pruned models. Thus the training process can be dramatically accelerated. Besides, we tailor the efficient pruned rate learning algorithm and pruning approach for AdaptCL. Meanwhile, AdaptCL provides a mechanism for handling the trade-off between accuracy and time overhead and can be combined with other techniques to accelerate training further. Empirical results show that AdaptCL introduces little computing and communication overhead. AdaptCL achieves time savings of more than 41% on average and improves accuracy in a low heterogeneous environment. In a highly heterogeneous environment, AdaptCL achieves a training speedup of 6.2x with a slight loss of accuracy.

【5】 Domain Conditional Predictors for Domain Adaptation 标题：域自适应的域条件预报器

作者：Joao Monteiro,Xavier Gibert,Jianqiao Feng,Vincent Dumoulin,Dar-Shyang Lee 机构：Google 备注：Part of the pre-registration workshop at NeurIPS 2020: this https URL 链接：https://arxiv.org/abs/2106.13899 摘要：学习保证通常依赖于i.i.d.数据的假设，一旦预测者被部署到执行现实任务中，在实践中很可能会违反这些假设。因此，域自适应方法作为一个有用的框架出现，在支持不同的训练和测试数据分布时产生额外的灵活性，前提是满足其他假设，如协变量移位，即期望标签上的条件分布独立于基础数据分布。为了在不同的训练数据源和测试数据源之间进行泛化，引入了几种方法，这些方法通常依赖于域不变性的一般思想，使得预测模型忽略了数据生成分布。在本文中，我们通过从相反的方向来处理跨数据源的泛化问题：我们考虑一种条件建模方法，其中预测除了依赖于输入数据外，还使用与底层数据生成分布相关的信息。例如，该模型有一个明确的机制来适应不断变化的环境和/或新的数据源。我们认为，这种方法比现有的域自适应方法更具普遍适用性，因为它不需要额外的假设，如协变量移位，并进一步产生更简单的训练算法，避免了通常在域不变方法中使用的minimax公式引起的训练不稳定性的共同来源。摘要：Learning guarantees often rely on assumptions of i.i.d. data, which will likely be violated in practice once predictors are deployed to perform real-world tasks. Domain adaptation approaches thus appeared as a useful framework yielding extra flexibility in that distinct train and test data distributions are supported, provided that other assumptions are satisfied such as covariate shift, which expects the conditional distributions over labels to be independent of the underlying data distribution. Several approaches were introduced in order to induce generalization across varying train and test data sources, and those often rely on the general idea of domain-invariance, in such a way that the data-generating distributions are to be disregarded by the prediction model. In this contribution, we tackle the problem of generalizing across data sources by approaching it from the opposite direction: we consider a conditional modeling approach in which predictions, in addition to being dependent on the input data, use information relative to the underlying data-generating distribution. For instance, the model has an explicit mechanism to adapt to changing environments and/or new data sources. We argue that such an approach is more generally applicable than current domain adaptation methods since it does not require extra assumptions such as covariate shift and further yields simpler training algorithms that avoid a common source of training instabilities caused by minimax formulations, often employed in domain-invariant methods.

【6】 Multimodal Few-Shot Learning with Frozen Language Models 标题：基于冻结语言模型的多模态少发式学习

作者：Maria Tsimpoukelli,Jacob Menick,Serkan Cabi,S. M. Ali Eslami,Oriol Vinyals,Felix Hill 机构：DeepMind, University College London 链接：https://arxiv.org/abs/2106.13884 摘要：在足够大的范围内训练时，自回归语言模型表现出显著的学习新语言任务的能力。在这里，我们提出了一个简单，但有效的方法，将这种少数镜头的学习能力转移到一个多模式的设置（视觉和语言）。使用对齐的图像和字幕数据，我们训练一个视觉编码器，将每幅图像表示为一系列连续的嵌入，这样一个预先训练的、冻结的语言模型就会生成相应的字幕。由此产生的系统是一个多模态的少数镜头学习者，具有惊人的能力，学习各种新的任务时，条件的例子，表现为一个序列的多重交织图像和文本嵌入。我们证明，它可以快速学习新对象和新的视觉类别的单词，只使用少数的例子进行视觉问答，并利用外部知识，通过测量一个单一的模型对各种已建立和新的基准。摘要：When trained at sufficient scale, auto-regressive language models exhibit the notable ability to learn a new language task after being prompted with just a few examples. Here, we present a simple, yet effective, approach for transferring this few-shot learning ability to a multimodal setting (vision and language). Using aligned image and caption data, we train a vision encoder to represent each image as a sequence of continuous embeddings, such that a pre-trained, frozen language model prompted with this prefix generates the appropriate caption. The resulting system is a multimodal few-shot learner, with the surprising ability to learn a variety of new tasks when conditioned on examples, represented as a sequence of multiple interleaved image and text embeddings. We demonstrate that it can rapidly learn words for new objects and novel visual categories, do visual question-answering with only a handful of examples, and make use of outside knowledge, by measuring a single model on a variety of established and new benchmarks.

强化学习(6篇)

【1】 Causal Reinforcement Learning using Observational and Interventional Data 标题：使用观测和干预数据的因果强化学习

作者：Maxime Gasse,Damien Grasset,Guillaume Gaudron,Pierre-Yves Oudeyer 机构：Polytechnique Montréal, Montréal QC, Canada, IRT Saint Exupéry Canada, Ubisoft La Forge, Bordeaux, France, Inria Bordeaux Sud-Ouest 链接：https://arxiv.org/abs/2106.14421 摘要：有效地学习环境的因果模型是基于模型的RL代理在POMDPs中运行的一个关键挑战。我们在这里考虑这样一个场景：学习代理能够通过与环境的直接交互（干预数据）收集在线经验，但也可以通过观察另一个代理与环境交互（观察数据）获得大量离线经验。一个关键的因素，使这种情况不平凡，是我们允许观察到的代理与环境互动的基础上隐藏的信息，这是没有观察到的学习代理。然后，我们提出以下问题：在线和离线经验可以安全地结合起来学习因果模型吗？我们能期望离线体验提高代理的性能吗？为了回答这些问题，我们从已经建立的do演算因果框架中引入了一些想法，并将基于模型的强化学习表示为一个因果推理问题。然后，我们提出了一种在学习过程中利用离线数据的通用而简单的方法。简言之，该方法依赖于学习一个解释干预和观察机制的基于潜变量的因果转换模型，然后使用恢复的潜变量通过解发现来推断标准的POMDP转换模型。我们证明了我们的方法是正确和有效的，在这个意义上，它获得了更好的推广保证由于离线数据（在渐近的情况下），我们说明了其有效性的经验对合成玩具问题。我们的贡献旨在弥合强化学习和因果关系领域之间的差距。摘要：Learning efficiently a causal model of the environment is a key challenge of model-based RL agents operating in POMDPs. We consider here a scenario where the learning agent has the ability to collect online experiences through direct interactions with the environment (interventional data), but has also access to a large collection of offline experiences, obtained by observing another agent interacting with the environment (observational data). A key ingredient, that makes this situation non-trivial, is that we allow the observed agent to interact with the environment based on hidden information, which is not observed by the learning agent. We then ask the following questions: can the online and offline experiences be safely combined for learning a causal model ? And can we expect the offline experiences to improve the agent's performances ? To answer these questions, we import ideas from the well-established causal framework of do-calculus, and we express model-based reinforcement learning as a causal inference problem. Then, we propose a general yet simple methodology for leveraging offline data during learning. In a nutshell, the method relies on learning a latent-based causal transition model that explains both the interventional and observational regimes, and then using the recovered latent variable to infer the standard POMDP transition model via deconfounding. We prove our method is correct and efficient in the sense that it attains better generalization guarantees due to the offline data (in the asymptotic case), and we illustrate its effectiveness empirically on synthetic toy problems. Our contribution aims at bridging the gap between the fields of reinforcement learning and causality.

【2】 Regret Analysis in Deterministic Reinforcement Learning 标题：确定性强化学习中的后悔分析

作者：Damianos Tranos,Alexandre Proutiere 机构： School of Electrical Engineering and Computer Science 链接：https://arxiv.org/abs/2106.14338 摘要：考虑具有确定性转移的马尔可夫决策过程（MDPs），研究后悔最小化问题，这是分析和设计最优学习算法的核心。我们提出了显式依赖于系统参数的对数问题特定遗憾下界（与以前的minimax方法相比），从而真正量化了任何学习算法可达到的性能基本极限。确定性mdp可以解释为图，并根据它们的循环进行分析，我们利用这一事实来识别一类确定性mdp，其遗憾下限可以通过数值确定。我们在一个确定性的线搜索问题和一个具有状态相关奖励的确定性MDP上进一步举例说明了这个结果，我们可以显式地说明它的遗憾下界。这些界与已知的多臂bandit问题的特定问题界有相似之处，并表明在确定性MDP上导航不必对学习算法的性能产生影响。摘要：We consider Markov Decision Processes (MDPs) with deterministic transitions and study the problem of regret minimization, which is central to the analysis and design of optimal learning algorithms. We present logarithmic problem-specific regret lower bounds that explicitly depend on the system parameter (in contrast to previous minimax approaches) and thus, truly quantify the fundamental limit of performance achievable by any learning algorithm. Deterministic MDPs can be interpreted as graphs and analyzed in terms of their cycles, a fact which we leverage in order to identify a class of deterministic MDPs whose regret lower bound can be determined numerically. We further exemplify this result on a deterministic line search problem, and a deterministic MDP with state-dependent rewards, whose regret lower bounds we can state explicitly. These bounds share similarities with the known problem-specific bound of the multi-armed bandit problem and suggest that navigation on a deterministic MDP need not have an effect on the performance of a learning algorithm.

【3】 Concentration of Contractive Stochastic Approximation and Reinforcement Learning 标题：压缩随机逼近的集中度与强化学习

作者：Siddharth Chandak,Vivek S. Borkar 机构：Electrical Engineering, Indian Institute of Technology Bombay, Mumbai, India 备注：15 pages, Submitted to Stochastic Systems 链接：https://arxiv.org/abs/2106.14308 摘要：利用鞅浓度不等式，导出了同时具有压缩映射、鞅差分和Markov噪声的随机逼近算法的浓度界。这些被应用于强化学习算法，特别是异步Q-学习和TD（0）。摘要：Using a martingale concentration inequality, concentration bounds `from time $n_0$ on' are derived for stochastic approximation algorithms with contractive maps and both martingale difference and Markov noises. These are applied to reinforcement learning algorithms, in particular to asynchronous Q-learning and TD(0).

【4】 Model-Advantage Optimization for Model-Based Reinforcement Learning 标题：基于模型的强化学习的模型优势优化

作者：Nirbhay Modhe,Harish Kamath,Dhruv Batra,Ashwin Kalyan 机构：Georgia Tech, Allen Institute for AI 链接：https://arxiv.org/abs/2106.14080 摘要：传统上，基于模型的强化学习（MBRL）算法的设计目标是学习环境的精确动态。这导致了模型学习的目标与寻找最优策略的整体学习问题之间的不匹配。价值感知模型学习是最大似然法的一种替代模型学习范式，它通过学习策略的价值函数为模型学习提供信息。虽然这种模式在理论上是合理的，但它并没有超出玩具设置的范围。在这项工作中，我们提出了一个新的价值意识的目标，这是一个上限的绝对性能差异的政策在两个模型。此外，我们还提出了一个通用算法，该算法修改了标准的MBRL管道——实现具有价值感知目标的学习。我们提出的目标，结合这个算法，是第一个成功的实例价值意识的MBRL在具有挑战性的连续控制环境，优于以往的价值意识的目标和具有竞争力的性能w.r.t.MLE为基础的MBRL方法。摘要：Model-based Reinforcement Learning (MBRL) algorithms have been traditionally designed with the goal of learning accurate dynamics of the environment. This introduces a mismatch between the objectives of model-learning and the overall learning problem of finding an optimal policy. Value-aware model learning, an alternative model-learning paradigm to maximum likelihood, proposes to inform model-learning through the value function of the learnt policy. While this paradigm is theoretically sound, it does not scale beyond toy settings. In this work, we propose a novel value-aware objective that is an upper bound on the absolute performance difference of a policy across two models. Further, we propose a general purpose algorithm that modifies the standard MBRL pipeline -- enabling learning with value aware objectives. Our proposed objective, in conjunction with this algorithm, is the first successful instantiation of value-aware MBRL on challenging continuous control environments, outperforming previous value-aware objectives and with competitive performance w.r.t. MLE-based MBRL approaches.

【5】 Compositional Reinforcement Learning from Logical Specifications 标题：来自逻辑规范的组合强化学习

作者：Kishor Jothimurugan,Suguman Bansal,Osbert Bastani,Rajeev Alur 机构：University of Pennsylvania 链接：https://arxiv.org/abs/2106.13906 摘要：研究了逻辑规范下复杂任务的学习控制策略问题。最近的方法自动从一个给定的规范生成一个奖励函数，并使用适当的强化学习算法来学习一个使期望奖励最大化的策略。然而，这些方法对于需要高层次规划的复杂任务的扩展性很差。在这项工作中，我们开发了一种组合学习方法，称为DiRL，它将高级规划和强化学习交织在一起。首先，DiRL将规范编码为抽象图；直观地说，图的顶点和边分别对应于状态空间和简单子任务的区域。然后，我们的方法结合强化学习来学习Dijkstra式规划算法中每个边（子任务）的神经网络策略，以计算图中的高级规划。在一组具有连续状态空间和动作空间的具有挑战性的控制基准上对所提出的方法进行了评估，结果表明该方法优于最新的基准。摘要：We study the problem of learning control policies for complex tasks given by logical specifications. Recent approaches automatically generate a reward function from a given specification and use a suitable reinforcement learning algorithm to learn a policy that maximizes the expected reward. These approaches, however, scale poorly to complex tasks that require high-level planning. In this work, we develop a compositional learning approach, called DiRL, that interleaves high-level planning and reinforcement learning. First, DiRL encodes the specification as an abstract graph; intuitively, vertices and edges of the graph correspond to regions of the state space and simpler sub-tasks, respectively. Our approach then incorporates reinforcement learning to learn neural network policies for each edge (sub-task) within a Dijkstra-style planning algorithm to compute a high-level plan in the graph. An evaluation of the proposed approach on a set of challenging control benchmarks with continuous state and action spaces demonstrates that it outperforms state-of-the-art baselines.

【6】 AutoPipeline: Synthesize Data Pipelines By-Target Using Reinforcement Learning and Search 标题：AutoPipeline：使用强化学习和搜索按目标合成数据管道

作者：Junwen Yang,Yeye He,Surajit Chaudhuri 机构：University of Chicago, Microsoft Research 链接：https://arxiv.org/abs/2106.13861 摘要：最近的工作在帮助用户自动化单个数据准备步骤方面取得了重大进展，如字符串转换和表操作操作符（如Join、GroupBy、Pivot等）。在这项工作中，我们建议通过综合具有字符串转换和表操作操作符的复杂数据管道，实现多个这样的步骤的端到端自动化。我们提出了一种新颖的“按目标”范例，允许用户轻松地指定所需的管道，这与传统的按示例范例有很大的不同。使用by target，用户将提供输入表（例如csv或json文件），并将我们指向一个“目标表”（例如，现有的数据库表或BI仪表板），以演示所需管道的输出是如何示意性地“看起来像”的。虽然这个问题似乎没有得到充分的说明，但我们的独特见解是，可以利用FDs和key等隐式表约束来显著地约束空间，从而使问题易于处理。我们开发了一个自动管道系统，学习综合管道使用强化学习和搜索。对从GitHub抓取的大量真实管道进行的实验表明，自动管道平均可以在10-20秒内成功合成60-70%的复杂管道（最多10步）。摘要：Recent work has made significant progress in helping users to automate single data preparation steps, such as string-transformations and table-manipulation operators (e.g., Join, GroupBy, Pivot, etc.). We in this work propose to automate multiple such steps end-to-end, by synthesizing complex data pipelines with both string transformations and table-manipulation operators. We propose a novel "by-target" paradigm that allows users to easily specify the desired pipeline, which is a significant departure from the traditional by-example paradigm. Using by-target, users would provide input tables (e.g., csv or json files), and point us to a "target table" (e.g., an existing database table or BI dashboard) to demonstrate how the output from the desired pipeline would schematically "look like". While the problem is seemingly underspecified, our unique insight is that implicit table constraints such as FDs and keys can be exploited to significantly constrain the space to make the problem tractable. We develop an Auto-Pipeline system that learns to synthesize pipelines using reinforcement learning and search. Experiments on large numbers of real pipelines crawled from GitHub suggest that Auto-Pipeline can successfully synthesize 60-70% of these complex pipelines (up to 10 steps) in 10-20 seconds on average.

符号|符号学习(1篇)

【1】 A Neural-symbolic Approach for Ontology-mediated Query Answering 标题：本体介导的查询回答的神经符号方法

作者：Medina Andresel,Csaba Domokos,Daria Stepanova,Trung-Kien Tran 机构：Bosch Center for AI, TU Wien 链接：https://arxiv.org/abs/2106.14052 摘要：近年来，知识图的低维向量空间表示被应用于不完全知识图上合取查询的求解。然而，目前的方法主要集中在归纳推理，即基于从数据中学习到的模式，通过预测事实来回答问题，缺乏应用外部领域知识进行演绎推理的能力。这样的（专家或常识）领域知识是一种宝贵的资源，可以用来提高机器智能。为了解决这一问题，我们引入了一种神经符号方法，用于在嵌入空间操作的不完全KG上进行本体介导的CQ应答。更具体地说，我们提出了各种数据扩充策略，使用基于查询重写的方法生成训练查询，然后利用一种新的损失函数来训练模型。实验结果证明了我们的训练策略和新的损失函数的有效性，即在需要归纳推理和演绎推理的情况下，我们的方法明显优于基线。摘要：Recently, low-dimensional vector space representations of knowledge graphs (KGs) have been applied to find answers to conjunctive queries (CQs) over incomplete KGs. However, the current methods only focus on inductive reasoning, i.e. answering CQs by predicting facts based on patterns learned from the data, and lack the ability of deductive reasoning by applying external domain knowledge. Such (expert or commonsense) domain knowledge is an invaluable resource which can be used to advance machine intelligence. To address this shortcoming, we introduce a neural-symbolic method for ontology-mediated CQ answering over incomplete KGs that operates in the embedding space. More specifically, we propose various data augmentation strategies to generate training queries using query-rewriting based methods and then exploit a novel loss function for training the model. The experimental results demonstrate the effectiveness of our training strategies and the new loss function, i.e., our method significantly outperforms the baseline in the settings that require both inductive and deductive reasoning.

医学相关(7篇)

【1】 Improving Prediction of Low-Prior Clinical Events with Simultaneous General Patient-State Representation Learning 标题：利用同步的一般患者-状态表征学习改进低先期临床事件的预测

作者：Matthew Barren,Milos Hauskrecht 机构：Improving Prediction of Low-Prior ClinicalEvents with Simultaneous General Patient-StateRepresentation LearningMatthew Barren[0000−000 3−08 5 5− 2 1 4 4] and Milos Hauskrecht[0000−000 2−78 18−06 3 3]University of Pittsburgh 备注：Accepted at 19th International Conference on Artificial Intelligence in Medicine (AIME 2021) 链接：https://arxiv.org/abs/2106.14838 摘要：低先验目标在许多重要的临床事件中是常见的，这就带来了有足够的数据来支持预测模型学习的挑战。许多以前的工作都是通过建立一个通用的病人状态表示模型，然后将其适应于一个新的低先验预测目标来解决这个问题。在这个模式中，一般病人状态模型和目标任务之间的不一致可能阻碍预测性能。为了克服这一挑战，我们提出了一种新的方法，通过低先验监督目标和通用病人状态表示（GPSR）的多任务学习同时优化共享模型。更具体地说，我们的方法通过联合优化一个共享模型来提高低优先级任务的预测性能，该模型结合了目标事件的丢失和广泛的一般临床事件。我们在递归神经网络（RNN）的背景下研究了该方法。通过使用MIMIC-III数据对多个临床事件目标进行的大量实验，我们发现在模型训练中加入一般患者状态表示任务可以提高对单个低先验目标的预测能力。摘要：Low-prior targets are common among many important clinical events, which introduces the challenge of having enough data to support learning of their predictive models. Many prior works have addressed this problem by first building a general patient-state representation model, and then adapting it to a new low-prior prediction target. In this schema, there is potential for the predictive performance to be hindered by the misalignment between the general patient-state model and the target task. To overcome this challenge, we propose a new method that simultaneously optimizes a shared model through multi-task learning of both the low-prior supervised target and general purpose patient-state representation (GPSR). More specifically, our method improves prediction performance of a low-prior task by jointly optimizing a shared model that combines the loss of the target event and a broad range of generic clinical events. We study the approach in the context of Recurrent Neural Networks (RNNs). Through extensive experiments on multiple clinical event targets using MIMIC-III data, we show that the inclusion of general patient-state representation tasks during model training improves the prediction of individual low-prior targets.

【2】 RadGraph: Extracting Clinical Entities and Relations from Radiology Reports 标题：RadGraph：从放射学报告中提取临床实体和关系

作者：Saahil Jain,Ashwin Agrawal,Adriel Saporta,Steven QH Truong,Du Nguyen Duong,Tan Bui,Pierre Chambon,Yuhao Zhang,Matthew P. Lungren,Andrew Y. Ng,Curtis P. Langlotz,Pranav Rajpurkar 机构：Stanford University, VinBrain, VinUniversity 链接：https://arxiv.org/abs/2106.14463 摘要：从自由文本放射报告中提取结构化的临床信息可以使放射报告信息用于各种重要的医疗保健应用程序。在我们的工作中，我们提出了RadGraph，一个数据集的实体和关系在全文胸部X射线放射学报告的基础上，我们设计了一个新的信息提取模式结构放射学报告。我们发布了一个开发数据集，该数据集包含来自MIMIC-CXR数据集（14579个实体和10889个关系）的500份放射报告的经委员会认证的放射学家注释，以及一个测试数据集，它包含两组独立的board certified radiologist注解，用于100份放射报告，在MIMIC-CXR和CheXpert数据集中平均分配。利用这些数据集，我们训练并测试了一个深度学习模型RadGraph Benchmark，该模型在MIMIC-CXR和CheXpert测试集上的关系提取的micro F1分别达到0.82和0.73。此外，我们还发布了一个推断数据集，其中包含由RadGraph Benchmark自动生成的注释，这些注释跨越220763份MIMIC-CXR报告（约600万个实体和400万个关系）和500份CheXpert报告（13783个实体和9908个关系），并映射到相关胸片。我们免费提供的数据集可以促进医学自然语言处理方面的广泛研究，以及计算机视觉和多模式学习（与胸片相关）。摘要：Extracting structured clinical information from free-text radiology reports can enable the use of radiology report information for a variety of critical healthcare applications. In our work, we present RadGraph, a dataset of entities and relations in full-text chest X-ray radiology reports based on a novel information extraction schema we designed to structure radiology reports. We release a development dataset, which contains board-certified radiologist annotations for 500 radiology reports from the MIMIC-CXR dataset (14,579 entities and 10,889 relations), and a test dataset, which contains two independent sets of board-certified radiologist annotations for 100 radiology reports split equally across the MIMIC-CXR and CheXpert datasets. Using these datasets, we train and test a deep learning model, RadGraph Benchmark, that achieves a micro F1 of 0.82 and 0.73 on relation extraction on the MIMIC-CXR and CheXpert test sets respectively. Additionally, we release an inference dataset, which contains annotations automatically generated by RadGraph Benchmark across 220,763 MIMIC-CXR reports (around 6 million entities and 4 million relations) and 500 CheXpert reports (13,783 entities and 9,908 relations) with mappings to associated chest radiographs. Our freely available dataset can facilitate a wide range of research in medical natural language processing, as well as computer vision and multi-modal learning when linked to chest radiographs.

【3】 Tiled sparse coding in eigenspaces for the COVID-19 diagnosis in chest X-ray images 标题：特征空间平铺稀疏编码在胸部X线图像冠状病毒诊断中的应用

作者：Juan E. Arco,Andrés Ortiz,Javier Ramírez,Juan M Gorriz 机构： Department of Signal Theory, Networking and Communications, Universidad de Granada, Department of Communications Engineering, Universidad de Malaga 备注：14 pages, 5 figures 链接：https://arxiv.org/abs/2106.14724 摘要：持续的冠状病毒-19（Coronavirus disease 2019）大流行危机改变了世界。据世界卫生组织（WHO）统计，已有400万人死于该病，而确诊的COVID-19病例已超过1.8亿。许多国家卫生系统的崩溃表明，需要开发工具，从医学影像中自动诊断该病。以前的研究都是用深度学习来达到这个目的。然而，这种方法的性能在很大程度上取决于用于训练算法的数据集的大小。在这项工作中，我们提出了一个基于稀疏编码的分类框架，以识别与不同病理相关的肺炎模式。具体来说，每个胸部X射线（CXR）图像被分割成不同的块。从主成分分析中提取最相关的特征，然后在稀疏编码过程中建立字典。一旦图像从字典的元素转换和重建，分类就从与每幅图像相关联的各个面片的重建误差中执行。在同时区分四种不同病理学（对照组、细菌性肺炎、病毒性肺炎、COVID-19）的真实场景中评估性能。识别肺炎的准确率为93.85%，而在4级分类中获得88.11%。出色的结果和稀疏编码在这个场景中的开创性使用证明了这种方法在现实环境中作为临床医生的辅助手段的适用性。摘要：The ongoing crisis of the COVID-19 (Coronavirus disease 2019) pandemic has changed the world. According to the World Health Organization (WHO), 4 million people have died due to this disease, whereas there have been more than 180 million confirmed cases of COVID-19. The collapse of the health system in many countries has demonstrated the need of developing tools to automatize the diagnosis of the disease from medical imaging. Previous studies have used deep learning for this purpose. However, the performance of this alternative highly depends on the size of the dataset employed for training the algorithm. In this work, we propose a classification framework based on sparse coding in order to identify the pneumonia patterns associated with different pathologies. Specifically, each chest X-ray (CXR) image is partitioned into different tiles. The most relevant features extracted from PCA are then used to build the dictionary within the sparse coding procedure. Once images are transformed and reconstructed from the elements of the dictionary, classification is performed from the reconstruction errors of individual patches associated with each image. Performance is evaluated in a real scenario where simultaneously differentiation between four different pathologies: control vs bacterial pneumonia vs viral pneumonia vs COVID-19. The accuracy when identifying the presence of pneumonia is 93.85%, whereas 88.11% is obtained in the 4-class classification context. The excellent results and the pioneering use of sparse coding in this scenario evidence the applicability of this approach as an aid for clinicians in a real-world environment.

【4】 Weighted multi-level deep learning analysis and framework for processing breast cancer WSIs 标题：乳腺癌WSIS处理的加权多层深度学习分析框架

作者：Peter Bokor,Lukas Hudec,Ondrej Fabian,Wanda Benesova 机构： 1 40 2 1 and theCharles University and Thomayer University Hospital 备注：9 pages, 12 images, 3 tables with results, We have an intention to submit this paper to the current journal focused on computer methods/deep learning in biomedicine 链接：https://arxiv.org/abs/2106.14708 摘要：乳腺癌的预防和早期诊断是选择合适治疗方法的必要前提。由于对更快更精确诊断结果的需求的增加而产生的巨大压力推动了自动解决方案。在过去的十年中，深度学习技术已经在多个领域展示了它们的能力，计算机辅助诊断（CAD）成为其中之一。然而，当涉及到整个幻灯片图像（WSI）的分析时，现有的大多数工作都是从不同的层次独立地计算预测。然而，这与组织病理学专家的方法形成了对比，后者要求看到在BC分类中重要的组织结构的整体结构。我们提出了一个基于深度学习的解决方案和框架来处理WSI基于一种新的方法，利用图像水平的优势。我们将从多个层次提取的信息进行加权，最终对恶性肿瘤进行分类。我们的结果表明，全球信息的盈利能力，准确率从72.2%提高到84.8%。摘要：Prevention and early diagnosis of breast cancer (BC) is an essential prerequisite for the selection of proper treatment. The substantial pressure due to the increase of demand for faster and more precise diagnostic results drives for automatic solutions. In the past decade, deep learning techniques have demonstrated their power over several domains, and Computer-Aided (CAD) diagnostic became one of them. However, when it comes to the analysis of Whole Slide Images (WSI), most of the existing works compute predictions from levels independently. This is, however, in contrast to the histopathologist expert approach who requires to see a global architecture of tissue structures important in BC classification. We present a deep learning-based solution and framework for processing WSI based on a novel approach utilizing the advantages of image levels. We apply the weighing of information extracted from several levels into the final classification of the malignancy. Our results demonstrate the profitability of global information with an increase of accuracy from 72.2% to 84.8%.

【5】 Benchmarking convolutional neural networks for diagnosing Lyme disease from images 标题：基于卷积神经网络的图像莱姆病诊断基准

作者：Sk Imran Hossain,Jocelyn de Goër de Herve,Md Shahriar Hassan,Delphine Martineau,Evelina Petrosyan,Violaine Corbain,Jean Beytout,Isabelle Lebert,Elisabeth Baux,Céline Cazorla,Carole Eldin,Yves Hansmann,Solene Patrat-Delon,Thierry Prazuck,Alice Raffetin,Pierre Tattevin,Gwenaël Vourc'H,Olivier Lesens,Engelbert Nguifo 机构：Engelbert Mephu Nguifoa,, Université Clermont Auvergne, CNRS, ENSMSE, LIMOS, F-, Clermont-Ferrand, France, Université Clermont Auvergne, INRAE, VetAgro Sup, UMR EPIA, Saint-Genès-Champanelle 链接：https://arxiv.org/abs/2106.14465 摘要：莱姆病是世界上最常见的媒介传染病之一。在早期阶段，这种疾病表现为大多数病例的红斑移行性（EM）皮肤病变。更好的诊断这些早期形式将有助于改善预后，防止过渡到严重的晚期形式感谢适当的抗生素治疗。最近的研究表明，卷积神经网络（CNNs）能很好地从图像中识别出莱姆病，但从EM图像中预测莱姆病的工作并不多。本研究的主要目的是广泛分析CNNs在莱姆病影像诊断中的有效性，并找出最佳的CNN结构。目前还没有公开的用于莱姆病预测的EM图像数据集，主要是出于隐私考虑。在这项研究中，我们利用EM数据集，包括从法国的CelMunt Feland大学医院中心（CF-CU）收集的图像和互联网。朱棣文从法国几家医院收集了这些图像。该数据集由来自CF-CHU的皮肤科专家和感染科专家标记。首先，我们在预测性能指标、计算复杂性指标和统计显著性测试方面对23个著名的CNN架构的数据集进行了基准测试。其次，为了提高CNNs的性能，我们采用了基于ImageNet预训练模型的转移学习方法，并利用皮肤损伤数据集“10000张训练图像的人机对抗（HAM1000）”对CNNs进行了预训练。在这个过程中，我们为每个cnn寻找在迁移学习微调过程中解冻的最佳层数。第三，对于模型的可解释性，我们利用梯度加权类激活映射来可视化对CNN有重要意义的输入区域以进行预测。第四，我们提供了基于预测性能和计算复杂性的模型选择准则。我们的研究证实了一些轻量级CNN用于莱姆病扫描前移动应用的有效性和潜力。我们还公开了所有经过训练的模型https://dappem.limos.fr/download.html，可供其他人用于转移学习和建立莱姆病预扫描仪。摘要：Lyme disease is one of the most common infectious vector-borne diseases in the world. In the early stage, the disease manifests itself in most cases with erythema migrans (EM) skin lesions. Better diagnosis of these early forms would allow improving the prognosis by preventing the transition to a severe late form thanks to appropriate antibiotic therapy. Recent studies show that convolutional neural networks (CNNs) perform very well to identify skin lesions from the image but, there is not much work for Lyme disease prediction from EM lesion images. The main objective of this study is to extensively analyze the effectiveness of CNNs for diagnosing Lyme disease from images and to find out the best CNN architecture for the purpose. There is no publicly available EM image dataset for Lyme disease prediction mainly because of privacy concerns. In this study, we utilized an EM dataset consisting of images collected from Clermont-Ferrand University Hospital Center (CF-CHU) of France and the internet. CF-CHU collected the images from several hospitals in France. This dataset was labeled by expert dermatologists and infectiologists from CF-CHU. First, we benchmarked this dataset for twenty-three well-known CNN architectures in terms of predictive performance metrics, computational complexity metrics, and statistical significance tests. Second, to improve the performance of the CNNs, we used transfer learning from ImageNet pre-trained models as well as pre-trained the CNNs with the skin lesion dataset "Human Against Machine with 10000 training images (HAM1000)". In that process, we searched for the best performing number of layers to unfreeze during transfer learning fine-tuning for each of the CNNs. Third, for model explainability, we utilized Gradient-weighted Class Activation Mapping to visualize the regions of input that are significant to the CNNs for making predictions. Fourth, we provided guidelines for model selection based on predictive performance and computational complexity. Our study confirmed the effectiveness and potential of even some lightweight CNNs to be used for Lyme disease pre-scanner mobile applications. We also made all the trained models publicly available at https://dappem.limos.fr/download.html, which can be used by others for transfer learning and building pre-scanners for Lyme disease.

【6】 Learning stochastic object models from medical imaging measurements by use of advanced AmbientGANs 标题：利用先进的AmbientGANs从医学成像测量中学习随机对象模型

作者：Weimin Zhou,Sayantan Bhadra,Frank J. Brooks,Hua Li,Mark A. Anastasio 备注：Submitted to IEEE Transactions on Medical Imaging. arXiv admin note: substantial text overlap with arXiv:2006.00033 链接：https://arxiv.org/abs/2106.14324 摘要：为了通过计算机模拟客观地评估新的医学成像技术，重要的是要考虑到所有来源的变异，有助于图像数据。可变性的一个重要来源，可以大大限制观察员的表现，是与可变性，在合奏的对象成像。这种可变性的来源可以用随机对象模型（som）来描述，som是一种生成性模型，可以用来从虚拟成像对象的分布中取样。通常希望通过使用具有良好特征的成像系统获得的实验成像测量来建立som，但是这项任务仍然具有挑战性。深层生成性神经网络，如生成性对抗性网络（GANs）在这类任务中具有潜力。为了从成像测量中建立som，提出了一种用测量算子扩充GAN的环境GAN。然而，原始的AmbientGAN不能立即从现代的训练程序和GAN结构中获益，这限制了它应用于实际大小的医学图像数据的能力。为了避免这一问题，本文提出了一种改进的氛围训练策略，该策略适用于现代渐进式或多分辨率训练方法，如用于渐进式生长的GANs和基于风格的GANs。利用所提出的训练程序所建立的环境，通过与程式化成像系统相对应的计算机模拟测量数据，以受控的方式进行了系统的验证。最后，利用模拟的单线圈实验磁共振成像数据，在较少程式化的条件下验证了该方法。摘要：In order to objectively assess new medical imaging technologies via computer-simulations, it is important to account for all sources of variability that contribute to image data. One important source of variability that can significantly limit observer performance is associated with the variability in the ensemble of objects to-be-imaged. This source of variability can be described by stochastic object models (SOMs), which are generative models that can be employed to sample from a distribution of to-be-virtually-imaged objects. It is generally desirable to establish SOMs from experimental imaging measurements acquired by use of a well-characterized imaging system, but this task has remained challenging. Deep generative neural networks, such as generative adversarial networks (GANs) hold potential for such tasks. To establish SOMs from imaging measurements, an AmbientGAN has been proposed that augments a GAN with a measurement operator. However, the original AmbientGAN could not immediately benefit from modern training procedures and GAN architectures, which limited its ability to be applied to realistically sized medical image data. To circumvent this, in this work, a modified AmbientGAN training strategy is proposed that is suitable for modern progressive or multi-resolution training approaches such as employed in the Progressive Growing of GANs and Style-based GANs. AmbientGANs established by use of the proposed training procedure are systematically validated in a controlled way by use of computer-simulated measurement data corresponding to a stylized imaging system. Finally, emulated single-coil experimental magnetic resonance imaging data are employed to demonstrate the methods under less stylized conditions.

【7】 Residual Moment Loss for Medical Image Segmentation 标题：医学图像分割中的残差矩损失

作者：Quanziang Wang,Renzhen Wang,Yuexiang Li,Kai Ma,Yefeng Zheng,Deyu Meng 机构： School of Mathematics and Statistics, Xi’an Jiaotong University, Xi’an, China, Tencent Jarvis Lab, Shenzhen, China 链接：https://arxiv.org/abs/2106.14178 摘要：实验证明，位置信息有助于深度学习模型捕捉目标的流形结构，从而提高医学图像分割的精度。然而，现有的大多数方法都采用隐式的方式对位置信息进行编码，例如距离变换映射，它描述了每个像素到轮廓边界的相对距离，便于网络学习。这些隐式方法不能充分利用目标的位置信息（即绝对位置）。本文提出了一种新的损失函数，即剩余矩损失函数，用于在深度学习网络训练过程中嵌入分割目标的位置信息。特别是在图像矩的激励下，利用坐标信息对分割预测图和地面真值图进行加权。然后，我们的RM-loss鼓励网络保持两个加权映射之间的一致性，这使得分割网络能够很容易地定位目标并提取出多个与结构相关的特征。我们通过在两个公开的数据集上进行广泛的实验，即二维视杯和视盘分割和三维左心房分割，来验证所提出的RM丢失。实验结果证明了RM-loss算法的有效性，显著提高了分割网络的精度。摘要：Location information is proven to benefit the deep learning models on capturing the manifold structure of target objects, and accordingly boosts the accuracy of medical image segmentation. However, most existing methods encode the location information in an implicit way, e.g. the distance transform maps, which describe the relative distance from each pixel to the contour boundary, for the network to learn. These implicit approaches do not fully exploit the position information (i.e. absolute location) of targets. In this paper, we propose a novel loss function, namely residual moment (RM) loss, to explicitly embed the location information of segmentation targets during the training of deep learning networks. Particularly, motivated by image moments, the segmentation prediction map and ground-truth map are weighted by coordinate information. Then our RM loss encourages the networks to maintain the consistency between the two weighted maps, which promotes the segmentation networks to easily locate the targets and extract manifold-structure-related features. We validate the proposed RM loss by conducting extensive experiments on two publicly available datasets, i.e., 2D optic cup and disk segmentation and 3D left atrial segmentation. The experimental results demonstrate the effectiveness of our RM loss, which significantly boosts the accuracy of segmentation networks.

蒸馏|知识提取(3篇)

【1】 PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation 标题：PQK：基于剪枝、量化和知识提炼的模型压缩

作者：Jangho Kim,Simyung Chang,Nojun Kwak 机构：Qualcomm AI Research†, Qualcomm Korea YH, Seoul National University 备注：Proceedings of INTERSPEECH 2021 链接：https://arxiv.org/abs/2106.14681 摘要：随着边缘设备的普及，在边缘设备上部署深度神经网络（DNN）已成为一个关键问题。然而，DNN需要很高的计算资源，这对于边缘设备来说是很少见的。为了解决这个问题，我们提出了一种新的模型压缩方法，用于计算资源有限的设备，称为PQK，包括剪枝、量化和知识提取（KD）过程。与传统的剪枝和KD不同，PQK在剪枝过程中利用剪枝后不重要的权值，使教师网络训练出更好的学生网络，而无需预先训练教师模型。PQK有两个阶段。第1阶段利用迭代剪枝和量化感知训练来建立一个轻量级和节能的模型。在第二阶段中，我们将第一阶段中未使用的不重要权重添加到修剪后的网络中，从而生成一个教师网络。利用这一教师网络，我们将剪枝后的网络训练成学生网络。在这样做时，我们不需要为KD框架预先训练教师网络，因为教师和学生网络共存于同一网络中。我们将我们的方法应用到识别模型中，验证了PQK在关键词识别和图像识别中的有效性。摘要：As edge devices become prevalent, deploying Deep Neural Networks (DNN) on edge devices has become a critical issue. However, DNN requires a high computational resource which is rarely available for edge devices. To handle this, we propose a novel model compression method for the devices with limited computational resources, called PQK consisting of pruning, quantization, and knowledge distillation (KD) processes. Unlike traditional pruning and KD, PQK makes use of unimportant weights pruned in the pruning process to make a teacher network for training a better student network without pre-training the teacher model. PQK has two phases. Phase 1 exploits iterative pruning and quantization-aware training to make a lightweight and power-efficient model. In phase 2, we make a teacher network by adding unimportant weights unused in phase 1 to a pruned network. By using this teacher network, we train the pruned network as a student network. In doing so, we do not need a pre-trained teacher network for the KD framework because the teacher and the student networks coexist within the same network. We apply our method to the recognition model and verify the effectiveness of PQK on keyword spotting (KWS) and image recognition.

【2】 Reward-Based 1-bit Compressed Federated Distillation on Blockchain 标题：区块链上基于奖励的1位压缩联合蒸馏

作者：Leon Witt,Usama Zafar,KuoYeh Shen,Felix Sattler,Dan Li,Wojciech Samek 机构： Tsinghua University, China and Fraunhofer Heinrich Hertz Institute 链接：https://arxiv.org/abs/2106.14265 摘要：最近各种形式的联邦知识提取（FD）的出现为新一代健壮的、通信效率高的联邦学习（FL）铺平了道路，在FL中，只聚合软标签，而不是像以前的FL方案那样聚合深度神经网络（DNN）的整个梯度。这种按设计的安全方法与性能日益提高的物联网（IoT）和移动设备相结合，为利用来自行业以及个人的私有数据作为人工智能模型训练的输入开辟了一个新的可能性领域。然而，在以前的FL系统中，由于工人和中央权力机构之间权力的不平衡、工人无私参与的假设以及无法正确衡量和比较工人的贡献，导致信任的缺乏，阻碍了这项技术从已经被委托的实体的小群体扩展到大规模采用。这项工作的目的是通过引入一个新的去中心化联邦学习框架来缓解上述问题，在这个框架中，大量压缩的1位软标签，类似于1-热标签预测，被聚集在一个智能合约上。在工人的贡献现在很容易比较的情况下，我们修改了FD的众包机制（PTSC）的同侪真相血清（Peer Truth Serum），以激励相容的方式奖励基于同侪一致性的诚实参与。由于计算复杂度和存储量的大幅降低，我们的框架是一个完全基于区块链的FL系统，在简单的智能合约上是可行的，因此区块链是不可知的。我们通过实验测试了我们的新框架并验证了它的理论性质。摘要：The recent advent of various forms of Federated Knowledge Distillation (FD) paves the way for a new generation of robust and communication-efficient Federated Learning (FL), where mere soft-labels are aggregated, rather than whole gradients of Deep Neural Networks (DNN) as done in previous FL schemes. This security-per-design approach in combination with increasingly performant Internet of Things (IoT) and mobile devices opens up a new realm of possibilities to utilize private data from industries as well as from individuals as input for artificial intelligence model training. Yet in previous FL systems, lack of trust due to the imbalance of power between workers and a central authority, the assumption of altruistic worker participation and the inability to correctly measure and compare contributions of workers hinder this technology from scaling beyond small groups of already entrusted entities towards mass adoption. This work aims to mitigate the aforementioned issues by introducing a novel decentralized federated learning framework where heavily compressed 1-bit soft-labels, resembling 1-hot label predictions, are aggregated on a smart contract. In a context where workers' contributions are now easily comparable, we modify the Peer Truth Serum for Crowdsourcing mechanism (PTSC) for FD to reward honest participation based on peer consistency in an incentive compatible fashion. Due to heavy reductions of both computational complexity and storage, our framework is a fully on-blockchain FL system that is feasible on simple smart contracts and therefore blockchain agnostic. We experimentally test our new framework and validate its theoretical properties.

【3】 Pre-treatment of outliers and anomalies in plant data: Methodology and case study of a Vacuum Distillation Unit 标题：工厂数据中异常值和异常的预处理：方法和减压蒸馏装置的实例研究

作者：Kamil Oster,Stefan Güttel,Jonathan L. Shapiro,Lu Chen,Megan Jobson 机构：a Department of Mathematics, The University of Manchester, Alan Turing Building, Oxford Road, Manchester, M,PL, UK, b Process Integration Limited, Station House, Stamford New Road, Altrincham, WA,EP, UK 备注：33 pages, 20 figures, submitted to the Journal of Process Control (ref: JPROCONT-D-21-00332) 链接：https://arxiv.org/abs/2106.14641 摘要：数据预处理对于提高数据质量，从而从原始数据中提取准确的信息起着重要的作用。常用的数据预处理技术之一是离群点检测。所谓的3${sigma}$方法是识别异常值的常用方法。如手稿所示，它没有识别出所有的异常值，导致数据的整体统计数据可能失真。这个问题会对进一步的数据分析产生重大影响，并会导致预测模型的准确性降低。异常值检测技术有很多种，但是除了理论工作之外，它们都需要案例研究。考虑了两种类型的异常值：短期（错误数据、噪声）和长期异常值（例如，长期故障）。使用的数据来自亚洲炼油厂的减压蒸馏装置（VDU），包括40个物理传感器（温度、压力和流量）。我们使用一种改进的3${sigma}$阈值方法来识别短期异常值，即将传感器数据分成由变化点确定的块，并在每个块内计算3${sigma}$阈值，表示接近正态分布。我们已经证明，分段3${sigma}$方法比适用于整个时间序列的3${sigma}$方法提供了一种更好的短期异常值检测方法。然而，对于长期的异常值（可以表示数据中的另一种状态），这并不能很好地执行。在这种情况下，我们使用主成分分析（PCA）和Hotelling的$T^2$统计来识别长期异常值。主成分分析的结果采用DBSCAN聚类方法。DBSCAN也能正确识别异常值（PCA方法能准确地检测到）支持PCA方法的一致性和准确性。摘要：Data pre-treatment plays a significant role in improving data quality, thus allowing extraction of accurate information from raw data. One of the data pre-treatment techniques commonly used is outliers detection. The so-called 3${sigma}$ method is a common practice to identify the outliers. As shown in the manuscript, it does not identify all outliers, resulting in possible distortion of the overall statistics of the data. This problem can have a significant impact on further data analysis and can lead to reduction in the accuracy of predictive models. There is a plethora of various techniques for outliers detection, however, aside from theoretical work, they all require case study work. Two types of outliers were considered: short-term (erroneous data, noise) and long-term outliers (e.g. malfunctioning for longer periods). The data used were taken from the vacuum distillation unit (VDU) of an Asian refinery and included 40 physical sensors (temperature, pressure and flow rate). We used a modified method for 3${sigma}$ thresholds to identify the short-term outliers, i.e. ensors data are divided into chunks determined by change points and 3${sigma}$ thresholds are calculated within each chunk representing near-normal distribution. We have shown that piecewise 3${sigma}$ method offers a better approach to short-term outliers detection than 3${sigma}$ method applied to the entire time series. Nevertheless, this does not perform well for long-term outliers (which can represent another state in the data). In this case, we used principal component analysis (PCA) with Hotelling's $T^2$ statistics to identify the long-term outliers. The results obtained with PCA were subject to DBSCAN clustering method. The outliers (which were visually obvious and correctly detected by the PCA method) were also correctly identified by DBSCAN which supported the consistency and accuracy of the PCA method.

聚类(1篇)

【1】 Improved Approximation Algorithms for Individually Fair Clustering 标题：改进的个体公平聚类近似算法

作者：Ali Vakilian,Mustafa Yalçıner 链接：https://arxiv.org/abs/2106.14043 摘要：在Jung等人[2020]提出的公平性概念下，我们考虑了具有$ellp$-范数成本的$k$-聚类问题，包括$median，$k$-均值和$k$-中心成本函数：给定一组大小为$n$的点$p$，如果p$中的每个点$v，则一组$k$中心诱导公平聚类，$v$可以在$n/k$最近的邻居中找到一个中心。最近，Mahabadi和Vakilian[2020]展示了如何获得a$（p^{O（p）}，7）公平$k$聚类问题的$-双准则逼近$ellp$-范数代价：每个点在距离它的$（n/k）$最近邻最多$7$倍的距离内找到一个中心，解的$ellp$-范数代价最多$p^{O（p）}$倍最优公平解的代价。在这项工作中，对于任何$varepsilon>0$，我们提出了一个改进的$（16^p varepsilon，3）$-双准则近似，用于公平的$ellp$-范数成本的$k$-聚类。为了实现我们的保证，我们扩展了[Charikar et al.，2002，Swamy，2016]的框架，并设计了一个在拟阵约束下具有$ellp$范数成本的设施位置的$16^p$近似算法，这可能是一个独立的利益。此外，我们的方法建议将我们的个体公平聚类减少到Kleindessner等人[2019]提出的具有组公平性要求的聚类，这本质上是中间拟阵问题[Krishnaswamy等人，2011]。摘要：We consider the $k$-clustering problem with $ell_p$-norm cost, which includes $k$-median, $k$-means and $k$-center cost functions, under an individual notion of fairness proposed by Jung et al. [2020]: given a set of points $P$ of size $n$, a set of $k$ centers induces a fair clustering if for every point $vin P$, $v$ can find a center among its $n/k$ closest neighbors. Recently, Mahabadi and Vakilian [2020] showed how to get a $(p^{O(p)},7)$-bicriteria approximation for the problem of fair $k$-clustering with $ell_p$-norm cost: every point finds a center within distance at most $7$ times its distance to its $(n/k)$-th closest neighbor and the $ell_p$-norm cost of the solution is at most $p^{O(p)}$ times the cost of an optimal fair solution. In this work, for any $varepsilon>0$, we present an improved $(16^p varepsilon,3)$-bicriteria approximation for the fair $k$-clustering with $ell_p$-norm cost. To achieve our guarantees, we extend the framework of [Charikar et al., 2002, Swamy, 2016] and devise a $16^p$-approximation algorithm for the facility location with $ell_p$-norm cost under matroid constraint which might be of an independent interest. Besides, our approach suggests a reduction from our individually fair clustering to a clustering with a group fairness requirement proposed by Kleindessner et al. [2019], which is essentially the median matroid problem [Krishnaswamy et al., 2011].

自动驾驶|车辆|车道检测等(1篇)

【1】 Realtime Robust Malicious Traffic Detection via Frequency Domain Analysis 标题：基于频域分析的实时鲁棒恶意流量检测

作者：Chuanpu Fu,Qi Li,Meng Shen,Ke Xu 机构：Department of Computer Science and Technology, Tsinghua University, Beijing, China, Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China 备注：To Appear in ACM CCS 2021 链接：https://arxiv.org/abs/2106.14707 摘要：基于机器学习（ML）的恶意流量检测是一种新兴的安全模式，特别是对于零日攻击检测，是对现有基于规则检测的补充。然而，现有的基于ML的流量特征提取方法存在检测精度低、吞吐量低等缺点。因此，它们不能实时检测攻击，特别是在高吞吐量网络中。特别是，这些检测系统类似于现有的基于规则的检测，很容易被复杂的攻击所规避。为此，我们提出了一种基于ML的实时恶意流量检测系统whister，该系统利用频域特征实现了高准确度和高吞吐量。它利用频域特征所代表的序列特征来实现有界的信息丢失，保证了较高的检测精度，同时也限制了特征的规模，实现了较高的检测吞吐量。特别是，攻击者不能轻易干扰频域特性，因此Whisper对各种规避攻击具有鲁棒性。我们对42种攻击类型的实验表明，与现有系统相比，Whisper能够准确地检测各种复杂的、隐蔽的攻击，最多可提高18.36%，同时实现两个数量级的吞吐量。即使在各种规避攻击下，Whisper仍然能够保持90%左右的检测准确率。摘要：Machine learning (ML) based malicious traffic detection is an emerging security paradigm, particularly for zero-day attack detection, which is complementary to existing rule based detection. However, the existing ML based detection has low detection accuracy and low throughput incurred by inefficient traffic features extraction. Thus, they cannot detect attacks in realtime especially in high throughput networks. Particularly, these detection systems similar to the existing rule based detection can be easily evaded by sophisticated attacks. To this end, we propose Whisper, a realtime ML based malicious traffic detection system that achieves both high accuracy and high throughput by utilizing frequency domain features. It utilizes sequential features represented by the frequency domain features to achieve bounded information loss, which ensures high detection accuracy, and meanwhile constrains the scale of features to achieve high detection throughput. Particularly, attackers cannot easily interfere with the frequency domain features and thus Whisper is robust against various evasion attacks. Our experiments with 42 types of attacks demonstrate that, compared with the state-of-theart systems, Whisper can accurately detect various sophisticated and stealthy attacks, achieving at most 18.36% improvement, while achieving two orders of magnitude throughput. Even under various evasion attacks, Whisper is still able to maintain around 90% detection accuracy.

点云|SLAM|雷达|激光|深度RGBD相关(1篇)

【1】 Closed-form Continuous-Depth Models 标题：闭合形式的连续深度模型

作者：Ramin Hasani,Mathias Lechner,Alexander Amini,Lucas Liebenwein,Max Tschaikowski,Gerald Teschl,Daniela Rus 机构： 3Aalborg University, 4University of Vienna 备注：17 pages 链接：https://arxiv.org/abs/2106.13898 摘要：连续深度神经模型，其中模型隐藏状态的导数由神经网络定义，具有强大的顺序数据处理能力。然而，这些模型依赖于先进的数值微分方程（DE）解算器，在计算成本和模型复杂性方面都产生了巨大的开销。在本文中，我们提出了一个新的模型族，称为闭式连续深度（CfC）网络，该网络描述简单，速度至少快一个数量级，同时与基于ODE的网络模型相比具有同样强大的建模能力。这些模型由此从时间连续模型的表达子集的解析闭式解导出，从而减轻了对所有复杂解算器的需求。在我们的实验评估中，我们证明了CfC网络在一系列不同的时间序列预测任务（包括那些具有长期依赖性和不规则采样数据的任务）上优于先进的递归模型。我们相信，我们的发现为在资源受限的环境中训练和部署丰富的、连续的神经模型提供了新的机会，这些环境对性能和效率都有要求。摘要：Continuous-depth neural models, where the derivative of the model's hidden state is defined by a neural network, have enabled strong sequential data processing capabilities. However, these models rely on advanced numerical differential equation (DE) solvers resulting in a significant overhead both in terms of computational cost and model complexity. In this paper, we present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster while exhibiting equally strong modeling abilities compared to their ODE-based counterparts. The models are hereby derived from the analytical closed-form solution of an expressive subset of time-continuous models, thus alleviating the need for complex DE solvers all together. In our experimental evaluations, we demonstrate that CfC networks outperform advanced, recurrent models over a diverse set of time-series prediction tasks, including those with long-term dependencies and irregularly sampled data. We believe our findings open new opportunities to train and deploy rich, continuous neural models in resource-constrained settings, which demand both performance and efficiency.

联邦学习|隐私保护|加密(4篇)

【1】 Weight Divergence Driven Divide-and-Conquer Approach for Optimal Federated Learning from non-IID Data 标题：权重发散驱动的非IID数据最优联合学习分治方法

作者：Pravin Chandran,Raghavendra Bhat,Avinash Chakravarthi,Srikanth Chandar 机构：Intel Technology India Pvt. Ltd, Bengaluru, KA, India 链接：https://arxiv.org/abs/2106.14503 摘要：联合学习允许对存储在分布式设备中的数据进行训练，而无需集中训练数据，从而维护数据隐私。解决处理数据异构性（非相同和独立分布或非IID）的能力是更广泛部署联合学习的关键因素。在本文中，我们提出了一种新的分而治之的训练方法，通过克服非IID环境中公认的FedAvg限制，可以使用流行的FedAvg聚合算法。我们提出了一种新的基于余弦距离的权值散度度量的方法来确定深度学习网络的精确点，在这个点上，深度学习网络可以被划分为与类无关的初始层和特定于类的深层来执行分而治之的训练。我们证明了该方法通过FedProx、FedMA等最先进的聚合算法实现了训练模型精度在par（并且在某些情况下超过par），同时，我们还证明了该方法可以在特定的条件下实现计算和带宽优化。摘要：Federated Learning allows training of data stored in distributed devices without the need for centralizing training data, thereby maintaining data privacy. Addressing the ability to handle data heterogeneity (non-identical and independent distribution or non-IID) is a key enabler for the wider deployment of Federated Learning. In this paper, we propose a novel Divide-and-Conquer training methodology that enables the use of the popular FedAvg aggregation algorithm by overcoming the acknowledged FedAvg limitations in non-IID environments. We propose a novel use of Cosine-distance based Weight Divergence metric to determine the exact point where a Deep Learning network can be divided into class agnostic initial layers and class-specific deep layers for performing a Divide and Conquer training. We show that the methodology achieves trained model accuracy at par (and in certain cases exceeding) with numbers achieved by state-of-the-art Aggregation algorithms like FedProx, FedMA, etc. Also, we show that this methodology leads to compute and bandwidth optimizations under certain documented conditions.

【2】 Multi-task Over-the-Air Federated Learning: A Non-Orthogonal Transmission Approach 标题：多任务空中联合学习：一种非正交传输方法

作者：Haoming Ma,Xiaojun Yuan,Dian Fan,Zhi Ding,Xin Wang 机构：Member, IEEE 链接：https://arxiv.org/abs/2106.14229 摘要：在这封信中，我们提出了一个多任务空中联合学习（MOAFL）框架，其中多个学习任务在边缘服务器的协调下共享用于数据收集和学习模型的边缘设备。特别地，所有任务的模型更新通过空中计算在非正交上行链路信道上同时传输和叠加，并且所有任务的聚合结果通过turbo压缩感知算法的扩展版本在ES处重建。收敛性分析和数值结果都表明，MOAFL框架可以显著降低多任务的上行带宽消耗，而不会导致学习性能的显著下降。摘要：In this letter, we propose a multi-task over-theair federated learning (MOAFL) framework, where multiple learning tasks share edge devices for data collection and learning models under the coordination of a edge server (ES). Specially, the model updates for all the tasks are transmitted and superpositioned concurrently over a non-orthogonal uplink channel via over-the-air computation, and the aggregation results of all the tasks are reconstructed at the ES through an extended version of the turbo compressed sensing algorithm. Both the convergence analysis and numerical results demonstrate that the MOAFL framework can significantly reduce the uplink bandwidth consumption of multiple tasks without causing substantial learning performance degradation.

【3】 Benchmarking Differential Privacy and Federated Learning for BERT Models 标题：BERT模型的基准差分隐私和联合学习

作者：Priyam Basu,Tiasa Singha Roy,Rakshit Naidu,Zumrut Muftuoglu,Sahib Singh,Fatemehsadat Mireshghallah 机构： or in personal journals and well-being appli-Equal contribution 1Manipal Institute of Technology 2CarnegieMellon University 3OpenMined 4Yildiz Technical University 5FordMotor Company 6University of California 备注：4 pages, 3 tables, 1 figure 链接：https://arxiv.org/abs/2106.13973 摘要：自然语言处理（NLP）技术可以用来帮助诊断疾病，如抑郁症，使用一个人的话语集合。抑郁症是一种严重的医学疾病，会对人的感觉、思维和行为产生不良影响，从而导致情绪和身体问题。由于此类数据的敏感性，需要采取隐私措施来处理和训练具有此类数据的模型。在这项工作中，我们研究了差异隐私（DP）的应用，在集中和联合学习（FL）设置中，对训练语境化语言模型（BERT，ALBERT，RoBERTa和DistilBERT）的影响。我们提供关于如何私下训练NLP模型以及什么架构和设置提供了更理想的隐私实用程序权衡的见解。我们设想这项工作将用于未来的医疗保健和心理健康研究，以保持医疗历史的隐私。因此，我们提供了这项工作的开源实现。摘要：Natural Language Processing (NLP) techniques can be applied to help with the diagnosis of medical conditions such as depression, using a collection of a person's utterances. Depression is a serious medical illness that can have adverse effects on how one feels, thinks, and acts, which can lead to emotional and physical problems. Due to the sensitive nature of such data, privacy measures need to be taken for handling and training models with such data. In this work, we study the effects that the application of Differential Privacy (DP) has, in both a centralized and a Federated Learning (FL) setup, on training contextualized language models (BERT, ALBERT, RoBERTa and DistilBERT). We offer insights on how to privately train NLP models and what architectures and setups provide more desirable privacy utility trade-offs. We envisage this work to be used in future healthcare and mental health studies to keep medical history private. Therefore, we provide an open-source implementation of this work.

【4】 Implicit Gradient Alignment in Distributed and Federated Learning 标题：分布式联合学习中的隐式梯度对齐

作者：Yatin Dandi,Luis Barba,Martin Jaggi 机构：IIT Kanpur, India, EPFL, Switzerland 链接：https://arxiv.org/abs/2106.13897 摘要：在分布式和联合学习中实现全局收敛的一个主要障碍是，由于分布式数据的异构性和随机性，客户端或小批量之间的梯度不一致。缓解这一问题的一种方法是鼓励在整个训练过程中跨不同客户调整梯度。我们的分析表明，这一目标可以通过使用正确的优化方法来实现，该方法复制了SGD的隐式正则化效应，从而实现梯度对齐，并提高了测试精度。由于这种正则化在SGD中的存在完全依赖于训练过程中不同小批量的连续使用，因此在训练大的小批量时，这种正则化是不存在的。为了在提高并行性的同时获得这种正则化的泛化优势，我们提出了一种新的gradallign算法，该算法在每次更新时都允许使用任意大的批处理，同时产生相同的隐式正则化。在不同的分布式和联邦学习环境下，我们通过实验验证了算法的有效性。摘要：A major obstacle to achieving global convergence in distributed and federated learning is the misalignment of gradients across clients, or mini-batches due to heterogeneity and stochasticity of the distributed data. One way to alleviate this problem is to encourage the alignment of gradients across different clients throughout training. Our analysis reveals that this goal can be accomplished by utilizing the right optimization method that replicates the implicit regularization effect of SGD, leading to gradient alignment as well as improvements in test accuracies. Since the existence of this regularization in SGD completely relies on the sequential use of different mini-batches during training, it is inherently absent when training with large mini-batches. To obtain the generalization benefits of this regularization while increasing parallelism, we propose a novel GradAlign algorithm that induces the same implicit regularization while allowing the use of arbitrarily large batches in each update. We experimentally validate the benefit of our algorithm in different distributed and federated learning settings.

推理|分析|理解|解释(8篇)

【1】 Understanding Dynamics of Nonlinear Representation Learning and Its Application 标题：非线性表征学习的理解动力学及其应用

作者：Kenji Kawaguchi,Linjun Zhang,Zhun Deng 机构：Harvard University, Rutgers University 链接：https://arxiv.org/abs/2106.14836 摘要：世界环境的表示在机器智能中起着至关重要的作用。直接在图像像素值等原始感官表征空间进行推理和推理往往效率低下。表征学习允许我们从原始的感官数据中自动发现合适的表征。例如，给定原始的感官数据，多层感知器在其隐藏层学习非线性表示，随后在其输出层用于分类（或回归）。这是在训练过程中通过最小化有监督或无监督的损失隐式发生的。本文研究了这种内隐非线性表征学习的动力学。我们确定了一对新的假设和一个新的条件，称为公共模型结构假设和数据体系结构对齐条件。在一般模型结构假设下，证明了数据结构对齐条件对全局收敛是充分的，对全局最优性是必要的。我们的结果为模型结构的设计提供了实际指导：例如，公共模型结构假设可以作为使用特定模型结构而不是其他模型结构的理由。作为一个应用，我们推导了一个新的训练框架，该框架通过依赖于每个数据和结构自动修改任何给定的训练算法来满足数据结构对齐条件，而不必假设它。在给定标准训练算法的情况下，运行其修改版本的框架在保持具有竞争力的（实际）测试性能的同时，通过卷积、跳过连接和标准基准数据集（包括MNIST、CIFAR-10、CIFAR-100）的批标准化，为ResNet-18提供全局收敛保证，Semeion、KMNIST和SVHN。摘要：Representations of the world environment play a crucial role in machine intelligence. It is often inefficient to conduct reasoning and inference directly in the space of raw sensory representations, such as pixel values of images. Representation learning allows us to automatically discover suitable representations from raw sensory data. For example, given raw sensory data, a multilayer perceptron learns nonlinear representations at its hidden layers, which are subsequently used for classification (or regression) at its output layer. This happens implicitly during training through minimizing a supervised or unsupervised loss. In this paper, we study the dynamics of such implicit nonlinear representation learning. We identify a pair of a new assumption and a novel condition, called the common model structure assumption and the data-architecture alignment condition. Under the common model structure assumption, the data-architecture alignment condition is shown to be sufficient for the global convergence and necessary for the global optimality. Our results provide practical guidance for designing a model structure: e.g., the common model structure assumption can be used as a justification for using a particular model structure instead of others. As an application, we then derive a new training framework, which satisfies the data-architecture alignment condition without assuming it by automatically modifying any given training algorithm dependently on each data and architecture. Given a standard training algorithm, the framework running its modified version is empirically shown to maintain competitive (practical) test performances while providing global convergence guarantees for ResNet-18 with convolutions, skip connections, and batch normalization with standard benchmark datasets, including MNIST, CIFAR-10, CIFAR-100, Semeion, KMNIST and SVHN.

【2】 Error analysis for physics informed neural networks (PINNs) approximating Kolmogorov PDEs 标题：物理信息神经网络逼近Kolmogorov偏微分方程的误差分析

作者：Tim De Ryck,Siddhartha Mishra 链接：https://arxiv.org/abs/2106.14473 摘要：物理信息神经网络通过最小化逐点残差来逼近偏微分方程的解。以一大类线性抛物偏微分方程（包括期权定价的热方程和Black-Scholes方程的Kolmogorov方程）为例，导出了PINNs近似解的误差的严格界。我们构造的神经网络，其PINN残差（泛化误差）可以尽可能小。我们还证明了只要使用足够数量的随机选择的训练（配置）点，总的$L^2$-误差就可以被泛化误差所限定，而泛化误差又可以被训练误差所限定。此外，我们还证明了pinn的大小和训练样本的数目只随底层维数呈多项式增长，使得pinn在这种情况下能够克服维数灾难。这些结果使我们能够为近似Kolmogorov偏微分方程的pinn提供一个全面的误差分析。摘要：Physics informed neural networks approximate solutions of PDEs by minimizing pointwise residuals. We derive rigorous bounds on the error, incurred by PINNs in approximating the solutions of a large class of linear parabolic PDEs, namely Kolmogorov equations that include the heat equation and Black-Scholes equation of option pricing, as examples. We construct neural networks, whose PINN residual (generalization error) can be made as small as desired. We also prove that the total $L^2$-error can be bounded by the generalization error, which in turn is bounded in terms of the training error, provided that a sufficient number of randomly chosen training (collocation) points is used. Moreover, we prove that the size of the PINNs and the number of training samples only grow polynomially with the underlying dimension, enabling PINNs to overcome the curse of dimensionality in this context. These results enable us to provide a comprehensive error analysis for PINNs in approximating Kolmogorov PDEs.

【3】 Self-paced Principal Component Analysis 标题：自定步主成分分析

作者：Zhao Kang,Hongfei Liu,Jiangxin Li,Xiaofeng Zhu,Ling Tian 机构： [ 1 2] develop a computationallysimple paradigm for image denoising using superpixel-basedThe authors are with the School of Computer Science and Engineering, University of Electronic Science and Technology of China 链接：https://arxiv.org/abs/2106.13880 摘要：主成分分析（PCA）在降维和特征提取方面有着广泛的应用。鲁棒主元分析（RPCA）在l1范数、l2范数、p范数等不同的鲁棒距离度量下，能在一定程度上处理噪声或异常值。然而，现实世界中的数据可能显示这些简单函数无法完全捕获的结构。另外，现有方法对复杂样本和简单样本一视同仁。相比之下，人类通常采用的学习模式是从简单到复杂，从少到多。基于这一原理，我们提出了一种新的方法，称为自步PCA（SPCA），以进一步降低噪声和异常值的影响。值得注意的是，在每次迭代开始时计算每个样本的复杂度，以便将从简单到更复杂的样本集成到训练中。基于交替优化，SPCA找到一个最优的投影矩阵，并迭代地滤除异常值。理论分析证明了SPCA的合理性。在流行数据集上的大量实验表明，该方法能显著提高现有结果。摘要：Principal Component Analysis (PCA) has been widely used for dimensionality reduction and feature extraction. Robust PCA (RPCA), under different robust distance metrics, such as l1-norm and l2, p-norm, can deal with noise or outliers to some extent. However, real-world data may display structures that can not be fully captured by these simple functions. In addition, existing methods treat complex and simple samples equally. By contrast, a learning pattern typically adopted by human beings is to learn from simple to complex and less to more. Based on this principle, we propose a novel method called Self-paced PCA (SPCA) to further reduce the effect of noise and outliers. Notably, the complexity of each sample is calculated at the beginning of each iteration in order to integrate samples from simple to more complex into training. Based on an alternating optimization, SPCA finds an optimal projection matrix and filters out outliers iteratively. Theoretical analysis is presented to show the rationality of SPCA. Extensive experiments on popular data sets demonstrate that the proposed method can improve the state of-the-art results considerably.

【4】 Rationale-Inspired Natural Language Explanations with Commonsense 标题：理性启发的常识自然语言解释

作者：Bodhisattwa Prasad Majumder,Oana-Maria Camburu,Thomas Lukasiewicz,Julian McAuley 机构：Department of Computer Science and Engineering, UC San Diego, USA, Department of Computer Science, University of Oxford, UK, Alan Turing Institute, London, UK 链接：https://arxiv.org/abs/2106.13876 摘要：可解释的机器学习模型主要使用提取原理（即输入特征的子集）或自由文本自然语言解释（NLEs）作为抽象证明来证明预测标签的正确性。虽然NLE比提取理论更全面，但机器生成的NLE有时缺乏常识性知识。在这里，我们表明，常识知识可以作为一个桥梁之间的提取原理和自然语言，使这两种类型的解释更好。更准确地说，我们引入了一个统一的框架，称为RExC（理性启发的常识解释），它（1）将原理提取为一组负责机器预测的特征，（2）使用可用的常识资源扩展提取原理，利用扩展知识生成自然语言解释。我们的框架在自然语言处理和视觉语言理解的五个任务中生成NLE，大大超过了以前的最新水平，人类注释者一致认为RExC生成的解释更全面，基于常识，与以前的先进车型相比，总体上更受欢迎。此外，我们的工作表明，常识性的基础解释可以提高任务绩效和基本原理提取能力。摘要：Explainable machine learning models primarily justify predicted labels using either extractive rationales (i.e., subsets of input features) or free-text natural language explanations (NLEs) as abstractive justifications. While NLEs can be more comprehensive than extractive rationales, machine-generated NLEs have been shown to sometimes lack commonsense knowledge. Here, we show that commonsense knowledge can act as a bridge between extractive rationales and NLEs, rendering both types of explanations better. More precisely, we introduce a unified framework, called RExC (Rationale-Inspired Explanations with Commonsense), that (1) extracts rationales as a set of features responsible for machine predictions, (2) expands the extractive rationales using available commonsense resources, and (3) uses the expanded knowledge to generate natural language explanations. Our framework surpasses by a large margin the previous state-of-the-art in generating NLEs across five tasks in both natural language processing and vision-language understanding, with human annotators consistently rating the explanations generated by RExC to be more comprehensive, grounded in commonsense, and overall preferred compared to previous state-of-the-art models. Moreover, our work shows that commonsense-grounded explanations can enhance both task performance and rationales extraction capabilities.

【5】 The Convergence Rate of SGD's Final Iterate: Analysis on Dimension Dependence 标题：SGD最终迭代的收敛速度：维数相关性分析

作者：Daogao Liu,Zhou Lu 机构：University of Washington, Princeton University 链接：https://arxiv.org/abs/2106.14588 摘要：随机梯度下降法（SGD）是最优化中最简单、最流行的方法之一。SGD的收敛速度已经得到了广泛的研究，并对运行平均格式进行了严密的分析，但最终迭代的次优性仍然没有得到很好的理解。shamir2013stochastic给出了SGD最小化非光滑凸函数的最终迭代的最著名上界，Lipschitz凸函数的上界为$O（logt/sqrt{T}）$，强凸性的附加假设为$O（logt/T）$。然而，最为人所知的下界比上界差$log T$。harvey2019tight给出了匹配的下界，但它们的构造需要维度$d=T$。然后，koren2020open询问了如何在定维环境下刻画SGD的最终迭代收敛性。在本文中，我们在更一般的条件下对任意$dleq T$回答了这个问题，证明了在标准步长下SGD最终迭代的次优性的$Omega（log d/sqrt{T}）$下界和$Omega（log d/T）$下界。我们的结果给出了SGD最终迭代收敛的第一个一般维数依赖下界，部分解决了koren2020open提出的COLT开放问题。我们还提供了进一步的证据来证明一维的正确率应该是$Theta（1/sqrt{T}）$，例如在比koren2020open更一般的设置下，一维特例的紧上界是$O（1/sqrt{T}）$。摘要：Stochastic Gradient Descent (SGD) is among the simplest and most popular methods in optimization. The convergence rate for SGD has been extensively studied and tight analyses have been established for the running average scheme, but the sub-optimality of the final iterate is still not well-understood. shamir2013stochastic gave the best known upper bound for the final iterate of SGD minimizing non-smooth convex functions, which is $O(log T/sqrt{T})$ for Lipschitz convex functions and $O(log T/ T)$ with additional assumption on strongly convexity. The best known lower bounds, however, are worse than the upper bounds by a factor of $log T$. harvey2019tight gave matching lower bounds but their construction requires dimension $d= T$. It was then asked by koren2020open how to characterize the final-iterate convergence of SGD in the constant dimension setting. In this paper, we answer this question in the more general setting for any $dleq T$, proving $Omega(log d/sqrt{T})$ and $Omega(log d/T)$ lower bounds for the sub-optimality of the final iterate of SGD in minimizing non-smooth Lipschitz convex and strongly convex functions respectively with standard step size schedules. Our results provide the first general dimension dependent lower bound on the convergence of SGD's final iterate, partially resolving a COLT open question raised by koren2020open. We also present further evidence to show the correct rate in one dimension should be $Theta(1/sqrt{T})$, such as a proof of a tight $O(1/sqrt{T})$ upper bound for one-dimensional special cases in settings more general than koren2020open.

【6】 Use of Variational Inference in Music Emotion Recognition 标题：变分推理在音乐情感识别中的应用

作者：Nathalie Deziderio,Hugo Tremonte de Carvalho 机构：Brasil, Rio de Janeiro, de mar¸co de , arXiv:,.,v, [stat.ML] , Jun 链接：https://arxiv.org/abs/2106.14323 摘要：这项工作旨在将统计技术应用于音乐情感识别领域，这是信号处理界公认的一个领域，但从统计角度进行的探索却很少。在这里，我们打开了该领域内的几种可能性，应用现代贝叶斯统计技术和开发有效的算法，重点是获得的结果的适用性。虽然这个项目的动机是开发一个基于情感的音乐推荐系统，但它的主要贡献是一个适应性很强的多元模型，可以用来解释任何有兴趣以有效的方式应用正则化的数据库。广义地说，我们将探讨一个健全的理论统计分析在一个能够理解一个著名数据库的算法建模中能起到什么作用，以及用这种方法能得到什么。摘要：This work was developed aiming to employ Statistical techniques to the field of Music Emotion Recognition, a well-recognized area within the Signal Processing world, but hardly explored from the statistical point of view. Here, we opened several possibilities within the field, applying modern Bayesian Statistics techniques and developing efficient algorithms, focusing on the applicability of the results obtained. Although the motivation for this project was the development of a emotion-based music recommendation system, its main contribution is a highly adaptable multivariate model that can be useful interpreting any database where there is an interest in applying regularization in an efficient manner. Broadly speaking, we will explore what role a sound theoretical statistical analysis can play in the modeling of an algorithm that is able to understand a well-known database and what can be gained with this kind of approach.

【7】 Interpretable Network Representation Learning with Principal Component Analysis 标题：基于主成分分析的可解释网络表征学习

作者：James D. Wilson,Jihui Lee 机构：Department of Psychiatry, University of Pittsburgh Medical Center, Pittsburgh, PA , USA, Division of Biostatistics, Department of Population Health Sciences, Weill Cornell Medicine, New York, NY , USA, Editor: 备注：33 pages. Submitted and currently under review 链接：https://arxiv.org/abs/2106.14238 摘要：研究了网络值数据样本的可解释网络表示学习问题。我们提出了网络主成分分析（PCAN）算法，通过子图计数统计来识别网络样本的统计意义低维表示。PCAN程序提供了一个可解释的框架，人们可以很容易地可视化、探索和制定网络样本的预测模型。此外，我们还介绍了一种基于快速采样的算法sPCAN，该算法在计算效率上明显高于相应算法，但仍具有可解释性的优点。我们研究了这两种方法之间的关系，并分析了它们在网络样本是基于核的随机图集合的情况下的大样本性质。我们证明了在这种情况下，sPCAN方法的嵌入具有中心极限定理，而且PCAN和sPCAN的总体嵌入是等价的。我们评估PCAN在自然网络样本（包括功能连接网络样本和描述美国参议院政治共同投票习惯的动态网络）中可视化、聚类和分类观察结果的能力。我们的分析表明，我们提出的算法提供了信息和歧视性特征描述网络在每个样本。PCAN和sPCAN方法建立在当前网络表征学习文献的基础上，为网络价值数据的解释性学习研究开辟了一条新的道路。PCAN和sPCAN方法的公开软件可在https://www.github.com/jihuilee/. 摘要：We consider the problem of interpretable network representation learning for samples of network-valued data. We propose the Principal Component Analysis for Networks (PCAN) algorithm to identify statistically meaningful low-dimensional representations of a network sample via subgraph count statistics. The PCAN procedure provides an interpretable framework for which one can readily visualize, explore, and formulate predictive models for network samples. We furthermore introduce a fast sampling-based algorithm, sPCAN, which is significantly more computationally efficient than its counterpart, but still enjoys advantages of interpretability. We investigate the relationship between these two methods and analyze their large-sample properties under the common regime where the sample of networks is a collection of kernel-based random graphs. We show that under this regime, the embeddings of the sPCAN method enjoy a central limit theorem and moreover that the population level embeddings of PCAN and sPCAN are equivalent. We assess PCAN's ability to visualize, cluster, and classify observations in network samples arising in nature, including functional connectivity network samples and dynamic networks describing the political co-voting habits of the U.S. Senate. Our analyses reveal that our proposed algorithm provides informative and discriminatory features describing the networks in each sample. The PCAN and sPCAN methods build on the current literature of network representation learning and set the stage for a new line of research in interpretable learning on network-valued data. Publicly available software for the PCAN and sPCAN methods are available at https://www.github.com/jihuilee/.

【8】 Functional Classwise Principal Component Analysis: A Novel Classification Framework 标题：功能分类主成分分析：一种新的分类框架

作者：Avishek Chatterjee,Satyaki Mazumder,Koel Das 机构： Das are with the Department ofMathematics and Statistics 链接：https://arxiv.org/abs/2106.13959 摘要：近年来，功能数据分析（FDA）已成功地应用于高维数据分类领域。在本文中，我们提出了一个新的分类框架，利用功能数据和分类主成分分析（PCA）。该方法适用于高维时间序列数据的小样本问题。该方法提取分段线性函数特征空间，特别适用于硬分类问题，将时间序列数据转化为函数数据，利用分类函数PCA进行特征提取，然后利用贝叶斯线性分类器进行分类。我们将该方法应用于神经科学、食品科学、医学和化学计量学等多个领域的合成数据集和实时序列数据，证明了该方法的有效性。摘要：In recent times, functional data analysis (FDA) has been successfully applied in the field of high dimensional data classification. In this paper, we present a novel classification framework using functional data and classwise Principal Component Analysis (PCA). Our proposed method can be used in high dimensional time series data which typically suffers from small sample size problem. Our method extracts a piece wise linear functional feature space and is particularly suitable for hard classification problems.The proposed framework converts time series data into functional data and uses classwise functional PCA for feature extraction followed by classification using a Bayesian linear classifier. We demonstrate the efficacy of our proposed method by applying it to both synthetic data sets and real time series data from diverse fields including but not limited to neuroscience, food science, medical sciences and chemometrics.

检测相关(5篇)

【1】 Cheating Detection Pipeline for Online Interviews and Exams 标题：在线面试和考试的作弊检测管道

作者：Azmi Can Özgen,Mahiye Uluyağmur Öztürk,Umut Bayraktar 机构：Huawei Turkey R&D Center, Istanbul, Turkey 链接：https://arxiv.org/abs/2106.14483 摘要：由于流行病和远程工作环境的优势，远程考试和工作面试越来越受欢迎并成为不可或缺的。大多数公司和学术机构利用这些系统进行招聘和在线考试。然而，远程考试系统的关键问题之一是在可靠的环境中进行考试。在这项工作中，我们提出了在线面试和考试作弊分析管道。这套系统只需要考生的一段视频，这段视频是在考试期间录制的。然后利用作弊检测管道来检测另一个人、电子设备使用情况和候选人缺勤状态。该流水线由人脸检测、人脸识别、目标检测和人脸跟踪算法组成。为了评估管道的性能，我们收集了一个私有视频数据集。视频数据集包括作弊活动和干净的视频。最终，我们的管道提供了一个有效和快速的准则，以检测和分析作弊活动在网上面试和考试视频。摘要：Remote examination and job interviews have gained popularity and become indispensable because of both pandemics and the advantage of remote working circumstances. Most companies and academic institutions utilize these systems for their recruitment processes and also for online exams. However, one of the critical problems of the remote examination systems is conducting the exams in a reliable environment. In this work, we present a cheating analysis pipeline for online interviews and exams. The system only requires a video of the candidate, which is recorded during the exam. Then cheating detection pipeline is employed to detect another person, electronic device usage, and candidate absence status. The pipeline consists of face detection, face recognition, object detection, and face tracking algorithms. To evaluate the performance of the pipeline we collected a private video dataset. The video dataset includes both cheating activities and clean videos. Ultimately, our pipeline presents an efficient and fast guideline to detect and analyze cheating activities in an online interview and exam video.

【2】 Machine Learning Detection Algorithm for Large Barkhausen Jumps in Cluttered Environment 标题：杂波环境下巴克豪森跳跃的机器学习检测算法

作者：Roger Alimi,Amir Ivry,Elad Fisher,Eyal Weiss 机构： Technology Division, Soreq NRC, Yavne , Israel, Technion, Israel Institute of Technology, Haifa , Israel, Jerusalem College of Technology, Jerusalem , Israel 备注：None 链接：https://arxiv.org/abs/2106.14148 摘要：现代磁传感器阵列通常采用最先进的低功率磁强计，如平行和正交磁通门。低功率磁通门往往有大巴克豪森跳跃，出现在直流磁通门输出跳跃。这种现象恶化了信号保真度，并有效地增加了传感器内部的噪声。即使在生产过程中可以筛选出更容易发生直流跳变的传感器，但由于其稀疏性，传统的噪声测量方法并不总能捕捉到直流跳变。此外，几乎所有的传感器核心都存在直流跳跃，尽管速度较慢，但仍然无法忍受。即使直流跳跃可以很容易地在屏蔽环境中检测到，当部署在存在自然噪声和杂波时，也很难准确地检测到它们。这项工作填补了这一空白，并提出了算法，区分直流跳跃嵌入在自然磁场数据。为了提高对噪声的鲁棒性，我们开发了两种基于时间和统计物理特征的机器学习算法。第一种算法采用支持向量机分类器，第二种算法基于神经网络结构。我们将这些新方法与更经典的基于核的方法进行了比较。为此，生成了接收机工作特性曲线，通过比较不同分类器在不同工作点上的性能，实现了不同分类器的诊断能力。与传统方法相比，基于机器学习的算法具有更高的精度。另外，基于相应的接收机工作特性曲线的快速收敛，神经网络具有很高的泛化能力和鲁棒性。摘要：Modern magnetic sensor arrays conventionally utilize state of the art low power magnetometers such as parallel and orthogonal fluxgates. Low power fluxgates tend to have large Barkhausen jumps that appear as a dc jump in the fluxgate output. This phenomenon deteriorates the signal fidelity and effectively increases the internal sensor noise. Even if sensors that are more prone to dc jumps can be screened during production, the conventional noise measurement does not always catch the dc jump because of its sparsity. Moreover, dc jumps persist in almost all the sensor cores although at a slower but still intolerable rate. Even if dc jumps can be easily detected in a shielded environment, when deployed in presence of natural noise and clutter, it can be hard to positively detect them. This work fills this gap and presents algorithms that distinguish dc jumps embedded in natural magnetic field data. To improve robustness to noise, we developed two machine learning algorithms that employ temporal and statistical physical-based features of a pre-acquired and well-known experimental data set. The first algorithm employs a support vector machine classifier, while the second is based on a neural network architecture. We compare these new approaches to a more classical kernel-based method. To that purpose, the receiver operating characteristic curve is generated, which allows diagnosis ability of the different classifiers by comparing their performances across various operation points. The accuracy of the machine learning-based algorithms over the classic method is highly emphasized. In addition, high generalization and robustness of the neural network can be concluded, based on the rapid convergence of the corresponding receiver operating characteristic curves.

【3】 A Machine Learning Model for Early Detection of Diabetic Foot using Thermogram Images 标题：基于热图图像的糖尿病足早期检测的机器学习模型

作者：Amith Khandakar,Muhammad E. H. Chowdhury,Mamun Bin Ibne Reaz,Sawal Hamid Md Ali,Md Anwarul Hasan,Serkan Kiranyaz,Tawsifur Rahman,Rashad Alfkey,Ahmad Ashrif A. Bakar,Rayaz A. Malik 机构：Department of Electrical Engineering, Qatar University, Doha-, Qatar, Dept. of Electrical, Electronics and Systems Engineering, Universiti Kebangsaan, Malaysia, Bangi, Selangor , Malaysia 备注：23 pages, 8 Figures 链接：https://arxiv.org/abs/2106.14207 摘要：糖尿病足溃疡（DFU）和截肢是一个重要的发病原因。预防DFU可通过鉴别DFU高危患者，并通过教育和卸载制定预防措施来实现。有几项研究报告说，热像图图像可能有助于检测足底温度增加之前，DFU。然而，足底温度的分布可能是不均匀的，因此很难量化和利用预测结果。我们将基于机器学习的评分技术、特征选择和优化技术以及学习分类器与几种最先进的卷积神经网络（CNNs）在足部温度图图像上进行了比较，并提出了一种鲁棒的解决方案来识别糖尿病足。MobilenetV2是一个相对浅层的CNN模型，对于一个基于两英尺热像图的分类，它的F1得分达到了95%，AdaBoost分类器使用了10个特征，F1得分达到了97%。对性能最好的网络的推理时间的比较证实，所提出的算法可以部署为智能手机应用程序，以允许用户在家庭环境中监视DFU的进程。摘要：Diabetes foot ulceration (DFU) and amputation are a cause of significant morbidity. The prevention of DFU may be achieved by the identification of patients at risk of DFU and the institution of preventative measures through education and offloading. Several studies have reported that thermogram images may help to detect an increase in plantar temperature prior to DFU. However, the distribution of plantar temperature may be heterogeneous, making it difficult to quantify and utilize to predict outcomes. We have compared a machine learning-based scoring technique with feature selection and optimization techniques and learning classifiers to several state-of-the-art Convolutional Neural Networks (CNNs) on foot thermogram images and propose a robust solution to identify the diabetic foot. A comparatively shallow CNN model, MobilenetV2 achieved an F1 score of ~95% for a two-feet thermogram image-based classification and the AdaBoost Classifier used 10 features and achieved an F1 score of 97 %. A comparison of the inference time for the best-performing networks confirmed that the proposed algorithm can be deployed as a smartphone application to allow the user to monitor the progression of the DFU in a home setting.

【4】 An XAI Approach to Deep Learning Models in the Detection of Ductal Carcinoma in Situ 标题：基于XAI的深度学习模型在导管原位癌检测中的应用

作者：Michele La Ferla,Matthew Montebello,Dylan Seychell 机构：University of Malta, Msida, Malta 备注：9 pages, 6 figures 链接：https://arxiv.org/abs/2106.14186 摘要：在过去十年左右的时间里，为了解决与健康有关的问题，特别是乳腺癌，深度学习社区出现了一场叛乱。继2016年Camelyon-16挑战赛之后，几位研究人员已投入时间构建卷积神经网络（CNN），以帮助放射科医生和其他临床医生诊断乳腺癌。尤其是导管原位癌（DCIS）；早期乳腺癌的临床术语。大公司在这方面的研究中贡献了相当大的一部分，其中谷歌Deepmind公司在2020年开发了一个模型，该模型被证明比放射科医生自己更能正确诊断乳腺癌。我们发现，在存在的问题中，有一个解释系统需要穿过CNN的隐藏层来突出那些有助于乳房X光片分类的像素。然后，我们选择了沈教授开发的一个开源、相当成功的项目，使用CBIS-DDSM图像数据库来运行我们的实验。后来使用Resnet-50和VGG-16补丁分类器对其进行了改进，分析比较了两者的结果。结果表明，Resnet-50在实验中收敛较早。在Montavon和Binder的研究之后，我们使用DeepTaylor分层相关传播（LRP）模型来突出乳腺x线片中对其分类贡献最大的像素和区域。这表示为原始图像中那些像素的映射，这些像素有助于诊断以及它们对最终分类的贡献程度。该算法最显著的优点是在Resnet-50补丁分类器体系结构中表现得非常好。摘要：During the last decade or so, there has been an insurgence in the deep learning community to solve health-related issues, particularly breast cancer. Following the Camelyon-16 challenge in 2016, several researchers have dedicated their time to build Convolutional Neural Networks (CNNs) to help radiologists and other clinicians diagnose breast cancer. In particular, there has been an emphasis on Ductal Carcinoma in Situ (DCIS); the clinical term for early-stage breast cancer. Large companies have given their fair share of research into this subject, among these Google Deepmind who developed a model in 2020 that has proven to be better than radiologists themselves to diagnose breast cancer correctly. We found that among the issues which exist, there is a need for an explanatory system that goes through the hidden layers of a CNN to highlight those pixels that contributed to the classification of a mammogram. We then chose an open-source, reasonably successful project developed by Prof. Shen, using the CBIS-DDSM image database to run our experiments on. It was later improved using the Resnet-50 and VGG-16 patch-classifiers, analytically comparing the outcome of both. The results showed that the Resnet-50 one converged earlier in the experiments. Following the research by Montavon and Binder, we used the DeepTaylor Layer-wise Relevance Propagation (LRP) model to highlight those pixels and regions within a mammogram which contribute most to its classification. This is represented as a map of those pixels in the original image, which contribute to the diagnosis and the extent to which they contribute to the final classification. The most significant advantage of this algorithm is that it performs exceptionally well with the Resnet-50 patch classifier architecture.

【5】 Score-Based Change Detection for Gradient-Based Learning Machines 标题：基于分数的梯度学习机变化检测

作者：Lang Liu,Joseph Salmon,Zaid Harchaoui 机构： Department of Statistics, University of Washington, Seattle, IMAG, University of Montpellier, CNRS, Montpellier 链接：https://arxiv.org/abs/2106.14122 摘要：机器学习算法的广泛应用需要自动变化检测算法来监控它们的行为。当机器学习算法从一个连续的、可能进化的数据流中学习时，用一个伴随的变化检测算法来补充它以促进它的监视和控制是可取的，而且常常是关键的。我们提出了一种通用的基于分数的变化检测方法，该方法可以检测通过经验风险最小化训练的机器学习模型中任意数量的组件的变化。这种提出的统计假设检验可以很容易地实现在可微编程框架内设计的模型。我们建立了假设检验的一致性，并说明了如何对其进行校正以达到规定的虚警率。我们说明了该方法对合成和真实数据的多功能性。摘要：The widespread use of machine learning algorithms calls for automatic change detection algorithms to monitor their behavior over time. As a machine learning algorithm learns from a continuous, possibly evolving, stream of data, it is desirable and often critical to supplement it with a companion change detection algorithm to facilitate its monitoring and control. We present a generic score-based change detection method that can detect a change in any number of components of a machine learning model trained via empirical risk minimization. This proposed statistical hypothesis test can be readily implemented for such models designed within a differentiable programming framework. We establish the consistency of the hypothesis test and show how to calibrate it to achieve a prescribed false alarm rate. We illustrate the versatility of the approach on synthetic and real data.

分类|识别(8篇)

【1】 Data Poisoning Won't Save You From Facial Recognition 标题：数据中毒不会将你从面部识别中解救出来

作者：Evani Radiya-Dixit,Florian Tramèr 机构：Stanford University 链接：https://arxiv.org/abs/2106.14851 摘要：数据中毒已被提出作为一种强有力的防御措施，以防人脸识别模型训练在网络上刮图片。通过扰乱他们在网上发布的图片，用户可以愚弄模型，使其对未来（未受干扰的）图片进行错误分类。我们证明，这种策略提供了一种错误的安全感，因为它忽略了双方之间固有的不对称性：用户的图片在发布之前（此时图片被刮走）会受到一次彻底的干扰，之后必须欺骗所有未来的模型——包括针对用户过去的攻击进行自适应训练的模型，或者使用攻击后发现的技术的模型。我们评估了两个针对大规模面部识别的中毒攻击系统，Fawkes（500000 下载量）和LowKey。我们演示了一个“不经意”的模型训练师如何简单地等待计算机视觉的未来发展，从而取消对过去收集的图片的保护。我们进一步证明，具有黑盒访问攻击的对手可以（i）训练一个鲁棒模型，抵抗收集到的图片的干扰，（ii）检测上传到网上的中毒图片。我们警告说，面部识别中毒不会承认攻击者和捍卫者之间的“军备竞赛”。一旦被干扰的图片被刮走，攻击就无法改变，因此任何未来成功的防御都将不可避免地损害用户的隐私。摘要：Data poisoning has been proposed as a compelling defense against facial recognition models trained on Web-scraped pictures. By perturbing the images they post online, users can fool models into misclassifying future (unperturbed) pictures. We demonstrate that this strategy provides a false sense of security, as it ignores an inherent asymmetry between the parties: users' pictures are perturbed once and for all before being published (at which point they are scraped) and must thereafter fool all future models -- including models trained adaptively against the users' past attacks, or models that use technologies discovered after the attack. We evaluate two systems for poisoning attacks against large-scale facial recognition, Fawkes (500,000 downloads) and LowKey. We demonstrate how an "oblivious" model trainer can simply wait for future developments in computer vision to nullify the protection of pictures collected in the past. We further show that an adversary with black-box access to the attack can (i) train a robust model that resists the perturbations of collected pictures and (ii) detect poisoned pictures uploaded online. We caution that facial recognition poisoning will not admit an "arms race" between attackers and defenders. Once perturbed pictures are scraped, the attack cannot be changed so any future successful defense irrevocably undermines users' privacy.

【2】 Integrate-and-Fire Neurons for Low-Powered Pattern Recognition 标题：用于低性能模式识别的集成与发射神经元

作者：Florian Bacho,Dominique Chu 机构：CEMS, School of Computing, University of Kent, Canterbury CT,NF, UK 备注：12 pages, 5 figures, 2 tables. This paper is the full text of the research, presented at the 20th International Conference on Artificial Intelligence and Soft Computing Web System (ICAISC 2021) 链接：https://arxiv.org/abs/2106.14596 摘要：嵌入式系统从传感器获取关于真实世界的信息，并对其进行处理以做出决策和/或进行传输。在某些情况下，数据和决策之间的关系很复杂和/或要传输的数据量很大（例如在生物记录器中）。人工神经网络（ANNs）能有效地检测输入数据中的模式，使其适合于决策或信息压缩以进行数据传输。然而，人工神经网络需要大量的能量，这会缩短电池供电设备的寿命。因此，通过提供一种在不消耗太多能量的情况下有效地处理感官数据的方法，尖峰神经网络的使用可以改进这样的系统。在这项工作中，我们介绍了一个低功耗的神经元模型称为集成和火灾，它利用了电容器的充放电特性。利用并联和串联RC电路，我们建立了一个可训练的神经元模型，该模型可以用递归形式表示。最后，我们用一个人工生成的狗姿势数据集来训练它的模拟，并将其实现为显示出良好能量特性的硬件。本文是在第20届国际人工智能与软计算网络系统会议（icaisc2021）上发表的这项研究的全文摘要：Embedded systems acquire information about the real world from sensors and process it to make decisions and/or for transmission. In some situations, the relationship between the data and the decision is complex and/or the amount of data to transmit is large (e.g. in biologgers). Artificial Neural Networks (ANNs) can efficiently detect patterns in the input data which makes them suitable for decision making or compression of information for data transmission. However, ANNs require a substantial amount of energy which reduces the lifetime of battery-powered devices. Therefore, the use of Spiking Neural Networks can improve such systems by providing a way to efficiently process sensory data without being too energy-consuming. In this work, we introduce a low-powered neuron model called Integrate-and-Fire which exploits the charge and discharge properties of the capacitor. Using parallel and series RC circuits, we developed a trainable neuron model that can be expressed in a recurrent form. Finally, we trained its simulation with an artificially generated dataset of dog postures and implemented it as hardware that showed promising energetic properties. This paper is the full text of the research, presented at the 20th International Conference on Artificial Intelligence and Soft Computing Web System (ICAISC 2021)

【3】 Deep Learning Image Recognition for Non-images 标题：面向非图像的深度学习图像识别

作者：Boris Kovalerchuk,Divya Chandrika Kalla,Bedant Agarwal 机构：Dept. of Computer Science, Central Washington University, USA, Indian Institute of Technology Kharagpur, India 备注：33 pages, 17 figures, 18 tables 链接：https://arxiv.org/abs/2106.14350 摘要：强大的深度学习算法通过将非图像机器学习问题转化为图像识别问题，为解决非图像机器学习问题提供了机会。本章提出的CPC-R算法通过可视化非图像数据将非图像数据转换为图像。深入学习CNN算法解决了这些图像的学习问题。CPC-R算法的设计允许在二维图像中保留所有高维信息。在替代方法中使用成对值映射而不是单值映射允许用2倍更少的视觉元素来编码每个n-D点。一个n-D点的属性被划分为它的值对，并且每一对都被可视化为在相同的2-D笛卡尔坐标中的2-D点。接下来，将灰度或颜色强度值分配给每一对，以编码成对的顺序。这是导致热图图像。对CPC-R进行了不同CNN结构的计算实验，优化CPC-R图像的方法表明，CPC-R与深度学习CNN相结合的算法能够解决非图像ML问题，在基准数据集上达到很高的精度。本章通过添加更多的实验来测试分类的准确性，探索发现的特征的显著性和信息性来测试其可解释性，并推广了该方法，从而扩展了我们以前的工作。摘要：Powerful deep learning algorithms open an opportunity for solving non-image Machine Learning (ML) problems by transforming these problems to into the image recognition problems. The CPC-R algorithm presented in this chapter converts non-image data into images by visualizing non-image data. Then deep learning CNN algorithms solve the learning problems on these images. The design of the CPC-R algorithm allows preserving all high-dimensional information in 2-D images. The use of pair values mapping instead of single value mapping used in the alternative approaches allows encoding each n-D point with 2 times fewer visual elements. The attributes of an n-D point are divided into pairs of its values and each pair is visualized as 2-D points in the same 2-D Cartesian coordinates. Next, grey scale or color intensity values are assigned to each pair to encode the order of pairs. This is resulted in the heatmap image. The computational experiments with CPC-R are conducted for different CNN architectures, and methods to optimize the CPC-R images showing that the combined CPC-R and deep learning CNN algorithms are able to solve non-image ML problems reaching high accuracy on the benchmark datasets. This chapter expands our prior work by adding more experiments to test accuracy of classification, exploring saliency and informativeness of discovered features to test their interpretability, and generalizing the approach.

【4】 Reducing numerical precision preserves classification accuracy in Mondrian Forests 标题：降低数值精度可保持蒙德里安森林的分类精度

作者：Marc Vicuna,Martin Khannouz,Gregory Kiar,Yohan Chatelain,Tristan Glatard 机构：Department of Computer Science and Software Engineering, Concordia University, Montreal, Canada, Center for the Developing Brain, Child Mind Institute, New York, NY, USA 备注：6 pages, 3 tables, 2 figures. Keywords: numerical precision, memory footprint, Mondrian Forests, human activity, recognition, data streams, supervised classification, floating-point representation 链接：https://arxiv.org/abs/2106.14340 摘要：Mondrian Forests是一种强大的数据流分类方法，但其巨大的内存占用使其不适合低资源平台，如连接对象。我们探讨了使用减少精度浮点表示来降低内存消耗，并评估了它对分类性能的影响。我们应用MuldReNeRice实现，OrPauleCC，一个数据流算法的C 集合，在人类活动识别中的两个规范数据集：ReFoFIT和BANOSEMPH{{Al}}。结果表明，树节点使用的浮点值精度可以从64位降低到8位，F1值无显著差异。在某些情况下，降低精度可以提高分类性能，可能是由于其正则化效应。我们的结论是，在蒙德里安森林中，数值精度是一个相关的超参数，通常使用的双精度值可能不是最佳性能所必需的。未来的工作将评估这些发现对其他数据流分类器的普遍性。摘要：Mondrian Forests are a powerful data stream classification method, but their large memory footprint makes them ill-suited for low-resource platforms such as connected objects. We explored using reduced-precision floating-point representations to lower memory consumption and evaluated its effect on classification performance. We applied the Mondrian Forest implementation provided by OrpailleCC, a C collection of data stream algorithms, to two canonical datasets in human activity recognition: Recofit and Banos emph{et al}. Results show that the precision of floating-point values used by tree nodes can be reduced from 64 bits to 8 bits with no significant difference in F1 score. In some cases, reduced precision was shown to improve classification performance, presumably due to its regularization effect. We conclude that numerical precision is a relevant hyperparameter in the Mondrian Forest, and that commonly-used double precision values may not be necessary for optimal performance. Future work will evaluate the generalizability of these findings to other data stream classifiers.

【5】 Deep Learning for Technical Document Classification 标题：深度学习在技术文档分类中的应用

作者：Shuo Jiang,Jianxi Luo,Jie Hu,Christopher L. Magee 机构：School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, China, Institute for Data, Systems, and Society and SUTD-MIT International Design Centre, Massachusetts, Institute of Technology, Cambridge, MA, USA 备注：34 pages, 7 figures, 10 tables 链接：https://arxiv.org/abs/2106.14269 摘要：在大型技术公司中，工程师和管理人员为支持相关决策而创建的技术文档的管理和组织需求近年来急剧增加，这导致了对更具可伸缩性、更准确和更自动化的文档分类的更高要求。以前的研究主要集中在分类和小规模数据库的文本处理上。本文提出了一种新的多模式深度学习技术文档分类体系结构TechDoc，它利用自然语言和描述性图像来训练层次分类器。该结构通过一个集成的训练过程综合卷积神经网络和递归神经网络。将该体系结构应用于一个大型的多模式技术文档数据库，训练了基于层次化国际专利分类系统的文档分类模型。结果表明，训练后的神经网络比单一模态和几种早期的文本分类方法具有更高的分类精度。经过训练的模型可以扩展到数百万个具有文本和数字的真实世界的技术文档，这对于大型技术公司和组织的数据和知识管理非常有用。摘要：In large technology companies, the requirements for managing and organizing technical documents created by engineers and managers in supporting relevant decision making have increased dramatically in recent years, which has led to a higher demand for more scalable, accurate, and automated document classification. Prior studies have primarily focused on processing text for classification and small-scale databases. This paper describes a novel multimodal deep learning architecture, called TechDoc, for technical document classification, which utilizes both natural language and descriptive images to train hierarchical classifiers. The architecture synthesizes convolutional neural networks and recurrent neural networks through an integrated training process. We applied the architecture to a large multimodal technical document database and trained the model for classifying documents based on the hierarchical International Patent Classification system. Our results show that the trained neural network presents a greater classification accuracy than those using a single modality and several earlier text classification methods. The trained model can potentially be scaled to millions of real-world technical documents with both text and figures, which is useful for data and knowledge management in large technology companies and organizations.

【6】 Image Classification with CondenseNeXt for ARM-Based Computing Platforms 标题：基于ARM计算平台的CondenseNeXt图像分类

作者：Priyank Kalgaonkar,Mohamed El-Sharkawy 机构：Department of Electrical and Computer Engineering, Purdue School of Engineering and Technology, Indianapolis, Indiana , USA. 备注：6 pages, 7 figures, conference, published IEEE Conference paper 链接：https://arxiv.org/abs/2106.14102 摘要：在这篇论文中，我们展示了我们的超高效深度卷积神经网络架构：CondenseNeXt在NXP BlueBox上的实现，NXP BlueBox是一个为自动驾驶车辆开发的自主驾驶开发平台。我们证明CondenseNeXt在FLOPs方面非常有效，它是为基于ARM的嵌入式计算平台设计的，计算资源有限，可以执行图像分类，而不需要CUDA支持的GPU。CondenseNeXt利用最先进的深度可分离卷积和模型压缩技术来实现显著的计算效率。对CIFAR-10、CIFAR-100和ImageNet数据集进行了广泛的分析，以验证卷积神经网络（CNN）结构的性能。在CIFAR-10（4.79%top-1误差）、CIFAR-100（21.98%top-1误差）和ImageNet（7.91%single model，single cropt top-5误差）三个基准数据集上实现了最先进的图像分类性能。CondenseNeXt最终训练的模型尺寸比CondenseNet提高了2.9 MB，前向触发器减少了59.98%，并且可以在基于ARM的计算平台上执行图像分类，而不需要CUDA支持的GPU，具有出色的效率。摘要：In this paper, we demonstrate the implementation of our ultra-efficient deep convolutional neural network architecture: CondenseNeXt on NXP BlueBox, an autonomous driving development platform developed for self-driving vehicles. We show that CondenseNeXt is remarkably efficient in terms of FLOPs, designed for ARM-based embedded computing platforms with limited computational resources and can perform image classification without the need of a CUDA enabled GPU. CondenseNeXt utilizes the state-of-the-art depthwise separable convolution and model compression techniques to achieve a remarkable computational efficiency. Extensive analyses are conducted on CIFAR-10, CIFAR-100 and ImageNet datasets to verify the performance of CondenseNeXt Convolutional Neural Network (CNN) architecture. It achieves state-of-the-art image classification performance on three benchmark datasets including CIFAR-10 (4.79% top-1 error), CIFAR-100 (21.98% top-1 error) and ImageNet (7.91% single model, single crop top-5 error). CondenseNeXt achieves final trained model size improvement of 2.9 MB and up to 59.98% reduction in forward FLOPs compared to CondenseNet and can perform image classification on ARM-Based computing platforms without needing a CUDA enabled GPU support, with outstanding efficiency.

【7】 The Role of Contextual Information in Best Arm Identification 标题：上下文信息在最佳ARM识别中的作用

作者：Masahiro Kato,Kaito Ariu 机构：CyberAgent Inc., KTH 链接：https://arxiv.org/abs/2106.14077 摘要：研究了随机盗贼在有上下文（协变量）信息的情况下，具有固定置信度的最优手臂识别问题。虽然我们可以在每一轮中使用上下文信息，但我们感兴趣的是边缘化的平均报酬超过上下文分布。我们的目标是在给定的错误率下，用最少的采样次数来确定最佳的arm。我们给出了问题的实例特定样本复杂度下界。然后，我们提出了一种上下文感知的跟踪停止策略，其中arm绘制的比例跟踪最优分配集，并证明了arm绘制的期望数目与下界渐近匹配。我们证明，与Garivier&Kaufmann（2016）的结果相比，上下文信息可以提高识别最佳边缘化平均报酬的效率。我们实验证实，上下文信息有助于更快的最佳手臂识别。摘要：We study the best-arm identification problem with fixed confidence when contextual (covariate) information is available in stochastic bandits. Although we can use contextual information in each round, we are interested in the marginalized mean reward over the contextual distribution. Our goal is to identify the best arm with a minimal number of samplings under a given value of the error rate. We show the instance-specific sample complexity lower bounds for the problem. Then, we propose a context-aware version of the "Track-and-Stop" strategy, wherein the proportion of the arm draws tracks the set of optimal allocations and prove that the expected number of arm draws matches the lower bound asymptotically. We demonstrate that contextual information can be used to improve the efficiency of the identification of the best marginalized mean reward compared with the results of Garivier & Kaufmann (2016). We experimentally confirm that context information contributes to faster best-arm identification.

【8】 A Photonic-Circuits-Inspired Compact Network: Toward Real-Time Wireless Signal Classification at the Edge 标题：一种受光子电路启发的紧凑型网络：走向边缘的实时无线信号分类

作者：Hsuan-Tung Peng,Joshua Lederman,Lei Xu,Thomas Ferreira de Lima,Chaoran Huang,Bhavin Shastri,David Rosenbluth,Paul Prucnal 机构：Princeton University, Queen’s University &, Vector Institute, Lockheed AI center 备注：17 pages, 14 figures 链接：https://arxiv.org/abs/2106.13865 摘要：机器学习（ML）方法在无线通信系统中普遍存在，在射频（RF）指纹识别、自动调制分类和认知无线电等领域有着广泛的应用。然而，ML模型的大尺寸使得它们难以在对延迟敏感的下游任务的边缘设备上实现。在无线通信系统中，以亚毫秒级处理ML数据，可以实现实时网络监控，提高安全性，防止渗透。此外，紧凑的、可集成的硬件平台能够在芯片级实现ML模型，将在无线通信网络中得到更广泛的应用。针对边缘无线信号的实时分类问题，提出了一种由光子硬件激励的递归神经网络模型和一种简化的卷积分类器组成的紧凑型深度网络，并将其应用于射频辐射源的随机传输识别。利用该模型，在使用比现有CNN分类器少50倍的训练参数的情况下，我们在30个相同的ZigBee设备上实现了96.32%的分类准确率。由于网络尺寸大大减小，我们使用小型FPGA板PYNQ-Z1演示了延迟为0.219 ms的实时RF指纹。摘要：Machine learning (ML) methods are ubiquitous in wireless communication systems and have proven powerful for applications including radio-frequency (RF) fingerprinting, automatic modulation classification, and cognitive radio. However, the large size of ML models can make them difficult to implement on edge devices for latency-sensitive downstream tasks. In wireless communication systems, ML data processing at a sub-millisecond scale will enable real-time network monitoring to improve security and prevent infiltration. In addition, compact and integratable hardware platforms which can implement ML models at the chip scale will find much broader application to wireless communication networks. Toward real-time wireless signal classification at the edge, we propose a novel compact deep network that consists of a photonic-hardware-inspired recurrent neural network model in combination with a simplified convolutional classifier, and we demonstrate its application to the identification of RF emitters by their random transmissions. With the proposed model, we achieve 96.32% classification accuracy over a set of 30 identical ZigBee devices when using 50 times fewer training parameters than an existing state-of-the-art CNN classifier. Thanks to the large reduction in network size, we demonstrate real-time RF fingerprinting with 0.219 ms latency using a small-scale FPGA board, the PYNQ-Z1.

表征(3篇)

【1】 Learning Mesh Representations via Binary Space Partitioning Tree Networks 标题：基于二元空间划分树网络的网格表示学习

作者：Zhiqin Chen,Andrea Tagliasacchi,Hao Zhang 机构： Zhang are with Simon Fraser University 备注：Accepted to TPAMI. This is the journal version of BSP-Net (arXiv:1911.06971) 链接：https://arxiv.org/abs/2106.14274 摘要：多边形网格无处不在，但在深度学习革命中只起到了相对次要的作用。最先进的三维形状神经生成模型学习隐函数，并通过昂贵的iso曲面生成网格。我们通过采用计算机图形学中的经典空间数据结构二进制空间划分（BSP）来促进3D学习，从而克服了这些挑战。BSP的核心运算是对三维空间进行递归细分以获得凸集。利用这一特性，我们设计了BSP网络，该网络通过凸分解学习表示三维形状，而无需监督。该网络被训练成使用从一组平面上建立的BSP树获得的一组凸面来重构形状，其中平面和凸面都由学习的网络权重定义。BSP-Net直接从推断的凸面输出多边形网格。生成的网格是防水的、紧凑的（即低多边形），并且非常适合表示尖锐的几何体。结果表明，BSP网络的重建质量与现有的重建方法相比具有一定的竞争力，但所使用的基元要少得多。我们还探讨了BSP网络的变化，包括使用更通用的解码器进行重建，使用比平面更通用的原语，以及使用变分自动编码器训练生成模型。代码位于https://github.com/czq142857/BSP-NET-original. 摘要：Polygonal meshes are ubiquitous, but have only played a relatively minor role in the deep learning revolution. State-of-the-art neural generative models for 3D shapes learn implicit functions and generate meshes via expensive iso-surfacing. We overcome these challenges by employing a classical spatial data structure from computer graphics, Binary Space Partitioning (BSP), to facilitate 3D learning. The core operation of BSP involves recursive subdivision of 3D space to obtain convex sets. By exploiting this property, we devise BSP-Net, a network that learns to represent a 3D shape via convex decomposition without supervision. The network is trained to reconstruct a shape using a set of convexes obtained from a BSP-tree built over a set of planes, where the planes and convexes are both defined by learned network weights. BSP-Net directly outputs polygonal meshes from the inferred convexes. The generated meshes are watertight, compact (i.e., low-poly), and well suited to represent sharp geometry. We show that the reconstruction quality by BSP-Net is competitive with those from state-of-the-art methods while using much fewer primitives. We also explore variations to BSP-Net including using a more generic decoder for reconstruction, more general primitives than planes, as well as training a generative model with variational auto-encoders. Code is available at https://github.com/czq142857/BSP-NET-original.

【2】 Time-Series Representation Learning via Temporal and Contextual Contrasting 标题：基于时间和上下文对比的时间序列表征学习

作者：Emadeldeen Eldele,Mohamed Ragab,Zhenghua Chen,Min Wu,Chee Keong Kwoh,Xiaoli Li,Cuntai Guan 机构：School of Computer Science and Engineering, Nanyang Technological University, Singapore, Institute for Infocomm Research, ASTAR, Singapore 备注：Accepted in IJCAI-21 conference ... please cite the conference version 链接：https://arxiv.org/abs/2106.14112 摘要：从时间动态的未标记时间序列数据中学习合适的表示是一项非常具有挑战性的任务。本文提出了一种基于时间和上下文对比的无监督时间序列表示学习框架（TS-TCC），用于从未标记数据中学习时间序列表示。首先，利用弱增广和强增广将原始时间序列数据转换成两种不同但相关的视图。其次，我们提出了一个新的时间对比模块，通过设计一个困难的交叉视图预测任务来学习鲁棒的时间表示。最后，为了进一步学习区分性表征，我们提出了一个基于时间对比模块的上下文对比模块。它试图最大化同一样本的不同上下文之间的相似性，同时最小化不同样本的上下文之间的相似性。在三个真实的时间序列数据集上进行了实验。结果表明，在所提出的TS-TCC学习的特征基础上训练线性分类器与有监督训练的效果相当。此外，我们提出的TS-TCC在较少的标记数据和迁移学习场景中表现出很高的效率。该代码在https://github.com/emadeldeen24/TS-TCC. 摘要：Learning decent representations from unlabeled time-series data with temporal dynamics is a very challenging task. In this paper, we propose an unsupervised Time-Series representation learning framework via Temporal and Contextual Contrasting (TS-TCC), to learn time-series representation from unlabeled data. First, the raw time-series data are transformed into two different yet correlated views by using weak and strong augmentations. Second, we propose a novel temporal contrasting module to learn robust temporal representations by designing a tough cross-view prediction task. Last, to further learn discriminative representations, we propose a contextual contrasting module built upon the contexts from the temporal contrasting module. It attempts to maximize the similarity among different contexts of the same sample while minimizing similarity among contexts of different samples. Experiments have been carried out on three real-world time-series datasets. The results manifest that training a linear classifier on top of the features learned by our proposed TS-TCC performs comparably with the supervised training. Additionally, our proposed TS-TCC shows high efficiency in few-labeled data and transfer learning scenarios. The code is publicly available at https://github.com/emadeldeen24/TS-TCC.

【3】 LiteGEM: Lite Geometry Enhanced Molecular Representation Learning for Quantum Property Prediction 标题：LiteGEM：用于量子性质预测的Lite几何增强型分子表示学习

作者：Shanzhuo Zhang,Lihang Liu,Sheng Gao,Donglong He,Xiaomin Fang,Weibin Li,Zhengjie Huang,Weiyue Su,Wenjin Wang 机构：PaddleHelix Team, Baidu Inc., PGL Team, Baidu Inc. 链接：https://arxiv.org/abs/2106.14494 摘要：在本报告中，我们（SuperHelix团队）提出了我们对KDD Cup 2021-PCQM4M-LSC的解决方案，这是一个预测分子HOMO-LUMO能隙的大规模量子化学数据集。我们的解决方案，Lite几何增强分子表征学习（LiteGEM）在深图神经网络和各种自监督学习任务的帮助下，在测试集上达到了0.1204的平均绝对误差（MAE）。框架的代码可以在中找到https://github.com/PaddlePaddle/PaddleHelix/tree/dev/competition/kddcup2021-PCQM4M-LSC/. 摘要：In this report, we (SuperHelix team) present our solution to KDD Cup 2021-PCQM4M-LSC, a large-scale quantum chemistry dataset on predicting HOMO-LUMO gap of molecules. Our solution, Lite Geometry Enhanced Molecular representation learning (LiteGEM) achieves a mean absolute error (MAE) of 0.1204 on the test set with the help of deep graph neural networks and various self-supervised learning tasks. The code of the framework can be found in https://github.com/PaddlePaddle/PaddleHelix/tree/dev/competition/kddcup2021-PCQM4M-LSC/.

3D|3D重建等相关(1篇)

【1】 Fully Steerable 3D Spherical Neurons 标题：完全可操纵的三维球形神经元

作者：Pavlo Melnyk,Michael Felsberg,Mårten Wadenbäck 机构：Computer Vision Laboratory, Department of Electrical Engineering, Linköping University 链接：https://arxiv.org/abs/2106.13863 摘要：从低级视觉理论中产生的可操纵过滤器在深度学习中找到了对应的过滤器。早期的工作使用了转向定理，并提出了卷积网络等价于刚性变换。在我们的工作中，我们提出了一种可控制的基于前馈学习的方法，该方法由球形决策面和操作点云组成。由于我们理论固有的三维几何结构，我们推导了其原子部分超球神经元的三维可操纵性约束。利用旋转等价性，我们展示了模型参数是如何在推理时完全可控的。所提出的球形滤波器组能够对未知方向的已知合成点集进行等变和在线优化后的不变类预测。摘要：Emerging from low-level vision theory, steerable filters found their counterpart in deep learning. Earlier works used the steering theorems and presented convolutional networks equivariant to rigid transformations. In our work, we propose a steerable feed-forward learning-based approach that consists of spherical decision surfaces and operates on point clouds. Due to the inherent geometric 3D structure of our theory, we derive a 3D steerability constraint for its atomic parts, the hypersphere neurons. Exploiting the rotational equivariance, we show how the model parameters are fully steerable at inference time. The proposed spherical filter banks enable to make equivariant and, after online optimization, invariant class predictions for known synthetic point sets in unknown orientations.

优化|敛散性(6篇)

【1】 High-probability Bounds for Non-Convex Stochastic Optimization with Heavy Tails 标题：重尾非凸随机优化问题的高概率界

作者：Ashok Cutkosky,Harsh Mehta 机构：Boston University, Google Research 链接：https://arxiv.org/abs/2106.14343 摘要：我们考虑使用梯度估计可能有重尾的一阶算法的非凸随机优化。我们证明了当梯度在（1,2]$中只有有界的$mathfrak{p}$次矩时，梯度裁剪、动量和归一化梯度下降的组合以最大的概率收敛到临界点。然后我们考虑二阶光滑损失的情况，据我们所知，在这种情况下还没有研究过，并且再次获得了任何$mathfrak{p}$的高概率界。此外，我们的结果适用于任意光滑范数，而典型的SGD分析需要Hilbert空间范数。进一步地，我们证明了在一个合适的“磨合”期之后，每次迭代的目标值都会单调地减小，直到确定了一个临界点，这为学习率“热身”的流行实践提供了直觉，也产生了最后一次迭代的保证。摘要：We consider non-convex stochastic optimization using first-order algorithms for which the gradient estimates may have heavy tails. We show that a combination of gradient clipping, momentum, and normalized gradient descent yields convergence to critical points in high-probability with best-known rates for smooth losses when the gradients only have bounded $mathfrak{p}$th moments for some $mathfrak{p}in(1,2]$. We then consider the case of second-order smooth losses, which to our knowledge have not been studied in this setting, and again obtain high-probability bounds for any $mathfrak{p}$. Moreover, our results hold for arbitrary smooth norms, in contrast to the typical SGD analysis which requires a Hilbert space norm. Further, we show that after a suitable "burn-in" period, the objective value will monotonically decrease for every iteration until a critical point is identified, which provides intuition behind the popular practice of learning rate "warm-up" and also yields a last-iterate guarantee.

【2】 Last-iterate Convergence in Extensive-Form Games 标题：扩展形式对策的末次迭代收敛性

作者：Chung-Wei Lee,Christian Kroer,Haipeng Luo 机构：University of Southern California, Columbia University 链接：https://arxiv.org/abs/2106.14326 摘要：基于遗憾的算法在寻找连续博弈（如扑克博弈）中的近似纳什均衡时非常有效。然而，大多数基于遗憾的算法，包括反事实遗憾最小化（CFR）及其变体，都依赖于迭代平均来实现收敛。受零和范式对策中乐观算法最后一次迭代收敛的最新进展的启发，我们研究了序列对策中的这一现象，并对具有完全召回的零和扩展型对策（EFGs）的最后一次迭代收敛进行了全面的研究，在三元组上使用各种乐观遗憾最小化算法。这包括使用香草熵或平方欧氏范数正则化器的算法，以及允许更有效实现的扩展版本。与CFR相比，我们证明了所有这些算法都具有最后的迭代收敛性，其中一些算法甚至以指数级的速度收敛。我们还提供了实验来进一步支持我们的理论结果。摘要：Regret-based algorithms are highly efficient at finding approximate Nash equilibria in sequential games such as poker games. However, most regret-based algorithms, including counterfactual regret minimization (CFR) and its variants, rely on iterate averaging to achieve convergence. Inspired by recent advances on last-iterate convergence of optimistic algorithms in zero-sum normal-form games, we study this phenomenon in sequential games, and provide a comprehensive study of last-iterate convergence for zero-sum extensive-form games with perfect recall (EFGs), using various optimistic regret-minimization algorithms over treeplexes. This includes algorithms using the vanilla entropy or squared Euclidean norm regularizers, as well as their dilated versions which admit more efficient implementation. In contrast to CFR, we show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast. We also provide experiments to further support our theoretical results.

【3】 Contextual Inverse Optimization: Offline and Online Learning 标题：上下文逆向优化：离线和在线学习

作者：Omar Besbes,Yuri Fonseca,Ilan Lobel 链接：https://arxiv.org/abs/2106.14015 摘要：我们研究了具有反馈信息的离线和在线上下文优化问题，其中我们不是观察损失，而是在事后观察一个完全了解目标函数的预言者将采取的最佳行动。我们的目标是最大限度地减少遗憾，遗憾是指我们的损失与一个无所不知的先知所造成的损失之间的差异。在离线环境中，决策者拥有过去时期的可用信息，需要做出一个决策；而在在线环境中，决策者根据每个时期的一组新的可行行动和上下文功能，随时间动态优化决策。对于离线设置，我们描述了最优的minimax策略，建立了可以实现的性能作为数据诱导的信息的基本几何结构的函数。在在线设置中，我们利用这种几何特征来优化累积后悔。我们发展了一个算法，产生这个问题的第一个遗憾界是对数的时间范围。摘要：We study the problems of offline and online contextual optimization with feedback information, where instead of observing the loss, we observe, after-the-fact, the optimal action an oracle with full knowledge of the objective function would have taken. We aim to minimize regret, which is defined as the difference between our losses and the ones incurred by an all-knowing oracle. In the offline setting, the decision-maker has information available from past periods and needs to make one decision, while in the online setting, the decision-maker optimizes decisions dynamically over time based a new set of feasible actions and contextual functions in each period. For the offline setting, we characterize the optimal minimax policy, establishing the performance that can be achieved as a function of the underlying geometry of the information induced by the data. In the online setting, we leverage this geometric characterization to optimize the cumulative regret. We develop an algorithm that yields the first regret bound for this problem that is logarithmic in the time horizon.

【4】 Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits 标题：向探索者学习：土匪的最优报酬估计

作者：Wenshuo Guo,Kumar Krishna Agrawal,Aditya Grover,Vidya Muthukumar,Ashwin Pananjady 机构：⋄Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, †Facebook AI Research, ‡School of Electrical & Computer Engineering and School of Industrial & Systems Engineering, Georgia Institute of Technology 链接：https://arxiv.org/abs/2106.14866 摘要：通过观察一个低后悔演示者的学习过程，我们引入了估计多武装土匪实例回报的“逆土匪”问题。现有的逆强化学习相关问题的解决方法假设执行一个最优策略，因此存在可辨识性问题。相比之下，我们的范例利用了演示者在通往最优的道路上的行为，特别是在探索阶段，以获得一致的报酬估计。我们开发了简单而有效的奖励估计程序，用于一类基于高置信度的算法的演示，表明奖励估计随着算法的遗憾增加而变得越来越容易。我们将这些上界与适用于任何演示算法的信息论下界相匹配，从而描述探索和报酬估计之间的最佳权衡。对自然科学的合成数据和模拟实验设计数据的大量经验评估证实了我们的理论结果。摘要：We introduce the "inverse bandit" problem of estimating the rewards of a multi-armed bandit instance from observing the learning process of a low-regret demonstrator. Existing approaches to the related problem of inverse reinforcement learning assume the execution of an optimal policy, and thereby suffer from an identifiability issue. In contrast, our paradigm leverages the demonstrator's behavior en route to optimality, and in particular, the exploration phase, to obtain consistent reward estimates. We develop simple and efficient reward estimation procedures for demonstrations within a class of upper-confidence-based algorithms, showing that reward estimation gets progressively easier as the regret of the algorithm increases. We match these upper bounds with information-theoretic lower bounds that apply to any demonstrator algorithm, thereby characterizing the optimal tradeoff between exploration and reward estimation. Extensive empirical evaluations on both synthetic data and simulated experimental design data from the natural sciences corroborate our theoretical results.

【5】 Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning 标题：最优值估计中的实例最优性：基于减方差Q学习的自适应

作者：Koulik Khamaru,Eric Xia,Martin J. Wainwright,Michael I. Jordan 机构：Department of Statistics:, and, Department of Electrical Engineering and Computer Sciences‹, UC Berkeley, Berkeley, CA 链接：https://arxiv.org/abs/2106.14352 摘要：强化学习中的各种算法在其收敛速度和最终精度上表现出极大的变化，这是问题结构的函数。这种特定于实例的行为不会被现有的全局极大极小边界所捕获，这是本质上最坏的情况。分析了具有离散状态和行为的折扣Markov决策过程的最优$Q$值函数的估计问题，并在$ellinfty$范数下确定了一个控制估计难度的实例依赖函数。利用一个局部极大极小框架，我们证明了这个泛函在任何估计过程的精度下界都是成立的。在另一个方向上，我们通过分析$Q$-学习的方差缩减版本，建立了我们的下界的锐度，直到在状态空间和动作空间中因子的对数。我们的理论提供了一种在$Q$-学习环境下区分“容易”问题和“难”问题的精确方法，如一个困难连续体的集合所示。摘要：Various algorithms in reinforcement learning exhibit dramatic variability in their convergence rates and ultimate accuracy as a function of the problem structure. Such instance-specific behavior is not captured by existing global minimax bounds, which are worst-case in nature. We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete states and actions and identify an instance-dependent functional that controls the difficulty of estimation in the $ell_infty$-norm. Using a local minimax framework, we show that this functional arises in lower bounds on the accuracy on any estimation procedure. In the other direction, we establish the sharpness of our lower bounds, up to factors logarithmic in the state and action spaces, by analyzing a variance-reduced version of $Q$-learning. Our theory provides a precise way of distinguishing "easy" problems from "hard" ones in the context of $Q$-learning, as illustrated by an ensemble with a continuum of difficulty.

【6】 Global Convergence of Gradient Descent for Asymmetric Low-Rank Matrix Factorization 标题：非对称低秩矩阵分解的梯度下降全局收敛性

作者：Tian Ye,Simon S. Du 机构：Institute for Interdisciplinary Information Sciences, Tsinghua University, Paul G. Allen School of Computer Science and Engineering, University of Washington 链接：https://arxiv.org/abs/2106.14289 摘要：我们研究了非对称低秩分解问题：[minuuu{mathbf{U}inmathbb{R}^{mtimes d}、mathbf{V}inmathbb{R}^{ntimes d}frac{1}{2}\mathbf{U}mathbf{V}^top-mathbf{Sigma}{uf^2]，其中$mathbf{Sigma}$是一个给定的大小为$mtimes n$、秩为$d$的矩阵。这是一个典型的问题，在优化中有两个困难：1）非凸性和2）非光滑性（由于$mathbf{U}$和$mathbf{V}$的不平衡性）。这也是一个原型更复杂的问题，如不对称矩阵传感和矩阵完成。尽管随机初始化的梯度下降算法具有非凸性和非光滑性，但经验证明它可以在多项式时间内解决这一问题。现有的解释这一现象的理论都需要对算法进行人工修改，例如在每次迭代中添加噪声，并添加一个平衡正则化器来平衡$mathbf{U}$和$mathbf{V}$。本文首先证明了随机初始化梯度下降收敛于多项式率的非对称低秩因子分解问题的全局最小值。为了证明这一点，我们发展了1）一种新的对称化技术来捕捉对称性和非对称性的大小；2）一种定量扰动分析来逼近矩阵导数。我们相信这两种方法对于其他相关的非凸问题都是有用的。摘要：We study the asymmetric low-rank factorization problem: [min_{mathbf{U} in mathbb{R}^{m times d}, mathbf{V} in mathbb{R}^{n times d}} frac{1}{2}|mathbf{U}mathbf{V}^top -mathbf{Sigma}|_F^2] where $mathbf{Sigma}$ is a given matrix of size $m times n$ and rank $d$. This is a canonical problem that admits two difficulties in optimization: 1) non-convexity and 2) non-smoothness (due to unbalancedness of $mathbf{U}$ and $mathbf{V}$). This is also a prototype for more complex problems such as asymmetric matrix sensing and matrix completion. Despite being non-convex and non-smooth, it has been observed empirically that the randomly initialized gradient descent algorithm can solve this problem in polynomial time. Existing theories to explain this phenomenon all require artificial modifications of the algorithm, such as adding noise in each iteration and adding a balancing regularizer to balance the $mathbf{U}$ and $mathbf{V}$. This paper presents the first proof that shows randomly initialized gradient descent converges to a global minimum of the asymmetric low-rank factorization problem with a polynomial rate. For the proof, we develop 1) a new symmetrization technique to capture the magnitudes of the symmetry and asymmetry, and 2) a quantitative perturbation analysis to approximate matrix derivatives. We believe both are useful for other related non-convex problems.

预测|估计(6篇)

【1】 Nonparametric estimation of continuous DPPs with kernel methods 标题：基于核方法的连续DPP的非参数估计

作者：Michaël Fanuel,Rémi Bardenet 备注：23 pages 链接：https://arxiv.org/abs/2106.14210 摘要：行列式点过程是排斥点模式的统计模型。抽样和推理都适用于DPPs，这是负相关模型中的一个罕见特征，解释了DPPs在机器学习和空间统计中的流行。提出了有限情况下的参数和非参数推理方法，即当点模式位于有限地集中时。在连续情况下，只研究了参数方法，而DPPs的非参数极大似然问题（迹类算子上的优化问题）仍然是一个悬而未决的问题。本文证明了这个极大似然（MLE）问题的一个限制形式在RKHS中非负函数的一个最近的表示中心定理的范围内。这导致了一个有限维问题，与原始极大似然估计有很强的统计联系。此外，我们提出、分析并证明了一个不动点算法来解决这个有限维问题。最后，我们还提供了DPP相关核的受控估计，从而提供了更高的可解释性。摘要：Determinantal Point Process (DPPs) are statistical models for repulsive point patterns. Both sampling and inference are tractable for DPPs, a rare feature among models with negative dependence that explains their popularity in machine learning and spatial statistics. Parametric and nonparametric inference methods have been proposed in the finite case, i.e. when the point patterns live in a finite ground set. In the continuous case, only parametric methods have been investigated, while nonparametric maximum likelihood for DPPs -- an optimization problem over trace-class operators -- has remained an open question. In this paper, we show that a restricted version of this maximum likelihood (MLE) problem falls within the scope of a recent representer theorem for nonnegative functions in an RKHS. This leads to a finite-dimensional problem, with strong statistical ties to the original MLE. Moreover, we propose, analyze, and demonstrate a fixed point algorithm to solve this finite-dimensional problem. Finally, we also provide a controlled estimate of the correlation kernel of the DPP, thus providing more interpretability.

【2】 On a novel training algorithm for sequence-to-sequence predictive recurrent networks 标题：一种新的序列间预测递归网络训练算法研究

作者：Boris Rubinstein 机构：Stowers Institute for Medical Research, th St., Kansas City, MO , U.S.A. 备注：8 pages, 4 figures 链接：https://arxiv.org/abs/2106.14120 摘要：将序列映射到序列的神经网络（seq2seq）在机器翻译和语音识别方面取得了重大进展。他们的传统结构包括两个递归网络（RNs）和一个线性预测器。在本文中，我们对相应的算法进行了分析，并证明了经过良好训练的预测网络的RNs参数不是相互独立的。它们的依赖性可以用来显著提高网络的有效性。传统的seq2seq算法需要与预测序列长度成比例的短期内存。这一要求在神经科学的背景下很难实现。提出了一种新的seq2seq预测网络无记忆算法，并与传统的无记忆算法进行了比较。结果表明，与传统算法相比，新算法具有更高的鲁棒性和预测精度。摘要：Neural networks mapping sequences to sequences (seq2seq) lead to significant progress in machine translation and speech recognition. Their traditional architecture includes two recurrent networks (RNs) followed by a linear predictor. In this manuscript we perform analysis of a corresponding algorithm and show that the parameters of the RNs of the well trained predictive network are not independent of each other. Their dependence can be used to significantly improve the network effectiveness. The traditional seq2seq algorithms require short term memory of a size proportional to the predicted sequence length. This requirement is quite difficult to implement in a neuroscience context. We present a novel memoryless algorithm for seq2seq predictive networks and compare it to the traditional one in the context of time series prediction. We show that the new algorithm is more robust and makes predictions with higher accuracy than the traditional one.

【3】 Solar Irradiation Forecasting using Genetic Algorithms 标题：基于遗传算法的太阳辐射预测

作者：V. Gunasekaran,K. K. Kovi,S. Arja,R. Chimata 机构：VzRAM Tech LLC, Lisle, Illinois, United States. 备注：9 pages, 4 figures 链接：https://arxiv.org/abs/2106.13956 摘要：由于可再生能源对电网的贡献不断增加，其预测变得越来越重要。太阳能是可再生能源最重要的贡献者之一，依赖于太阳辐射。为了对电网进行有效的管理，需要建立高精度的太阳辐射预测模型。本文采用线性回归、极值梯度增强和遗传算法优化等机器学习技术对太阳辐射进行预测。用于训练和验证的数据是从美国三个不同地理站记录的，这些地理站是SURFRAD网络的一部分。预测并比较了所建模型的全球水平指数（GHI）。将遗传算法优化应用于XGB，进一步提高了太阳辐射预测的精度。摘要：Renewable energy forecasting is attaining greater importance due to its constant increase in contribution to the electrical power grids. Solar energy is one of the most significant contributors to renewable energy and is dependent on solar irradiation. For the effective management of electrical power grids, forecasting models that predict solar irradiation, with high accuracy, are needed. In the current study, Machine Learning techniques such as Linear Regression, Extreme Gradient Boosting and Genetic Algorithm Optimization are used to forecast solar irradiation. The data used for training and validation is recorded from across three different geographical stations in the United States that are part of the SURFRAD network. A Global Horizontal Index (GHI) is predicted for the models built and compared. Genetic Algorithm Optimization is applied to XGB to further improve the accuracy of solar irradiation prediction.

【4】 Recurrently Predicting Hypergraphs 标题：递归预测超图

作者：David W. Zhang,Gertjan J. Burghouts,Cees G. M. Snoek 机构：University of Amsterdam, TNO 链接：https://arxiv.org/abs/2106.13919 摘要：这项工作考虑预测超图的关系结构为一个给定的顶点集，作为常见的应用在粒子物理，生物系统和其他复杂的组合问题。对于一组$n$元素，$mathcal{O}（2^n）$中可能存在的多路关系或超边的数量会产生问题。对于中等大小的$n$来说，简单地为所有关系存储一个指示符张量已经很困难，这促使以前的方法限制超边连接的顶点数量。相反，我们提出了一个递归超图神经网络，它通过迭代细化解的初始猜测来预测关联矩阵。我们利用了大多数感兴趣的超图是稀疏连接的特性，并将内存需求减少到$mathcal{O}（nk）$，其中$k$是正边的最大数目，即实际存在的边。为了抵消线性增长的内存开销，我们进一步提出了一种在随机采样子序列上应用时间反向传播的算法。我们的经验表明，我们的方法可以匹配内在复杂度的增加而不降低性能，并且与最先进的模型相比表现出更好的性能。摘要：This work considers predicting the relational structure of a hypergraph for a given set of vertices, as common for applications in particle physics, biological systems and other complex combinatorial problems. A problem arises from the number of possible multi-way relationships, or hyperedges, scaling in $mathcal{O}(2^n)$ for a set of $n$ elements. Simply storing an indicator tensor for all relationships is already intractable for moderately sized $n$, prompting previous approaches to restrict the number of vertices a hyperedge connects. Instead, we propose a recurrent hypergraph neural network that predicts the incidence matrix by iteratively refining an initial guess of the solution. We leverage the property that most hypergraphs of interest are sparsely connected and reduce the memory requirement to $mathcal{O}(nk)$, where $k$ is the maximum number of positive edges, i.e., edges that actually exist. In order to counteract the linearly growing memory cost from training a lengthening sequence of refinement steps, we further propose an algorithm that applies backpropagation through time on randomly sampled subsequences. We empirically show that our method can match an increase in the intrinsic complexity without a performance decrease and demonstrate superior performance compared to state-of-the-art models.

【5】 Predictive Control Using Learned State Space Models via Rolling Horizon Evolution 标题：基于滚动时域进化学习状态空间模型的预测控制

作者：Alvaro Ovalle,Simon M. Lucas 机构：School of Electronic Engineering and Computer Science, Queen Mary University of London, London, United Kingdom 备注：Accepted at the Bridging the Gap Between AI Planning and Reinforcement Learning (PRL) Workshop at ICAPS 2021 链接：https://arxiv.org/abs/2106.13911 摘要：基于模型的强化学习的兴趣很大一部分来自于获得一个能够进行战略性长期决策的前向模型的潜在效用。假设一个agent成功地学习了一个有用的预测模型，它仍然需要一种机制来利用它来生成和选择相互竞争的模拟计划。在本文中，我们将进化算法规划技术与通过深度学习和变分推理学习的模型相结合来探索这一主题。我们演示了一个代理的方法，该代理在一组视觉导航任务中可靠地执行在线规划。摘要：A large part of the interest in model-based reinforcement learning derives from the potential utility to acquire a forward model capable of strategic long term decision making. Assuming that an agent succeeds in learning a useful predictive model, it still requires a mechanism to harness it to generate and select among competing simulated plans. In this paper, we explore this theme combining evolutionary algorithmic planning techniques with models learned via deep learning and variational inference. We demonstrate the approach with an agent that reliably performs online planning in a set of visual navigation tasks.

【6】 Improved Prediction and Network Estimation Using the Monotone Single Index Multi-variate Autoregressive Model 标题：基于单调单指标多元自回归模型的改进预测和网络估计

作者：Yue Gao,Garvesh Raskutti 机构：Department of Statistics, University of Wisconsin Madison, Madison, WI , USA 链接：https://arxiv.org/abs/2106.14630 摘要：多变量点过程或时间序列数据的网络估计是一个非常重要的问题。以前的工作主要集中在需要已知参数模型的参数方法上，这使得估计过程对模型的不规范性、非线性和异质性的鲁棒性较差。本文提出了一种基于单调单指标多变量自回归模型（SIMAM）的半参数方法。我们为相依数据和交替投影梯度下降算法提供了理论保证。值得注意的是，我们没有显式地假设过程上的混合条件（尽管我们确实需要类似于受限强凸性的条件），并且我们实现了形式为$O（T^{-frac{1}{3}}sqrt{slog（TM）}）$（在独立设计情况下是最优的）的速率，其中$s$是表示稀疏度，$M$是演员的数量，$T$是时间点的数量。此外，在模拟数据和两个实际数据的例子中，我们证明了SIMAM方法在预测和网络估计方面都优于最先进的参数化方法。摘要：Network estimation from multi-variate point process or time series data is a problem of fundamental importance. Prior work has focused on parametric approaches that require a known parametric model, which makes estimation procedures less robust to model mis-specification, non-linearities and heterogeneities. In this paper, we develop a semi-parametric approach based on the monotone single-index multi-variate autoregressive model (SIMAM) which addresses these challenges. We provide theoretical guarantees for dependent data and an alternating projected gradient descent algorithm. Significantly we do not explicitly assume mixing conditions on the process (although we do require conditions analogous to restricted strong convexity) and we achieve rates of the form $O(T^{-frac{1}{3}} sqrt{slog(TM)})$ (optimal in the independent design case) where $s$ is the threshold for the maximum in-degree of the network that indicates the sparsity level, $M$ is the number of actors and $T$ is the number of time points. In addition, we demonstrate the superior performance both on simulated data and two real data examples where our SIMAM approach out-performs state-of-the-art parametric methods both in terms of prediction and network estimation.

其他神经网络|深度学习|模型|建模(28篇)

【1】 Multi-task curriculum learning in a complex, visual, hard-exploration domain: Minecraft 标题：复杂、直观、难探索领域的多任务课程学习：“我的世界”

作者：Ingmar Kanitscheider,Joost Huizinga,David Farhi,William Hebgen Guss,Brandon Houghton,Raul Sampedro,Peter Zhokhov,Bowen Baker,Adrien Ecoffet,Jie Tang,Oleg Klimov,Jeff Clune 机构：OpenAI 备注：first submission 链接：https://arxiv.org/abs/2106.14876 摘要：强化学习的一个重要挑战是训练能够解决各种任务的智能体。如果任务相互依赖（例如，在学习跑步之前需要先学会走路），课程学习可以通过关注下一个最佳学习任务来加快学习速度。我们探索课程学习在一个复杂的，视觉领域与许多艰难的探索挑战：地雷。我们发现，学习进度（定义为任务成功概率的变化）是自动构建有效课程的可学习性的可靠度量。我们介绍了一个学习进度为基础的课程，并测试了一个复杂的强化学习问题（称为“西蒙说”），其中一个代理人是指示获得一个理想的目标项目。许多必要的技能相互依赖。实验表明：（1）获得新项目的一个集内探索奖励可以提高性能；（2）在整个训练过程中动态调整这个奖励，使它只适用于代理不能可靠获得的项目，从而进一步提高性能，（3）基于学习进度的课程优雅地遵循agent的学习曲线；（4）当基于学习进度的课程与动态探索奖金相结合时，它的学习效率更高，获得的绩效远远高于统一基线。这些结果表明，将事件内和跨训练探索奖金与学习进度相结合，创造了一种很有前途的自动课程生成方法，这可能会大大提高我们训练能力更强、通常更智能的代理的能力。摘要：An important challenge in reinforcement learning is training agents that can solve a wide variety of tasks. If tasks depend on each other (e.g. needing to learn to walk before learning to run), curriculum learning can speed up learning by focusing on the next best task to learn. We explore curriculum learning in a complex, visual domain with many hard exploration challenges: Minecraft. We find that learning progress (defined as a change in success probability of a task) is a reliable measure of learnability for automatically constructing an effective curriculum. We introduce a learning-progress based curriculum and test it on a complex reinforcement learning problem (called "Simon Says") where an agent is instructed to obtain a desired goal item. Many of the required skills depend on each other. Experiments demonstrate that: (1) a within-episode exploration bonus for obtaining new items improves performance, (2) dynamically adjusting this bonus across training such that it only applies to items the agent cannot reliably obtain yet further increases performance, (3) the learning-progress based curriculum elegantly follows the learning curve of the agent, and (4) when the learning-progress based curriculum is combined with the dynamic exploration bonus it learns much more efficiently and obtains far higher performance than uniform baselines. These results suggest that combining intra-episode and across-training exploration bonuses with learning progress creates a promising method for automated curriculum generation, which may substantially increase our ability to train more capable, generally intelligent agents.

【2】 Laplace Redux -- Effortless Bayesian Deep Learning 标题：Laplace Redux--轻松的贝叶斯深度学习

作者：Erik Daxberger,Agustinus Kristiadi,Alexander Immer,Runa Eschenhagen,Matthias Bauer,Philipp Hennig 机构：University of Cambridge, MPI for Intelligent Systems, Tübingen, University of Tübingen, ETH Zurich, Max Planck ETH CLS, DeepMind, London 备注：Source Code: this https URL; Library Documentation: this https URL 链接：https://arxiv.org/abs/2106.14806 摘要：深度学习的贝叶斯公式已被证明具有令人信服的理论特性，并提供实际的功能优势，如改进的预测不确定性量化和模型选择。拉普拉斯近似（LA）是一个经典的，可以说是最简单的家庭的近似棘手的后验深层神经网络。然而，尽管它很简单，LA并不像变分贝叶斯或深群那样受欢迎。这可能是由于假设LA由于涉及Hessian计算而昂贵，难以实现，或者它产生较差的结果。在这项工作中，我们表明这些是误解：我们（i）审查了LA的各种变体，包括成本开销最小的版本(ii）引入“laplace”，这是一个易于使用的软件库，为PyTorch提供用户友好的访问所有主要口味的LA；以及（iii）通过大量实验证明，LA在性能方面与更流行的替代方案具有竞争力，同时在计算成本方面表现出色。我们希望，这项工作将作为一个催化剂，以更广泛地采用LA在实际的深度学习，包括在领域贝叶斯方法通常不考虑在目前。摘要：Bayesian formulations of deep learning have been shown to have compelling theoretical properties and offer practical functional benefits, such as improved predictive uncertainty quantification and model selection. The Laplace approximation (LA) is a classic, and arguably the simplest family of approximations for the intractable posteriors of deep neural networks. Yet, despite its simplicity, the LA is not as popular as alternatives like variational Bayes or deep ensembles. This may be due to assumptions that the LA is expensive due to the involved Hessian computation, that it is difficult to implement, or that it yields inferior results. In this work we show that these are misconceptions: we (i) review the range of variants of the LA including versions with minimal cost overhead; (ii) introduce "laplace", an easy-to-use software library for PyTorch offering user-friendly access to all major flavors of the LA; and (iii) demonstrate through extensive experiments that the LA is competitive with more popular alternatives in terms of performance, while excelling in terms of computational cost. We hope that this work will serve as a catalyst to a wider adoption of the LA in practical deep learning, including in domains where Bayesian approaches are not typically considered at the moment.

【3】 PhysiNet: A Combination of Physics-based Model and Neural Network Model for Digital Twins 标题：PhysiNet：基于物理模型和神经网络模型相结合的数字双胞胎

作者：Chao Sun,Victor Guang Shi 机构： Advanced Manufacturing Research Centre, University of Sheffield, Rotherham, S,TZ, UK 链接：https://arxiv.org/abs/2106.14790 摘要：作为物理系统或过程的实时数字对应物，数字孪生子被用于系统仿真和优化。神经网络是利用数据建立数字孪生模型的一种方法，特别是当基于物理的模型不精确甚至不可用时。然而，对于一个新设计的系统来说，需要时间来积累足够的数据来建立神经网络模型，并且只有一个近似的物理模型可用。为了充分利用这两种模型的优点，本文提出了一种基于物理模型和神经网络模型相结合的模型，以提高系统全生命周期的预测精度。该模型能够自动组合模型，提高模型的预测性能。实验结果表明，该混合模型的性能优于物理模型和神经网络模型。摘要：As the real-time digital counterpart of a physical system or process, digital twins are utilized for system simulation and optimization. Neural networks are one way to build a digital twins model by using data especially when a physics-based model is not accurate or even not available. However, for a newly designed system, it takes time to accumulate enough data for neural network moded and only an approximate physics-based model is available. To take advantage of both models, this paper proposed a model that combines the physics-based model and the neural network model to improve the prediction accuracy for the whole life cycle of a system. The proposed model was able to automatically combine the models and boost their prediction performance. Experiments showed that the proposed hybrid model outperformed both the physics-based model and the neural network model.

【4】 HALF: Holistic Auto Machine Learning for FPGAs 标题：半：FPGA的整体自动机器学习

作者：Jonas Ney,Dominik Loroch,Vladimir Rybalkin,Nico Weber,Jens Krüger,Norbert Wehn 机构：∗ University of Kaiserslautern, Kaiserslautern, Germany, † Fraunhofer ITWM, Kaiserslautern, Germany 备注：Submitted at FPL2021. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible 链接：https://arxiv.org/abs/2106.14771 摘要：深度神经网络（Deep Neural Networks，DNNs）能够解决与嵌入式系统相关的复杂问题，如图像和自然语言处理等。为了在特定的FPGA平台上以给定的成本标准（如能源效率）高效地实现DNNs，从拓扑结构到最终的硬件实现都需要考虑大量的设计参数。不同设计层之间的相互依赖性必须得到有效的考虑和探索，这使得手动寻找优化的解决方案几乎不可能。一种自动化、整体化的设计方法可以显著提高DNN在FPGA上的实现质量。为此，我们提出了一种跨层设计的空间探索方法。它包括从dnn的硬件感知拓扑搜索到给定FPGA平台的最终优化实现的优化。该方法在我们的整体自动机器学习框架（HALF）中实现，该框架结合了进化搜索算法、各种优化步骤和可参数化的硬件DNN模块库。半自动化的探索过程和实现优化的解决方案在目标FPGA平台上的各种应用。我们在一个用于心律失常检测的医疗用例上展示了HALF在三个不同设计目标下的性能，分别是低能耗、低功耗和高通量。在Nvidia-Jetson平台上，我们的FPGA实现在吞吐量和能耗方面都优于TensorRT优化模型。摘要：Deep Neural Networks (DNNs) are capable of solving complex problems in domains related to embedded systems, such as image and natural language processing. To efficiently implement DNNs on a specific FPGA platform for a given cost criterion, e.g. energy efficiency, an enormous amount of design parameters has to be considered from the topology down to the final hardware implementation. Interdependencies between the different design layers have to be taken into account and explored efficiently, making it hardly possible to find optimized solutions manually. An automatic, holistic design approach can improve the quality of DNN implementations on FPGA significantly. To this end, we present a cross-layer design space exploration methodology. It comprises optimizations starting from a hardware-aware topology search for DNNs down to the final optimized implementation for a given FPGA platform. The methodology is implemented in our Holistic Auto machine Learning for FPGAs (HALF) framework, which combines an evolutionary search algorithm, various optimization steps and a library of parametrizable hardware DNN modules. HALF automates both the exploration process and the implementation of optimized solutions on a target FPGA platform for various applications. We demonstrate the performance of HALF on a medical use case for arrhythmia detection for three different design goals, i.e. low-energy, low-power and high-throughput respectively. Our FPGA implementation outperforms a TensorRT optimized model on an Nvidia Jetson platform in both throughput and energy consumption.

【5】 Robust Learning-Augmented Caching: An Experimental Study 标题：健壮学习-增强缓存：一项实验研究

作者：Jakub Chłędowski,Adam Polak,Bartosz Szabucki,Konrad Zolna 机构： Which item to evict from the cache in orderto make room for a new item when the cache is full? TheEqual contribution 1Jagiellonian University 备注：ICML 2021 链接：https://arxiv.org/abs/2106.14693 摘要：有效的缓存对现代计算系统的性能至关重要。缓存中出现的一个关键优化问题——逐出哪个项来为新项腾出空间——在不知道未来的情况下无法得到最佳解决。对于这个问题有许多经典的近似算法，但是最近的研究者开始成功地应用机器学习，通过发现隐式输入模式和预测未来来决定要排除什么。虽然机器学习通常不提供任何最坏情况的保证，但新的学习领域增强算法提出了利用经典在线缓存算法使机器学习预测具有鲁棒性的解决方案。我们是第一个全面评估这些学习增强算法对现实世界的缓存数据集和国家的最先进的机器学习预测。我们证明了一个简单的方法——盲目地跟随一个预测器或一个经典的鲁棒算法，并在其中一个变得比另一个更差时进行切换——与一个性能良好的预测器相比只具有较低的开销，而当耦合的预测器失败时与经典方法竞争，从而提供了廉价的最坏情况保险。摘要：Effective caching is crucial for the performance of modern-day computing systems. A key optimization problem arising in caching -- which item to evict to make room for a new item -- cannot be optimally solved without knowing the future. There are many classical approximation algorithms for this problem, but more recently researchers started to successfully apply machine learning to decide what to evict by discovering implicit input patterns and predicting the future. While machine learning typically does not provide any worst-case guarantees, the new field of learning-augmented algorithms proposes solutions that leverage classical online caching algorithms to make the machine-learned predictors robust. We are the first to comprehensively evaluate these learning-augmented algorithms on real-world caching datasets and state-of-the-art machine-learned predictors. We show that a straightforward method -- blindly following either a predictor or a classical robust algorithm, and switching whenever one becomes worse than the other -- has only a low overhead over a well-performing predictor, while competing with classical methods when the coupled predictor fails, thus providing a cheap worst-case insurance.

【6】 On Locality of Local Explanation Models 标题：关于局部解释模型的局部性

作者：Sahra Ghalebikesabi,Lucile Ter-Minassian,Karla Diaz-Ordaz,Chris Holmes 机构：The London School of Hygiene & Tropical Medicine & The Alan Turing Institute, University of Oxford & The Alan Turing Institute 备注：Submitted to NeurIPS 2021 链接：https://arxiv.org/abs/2106.14648 摘要：Shapley值通过模拟全局总体分布下的特征缺失，为特定实例的模型结果提供模型无关的特征属性。当对局部模型行为感兴趣时，使用全局总体可能导致潜在的误导结果。因此，我们考虑制定邻域参考分布，以提高当地的解释性夏普利值。通过这样做，我们发现Nadaraya-Watson估计，一个研究得很好的核回归，可以表示为一个自标准化的重要抽样估计。在经验上，我们观察到邻域Shapley值确定了有意义的稀疏特征相关属性，从而提供了对局部模型行为的洞察，这是对传统Shapley分析的补充。它们还增加了对敌方分类器构造的流形解释性和鲁棒性。摘要：Shapley values provide model agnostic feature attributions for model outcome at a particular instance by simulating feature absence under a global population distribution. The use of a global population can lead to potentially misleading results when local model behaviour is of interest. Hence we consider the formulation of neighbourhood reference distributions that improve the local interpretability of Shapley values. By doing so, we find that the Nadaraya-Watson estimator, a well-studied kernel regressor, can be expressed as a self-normalised importance sampling estimator. Empirically, we observe that Neighbourhood Shapley values identify meaningful sparse feature relevance attributions that provide insight into local model behaviour, complimenting conventional Shapley analysis. They also increase on-manifold explainability and robustness to the construction of adversarial classifiers.

【7】 Expert Q-learning: Deep Q-learning With State Values From Expert Examples 标题：专家问答学习：从专家实例中利用状态值进行深度问答学习

作者：Li Meng,Anis Yazidi,Morten Goodwin,Paal Engelstad 机构：University of Oslo, University of Agder, Oslo Metropolitan University 链接：https://arxiv.org/abs/2106.14642 摘要：提出了一种新的专家Q学习算法。专家Q-learning的灵感来源于决斗Q-learning，旨在将Q值分解为状态值和动作优势，将半监督学习的思想融入强化学习。与生成性对抗性模仿学习和演示中的深度Q学习不同，我们使用的离线专家只从{-1，0，1}预测一个状态的值，表明这是一个坏的、中立的还是好的状态。除了Q网络之外，还设计了一个专家网络，每当专家示例缓冲区不为空时，每次都会在常规离线小批量更新之后进行更新。我们的算法还保留了Q网络和专家网络的异步副本，使用与双Q学习相同的方式预测目标值。在Othello博弈中，我们将我们的算法与最新的Q学习算法进行了比较，该算法是双Q学习和决斗Q学习的结合。结果表明，专家Q-学习确实是有用的，更能抵抗Q-学习的高估偏差。基线Q-学习算法表现出不稳定和次优的行为，特别是在与随机玩家比赛时，而专家Q-学习则表现出更稳健的性能和更高的分数。不使用示例的专家Q学习在针对固定玩家进行训练和测试时，也获得了比基线算法更好的结果。另一方面，没有例子的专家Q学习算法在直接博弈中无法战胜基线Q学习算法，尽管它也显示出了减少高估偏差的力量。摘要：We propose a novel algorithm named Expert Q-learning. Expert Q-learning was inspired by Dueling Q-learning and aimed at incorporating the ideas from semi-supervised learning into reinforcement learning through splitting Q-values into state values and action advantages. Different from Generative Adversarial Imitation Learning and Deep Q-Learning from Demonstrations, the offline expert we have used only predicts the value of a state from {-1, 0, 1}, indicating whether this is a bad, neutral or good state. An expert network is designed in addition to the Q-network and updated each time following the regular offline minibatch update whenever the expert example buffer is not empty. Our algorithm also keeps asynchronous copies of the Q-network and expert network, predicting the target values using the same manner as of Double Q-learning. We compared on the game of Othello our algorithm with the state-of-the-art Q-learning algorithm, which was a combination of Double Q-learning and Dueling Q-learning. The results showed that Expert Q-learning was indeed useful and more resistant to the overestimation bias of Q-learning. The baseline Q-learning algorithm exhibited unstable and suboptimal behavior, especially when playing against a stochastic player, whereas Expert Q-learning demonstrated more robust performance with higher scores. Expert Q-learning without using examples has also gained better results than the baseline algorithm when trained and tested against a fixed player. On the other hand, Expert Q-learning without examples cannot win against the baseline Q-learning algorithm in direct game competitions despite the fact that it has also shown the strength of reducing the overestimation bias.

【8】 Hyperbolic Busemann Learning with Ideal Prototypes 标题：具有理想原型的双曲Busemann学习

作者：Mina Ghadimi Atigh,Martin Keller-Ressel,Pascal Mettes 机构：University of Amsterdam, Technische Universität Dresden 链接：https://arxiv.org/abs/2106.14472 摘要：双曲空间已成为一种流行的流形表示学习的任意数据，从树状结构和文本的图形。基于欧氏空间和超球面空间中的原型深度学习的成功，最近的一些工作提出了双曲原型分类。这种方法能够在低维输出空间中进行有效的学习，并且可以利用类之间的层次关系，但是需要关于类标签的特权信息来定位双曲原型。在这项工作中，我们提出双曲布斯曼学习。我们的方法背后的主要思想是将原型放置在庞加莱球的理想边界上，这不需要先验的标签知识。为了能够计算接近理想原型，我们引入了惩罚布斯曼损失。我们提供了理论支持使用理想原型和拟议损失证明其等价于逻辑回归在一维的情况。从经验上看，我们的方法提供了一个自然的解释分类信心，同时优于最近的超球面和双曲原型方法。摘要：Hyperbolic space has become a popular choice of manifold for representation learning of arbitrary data, from tree-like structures and text to graphs. Building on the success of deep learning with prototypes in Euclidean and hyperspherical spaces, a few recent works have proposed hyperbolic prototypes for classification. Such approaches enable effective learning in low-dimensional output spaces and can exploit hierarchical relations amongst classes, but require privileged information about class labels to position the hyperbolic prototypes. In this work, we propose Hyperbolic Busemann Learning. The main idea behind our approach is to position prototypes on the ideal boundary of the Poincare ball, which does not require prior label knowledge. To be able to compute proximities to ideal prototypes, we introduce the penalised Busemann loss. We provide theory supporting the use of ideal prototypes and the proposed loss by proving its equivalence to logistic regression in the one-dimensional case. Empirically, we show that our approach provides a natural interpretation of classification confidence, while outperforming recent hyperspherical and hyperbolic prototype approaches.

【9】 R-Drop: Regularized Dropout for Neural Networks 标题：R-Drop：神经网络的正则化丢包

作者：Xiaobo Liang,Lijun Wu,Juntao Li,Yue Wang,Qi Meng,Tao Qin,Wei Chen,Min Zhang,Tie-Yan Liu 机构：Soochow University,Microsoft Research Asia 链接：https://arxiv.org/abs/2106.14448 摘要：辍学是一种有效且广泛应用的深度神经网络训练正则化技术。本文提出了一种简单的模型训练中关于退出的正则化策略，即R-Drop，它使得由退出产生的不同子模型的输出分布保持一致。具体地说，对于每个训练样本，R-Drop最小化了两个子模型输出分布之间的双向KL发散。理论分析表明，R-Drop减少了模型参数的自由度，补充了Drop。在$bf{5}$广泛使用的深度学习任务（共$bf{18}$个数据集）上的实验，包括神经机器翻译、抽象摘要、语言理解、语言建模和图像分类，表明R-Drop是普遍有效的。特别是，当应用于微调大规模预训练模型（例如ViT、RoBERTa large和BART）时，它产生了实质性的改进，并在WMT14英语$到$德语翻译（$bf{30.91}$BLEU）和WMT14英语$到$法语翻译（$bf{43.95}$BLEU）上实现了最先进的（SOTA）性能，甚至超越模型训练超大规模的数据和专家设计Transformer模型的先进变种。我们的代码可以在GitHub{url上找到{https://github.com/dropreg/R-Drop}}. 摘要：Dropout is a powerful and widely used technique to regularize the training of deep neural networks. In this paper, we introduce a simple regularization strategy upon dropout in model training, namely R-Drop, which forces the output distributions of different sub models generated by dropout to be consistent with each other. Specifically, for each training sample, R-Drop minimizes the bidirectional KL-divergence between the output distributions of two sub models sampled by dropout. Theoretical analysis reveals that R-Drop reduces the freedom of the model parameters and complements dropout. Experiments on $bf{5}$ widely used deep learning tasks ($bf{18}$ datasets in total), including neural machine translation, abstractive summarization, language understanding, language modeling, and image classification, show that R-Drop is universally effective. In particular, it yields substantial improvements when applied to fine-tune large-scale pre-trained models, e.g., ViT, RoBERTa-large, and BART, and achieves state-of-the-art (SOTA) performances with the vanilla Transformer model on WMT14 English$to$German translation ($bf{30.91}$ BLEU) and WMT14 English$to$French translation ($bf{43.95}$ BLEU), even surpassing models trained with extra large-scale data and expert-designed advanced variants of Transformer models. Our code is available at GitHub{url{https://github.com/dropreg/R-Drop}}.

【10】 Co^2L: Contrastive Continual Learning标题：CO^2L：对比持续学习

作者：Hyuntak Cha,Jaeho Lee,Jinwoo Shin 机构：Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea 备注：14 pages, 5 figures 链接：https://arxiv.org/abs/2106.14413 摘要：最近在自我监督学习方面的突破表明，这种算法学习的视觉表征比依赖于特定任务监督的联合训练方法能够更好地转移到看不见的任务。在本文中，我们发现在持续学习的文本中存在相似的情况：对比学习的表征比联合训练的表征更能抵抗灾难性遗忘。基于这一新的观察，我们提出了一种基于排练的持续学习算法，该算法着重于持续学习和保持可转移表征。更具体地说，所提出的方案（1）使用对比学习目标学习表示，（2）使用自监督蒸馏步骤保留所学习的表示。我们进行了广泛的实验验证下流行的基准图像分类数据集，我们的方法设置了新的国家的最先进的性能。摘要：Recent breakthroughs in self-supervised learning show that such algorithms learn visual representations that can be transferred better to unseen tasks than joint-training methods relying on task-specific supervision. In this paper, we found that the similar holds in the continual learning con-text: contrastively learned representations are more robust against the catastrophic forgetting than jointly trained representations. Based on this novel observation, we propose a rehearsal-based continual learning algorithm that focuses on continually learning and maintaining transferable representations. More specifically, the proposed scheme (1) learns representations using the contrastive learning objective, and (2) preserves learned representations using a self-supervised distillation step. We conduct extensive experimental validations under popular benchmark image classification datasets, where our method sets the new state-of-the-art performance.

【11】 Integrating topic modeling and word embedding to characterize violent deaths 标题：整合主题建模和词语嵌入来刻画暴力死亡

作者：Alina Arseniev-Koehler,Susan D. Cochran,Vickie M. Mays,Kai-Wei Chang,Jacob Gates Foster 机构：Department of Sociology, University of California-Los Angeles, Los Angeles, CA ,; bDepartment of Epidemiology, UCLA Fielding School of Public Health and 链接：https://arxiv.org/abs/2106.14365 摘要：人们越来越需要在许多领域的文本数据中识别潜在模式的方法。本文提出了一种新的方法来识别语料库中的主题，并将文档表示为主题序列。话语原子主题建模借鉴了机器学习理论的进步，将主题建模和单词嵌入结合起来，充分利用了两者的不同功能。我们首先识别一组向量（“话语原子”），它们提供了嵌入空间的稀疏表示。原子向量可以解释为潜在的主题：通过生成模型，原子映射到词的分布上；人们还可以推断出产生一系列单词的主题。我们用一个未充分利用的文本的突出例子来说明我们的方法：美国国家暴力死亡报告系统（NVDRS）。NVDRS用结构化变量和非结构化叙述总结了暴力死亡事件。我们在叙述中确定了225个潜在的主题（例如，死亡准备和身体攻击）；这些主题中的许多并没有被现有的结构化变量所捕获。基于已知的性别自杀和杀人模式，以及最近关于语义空间中性别偏见的研究，我们确定了主题的性别偏见（例如，关于止痛药的主题是女性的）。然后，我们比较了性别偏见的话题，他们的流行率在叙述中的女性与男性受害者。研究结果提供了有关致命暴力及其性别性质报告的详细定量图片。我们的方法为文本数据中的主题建模提供了一种灵活且广泛适用的方法。摘要：There is an escalating need for methods to identify latent patterns in text data from many domains. We introduce a new method to identify topics in a corpus and represent documents as topic sequences. Discourse Atom Topic Modeling draws on advances in theoretical machine learning to integrate topic modeling and word embedding, capitalizing on the distinct capabilities of each. We first identify a set of vectors ("discourse atoms") that provide a sparse representation of an embedding space. Atom vectors can be interpreted as latent topics: Through a generative model, atoms map onto distributions over words; one can also infer the topic that generated a sequence of words. We illustrate our method with a prominent example of underutilized text: the U.S. National Violent Death Reporting System (NVDRS). The NVDRS summarizes violent death incidents with structured variables and unstructured narratives. We identify 225 latent topics in the narratives (e.g., preparation for death and physical aggression); many of these topics are not captured by existing structured variables. Motivated by known patterns in suicide and homicide by gender, and recent research on gender biases in semantic space, we identify the gender bias of our topics (e.g., a topic about pain medication is feminine). We then compare the gender bias of topics to their prevalence in narratives of female versus male victims. Results provide a detailed quantitative picture of reporting about lethal violence and its gendered nature. Our method offers a flexible and broadly applicable approach to model topics in text data.

【12】 Stabilizing Equilibrium Models by Jacobian Regularization 标题：用雅可比正则化稳定平衡模型

作者：Shaojie Bai,Vladlen Koltun,J. Zico Kolter 机构： one could directly differentiate through the final 1Carnegie Mellon University 备注：ICML 2021 Short Oral 链接：https://arxiv.org/abs/2106.14342 摘要：深度平衡网络（DEQs）是一类新的模型，它避开了传统的深度模型，有利于寻找单个非线性层的不动点。这些模型已被证明在使用更少内存的同时，实现了与最先进的深度网络相竞争的性能。然而，它们也比较慢，对于架构选择来说比较脆弱，并且会给模型带来潜在的不稳定性。本文提出了一种DEQ模型的正则化方法，它将不动点更新方程的雅可比矩阵显式正则化，以稳定平衡模型的学习。我们表明，这种正则化只增加了最小的计算成本，显著地稳定了前向和后向过程中的不动点收敛，并且可以很好地扩展到高维、真实的领域（例如WikiText-103语言建模和ImageNet分类）。使用这种方法，我们首次演示了一种隐式深度模型，该模型的运行速度和性能水平与流行的传统深度网络（如ResNet-101）大致相同，同时仍然保持了deq的恒定内存占用和架构简单性。代码位于https://github.com/locuslab/deq . 摘要：Deep equilibrium networks (DEQs) are a new class of models that eschews traditional depth in favor of finding the fixed point of a single nonlinear layer. These models have been shown to achieve performance competitive with the state-of-the-art deep networks while using significantly less memory. Yet they are also slower, brittle to architectural choices, and introduce potential instability to the model. In this paper, we propose a regularization scheme for DEQ models that explicitly regularizes the Jacobian of the fixed-point update equations to stabilize the learning of equilibrium models. We show that this regularization adds only minimal computational cost, significantly stabilizes the fixed-point convergence in both forward and backward passes, and scales well to high-dimensional, realistic domains (e.g., WikiText-103 language modeling and ImageNet classification). Using this method, we demonstrate, for the first time, an implicit-depth model that runs with approximately the same speed and level of performance as popular conventional deep networks such as ResNet-101, while still maintaining the constant memory footprint and architectural simplicity of DEQs. Code is available at https://github.com/locuslab/deq .

【13】 Legendre Deep Neural Network (LDNN) and its application for approximation of nonlinear Volterra Fredholm Hammerstein integral equations 标题：勒让德深度神经网络(LDNN)及其在非线性Volterra Fredholm Hammerstein积分方程逼近中的应用

作者：Zeinab Hajimohammadi,Kourosh Parand,Ali Ghodsi 机构：Department of Computer and Data Sciences, Shahid Beheshti University, Tehran, Iran, Institute for Cognitive and Brain Sciences, Shahid Beheshti University, Tehran, Iran, Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Canada 链接：https://arxiv.org/abs/2106.14320 摘要：生物学、物理学和工程学中的各种现象都是用微分方程来模拟的。这些微分方程包括偏微分方程和常微分方程，都可以转化为积分方程。特别是Volterra-Fredholm-Hammerstein积分方程是这些积分方程的主要类型，研究人员对研究和求解这些方程很感兴趣。本文提出了求解非线性Volterra-Fredholm-Hammerstein积分方程的Legendre深度神经网络（LDNN）。LDNN利用勒让德正交多项式作为深层结构的激活函数。我们提出了如何使用LDNN来解决非线性VFHIEs。我们使用高斯求积配置法结合LDNN的结果，给出了一个新的非线性VFHIEs的数值解。最后给出了几个算例来验证LDNN的性能和精度。摘要：Various phenomena in biology, physics, and engineering are modeled by differential equations. These differential equations including partial differential equations and ordinary differential equations can be converted and represented as integral equations. In particular, Volterra Fredholm Hammerstein integral equations are the main type of these integral equations and researchers are interested in investigating and solving these equations. In this paper, we propose Legendre Deep Neural Network (LDNN) for solving nonlinear Volterra Fredholm Hammerstein integral equations (VFHIEs). LDNN utilizes Legendre orthogonal polynomials as activation functions of the Deep structure. We present how LDNN can be used to solve nonlinear VFHIEs. We show using the Gaussian quadrature collocation method in combination with LDNN results in a novel numerical solution for nonlinear VFHIEs. Several examples are given to verify the performance and accuracy of LDNN.

【14】 Pairing Conceptual Modeling with Machine Learning 标题：将概念建模与机器学习配对

作者：Wolfgang Maass,Veda C. Storey 机构：German Research Center for Artificial Intelligence (DFKI), Saarbrücken, Germany, Saarland University, Saarland Informatics Campus, Saarbrücken, Germany, J. Mack Robinson College of Business, Georgia State University 备注：None 链接：https://arxiv.org/abs/2106.14251 摘要：概念建模和机器学习一直被认为是重要的研究领域。随着越来越重视为商业和其他应用数字化和处理大量数据，考虑这些研究领域如何相互补充将是有益的。为了理解它们如何配对，我们提供了机器学习基础和开发周期的概述。然后，我们研究了如何将概念建模应用于机器学习，并提出了一个将概念建模纳入数据科学项目的框架。通过将该框架应用于一个医疗保健应用程序来说明该框架。对于逆配对，机器学习可以通过文本和规则挖掘以及知识图来影响概念建模。以这种方式将概念建模与机器学习结合起来，为以后的研究打下基础。摘要：Both conceptual modeling and machine learning have long been recognized as important areas of research. With the increasing emphasis on digitizing and processing large amounts of data for business and other applications, it would be helpful to consider how these areas of research can complement each other. To understand how they can be paired, we provide an overview of machine learning foundations and development cycle. We then examine how conceptual modeling can be applied to machine learning and propose a framework for incorporating conceptual modeling into data science projects. The framework is illustrated by applying it to a healthcare application. For the inverse pairing, machine learning can impact conceptual modeling through text and rule mining, as well as knowledge graphs. The pairing of conceptual modeling and machine learning in this this way should help lay the foundations for future research.

【15】 Learning to solve geometric construction problems from images 标题：学习从图像中解决几何构造问题

作者：J. Macke,J. Sedlar,M. Olsak,J. Urban,J. Sivic 机构： andJosef Sivic 2[0000−000 2− 2 5 5 4− 5 30 1] 1 Charles University in Prague, Czech Republic 2 Czech Technical University in Prague, cz 3 University of Innsbruck 备注：16 pages, 7 figures, 3 tables 链接：https://arxiv.org/abs/2106.14195 摘要：我们描述了一个纯粹的基于图像的方法来寻找几何结构与尺子和罗盘在欧几里德几何游戏。该方法采用了maskr-CNN最新的图像处理神经网络结构，并加入了基于树的搜索过程。在有监督的环境下，该方法从欧几里德的前六个层次包中学习求解所有68种几何构造问题，平均准确率为92%。当对新问题进行评价时，该方法可以解决68类欧氏问题中的31类。我们相信，这是第一次，一个纯粹的图像为基础的学习已被训练，以解决几何构造问题的这一困难。摘要：We describe a purely image-based method for finding geometric constructions with a ruler and compass in the Euclidea geometric game. The method is based on adapting the Mask R-CNN state-of-the-art image processing neural architecture and adding a tree-based search procedure to it. In a supervised setting, the method learns to solve all 68 kinds of geometric construction problems from the first six level packs of Euclidea with an average 92% accuracy. When evaluated on new kinds of problems, the method can solve 31 of the 68 kinds of Euclidea problems. We believe that this is the first time that a purely image-based learning has been trained to solve geometric construction problems of this difficulty.

【16】 Mitigating severe over-parameterization in deep convolutional neural networks through forced feature abstraction and compression with an entropy-based heuristic 标题：通过基于熵的启发式强制特征提取和压缩来缓解深度卷积神经网络中的严重过度参数化

作者：Nidhi Gowdra,Roopak Sinha,Stephen MacDonell,Wei Qi Yan 机构：School of Engineering, Computer and Mathematical Sciences, Auckland University of Technology, New Zealand 备注：None 链接：https://arxiv.org/abs/2106.14190 摘要：卷积神经网络（CNNs）如ResNet-50、DenseNet-40和ResNeXt-56严重参数化，因此需要增加模型训练所需的计算资源，模型训练随着模型深度的增加呈指数级扩展。本文提出了一种基于熵的卷积层估计（EBCLE）启发式算法，该算法鲁棒性强、简单，但有效地解决了CNN模型中网络深度的过参数化问题。EBCLE启发式算法利用输入数据集熵数据分布的先验知识来确定卷积网络深度的上界，在这个上界之外，身份转换对于提高模型性能的贡献很小。通过强制特征压缩和抽象来限制深度冗余限制了过度参数化，同时减少了24.99%-78.59%的训练时间，而不会降低模型性能。我们提出了经验证据来强调使用EBCLE启发式训练的更广泛但更浅的模型的相对有效性，该启发式保持或优于更窄但更深的模型的基线分类精度。EBCLE启发式是架构不可知的，基于EBCLE的CNN模型限制了深度冗余，从而提高了可用计算资源的利用率。提出的EBCLE启发式是一种令人信服的技术，研究人员分析证明他们的超参数（HP）选择CNNs。在5个基准数据集（ImageNet32、CIFAR-10/100、STL-10、MNIST）和4个网络体系结构（DenseNet、ResNet、ResNeXt和EfficientNet B0-B2）上建立了训练CNN模型的EBCLE启发式的经验验证，并采用适当的统计检验来推断本文提出的任何结论性主张。摘要：Convolutional Neural Networks (CNNs) such as ResNet-50, DenseNet-40 and ResNeXt-56 are severely over-parameterized, necessitating a consequent increase in the computational resources required for model training which scales exponentially for increments in model depth. In this paper, we propose an Entropy-Based Convolutional Layer Estimation (EBCLE) heuristic which is robust and simple, yet effective in resolving the problem of over-parameterization with regards to network depth of CNN model. The EBCLE heuristic employs a priori knowledge of the entropic data distribution of input datasets to determine an upper bound for convolutional network depth, beyond which identity transformations are prevalent offering insignificant contributions for enhancing model performance. Restricting depth redundancies by forcing feature compression and abstraction restricts over-parameterization while decreasing training time by 24.99% - 78.59% without degradation in model performance. We present empirical evidence to emphasize the relative effectiveness of broader, yet shallower models trained using the EBCLE heuristic, which maintains or outperforms baseline classification accuracies of narrower yet deeper models. The EBCLE heuristic is architecturally agnostic and EBCLE based CNN models restrict depth redundancies resulting in enhanced utilization of the available computational resources. The proposed EBCLE heuristic is a compelling technique for researchers to analytically justify their HyperParameter (HP) choices for CNNs. Empirical validation of the EBCLE heuristic in training CNN models was established on five benchmarking datasets (ImageNet32, CIFAR-10/100, STL-10, MNIST) and four network architectures (DenseNet, ResNet, ResNeXt and EfficientNet B0-B2) with appropriate statistical tests employed to infer any conclusive claims presented in this paper.

【17】 Model-assisted Learning-based Framework for Sensor Fault-Tolerant Building HVAC Control 标题：基于模型辅助学习的传感器容错楼宇暖通空调控制框架

作者：Shichao Xu,Yangyang Fu,Yixuan Wang,Zheng O'Neill,Qi Zhu 机构：Northwestern University, Evanston, USA, Texas A&M University, College Station, USA, th Zheng O’Neill 链接：https://arxiv.org/abs/2106.14144 摘要：由于人们87%的时间都在室内，建筑物中的智能供暖、通风和空调（HVAC）系统对于保持居住者的舒适性和降低能耗至关重要。现代智能建筑中的暖通空调系统依赖于传感器的实时读数，在实际应用中，传感器经常会出现各种故障，并且容易受到恶意攻击。这种错误的传感器输入可能会导致违反室内环境要求（如温度、湿度等）和能源消耗的增加。虽然文献中提出了许多基于模型的方法用于建筑暖通空调控制，但开发精确的物理模型以确保其性能的成本很高，而且解决传感器故障的影响更具挑战性。在这项工作中，我们提出了一个新的基于学习的传感器容错暖通空调控制框架，其中包括三个基于深度学习的组件：1）考虑可能的传感器故障，生成温度方案；2）基于精度评估选择一个方案，在选定的温度方案中应用强化学习。此外，为了解决建筑相关任务中训练数据不足的问题，我们提出了一种利用建筑物理力学抽象模型的模型辅助学习方法。通过大量的数值实验，我们证明了所提出的容错暖通空调控制框架可以在保持能源效率的同时，显著降低各种传感器故障模式下的建筑温度违规。摘要：As people spend up to 87% of their time indoors, intelligent Heating, Ventilation, and Air Conditioning (HVAC) systems in buildings are essential for maintaining occupant comfort and reducing energy consumption. Those HVAC systems in modern smart buildings rely on real-time sensor readings, which in practice often suffer from various faults and could also be vulnerable to malicious attacks. Such faulty sensor inputs may lead to the violation of indoor environment requirements (e.g., temperature, humidity, etc.) and the increase of energy consumption. While many model-based approaches have been proposed in the literature for building HVAC control, it is costly to develop accurate physical models for ensuring their performance and even more challenging to address the impact of sensor faults. In this work, we present a novel learning-based framework for sensor fault-tolerant HVAC control, which includes three deep learning based components for 1) generating temperature proposals with the consideration of possible sensor faults, 2) selecting one of the proposals based on the assessment of their accuracy, and 3) applying reinforcement learning with the selected temperature proposal. Moreover, to address the challenge of training data insufficiency in building-related tasks, we propose a model-assisted learning method leveraging an abstract model of building physical dynamics. Through extensive numerical experiments, we demonstrate that the proposed fault-tolerant HVAC control framework can significantly reduce building temperature violations under a variety of sensor fault patterns while maintaining energy efficiency.

【18】 PhyCRNet: Physics-informed Convolutional-Recurrent Network for Solving Spatiotemporal PDEs 标题：PhyCRNet：求解时空偏微分方程的物理信息卷积-递归网络

作者：Pu Ren,Chengping Rao,Yang Liu,Jianxun Wang,Hao Sun 机构：Department of Civil and Environmental Engineering, Northeastern University, Boston, MA , USA, Department of Mechanical and Industrial Engineering, Northeastern University, Boston, MA , USA 备注：22 pages 链接：https://arxiv.org/abs/2106.14103 摘要：偏微分方程（pde）在建模和仿真问题中起着基础性的作用。深入学习的最新进展显示了物理信息神经网络（PINNs）解决偏微分方程的巨大潜力，它可以作为数据驱动建模和逆分析的基础。然而，现有的基于全连通NNs的PINN方法对低维时空参数化具有内在的局限性。此外，由于初始/边界条件（I/BCs）是通过惩罚软施加的，因此解的质量在很大程度上依赖于超参数调整。为此，我们提出了一种新的基于物理信息的卷积循环学习结构（PhyCRNet和PhyCRNet-s），用于求解无标记数据的偏微分方程。提出了一种用于低维空间特征提取和时间演化学习的编译码卷积长短时记忆网络。损失函数被定义为聚合的离散化PDE残差，而I/bc在网络中被硬编码以确保强制满足（例如，周期性边界填充）。通过显式模拟时间推进的自回归和剩余连接，网络得到进一步增强。通过求解三个非线性偏微分方程（如二维Burgers方程、$lambda$-$omega$和FitzHugh-Nagumo反应扩散方程），对我们提出的方法的性能进行了评估，并与art基线算法的开始进行了比较。数值结果表明，该方法在求解精度、可外极性和可推广性等方面具有优越性。摘要：Partial differential equations (PDEs) play a fundamental role in modeling and simulating problems across a wide range of disciplines. Recent advances in deep learning have shown the great potential of physics-informed neural networks (PINNs) to solve PDEs as a basis for data-driven modeling and inverse analysis. However, the majority of existing PINN methods, based on fully-connected NNs, pose intrinsic limitations to low-dimensional spatiotemporal parameterizations. Moreover, since the initial/boundary conditions (I/BCs) are softly imposed via penalty, the solution quality heavily relies on hyperparameter tuning. To this end, we propose the novel physics-informed convolutional-recurrent learning architectures (PhyCRNet and PhyCRNet-s) for solving PDEs without any labeled data. Specifically, an encoder-decoder convolutional long short-term memory network is proposed for low-dimensional spatial feature extraction and temporal evolution learning. The loss function is defined as the aggregated discretized PDE residuals, while the I/BCs are hard-encoded in the network to ensure forcible satisfaction (e.g., periodic boundary padding). The networks are further enhanced by autoregressive and residual connections that explicitly simulate time marching. The performance of our proposed methods has been assessed by solving three nonlinear PDEs (e.g., 2D Burgers' equations, the $lambda$-$omega$ and FitzHugh Nagumo reaction-diffusion equations), and compared against the start-of-the-art baseline algorithms. The numerical results demonstrate the superiority of our proposed methodology in the context of solution accuracy, extrapolability and generalizability.

【19】 Accelerating Recurrent Neural Networks for Gravitational Wave Experiments 标题：加速递归神经网络在引力波实验中的应用

作者：Zhiqiang Que,Erwei Wang,Umar Marikar,Eric Moreno,Jennifer Ngadiuba,Hamza Javed,Bartłomiej Borzyszkowski,Thea Aarrestad,Vladimir Loncar,Sioni Summers,Maurizio Pierini,Peter Y Cheung,Wayne Luk 机构：† California Institute of Technology, Pasadena, CA, USA, ‡ European Organization for Nuclear Research (CERN), Geneva, Switzerland 备注：Accepted at the 2021 32nd IEEE International Conference on Application-specific Systems, Architectures and Processors (ASAP) 链接：https://arxiv.org/abs/2106.14089 摘要：提出了一种新的可重构结构，用于减少用于探测引力波的递归神经网络（RNN）的延迟。像LIGO探测器这样的引力干涉仪捕捉宇宙事件，例如黑洞合并，这些事件发生在未知时间，持续时间不同，产生时间序列数据。我们已经开发了一种新的架构，能够加速RNN推理来分析LIGO探测器的时间序列数据。该体系结构基于优化多层LSTM（Long-Short-Term-Memory）网络中的启动间隔（II），为每一层确定适当的重用因子。为该体系结构设计了一个可定制的模板，该模板使用高级综合工具生成具有高效资源利用率的低延迟FPGA设计。该方法基于两个LSTM模型，分别针对zynq7045fpga和U250 FPGA进行了评估。实验结果表明，在实现相同的IIs的同时，采用平衡II可以使dsp的数目减少42%。与其他基于FPGA的LSTM设计相比，我们的设计可以实现大约4.92到12.4倍的低延迟。摘要：This paper presents novel reconfigurable architectures for reducing the latency of recurrent neural networks (RNNs) that are used for detecting gravitational waves. Gravitational interferometers such as the LIGO detectors capture cosmic events such as black hole mergers which happen at unknown times and of varying durations, producing time-series data. We have developed a new architecture capable of accelerating RNN inference for analyzing time-series data from LIGO detectors. This architecture is based on optimizing the initiation intervals (II) in a multi-layer LSTM (Long Short-Term Memory) network, by identifying appropriate reuse factors for each layer. A customizable template for this architecture has been designed, which enables the generation of low-latency FPGA designs with efficient resource utilization using high-level synthesis tools. The proposed approach has been evaluated based on two LSTM models, targeting a ZYNQ 7045 FPGA and a U250 FPGA. Experimental results show that with balanced II, the number of DSPs can be reduced up to 42% while achieving the same IIs. When compared to other FPGA-based LSTM designs, our design can achieve about 4.92 to 12.4 times lower latency.

【20】 Continual Learning via Inter-Task Synaptic Mapping 标题：基于任务间突触映射的持续学习

作者：Mao Fubing,Weng Weiwei,Mahardhika Pratama,Edward Yapp Kien Yee 机构：Kien Yeec, National Engineering Research Center for Big Data Technology and System, Services, Computing Technology and System Lab, Cluster and Grid Computing Lab, School of, Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 备注：None 链接：https://arxiv.org/abs/2106.13954 摘要：从流任务中学习会导致一个模型灾难性地抹去它从以前的片段中吸收的独特经验。虽然LWF、SI、EWC等正则化技术通过限制旧任务的重要参数在接受新概念时的变化，证明了它们是克服这一问题的有效途径，但这些方法没有利用每个任务的公共信息，这些信息可以共享给现有的神经元。因此，由于参数重要性变量迅速爆炸，它们不能很好地扩展到大规模问题。本文提出了一种任务间突触标测（ISYANA）方法来支持知识的持续学习。ISYANA将任务与神经元的关系以及概念与概念的关系结合起来，这样就防止了神经元在接受相关概念的同时接受不同的概念。对基准连续学习问题进行了数值研究，并与常用的连续学习算法进行了比较。ISYANA的表演与艺术水平相比具有竞争力。ISYANA的代码在url中提供{https://github.com/ContinualAL/ISYANAKBS}. 摘要：Learning from streaming tasks leads a model to catastrophically erase unique experiences it absorbs from previous episodes. While regularization techniques such as LWF, SI, EWC have proven themselves as an effective avenue to overcome this issue by constraining important parameters of old tasks from changing when accepting new concepts, these approaches do not exploit common information of each task which can be shared to existing neurons. As a result, they do not scale well to large-scale problems since the parameter importance variables quickly explode. An Inter-Task Synaptic Mapping (ISYANA) is proposed here to underpin knowledge retention for continual learning. ISYANA combines task-to-neuron relationship as well as concept-to-concept relationship such that it prevents a neuron to embrace distinct concepts while merely accepting relevant concept. Numerical study in the benchmark continual learning problems has been carried out followed by comparison against prominent continual learning algorithms. ISYANA exhibits competitive performance compared to state of the arts. Codes of ISYANA is made available in url{https://github.com/ContinualAL/ISYANAKBS}.

【21】 A multi-stage machine learning model on diagnosis of esophageal manometry 标题：食管测压诊断的多阶段机器学习模型

作者：Wenjun Kou,Dustin A. Carlson,Alexandra J. Baumann,Erica N. Donnan,Jacob M. Schauer,Mozziyar Etemadi,John E. Pandolfino 机构：arXiv:,.,v, [cs.LG] , Jun 链接：https://arxiv.org/abs/2106.13869 摘要：高分辨率测压（HRM）是诊断食管动力障碍的主要方法。它的解释和分类包括对吞咽水平结果的初步评估，然后基于芝加哥分类（CC）使用树状算法推导研究水平诊断。这种使用人力资源管理诊断运动障碍的方法是使用一个多阶段的建模框架来反映的，该框架是由多种机器学习方法组合而成的。具体来说，该框架包括燕子级的深度学习模型和学习级的基于特征的机器学习模型。在吞咽水平阶段，建立了三个基于卷积神经网络（CNNs）的模型来预测吞咽类型、吞咽加压和综合松弛压力（IRP）。在研究阶段，对基于专家知识的规则模型、xgboost模型和人工神经网络（ANN）模型家族进行了模型选择，设计了后两个模型，并利用输出知识的激励进行了扩充。利用贝叶斯原理，提出了一种简单的模型不可知的模型平衡策略，通过精度得分加权得到模型平均值。对平均（混合）模型和单个模型进行了比较和评价，其中在测试数据集上，前1预测的最佳性能为0.81，前2预测的最佳性能为0.92。这是第一个人工智能风格的模型，自动预测CC诊断的人力资源管理研究的原始数据。此外，提出的模型框架可以很容易地扩展到多模态任务，例如基于HRM和功能性管腔成像探针全景测量（FLIP）的临床数据对食管患者进行诊断。摘要：High-resolution manometry (HRM) is the primary procedure used to diagnose esophageal motility disorders. Its interpretation and classification includes an initial evaluation of swallow-level outcomes and then derivation of a study-level diagnosis based on Chicago Classification (CC), using a tree-like algorithm. This diagnostic approach on motility disordered using HRM was mirrored using a multi-stage modeling framework developed using a combination of various machine learning approaches. Specifically, the framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage. In the swallow-level stage, three models based on convolutional neural networks (CNNs) were developed to predict swallow type, swallow pressurization, and integrated relaxation pressure (IRP). At the study-level stage, model selection from families of the expert-knowledge-based rule models, xgboost models and artificial neural network(ANN) models were conducted, with the latter two model designed and augmented with motivation from the export knowledge. A simple model-agnostic strategy of model balancing motivated by Bayesian principles was utilized, which gave rise to model averaging weighted by precision scores. The averaged (blended) models and individual models were compared and evaluated, of which the best performance on test dataset is 0.81 in top-1 prediction, 0.92 in top-2 predictions. This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data. Moreover, the proposed modeling framework could be easily extended to multi-modal tasks, such as diagnosis of esophageal patients based on clinical data from both HRM and functional luminal imaging probe panometry (FLIP).

【22】 POLAR: A Polynomial Arithmetic Framework for Verifying Neural-Network Controlled Systems 标题：POLAR：一种验证神经网络控制系统的多项式算法框架

作者：Chao Huang,Jiameng Fan,Xin Chen,Wenchao Li,Qi Zhu 机构：University of Liverpool, Boston University, University of Dayton, Northwestern University 链接：https://arxiv.org/abs/2106.13867 摘要：本文提出了一种基于区间余数多项式过逼近的神经网络控制系统（NNCSs）有界时间可达性分析的POLAR结构。与现有的基于标准Taylor模型的算法相比，该框架采用Bernstein多项式插值和Taylor模型算法相结合的方法逐层迭代逼近神经元的输出范围。该方法克服了标准Taylor模型算法无法处理Taylor多项式不能很好逼近的函数的缺点，显著提高了NNCSs可达态计算的精度和效率。该方法在估计神经网络的输出范围时，使泰勒模型余项在线性映射下保持符号化，进一步加强了过度逼近。我们证明了POLAR可以与现有的Taylor模型流管构建技术无缝集成，并证明POLAR在一系列基准上显著优于当前最先进的技术。摘要：We propose POLAR, a textbf{pol}ynomial textbf{ar}ithmetic framework that leverages polynomial overapproximations with interval remainders for bounded-time reachability analysis of neural network-controlled systems (NNCSs). Compared with existing arithmetic approaches that use standard Taylor models, our framework uses a novel approach to iteratively overapproximate the neuron output ranges layer-by-layer with a combination of Bernstein polynomial interpolation for continuous activation functions and Taylor model arithmetic for the other operations. This approach can overcome the main drawback in the standard Taylor model arithmetic, i.e. its inability to handle functions that cannot be well approximated by Taylor polynomials, and significantly improve the accuracy and efficiency of reachable states computation for NNCSs. To further tighten the overapproximation, our method keeps the Taylor model remainders symbolic under the linear mappings when estimating the output range of a neural network. We show that POLAR can be seamlessly integrated with existing Taylor model flowpipe construction techniques, and demonstrate that POLAR significantly outperforms the current state-of-the-art techniques on a suite of benchmarks.

【23】 Ladder Polynomial Neural Networks 标题：梯形多项式神经网络

作者：Li-Ping Liu,Ruiyuan Gu,Xiaozhe Hu 机构： 1Department of Computer Science, Tufts University 2Departmentof Mathematics 备注：The work has been first submitted to ICLR 2019 (submission link). Unfortunately the contribution was not sufficiently appreciated by reviewers 链接：https://arxiv.org/abs/2106.13834 摘要：多项式函数具有许多有用的分析性质，但由于其函数类被认为是受限的，因此很少用作学习模型。这项工作表明，当适当训练多项式函数时，可以得到很强的学习模型。特别地，本文利用乘积激活函数构造了多项式前馈神经网络。新的神经网络是一个多项式函数，并提供其多项式阶的精确控制。它可以通过标准的训练技术进行训练，如批量标准化和退出。这种新的前馈网络覆盖了以前的几种多项式模型作为特例。与一般的前馈神经网络相比，多项式前馈网络具有一些有趣的量的闭式计算，在贝叶斯学习中非常有用。在一系列的回归和分类任务的实证研究中，提出的模型优于以往的多项式模型。摘要：Polynomial functions have plenty of useful analytical properties, but they are rarely used as learning models because their function class is considered to be restricted. This work shows that when trained properly polynomial functions can be strong learning models. Particularly this work constructs polynomial feedforward neural networks using the product activation, a new activation function constructed from multiplications. The new neural network is a polynomial function and provides accurate control of its polynomial order. It can be trained by standard training techniques such as batch normalization and dropout. This new feedforward network covers several previous polynomial models as special cases. Compared with common feedforward neural networks, the polynomial feedforward network has closed-form calculations of a few interesting quantities, which are very useful in Bayesian learning. In a series of regression and classification tasks in the empirical study, the proposed model outperforms previous polynomial models.

【24】 Dynamic Planning and Learning under Recovering Rewards 标题：报酬回收下的动态规划与学习

作者：David Simchi-Levi,Zeyu Zheng,Feng Zhu 机构： the decision maker cansimultaneously pull and collect rewards from at most K 1Institute for Data, USA 2Department of Indus-trial Engineering and Operations Research, University of Cali-fornia 备注：Accepted by ICML 2021 链接：https://arxiv.org/abs/2106.14813 摘要：基于实时流式电子商务、促销和推荐等新兴应用，我们引入了一类多武装强盗问题，该问题具有以下两个特点：（i）决策者在每个时间段可以从N$不同的武器中抽取并收取至多K$的奖励(ii）手臂被拉动后，预期回报立即下降，然后随着空闲时间的增加，非参数恢复。以T$时间段内期望累积报酬最大化为目标，提出、构造并证明了一类“纯周期策略”的性能保证。对于所有模型参数都已知的离线问题，我们提出的策略得到了一个近似比为$1-mathcal O（1/sqrt{K}）$，当$K$增长到无穷大时，这个近似比是渐近最优的。针对模型参数未知且需要学习的在线问题，设计了一种基于置信上界（UCB）的策略，该策略与离线基准相比具有$widetilde{mathcal O}（Nsqrt{T}）$遗憾。我们的框架和策略设计可能会被应用到其他离线规划和在线学习应用程序中，这些应用程序具有非固定和可恢复的回报。摘要：Motivated by emerging applications such as live-streaming e-commerce, promotions and recommendations, we introduce a general class of multi-armed bandit problems that have the following two features: (i) the decision maker can pull and collect rewards from at most $K$ out of $N$ different arms in each time period; (ii) the expected reward of an arm immediately drops after it is pulled, and then non parametrically recovers as the idle time increases. With the objective of maximizing expected cumulative rewards over $T$ time periods, we propose, construct and prove performance guarantees for a class of "Purely Periodic Policies". For the offline problem when all model parameters are known, our proposed policy obtains an approximation ratio that is at the order of $1-mathcal O(1/sqrt{K})$, which is asymptotically optimal when $K$ grows to infinity. For the online problem when the model parameters are unknown and need to be learned, we design an Upper Confidence Bound (UCB) based policy that approximately has $widetilde{mathcal O}(Nsqrt{T})$ regret against the offline benchmark. Our framework and policy design may have the potential to be adapted into other offline planning and online learning applications with non-stationary and recovering rewards.

【25】 Polyconvex anisotropic hyperelasticity with neural networks 标题：基于神经网络的多凸各向异性超弹性

作者：Dominik Klein,Mauricio Fernández,Robert J. Martin,Patrizio Neff,Oliver Weeger 机构：Cyber-Physical Simulation Group, Department of Mechanical Engineering & Centre for Computational, Engineering, Technical University of Darmstadt, Dolivostr. , Darmstadt, Germany, Access e.V., Intzestr. , Aachen, Germany 链接：https://arxiv.org/abs/2106.14623 摘要：本文提出了两种基于机器学习的有限变形本构模型。该模型采用输入凸神经网络，具有超弹性、各向异性和满足多凸性条件，具有椭圆性，保证了材料的稳定性。第一个本构模型基于一组多凸、各向异性和目标不变量。第二种方法是根据变形梯度及其辅因子和行列式，采用群对称化来满足材料对称性条件，通过数据扩充来近似实现客观性。数据扩充方法的数据集扩展是基于力学考虑，不需要额外的实验或模拟数据。这些模型用立方晶格超材料的模拟数据进行了校正，包括有限变形和晶格不稳定性。根据实验研究中常用的变形，使用了适量的标定数据。基于不变量的模型在几种变形模式下都存在缺陷，而仅基于变形梯度的模型能够很好地再现和预测材料的有效行为，具有良好的泛化能力。因此，特别是第二个模型提出了一个高度灵活的本构建模方法，这导致了一个数学上适定的问题。摘要：In the present work, two machine learning based constitutive models for finite deformations are proposed. Using input convex neural networks, the models are hyperelastic, anisotropic and fulfill the polyconvexity condition, which implies ellipticity and thus ensures material stability. The first constitutive model is based on a set of polyconvex, anisotropic and objective invariants. The second approach is formulated in terms of the deformation gradient, its cofactor and determinant, uses group symmetrization to fulfill the material symmetry condition, and data augmentation to fulfill objectivity approximately. The extension of the dataset for the data augmentation approach is based on mechanical considerations and does not require additional experimental or simulation data. The models are calibrated with highly challenging simulation data of cubic lattice metamaterials, including finite deformations and lattice instabilities. A moderate amount of calibration data is used, based on deformations which are commonly applied in experimental investigations. While the invariant-based model shows drawbacks for several deformation modes, the model based on the deformation gradient alone is able to reproduce and predict the effective material behavior very well and exhibits excellent generalization capabilities. Thus, in particular the second model presents a highly flexible constitutive modeling approach, that leads to a mathematically well-posed problem.

【26】 Towards Model-informed Precision Dosing with Expert-in-the-loop Machine Learning 标题：基于专家在环机器学习的模型信息精确配药

作者：Yihuang Kang,Yi-Wen Chiu,Ming-Yen Lin,Fang-yi Su,Sheng-Tai Huang 机构：Department of Information Management, National Sun Yat-sen University, Kaohsiung, Taiwan, Division of Nephrology, Department of Internal Medicine, Kaohsiung Medical University Hospital, fangyi 链接：https://arxiv.org/abs/2106.14384 摘要：机器学习（ML）及其应用已经改变了我们的生活，但它也创造了与公平、负责、透明和道德人工智能发展相关的问题。由于ML模型还不能完全理解，很明显，我们仍然需要人类参与算法决策过程。在本文中，我们考虑了一个ML框架，它可以通过将人类专家加入到模型学习循环中来加速模型学习并提高模型的可解释性。针对数据标注成本高、缺乏合适的数据来建立目标任务与输入特征之间的关联模型的学习问题，提出了一种新的人在回路ML框架。实验结果表明，该方法可以从数据中学习可解释的规则，并可以用规则表示编辑代替数据注释，从而降低专家的工作量。该方法还可以通过在迭代模型学习过程中引入专家反馈来消除算法偏差。摘要：Machine Learning (ML) and its applications have been transforming our lives but it is also creating issues related to the development of fair, accountable, transparent, and ethical Artificial Intelligence. As the ML models are not fully comprehensible yet, it is obvious that we still need humans to be part of algorithmic decision-making processes. In this paper, we consider a ML framework that may accelerate model learning and improve its interpretability by incorporating human experts into the model learning loop. We propose a novel human-in-the-loop ML framework aimed at dealing with learning problems that the cost of data annotation is high and the lack of appropriate data to model the association between the target tasks and the input features. With an application to precision dosing, our experimental results show that the approach can learn interpretable rules from data and may potentially lower experts' workload by replacing data annotation with rule representation editing. The approach may also help remove algorithmic bias by introducing experts' feedback into the iterative model learning process.

【27】 Use of Machine Learning Technique to maximize the signal over background for H rightarrow ττ标题：使用机器学习技术最大化H的背景信号ightarrowττ

作者：Kanhaiya Gupta 机构：Physikalisches Institut, Universit¨at Bonn, Nussallee , Bonn, Germany∗ 备注：6 pages, 12 figures 链接：https://arxiv.org/abs/2106.14257 摘要：近年来，人工神经网络（ANNs）在模式识别和机器学习领域赢得了众多的竞争。人工神经网络已被应用于从语音识别到蛋白质二级结构预测、癌症分类和基因预测等领域。在这里，我们打算利用机器学习技术将记录的事件分类为信号或背景，最大限度地提高在伪数据集中发现希格斯玻色子衰变为两个$tau$轻子的机会。摘要：In recent years, artificial neural networks (ANNs) have won numerous contests in pattern recognition and machine learning. ANNS have been applied to problems ranging from speech recognition to prediction of protein secondary structure, classification of cancers, and gene prediction. Here, we intend to maximize the chances of finding the Higgs boson decays to two $tau$ leptons in the pseudo dataset using a Machine Learning technique to classify the recorded events as signal or background.

【28】 The mbsts package: Multivariate Bayesian Structural Time Series Models in R 标题：MBSTS软件包：R中的多变量贝叶斯结构时间序列模型

作者：Ning Ning,Jinwen Qiu 链接：https://arxiv.org/abs/2106.14045 摘要：多元贝叶斯结构时间序列（MBSTS）模型{QIU2018多元，Jammalamadaka2019Predicting}作为许多结构时间序列模型的推广版本，处理多个相关时间序列的推断和预测，其中还可以选择对每个目标序列使用不同的同期预测候选池。MBSTS模型具有广泛的应用，是特征选择、时间序列预测、即时预报、因果影响推断等的理想模型。本文介绍了如何使用R packagepkg{mbsts}进行mbsts建模，在package中的用户友好函数和开发人员友好函数之间建立桥梁，并给出了相应的方法。对pkg{mbsts}包中的模拟数据集和面向对象函数进行了说明，使用户能够灵活地添加或删除某些组件，并简化或复杂化某些设置。摘要：The multivariate Bayesian structural time series (MBSTS) model citep{qiu2018multivariate,Jammalamadaka2019Predicting} as a generalized version of many structural time series models, deals with inference and prediction for multiple correlated time series, where one also has the choice of using a different candidate pool of contemporaneous predictors for each target series. The MBSTS model has wide applications and is ideal for feature selection, time series forecasting, nowcasting, inferring causal impact, and others. This paper demonstrates how to use the R package pkg{mbsts} for MBSTS modeling, establishing a bridge between user-friendly and developer-friendly functions in package and the corresponding methodology. A simulated dataset and object-oriented functions in the pkg{mbsts} package are explained in the way that enables users to flexibly add or deduct some components, as well as to simplify or complicate some settings.

其他(24篇)

【1】 Doing good by fighting fraud: Ethical anti-fraud systems for mobile payments 标题：通过打击欺诈做好事：移动支付的道德反欺诈系统

作者：Zainul Abi Din,Hari Venugopalan,Henry Lin,Adam Wushensky,Steven Liu,Samuel T. King 机构：∗ University of California, Davis, † Bouncer Technologies 链接：https://arxiv.org/abs/2106.14861 摘要：pp建设者通常使用安全挑战（一种逐步验证的形式）来为他们的应用程序增加安全性。然而，这类建筑的伦理意涵之前还没有被研究过。在本文中，我们提出了一个大规模的测量研究运行现有的反欺诈安全挑战，拳击手，在真实的应用程序运行在移动设备上。我们发现，虽然Boxer总体上工作得很好，但它无法在运行机器学习模型的设备上以低于每秒一帧（FPS）的速度进行有效扫描，从而阻碍了使用廉价设备的用户。根据我们的研究成果，我们设计了一个新的反欺诈系统，用于扫描支付卡，该系统在现代移动设备上广泛的性能特征和硬件配置下工作。与Boxer相比，Daredevil将运行速度低于1fps的设备数量减少了一个数量级，为打击欺诈提供了一个更公平的系统。我们总共收集了5085444个真实设备的数据，这些设备分布在496个运行生产软件并与真实用户交互的真实应用程序中。摘要：pp builders commonly use security challenges, aform of step-up authentication, to add security to their apps. However, the ethical implications of this type of architecture has not been studied previously. In this paper, we present a large-scale measurement study of running an existing anti-fraud security challenge, Boxer, in real apps running on mobile devices. We find that although Boxer does work well overall, it is unable to scan effectively on devices that run its machine learning models at less than one frame per second (FPS), blocking users who use inexpensive devices. With the insights from our study, we design Daredevil, anew anti-fraud system for scanning payment cards that work swell across the broad range of performance characteristics and hardware configurations found on modern mobile devices. Daredevil reduces the number of devices that run at less than one FPS by an order of magnitude compared to Boxer, providing a more equitable system for fighting fraud. In total, we collect data from 5,085,444 real devices spread across 496 real apps running production software and interacting with real users.

【2】 Virtual Agents in Live Coding: A Short Review 标题：实时编码中的虚拟代理：简评

作者：Anna Xambó 备注：Preprint version submitted to eContact! (this https URL) for the special issue 21.1 - Take Back the Stage: Live coding, live audiovisual, laptop orchestra 链接：https://arxiv.org/abs/2106.14835 摘要：人工智能和实时编码很少被探索。本文简要回顾了在实时编码实践中使用虚拟代理的不同观点，回顾了过去和现在，并指出了未来的发展方向。摘要：AI and live coding has been little explored. This article contributes with a short review of different perspectives of using virtual agents in the practice of live coding looking at past and present as well as pointing to future directions.

【3】 Multi-objective Evolutionary Approach for Efficient Kernel Size and Shape for CNN 标题：细胞神经网络有效核大小和形状的多目标进化方法

作者：Ziwei Wang,Martin A. Trefzer,Simon J. Bale,Andy M. Tyrrell 机构： Department of Electronic Engineering, University ofYork 备注：13 pages paper, plus 17 papers supplementary materials 链接：https://arxiv.org/abs/2106.14776 摘要：虽然CNN拓扑结构的最新发展，如VGGNet和ResNet，已经变得越来越精确，但是这些网络的计算成本很高，涉及数十亿的算术运算和参数。为了提高分类精度，现有的cnn通常包含大量复杂的卷积层。然而，对于某些应用，例如物联网（IoT），这些CNN将在资源受限的平台上实现，CNN架构必须小而高效。为了解决这一问题，降低卷积层的资源消耗成为最重要的解决方案之一。本文提出了一种多目标优化方法，利用多目标进化算法（MOEAs）在计算量和网络精度之间进行折衷。卷积核的数目和大小与CNNs的计算资源消耗成正比。因此，本文考虑通过减少卷积层核的大小和数目来优化计算资源消耗。此外，还研究了非常规核形状的使用，结果表明这些核形状明显优于常用的平方卷积核。因此，本文的主要贡献是一种基于非传统核形状的方法来显著降低CNNs的计算成本，并为特定用例提供不同的权衡。实验结果进一步表明，该方法在不降低网络性能的前提下，极大地提高了资源消耗。与基准CNN相比，在CIFAR-10数据集上，最佳折衷结构的乘法减少了6倍，分类精度略有提高。摘要：While state-of-the-art development in CNN topology, such as VGGNet and ResNet, have become increasingly accurate, these networks are computationally expensive involving billions of arithmetic operations and parameters. To improve the classification accuracy, state-of-the-art CNNs usually involve large and complex convolutional layers. However, for certain applications, e.g. Internet of Things (IoT), where such CNNs are to be implemented on resource-constrained platforms, the CNN architectures have to be small and efficient. To deal with this problem, reducing the resource consumption in convolutional layers has become one of the most significant solutions. In this work, a multi-objective optimisation approach is proposed to trade-off between the amount of computation and network accuracy by using Multi-Objective Evolutionary Algorithms (MOEAs). The number of convolution kernels and the size of these kernels are proportional to computational resource consumption of CNNs. Therefore, this paper considers optimising the computational resource consumption by reducing the size and number of kernels in convolutional layers. Additionally, the use of unconventional kernel shapes has been investigated and results show these clearly outperform the commonly used square convolution kernels. The main contributions of this paper are therefore a methodology to significantly reduce computational cost of CNNs, based on unconventional kernel shapes, and provide different trade-offs for specific use cases. The experimental results further demonstrate that the proposed method achieves large improvements in resource consumption with no significant reduction in network performance. Compared with the benchmark CNN, the best trade-off architecture shows a reduction in multiplications of up to 6X and with slight increase in classification accuracy on CIFAR-10 dataset.

【4】 Using Issues to Explain Legal Decisions 标题：用问题来解释法律决定

作者：Trevor Bench-Capon 机构：Using Issues to Explain Legal DecisionsTrevor Bench-CaponUniversity of Liverpool 备注：Presented at the XAILA workshop 2021 链接：https://arxiv.org/abs/2106.14688 摘要：由于需要解释机器学习系统的输出以预测法律案例的结果，人们对传统人工智能和法律系统提供的解释产生了新的兴趣，特别是那些使用基于因素的推理和判例的系统。在这篇论文中，我们考虑我们应该期望从这些系统中得到什么样的解释，特别关注在案例中使用问题所能提供的结构。摘要：The need to explain the output from Machine Learning systems designed to predict the outcomes of legal cases has led to a renewed interest in the explanations offered by traditional AI and Law systems, especially those using factor based reasoning and precedent cases. In this paper we consider what sort of explanations we should expect from such systems, with a particular focus on the structure that can be provided by the use of issues in cases.

【5】 Evolutionary Dynamics and Φ-Regret Minimization in Games标题：进化动力学与博弈中的Φ-后悔最小化

作者：Georgios Piliouras,Mark Rowland,Shayegan Omidshafiei,Romuald Elie,Daniel Hennes,Jerome Connor,Karl Tuyls 机构：SUTD, DeepMind 链接：https://arxiv.org/abs/2106.14668 摘要：后悔是网络学习中的一个基本概念，在游戏学习动态分析中也有重要的应用。后悔量化了学习者的表现与事后发现的基线之间的差异。众所周知，遗憾最小化算法收敛于博弈中的某类平衡点；然而，博弈论中使用的传统后悔形式主要考虑允许偏离确定性行动或策略的基线。在本文中，我们从完全策略空间（即纯策略上的概率分布）的划分偏差的角度，在先前建立的$Phi$-后悔框架的镜头下，重新审视我们对后悔的理解，该框架提供了更强后悔度量的连续统。重要的是，$Phi$-后悔使学习代理能够考虑到与混合策略的偏差，概括了几种现有的后悔概念，如外部后悔、内部后悔和交换后悔，从而拓宽了基于后悔的学习算法分析所获得的见解。本文证明了在一般的$2×2$博弈中，经过充分研究的复制子动力学（RD）进化学习算法无缝地最小化了$Phi$-遗憾的最强可能形式，而不需要对底层算法本身进行任何修改。随后，我们在一组144美元2乘以2美元的游戏中进行了实验，验证了我们的理论结果，其中RD表现出一系列不同的行为。最后，我们提供了RD在一些大型博弈中使$Phi$-遗憾最小化的经验证据，暗示了从理论和经验的角度对此类算法进行基于$Phi$-遗憾的研究的进一步机会。摘要：Regret has been established as a foundational concept in online learning, and likewise has important applications in the analysis of learning dynamics in games. Regret quantifies the difference between a learner's performance against a baseline in hindsight. It is well-known that regret-minimizing algorithms converge to certain classes of equilibria in games; however, traditional forms of regret used in game theory predominantly consider baselines that permit deviations to deterministic actions or strategies. In this paper, we revisit our understanding of regret from the perspective of deviations over partitions of the full emph{mixed} strategy space (i.e., probability distributions over pure strategies), under the lens of the previously-established $Phi$-regret framework, which provides a continuum of stronger regret measures. Importantly, $Phi$-regret enables learning agents to consider deviations from and to mixed strategies, generalizing several existing notions of regret such as external, internal, and swap regret, and thus broadening the insights gained from regret-based analysis of learning algorithms. We prove here that the well-studied evolutionary learning algorithm of replicator dynamics (RD) seamlessly minimizes the strongest possible form of $Phi$-regret in generic $2 times 2$ games, without any modification of the underlying algorithm itself. We subsequently conduct experiments validating our theoretical results in a suite of 144 $2 times 2$ games wherein RD exhibits a diverse set of behaviors. We conclude by providing empirical evidence of $Phi$-regret minimization by RD in some larger games, hinting at further opportunity for $Phi$-regret based study of such algorithms from both a theoretical and empirical perspective.

【6】 Timestamping Documents and Beliefs 标题：为文档和信仰添加时间戳

作者：Swayambhu Nath Ray 机构：Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India 备注：Master's Report 链接：https://arxiv.org/abs/2106.14622 摘要：我们可以获得的大部分文本信息都是时变的。在一个信息是动态的世界里，给它们加上时间戳是一项非常重要的任务。文档是一个很好的信息源，用于许多任务，如情感分析、评论分类等。文档创建日期的知识有助于完成一些任务，如摘要、事件提取、临时信息提取等。不幸的是，对于web上的大多数文档，时间戳元数据错误或丢失。因此，文档断代是一个具有挑战性的问题，它需要对文档的时间结构以及文档的上下文信息进行推理。以前的文件日期系统主要依赖于手工制作的功能，而忽略了这样的文件内部结构。本文提出了一种基于图卷积网络（GCN）的文档定年方法NeuralDater，它以一种有原则的方式联合利用文档的句法和时态图结构。我们还指出了NeuralDater的一些局限性，并试图以一种更灵活、更直观的方式利用文档中的上下文和时间信息，提出了一种基于注意的文档日期系统AD3:attentient Deep Document Dater。据我们所知，这些都是首次应用深度学习方法的任务。通过对真实世界数据集的大量实验，我们发现我们的模型显著优于最新的基线。摘要：Most of the textual information available to us are temporally variable. In a world where information is dynamic, time-stamping them is a very important task. Documents are a good source of information and are used for many tasks like, sentiment analysis, classification of reviews etc. The knowledge of creation date of documents facilitates several tasks like summarization, event extraction, temporally focused information extraction etc. Unfortunately, for most of the documents on the web, the time-stamp meta-data is either erroneous or missing. Thus document dating is a challenging problem which requires inference over the temporal structure of the document alongside the contextual information of the document. Prior document dating systems have largely relied on handcrafted features while ignoring such document-internal structures. In this paper we propose NeuralDater, a Graph Convolutional Network (GCN) based document dating approach which jointly exploits syntactic and temporal graph structures of document in a principled way. We also pointed out some limitations of NeuralDater and tried to utilize both context and temporal information in documents in a more flexible and intuitive manner proposing AD3: Attentive Deep Document Dater, an attention-based document dating system. To the best of our knowledge these are the first application of deep learning methods for the task. Through extensive experiments on real-world datasets, we find that our models significantly outperforms state-of-the-art baselines by a significant margin.

【7】 Ensembling Shift Detectors: an Extensive Empirical Evaluation 标题：集成移位检测器：广泛的经验评估

作者：Simona Maggio,Léo Dreyfus-Schmidt 机构：L´eo Dreyfus-Schmidt,[,−,−,−,], Equal contribution 备注：20 pages, 7 figures 链接：https://arxiv.org/abs/2106.14608 摘要：术语dataset shift指用于训练机器学习模型的数据与模型运行的数据不同的情况。虽然有几种类型的移位会自然发生，但现有的移位检测器通常只设计用于处理特定类型的移位。我们提出了一个简单而强大的技术来集成互补移位检测器，同时调整每个检测器的统计检验对数据集的显著性水平。这使得能够进行更健壮的移位检测，能够处理所有不同类型的移位，这在精确移位类型通常未知的实际设置中是必不可少的。这一方法通过对应用于真实世界结构化数据集的各种合成偏移进行的大规模可靠的统计基准研究得到了验证。摘要：The term dataset shift refers to the situation where the data used to train a machine learning model is different from where the model operates. While several types of shifts naturally occur, existing shift detectors are usually designed to address only a specific type of shift. We propose a simple yet powerful technique to ensemble complementary shift detectors, while tuning the significance level of each detector's statistical test to the dataset. This enables a more robust shift detection, capable of addressing all different types of shift, which is essential in real-life settings where the precise shift type is often unknown. This approach is validated by a large-scale statistically sound benchmark study over various synthetic shifts applied to real-world structured datasets.

【8】 Privacy-Preserving Image Acquisition Using Trainable Optical Kernel 标题：基于可训练光学核的隐私保护图像获取

作者：Yamin Sepehri,Pedram Pad,Pascal Frossard,L. Andrea Dunbar 机构：†Centre Suisse d’Electronique et de Microtechnique (CSEM), ‡ ´Ecole polytechnique f´ed´erale de Lausanne (EPFL) 备注：9 pages, 9 figures 链接：https://arxiv.org/abs/2106.14577 摘要：在传感器和摄像头无处不在的社会中，保护隐私越来越受到关注。在这项工作中，我们首次提出了一种可训练的图像获取方法，在敏感的身份信息到达图像传感器之前去除光域中的敏感信息。该方法利用可训练的光卷积核，在滤除敏感内容的同时传输所需信息。由于敏感内容在到达图像传感器之前被抑制，因此不会进入数字域，因此任何形式的隐私攻击都无法恢复。这与当前所有易受直接访问攻击的数字隐私保护方法形成对比。此外，与以往无法训练的光学隐私保护方法相比，我们的方法是数据驱动的，并针对手头的特定应用进行了优化。此外，由于该处理在光域中被动地发生并且甚至可以在完全数字隐私保护系统的顶部一起使用，因此在采集系统上没有额外的计算、存储器或功率负担。该方法适用于不同的数字神经网络和内容。我们将其应用于一些场景中，例如将微笑检测作为所需属性，而将性别作为敏感内容过滤掉。我们结合两个对抗性神经网络训练光学内核，其中分析网络尝试检测所需属性，而对抗性网络尝试检测敏感内容。实验结果表明，该方法可以减少65.1%的敏感内容，仅损失7.3%的敏感内容。此外，我们利用深度重建的方法对原始人脸进行重建，证实了重建攻击的无效性，从而获得敏感内容。摘要：Preserving privacy is a growing concern in our society where sensors and cameras are ubiquitous. In this work, for the first time, we propose a trainable image acquisition method that removes the sensitive identity revealing information in the optical domain before it reaches the image sensor. The method benefits from a trainable optical convolution kernel which transmits the desired information while filters out the sensitive content. As the sensitive content is suppressed before it reaches the image sensor, it does not enter the digital domain therefore is unretrievable by any sort of privacy attack. This is in contrast with the current digital privacy-preserving methods that are all vulnerable to direct access attack. Also, in contrast with the previous optical privacy-preserving methods that cannot be trained, our method is data-driven and optimized for the specific application at hand. Moreover, there is no additional computation, memory, or power burden on the acquisition system since this processing happens passively in the optical domain and can even be used together and on top of the fully digital privacy-preserving systems. The proposed approach is adaptable to different digital neural networks and content. We demonstrate it for several scenarios such as smile detection as the desired attribute while the gender is filtered out as the sensitive content. We trained the optical kernel in conjunction with two adversarial neural networks where the analysis network tries to detect the desired attribute and the adversarial network tries to detect the sensitive content. We show that this method can reduce 65.1% of sensitive content when it is selected to be the gender and it only loses 7.3% of the desired content. Moreover, we reconstruct the original faces using the deep reconstruction method that confirms the ineffectiveness of reconstruction attacks to obtain the sensitive content.

【9】 FreeTickets: Accurate, Robust and Efficient Deep Ensemble by Training with Dynamic Sparsity 标题：FreeTickets：通过动态稀疏性训练实现准确、健壮、高效的深度集成

作者：Shiwei Liu,Tianlong Chen,Zahra Atashgahi,Xiaohan Chen,Ghada Sokar,Elena Mocanu,Mykola Pechenizkiy,Zhangyang Wang,Decebal Constantin Mocanu 机构：Eindhoven University of Technology,University of Texas at Austin, University of Twente 备注：preprint version 链接：https://arxiv.org/abs/2106.14568 摘要：最近关于稀疏神经网络的研究表明，孤立地训练一个稀疏网络是可能的，以便用一小部分参数来匹配相应的稠密网络的性能。然而，这些性能稀疏神经网络（中奖彩票）的识别要么涉及昂贵的迭代训练修剪再训练过程（如彩票假设），要么涉及超长的稀疏训练时间（如动态稀疏训练），这两者都会引起财务和环境问题。在这项工作中，我们试图通过引入免费门票的概念来解决这个成本降低的问题，作为第一个解决方案，稀疏卷积神经网络的性能比其密集网络的性能大大提高，而用于完全训练的只占后者所需计算资源的一小部分。具体来说，我们例举了FreeTickets的概念，提出了两种新的具有动态稀疏性的有效集成方法，在稀疏训练过程中一次生成多个多样的、精确的“免费”票证。将这些免费门票组合成一个集合，与相应的密集（集合）网络相比，在精确度、不确定性估计、鲁棒性和效率方面都有显著的提高。我们的结果为稀疏神经网络的强度提供了新的见解，并表明稀疏性的好处远远超出了通常的训练/推理的预期效率。我们将发布所有的密码https://github.com/Shiweiliuiiiiiii/FreeTickets. 摘要：Recent works on sparse neural networks have demonstrated that it is possible to train a sparse network in isolation to match the performance of the corresponding dense networks with a fraction of parameters. However, the identification of these performant sparse neural networks (winning tickets) either involves a costly iterative train-prune-retrain process (e.g., Lottery Ticket Hypothesis) or an over-extended sparse training time (e.g., Training with Dynamic Sparsity), both of which would raise financial and environmental concerns. In this work, we attempt to address this cost-reducing problem by introducing the FreeTickets concept, as the first solution which can boost the performance of sparse convolutional neural networks over their dense network equivalents by a large margin, while using for complete training only a fraction of the computational resources required by the latter. Concretely, we instantiate the FreeTickets concept, by proposing two novel efficient ensemble methods with dynamic sparsity, which yield in one shot many diverse and accurate tickets "for free" during the sparse training process. The combination of these free tickets into an ensemble demonstrates a significant improvement in accuracy, uncertainty estimation, robustness, and efficiency over the corresponding dense (ensemble) networks. Our results provide new insights into the strength of sparse neural networks and suggest that the benefits of sparsity go way beyond the usual training/inference expected efficiency. We will release all codes in https://github.com/Shiweiliuiiiiiii/FreeTickets.

【10】 Certified Robustness via Randomized Smoothing over Multiplicative Parameters 标题：乘性参数随机平滑的鲁棒性验证

作者：Nikita Muravev,Aleksandr Petiushko 机构：Lomonosov Moscow State University, Huawei Moscow Research Center 链接：https://arxiv.org/abs/2106.14432 摘要：提出了一种新的乘性参数随机平滑方法。利用该方法构造了对gamma校正扰动具有一定鲁棒性的分类器，并将结果与高斯平滑得到的分类器进行了比较。据我们所知，这是第一个工作证明鲁棒性对乘法伽马校正变换。摘要：We propose a novel approach of randomized smoothing over multiplicative parameters. Using this method we construct certifiably robust classifiers with respect to a gamma-correction perturbation and compare the result with classifiers obtained via Gaussian smoothing. To the best of our knowledge it is the first work concerning certified robustness against the multiplicative gamma-correction transformation.

【11】 Poisoning the Search Space in Neural Architecture Search 标题：神经结构搜索中的搜索空间毒化

作者：Robert Wu,Nayan Saxena,Rohan Jain 机构：Department of Computer Science, University of Toronto, Department of Statistical Sciences, Departments of Computer Science & Mathematics 备注：All authors contributed equally. Appears in AdvML Workshop @ ICML2021: A Blessing in Disguise: The Prospects and Perils of Adversarial Machine Learning 链接：https://arxiv.org/abs/2106.14406 摘要：深度学习已被证明是一种高效的问题解决工具，可用于跨多个领域（如医疗保健和自动驾驶）的目标检测和图像分割。这种性能的核心在于神经结构设计，它严重依赖于领域知识和研究人员的经验。最近，在给定可能操作的初始搜索空间的情况下，这种寻找最佳体系结构的过程被神经体系结构搜索（NAS）自动化了。在本文中，我们评估了一种称为高效NAS（ENAS）的算法对原始搜索空间中精心设计的无效操作的数据不可知中毒攻击的鲁棒性。通过在CIFAR-10数据集上评估算法性能，我们实证地证明了我们新的搜索空间中毒（SSP）方法和多实例中毒攻击是如何利用ENAS控制器的设计缺陷导致子网络的预测错误率过高的。我们的结果为使用NAS进行更具对抗性的健壮体系结构搜索提供了挑战。摘要：Deep learning has proven to be a highly effective problem-solving tool for object detection and image segmentation across various domains such as healthcare and autonomous driving. At the heart of this performance lies neural architecture design which relies heavily on domain knowledge and prior experience on the researchers' behalf. More recently, this process of finding the most optimal architectures, given an initial search space of possible operations, was automated by Neural Architecture Search (NAS). In this paper, we evaluate the robustness of one such algorithm known as Efficient NAS (ENAS) against data agnostic poisoning attacks on the original search space with carefully designed ineffective operations. By evaluating algorithm performance on the CIFAR-10 dataset, we empirically demonstrate how our novel search space poisoning (SSP) approach and multiple-instance poisoning attacks exploit design flaws in the ENAS controller to result in inflated prediction error rates for child networks. Our results provide insights into the challenges to surmount in using NAS for more adversarially robust architecture search.

【12】 Habitat 2.0: Training Home Assistants to Rearrange their Habitat 标题：栖息地2.0：训练家务助理重新安排他们的栖息地

作者：Andrew Szot,Alex Clegg,Eric Undersander,Erik Wijmans,Yili Zhao,John Turner,Noah Maestre,Mustafa Mukadam,Devendra Chaplot,Oleksandr Maksymets,Aaron Gokaslan,Vladimir Vondrus,Sameer Dharur,Franziska Meier,Wojciech Galuba,Angel Chang,Zsolt Kira,Vladlen Koltun,Jitendra Malik,Manolis Savva,Dhruv Batra 机构：Facebook AI Research,Georgia Tech,Intel Research,Simon Fraser University ,UC Berkeley 链接：https://arxiv.org/abs/2106.14405 摘要：我们介绍了habitat2.0（H2.0），一个在交互式3D环境和复杂物理场景中训练虚拟机器人的仿真平台。我们对所有层次的具体化人工智能堆栈（数据、模拟和基准任务）做出了全面的贡献。具体来说，我们提出：（i）ReplicaCAD：艺术家创作、注释、可重构的公寓（匹配真实空间）3D数据集，带有铰接对象（例如可以打开/关闭的橱柜和抽屉）(ii）H2.0：高性能物理支持的3D模拟器，在8-GPU节点上的速度超过每秒25000个模拟步骤（实时8500倍），比以前的工作速度提高了100倍；（iii）家庭助理基准（HAB）：一套辅助机器人的常见任务（整理房子、准备食品、摆桌子），测试一系列移动操作能力。这些大规模的工程贡献使我们能够系统地比较大规模的深度强化学习（RL）和经典意义上的计划法（SPA）在长时间结构化任务中的管道，重点是对新对象、容器和布局的概括。我们发现：（1）与等级策略相比，扁平RL策略在HAB上比较困难(2）具有独立技能的等级制度存在“交接问题”（3）SPA管道比RL政策更脆弱。摘要：We introduce Habitat 2.0 (H2.0), a simulation platform for training virtual robots in interactive 3D environments and complex physics-enabled scenarios. We make comprehensive contributions to all levels of the embodied AI stack - data, simulation, and benchmark tasks. Specifically, we present: (i) ReplicaCAD: an artist-authored, annotated, reconfigurable 3D dataset of apartments (matching real spaces) with articulated objects (e.g. cabinets and drawers that can open/close); (ii) H2.0: a high-performance physics-enabled 3D simulator with speeds exceeding 25,000 simulation steps per second (850x real-time) on an 8-GPU node, representing 100x speed-ups over prior work; and, (iii) Home Assistant Benchmark (HAB): a suite of common tasks for assistive robots (tidy the house, prepare groceries, set the table) that test a range of mobile manipulation capabilities. These large-scale engineering contributions allow us to systematically compare deep reinforcement learning (RL) at scale and classical sense-plan-act (SPA) pipelines in long-horizon structured tasks, with an emphasis on generalization to new objects, receptacles, and layouts. We find that (1) flat RL policies struggle on HAB compared to hierarchical ones; (2) a hierarchy with independent skills suffers from 'hand-off problems', and (3) SPA pipelines are more brittle than RL policies.

【13】 Revelio: ML-Generated Debugging Queries for Distributed Systems 标题：REVIDIO：ML生成的分布式系统调试查询

作者：Pradeep Dogga,Karthik Narasimhan,Anirudh Sivaraman,Shiv Kumar Saini,George Varghese,Ravi Netravali 机构：UCLA, Princeton University, NYU, Adobe Research, India 链接：https://arxiv.org/abs/2106.14347 摘要：调试分布式系统的一个主要困难在于手动确定要使用哪些可用的调试工具以及如何查询其日志。我们自己对生产调试工作流程的研究证实了这种负担的严重性。本文探讨了机器学习模型是否能帮助开发人员进行分布式系统的调试。我们介绍了Revelio，一个调试助手，它将用户报告和系统日志作为输入，并输出调试查询，开发人员可以使用这些查询来查找bug的根本原因。关键的挑战在于（1）结合不同类型的输入（例如，自然语言报告和定量日志）和（2）推广到看不见的故障。Revelio通过使用深度神经网络将不同的输入源和潜在的查询均匀地嵌入到高维向量空间来解决这些问题。此外，它还利用生产系统的观察结果将查询生成分解为两个计算和统计上更简单的学习任务。为了评估Revelio，我们构建了一个包含多个分布式应用程序和调试工具的测试平台。通过对800名机械突厥者的日志和报告进行故障注入和训练，我们发现Revelio在其预测的前三名相关查询列表中96%的时间包含了最有用的查询。我们的开发者研究证实了Revelio的实用性。摘要：A major difficulty in debugging distributed systems lies in manually determining which of the many available debugging tools to use and how to query its logs. Our own study of a production debugging workflow confirms the magnitude of this burden. This paper explores whether a machine-learning model can assist developers in distributed systems debugging. We present Revelio, a debugging assistant which takes user reports and system logs as input, and outputs debugging queries that developers can use to find a bug's root cause. The key challenges lie in (1) combining inputs of different types (e.g., natural language reports and quantitative logs) and (2) generalizing to unseen faults. Revelio addresses these by employing deep neural networks to uniformly embed diverse input sources and potential queries into a high-dimensional vector space. In addition, it exploits observations from production systems to factorize query generation into two computationally and statistically simpler learning tasks. To evaluate Revelio, we built a testbed with multiple distributed applications and debugging tools. By injecting faults and training on logs and reports from 800 Mechanical Turkers, we show that Revelio includes the most helpful query in its predicted list of top-3 relevant queries 96% of the time. Our developer study confirms the utility of Revelio.

【14】 How many moments does MMD compare? 标题：MMD比较了几个时刻？

作者：Rustem Takhanov 机构： School of Sciences and Humanities 链接：https://arxiv.org/abs/2106.14277 摘要：本文提出了一种研究Mercer核的新方法，它对应于一个特殊的核$K$a伪微分算子$p（{mathbf x}，D）$，使得$mathcal{F}p（{mathbf x}，D）^dag p（{mathbf x}，D）mathcal{F}^{-1}$作用于光滑函数的方式与与与$K$相关联的积分运算符相同（其中$mathcal{F}$是傅里叶变换）。我们证明了伪微分算子定义的核能够一致逼近紧集上的任意连续Mercer核。符号$p（{mathbf x}，{mathbf y}）$封装了许多关于由内核$K$定义的最大平均差异距离结构的有用信息。我们用$p$奇异值分解的前$r$项之和来近似$p（{mathbf x}，{mathbf y}）$，表示为$pr（{mathbf x}，{mathbf y}）$。如果与$p（{mathbf x}，{mathbf y}）$相关的积分算子的有序奇异值迅速衰减，则由新符号$pr$定义的MMD距离与初始符号仅略有不同。此外，新的MMD距离可以解释为比较两个概率分布的$r$局部矩的聚合结果。后者的结果在与$p$有关的积分算子的右奇异向量一致有界的条件下成立。但是，即使这不能满足，我们仍然可以认为，p$和pr$之间的Hilbert-Schmidt距离消失了。因此，我们报告了一个有趣的现象：MMD距离度量两个概率分布相对于一定数量的局部矩的差异，$r^ast$，而这个数字$r^ast$取决于奇异值$p$消失的速度。摘要：We present a new way of study of Mercer kernels, by corresponding to a special kernel $K$ a pseudo-differential operator $p({mathbf x}, D)$ such that $mathcal{F} p({mathbf x}, D)^dag p({mathbf x}, D) mathcal{F}^{-1}$ acts on smooth functions in the same way as an integral operator associated with $K$ (where $mathcal{F}$ is the Fourier transform). We show that kernels defined by pseudo-differential operators are able to approximate uniformly any continuous Mercer kernel on a compact set. The symbol $p({mathbf x}, {mathbf y})$ encapsulates a lot of useful information about the structure of the Maximum Mean Discrepancy distance defined by the kernel $K$. We approximate $p({mathbf x}, {mathbf y})$ with the sum of the first $r$ terms of the Singular Value Decomposition of $p$, denoted by $p_r({mathbf x}, {mathbf y})$. If ordered singular values of the integral operator associated with $p({mathbf x}, {mathbf y})$ die down rapidly, the MMD distance defined by the new symbol $p_r$ differs from the initial one only slightly. Moreover, the new MMD distance can be interpreted as an aggregated result of comparing $r$ local moments of two probability distributions. The latter results holds under the condition that right singular vectors of the integral operator associated with $p$ are uniformly bounded. But even if this is not satisfied we can still hold that the Hilbert-Schmidt distance between $p$ and $p_r$ vanishes. Thus, we report an interesting phenomenon: the MMD distance measures the difference of two probability distributions with respect to a certain number of local moments, $r^ast$, and this number $r^ast$ depends on the speed with which singular values of $p$ die down.

【15】 AI based Presentation Creator With Customized Audio Content Delivery 标题：基于AI的演示文稿创建器，支持定制音频内容交付

作者：Muvazima Mansoor,Srikanth Chandar,Ramamoorthy Srinath 机构：ECE, PES University, Bengaluru, India, CSE 链接：https://arxiv.org/abs/2106.14213 摘要：在本文中，我们提出了一个架构来解决一个新的问题陈述，这个问题陈述在最近随着COVID-19流行对虚拟内容交付需求的增加而更加突出。所有的教育机构、工作场所、研究中心等都在试图通过使用在线内容传递来弥合这个社会距离遥远的时代的沟通鸿沟。现在的趋势是创建演示文稿，然后使用各种虚拟会议平台进行演示。我们试图通过本文来减少和消除创建演示文稿和交付演示文稿所花费的时间，本论文旨在使用机器学习（ML）算法和自然语言处理（NLP）模块来自动化从文档创建基于幻灯片的演示文稿的过程，然后使用最先进的语音克隆模型，以所需作者的声音传递内容。我们认为结构化文档（如研究论文）是必须呈现的内容。研究论文首先使用BERT摘要技术进行总结，并浓缩成幻灯片中的要点。以Tacotron为灵感的架构，带有编码器、合成器和基于生成对抗网络（GAN）的声码器，用于以作者的声音（或任何定制的声音）来传达幻灯片的内容。现在，几乎所有的学习都转向了在线模式，专业人士现在在舒适的家里工作。由于目前的情况，教师和专业人员已转向介绍，以帮助他们传授信息。在本文中，我们的目标是通过自动化创建演示文稿的过程并随后以定制的声音交付演示文稿，从而减少创建演示文稿所需的大量时间，使用一种可以使用短音频片段克隆任何声音的内容交付机制。摘要：In this paper, we propose an architecture to solve a novel problem statement that has stemmed more so in recent times with an increase in demand for virtual content delivery due to the COVID-19 pandemic. All educational institutions, workplaces, research centers, etc. are trying to bridge the gap of communication during these socially distanced times with the use of online content delivery. The trend now is to create presentations, and then subsequently deliver the same using various virtual meeting platforms. The time being spent in such creation of presentations and delivering is what we try to reduce and eliminate through this paper which aims to use Machine Learning (ML) algorithms and Natural Language Processing (NLP) modules to automate the process of creating a slides-based presentation from a document, and then use state-of-the-art voice cloning models to deliver the content in the desired author's voice. We consider a structured document such as a research paper to be the content that has to be presented. The research paper is first summarized using BERT summarization techniques and condensed into bullet points that go into the slides. Tacotron inspired architecture with Encoder, Synthesizer, and a Generative Adversarial Network (GAN) based vocoder, is used to convey the contents of the slides in the author's voice (or any customized voice). Almost all learning has now been shifted to online mode, and professionals are now working from the comfort of their homes. Due to the current situation, teachers and professionals have shifted to presentations to help them in imparting information. In this paper, we aim to reduce the considerable amount of time that is taken in creating a presentation by automating this process and subsequently delivering this presentation in a customized voice, using a content delivery mechanism that can clone any voice using a short audio clip.

【16】 Autonomous Deep Quality Monitoring in Streaming Environments 标题：流媒体环境下的自主深度质量监控

作者：Andri Ashfahani,Mahardhika Pratama,Edwin Lughofer,Edward Yapp Kien Yee 机构：SCSE, NTU, Singapore, DKBMS, JKU, Austria, E. Y. K. Yee, SIMTech, ASTAR 备注：None 链接：https://arxiv.org/abs/2106.13955 摘要：在工业中，质量监控的通常做法是依靠手动检查，众所周知，手动检查速度慢、容易出错且依赖于操作员。这一问题对数据驱动方法开发的自动化实时质量监控提出了强烈的需求，从而减轻了对操作员的依赖，并适应了各种过程的不确定性。尽管如此，当前的方法并没有考虑到感官信息的流性质，而严重依赖于手工制作的特性，使其具有特定于应用程序的特性。本文提出了基于最近发展起来的数据流深度学习算法的在线质量监控方法，即动态进化容量神经网络NADINE 。它的特点是集成了1-D和2-D卷积层，以提取时间序列的自然特征和从我们自己的项目中的注塑机的传感器和摄像头捕获的视觉数据流。实时实验中，在线质量监控任务在预先测试的基础上进行动态模拟，然后采用显著的数据流评估协议进行训练。与最先进的技术相比，NADINE 在流媒体环境中的质量监控任务平均提高了4.68%。为了支持可复制的研究计划，NADINE 的代码、结果以及补充材料和注塑数据集在url中提供{https://github.com/ContinualAL/NADINE-IJCNN2021}. 摘要：The common practice of quality monitoring in industry relies on manual inspection well-known to be slow, error-prone and operator-dependent. This issue raises strong demand for automated real-time quality monitoring developed from data-driven approaches thus alleviating from operator dependence and adapting to various process uncertainties. Nonetheless, current approaches do not take into account the streaming nature of sensory information while relying heavily on hand-crafted features making them application-specific. This paper proposes the online quality monitoring methodology developed from recently developed deep learning algorithms for data streams, Neural Networks with Dynamically Evolved Capacity (NADINE), namely NADINE . It features the integration of 1-D and 2-D convolutional layers to extract natural features of time-series and visual data streams captured from sensors and cameras of the injection molding machines from our own project. Real-time experiments have been conducted where the online quality monitoring task is simulated on the fly under the prequential test-then-train fashion - the prominent data stream evaluation protocol. Comparison with the state-of-the-art techniques clearly exhibits the advantage of NADINE with 4.68% improvement on average for the quality monitoring task in streaming environments. To support the reproducible research initiative, codes, results of NADINE along with supplementary materials and injection molding dataset are made available in url{https://github.com/ContinualAL/NADINE-IJCNN2021}.

【17】 Core Challenges in Embodied Vision-Language Planning 标题：体验式视觉语言规划的核心挑战

作者：Jonathan Francis,Nariaki Kitamura,Felix Labelle,Xiaopeng Lu,Ingrid Navarro,Jean Oh 机构：Language Technologies Institute, Carnegie Mellon University, Forbes Ave., Pittsburgh, PA, USA, Human-Machine Collaboration, Bosch Research Pittsburgh, Smallman St., Pittsburgh, PA, USA 备注：35 pages 链接：https://arxiv.org/abs/2106.13948 摘要：Recent advances in the areas of multimodal machine learning and artificial intelligence (AI) have led to the development of challenging tasks at the intersection of Computer Vision, Natural Language Processing, and Embodied AI. Whereas many approaches and previous survey pursuits have characterised one or two of these dimensions, there has not been a holistic analysis at the center of all three. Moreover, even when combinations of these topics are considered, more focus is placed on describing, e.g., current architectural methods, as opposed to also illustrating high-level challenges and opportunities for the field. In this survey paper, we discuss Embodied Vision-Language Planning (EVLP) tasks, a family of prominent embodied navigation and manipulation problems that jointly use computer vision and natural language. We propose a taxonomy to unify these tasks and provide an in-depth analysis and comparison of the new and current algorithmic approaches, metrics, simulated environments, as well as the datasets used for EVLP tasks. Finally, we present the core challenges that we believe new EVLP works should seek to address, and we advocate for task construction that enables model generalizability and furthers real-world deployment.

【18】 Low-Precision Training in Logarithmic Number System using Multiplicative Weight Update 标题：基于乘法权值更新的对数系统低精度训练

作者：Jiawei Zhao,Steve Dai,Rangharajan Venkatesan,Ming-Yu Liu,Brucek Khailany,Bill Dally,Anima Anandkumar 机构：Caltech, NVIDIA 链接：https://arxiv.org/abs/2106.13914 摘要：Training large-scale deep neural networks (DNNs) currently requires a significant amount of energy, leading to serious environmental impacts. One promising approach to reduce the energy costs is representing DNNs with low-precision numbers. While it is common to train DNNs with forward and backward propagation in low-precision, training directly over low-precision weights, without keeping a copy of weights in high-precision, still remains to be an unsolved problem. This is due to complex interactions between learning algorithms and low-precision number systems. To address this, we jointly design a low-precision training framework involving a logarithmic number system (LNS) and a multiplicative weight update training method, termed LNS-Madam. LNS has a high dynamic range even in a low-bitwidth setting, leading to high energy efficiency and making it relevant for on-board training in energy-constrained edge devices. We design LNS to have the flexibility of choosing different bases for weights and gradients, as they usually require different quantization gaps and dynamic ranges during training. By drawing the connection between LNS and multiplicative update, LNS-Madam ensures low quantization error during weight update, leading to a stable convergence even if the bitwidth is limited. Compared to using a fixed-point or floating-point number system and training with popular learning algorithms such as SGD and Adam, our joint design with LNS and LNS-Madam optimizer achieves better accuracy while requiring smaller bitwidth. Notably, with only 5-bit for gradients, the proposed training framework achieves accuracy comparable to full-precision state-of-the-art models such as ResNet-50 and BERT. After conducting energy estimations by analyzing the math datapath units during training, the results show that our design achieves over 60x energy reduction compared to FP32 on BERT models.

【19】 Knowledge Infused Policy Gradients with Upper Confidence Bound for Relational Bandits 标题：关系环的具有上界置信度的知识注入政策梯度

作者：Kaushik Roy,Qi Zhang,Manas Gaur,Amit Sheth 机构：Artificial Intelligence Institute, University of South Carolina, Columbia, USA 备注：Accepted for publication in the research track at ECML-PKDD 2021 链接：https://arxiv.org/abs/2106.13895 摘要：Contextual Bandits find important use cases in various real-life scenarios such as online advertising, recommendation systems, healthcare, etc. However, most of the algorithms use flat feature vectors to represent context whereas, in the real world, there is a varying number of objects and relations among them to model in the context. For example, in a music recommendation system, the user context contains what music they listen to, which artists create this music, the artist albums, etc. Adding richer relational context representations also introduces a much larger context space making exploration-exploitation harder. To improve the efficiency of exploration-exploitation knowledge about the context can be infused to guide the exploration-exploitation strategy. Relational context representations allow a natural way for humans to specify knowledge owing to their descriptive nature. We propose an adaptation of Knowledge Infused Policy Gradients to the Contextual Bandit setting and a novel Knowledge Infused Policy Gradients Upper Confidence Bound algorithm and perform an experimental analysis of a simulated music recommendation dataset and various real-life datasets where expert knowledge can drastically reduce the total regret and where it cannot.

【20】 Pastprop-RNN: improved predictions of the future by correcting the past 标题：Pastprop-RNN：通过修正过去改进对未来的预测

作者：André Baptista,Yassine Baghoussi,Carlos Soares,João Mendes-Moreira,Miguel Arantes 机构：Faculdade de Engenharia, Universidade do Porto, INESC TEC, Fraunhofer AICOS and LIACC, Inovretail 链接：https://arxiv.org/abs/2106.13881 摘要：Forecasting accuracy is reliant on the quality of available past data. Data disruptions can adversely affect the quality of the generated model (e.g. unexpected events such as out-of-stock products when forecasting demand). We address this problem by pastcasting: predicting how data should have been in the past to explain the future better. We propose Pastprop-LSTM, a data-centric backpropagation algorithm that assigns part of the responsibility for errors to the training data and changes it accordingly. We test three variants of Pastprop-LSTM on forecasting competition datasets, M4 and M5, plus the Numenta Anomaly Benchmark. Empirical evaluation indicates that the proposed method can improve forecasting accuracy, especially when the prediction errors of standard LSTM are high. It also demonstrates the potential of the algorithm on datasets containing anomalies.

【21】 Approximate Maximum Halfspace Discrepancy 标题：近似最大半空间差异

作者：Michael Matheny,Jeff M. Phillips 机构：Amazon, University of Utah 链接：https://arxiv.org/abs/2106.13851 摘要：Consider the geometric range space $(X, mathcal{H}_d)$ where $X subset mathbb{R}^d$ and $mathcal{H}_d$ is the set of ranges defined by $d$-dimensional halfspaces. In this setting we consider that $X$ is the disjoint union of a red and blue set. For each halfspace $h in mathcal{H}_d$ define a function $Phi(h)$ that measures the "difference" between the fraction of red and fraction of blue points which fall in the range $h$. In this context the maximum discrepancy problem is to find the $h^* = arg max_{h in (X, mathcal{H}_d)} Phi(h)$. We aim to instead find an $hat{h}$ such that $Phi(h^*) - Phi(hat{h}) le varepsilon$. This is the central problem in linear classification for machine learning, in spatial scan statistics for spatial anomaly detection, and shows up in many other areas. We provide a solution for this problem in $O(|X| (1/varepsilon^d) log^4 (1/varepsilon))$ time, which improves polynomially over the previous best solutions. For $d=2$ we show that this is nearly tight through conditional lower bounds. For different classes of $Phi$ we can either provide a $Omega(|X|^{3/2 - o(1)})$ time lower bound for the exact solution with a reduction to APSP, or an $Omega(|X| 1/varepsilon^{2-o(1)})$ lower bound for the approximate solution with a reduction to 3SUM. A key technical result is a $varepsilon$-approximate halfspace range counting data structure of size $O(1/varepsilon^d)$ with $O(log (1/varepsilon))$ query time, which we can build in $O(|X| (1/varepsilon^d) log^4 (1/varepsilon))$ time.

【22】 Variance Reduction for Matrix Computations with Applications to Gaussian Processes 标题：矩阵计算的方差化简方法及其在高斯过程中的应用

作者：Anant Mathur,Sarat Moka,Zdravko Botev 机构： University of New South Wales High Street, Kensington Sydney, NSW , Macquarie University, Balaclava Rd, Macquarie Park, NSW , Australia 备注：20 pages, 3 figures 链接：https://arxiv.org/abs/2106.14565 摘要：In addition to recent developments in computing speed and memory, methodological advances have contributed to significant gains in the performance of stochastic simulation. In this paper, we focus on variance reduction for matrix computations via matrix factorization. We provide insights into existing variance reduction methods for estimating the entries of large matrices. Popular methods do not exploit the reduction in variance that is possible when the matrix is factorized. We show how computing the square root factorization of the matrix can achieve in some important cases arbitrarily better stochastic performance. In addition, we propose a factorized estimator for the trace of a product of matrices and numerically demonstrate that the estimator can be up to 1,000 times more efficient on certain problems of estimating the log-likelihood of a Gaussian process. Additionally, we provide a new estimator of the log-determinant of a positive semi-definite matrix where the log-determinant is treated as a normalizing constant of a probability density.

【23】 Quantum Data Compression and Quantum Cross Entropy 标题：量子数据压缩与量子交叉熵

作者：Zhou Shangnan 机构：Stanford Institute for Theoretical Physics, Stanford University, Stanford, CA , USA 备注：8 pages 链接：https://arxiv.org/abs/2106.13823 摘要：Quantum machine learning is an emerging field at the intersection of machine learning and quantum computing. A central quantity for the theoretical foundation of quantum machine learning is the quantum cross entropy. In this paper, we present one operational interpretation of this quantity, that the quantum cross entropy is the compression rate for sub-optimal quantum source coding. To do so, we give a simple, universal quantum data compression protocol, which is developed based on quantum generalization of variable-length coding, as well as quantum strong typicality.

【24】 Training Saturation in Layerwise Quantum Approximate Optimisation 标题：分层量子近似优化中的训练饱和问题

作者：E. Campos,D. Rabinovich,V. Akshay,J. Biamonte 机构：Skolkovo Institute of Science and Technology, Nobel Street, Moscow, Russia 备注：7 pages; RevTEX 链接：https://arxiv.org/abs/2106.13814 摘要：Quantum Approximate Optimisation (QAOA) is the most studied gate based variational quantum algorithm today. We train QAOA one layer at a time to maximize overlap with an $n$ qubit target state. Doing so we discovered that such training always saturates -- called textit{training saturation} -- at some depth $p^*$, meaning that past a certain depth, overlap can not be improved by adding subsequent layers. We formulate necessary conditions for saturation. Numerically, we find layerwise QAOA reaches its maximum overlap at depth $p^*=n$. The addition of coherent dephasing errors to training removes saturation, recovering robustness to layerwise training. This study sheds new light on the performance limitations and prospects of QAOA.

linux https 网络安全 sql 批量计算

0 人点赞