人工智能学术速递[12.22]

2021-12-24 08:56:25 浏览数 (1)

cs.AI人工智能,共计40篇

【1】 Max-Margin Contrastive Learning 标题:最大裕度对比学习 链接:https://arxiv.org/abs/2112.11450

作者:Anshul Shah,Suvrit Sra,Rama Chellappa,Anoop Cherian 机构:Johns Hopkins University, Baltimore, MD, Massachusetts Institute of Technology, Cambridge, MA, Mitsubishi Electric Research Labs, Cambridge, MA 备注:Accepted at AAAI 2022 摘要:标准的对比学习方法通常需要大量的负面因素才能实现有效的无监督学习,并且往往表现出缓慢的收敛性。我们怀疑这种行为是由于选择了次优的底片来提供与正片的对比。我们从支持向量机(SVM)中汲取灵感,提出了最大边际对比学习(MMCL),从而克服了这一困难。我们的方法通过二次优化问题选择否定作为稀疏支持向量,并通过最大化决策裕度来增强对比性。由于支持向量机优化可能需要计算,特别是在端到端的环境中,我们提出了简化方法,以减轻计算负担。我们在标准的视觉基准数据集上验证了我们的方法,证明了与最新技术相比,我们在无监督表示学习方面有更好的性能,同时具有更好的经验收敛特性。 摘要:Standard contrastive learning approaches usually require a large number of negatives for effective unsupervised learning and often exhibit slow convergence. We suspect this behavior is due to the suboptimal selection of negatives used for offering contrast to the positives. We counter this difficulty by taking inspiration from support vector machines (SVMs) to present max-margin contrastive learning (MMCL). Our approach selects negatives as the sparse support vectors obtained via a quadratic optimization problem, and contrastiveness is enforced by maximizing the decision margin. As SVM optimization can be computationally demanding, especially in an end-to-end setting, we present simplifications that alleviate the computational burden. We validate our approach on standard vision benchmark datasets, demonstrating better performance in unsupervised representation learning over state-of-the-art, while having better empirical convergence properties.

【2】 Multi-Modality Distillation via Learning the teacher's modality-level Gram Matrix 标题:通过学习教师的通道级语法矩阵实现多通道提取 链接:https://arxiv.org/abs/2112.11447

作者:Peng Liu 机构:Multi-Modality Distillation via Learning the teacher’smodality-level Gram MatrixPeng LiuYunnan University 备注:10 pages 摘要:在多模态知识提取研究的背景下,现有的方法主要集中在只学习教师最终输出的问题上。因此,教师网络和学生网络之间仍然存在着深刻的差异。有必要强制学生网络学习教师网络的情态关系信息。为了有效地利用教师向学生传递知识,采用了一种新的模态关系提取范式,通过对不同模态之间的关系信息进行建模,即学习教师模态层次的Gram矩阵。 摘要:In the context of multi-modality knowledge distillation research, the existing methods was mainly focus on the problem of only learning teacher final output. Thus, there are still deep differences between the teacher network and the student network. It is necessary to force the student network to learn the modality relationship information of the teacher network. To effectively exploit transfering knowledge from teachers to students, a novel modality relation distillation paradigm by modeling the relationship information among different modality are adopted, that is learning the teacher modality-level Gram Matrix.

【3】 Scaling Language Models: Methods, Analysis & Insights from Training Gopher 标题:标度语言模型:方法、分析和来自训练地鼠的启示 链接:https://arxiv.org/abs/2112.11446

作者:Jack W. Rae,Sebastian Borgeaud,Trevor Cai,Katie Millican,Jordan Hoffmann,Francis Song,John Aslanides,Sarah Henderson,Roman Ring,Susannah Young,Eliza Rutherford,Tom Hennigan,Jacob Menick,Albin Cassirer,Richard Powell,George van den Driessche,Lisa Anne Hendricks,Maribeth Rauh,Po-Sen Huang,Amelia Glaese,Johannes Welbl,Sumanth Dathathri,Saffron Huang,Jonathan Uesato,John Mellor,Irina Higgins,Antonia Creswell,Nat McAleese,Amy Wu,Erich Elsen,Siddhant Jayakumar,Elena Buchatskaya,David Budden,Esme Sutherland,Karen Simonyan,Michela Paganini,Laurent Sifre,Lena Martens,Xiang Lorraine Li,Adhiguna Kuncoro,Aida Nematzadeh,Elena Gribovskaya,Domenic Donato,Angeliki Lazaridou,Arthur Mensch,Jean-Baptiste Lespiau,Maria Tsimpoukelli,Nikolai Grigorev,Doug Fritz,Thibault Sottiaux,Mantas Pajarskas,Toby Pohlen,Zhitao Gong,Daniel Toyama,Cyprien de Masson d'Autume,Yujia Li,Tayfun Terzi,Vladimir Mikulik,Igor Babuschkin,Aidan Clark,Diego de Las Casas,Aurelia Guy,Chris Jones,James Bradbury,Matthew Johnson,Blake Hechtman,Laura Weidinger,Iason Gabriel,William Isaac,Ed Lockhart,Simon Osindero,Laura Rimell,Chris Dyer,Oriol Vinyals,Kareem Ayoub,Jeff Stanway,Lorrayne Bennett,Demis Hassabis,Koray Kavukcuoglu,Geoffrey Irving 备注:118 pages 摘要:语言建模通过利用大量人类书面知识库更好地预测和理解世界,向智能通信系统迈出了一步。在本文中,我们分析了基于Transformer的语言模型在各种模型尺度上的性能——从具有数千万个参数的模型到被称为Gopher的2800亿个参数的模型。这些模型在152项不同任务中进行评估,在大多数任务中实现了最先进的性能。在阅读理解、事实核查和有毒语言识别等领域,从量表中获得的收益最大,但逻辑和数学推理的收益较小。我们提供了对训练数据集和模型行为的整体分析,涵盖了模型规模与偏差和毒性的交叉点。最后,我们讨论了语言模型在人工智能安全和减少下游危害方面的应用。 摘要:Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gopher. These models are evaluated on 152 diverse tasks, achieving state-of-the-art performance across the majority. Gains from scale are largest in areas such as reading comprehension, fact-checking, and the identification of toxic language, but logical and mathematical reasoning see less benefit. We provide a holistic analysis of the training dataset and model's behaviour, covering the intersection of model scale with bias and toxicity. Finally we discuss the application of language models to AI safety and the mitigation of downstream harms.

【4】 ESAN: Efficient Sentiment Analysis Network of A-Shares Research Reports for Stock Price Prediction 标题:ESAN:面向股价预测的高效A股研究报告情绪分析网络 链接:https://arxiv.org/abs/2112.11444

作者:Tuo Sun,Wanrong Zheng,Shufan Yu,Mengxun Li,Jiarui Ou 摘要:在本文中,我们将开发一个自然语言处理模型来帮助我们预测股票的长期走势。整个网络包括两个模块。第一个模块是一个自然语言处理模型,它从输入报告中寻找可靠的因素。另一种是以因素为输入的时间序列预测模型,旨在预测股票收益率。为了表明我们的模型结合情绪分析模块和时间序列预测模块的效率,我们将我们的方法命名为ESAN。 摘要:In this paper, we are going to develop a natural language processing model to help us to predict stocks in the long term. The whole network includes two modules. The first module is a natural language processing model which seeks out reliable factors from input reports. While the other is a time-series forecasting model which takes the factors as input and aims to predict stocks earnings yield. To indicate the efficiency of our model to combine the sentiment analysis module and the time-series forecasting module, we name our method ESAN.

【5】 Automated Drug-Related Information Extraction from French Clinical Documents: ReLyfe Approach 标题:从法国临床文献中自动提取与药物相关的信息:ReLyfe方法 链接:https://arxiv.org/abs/2112.11439

作者:Azzam Alwan,Maayane Attias,Larry Rubin,Adnan El Bakri 机构:R&D Department, ReLyfe - Medical Intelligence, Paris, France, Computer Science, Ecole Polytechnique, Palaiseau, France, BeCareLink, New York, United States 备注:None 摘要:在法国,构建医疗数据仍然是一项挑战,主要是因为出于隐私考虑,缺乏医疗数据,以及缺乏处理法语的方法和途径。这些挑战之一是在法国临床文献中构建药物相关信息。据我们所知,在过去十年中,研究法国处方的相关论文不到五篇。本文提出了一种从法国临床扫描文档中提取药物相关信息的新方法,同时保护患者的隐私。此外,我们在一个健康数据管理平台中部署了我们的方法,用于构建药物医疗数据并帮助患者组织他们的药物计划。它可以在任何web或移动平台上实现。这项工作通过创建适用于实际生产问题的应用程序,缩小了理论工作和实际工作之间的差距。它是基于规则的阶段和深度学习方法的结合。最后,数值结果表明了该方法的优越性和相关性。 摘要:Structuring medical data in France remains a challenge mainly because of the lack of medical data due to privacy concerns and the lack of methods and approaches on processing the French language. One of these challenges is structuring drug-related information in French clinical documents. To our knowledge, over the last decade, there are less than five relevant papers that study French prescriptions. This paper proposes a new approach for extracting drug-related information from French clinical scanned documents while preserving patients' privacy. In addition, we deployed our method in a health data management platform where it is used to structure drug medical data and help patients organize their drug schedules. It can be implemented on any web or mobile platform. This work closes the gap between theoretical and practical work by creating an application adapted to real production problems. It is a combination of a rule-based phase and a Deep Learning approach. Finally, numerical results show the outperformance and relevance of the proposed methodology.

【6】 Toward Explainable AI for Regression Models 标题:回归模型的可解释人工智能 链接:https://arxiv.org/abs/2112.11407

作者:Simon Letzgus,Patrick Wagner,Jonas Lederer,Wojciech Samek,Klaus-Robert Müller,Gregoire Montavon 机构:Gr´egoire Montavon∗, Machine Learning Group, Technische Universit¨at Berlin, Berlin, Germany, Department of Artificial Intelligence, Fraunhofer Heinrich Hertz Institute, Berlin, Germany 备注:17 pages, 10 figures, preprint 摘要:除了机器学习(ML)模型令人印象深刻的预测能力外,最近出现了解释方法,可以解释复杂的非线性学习模型,如深度神经网络。获得更好的理解尤其重要,例如对于安全关键的ML应用程序或医疗诊断等。虽然此类可解释AI(XAI)技术在分类器中已经非常流行,但到目前为止,很少有人关注回归模型(XAIR)的XAI。在这篇综述中,我们阐明了XAI在回归和分类任务中的基本概念差异,为XAIR建立了新的理论见解和分析,提供了XAIR在真实实际回归问题上的演示,最后讨论了该领域仍然存在的挑战。 摘要:In addition to the impressive predictive power of machine learning (ML) models, more recently, explanation methods have emerged that enable an interpretation of complex non-linear learning models such as deep neural networks. Gaining a better understanding is especially important e.g. for safety-critical ML applications or medical diagnostics etc. While such Explainable AI (XAI) techniques have reached significant popularity for classifiers, so far little attention has been devoted to XAI for regression models (XAIR). In this review, we clarify the fundamental conceptual differences of XAI for regression and classification tasks, establish novel theoretical insights and analysis for XAIR, provide demonstrations of XAIR on genuine practical regression problems, and finally discuss the challenges remaining for the field.

【7】 Sports Video: Fine-Grained Action Detection and Classification of Table Tennis Strokes from Videos for MediaEval 2021 标题:体育视频:2021年中世纪体育视频中乒乓球击球动作的细粒度检测与分类 链接:https://arxiv.org/abs/2112.11384

作者:Pierre-Etienne Martin,Jordan Calandre,Boris Mansencal,Jenny Benois-Pineau,Renaud Péteri,Laurent Mascarilla,Julien Morlier 机构:CCP Department, Max Planck Institute for Evolutionary Anthropology, D-, Leipzig, Germany, MIA, La Rochelle University, La Rochelle, France, Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, Talence, France, IMS, University of Bordeaux, Talence, France 备注:MediaEval 2021, Dec 2021, Online, Germany 摘要:由于应用领域的多样性,体育视频分析是一个流行的研究课题,从具有用户定制摘要的多媒体智能设备到运动员表现分析。体育视频任务是中世纪2021基准的一部分。此任务处理视频中的细粒度动作检测和分类。重点是乒乓球比赛的录音。该任务自2019年开始运行,针对在自然条件下记录的未经修剪的视频(每一次中风的时间边界已知)提出了分类挑战。今年,该数据集得到了扩展,此外,它还提供了一个来自未经剪辑的无注释视频的检测挑战。这项工作旨在为体育教练和运动员创建工具,以便分析运动成绩。运动分析和运动员档案可以建立在这些技术的基础上,以丰富运动员的训练经验,提高他们的成绩。 摘要:Sports video analysis is a prevalent research topic due to the variety of application areas, ranging from multimedia intelligent devices with user-tailored digests up to analysis of athletes' performance. The Sports Video task is part of the MediaEval 2021 benchmark. This task tackles fine-grained action detection and classification from videos. The focus is on recordings of table tennis games. Running since 2019, the task has offered a classification challenge from untrimmed video recorded in natural conditions with known temporal boundaries for each stroke. This year, the dataset is extended and offers, in addition, a detection challenge from untrimmed videos without annotations. This work aims at creating tools for sports coaches and players in order to analyze sports performance. Movement analysis and player profiling may be built upon such technology to enrich the training experience of athletes and improve their performance.

【8】 Reasoning About Causal Models With Infinitely Many Variables 标题:关于具有无穷多个变量的因果模型的推理 链接:https://arxiv.org/abs/2112.11362

作者:Joseph Y. Halpern,Spencer Peters 机构: Cornell University 摘要:广义结构方程模型(GSEMs)[Peters和Halpern 2021],顾名思义,是结构方程模型(SEM)的推广。它们可以处理(除其他外)无限范围的无限多变量,这对于捕捉动力系统至关重要。我们在GSEMs中提供了因果推理的合理和完整的公理化,这是Halpern[2000]为SEMs提供的合理和完整公理化的扩展。考虑GSEMs有助于澄清Halpern的公理所捕获的属性。 摘要:Generalized structural equations models (GSEMs) [Peters and Halpern 2021], are, as the name suggests, a generalization of structural equations models (SEMs). They can deal with (among other things) infinitely many variables with infinite ranges, which is critical for capturing dynamical systems. We provide a sound and complete axiomatization of causal reasoning in GSEMs that is an extension of the sound and complete axiomatization provided by Halpern [2000] for SEMs. Considering GSEMs helps clarify what properties Halpern's axioms capture.

【9】 Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling 标题:利用自适应客户端抽样解决联合学习的系统和统计异构性 链接:https://arxiv.org/abs/2112.11256

作者:Bing Luo,Wenli Xiao,Shiqiang Wang,Jianwei Huang,Leandros Tassiulas 机构:∗Shenzhen Institute of Artificial Intelligence and Robotics for Society, China, †School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, China, ‡IBM T. J. Watson Research Center, Yorktown Heights, NY, USA 备注:Accepted in IEEE INFOCOM 2022 摘要:当参与者数量较多且服务器的通信带宽有限时,联邦学习(FL)算法通常会在每一轮(部分参与)中对一小部分客户机进行采样。最近关于FL收敛性分析的工作集中于无偏客户抽样,例如,均匀随机抽样,由于高度的系统异质性和统计异质性,其收敛时间较慢。本文旨在设计一种自适应客户机采样算法,该算法同时处理系统和统计异构性,以最小化挂钟收敛时间。对于具有任意客户抽样概率的FL算法,我们得到了一个新的易于处理的收敛界。基于这个界,我们解析地建立了总学习时间和抽样概率之间的关系,从而得到了训练时间最小化的非凸优化问题。我们设计了一个有效的算法来学习收敛界中的未知参数,并开发了一个低复杂度的算法来近似求解非凸问题。硬件原型和仿真的实验结果表明,与几种基线采样方案相比,我们提出的采样方案显著缩短了收敛时间。值得注意的是,在硬件原型中,我们的方案比统一采样基线在达到相同目标损耗方面花费的时间少73%。 摘要:Federated learning (FL) algorithms usually sample a fraction of clients in each round (partial participation) when the number of participants is large and the server's communication bandwidth is limited. Recent works on the convergence analysis of FL have focused on unbiased client sampling, e.g., sampling uniformly at random, which suffers from slow wall-clock time for convergence due to high degrees of system heterogeneity and statistical heterogeneity. This paper aims to design an adaptive client sampling algorithm that tackles both system and statistical heterogeneity to minimize the wall-clock convergence time. We obtain a new tractable convergence bound for FL algorithms with arbitrary client sampling probabilities. Based on the bound, we analytically establish the relationship between the total learning time and sampling probabilities, which results in a non-convex optimization problem for training time minimization. We design an efficient algorithm for learning the unknown parameters in the convergence bound and develop a low-complexity algorithm to approximately solve the non-convex problem. Experimental results from both hardware prototype and simulation demonstrate that our proposed sampling scheme significantly reduces the convergence time compared to several baseline sampling schemes. Notably, our scheme in hardware prototype spends 73% less time than the uniform sampling baseline for reaching the same target loss.

【10】 Mind the Gap! A Study on the Transferability of Virtual vs Physical-world Testing of Autonomous Driving Systems 标题:小心台阶间跨度!自动驾驶系统虚拟与实物测试的可移植性研究 链接:https://arxiv.org/abs/2112.11255

作者:Andrea Stocco,Brian Pulfer,Paolo Tonella 机构:Computer Society 备注:12 pages 摘要:自动驾驶汽车(SDC)的安全部署需要彻底的模拟和现场测试。大多数测试技术在仿真环境中考虑虚拟化的SDCS,而较少的努力已经被用于评估这种技术是否转移到物理现实世界的车辆上并且是有效的。在本文中,我们利用驴车开源框架对部署在物理小型车辆上的SDC测试与虚拟模拟车辆上的SDC测试进行了实证比较。在我们的实证研究中,我们调查了在大量腐败和敌对环境中,虚拟环境和现实环境之间行为和失败暴露的可转移性。虽然大量测试结果确实在虚拟环境和物理环境之间传递,但我们也发现了导致虚拟世界和物理世界之间现实差距的关键缺陷,这些缺陷威胁到了现有测试解决方案在应用于物理SDC时的潜力。 摘要:Safe deployment of self-driving cars (SDC) necessitates thorough simulated and in-field testing. Most testing techniques consider virtualized SDCs within a simulation environment, whereas less effort has been directed towards assessing whether such techniques transfer to and are effective with a physical real-world vehicle. In this paper, we leverage the Donkey Car open-source framework to empirically compare testing of SDCs when deployed on a physical small-scale vehicle vs its virtual simulated counterpart. In our empirical study, we investigate the transferability of behavior and failure exposure between virtual and real-world environments on a vast set of corrupted and adversarial settings. While a large number of testing results do transfer between virtual and physical environments, we also identified critical shortcomings that contribute to the reality gap between the virtual and physical world, threatening the potential of existing testing solutions when applied to physical SDCs.

【11】 Hateful Memes Challenge: An Enhanced Multimodal Framework 标题:仇恨模因挑战:一个增强的多模态框架 链接:https://arxiv.org/abs/2112.11244

作者:Aijing Gao,Bingjun Wang,Jiaqi Yin,Yating Tian 机构:Georgia Institute of Technology 摘要:Facebook AI提出的可恶的模因挑战吸引了世界各地的参赛者。挑战的重点是在多模态模因中检测仇恨言语。各种最先进的深度学习模型已应用于此问题,challenge排行榜的表现也在不断提高。在本文中,我们增强了仇恨检测框架,包括利用Detectron进行特征提取,探索具有不同损失函数的VisualBERT和UNITER模型的不同设置,研究仇恨模因与敏感文本特征之间的关联,最后建立集成方法来提高模型性能。我们微调的VisualBERT、UNITER和ensemble方法的AUROC在挑战测试集上分别达到0.765、0.790和0.803,优于基线模型。我们的代码可在https://github.com/yatingtian/hateful-meme 摘要:Hateful Meme Challenge proposed by Facebook AI has attracted contestants around the world. The challenge focuses on detecting hateful speech in multimodal memes. Various state-of-the-art deep learning models have been applied to this problem and the performance on challenge's leaderboard has also been constantly improved. In this paper, we enhance the hateful detection framework, including utilizing Detectron for feature extraction, exploring different setups of VisualBERT and UNITER models with different loss functions, researching the association between the hateful memes and the sensitive text features, and finally building ensemble method to boost model performance. The AUROC of our fine-tuned VisualBERT, UNITER, and ensemble method achieves 0.765, 0.790, and 0.803 on the challenge's test set, respectively, which beats the baseline models. Our code is available at https://github.com/yatingtian/hateful-meme

【12】 Unsupervised deep learning techniques for powdery mildew recognition based on multispectral imaging 标题:基于多光谱成像的无监督深度学习白粉病识别技术 链接:https://arxiv.org/abs/2112.11242

作者:Alessandro Benfenati,Paola Causin,Roberto Oberti,Giovanni Stefanello 机构: Dept. of Environmental Science and Policy, Universita degli Studi di Milano, Milano, Dept. of Mathematics, Universita degli Studi di Milano, Milano, Italy, Dept. of Agricultural and Environmental Sciences - Production, Landscape 摘要:目标。植物病害的可持续管理是一项具有相关经济和环境影响的公开挑战。最佳策略依赖于人类专业知识,在有利条件下进行实地侦察,以评估疾病症状的当前存在和程度。这项劳动密集型任务由于要侦察的大面积区域以及要检测的早期症状的毫米级大小而变得复杂。有鉴于此,基于图像的早期疾病症状检测是自动化该过程的一种有吸引力的方法,能够以可持续的成本实现潜在的高通量监测。方法。深度学习已成功地应用于各个领域,通过训练过程学习滤波器,自动选择相关图像特征。深度学习最近也进入了植物病害检测领域:基于这一思想,我们提出了一种自动识别黄瓜叶片白粉病的深度学习方法。我们专注于应用于多光谱成像数据的无监督深度学习技术,并建议使用自动编码器架构来研究两种疾病检测策略:i)压缩空间中的特征聚类;ii)异常检测。后果这两种方法已通过定量指标进行了评估。聚类方法本身并不能提供准确的预测,但它确实满足相关信息的需要。相反,异常检测具有很大的分辨率潜力,可以进一步利用它作为有监督体系结构(标记样本数量非常有限)的先验知识。 摘要:Objectives. Sustainable management of plant diseases is an open challenge which has relevant economic and environmental impact. Optimal strategies rely on human expertise for field scouting under favourable conditions to assess the current presence and extent of disease symptoms. This labor-intensive task is complicated by the large field area to be scouted, combined with the millimeter-scale size of the early symptoms to be detected. In view of this, image-based detection of early disease symptoms is an attractive approach to automate this process, enabling a potential high throughput monitoring at sustainable costs. Methods. Deep learning has been successfully applied in various domains to obtain an automatic selection of the relevant image features by learning filters via a training procedure. Deep learning has recently entered also the domain of plant disease detection: following this idea, in this work we present a deep learning approach to automatically recognize powdery mildew on cucumber leaves. We focus on unsupervised deep learning techniques applied to multispectral imaging data and we propose the use of autoencoder architectures to investigate two strategies for disease detection: i) clusterization of features in a compressed space; ii) anomaly detection. Results. The two proposed approaches have been assessed by quantitative indices. The clusterization approach is not fully capable by itself to provide accurate predictions but it does cater relevant information. Anomaly detection has instead a significant potential of resolution which could be further exploited as a prior for supervised architectures with a very limited number of labeled samples.

【13】 A next-generation platform for Cyber Range-as-a-Service 标题:面向网络范围即服务的下一代平台 链接:https://arxiv.org/abs/2112.11233

作者:Vittorio Orbinato 机构:DIETI, Universita degli Studi di Napoli Federico II, Naples, Italy 摘要:在过去几年中,网络靶场已成为训练应对网络威胁和攻击的专业人员的普遍解决方案。云计算在这方面起着关键作用,因为它能够创建虚拟基础设施,网络范围就是基于这些基础设施的。然而,赛博靶场的设置和管理是昂贵且耗时的活动。在本文中,我们重点介绍下一代Cyber Range平台的新功能。特别是,这些功能包括为实际的公司基础设施创建虚拟克隆、将安全管理人员从训练场景和会话的设置中解放出来、自动监控参与者的活动以及模拟他们的行为。 摘要:In the last years, Cyber Ranges have become a widespread solution to train professionals for responding to cyber threats and attacks. Cloud computing plays a key role in this context since it enables the creation of virtual infrastructures on which Cyber Ranges are based. However, the setup and management of Cyber Ranges are expensive and time-consuming activities. In this paper, we highlight the novel features for the next-generation Cyber Range platforms. In particular, these features include the creation of a virtual clone for an actual corporate infrastructure, relieving the security managers from the setup of the training scenarios and sessions, the automatic monitoring of the participants' activities, and the emulation of their behavior.

【14】 Interpretable Preference-based Reinforcement Learning with Tree-Structured Reward Functions 标题:基于树结构奖励函数的可解释偏好强化学习 链接:https://arxiv.org/abs/2112.11230

作者:Tom Bewley,Freddy Lecue 机构:University of Bristol, Bristol, United Kingdom, CortAIx, Thales, Montréal, Canada 备注:Accepted for publication at the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2022) 摘要:强化学习(RL)提供一致且性能良好的代理的潜力部分受到奖励工程问题的制约。启发式试错法的一种替代方法是基于偏好的RL(PbRL),其中奖励函数是从稀疏的人类反馈推断出来的。然而,以前的PbRL方法缺乏对学习到的奖励结构的可解释性,这妨碍了评估稳健性和一致性的能力。我们提出了一种在线的主动偏好学习算法,该算法利用树的内在可解释的组合结构构造奖励函数。通过使用合成反馈和人工反馈,我们展示了在几种环境中对树结构奖励函数的样本有效学习,然后利用增强的解释能力来探索和调试对齐。 摘要:The potential of reinforcement learning (RL) to deliver aligned and performant agents is partially bottlenecked by the reward engineering problem. One alternative to heuristic trial-and-error is preference-based RL (PbRL), where a reward function is inferred from sparse human feedback. However, prior PbRL methods lack interpretability of the learned reward structure, which hampers the ability to assess robustness and alignment. We propose an online, active preference learning algorithm that constructs reward functions with the intrinsically interpretable, compositional structure of a tree. Using both synthetic and human-provided feedback, we demonstrate sample-efficient learning of tree-structured reward functions in several environments, then harness the enhanced interpretability to explore and debug for alignment.

【15】 Energy-bounded Learning for Robust Models of Code 标题:代码健壮模型的能量受限学习 链接:https://arxiv.org/abs/2112.11226

作者:Nghi D. Q. Bui,Yijun Yu 机构:Trustworthiness Lab, Huawei Ireland Research Centre 备注:arXiv admin note: text overlap with arXiv:2010.03759 by other authors 摘要:在编程中,学习代码表示有多种应用,包括代码分类、代码搜索、注释生成、错误预测等。已经提出了以标记、语法树、依赖关系图、代码导航路径或其变体的组合表示代码的各种方法,但是,现有的普通学习技术在鲁棒性方面有一个主要限制,即。,当输入以微妙的方式改变时,模型很容易做出错误的预测。为了增强鲁棒性,现有的方法侧重于识别对抗性样本,而不是特定分布之外的有效样本,我们称之为分布外(out-of-distribution,OOD)样本。识别此类OOD样本是本文研究的新问题。为此,我们建议首先使用分布外样本扩充分布内数据集,以便在一起训练时,它们将增强模型的鲁棒性。我们建议使用能量有界学习目标函数为分布内样本分配较高的分数,为分布外样本分配较低的分数,以便将这种分布外样本纳入源代码模型的训练过程。在OOD检测和对抗性样本检测方面,我们的评估结果表明,现有源代码模型具有更强的鲁棒性,能够更准确地识别OOD数据,同时更能抵抗对抗性攻击。此外,所提出的能量有界分数大大优于所有现有的OOD检测分数,包括softmax置信分数、Mahalanobis分数和ODIN。 摘要:In programming, learning code representations has a variety of applications, including code classification, code search, comment generation, bug prediction, and so on. Various representations of code in terms of tokens, syntax trees, dependency graphs, code navigation paths, or a combination of their variants have been proposed, however, existing vanilla learning techniques have a major limitation in robustness, i.e., it is easy for the models to make incorrect predictions when the inputs are altered in a subtle way. To enhance the robustness, existing approaches focus on recognizing adversarial samples rather than on the valid samples that fall outside a given distribution, which we refer to as out-of-distribution (OOD) samples. Recognizing such OOD samples is the novel problem investigated in this paper. To this end, we propose to first augment the in=distribution datasets with out-of-distribution samples such that, when trained together, they will enhance the model's robustness. We propose the use of an energy-bounded learning objective function to assign a higher score to in-distribution samples and a lower score to out-of-distribution samples in order to incorporate such out-of-distribution samples into the training process of source code models. In terms of OOD detection and adversarial samples detection, our evaluation results demonstrate a greater robustness for existing source code models to become more accurate at recognizing OOD data while being more resistant to adversarial attacks at the same time. Furthermore, the proposed energy-bounded score outperforms all existing OOD detection scores by a large margin, including the softmax confidence score, the Mahalanobis score, and ODIN.

【16】 Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles 标题:具有时变状态和控制约束的基于模型的安全强化学习:在智能车辆中的应用 链接:https://arxiv.org/abs/2112.11217

作者:Xinglong Zhang,Yaoqian Peng,Biao Luo,Wei Pan,Xin Xu,Haibin Xie 机构: NationalUniversity of Defense Technology, Biao Lu is with theSchool of Automation, Central South University, Wei Panis with the Department of Cognitive Robotics, Delft University of Technology 备注:13 pages, 7 figures 摘要:近年来,基于屏障函数的安全强化学习(RL)受到了越来越多的关注,该学习具有参与者-批评家结构,用于连续控制任务。学习具有安全性和收敛性保证的近似最优控制策略仍然是一个挑战。此外,很少有研究涉及时变安全约束下的安全RL算法设计。针对具有时变状态和控制约束的非线性系统的最优控制问题,提出了一种基于模型的安全RL算法。在该方法中,我们构造了一种新的基于屏障的控制策略结构,可以保证控制安全。提出了一种多步策略评估机制,用于预测时变安全约束下的策略安全风险,指导策略安全更新。证明了稳定性和鲁棒性的理论结果。同时,对演员-评论家学习算法的收敛性进行了分析。在模拟的安全健身房环境中,该算法的性能优于几种最新的RL算法。此外,该方法还应用于两个实际智能车辆的集成路径跟踪和碰撞避免问题。使用差速驱动车辆和Ackermann驱动车辆分别验证离线部署性能和在线学习性能。我们的方法在实验中显示了令人印象深刻的模拟真实传输能力和令人满意的在线控制性能。 摘要:Recently, barrier function-based safe reinforcement learning (RL) with the actor-critic structure for continuous control tasks has received increasing attention. It is still challenging to learn a near-optimal control policy with safety and convergence guarantees. Also, few works have addressed the safe RL algorithm design under time-varying safety constraints. This paper proposes a model-based safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. In the proposed approach, we construct a novel barrier-based control policy structure that can guarantee control safety. A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely. Theoretical results on stability and robustness are proven. Also, the convergence of the actor-critic learning algorithm is analyzed. The performance of the proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment. Furthermore, the approach is applied to the integrated path following and collision avoidance problem for two real-world intelligent vehicles. A differential-drive vehicle and an Ackermann-drive one are used to verify the offline deployment performance and the online learning performance, respectively. Our approach shows an impressive sim-to-real transfer capability and a satisfactory online control performance in the experiment.

【17】 Value Activation for Bias Alleviation: Generalized-activated Deep Double Deterministic Policy Gradients 标题:偏差缓解的价值激活:广义激活的深层双重确定性政策梯度 链接:https://arxiv.org/abs/2112.11216

作者:Jiafei Lyu,Yu Yang,Jiangpeng Yan,Xiu Li 机构:• We propose a novel generalized-activated weighting operator for bias alleviation in deep reinforcement learning., • We theoretically and experimentally show that the distance between the max operator and the generalized-activated 备注:13 pages 摘要:在深度强化学习(DRL)中,准确估计值函数是至关重要的,这样agent就可以执行正确的操作,而不是次优的操作。然而,现有的演员批评方法或多或少地受到低估或高估偏差的影响,这对他们的表现产生了负面影响。在本文中,我们揭示了一个简单但有效的原理:适当的值校正有利于减少偏差,我们提出了广义激活加权算子,它使用任何非递减函数,即激活函数,作为更好的值估计的权重。特别地,我们将广义激活加权算子集成到值估计中,并引入了一种新的算法——广义激活深度双确定性策略梯度(GD3)。我们从理论上证明了GD3能够缓解潜在的估计偏差。我们有趣地发现,简单的激活函数可以在不增加额外技巧的情况下获得令人满意的性能,并且有助于更快的收敛。在大量具有挑战性的连续控制任务上的实验结果表明,具有任务特定激活的GD3优于常用的基线方法。我们还发现了一个事实,即微调多项式激活函数可以在大多数任务中获得更好的结果。 摘要:It is vital to accurately estimate the value function in Deep Reinforcement Learning (DRL) such that the agent could execute proper actions instead of suboptimal ones. However, existing actor-critic methods suffer more or less from underestimation bias or overestimation bias, which negatively affect their performance. In this paper, we reveal a simple but effective principle: proper value correction benefits bias alleviation, where we propose the generalized-activated weighting operator that uses any non-decreasing function, namely activation function, as weights for better value estimation. Particularly, we integrate the generalized-activated weighting operator into value estimation and introduce a novel algorithm, Generalized-activated Deep Double Deterministic Policy Gradients (GD3). We theoretically show that GD3 is capable of alleviating the potential estimation bias. We interestingly find that simple activation functions lead to satisfying performance with no additional tricks, and could contribute to faster convergence. Experimental results on numerous challenging continuous control tasks show that GD3 with task-specific activation outperforms the common baseline methods. We also uncover a fact that fine-tuning the polynomial activation function achieves superior results on most of the tasks.

【18】 Interpretable Knowledge Tracing: Simple and Efficient Student Modeling with Causal Relations 标题:可解释知识追踪:具有因果关系的简单有效的学生建模 链接:https://arxiv.org/abs/2112.11209

作者:Sein Minn,Jill-Jenn Vie,Koh Takeuchi,Hisashi Kashima,Feida Zhu 机构: Univ. Lille, Inria, CNRS, Centrale Lille, UMR , - CRIStAL, F-, Lille, France, Universit´e Paris-Saclay, Inria, CEA, Palaiseau, France, Kyoto University, Japan, Singapore Management University, Singapore 备注:AAAI Symposium on Educational Advances in Artificial Intelligence EAAI-22. arXiv admin note: text overlap with arXiv:2012.12218 摘要:智能教学系统在未来的学习环境中变得至关重要。知识追踪(KT)是该系统的关键部分。它是关于推断学生的技能掌握情况并预测他们的表现,从而相应地调整课程。与传统模型相比,基于深度学习的KT模型具有显著的预测性能。然而,很难从神经网络中成千上万个与认知理论相关的参数中提取出有心理学意义的解释。有几种方法可以实现学生成绩预测的高准确性,但诊断和预测推理在学习科学中更为关键。由于KT问题几乎没有可观察的特征(问题ID和学生在每次练习中的正确性),我们使用机器学习和数据挖掘技术从学生的反应数据中提取有意义的潜在特征。在这项工作中,我们提出了可解释知识追踪(IKT),这是一个简单的模型,依赖于三个有意义的潜在特征:个人技能掌握、能力概况(跨技能学习迁移)和问题难度。IKT对未来学生表现的预测是使用树增强朴素贝叶斯分类器(TAN)进行的,因此其预测比基于深度学习的学生模型更容易解释。IKT也显示出比基于深度学习的学生模型更好的学生成绩预测,而不需要大量的参数。我们对每个特征进行消融研究,以检查它们对学生成绩预测的贡献。因此,IKT在现实教育系统中提供具有因果推理的自适应个性化教学方面具有巨大潜力。 摘要:Intelligent Tutoring Systems have become critically important in future learning environments. Knowledge Tracing (KT) is a crucial part of that system. It is about inferring the skill mastery of students and predicting their performance to adjust the curriculum accordingly. Deep Learning-based KT models have shown significant predictive performance compared with traditional models. However, it is difficult to extract psychologically meaningful explanations from the tens of thousands of parameters in neural networks, that would relate to cognitive theory. There are several ways to achieve high accuracy in student performance prediction but diagnostic and prognostic reasoning is more critical in learning sciences. Since KT problem has few observable features (problem ID and student's correctness at each practice), we extract meaningful latent features from students' response data by using machine learning and data mining techniques. In this work, we present Interpretable Knowledge Tracing (IKT), a simple model that relies on three meaningful latent features: individual skill mastery, ability profile (learning transfer across skills), and problem difficulty. IKT's prediction of future student performance is made using a Tree-Augmented Naive Bayes Classifier (TAN), therefore its predictions are easier to explain than deep learning-based student models. IKT also shows better student performance prediction than deep learning-based student models without requiring a huge amount of parameters. We conduct ablation studies on each feature to examine their contribution to student performance prediction. Thus, IKT has great potential for providing adaptive and personalized instructions with causal reasoning in real-world educational systems.

【19】 Artificial Intelligence Ethics and Safety: practical tools for creating "good" models 链接:https://arxiv.org/abs/2112.11208

作者:Nicholas Kluge Corrêa 机构: Master's degree in Electrical Engineering and doctoral student in Philosophy − PUCRS., Fellow of the Academic Excellence Program (Proex) of the CAPES Foundation, (Coordination for the Improvement of Higher Education Personnel). President of the 备注:English version 摘要:人工智能机器人伦理协会(AIRES)是一个非营利组织,由Aaron Hui于2018年成立,旨在提高对人工智能伦理实施和监管的认识和重要性。艾利斯现在是一个组织在大学,如加州大学洛杉矶分校(洛杉矶),美国南加州大学(南加州大学),加州理工学院(加州理工大学),斯坦福大学,康奈尔大学,康奈尔大学,和天主教天主教大学里奥格兰德Sul(巴西)。临市局的艾利斯是艾利斯的第一个国际分会,因此,我们致力于促进和加强艾利斯使命。我们的使命是专注于教育未来的人工智能领导者道德原则,以确保人工智能以道德和负责任的方式创建。由于对于如何在人工智能系统开发实践中实施伦理原则和规范性指导方针的建议仍然很少,因此这项工作的目标是试图弥合话语和实践之间的差距。在抽象原则和技术实现之间。在这项工作中,我们试图向读者介绍人工智能道德和安全的主题。同时,我们提供了一些工具来帮助智能系统开发人员开发“好”的模型。这本书是以英语和葡萄牙语出版的发展指南。欢迎提供意见和建议。 摘要:The AI Robotics Ethics Society (AIRES) is a non-profit organization founded in 2018 by Aaron Hui to promote awareness and the importance of ethical implementation and regulation of AI. AIRES is now an organization with chapters at universities such as UCLA (Los Angeles), USC (University of Southern California), Caltech (California Institute of Technology), Stanford University, Cornell University, Brown University, and the Pontifical Catholic University of Rio Grande do Sul (Brazil). AIRES at PUCRS is the first international chapter of AIRES, and as such, we are committed to promoting and enhancing the AIRES Mission. Our mission is to focus on educating the AI leaders of tomorrow in ethical principles to ensure that AI is created ethically and responsibly. As there are still few proposals for how we should implement ethical principles and normative guidelines in the practice of AI system development, the goal of this work is to try to bridge this gap between discourse and praxis. Between abstract principles and technical implementation. In this work, we seek to introduce the reader to the topic of AI Ethics and Safety. At the same time, we present several tools to help developers of intelligent systems develop "good" models. This work is a developing guide published in English and Portuguese. Contributions and suggestions are welcome.

【20】 Building a Decision Support System for Automated Mobile Asthma Monitoring in Remote Areas 标题:构建偏远地区移动哮喘自动监测决策支持系统 链接:https://arxiv.org/abs/2112.11195

作者:Chinazunwa Uwaoma,Gunjan Mansingh 机构: The University of the West Indies, The University of West Indies 摘要:移动计算的进步为使用智能手机作为数据采集、分析和展示平台开发若干健康应用程序铺平了道路。mhealth系统广泛应用的领域包括监测心血管疾病和肺部疾病等长期健康状况,以及检测这些状况基线测量值的变化。由于与哮喘相关的经济、社会和情绪负担,哮喘是全球范围内日益受到关注的呼吸道疾病之一。哮喘的管理和控制可以通过持续实时监测病情得到改善,因为哮喘发作可以随时随地发生。本文提出使用装有嵌入式传感器的智能手机来捕捉和分析运动引发的哮喘的早期症状。该系统的设计基于决策支持系统技术,用于测量和分析患者体力活动的水平和类型以及易患哮喘的天气条件。初步结果表明,智能手机可以在没有其他联网设备的情况下监测和检测哮喘症状。这将提高卫生系统的可用性,同时确保用户数据隐私,并降低系统部署的总体成本。此外,拟议的系统可作为低收入国家哮喘患者快速医疗反应的便捷工具,在这些国家,获得专门医疗设备的机会有限,卫生专业人员短缺。这种监测系统的发展标志着减轻全球哮喘负担的积极反应。 摘要:Advances in mobile computing have paved the way for the development of several health applications using smartphone as a platform for data acquisition, analysis and presentation. Such areas where mhealth systems have been extensively deployed include monitoring of long term health conditions like Cardio Vascular Diseases and pulmonary disorders, as well as detection of changes from baseline measurements of such conditions. Asthma is one of the respiratory conditions with growing concern across the globe due to the economic, social and emotional burden associated with the ailment. The management and control of asthma can be improved by consistent monitoring of the condition in realtime since attack could occur anytime and anywhere. This paper proposes the use of smartphone equipped with embedded sensors, to capture and analyze early symptoms of asthma triggered by exercise. The system design is based on Decision Support System techniques for measuring and analyzing the level and type of patients physical activity as well as weather conditions that predispose asthma attack. Preliminary results show that smartphones can be used to monitor and detect asthma symptoms without other networked devices. This would enhance the usability of the health system while ensuring users data privacy, and reducing the overall cost of system deployment. Further, the proposed system can serve as a handy tool for a quick medical response for asthmatics in low income countries where there are limited access to specialized medical devices and shortages of health professionals. Development of such monitoring systems signals a positive response to lessen the global burden of asthma.

【21】 There is an elephant in the room: Towards a critique on the use of fairness in biometrics 标题:房间里有一头大象:对生物特征识别中公平使用的批评 链接:https://arxiv.org/abs/2112.11193

作者:Ana Valdivia,Júlia Corbera-Serrajòrdia,Aneta Swianiewicz 机构:King’s College London (KCL), London, UK 备注:14 pages, 3 figures 摘要:2019年,英国上法庭移民和庇护分庭驳回了一项庇护上诉,该上诉基于生物识别系统的输出以及其他差异。在生物特征数据库中发现了寻求庇护者的指纹,这与上诉人的陈述相矛盾。法庭认为这一证据毫不含糊,并驳回了庇护申请。如今,生物识别系统的扩散正在围绕其政治、社会和道德影响形成公众辩论。然而,尽管对将这项技术用于移民控制的种族化的担忧不断增加,但生物特征识别行业和创新方面的投资却在大幅增加。此外,公平性最近也被生物识别技术所采用,以减轻对生物识别技术的偏见和歧视。然而,算法公平性不能在被破坏的场景中分配正义,或者其目的是歧视,例如部署在边境的生物特征识别。在本文中,我们对最近关于生物特征公平性的争论进行了批判性解读,并利用机器学习中的公平性研究和批判性边界研究显示了其局限性。基于先前的公平性演示,我们证明了生物特征公平性标准在数学上是互斥的。然后,论文继续从经验上说明,通过复制以前作品中的实验,公平的生物识别系统是不可能的。最后,我们将辩论置于边界,讨论生物特征识别中的公平政治。我们声称偏见和错误率对公民和寻求庇护者有不同的影响。公平性掩盖了生物特征识别领域中的大象,关注人口偏见和算法的道德论述,而不是研究这些系统如何再现历史和政治不公平。 摘要:In 2019, the UK's Immigration and Asylum Chamber of the Upper Tribunal dismissed an asylum appeal basing the decision on the output of a biometric system, alongside other discrepancies. The fingerprints of the asylum seeker were found in a biometric database which contradicted the appellant's account. The Tribunal found this evidence unequivocal and denied the asylum claim. Nowadays, the proliferation of biometric systems is shaping public debates around its political, social and ethical implications. Yet whilst concerns towards the racialised use of this technology for migration control have been on the rise, investment in the biometrics industry and innovation is increasing considerably. Moreover, fairness has also been recently adopted by biometrics to mitigate bias and discrimination on biometrics. However, algorithmic fairness cannot distribute justice in scenarios which are broken or intended purpose is to discriminate, such as biometrics deployed at the border. In this paper, we offer a critical reading of recent debates about biometric fairness and show its limitations drawing on research in fairness in machine learning and critical border studies. Building on previous fairness demonstrations, we prove that biometric fairness criteria are mathematically mutually exclusive. Then, the paper moves on illustrating empirically that a fair biometric system is not possible by reproducing experiments from previous works. Finally, we discuss the politics of fairness in biometrics by situating the debate at the border. We claim that bias and error rates have different impact on citizens and asylum seekers. Fairness has overshadowed the elephant in the room of biometrics, focusing on the demographic biases and ethical discourses of algorithms rather than examine how these systems reproduce historical and political injustices.

【22】 Developing a Trusted Human-AI Network for Humanitarian Benefit 标题:发展可信赖的人类-人工智能网络,为人道主义造福 链接:https://arxiv.org/abs/2112.11191

作者:Susannah Kate Devitt,Jason Scholz,Timo Schless,Larry Lewis 机构:Trusted Autonomous Systems, Australia, University of Queensland, Australia, RMIT, Australia, Whiteflag Foundation, Netherlands, Center for Naval Analysis, Arlington, Virginia, USA 备注:30 pages, 7 figures, 2 boxes, submitted for peer review to the Journal of Digital War, My War Special Issue 摘要:人类和人工智能(AI)将越来越多地以数字和物理方式参与冲突,但缺乏跨代理和平台的可信通信。例如,灾难和冲突中的人类已经使用信息和社交媒体来共享信息,然而,国际人道主义救援组织将这些信息视为无法验证和不可信。人工智能可以减少“战争迷雾”并改善结果,但人工智能的实施往往脆弱,应用范围狭窄,道德风险大。与此同时,人为错误甚至对致力于遵守国际人道主义法的战斗人员造成重大平民伤害。大赦国际提供了一个机会,帮助减少战争悲剧,并向需要的人提供人道主义援助。在本文中,我们考虑将通信协议(WieFLAG协议)、分布式分类帐技术和信息融合与人工智能(AI)相结合,以改进称为“受保护的保证理解情况和实体”(暂停)的冲突通信。这样一个可信的人工智能通信网络可以提供有关受保护实体、关键基础设施的可靠信息交换;冲突中人和机器的人道主义信号和状态更新。 摘要:Humans and artificial intelligences (AI) will increasingly participate digitally and physically in conflicts, yet there is a lack of trusted communications across agents and platforms. For example, humans in disasters and conflict already use messaging and social media to share information, however, international humanitarian relief organisations treat this information as unverifiable and untrustworthy. AI may reduce the 'fog-of-war' and improve outcomes, however AI implementations are often brittle, have a narrow scope of application and wide ethical risks. Meanwhile, human error causes significant civilian harms even by combatants committed to complying with international humanitarian law. AI offers an opportunity to help reduce the tragedy of war and deliver humanitarian aid to those who need it. In this paper we consider the integration of a communications protocol (the 'Whiteflag protocol'), distributed ledger technology, and information fusion with artificial intelligence (AI), to improve conflict communications called 'Protected Assurance Understanding Situation and Entities' (PAUSE). Such a trusted human-AI communication network could provide accountable information exchange regarding protected entities, critical infrastructure; humanitarian signals and status updates for humans and machines in conflicts.

【23】 Predicting infections in the Covid-19 pandemic -- lessons learned 标题:预测冠状病毒大流行中的感染--吸取的教训 链接:https://arxiv.org/abs/2112.11187

作者:Sharare Zehtabian,Siavash Khodadadeh,Damla Turgut,Ladislau Bölöni 机构:Department of Computer Science, University of Central Florida, Orlando, Florida 摘要:在整个COVID-19大流行中,已经投入了大量的努力来开发技术,这些技术在各种公共政策和非药物干预假设下预测感染的数量。虽然可用数据、人工智能模型的复杂性和可用计算能力都超过了前几年的可用数据,但预测方法的总体成功率非常有限。在本文中,我们从预测算法提出X大奖奖流行病应对挑战,并考虑几个方向,可能会允许他们的改进。然后,我们调查了他们在几个月的中期预测中的表现。我们发现,通过增加有关建模区域文化的额外信息来增强算法,结合传统的分区模型和最新的深度学习架构,可以提高短期预测的性能,中期预测的准确性仍然很低,要使这些模型成为公共政策工具箱的可靠组成部分,还需要进行大量的未来研究。 摘要:Throughout the Covid-19 pandemic, a significant amount of effort had been put into developing techniques that predict the number of infections under various assumptions about the public policy and non-pharmaceutical interventions. While both the available data and the sophistication of the AI models and available computing power exceed what was available in previous years, the overall success of prediction approaches was very limited. In this paper, we start from prediction algorithms proposed for XPrize Pandemic Response Challenge and consider several directions that might allow their improvement. Then, we investigate their performance over medium-term predictions extending over several months. We find that augmenting the algorithms with additional information about the culture of the modeled region, incorporating traditional compartmental models and up-to-date deep learning architectures can improve the performance for short term predictions, the accuracy of medium-term predictions is still very low and a significant amount of future research is needed to make such models a reliable component of a public policy toolbox.

【24】 Task-oriented Dialogue Systems: performance vs. quality-optima, a review 标题:任务型对话系统:性能与质量--Optima综述 链接:https://arxiv.org/abs/2112.11176

作者:Ryan Fellows,Hisham Ihshaish,Steve Battle,Ciaran Haines,Peter Mayhew,J. Ignacio Deza 机构:Computer Science Research Centre (CSRC), University of the West of England (UWE), Bristol, United Kingdom, GE Aviation, Cheltenham, United Kingdom, Universidad Atl´antida Argentina, Mar del Plata, Argentina 摘要:面向任务的对话系统(TOD)正在继续流行,因为各个行业都在寻找有效利用其能力的方法,从而节省时间和金钱。然而,即使是最先进的TOD也尚未充分发挥其潜力。TOD的主要设计重点通常是完成手头的任务,因此任务解决的度量应该优先考虑。其他可能表明对话成功或不成功的对话质量属性可能会被忽略。这可能会导致人与对话系统之间的交互,让用户感到不满意或沮丧。本文探讨了有关对话系统的评价框架和对话质量属性在对话系统中的作用的文献,考察了它们是否、如何以及在何处被使用,并考察了它们与对话系统绩效的相关性。 摘要:Task-oriented dialogue systems (TODS) are continuing to rise in popularity as various industries find ways to effectively harness their capabilities, saving both time and money. However, even state-of-the-art TODS are not yet reaching their full potential. TODS typically have a primary design focus on completing the task at hand, so the metric of task-resolution should take priority. Other conversational quality attributes that may point to the success, or otherwise, of the dialogue, may be ignored. This can cause interactions between human and dialogue system that leave the user dissatisfied or frustrated. This paper explores the literature on evaluative frameworks of dialogue systems and the role of conversational quality attributes in dialogue systems, looking at if, how, and where they are utilised, and examining their correlation with the performance of the dialogue system.

【25】 AutoCTS: Automated Correlated Time Series Forecasting -- Extended Version 标题:AutoCTS:自动相关时间序列预测--扩展版 链接:https://arxiv.org/abs/2112.11174

作者:Xinle Wu,Dalin Zhang,Chenjuan Guo,Chaoyang He,Bin Yang,Christian S. Jensen 机构:Aalborg University, Denmark, University of Southern California, USA 备注:to appear in PVLDB 2022 摘要:相关时间序列(CTS)预测在许多网络物理系统中起着至关重要的作用,在这些系统中,多个传感器发出时间序列来捕获相互关联的过程。基于深度学习的解决方案提供最先进的CTS预测性能,采用各种时空(ST)块,能够对时间序列之间的时间依赖性和空间相关性进行建模。然而,仍然存在两个挑战。首先,ST块是手工设计的,这既耗时又昂贵。其次,现有的预测模型只是将相同的ST块多次叠加,这限制了模型的潜力。为了应对这些挑战,我们提出了能够自动识别高度竞争的ST块以及使用不同拓扑连接的异构ST块预测模型的AutoCT,而不是使用简单堆叠连接的相同ST块。具体而言,我们设计了一个微观和宏观搜索空间来模拟ST块的可能结构以及异构ST块之间的连接,并提供了一种搜索策略,能够联合探索搜索空间以确定最佳预测模型。在八个常用CTS预测基准数据集上进行的大量实验证明了我们的设计选择是正确的,并证明AutoCTS能够自动发现性能优于最先进的人类设计模型的预测模型。这是“自动相关时间序列预测”的扩展版本,将出现在PVLDB 2022中。 摘要:Correlated time series (CTS) forecasting plays an essential role in many cyber-physical systems, where multiple sensors emit time series that capture interconnected processes. Solutions based on deep learning that deliver state-of-the-art CTS forecasting performance employ a variety of spatio-temporal (ST) blocks that are able to model temporal dependencies and spatial correlations among time series. However, two challenges remain. First, ST-blocks are designed manually, which is time consuming and costly. Second, existing forecasting models simply stack the same ST-blocks multiple times, which limits the model potential. To address these challenges, we propose AutoCTS that is able to automatically identify highly competitive ST-blocks as well as forecasting models with heterogeneous ST-blocks connected using diverse topologies, as opposed to the same ST-blocks connected using simple stacking. Specifically, we design both a micro and a macro search space to model possible architectures of ST-blocks and the connections among heterogeneous ST-blocks, and we provide a search strategy that is able to jointly explore the search spaces to identify optimal forecasting models. Extensive experiments on eight commonly used CTS forecasting benchmark datasets justify our design choices and demonstrate that AutoCTS is capable of automatically discovering forecasting models that outperform state-of-the-art human-designed models. This is an extended version of ``AutoCTS: Automated Correlated Time Series Forecasting'', to appear in PVLDB 2022.

【26】 PONet: Robust 3D Human Pose Estimation via Learning Orientations Only 标题:Ponet:仅基于学习方向的鲁棒三维人体姿态估计 链接:https://arxiv.org/abs/2112.11153

作者:Jue Wang,Shaoli Huang,Xinchao Wang,Dacheng Tao 摘要:传统的三维人体姿态估计依赖于首先检测二维人体关键点,然后解决二维到三维的对应问题。尽管取得了有希望的结果,但这种学习模式高度依赖于2D关键点检测器的质量,2D关键点检测器不可避免地容易受到遮挡和图像缺失的影响。在本文中,我们提出了一种新的姿态定向网络(PONet),该网络仅通过学习方向就能够可靠地估计三维姿态,从而在没有图像证据的情况下绕过容易出错的关键点检测器。对于具有部分不可见肢体的图像,PONet通过利用局部图像证据来恢复3D姿势来估计这些肢体的3D方向。此外,PONet能够通过利用可见肢体之间的方向相关性来补充估计的姿势,甚至从具有完全不可见肢体的图像推断出完整的3D姿势,从而进一步提高3D姿势估计的鲁棒性。我们在多个数据集上评估我们的方法,包括Human3。6M、MPII、MPI-INF-3DHP和3DPW。我们的方法达到了与理想的设置中的最先进的技术的PAR结果,但显著地消除了对关键点检测器和相应的计算负担的依赖性。在非常具有挑战性的场景中,例如截断和擦除,我们的方法执行非常稳健,并且与最新技术相比产生了更优的结果,显示了其在实际应用中的潜力。 摘要:Conventional 3D human pose estimation relies on first detecting 2D body keypoints and then solving the 2D to 3D correspondence problem.Despite the promising results, this learning paradigm is highly dependent on the quality of the 2D keypoint detector, which is inevitably fragile to occlusions and out-of-image absences.In this paper,we propose a novel Pose Orientation Net (PONet) that is able to robustly estimate 3D pose by learning orientations only, hence bypassing the error-prone keypoint detector in the absence of image evidence. For images with partially invisible limbs, PONet estimates the 3D orientation of these limbs by taking advantage of the local image evidence to recover the 3D pose.Moreover, PONet is competent to infer full 3D poses even from images with completely invisible limbs, by exploiting the orientation correlation between visible limbs to complement the estimated poses,further improving the robustness of 3D pose estimation.We evaluate our method on multiple datasets, including Human3.6M, MPII, MPI-INF-3DHP, and 3DPW. Our method achieves results on par with state-of-the-art techniques in ideal settings, yet significantly eliminates the dependency on keypoint detectors and the corresponding computation burden. In highly challenging scenarios, such as truncation and erasing, our method performs very robustly and yields much superior results as compared to state of the art,demonstrating its potential for real-world applications.

【27】 RepMLPNet: Hierarchical Vision MLP with Re-parameterized Locality 标题:RepMLPNet:具有重新参数化局部性的分层视觉MLP 链接:https://arxiv.org/abs/2112.11081

作者:Xiaohan Ding,Honghao Chen,Xiangyu Zhang,Jungong Han,Guiguang Ding 机构: Beijing National Research Center for Information Science and Technology (BNRist);, School of Software, Tsinghua University, Beijing, China, Institute of Automation, Chinese Academy of Sciences, MEGVII Technology 备注:The code and models are available at this https URL arXiv admin note: text overlap with arXiv:2105.01883 摘要:与卷积层相比,完全连接(FC)层在建模长距离依赖方面更好,但在捕获局部模式方面更差,因此通常不太适合用于图像识别。在本文中,我们提出了一种方法,局部注入,通过将并行conv核的训练参数合并到FC核中,将局部先验合并到FC层。局部注入可以看作是一种新的结构再参数化方法,因为它通过变换参数来等价地转换结构。在此基础上,我们提出了一种多层感知器(MLP)块RepMLP块,该块使用三层FC来提取特征,并提出了一种新的结构RepMLPNet。分层设计将RepMLPNet与其他同时提出的视觉mlp区分开来。当它生成不同层次的特征图时,它可以作为下游任务(如语义分割)的主干模型。我们的结果表明:1)局部注入是MLP模型的一种通用方法;2) 与其他MLP相比,RepMLPNet具有良好的精度-效率权衡;3) RepMLPNet是第一个无缝转移到城市景观语义分割的MLP。有关代码和模型,请访问https://github.com/DingXiaoH/RepMLP. 摘要:Compared to convolutional layers, fully-connected (FC) layers are better at modeling the long-range dependencies but worse at capturing the local patterns, hence usually less favored for image recognition. In this paper, we propose a methodology, Locality Injection, to incorporate local priors into an FC layer via merging the trained parameters of a parallel conv kernel into the FC kernel. Locality Injection can be viewed as a novel Structural Re-parameterization method since it equivalently converts the structures via transforming the parameters. Based on that, we propose a multi-layer-perceptron (MLP) block named RepMLP Block, which uses three FC layers to extract features, and a novel architecture named RepMLPNet. The hierarchical design distinguishes RepMLPNet from the other concurrently proposed vision MLPs. As it produces feature maps of different levels, it qualifies as a backbone model for downstream tasks like semantic segmentation. Our results reveal that 1) Locality Injection is a general methodology for MLP models; 2) RepMLPNet has favorable accuracy-efficiency trade-off compared to the other MLPs; 3) RepMLPNet is the first MLP that seamlessly transfer to Cityscapes semantic segmentation. The code and models are available at https://github.com/DingXiaoH/RepMLP.

【28】 Robust Recommendation with Implicit Feedback via Eliminating the Effects of Unexpected Behaviors 标题:消除意外行为影响的隐式反馈鲁棒推荐 链接:https://arxiv.org/abs/2112.11023

作者:Jie Chen,Lifen Jiang,Chunmei Ma,Huazhi Sun 机构:SUMMARY 摘要:近年来,在隐式反馈推荐中,将短期偏好引入推荐系统越来越受到人们的关注。然而,历史交互中的意外行为(如意外点击某些项目)并不能很好地反映用户的固有偏好。现有的研究未能对意外行为的影响进行建模,从而导致推荐性能较差。在本文中,我们提出了一个多偏好模型(MPM)来消除意外行为的影响。MPM首先通过细粒度偏好模块从用户最近的历史交互中提取用户的即时偏好。然后训练一个意外行为检测器来判断这些即时偏好是否受到意外行为的影响。我们还将用户的一般偏好集成到MPM中。最后,执行输出模块以消除意外行为的影响,并集成所有信息以做出最终建议。我们在一部电影和一家电子零售的两个数据集上进行了广泛的实验,证明我们的模型比最先进的方法有了显著的改进。实验结果表明,MPM在性能上有了很大的提高HR@10和NDCG@10与AttRec模型相比,平均增长3.643%和4.107%。我们在https://github.com/chenjie04/MPM/. 摘要:In the implicit feedback recommendation, incorporating short-term preference into recommender systems has attracted increasing attention in recent years. However, unexpected behaviors in historical interactions like clicking some items by accident don't well reflect users' inherent preferences. Existing studies fail to model the effects of unexpected behaviors, thus achieve inferior recommendation performance. In this paper, we propose a Multi-Preferences Model (MPM) to eliminate the effects of unexpected behaviors. MPM first extracts the users' instant preferences from their recent historical interactions by a fine-grained preference module. Then an unexpected-behaviors detector is trained to judge whether these instant preferences are biased by unexpected behaviors. We also integrate user's general preference in MPM. Finally, an output module is performed to eliminate the effects of unexpected behaviors and integrates all the information to make a final recommendation. We conduct extensive experiments on two datasets of a movie and an e-retailing, demonstrating significant improvements in our model over the state-of-the-art methods. The experimental results show that MPM gets a massive improvement in HR@10 and NDCG@10, which relatively increased by 3.643% and 4.107% compare with AttRec model on average. We publish our code at https://github.com/chenjie04/MPM/.

【29】 Continuous-Time Video Generation via Learning Motion Dynamics with Neural ODE 标题:基于神经节点学习运动动力学的连续时间视频生成 链接:https://arxiv.org/abs/2112.10960

作者:Kangyeol Kim,Sunghyun Park,Junsoo Lee,Joonseok Lee,Sookyung Kim,Jaegul Choo,Edward Choi 机构: KAIST AI, Korea, Kakao Enterprise, Naver Webtoon, Seoul National University, Google Research, United States, Xerox PARC, Letsur Inc. 备注:24 pages; Accepted to BMVC 2021 摘要:为了执行无条件视频生成,我们必须了解真实世界视频的分布。为了合成高质量的视频,各种研究试图学习噪声和视频之间的映射函数,包括最近分离运动分布和外观分布的努力。然而,以前的方法是以离散的固定时间间隔学习运动动力学,这与物理体运动的连续性相反。在本文中,我们提出了一种新的视频生成方法,该方法学习运动和外观的独立分布,前者通过神经ODE建模来学习自然运动动力学。具体而言,我们采用两阶段方法,第一阶段将噪声向量转换为任意帧速率的关键点序列,第二阶段基于给定的关键点序列和外观噪声向量合成视频。我们的模型不仅在数量上优于最近的视频生成基线,而且还展示了多种功能,如动态帧速率操纵和两个数据集之间的运动传输,从而为各种视频生成应用打开了新的大门。 摘要:In order to perform unconditional video generation, we must learn the distribution of the real-world videos. In an effort to synthesize high-quality videos, various studies attempted to learn a mapping function between noise and videos, including recent efforts to separate motion distribution and appearance distribution. Previous methods, however, learn motion dynamics in discretized, fixed-interval timesteps, which is contrary to the continuous nature of motion of a physical body. In this paper, we propose a novel video generation approach that learns separate distributions for motion and appearance, the former modeled by neural ODE to learn natural motion dynamics. Specifically, we employ a two-stage approach where the first stage converts a noise vector to a sequence of keypoints in arbitrary frame rates, and the second stage synthesizes videos based on the given keypoints sequence and the appearance noise vector. Our model not only quantitatively outperforms recent baselines for video generation, but also demonstrates versatile functionality such as dynamic frame rate manipulation and motion transfer between two datasets, thus opening new doors to diverse video generation applications.

【30】 Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial Motion 标题:观看这些词语:使用词语条件化面部运动的视频篡改检测 链接:https://arxiv.org/abs/2112.10936

作者:Shruti Agarwal,Liwen Hu,Evonne Ng,Trevor Darrell,Hao Li,Anna Rohrbach 机构:UC Berkeley, Pinscreen 摘要:在当今的数字误报时代,我们日益面临视频篡改技术带来的新威胁。此类伪造包括廉价假货(例如,长相或音频配音)和深度假货(例如,复杂的人工智能媒体合成方法),这些假货在感知上与真实视频无法区分。为了应对这一挑战,我们提出了一种多模态语义取证方法,以发现超出检测视觉质量差异的线索,从而处理更简单的廉价假货和具有视觉说服力的深度假货。在这项工作中,我们的目标是通过检测他们的面部动作和他们所说的话之间的异常对应来验证视频中所看到的所谓的人确实是他们自己。我们利用归因的概念来学习特定于个人的生物特征模式,从而将特定的说话人与其他人区分开来。我们使用可解释动作单位(AUs)来捕捉一个人的面部和头部运动,而不是深度CNN视觉特征,我们是第一个使用词条件面部运动分析的人。与现有的针对特定人的方法不同,我们的方法也能有效地防止针对嘴唇操纵的攻击。我们进一步证明了我们的方法对训练中未发现的一系列假货的有效性,包括未经视频处理的假货,这些假货在之前的工作中未得到解决。 摘要:In today's era of digital misinformation, we are increasingly faced with new threats posed by video falsification techniques. Such falsifications range from cheapfakes (e.g., lookalikes or audio dubbing) to deepfakes (e.g., sophisticated AI media synthesis methods), which are becoming perceptually indistinguishable from real videos. To tackle this challenge, we propose a multi-modal semantic forensic approach to discover clues that go beyond detecting discrepancies in visual quality, thereby handling both simpler cheapfakes and visually persuasive deepfakes. In this work, our goal is to verify that the purported person seen in the video is indeed themselves by detecting anomalous correspondences between their facial movements and the words they are saying. We leverage the idea of attribution to learn person-specific biometric patterns that distinguish a given speaker from others. We use interpretable Action Units (AUs) to capture a persons' face and head movement as opposed to deep CNN visual features, and we are the first to use word-conditioned facial motion analysis. Unlike existing person-specific approaches, our method is also effective against attacks that focus on lip manipulation. We further demonstrate our method's effectiveness on a range of fakes not seen in training including those without video manipulation, that were not addressed in prior work.

【31】 Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting 标题:具有输入独立动态重路由的紧凑型多级稀疏神经网络 链接:https://arxiv.org/abs/2112.10930

作者:Minghai Qin,Tianyun Zhang,Fei Sun,Yen-Kuang Chen,Makan Fardad,Yanzhi Wang,Yuan Xie 机构:Western Digital Research, Cleveland State University, Alibaba DAMO Academy, Syracuse University, Northeastern University 摘要:深度神经网络(DNN)在许多实际应用中表现出卓越的性能,但其巨大的计算成本和存储需求使其无法部署到许多边缘和物联网(IoT)设备上。稀疏深度神经网络的大部分权值参数为零,可以大大降低模型的计算复杂度和内存消耗。在实际使用场景中,设备在不同的环境下可能会受到可用计算和内存资源的大幅波动,并且由于存在延迟较大的长尾推断,服务质量(QoS)难以维持。面对现实生活中的挑战,我们建议训练一个支持多个稀疏层次的稀疏模型。即,满足权重的层次结构,使得稀疏子模型的稀疏子模型区域子集的非零参数的位置和值小于稀疏子模型的稀疏子模型区域子集的非零参数的位置和值。通过这种方式,可以在推理过程中动态选择合适的稀疏度级别,而存储成本由最小稀疏子模型来限制。我们已经在各种DNN模型和任务上验证了我们的方法,包括ResNet-50、PointNet 、GNMT和图形注意网络。我们得到的稀疏子模型平均权重为13.38%,失败率为14.97%,而精度与密集子模型相当。在相对精度损失仅为3.25%的情况下,可以获得权重为5.38%和FLOPs为4.47%的更稀疏子模型,这些子模型是较稀疏子模型的子集。 摘要:Deep neural networks (DNNs) have shown to provide superb performance in many real life applications, but their large computation cost and storage requirement have prevented them from being deployed to many edge and internet-of-things (IoT) devices. Sparse deep neural networks, whose majority weight parameters are zeros, can substantially reduce the computation complexity and memory consumption of the models. In real-use scenarios, devices may suffer from large fluctuations of the available computation and memory resources under different environment, and the quality of service (QoS) is difficult to maintain due to the long tail inferences with large latency. Facing the real-life challenges, we propose to train a sparse model that supports multiple sparse levels. That is, a hierarchical structure of weights are satisfied such that the locations and the values of the non-zero parameters of the more-sparse sub-model area subset of the less-sparse sub-model. In this way, one can dynamically select the appropriate sparsity level during inference, while the storage cost is capped by the least sparse sub-model. We have verified our methodologies on a variety of DNN models and tasks, including the ResNet-50, PointNet , GNMT, and graph attention networks. We obtain sparse sub-models with an average of 13.38% weights and 14.97% FLOPs, while the accuracies are as good as their dense counterparts. More-sparse sub-models with 5.38% weights and 4.47% of FLOPs, which are subsets of the less-sparse ones, can be obtained with only 3.25% relative accuracy loss.

【32】 DB-BERT: a Database Tuning Tool that "Reads the Manual" 链接:https://arxiv.org/abs/2112.10925

作者:Immanuel Trummer 机构:Cornell University, Ithaca, New York 摘要:DB-BERT是一种数据库调优工具,它利用通过手册和其他相关文本文档的自然语言分析获得的信息。它使用文本标识要优化的数据库系统参数以及建议的参数值。DB-BERT将预先训练好的大型语言模型(特别是BERT模型)应用于文本分析。在初始训练阶段,它会微调模型权重,以便将自然语言提示转换为推荐设置。在运行时,DB-BERT学习聚合、调整和优先排序提示,以实现特定数据库系统和基准的最佳性能。这两个阶段都是迭代的,使用强化学习来指导选择要评估的调优设置(惩罚数据库系统拒绝的设置,同时奖励提高性能的设置)。在我们的实验中,我们利用数百个关于数据库调优的文本文档作为DB-BERT的输入。考虑到不同的基准(TPC-C和TPC-H)、度量(吞吐量和运行时)以及数据库系统(Postgres和MySQL),我们将DB-BERT与不同的基线进行比较。在所有情况下,DB-BERT在所有比较方法中找到最佳参数设置。DB-BERT的代码可在线获取,网址为https://itrummer.github.io/dbbert/. 摘要:DB-BERT is a database tuning tool that exploits information gained via natural language analysis of manuals and other relevant text documents. It uses text to identify database system parameters to tune as well as recommended parameter values. DB-BERT applies large, pre-trained language models (specifically, the BERT model) for text analysis. During an initial training phase, it fine-tunes model weights in order to translate natural language hints into recommended settings. At run time, DB-BERT learns to aggregate, adapt, and prioritize hints to achieve optimal performance for a specific database system and benchmark. Both phases are iterative and use reinforcement learning to guide the selection of tuning settings to evaluate (penalizing settings that the database system rejects while rewarding settings that improve performance). In our experiments, we leverage hundreds of text documents about database tuning as input for DB-BERT. We compare DB-BERT against various baselines, considering different benchmarks (TPC-C and TPC-H), metrics (throughput and run time), as well as database systems (Postgres and MySQL). In all cases, DB-BERT finds the best parameter settings among all compared methods. The code of DB-BERT is available online at https://itrummer.github.io/dbbert/.

【33】 Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks 标题:稀疏深度神经网络的负载平衡集散模式 链接:https://arxiv.org/abs/2112.10898

作者:Fei Sun,Minghai Qin,Tianyun Zhang,Xiaolong Ma,Haoran Li,Junwen Luo,Zihao Zhao,Yen-Kuang Chen,Yuan Xie 机构:Alibaba DAMO Academy, Western Digital Research, Cleveland State University, Northeastern Unviersity, Fudan University 摘要:深度神经网络(DNN)已被证明能有效地解决许多实际问题,但其高昂的计算成本使得这些模型无法应用于边缘设备。剪枝作为一种将零引入模型权重的方法,已被证明是在模型精度和计算效率之间提供良好折衷的有效方法,并且是生成压缩模型的一种广泛使用的方法。但是,修剪的粒度会做出重要的权衡。在相同的稀疏度水平下,粗粒度结构化稀疏模式在传统硬件上效率更高,但精度更差,而细粒度非结构化稀疏模式可以获得更好的精度,但在现有硬件上效率较低。另一方面,一些现代处理器配备了快速片上暂存器存储器和收集/分散引擎,可对这些存储器执行间接加载和存储操作。在这项工作中,我们提出了一组新的稀疏模式,称为聚集-分散(GS)模式,以利用草稿行存储器和聚集/分散引擎来加速神经网络推理。相应地,我们提出了一种紧凑的稀疏格式。提出的稀疏模式集,以及一种新的修剪方法,解决了负载不平衡问题,使模型的质量接近非结构化稀疏模型,计算效率接近结构化稀疏模型。我们的实验表明,与传统的结构化稀疏模式相比,GS模式在精度和计算效率之间始终具有更好的权衡。GS模式可以在相同的精度级别将DNN组件的运行时间减少两到三倍。这在三种不同的深度学习任务和流行模型上得到了证实,即用于机器翻译的GNMT、用于图像识别的ResNet50和用于声学语音识别的Japser。 摘要:Deep neural networks (DNNs) have been proven to be effective in solving many real-life problems, but its high computation cost prohibits those models from being deployed to edge devices. Pruning, as a method to introduce zeros to model weights, has shown to be an effective method to provide good trade-offs between model accuracy and computation efficiency, and is a widely-used method to generate compressed models. However, the granularity of pruning makes important trade-offs. At the same sparsity level, a coarse-grained structured sparse pattern is more efficient on conventional hardware but results in worse accuracy, while a fine-grained unstructured sparse pattern can achieve better accuracy but is inefficient on existing hardware. On the other hand, some modern processors are equipped with fast on-chip scratchpad memories and gather/scatter engines that perform indirect load and store operations on such memories. In this work, we propose a set of novel sparse patterns, named gather-scatter (GS) patterns, to utilize the scratchpad memories and gather/scatter engines to speed up neural network inferences. Correspondingly, we present a compact sparse format. The proposed set of sparse patterns, along with a novel pruning methodology, address the load imbalance issue and result in models with quality close to unstructured sparse models and computation efficiency close to structured sparse models. Our experiments show that GS patterns consistently make better trade-offs between accuracy and computation efficiency compared to conventional structured sparse patterns. GS patterns can reduce the runtime of the DNN components by two to three times at the same accuracy levels. This is confirmed on three different deep learning tasks and popular models, namely, GNMT for machine translation, ResNet50 for image recognition, and Japser for acoustic speech recognition.

【34】 A Constraint Programming Approach to Weighted Isomorphic Mapping of Fragment-based Shape Signatures 标题:基于片段的形状签名加权同构映射的约束规划方法 链接:https://arxiv.org/abs/2112.10892

作者:Thierry Petit,Randy J. Zauhar 机构:Department of Mathematics, Physics and Statistics, University of the Sciences in Philadelphia., Department of Chemistry and Biochemistry, University of the Sciences in Philadelphia 备注:9 pages 摘要:基于片段的形状特征技术已被证明是计算机辅助药物设计的有力工具。他们允许科学家搜索与已知活性化合物相似的目标分子。它们不需要参考完整的基本化学结构,这对于处理包含数百万种化合物的化学数据库至关重要。然而,找到碎片化合物部分的最佳匹配可能很耗时。在本文中,我们使用约束规划来解决这个具体问题。它涉及到寻找受连接性约束的片段的加权分配。我们的实验证明了我们的方法的实用性,并开辟了新的视角,包括生成多种多样的解决方案。我们的方法最初是在实时环境中使用约束解算器,传播允许避免加权路径的枚举。该模型必须对附加约束保持鲁棒性,从而使某些实例无法处理。这种特殊的环境要求在选择模型时使用不寻常的标准:轻量级、标准传播算法、数据结构,在减少搜索空间的同时,不存在令人望而却步的固定成本。目标不是设计新的复杂算法来解决困难的情况。 摘要:Fragment-based shape signature techniques have proven to be powerful tools for computer-aided drug design. They allow scientists to search for target molecules with some similarity to a known active compound. They do not require reference to the full underlying chemical structure, which is essential to deal with chemical databases containing millions of compounds. However, finding the optimal match of a part of the fragmented compound can be time-consuming. In this paper, we use constraint programming to solve this specific problem. It involves finding a weighted assignment of fragments subject to connectivity constraints. Our experiments demonstrate the practical relevance of our approach and open new perspectives, including generating multiple, diverse solutions. Our approach constitutes an original use of a constraint solver in a real time setting, where propagation allows to avoid an enumeration of weighted paths. The model must remain robust to the addition of additional constraints making some instances not tractable. This particular context requires the use of unusual criteria for the choice of the model: lightweight, standard propagation algorithms, data structures without prohibitive constant cost while reducing the search space. The objective is not to design new, complex algorithms to solve difficult instances.

【35】 Fast Algorithms for Poker Require Modelling it as a Sequential Bayesian Game 标题:扑克快速算法要求将其建模为序列贝叶斯博弈 链接:https://arxiv.org/abs/2112.10890

作者:Vojtěch Kovařík,David Milec,Michal Šustr,Dominik Seitz,Viliam Lisý 机构:Artificial Intelligence Center, Czech Technical University in Prague 备注:To appear at Reinforcement Learning in Games workshop at AAAI 2022 摘要:在不完全信息游戏中,许多最近的结果只是针对扑克和类似扑克的游戏(如撒谎者骰子)而制定或评估的。我们认为序贯贝叶斯博弈是一类自然的博弈,可以推广这些结果。特别是,该模型允许反事实遗憾最小化算法的优雅公式,称为公共状态CFR(PS-CFR),这自然有助于高效实现。根据经验,由公共州CFR解决一个包含10^7个州的扑克子游戏需要3分钟和700 MB,而类似版本的香草CFR需要5.5小时和20 GB。此外,CFR的公共状态公式为利用特定领域的假设提供了可能性,导致扑克和其他领域的渐进复杂性(以及进一步的经验加速)比普通CFR二次降低。总的来说,这表明将扑克表示为连续贝叶斯游戏的能力在基于CFR的方法的成功中起到了关键作用。最后,我们将公共状态CFR扩展到一般的扩展形式游戏,认为这种扩展享受了序列贝叶斯游戏版本的一些(但不是全部)好处。 摘要:Many recent results in imperfect information games were only formulated for, or evaluated on, poker and poker-like games such as liar's dice. We argue that sequential Bayesian games constitute a natural class of games for generalizing these results. In particular, this model allows for an elegant formulation of the counterfactual regret minimization algorithm, called public-state CFR (PS-CFR), which naturally lends itself to an efficient implementation. Empirically, solving a poker subgame with 10^7 states by public-state CFR takes 3 minutes and 700 MB while a comparable version of vanilla CFR takes 5.5 hours and 20 GB. Additionally, the public-state formulation of CFR opens up the possibility for exploiting domain-specific assumptions, leading to a quadratic reduction in asymptotic complexity (and a further empirical speedup) over vanilla CFR in poker and other domains. Overall, this suggests that the ability to represent poker as a sequential Bayesian game played a key role in the success of CFR-based methods. Finally, we extend public-state CFR to general extensive-form games, arguing that this extension enjoys some - but not all - of the benefits of the version for sequential Bayesian games.

【36】 AGPNet -- Autonomous Grading Policy Network 标题:AGPNet--自主评分政策网络 链接:https://arxiv.org/abs/2112.10877

作者:Chana Ross,Yakov Miron,Yuval Goldfracht,Dotan Di Castro 机构: Israel 2Charney School of Marine Sciences, University of Haifa 备注:7 pages, paper submitted to IEEE International Conference on Robotics and Automation 摘要:在这项工作中,我们建立了启发式和学习策略,用于自动控制推土机在布满沙堆的不均匀区域进行分级。我们将该问题形式化为一个马尔可夫决策过程,设计了一个演示agent与环境交互的仿真,最后将我们的模拟器与一个真正的推土机原型进行了比较。我们使用强化学习、行为克隆和对比学习的方法来训练混合策略。我们经过训练的代理AGPNet达到了人类水平的性能,并且在自主评分任务方面优于当前最先进的机器学习方法。此外,我们的代理能够从随机场景推广到看不见的现实世界问题。 摘要:In this work, we establish heuristics and learning strategies for the autonomous control of a dozer grading an uneven area studded with sand piles. We formalize the problem as a Markov Decision Process, design a simulation which demonstrates agent-environment interactions and finally compare our simulator to a real dozer prototype. We use methods from reinforcement learning, behavior cloning and contrastive learning to train a hybrid policy. Our trained agent, AGPNet, reaches human-level performance and outperforms current state-of-the-art machine learning methods for the autonomous grading task. In addition, our agent is capable of generalizing from random scenarios to unseen real world problems.

【37】 Demonstration Informed Specification Search 标题:演示通知规格搜索 链接:https://arxiv.org/abs/2112.10807

作者:Marcell Vazquez-Chanlatte,Ameesh Shah,Gil Lederman,Sanjit A. Seshia 机构:Seshia, University of California, Berkeley 摘要:本文考虑从专家演示中学习历史相关任务规范(例如自动机和时态逻辑)的问题。不幸的是,考虑中的任务数量(可数无限)加上对演示任务编码所需的历史特征的先验无知,使得现有的从演示中学习任务的方法不适用。为了解决这一缺陷,我们提出了基于演示的规范搜索(DISS):通过黑盒访问参数化的一系列算法(i)最大熵规划器和(ii)从标记示例识别概念(如自动机)的算法。DISS的工作原理是:(i)猜测标记的示例,使演示不那么令人惊讶;(ii)与当前标记的示例一致的采样概念。在确定性有限自动机描述的任务上下文中,我们提供了DISS的具体实现,该实现有效地结合了任务的部分知识和单个专家演示,以确定完整的任务规范。 摘要:This paper considers the problem of learning history dependent task specifications, e.g. automata and temporal logic, from expert demonstrations. Unfortunately, the (countably infinite) number of tasks under consideration combined with an a-priori ignorance of what historical features are needed to encode the demonstrated task makes existing approaches to learning tasks from demonstrations inapplicable. To address this deficit, we propose Demonstration Informed Specification Search (DISS): a family of algorithms parameterized by black box access to (i) a maximum entropy planner and (ii) an algorithm for identifying concepts, e.g., automata, from labeled examples. DISS works by alternating between (i) conjecturing labeled examples to make the demonstrations less surprising and (ii) sampling concepts consistent with the current labeled examples. In the context of tasks described by deterministic finite automata, we provide a concrete implementation of DISS that efficiently combines partial knowledge of the task and a single expert demonstration to identify the full task specification.

【38】 TFDPM: Attack detection for cyber-physical systems with diffusion probabilistic models 标题:TFDPM:基于扩散概率模型的网络物理系统攻击检测 链接:https://arxiv.org/abs/2112.10774

作者:Tijin Yan,Tong Zhou,Yufeng Zhan,Yuanqing Xia 机构:Key Laboratory of Intelligent Control Decision of Complex Systems, School of Automation, Beijing Institute of Technology, Beijing , P. R. China. 备注:27 pages, 11 figures 摘要:随着AIoT的发展,基于数据驱动的网络物理系统(CPSs)攻击检测方法受到了广泛关注。然而,现有的方法通常采用易于处理的分布来近似数据分布,这不适用于复杂系统。此外,不同通道中数据的相关性没有引起足够的重视。为了解决这些问题,我们使用基于能量的生成模型,它对数据分布的函数形式限制较少。此外,图形神经网络用于显式建模不同通道中数据的相关性。最后,我们提出了一个通用的攻击检测框架TFDPM。在给定历史数据的情况下,同时提取时间模式和特征模式。然后将提取的特征发送到条件扩散概率模型。利用条件生成网络可以得到预测值,并根据预测值和观测值之间的差异检测攻击。此外,为了实现实时检测,提出了一种条件噪声调度网络来加速预测过程。实验结果表明,TFDPM的性能优于现有的最先进的攻击检测方法。噪声调度网络将检测速度提高了三倍。 摘要:With the development of AIoT, data-driven attack detection methods for cyber-physical systems (CPSs) have attracted lots of attention. However, existing methods usually adopt tractable distributions to approximate data distributions, which are not suitable for complex systems. Besides, the correlation of the data in different channels does not attract sufficient attention. To address these issues, we use energy-based generative models, which are less restrictive on functional forms of the data distribution. In addition, graph neural networks are used to explicitly model the correlation of the data in different channels. In the end, we propose TFDPM, a general framework for attack detection tasks in CPSs. It simultaneously extracts temporal pattern and feature pattern given the historical data. Then extract features are sent to a conditional diffusion probabilistic model. Predicted values can be obtained with the conditional generative network and attacks are detected based on the difference between predicted values and observed values. In addition, to realize real-time detection, a conditional noise scheduling network is proposed to accelerate the prediction process. Experimental results show that TFDPM outperforms existing state-of-the-art attack detection methods. The noise scheduling network increases the detection speed by three times.

【39】 Improving Learning-to-Defer Algorithms Through Fine-Tuning 标题:通过微调改进学习延迟算法 链接:https://arxiv.org/abs/2112.10768

作者:Naveen Raman,Michael Yee 机构:Department of Computer Science, University of Maryland, College Park, MD, MIT Lincoln Laboratory, Lexington, MA 摘要:人工智能的普遍存在导致了人类和人工智能共同工作的情况,因此需要学习延迟算法,以确定如何在人工智能和人类之间划分任务。当与特定个体配对时,我们通过合并两种微调算法并使用合成和图像数据集测试其有效性来改进延迟算法的学习。我们发现,微调可以捕捉简单的人类技能模式,但要与细微差别作斗争,我们建议今后的工作使用稳健的半监督模式来改进学习。 摘要:The ubiquity of AI leads to situations where humans and AI work together, creating the need for learning-to-defer algorithms that determine how to partition tasks between AI and humans. We work to improve learning-to-defer algorithms when paired with specific individuals by incorporating two fine-tuning algorithms and testing their efficacy using both synthetic and image datasets. We find that fine-tuning can pick up on simple human skill patterns, but struggles with nuance, and we suggest future work that uses robust semi-supervised to improve learning.

【40】 HarmoFL: Harmonizing Local and Global Drifts in Federated Learning on Heterogeneous Medical Images 标题:HarmoFL:异构医学图像联合学习中局部和全局漂移的协调 链接:https://arxiv.org/abs/2112.10775

作者:Meirui Jiang,Zirui Wang,Qi Dou 机构: Department of Computer Science and Engineering, The Chinese University of Hong Kong, School of Biological Science and Medical Engineering, Beihang University 摘要:多个医疗机构使用联合学习(FL)协作训练模型已成为最大限度地发挥数据驱动模型潜力的一个有希望的解决方案,但医学图像中的非独立同分布(非iid)数据仍然是现实世界实践中的一个突出挑战。不同的扫描程序或协议导致的特征异质性在本地(客户端)和全局(服务器)优化的学习过程中引入了漂移,这损害了收敛性和模型性能。以前的许多工作都试图通过局部或全局解决漂移来解决非iid问题,但如何联合解决这两个基本耦合的漂移仍然不清楚。在这项工作中,我们专注于处理本地和全球漂移,并引入了一个新的协调框架HarmoFL。首先,我们建议通过将变换到频域的图像的振幅归一化以模拟统一的成像设置来减轻局部更新漂移,以便在局部客户机上生成协调的特征空间。其次,基于协调特征,我们设计了一个客户权重扰动,引导每个局部模型达到平坦最优,其中局部最优解的邻域具有一致的低损失。在没有任何额外通信开销的情况下,扰动通过聚合多个局部平坦最优解,帮助全局模型朝着收敛最优解进行优化。我们对所提出的方法进行了理论分析,并在三个医学图像分类和分割任务上进行了大量实验,结果表明HarmoFL的性能优于一组最新的具有良好收敛性能的方法。 摘要:Multiple medical institutions collaboratively training a model using federated learning (FL) has become a promising solution for maximizing the potential of data-driven models, yet the non-independent and identically distributed (non-iid) data in medical images is still an outstanding challenge in real-world practice. The feature heterogeneity caused by diverse scanners or protocols introduces a drift in the learning process, in both local (client) and global (server) optimizations, which harms the convergence as well as model performance. Many previous works have attempted to address the non-iid issue by tackling the drift locally or globally, but how to jointly solve the two essentially coupled drifts is still unclear. In this work, we concentrate on handling both local and global drifts and introduce a new harmonizing framework called HarmoFL. First, we propose to mitigate the local update drift by normalizing amplitudes of images transformed into the frequency domain to mimic a unified imaging setting, in order to generate a harmonized feature space across local clients. Second, based on harmonized features, we design a client weight perturbation guiding each local model to reach a flat optimum, where a neighborhood area of the local optimal solution has a uniformly low loss. Without any extra communication cost, the perturbation assists the global model to optimize towards a converged optimal solution by aggregating several local flat optima. We have theoretically analyzed the proposed method and empirically conducted extensive experiments on three medical image classification and segmentation tasks, showing that HarmoFL outperforms a set of recent state-of-the-art methods with promising convergence behavior.

机器翻译,仅供参考

0 人点赞