人工智能学术速递[7.12]

2021-07-27 10:47:40 浏览数 (1)

访问www.arxivdaily.com获取含摘要速递,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏、发帖等功能!点击阅读原文即可访问

cs.AI人工智能,共计33篇

【1】 Learning Interaction-aware Guidance Policies for Motion Planning in Dense Traffic Scenarios 标题:密集交通场景下运动规划的学习交互感知诱导策略

作者:Bruno Brito,Achin Agarwal,Javier Alonso-Mora 机构:This work was supported by the Amsterdam Institute for AdvancedMetropolitan Solutions and the Netherlands Organisation for Scientific Re-search (NWO) domain Applied Sciences (Veni 1 59 16), DelftUniversityofTechnology 链接:https://arxiv.org/abs/2107.04538 摘要:密集交通场景下的自主导航对于自主车辆(AVs)来说仍然是一个挑战,因为其他驾驶员的意图是不可直接观察的,AVs必须处理各种各样的驾驶行为。要在密集的交通中机动,AVs必须能够推理其行为如何影响他人(交互模型),并利用此推理安全地在密集的交通中导航。提出了一种新的交通密集场景下基于交互感知的运动规划框架。我们探讨了人类驾驶行为和他们相互作用时的速度变化之间的联系。因此,我们建议通过深度强化学习(RL)学习一种交互感知策略,为基于优化的规划人员提供关于其他车辆协作性的全局指导,通过约束满足确保安全性和运动可行性。学习到的策略可以推理和引导具有交互行为的局部优化规划器在交通拥挤的情况下主动合并,同时在其他车辆不让行的情况下保持安全。我们提出了定性和定量的结果在高度互动的模拟环境(高速公路合并和无保护左转)对两个基线方法,基于学习和优化的方法。结果表明,相对于基于学习和基于优化的基线,我们的方法显著减少了碰撞次数,提高了成功率。 摘要:Autonomous navigation in dense traffic scenarios remains challenging for autonomous vehicles (AVs) because the intentions of other drivers are not directly observable and AVs have to deal with a wide range of driving behaviors. To maneuver through dense traffic, AVs must be able to reason how their actions affect others (interaction model) and exploit this reasoning to navigate through dense traffic safely. This paper presents a novel framework for interaction-aware motion planning in dense traffic scenarios. We explore the connection between human driving behavior and their velocity changes when interacting. Hence, we propose to learn, via deep Reinforcement Learning (RL), an interaction-aware policy providing global guidance about the cooperativeness of other vehicles to an optimization-based planner ensuring safety and kinematic feasibility through constraint satisfaction. The learned policy can reason and guide the local optimization-based planner with interactive behavior to pro-actively merge in dense traffic while remaining safe in case the other vehicles do not yield. We present qualitative and quantitative results in highly interactive simulation environments (highway merging and unprotected left turns) against two baseline approaches, a learning-based and an optimization-based method. The presented results demonstrate that our method significantly reduces the number of collisions and increases the success rate with respect to both learning-based and optimization-based baselines.

【2】 Batch Inverse-Variance Weighting: Deep Heteroscedastic Regression 标题:批次逆方差加权:深度异方差回归

作者:Vincent Mai,Waleed Khamies,Liam Paull 机构:Whaleed Khamies, Robotics and Embodied AI Lab, Mila - Quebec Institute of Artificial Intelligence, Université de Montréal, Canada, Canada CIFAR AI Chair 备注:Accepted at the Uncertainty in Deep Learning (UDL) workshop at ICML 2021 链接:https://arxiv.org/abs/2107.04497 摘要:异方差回归是监督学习的任务,其中每个标签都受到来自不同分布的噪声的影响。这种噪声可能是由标记过程引起的,并且会对学习算法的性能产生负面影响,因为它违反了i.i.d.的假设。然而,在许多情况下,标签过程能够估计每个标签的这种分布的方差,这可以用作减轻这种影响的附加信息。基于Gauss-Markov定理,提出了一种逆方差加权均方误差的神经网络参数优化方法。我们引入了一种对近地真值样本具有鲁棒性的损失函数批量逆方差,并允许控制有效学习率。实验结果表明,与L2丢失、逆方差加权以及基于滤波的基线相比,BIV算法在两个噪声数据集上都显著提高了网络的性能。 摘要:Heteroscedastic regression is the task of supervised learning where each label is subject to noise from a different distribution. This noise can be caused by the labelling process, and impacts negatively the performance of the learning algorithm as it violates the i.i.d. assumptions. In many situations however, the labelling process is able to estimate the variance of such distribution for each label, which can be used as an additional information to mitigate this impact. We adapt an inverse-variance weighted mean square error, based on the Gauss-Markov theorem, for parameter optimization on neural networks. We introduce Batch Inverse-Variance, a loss function which is robust to near-ground truth samples, and allows to control the effective learning rate. Our experimental results show that BIV improves significantly the performance of the networks on two noisy datasets, compared to L2 loss, inverse-variance weighting, as well as a filtering-based baseline.

【3】 Offline reinforcement learning with uncertainty for treatment strategies in sepsis 标题:具有不确定性的离线强化学习在脓毒症治疗策略中的应用

作者:Ran Liu,Joseph L. Greenstein,James C. Fackler,Jules Bergmann,Melania M. Bembea,Raimond L. Winslow 机构:Affiliations:, Institute for Computational Medicine, The Johns Hopkins University, Department of Biomedical Engineering, The Johns Hopkins University School of Medicine &, Whiting School of Engineering 备注:25 pages, 8 figures 链接:https://arxiv.org/abs/2107.04491 摘要:脓毒症和感染性休克的指南性治疗是困难的,因为脓毒症是一种不同范围的危及生命的器官功能障碍,其病理生理学尚不完全清楚。脓毒症的早期干预对患者的预后至关重要,然而这些干预措施会产生不良影响,并且经常管理过度。更大的个性化是必要的,因为没有一个单一的行动是适合所有患者。我们提出了一个新的应用强化学习,其中我们确定最佳的建议脓毒症治疗的数据,估计他们的信心水平,并确定治疗方案很少观察到的训练数据。我们的方法可以提供多种治疗方案,而不是单一的建议。我们研究了学习策略,发现强化学习由于死亡率和接受治疗水平之间的混杂关系而偏向于积极干预。我们使用子空间学习来减轻这种偏见,并开发出能够在医疗保健应用程序中产生更准确的学习策略的方法。 摘要:Guideline-based treatment for sepsis and septic shock is difficult because sepsis is a disparate range of life-threatening organ dysfunctions whose pathophysiology is not fully understood. Early intervention in sepsis is crucial for patient outcome, yet those interventions have adverse effects and are frequently overadministered. Greater personalization is necessary, as no single action is suitable for all patients. We present a novel application of reinforcement learning in which we identify optimal recommendations for sepsis treatment from data, estimate their confidence level, and identify treatment options infrequently observed in training data. Rather than a single recommendation, our method can present several treatment options. We examine learned policies and discover that reinforcement learning is biased against aggressive intervention due to the confounding relationship between mortality and level of treatment received. We mitigate this bias using subspace learning, and develop methodology that can yield more accurate learning policies across healthcare applications.

【4】 ARC: Adversarially Robust Control Policies for Autonomous Vehicles 标题:ARC:自主车辆的对抗性鲁棒控制策略

作者:Sampo Kuutti,Saber Fallah,Richard Bowden 备注:Accepted in IEEE Intelligent Transportation Systems Conference (ITSC) 2021 链接:https://arxiv.org/abs/2107.04487 摘要:深度神经网络已经证明了其学习各种任务控制策略的能力。然而,这些基于神经网络的策略已被证明容易被敌对代理利用。因此,有必要开发技术来学习对对手具有鲁棒性的控制策略。我们引入了对手鲁棒控制(ARC),它在相同的损失下,端到端地训练主角策略和对手策略。主角的目标是最大限度地减少损失,而对手则试图将损失降到最低。我们在一个高速公路驾驶场景中演示了建议的ARC训练,其中主角控制跟随者车辆,而对手控制领头车辆。通过训练主人公对抗一组对手,它学习了一种更为鲁棒的控制策略,该策略可推广到多种对抗策略。结果表明,与原策略相比,该方法减少了90.25%的碰撞次数。此外,通过利用辅助蒸馏损失,我们证明了微调控制策略在其原始训练分布上的性能没有下降。 摘要:Deep neural networks have demonstrated their capability to learn control policies for a variety of tasks. However, these neural network-based policies have been shown to be susceptible to exploitation by adversarial agents. Therefore, there is a need to develop techniques to learn control policies that are robust against adversaries. We introduce Adversarially Robust Control (ARC), which trains the protagonist policy and the adversarial policy end-to-end on the same loss. The aim of the protagonist is to maximise this loss, whilst the adversary is attempting to minimise it. We demonstrate the proposed ARC training in a highway driving scenario, where the protagonist controls the follower vehicle whilst the adversary controls the lead vehicle. By training the protagonist against an ensemble of adversaries, it learns a significantly more robust control policy, which generalises to a variety of adversarial strategies. The approach is shown to reduce the amount of collisions against new adversaries by up to 90.25%, compared to the original policy. Moreover, by utilising an auxiliary distillation loss, we show that the fine-tuned control policy shows no drop in performance across its original training distribution.

【5】 Adversarial Mixture Density Networks: Learning to Drive Safely from Collision Data 标题:对抗性混合密度网络:从碰撞数据中学习安全驾驶

作者:Sampo Kuutti,Saber Fallah,Richard Bowden 备注:Accepted in IEEE Intelligent Transportation Systems Conference (ITSC) 2021 链接:https://arxiv.org/abs/2107.04485 摘要:模仿学习被广泛应用于基于预记录数据的自动驾驶控制策略学习。然而,基于模仿学习的策略在遇到训练分布以外的状态时,容易出现复合错误。此外,这些代理已被证明很容易被敌对道路使用者利用,目的是制造碰撞。为了克服这些缺点,我们引入了对抗性混合密度网络(AMDN),它从不同的数据集中学习两个分布。第一种是从自然人驾驶数据集中学习到的安全行为分布。第二个分布表示可能导致冲突的不安全操作,它是从冲突数据集学习的。在训练过程中,我们利用这两个分布来提供基于两个分布相似性的额外损失。在碰撞数据集上训练时,根据安全行为分布与不安全行为分布的相似性对安全行为分布进行惩罚,得到一种更为鲁棒和安全的控制策略。我们在车辆跟踪用例中演示了所提出的AMDN方法,并在自然和对抗性测试环境下进行了评估。我们发现,尽管AMDN方法简单,但与纯模仿学习或标准混合密度网络方法相比,AMDN在学习控制策略的安全性方面具有显著的优势。 摘要:Imitation learning has been widely used to learn control policies for autonomous driving based on pre-recorded data. However, imitation learning based policies have been shown to be susceptible to compounding errors when encountering states outside of the training distribution. Further, these agents have been demonstrated to be easily exploitable by adversarial road users aiming to create collisions. To overcome these shortcomings, we introduce Adversarial Mixture Density Networks (AMDN), which learns two distributions from separate datasets. The first is a distribution of safe actions learned from a dataset of naturalistic human driving. The second is a distribution representing unsafe actions likely to lead to collision, learned from a dataset of collisions. During training, we leverage these two distributions to provide an additional loss based on the similarity of the two distributions. By penalising the safe action distribution based on its similarity to the unsafe action distribution when training on the collision dataset, a more robust and safe control policy is obtained. We demonstrate the proposed AMDN approach in a vehicle following use-case, and evaluate under naturalistic and adversarial testing environments. We show that despite its simplicity, AMDN provides significant benefits for the safety of the learned control policy, when compared to pure imitation learning or standard mixture density network approaches.

【6】 Aligning an optical interferometer with beam divergence control and continuous action space 标题:利用光束发散控制和连续作用空间对准光学干涉仪

作者:Stepan Makarenko,Dmitry Sorokin,Alexander Ulanov,A. I. Lvovsky 机构:Russian Quantum Center, Moscow, Russia, Moscow Institute of Physics and Technology, Russia, University of Oxford, United Kingdom 备注:12 pages, 5 figures 链接:https://arxiv.org/abs/2107.04457 摘要:强化学习正从模拟环境向物理环境的转变,逐渐走向现实问题的应用。在这项工作中,我们实现了光学马赫-曾德尔干涉仪与共焦望远镜在一个手臂,它控制相应的光束直径和发散度的视觉对齐。我们使用连续动作空间;指数缩放使我们能够处理超过两个数量级范围内的动作。我们的代理只在一个模拟的环境中进行训练。在实验评估中,代理的性能明显优于现有的解决方案和人类专家。 摘要:Reinforcement learning is finding its way to real-world problem application, transferring from simulated environments to physical setups. In this work, we implement vision-based alignment of an optical Mach-Zehnder interferometer with a confocal telescope in one arm, which controls the diameter and divergence of the corresponding beam. We use a continuous action space; exponential scaling enables us to handle actions within a range of over two orders of magnitude. Our agent trains only in a simulated environment with domain randomizations. In an experimental evaluation, the agent significantly outperforms an existing solution and a human expert.

【7】 Multimodal Icon Annotation For Mobile Applications 标题:面向移动应用的多模态图标标注

作者:Xiaoxue Zang,Ying Xu,Jindong Chen 机构:Google Research, Mountain View, CA, United States, Screenshot, View hierarchy, close, settings, home, play 备注:11 pages, MobileHCI 2021 链接:https://arxiv.org/abs/2107.04452 摘要:注释用户界面(UI)涉及对屏幕上有意义的UI元素进行本地化和分类,对于许多移动应用程序(如屏幕阅读器和设备的语音控制)来说是一个关键步骤。由于屏幕上缺少明确的标签、它们与图片的相似性以及形状的多样性,注释对象图标(如菜单、搜索和向后箭头)尤其具有挑战性。现有的研究或者使用视图层次或者基于像素的方法来处理这个任务。基于像素的方法更受欢迎,因为移动平台上的视图层次结构特征通常不完整或不准确,但是它忽略了视图层次结构中的指导信息,例如资源ID或内容描述。我们提出了一种新的基于深度学习的多模式方法,它结合了像素和视图层次结构的优点,并利用了最先进的对象检测技术。为了演示所提供的实用程序,我们通过手动注释Rico中最常用的29个图标来创建一个高质量的UI数据集,Rico是一个由72k UI屏幕截图组成的大型移动设计数据集。实验结果表明了该方法的有效性。该模型不仅优于广泛使用的目标分类基线,而且优于基于像素的目标检测模型。我们的研究揭示了如何结合视图层次结构和像素特征来注释UI元素。 摘要:Annotating user interfaces (UIs) that involves localization and classification of meaningful UI elements on a screen is a critical step for many mobile applications such as screen readers and voice control of devices. Annotating object icons, such as menu, search, and arrow backward, is especially challenging due to the lack of explicit labels on screens, their similarity to pictures, and their diverse shapes. Existing studies either use view hierarchy or pixel based methods to tackle the task. Pixel based approaches are more popular as view hierarchy features on mobile platforms are often incomplete or inaccurate, however it leaves out instructional information in the view hierarchy such as resource-ids or content descriptions. We propose a novel deep learning based multi-modal approach that combines the benefits of both pixel and view hierarchy features as well as leverages the state-of-the-art object detection techniques. In order to demonstrate the utility provided, we create a high quality UI dataset by manually annotating the most commonly used 29 icons in Rico, a large scale mobile design dataset consisting of 72k UI screenshots. The experimental results indicate the effectiveness of our multi-modal approach. Our model not only outperforms a widely used object classification baseline but also pixel based object detection models. Our study sheds light on how to combine view hierarchy with pixel features for annotating UI elements.

【8】 A Comparison of Contextual and Non-Contextual Preference Ranking for Set Addition Problems 标题:集合相加问题的上下文偏好排序与非上下文偏好排序的比较

作者:Timo Bertram,Johannes Fürnkranz,Martin Müller 机构: Johannes-Kepler Universit¨at, University of Al-berta 备注:None 链接:https://arxiv.org/abs/2107.04438 摘要:本文研究了元素对集合加法的求值问题。这个问题是困难的,因为在一般情况下,它不能被简化为选择之间的无条件偏好。因此,我们根据决策的上下文建立偏好模型。我们讨论并比较了两种不同的暹罗网络架构:一种是比较加法后产生的两个集合的双网络,另一种是模拟每个候选对现有集合的贡献的三重网络。我们在实际任务中评估这两种设置;学习人类的卡片偏好,在收集纸牌游戏魔术甲板建设:收集。我们证明了三重网络方法比双网络方法获得了更好的结果,并且在这项任务上都优于以前的结果。 摘要:In this paper, we study the problem of evaluating the addition of elements to a set. This problem is difficult, because it can, in the general case, not be reduced to unconditional preferences between the choices. Therefore, we model preferences based on the context of the decision. We discuss and compare two different Siamese network architectures for this task: a twin network that compares the two sets resulting after the addition, and a triplet network that models the contribution of each candidate to the existing set. We evaluate the two settings on a real-world task; learning human card preferences for deck building in the collectible card game Magic: The Gathering. We show that the triplet approach achieves a better result than the twin network and that both outperform previous results on this task.

【9】 How to choose an Explainability Method? Towards a Methodical Implementation of XAI in Practice 标题:如何选择可解释性方法?如何在实践中有条不紊地实施XAI

作者:Tom Vermeire,Thibault Laugel,Xavier Renard,David Martens,Marcin Detyniecki 机构: University of Antwerp, Prinsstraat , Antwerp, Belgium, AXA, Paris, France, Sorbonne Universit´e, CNRS, LIP, F-, Paris, France, Polish Academy of Science, IBS PAN, Warsaw, Poland 链接:https://arxiv.org/abs/2107.04427 摘要:由于监管举措和公众意识的转变,解释性正成为使用自动化决策的组织的一个重要要求。在这个领域中,已经引入了各种不同的算法来提供这种解释性,但是机器学习领域的现有文献很少关注利益相关者的需求,而这些利益相关者的需求是在人机界面领域中研究的。因此,想要或需要提供这种可解释性的组织将面临为其用例选择适当方法的问题。在本文中,我们认为需要一种方法来弥补利益相关者的需求和解释方法之间的差距。我们将介绍我们正在进行的创建此方法的工作,以帮助数据科学家向利益相关者提供可解释性。特别是,我们的贡献包括用于描述XAI方法和用户需求的文档(如附录所示),我们的方法建立在这些文档的基础上。 摘要:Explainability is becoming an important requirement for organizations that make use of automated decision-making due to regulatory initiatives and a shift in public awareness. Various and significantly different algorithmic methods to provide this explainability have been introduced in the field, but the existing literature in the machine learning community has paid little attention to the stakeholder whose needs are rather studied in the human-computer interface community. Therefore, organizations that want or need to provide this explainability are confronted with the selection of an appropriate method for their use case. In this paper, we argue there is a need for a methodology to bridge the gap between stakeholder needs and explanation methods. We present our ongoing work on creating this methodology to help data scientists in the process of providing explainability to stakeholders. In particular, our contributions include documents used to characterize XAI methods and user requirements (shown in Appendix), which our methodology builds upon.

【10】 An Orchestration Platform that Puts Radiologists in the Driver's Seat of AI Innovation: A Methodological Approach 标题:让放射科医生坐在人工智能创新的驾驶座上的编排平台:方法论方法

作者:Raphael Y. Cohen,Aaron D. Sodickson 机构:Division of Emergency Radiology, Department of Radiology, Brigham and Women’s Hospital, Boston, MA , Harvard Medical School 链接:https://arxiv.org/abs/2107.04409 摘要:目前人工智能驱动的放射学研究需要的资源和专业知识往往是小型和资源有限的实验室无法获得的。能够参与人工智能研究的临床医生通常资金充足,人员配备充足,或者在人工智能和计算方面有丰富的经验,或者能够接触到从事人工智能研究的同事或设施。当前的成像数据是面向临床医生的,不容易适应机器学习计划,导致效率低下、耗时且成本高昂的工作依赖于数据工程师和机器学习科学家的团队,并且常常阻碍放射科医生推动人工智能研究和创新。我们介绍了我们为解决基础设施和平台需求而开发的系统和方法,同时减少了进入的人员和资源障碍。我们强调数据优先和模块化的方法,简化人工智能开发和部署过程,同时为放射科医生提供高效和熟悉的界面,使他们能够成为新人工智能创新的驱动力。 摘要:Current AI-driven research in radiology requires resources and expertise that are often inaccessible to small and resource-limited labs. The clinicians who are able to participate in AI research are frequently well-funded, well-staffed, and either have significant experience with AI and computing, or have access to colleagues or facilities that do. Current imaging data is clinician-oriented and is not easily amenable to machine learning initiatives, resulting in inefficient, time consuming, and costly efforts that rely upon a crew of data engineers and machine learning scientists, and all too often preclude radiologists from driving AI research and innovation. We present the system and methodology we have developed to address infrastructure and platform needs, while reducing the staffing and resource barriers to entry. We emphasize a data-first and modular approach that streamlines the AI development and deployment process while providing efficient and familiar interfaces for radiologists, such that they can be the drivers of new AI innovations.

【11】 Hoechst Is All You Need: LymphocyteClassification with Deep Learning 标题:Hoechst是您所需要的一切:基于深度学习的淋巴细胞分类

作者:Jessica Cooper,In Hwa Um,Ognjen Arandjelović,David J Harrison 机构:University of St Andrews 备注:15 pages, 4 figures 链接:https://arxiv.org/abs/2107.04388 摘要:多重免疫荧光和免疫组织化学使癌症病理学家能够识别细胞表面表达的几种蛋白质,从而使细胞分类、更好地了解肿瘤微环境、更准确的诊断、预后,并根据个别病人的免疫状况进行量身定制的免疫治疗。然而,他们是昂贵和耗时的过程,需要复杂的染色和成像技术的专家技术人员。Hoechst染色更便宜,也更容易进行,但在这种情况下通常不使用,因为它与DNA结合,而不是与免疫荧光技术靶向的蛋白质结合,而且以前认为仅基于DNA形态来分化表达这些蛋白质的细胞是不可能的。在这项工作中,我们展示了另一种方法,即训练一个深度卷积神经网络来识别表达三种蛋白质(T淋巴细胞标记CD3和CD8,以及B淋巴细胞标记CD20)的细胞,其精确度和召回率超过90%,仅来自Hoechst 33342染色的组织。我们的模型学习了以前未知的与这些蛋白表达相关的形态学特征,这些特征可用于准确区分淋巴细胞亚型,用于评估免疫细胞浸润等关键预后指标,从而预测和改善患者的预后,而无需昂贵的多重免疫荧光。 摘要:Multiplex immunofluorescence and immunohistochemistry benefit patients by allowing cancer pathologists to identify several proteins expressed on the surface of cells, enabling cell classification, better understanding of the tumour micro-environment, more accurate diagnoses, prognoses, and tailored immunotherapy based on the immune status of individual patients. However, they are expensive and time consuming processes which require complex staining and imaging techniques by expert technicians. Hoechst staining is much cheaper and easier to perform, but is not typically used in this case as it binds to DNA rather than to the proteins targeted by immunofluorescent techniques, and it was not previously thought possible to differentiate cells expressing these proteins based only on DNA morphology. In this work we show otherwise, training a deep convolutional neural network to identify cells expressing three proteins (T lymphocyte markers CD3 and CD8, and the B lymphocyte marker CD20) with greater than 90% precision and recall, from Hoechst 33342 stained tissue only. Our model learns previously unknown morphological features associated with expression of these proteins which can be used to accurately differentiate lymphocyte subtypes for use in key prognostic metrics such as assessment of immune cell infiltration,and thereby predict and improve patient outcomes without the need for costly multiplex immunofluorescence.

【12】 Joint Matrix Decomposition for Deep Convolutional Neural Networks Compression 标题:用于深卷积神经网络压缩的联合矩阵分解

作者:Shaowu Chen,Jihao Zhou,Weize Sun,Lei Huang 机构: Weize Sun and Lei Huang are with the Guang-dong Key Laboratory of Intelligent Information Processing, Shenzhen University 备注:Code is publicly available on GitHub: this https URL 链接:https://arxiv.org/abs/2107.04386 摘要:具有大量参数的深度卷积神经网络(CNNs)需要大量的计算资源,这限制了CNNs在资源受限设备上的应用。因此,近年来,基于分解的方法被用于压缩cnn。然而,由于压缩因子与性能呈负相关,最先进的作品要么性能严重下降,要么压缩因子很低。为了克服这些问题,不同于以往单独压缩层的工作,我们提出了通过联合矩阵分解来压缩cnn,缓解性能下降的问题。该思想的灵感来源于CNNs中存在大量的重复模块,通过将具有相同结构的权值投影到同一子空间中,可以进一步压缩甚至加速网络。提出了三种基于奇异值分解的联合矩阵分解方案,并给出了相应的优化方法。在三个具有挑战性的紧凑CNN和三个基准数据集上进行了大量的实验,证明了我们提出的算法的优越性能。结果表明,我们的方法可以将ResNet-34的大小压缩22倍,与现有的几种方法相比,精度下降较小。 摘要:Deep convolutional neural networks (CNNs) with a large number of parameters requires huge computational resources, which has limited the application of CNNs on resources constrained appliances. Decomposition-based methods, therefore, have been utilized to compress CNNs in recent years. However, since the compression factor and performance are negatively correlated, the state-of-the-art works either suffer from severe performance degradation or have limited low compression factors. To overcome these problems, unlike previous works compressing layers separately, we propose to compress CNNs and alleviate performance degradation via joint matrix decomposition. The idea is inspired by the fact that there are lots of repeated modules in CNNs, and by projecting weights with the same structures into the same subspace, networks can be further compressed and even accelerated. In particular, three joint matrix decomposition schemes are developed, and the corresponding optimization approaches based on Singular Values Decomposition are proposed. Extensive experiments are conducted across three challenging compact CNNs and 3 benchmark data sets to demonstrate the superior performance of our proposed algorithms. As a result, our methods can compress the size of ResNet-34 by 22x with slighter accuracy degradation compared with several state-of-the-art methods.

【13】 Rail Topology Ontology: A Rail Infrastructure Base Ontology 标题:轨道拓扑本体:一种基于轨道基础设施的本体

作者:Stefan Bischof,Gottfried Schenner 机构:Siemens AG Österreich, Vienna, Austria 备注:accepted at the International Semantic Web Conference'21 (ISWC 2021) 链接:https://arxiv.org/abs/2107.04378 摘要:铁路基础设施的工程项目通常涉及许多子系统,这些子系统需要对规划和已建基础设施及其底层拓扑结构的一致视图。一致性通常是通过在使用基于XML的数据格式和基于UML的面向对象模型的工具之间交换和验证数据来确保的。通过公共拓扑模型将这些数据表示更紧密地对齐可以减少铁路基础设施工程工具的开发工作。一个通用的语义模型也是铁路知识图成功应用的前提。基于RailTopoModel标准,我们开发了RailTopoOntology作为一个模型,以符合标准的方式表示铁路基础设施的核心特征。本文描述了本体及其开发方法,讨论了本体在铁路工程系统数据与其它来源的知识图集成中的适用性。有了铁路拓扑本体,软件工程师和知识科学家就有了一个基于标准的本体来表示铁路拓扑,以集成断开连接的数据源。我们将铁路拓扑本体用于铁路知识图,并计划通过从现有数据交换标准派生的铁路基础设施本体对其进行扩展,因为许多此类标准使用与所提出的本体相同的基础模型,即铁路拓扑模型。 摘要:Engineering projects for railway infrastructure typically involve many subsystems which need consistent views of the planned and built infrastructure and its underlying topology. Consistency is typically ensured by exchanging and verifying data between tools using XML-based data formats and UML-based object-oriented models. A tighter alignment of these data representations via a common topology model could decrease the development effort of railway infrastructure engineering tools. A common semantic model is also a prerequisite for the successful adoption of railway knowledge graphs. Based on the RailTopoModel standard, we developed the Rail Topology Ontology as a model to represent core features of railway infrastructures in a standard-compliant manner. This paper describes the ontology and its development method, and discusses its suitability for integrating data of railway engineering systems and other sources in a knowledge graph. With the Rail Topology Ontology, software engineers and knowledge scientists have a standard-based ontology for representing railway topologies to integrate disconnected data sources. We use the Rail Topology Ontology for our rail knowledge graph and plan to extend it by rail infrastructure ontologies derived from existing data exchange standards, since many such standards use the same base model as the presented ontology, viz., RailTopoModel.

【14】 RGB Stream Is Enough for Temporal Action Detection 标题:RGB流足以进行时间动作检测

作者:Chenhao Wang,Hongxiang Cai,Yuxin Zou,Yichao Xiong 机构:Media Intelligence Technology Co.,Ltd 链接:https://arxiv.org/abs/2107.04362 摘要:到目前为止,最先进的时间动作检测器基于两个流输入,包括RGB帧和光流。虽然将RGB帧与光流相结合可以显著提高性能,但光流是一种手工设计的表示方法,不仅计算量大,而且两种流方法往往不能与光流进行端到端的联合学习。在本文中,我们认为光流在高精度的时间动作检测中是不必要的,而图像级数据增强(ILDA)是解决去除光流后性能下降的关键。为了评估ILDA的有效性,我们设计了一个简单而有效的基于单个RGB流的单级时间动作检测器DaoTAD。我们的结果表明,当使用ILDA训练时,DaoTAD具有与所有现有的最先进的双流检测器相当的精度,同时大大超过了以前的方法的推理速度,并且在GeForce GTX 1080 Ti上的推理速度是惊人的6668 fps。代码位于url{https://github.com/Media-Smart/vedatad}. 摘要:State-of-the-art temporal action detectors to date are based on two-stream input including RGB frames and optical flow. Although combining RGB frames and optical flow boosts performance significantly, optical flow is a hand-designed representation which not only requires heavy computation, but also makes it methodologically unsatisfactory that two-stream methods are often not learned end-to-end jointly with the flow. In this paper, we argue that optical flow is dispensable in high-accuracy temporal action detection and image level data augmentation (ILDA) is the key solution to avoid performance degradation when optical flow is removed. To evaluate the effectiveness of ILDA, we design a simple yet efficient one-stage temporal action detector based on single RGB stream named DaoTAD. Our results show that when trained with ILDA, DaoTAD has comparable accuracy with all existing state-of-the-art two-stream detectors while surpassing the inference speed of previous methods by a large margin and the inference speed is astounding 6668 fps on GeForce GTX 1080 Ti. Code is available at url{https://github.com/Media-Smart/vedatad}.

【15】 An ontology for the formalization and visualization of scientific knowledge 标题:一种用于科学知识形式化和可视化的本体论

作者:Vincenzo Daponte,Gilles Falquet 机构:Centre Universitaire d’informatique, Université de Genève, Switzerland 链接:https://arxiv.org/abs/2107.04347 摘要:本文提出的科学知识对象本体的构建是面向科学知识可视化方法发展的一部分。科学知识组织的概念(定理、定律、经验、证明等)出现在现有的本体论中,但没有一个概念是以这个主题为中心的,并且呈现出一个简单易用的组织。我们提出了第一个版本建立在本体论的来源(本体论的知识对象的某些领域,词汇和更高层次的),专业知识库和科学家访谈。我们已经将这个本体与所使用的一些源进行了对齐,这使我们能够验证它与它们的一致性。本体的有效性在于使用它来形式化来自不同来源的知识,这是我们在物理领域已经开始做的。 摘要:The construction of an ontology of scientific knowledge objects, presented here, is part of the development of an approach oriented towards the visualization of scientific knowledge. It is motivated by the fact that the concepts of organization of scientific knowledge (theorem, law, experience, proof, etc.) appear in existing ontologies but that none of them is centered on this topic and presents a simple and easily usable organization. We present the first version built from ontological sources (ontologies of knowledge objects of certain fields, lexical and higher level ones), specialized knowledge bases and interviews with scientists. We have aligned this ontology with some of the sources used, which has allowed us to verify its consistency with respect to them. The validation of the ontology consists in using it to formalize knowledge from various sources, which we have begun to do in the field of physics.

【16】 Attend2Pack: Bin Packing through Deep Reinforcement Learning with Attention 标题:Attend2Pack:带注意的深度强化学习装箱

作者:Jingwei Zhang,Bin Zi,Xiaoyu Ge 机构:China 2Australian National University 备注:Reinforcement Learning for Real Life (RL4RealLife) Workshop in the 38th International Conference on Machine Learning, 2021 链接:https://arxiv.org/abs/2107.04333 摘要:本文试图通过学习的角度来解决装箱问题。在基于自我注意的编码和深度强化学习算法的基础上,我们提出了一种新的端到端学习模型。通过对组合动作空间进行分解,并利用一种新的训练技术,即优先过采样(prioritized oversampling),这是一种加速策略学习的通用方案,我们在一系列实验环境中获得了最先进的性能。此外,虽然提出的方法attend2pack的目标是离线BPP,但我们将我们的方法简化为严格的在线BPP设置,在这里它也能够实现最先进的性能。通过一系列的消融研究以及与以前的一系列工作的比较,我们希望能为这一领域的研究提供一个有效的基线方法。 摘要:This paper seeks to tackle the bin packing problem (BPP) through a learning perspective. Building on self-attention-based encoding and deep reinforcement learning algorithms, we propose a new end-to-end learning model for this task of interest. By decomposing the combinatorial action space, as well as utilizing a new training technique denoted as prioritized oversampling, which is a general scheme to speed up on-policy learning, we achieve state-of-the-art performance in a range of experimental settings. Moreover, although the proposed approach attend2pack targets offline-BPP, we strip our method down to the strict online-BPP setting where it is also able to achieve state-of-the-art performance. With a set of ablation studies as well as comparisons against a range of previous works, we hope to offer as a valid baseline approach to this field of study.

【17】 Semantic Segmentation on Multiple Visual Domains 标题:多视图域上的语义分割

作者:Floris Naber 机构:Department of Electrical Engineering, Eindhoven University of Technology 备注:Graduation project report 链接:https://arxiv.org/abs/2107.04326 摘要:语义分割模型只在其训练的领域中表现良好,训练数据集很少,并且通常具有很小的标签空间,因为所需的像素级注释制作成本很高。因此,需要在多个现有域上训练模型以增加输出标签空间。目前的研究表明,使用多域训练有可能提高跨数据集的准确性,但这尚未成功地扩展到三个不同的非重叠域的数据集,而无需手动标记。本文针对Cityscapes、SUIM和sunrgb-D数据集提出了一种方法,通过创建一个跨越所有数据集类的标签空间。合并重复的类,并通过保持类分离来解决不同粒度的问题。结果表明,在硬件性能均衡的情况下,由于资源不是无限的,多域模型的精度比所有基线模型加在一起的精度要高,表明即使在没有共同点的域中,模型也能从额外的数据中受益。 摘要:Semantic segmentation models only perform well on the domain they are trained on and datasets for training are scarce and often have a small label-spaces, because the pixel level annotations required are expensive to make. Thus training models on multiple existing domains is desired to increase the output label-space. Current research shows that there is potential to improve accuracy across datasets by using multi-domain training, but this has not yet been successfully extended to datasets of three different non-overlapping domains without manual labelling. In this paper a method for this is proposed for the datasets Cityscapes, SUIM and SUN RGB-D, by creating a label-space that spans all classes of the datasets. Duplicate classes are merged and discrepant granularity is solved by keeping classes separate. Results show that accuracy of the multi-domain model has higher accuracy than all baseline models together, if hardware performance is equalized, as resources are not limitless, showing that models benefit from additional data even from domains that have nothing in common.

【18】 Understanding surrogate explanations: the interplay between complexity, fidelity and coverage 标题:理解代理解释:复杂性、保真度和覆盖率之间的相互作用

作者:Rafael Poyiadzi,Xavier Renard,Thibault Laugel,Raul Santos-Rodriguez,Marcin Detyniecki 机构: University of Bristol, Bristol, United Kingdom, AXA, Paris, France, Sorbonne Universit´e, CNRS, LIP, F-, Paris, France, Polish Academy of Science, IBS PAN, Warsaw, Poland 备注:12 pages, 8 figures 链接:https://arxiv.org/abs/2107.04309 摘要:本文分析了替代解释背后的基本要素,以更好地理解其内在机制。我们从考虑全局代理开始我们的阐述,描述代理的复杂性和对所建模黑盒的忠实度之间的权衡。我们表明,从全球过渡到地方-减少覆盖面-允许更有利的条件在帕累托边界的保真度复杂的代理。我们讨论的复杂性,保真度和覆盖范围之间的相互作用,并考虑如何不同的用户需求可以导致问题的配方,这些约束或处罚。我们还通过实验证明了如何使局部替代解释过程具有交互性,从而得到更好的解释。 摘要:This paper analyses the fundamental ingredients behind surrogate explanations to provide a better understanding of their inner workings. We start our exposition by considering global surrogates, describing the trade-off between complexity of the surrogate and fidelity to the black-box being modelled. We show that transitioning from global to local - reducing coverage - allows for more favourable conditions on the Pareto frontier of fidelity-complexity of a surrogate. We discuss the interplay between complexity, fidelity and coverage, and consider how different user needs can lead to problem formulations where these are either constraints or penalties. We also present experiments that demonstrate how the local surrogate interpretability procedure can be made interactive and lead to better explanations.

【19】 Integrating Planning, Execution and Monitoring in the presence of Open World Novelties: Case Study of an Open World Monopoly Solver 标题:在开放世界新奇事物面前整合规划、执行和监控:开放世界垄断解算器的案例研究

作者:Sriram Gopalakrishnan,Utkarsh Soni,Tung Thai,Panagiotis Lymperopoulos,Matthias Scheutz,Subbarao Kambhampati 机构:† Arizona State University, ‡ Tufts University 链接:https://arxiv.org/abs/2107.04303 摘要:垄断博弈是一个对抗性的多智能体领域,除了成为最后一个博弈者之外,没有固定的目标,有一些有用的子目标,比如垄断属性集,并开发它们。掷骰子、抽牌和对手的策略也有很多随机性。这种不可预测性是更糟的,当未知的新奇添加在游戏过程中。考虑到这些挑战,垄断是DARPA-SAILON项目选择的试验台之一,该项目旨在创建能够检测和适应新奇事物的代理。为了处理游戏的复杂性,我们开发了一个代理,它避开了完整的计划,并随着游戏的发展在线调整它的策略。在赛隆项目的最新独立评估中,我们的代理人在大多数指标上都是表现最好的代理人。我们在此介绍我们的方法和结果。 摘要:The game of monopoly is an adversarial multi-agent domain where there is no fixed goal other than to be the last player solvent, There are useful subgoals like monopolizing sets of properties, and developing them. There is also a lot of randomness from dice rolls, card-draws, and adversaries' strategies. This unpredictability is made worse when unknown novelties are added during gameplay. Given these challenges, Monopoly was one of the test beds chosen for the DARPA-SAILON program which aims to create agents that can detect and accommodate novelties. To handle the game complexities, we developed an agent that eschews complete plans, and adapts it's policy online as the game evolves. In the most recent independent evaluation in the SAILON program, our agent was the best performing agent on most measures. We herein present our approach and results.

【20】 WinoCNN: Kernel Sharing Winograd Systolic Array for Efficient Convolutional Neural Network Acceleration on FPGAs 标题:WinoCNN:基于FPGA的高效卷积神经网络加速内核共享Winograd脉动阵列

作者:Xinheng Liu,Yao Chen,Cong Hao,Ashutosh Dhar,Deming Chen 机构:∗University of Illinois at Urbana-Champaign, IL, USA, †Advanced Digital Sciences Center, Singapore, ‡Georgia Institute of Technology, GA, USA 备注:Published in the proceedings of ASAP 2021 链接:https://arxiv.org/abs/2107.04244 摘要:结合Winograd算法和脉动阵列结构,在FPGA平台上实现了对卷积神经网络(CNNs)的加速,提高了DSP的效率。然而,在基于FPGA的Winograd处理单元中处理任意卷积内核大小和支持有效的数据访问仍然没有得到充分的研究。在这项工作中,我们首次提出了一种优化的Winograd处理单元(WinoPE),它可以在相同的计算资源量下自然地支持多个卷积核大小,并保持较高的运行时DSP效率。利用所提出的WinoPE,我们构造了一个高效的脉动阵列加速器WinoCNN。我们还提出了一个专用的内存子系统来优化数据访问。基于加速器体系结构,我们建立了精确的资源和性能模型,以探索不同资源约束下的最优加速器配置。我们在多个FPGA上实现了我们提出的加速器,在吞吐量和DSP效率方面都优于最先进的设计。利用xilinxzcu102 FPGA,我们实现的DSP效率高达1.33gops/DSP,吞吐量高达3.1tops。这分别比之前报道的最佳解决方案好29.1%和20.0%。 摘要:The combination of Winograd's algorithm and systolic array architecture has demonstrated the capability of improving DSP efficiency in accelerating convolutional neural networks (CNNs) on FPGA platforms. However, handling arbitrary convolution kernel sizes in FPGA-based Winograd processing elements and supporting efficient data access remain underexplored. In this work, we are the first to propose an optimized Winograd processing element (WinoPE), which can naturally support multiple convolution kernel sizes with the same amount of computing resources and maintains high runtime DSP efficiency. Using the proposed WinoPE, we construct a highly efficient systolic array accelerator, termed WinoCNN. We also propose a dedicated memory subsystem to optimize the data access. Based on the accelerator architecture, we build accurate resource and performance modeling to explore optimal accelerator configurations under different resource constraints. We implement our proposed accelerator on multiple FPGAs, which outperforms the state-of-the-art designs in terms of both throughput and DSP efficiency. Our implementation achieves DSP efficiency up to 1.33 GOPS/DSP and throughput up to 3.1 TOPS with the Xilinx ZCU102 FPGA. These are 29.1% and 20.0% better than the best solutions reported previously, respectively.

【21】 Exploring Dropout Discriminator for Domain Adaptation 标题:用于领域自适应的丢弃判别器研究

作者:Vinod K Kurmi,Venkatesh K Subramanian,Vinay P. Namboodiri 机构:a Electrical Engineering Department, Indian Institute of Technology Kanpur, Kanpur, India, b Department of Computer Science and Engineering, Indian Institute of Technology 备注:This work is an extension of our BMVC-2019 paper (arXiv:1907.10628) 链接:https://arxiv.org/abs/2107.04231 摘要:如何使分类器适应新的领域是机器学习中一个具有挑战性的问题。这一点已经通过许多基于深度和非深度学习的方法来解决。在所使用的方法中,对抗性学习方法被广泛应用于解决许多深度学习问题和领域适应问题。这些方法是基于一个鉴别器,确保源和目标分布是密切的。然而,这里我们建议,与其使用由单个鉴别器获得的点估计,不如使用基于鉴别器集合的分布来弥合这一差距。这可以通过使用多个分类器或使用传统的集成方法来实现。与此相反,我们建议一个基于montecarlo辍学的系综鉴别器足以获得基于分布的鉴别器。具体来说,我们提出了一个基于课程的辍学鉴别器,该鉴别器逐渐增加基于样本的分布的方差,并使用相应的反向梯度来对齐源和目标的特征表示。一组鉴别器有助于模型有效地学习数据分布。它还提供了更好的梯度估计来训练特征抽取器。详细的结果和深入的烧蚀分析表明,我们的模型优于最新的结果。 摘要:Adaptation of a classifier to new domains is one of the challenging problems in machine learning. This has been addressed using many deep and non-deep learning based methods. Among the methodologies used, that of adversarial learning is widely applied to solve many deep learning problems along with domain adaptation. These methods are based on a discriminator that ensures source and target distributions are close. However, here we suggest that rather than using a point estimate obtaining by a single discriminator, it would be useful if a distribution based on ensembles of discriminators could be used to bridge this gap. This could be achieved using multiple classifiers or using traditional ensemble methods. In contrast, we suggest that a Monte Carlo dropout based ensemble discriminator could suffice to obtain the distribution based discriminator. Specifically, we propose a curriculum based dropout discriminator that gradually increases the variance of the sample based distribution and the corresponding reverse gradients are used to align the source and target feature representations. An ensemble of discriminators helps the model to learn the data distribution efficiently. It also provides a better gradient estimates to train the feature extractor. The detailed results and thorough ablation analysis show that our model outperforms state-of-the-art results.

【22】 Activated Gradients for Deep Neural Networks 标题:深度神经网络的激活梯度

作者:Mei Liu,Liangming Chen,Xiaohao Du,Long Jin,Mingsheng Shang 机构: Jin are with the School of InformationScience and Engineering, Lanzhou University 链接:https://arxiv.org/abs/2107.04228 摘要:由于病态问题、消失/爆炸梯度问题和鞍点问题,深层神经网络往往性能不佳甚至训练失败。本文提出了一种利用梯度激活函数(GAF)来处理这些问题的新方法。直观地说,GAF放大了微小梯度,限制了大梯度。本文从理论上给出了GAF需要满足的条件,并在此基础上证明了GAF缓解了上述问题。此外,本文还证明了在一定的假设条件下,带GAF的SGD的收敛速度比不带GAF的SGD快。此外,在CIFAR、ImageNet和PASCAL视觉对象类上的实验证实了GAF的有效性。实验结果还表明,该方法可以应用于各种深度神经网络中,提高其性能。源代码在https://github.com/LongJin-lab/Activated-Gradients-for-Deep-Neural-Networks. 摘要:Deep neural networks often suffer from poor performance or even training failure due to the ill-conditioned problem, the vanishing/exploding gradient problem, and the saddle point problem. In this paper, a novel method by acting the gradient activation function (GAF) on the gradient is proposed to handle these challenges. Intuitively, the GAF enlarges the tiny gradients and restricts the large gradient. Theoretically, this paper gives conditions that the GAF needs to meet, and on this basis, proves that the GAF alleviates the problems mentioned above. In addition, this paper proves that the convergence rate of SGD with the GAF is faster than that without the GAF under some assumptions. Furthermore, experiments on CIFAR, ImageNet, and PASCAL visual object classes confirm the GAF's effectiveness. The experimental results also demonstrate that the proposed method is able to be adopted in various deep neural networks to improve their performance. The source code is publicly available at https://github.com/LongJin-lab/Activated-Gradients-for-Deep-Neural-Networks.

【23】 Safe Learning of Lifted Action Models 标题:提升动作模型的安全学习

作者:Brendan Juba,Hai S. Le,Roni Stern 机构:Washington University in St. Louis, USA, Palo Alto Research Center, USA, Ben Gurion University of the Negev, Israel 链接:https://arxiv.org/abs/2107.04169 摘要:创建一个领域模型,即使是对于经典的、独立于领域的规划,也是一个众所周知的困难的知识工程任务。解决这个问题的一个自然方法是从观测中学习一个领域模型。然而,模型学习方法常常不能提供安全保证:学习的模型可能假设行为是适用的,当它们不适用时,并且可能错误地捕捉到行为的效果。这可能导致生成在执行时失败的计划。在某些领域,由于失败的代价或失败后无法在线重新规划,此类失败是不可接受的。在这种情况下,所有的学习都必须离线完成,基于一些观察结果,例如,由一些其他代理或人类收集的。通过这种学习,任务是制定一个保证成功的计划。这就是所谓的无模型规划问题。针对经典规划中的无模型规划问题,提出了一种求解算法。然而,他们仅限于学习固定的领域,因此他们无法扩展。我们推广了之前的工作,提出了第一个提升域的安全无模型规划算法。我们证明了我们的方法的正确性,并提供了一个统计分析表明,解决未来问题所需的高概率的轨迹数是线性的潜在规模的领域模型。我们还对12个IPC域进行了实验,实验结果表明,我们的方法能够在所有情况下学习最多两条轨迹的真实动作模型。 摘要:Creating a domain model, even for classical, domain-independent planning, is a notoriously hard knowledge-engineering task. A natural approach to solve this problem is to learn a domain model from observations. However, model learning approaches frequently do not provide safety guarantees: the learned model may assume actions are applicable when they are not, and may incorrectly capture actions' effects. This may result in generating plans that will fail when executed. In some domains such failures are not acceptable, due to the cost of failure or inability to replan online after failure. In such settings, all learning must be done offline, based on some observations collected, e.g., by some other agents or a human. Through this learning, the task is to generate a plan that is guaranteed to be successful. This is called the model-free planning problem. Prior work proposed an algorithm for solving the model-free planning problem in classical planning. However, they were limited to learning grounded domains, and thus they could not scale. We generalize this prior work and propose the first safe model-free planning algorithm for lifted domains. We prove the correctness of our approach, and provide a statistical analysis showing that the number of trajectories needed to solve future problems with high probability is linear in the potential size of the domain model. We also present experiments on twelve IPC domains showing that our approach is able to learn the real action model in all cases with at most two trajectories.

【24】 Parallel and Multi-Objective Falsification with Scenic and VerifAI 标题:并行多目标的景证证伪

作者:Kesav Viswanadha,Edward Kim,Francis Indaheng,Daniel J. Fremont,Sanjit A. Seshia 机构: University of California, Berkeley, University of California, Santa Cruz 链接:https://arxiv.org/abs/2107.04164 摘要:证伪已经成为基于仿真的自治系统验证的重要工具。本文对场景规范语言和VerifAI工具箱进行了扩展,利用并行性提高了基于抽样的证伪方法的可扩展性,并将证伪扩展到多目标规范。我们首先提出了一个并行化的框架,它与scient的模拟和采样功能以及VerifAI的伪造功能相结合,减少了基于模拟的测试中固有的执行时间瓶颈。然后,我们提出了一个扩展的VerifAI的伪造算法,以支持抽样过程中的多目标优化,使用规则书的概念来指定一个优先顺序超过多个指标,可以用来指导反例搜索过程。最后,我们用一套用scenent语言编写的综合基准来评估这些扩展的好处。 摘要:Falsification has emerged as an important tool for simulation-based verification of autonomous systems. In this paper, we present extensions to the Scenic scenario specification language and VerifAI toolkit that improve the scalability of sampling-based falsification methods by using parallelism and extend falsification to multi-objective specifications. We first present a parallelized framework that is interfaced with both the simulation and sampling capabilities of Scenic and the falsification capabilities of VerifAI, reducing the execution time bottleneck inherently present in simulation-based testing. We then present an extension of VerifAI's falsification algorithms to support multi-objective optimization during sampling, using the concept of rulebooks to specify a preference ordering over multiple metrics that can be used to guide the counterexample search process. Lastly, we evaluate the benefits of these extensions with a comprehensive set of benchmarks written in the Scenic language.

【25】 Levi Graph AMR Parser using Heterogeneous Attention 标题:基于异构注意力的Levi图AMR分析器

作者:Han He,Jinho D. Choi 机构:Computer Science, Emory University, Atlanta GA , USA 备注:Accepted in IWPT 2021: The 17th International Conference on Parsing Technologies 链接:https://arxiv.org/abs/2107.04152 摘要:结合双仿射解码器,transformers已经有效地适应了文本到图形的转换,并在AMR解析方面取得了最先进的性能。然而,许多先前的工作依赖于双仿射解码器来进行弧和标签预测中的一个或两个,尽管解码器使用的大多数特征可能已经被Transformer学习。本文提出了一种新的AMR解析方法,它将异构数据(标记、概念、标签)作为一个输入输入到一个转换器来学习注意,并且只使用来自转换器的注意矩阵来预测AMR图中的所有元素(概念、弧、标签)。虽然我们的模型使用的参数比以前最先进的图形解析器少得多,但在amr2.0和3.0上显示出相似或更好的精度。 摘要:Coupled with biaffine decoders, transformers have been effectively adapted to text-to-graph transduction and achieved state-of-the-art performance on AMR parsing. Many prior works, however, rely on the biaffine decoder for either or both arc and label predictions although most features used by the decoder may be learned by the transformer already. This paper presents a novel approach to AMR parsing by combining heterogeneous data (tokens, concepts, labels) as one input to a transformer to learn attention, and use only attention matrices from the transformer to predict all elements in AMR graphs (concepts, arcs, labels). Although our models use significantly fewer parameters than the previous state-of-the-art graph parser, they show similar or better accuracy on AMR 2.0 and 3.0.

【26】 Does Form Follow Function? An Empirical Exploration of the Impact of Deep Neural Network Architecture Design on Hardware-Specific Acceleration 标题:形式遵循功能吗?深度神经网络结构设计对硬件加速影响的实证研究

作者:Saad Abbasi,Mohammad Javad Shafiee,Ellick Chan,Alexander Wong 机构:Waterloo, ON, Canada, University of Waterloo, DarwinAI, Intel Corporation, United States 备注:8 pages 链接:https://arxiv.org/abs/2107.04144 摘要:关于深层神经网络架构设计和硬件特定加速的形式和功能之间的细粒度关系是研究文献中没有很好研究的一个领域,形式通常由精度而不是硬件功能决定。在这项研究中,我们进行了全面的实证研究,以探讨深层神经网络架构设计对通过特定硬件加速实现的推理加速程度的影响。更具体地说,我们通过OpenVINO微处理器特定加速和GPU特定加速的视角,实证研究了各种常用的宏体系结构设计模式对不同体系结构深度的影响。实验结果表明,在利用硬件特定加速的情况下,平均推理速度提高了380%,而推理速度的提高程度因宏体系结构设计模式的不同而有很大的差异,其中最快的加速速度达到了550%。此外,我们还深入探讨了随着体系结构深度和宽度的增加,FLOPs需求、3级缓存效率和网络延迟之间的关系。最后,我们分析了在各种手工制作的深度卷积神经网络架构设计以及通过神经架构搜索策略发现的架构设计中,使用硬件特定加速与本地深度学习框架相比,推理时间的减少。我们发现,DARTS派生的体系结构受益于硬件特定软件加速的最大改进(1200%),而基于深度瓶颈卷积的MobileNet-V2的总体推断时间最低,约为2.4毫秒。 摘要:The fine-grained relationship between form and function with respect to deep neural network architecture design and hardware-specific acceleration is one area that is not well studied in the research literature, with form often dictated by accuracy as opposed to hardware function. In this study, a comprehensive empirical exploration is conducted to investigate the impact of deep neural network architecture design on the degree of inference speedup that can be achieved via hardware-specific acceleration. More specifically, we empirically study the impact of a variety of commonly used macro-architecture design patterns across different architectural depths through the lens of OpenVINO microprocessor-specific and GPU-specific acceleration. Experimental results showed that while leveraging hardware-specific acceleration achieved an average inference speed-up of 380%, the degree of inference speed-up varied drastically depending on the macro-architecture design pattern, with the greatest speedup achieved on the depthwise bottleneck convolution design pattern at 550%. Furthermore, we conduct an in-depth exploration of the correlation between FLOPs requirement, level 3 cache efficacy, and network latency with increasing architectural depth and width. Finally, we analyze the inference time reductions using hardware-specific acceleration when compared to native deep learning frameworks across a wide variety of hand-crafted deep convolutional neural network architecture designs as well as ones found via neural architecture search strategies. We found that the DARTS-derived architecture to benefit from the greatest improvement from hardware-specific software acceleration (1200%) while the depthwise bottleneck convolution-based MobileNet-V2 to have the lowest overall inference time of around 2.4 ms.

【27】 Learning to Delegate for Large-scale Vehicle Routing 标题:大规模车辆路径问题的授权学习

作者:Sirui Li,Zhongxia Yan,Cathy Wu 机构:MIT 链接:https://arxiv.org/abs/2107.04139 摘要:车辆路径问题是一类具有广泛实际应用的组合问题。虽然以前的启发式或基于学习的工作可以在多达100个客户的小问题实例上获得不错的解决方案,但它们的性能不能扩展到大问题。针对大规模VRP问题,提出了一种新的学习增强局部搜索算法。该方法通过识别适当的子问题和$textit{delegating}$将它们的改进交给一个黑盒子问题来迭代地改进解决方案。在每个步骤中,我们利用空间局部性只考虑子问题的线性数,而不是指数。我们将子问题选择框架化为回归问题,并在生成的问题实例训练集上训练变换器。我们证明,在500到3000的vrp上,我们的方法达到了最先进的性能,速度比强基线提高了15倍。 摘要:Vehicle routing problems (VRPs) are a class of combinatorial problems with wide practical applications. While previous heuristic or learning-based works achieve decent solutions on small problem instances of up to 100 customers, their performance does not scale to large problems. This article presents a novel learning-augmented local search algorithm to solve large-scale VRP. The method iteratively improves the solution by identifying appropriate subproblems and $textit{delegating}$ their improvement to a black box subsolver. At each step, we leverage spatial locality to consider only a linear number of subproblems, rather than exponential. We frame subproblem selection as a regression problem and train a Transformer on a generated training set of problem instances. We show that our method achieves state-of-the-art performance, with a speed-up of up to 15 times over strong baselines, on VRPs with sizes ranging from 500 to 3000.

【28】 A Systematic Survey of Text Worlds as Embodied Natural Language Environments 标题:文本世界作为具体化自然语言环境的系统考察

作者:Peter A Jansen 机构:A Systematic Survey ofText Worlds as Embodied Natural Language EnvironmentsPeter JansenSchool of Information, University of Arizona 备注:18 pages 链接:https://arxiv.org/abs/2107.04132 摘要:文本世界是用于具体化代理的虚拟环境,与2D或3D环境不同,这些代理仅使用文本描述来呈现。这些环境提供了一种高保真3D环境的替代方案,因为它们的低进入壁垒,提供了研究语义、合成推理和其他具有丰富高级动作空间的高级任务的能力,同时控制感知输入。这项系统调查概述了文本世界的工具、环境和代理建模的最新发展,同时考察了知识图、常识推理、文本世界性能向高保真环境的转移学习的最新趋势,以及一旦实现的近期发展目标,使文本世界成为自然语言处理的一个有吸引力的通用研究范式。 摘要:Text Worlds are virtual environments for embodied agents that, unlike 2D or 3D environments, are rendered exclusively using textual descriptions. These environments offer an alternative to higher-fidelity 3D environments due to their low barrier to entry, providing the ability to study semantics, compositional inference, and other high-level tasks with rich high-level action spaces while controlling for perceptual input. This systematic survey outlines recent developments in tooling, environments, and agent modeling for Text Worlds, while examining recent trends in knowledge graphs, common sense reasoning, transfer learning of Text World performance to higher-fidelity environments, as well as near-term development targets that, once achieved, make Text Worlds an attractive general research paradigm for natural language processing.

【29】 The Multi-phase spatial meta-heuristic algorithm for public health emergency transportation 标题:公共卫生应急运输的多阶段空间元启发式算法

作者:Fariba Afrin Irany,Arnav Iyer,Rubenia Borge Flores,Armin R. Mikler 机构:University of North Texas, Denton, edu, Georgia State University, Atlanta 备注:17 pages, 3 figures, 3 tables, Journals 链接:https://arxiv.org/abs/2107.04125 摘要:在生物恐怖袭击的情况下,提供大规模预防的医疗对策(MCMs)是一个活跃的研究课题,在过去几十年中引起了研究界的兴趣。本研究的目的是设计一个有效的接收-重新加载-存储问题(RSS)算法,在考虑时间、物理、人力资源和容量限制的情况下,寻找将MCMs运送到目标人群的可行路径。为此,我们将p-median问题应用到基于POD的应急响应规划过程中,并提出了一种有效的算法来在合理的计算时间内实现p-median。我们提出重新计划,响应计划分析系统,其中包含一些RSS解决方案在北得克萨斯大学计算流行病学和反应分析中心(CECERA)在北得克萨斯大学开发。最后,我们分析了一个研究案例,展示了算法的计算性能如何在短期和长期内影响决策过程和应急计划。 摘要:The delivery of Medical Countermeasures(MCMs) for mass prophylaxis in the case of a bio-terrorist attack is an active research topic that has interested the research community over the past decades. The objective of this study is to design an efficient algorithm for the Receive Reload and Store Problem(RSS) in which we aim to find feasible routes to deliver MCMs to a target population considering time, physical, and human resources, and capacity limitations. For doing this, we adapt the p-median problem to the POD-based emergency response planning procedures and propose an efficient algorithm solution to perform the p-median in reasonable computational time. We present RE-PLAN, the Response PLan Analyzer system that contains some RSS solutions developed at The Center for Computational Epidemiology and Response Analysis (CeCERA) at the University of North Texas. Finally, we analyze a study case where we show how the computational performance of the algorithm can impact the process of decision making and emergency planning in the short and long terms.

【30】 Even Faster SNN Simulation with Lazy Event-driven Plasticity and Shared Atomics 标题:采用Lazy 事件驱动塑性和共享原子的SNN模拟速度更快

作者:Dennis Bautembach,Iason Oikonomidis,Antonis Argyros 机构:FORTH - ICS & CSD - UOC 备注:Submitted to IEEE-HPEC 2021 链接:https://arxiv.org/abs/2107.04092 摘要:我们提出了两种新的优化方法来加速基于时钟的脉冲神经网络(SNN)模拟器。第一个目标是尖峰时间依赖性可塑性(STDP)。它结合了懒惰和事件驱动的可塑性,并有效地促进了使用位场和整数内部函数计算突触前和突触后峰值。它提供了更高的带宽比事件驱动塑性单独实现了1.5倍-2倍的加速比我们最接近的竞争对手。第二个优化目标是尖峰交货。我们以一种限制在任何给定时间需要更新的神经元数量的方式来划分我们的图表示,这允许我们在共享内存而不是全局内存中执行所述更新。这比我们最接近的竞争对手快2-2.5倍。这两种优化都代表了STDP和Spice(我们最先进的SNN模拟器)内部峰值交付多年迭代的最终进化阶段。所提出的优化不仅限于我们的图形表示或管道,而且适用于多种模拟器设计。我们在三个成熟的模型上评估我们的性能,并将我们自己与其他三个最先进的模拟器进行比较。 摘要:We present two novel optimizations that accelerate clock-based spiking neural network (SNN) simulators. The first one targets spike timing dependent plasticity (STDP). It combines lazy- with event-driven plasticity and efficiently facilitates the computation of pre- and post-synaptic spikes using bitfields and integer intrinsics. It offers higher bandwidth than event-driven plasticity alone and achieves a 1.5x-2x speedup over our closest competitor. The second optimization targets spike delivery. We partition our graph representation in a way that bounds the number of neurons that need be updated at any given time which allows us to perform said update in shared memory instead of global memory. This is 2x-2.5x faster than our closest competitor. Both optimizations represent the final evolutionary stages of years of iteration on STDP and spike delivery inside "Spice" (/spaIk/), our state of the art SNN simulator. The proposed optimizations are not exclusive to our graph representation or pipeline but are applicable to a multitude of simulator designs. We evaluate our performance on three well-established models and compare ourselves against three other state of the art simulators.

【31】 Robust Counterfactual Explanations on Graph Neural Networks 标题:图神经网络的鲁棒反事实解释

作者:Mohit Bajaj,Lingyang Chu,Zi Yu Xue,Jian Pei,Lanjun Wang,Peter Cho-Ho Lam,Yong Zhang 机构:Huawei Technologies Canada Co. Ltd., McMaster University, The University of British Columbia, Simon Fraser University 链接:https://arxiv.org/abs/2107.04086 摘要:图神经网络(GNNs)在高风险应用中的大规模部署对解释产生了强烈的需求,这些解释对噪声具有鲁棒性,并与人类的直觉保持一致。大多数现有的方法通过识别与预测有很强相关性的输入图的子图来产生解释。这些解释对噪声不可靠,因为单独优化单个输入的相关性很容易过拟合噪声。此外,它们与人类的直觉并不一致,因为从输入图中删除已识别的子图并不一定会改变预测结果。在本文中,我们提出了一种新的方法,通过在相似的输入图上显式地建模GNNs的公共决策逻辑来生成GNNs的鲁棒反事实解释。我们的解释对噪声具有天然的鲁棒性,因为它们是由支配许多相似输入图的预测的GNN的公共决策边界产生的。这些解释也很符合人类的直觉,因为从输入图中删除解释所识别的一组边会显著改变预测。在许多公共数据集上的详尽实验证明了该方法的优越性能。 摘要:Massive deployment of Graph Neural Networks (GNNs) in high-stake applications generates a strong demand for explanations that are robust to noise and align well with human intuition. Most existing methods generate explanations by identifying a subgraph of an input graph that has a strong correlation with the prediction. These explanations are not robust to noise because independently optimizing the correlation for a single input can easily overfit noise. Moreover, they do not align well with human intuition because removing an identified subgraph from an input graph does not necessarily change the prediction result. In this paper, we propose a novel method to generate robust counterfactual explanations on GNNs by explicitly modelling the common decision logic of GNNs on similar input graphs. Our explanations are naturally robust to noise because they are produced from the common decision boundaries of a GNN that govern the predictions of many similar input graphs. The explanations also align well with human intuition because removing the set of edges identified by an explanation from the input graph changes the prediction significantly. Exhaustive experiments on many public datasets demonstrate the superior performance of our method.

【32】 Scaling Gaussian Processes with Derivative Information Using Variational Inference 标题:利用变分推论对具有导数信息的高斯过程进行标度

作者:Misha Padidar,Xinran Zhu,Leo Huang,Jacob R. Gardner,David Bindel 机构:Cornell University, University of Pennsylvania 链接:https://arxiv.org/abs/2107.04061 摘要:具有导数信息的高斯过程在导数信息可用的许多环境中都很有用,包括自然科学中出现的许多贝叶斯优化和回归任务。然而,当在$D$输入维度中对$N$点进行训练时,合并导数观测值会带来占主导地位的$O(N^3D^3)$计算成本。即使是中等规模的问题也难以解决。虽然最近的工作已经解决了低-$D$设置中的这一棘手问题,但是高-$N$,高-$D$设置仍然没有被探索,并且具有很大的价值,特别是随着机器学习问题越来越高维化。本文介绍了利用变分推理实现带导数的完全可伸缩高斯过程回归的方法。类似于使用诱导值稀疏训练集的标签,我们引入了诱导方向导数的概念来稀疏训练集的偏导数信息。这使得我们能够构造一个包含导数信息的变分后验,但其大小既不依赖于完整数据集大小$N$,也不依赖于完整维度$D$。我们展示了我们的方法在各种任务上的完全可扩展性,从高维恒星融合回归任务到使用贝叶斯优化在Pubmed上训练图卷积神经网络。令人惊讶的是,我们发现,即使在只有标签数据可用的情况下,我们的方法也可以提高回归性能。 摘要:Gaussian processes with derivative information are useful in many settings where derivative information is available, including numerous Bayesian optimization and regression tasks that arise in the natural sciences. Incorporating derivative observations, however, comes with a dominating $O(N^3D^3)$ computational cost when training on $N$ points in $D$ input dimensions. This is intractable for even moderately sized problems. While recent work has addressed this intractability in the low-$D$ setting, the high-$N$, high-$D$ setting is still unexplored and of great value, particularly as machine learning problems increasingly become high dimensional. In this paper, we introduce methods to achieve fully scalable Gaussian process regression with derivatives using variational inference. Analogous to the use of inducing values to sparsify the labels of a training set, we introduce the concept of inducing directional derivatives to sparsify the partial derivative information of a training set. This enables us to construct a variational posterior that incorporates derivative information but whose size depends neither on the full dataset size $N$ nor the full dimensionality $D$. We demonstrate the full scalability of our approach on a variety of tasks, ranging from a high dimensional stellarator fusion regression task to training graph convolutional neural networks on Pubmed using Bayesian optimization. Surprisingly, we find that our approach can improve regression performance even in settings where only label data is available.

【33】 Entropy, Information, and the Updating of Probabilities 标题:熵、信息与概率的更新

作者:Ariel Caticha 机构:Department of Physics, University at Albany-SUNY, Albany, NY , USA. 备注:28 pages. Invited paper to appear in Entropy in the special volume "Statistical Foundations of Entropy", ed. by P. Jizba and J. Korbel. arXiv admin note: text overlap with arXiv:1412.5644 链接:https://arxiv.org/abs/2107.04529 摘要:本文回顾了最大熵方法作为一般推理框架的一种特殊方法。讨论强调了派生语中的语用因素。信息的认知概念是根据它与理想理性主体的贝叶斯信念的关系来定义的。通过消除归纳过程,设计了由先验概率分布到后验概率分布的更新方法。对数相对熵是唯一的更新工具,具有普遍适用性(b) 承认先前信息的价值;(c)承认科学中独立概念所起的特权作用。由此产生的框架——ME方法——可以处理任意先验和任意约束。它将MaxEnt和Bayes规则作为特例,将熵方法和贝叶斯方法统一为一个通用的推理方案。ME方法不仅仅是选择单一的后验分布,还解决了其他分布的可能性要小得多的问题,这为波动和大偏差理论提供了直接的桥梁。 摘要:This paper is a review of a particular approach to the method of maximum entropy as a general framework for inference. The discussion emphasizes the pragmatic elements in the derivation. An epistemic notion of information is defined in terms of its relation to the Bayesian beliefs of ideally rational agents. The method of updating from a prior to a posterior probability distribution is designed through an eliminative induction process. The logarithmic relative entropy is singled out as the unique tool for updating that (a) is of universal applicability; (b) that recognizes the value of prior information; and (c) that recognizes the privileged role played by the notion of independence in science. The resulting framework -- the ME method -- can handle arbitrary priors and arbitrary constraints. It includes MaxEnt and Bayes' rule as special cases and, therefore, it unifies entropic and Bayesian methods into a single general inference scheme. The ME method goes beyond the mere selection of a single posterior, but also addresses the question of how much less probable other distributions might be, which provides a direct bridge to the theories of fluctuations and large deviations.

0 人点赞