人工智能学术速递[9.6]

2021-09-16 15:54:53 浏览数 (1)

Update!H5支持摘要折叠,体验更佳!点击阅读原文访问arxivdaily.com,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏等功能!

cs.AI人工智能,共计30篇

【1】 CREAK: A Dataset for Commonsense Reasoning over Entity Knowledge 标题:CRAK:一种基于实体知识的常识推理数据集 链接:https://arxiv.org/abs/2109.01653

作者:Yasumasa Onoe,Michael J. Q. Zhang,Eunsol Choi,Greg Durrett 机构:The University of Texas at Austin 摘要:大多数针对常识推理的基准数据集关注日常场景:物理知识,如知道自己可以在瀑布下装满杯子[Talmor et al.,2019],社会知识,如撞人是尴尬的[Sap et al.,2019],以及其他一般情况。然而,基于特定实体的知识,存在着丰富的常识推理空间:例如,判断“哈利波特可以教你如何骑扫帚飞行”这一说法的真实性。模特们能学会以这种方式将实体知识与常识推理结合起来吗?我们介绍了CREAK,一个关于实体知识的常识推理测试平台,将实体的事实检查(哈利波特是一个巫师,擅长骑扫帚)与常识推理(如果你擅长一项技能,你可以教别人如何做)联系起来。我们的数据集由13k个人类编写的关于实体的英语声明组成,这些实体要么是真的,要么是假的,此外还有一个小的对比集。众工可以很容易地提出这些陈述,数据集上的人因绩效很高(高90);我们认为,模型应该能够融合实体知识和常识推理,才能在这方面做得很好。在我们的实验中,我们关注闭卷设置,并观察到在现有事实验证基准上微调的基线模型在吱吱作响。在吱吱声上训练模型可以大幅提高准确性,但仍低于人类的表现。我们的基准提供了一个独特的探索自然语言理解模型,测试其检索事实的能力(例如,谁教在芝加哥大学?)和未陈述的常识知识(例如,管家不吼叫客人)。 摘要:Most benchmark datasets targeting commonsense reasoning focus on everyday scenarios: physical knowledge like knowing that you could fill a cup under a waterfall [Talmor et al., 2019], social knowledge like bumping into someone is awkward [Sap et al., 2019], and other generic situations. However, there is a rich space of commonsense inferences anchored to knowledge about specific entities: for example, deciding the truthfulness of a claim "Harry Potter can teach classes on how to fly on a broomstick." Can models learn to combine entity knowledge with commonsense reasoning in this fashion? We introduce CREAK, a testbed for commonsense reasoning about entity knowledge, bridging fact-checking about entities (Harry Potter is a wizard and is skilled at riding a broomstick) with commonsense inferences (if you're good at a skill you can teach others how to do it). Our dataset consists of 13k human-authored English claims about entities that are either true or false, in addition to a small contrast set. Crowdworkers can easily come up with these statements and human performance on the dataset is high (high 90s); we argue that models should be able to blend entity knowledge and commonsense reasoning to do well here. In our experiments, we focus on the closed-book setting and observe that a baseline model finetuned on existing fact verification benchmark struggles on CREAK. Training a model on CREAK improves accuracy by a substantial margin, but still falls short of human performance. Our benchmark provides a unique probe into natural language understanding models, testing both its ability to retrieve facts (e.g., who teaches at the University of Chicago?) and unstated commonsense knowledge (e.g., butlers do not yell at guests).

【2】 Integration of Data and Theory for Accelerated Derivable Symbolic Discovery 标题:数据与理论的集成加速可导符号发现 链接:https://arxiv.org/abs/2109.01634

作者:Cristina Cornelio,Sanjeeb Dash,Vernon Austel,Tyler Josephson,Joao Goncalves,Kenneth Clarkson,Nimrod Megiddo,Bachir El Khadir,Lior Horesh 机构:IBM Research, Samsung AI Center Cambridge, Department of Chemical, Biochemical, and Environmental Engineering, University of Maryland, Baltimore County, Department of Chemistry and Chemical Theory Center, University of Minnesota 摘要:长期以来,科学家们一直致力于发现准确描述数据的有意义的方程式。机器学习算法自动构建精确的数据驱动模型,但确保这些模型与现有知识一致是一项挑战。我们开发了一种将自动定理证明与符号回归相结合的方法,实现了自然法则的原则推导。我们用开普勒第三定律、爱因斯坦的相对论时间膨胀和朗缪尔的吸附理论证明了这一点,在每种情况下,都自动将实验数据与背景理论联系起来。逻辑推理与机器学习的结合为自然现象的关键方面提供了可概括的见解。 摘要:Scientists have long aimed to discover meaningful equations which accurately describe data. Machine learning algorithms automate construction of accurate data-driven models, but ensuring that these are consistent with existing knowledge is a challenge. We developed a methodology combining automated theorem proving with symbolic regression, enabling principled derivations of laws of nature. We demonstrate this for Kepler's third law, Einstein's relativistic time dilation, and Langmuir's theory of adsorption, in each case, automatically connecting experimental data with background theory. The combination of logical reasoning with machine learning provides generalizable insights into key aspects of the natural phenomena.

【3】 Stochastic Physics-Informed Neural Networks (SPINN): A Moment-Matching Framework for Learning Hidden Physics within Stochastic Differential Equations 标题:随机物理信息神经网络(SPINN):学习随机微分方程隐含物理的矩匹配框架 链接:https://arxiv.org/abs/2109.01621

作者:Jared O'Leary,Joel A. Paulson,Ali Mesbah 机构:Department of Chemical and Biomolecular Engineering, University of California, Berkeley, Berkeley, CA, USA, Department of Chemical and Biomolecular Engineering, The Ohio State, University, Columbus, OH, USA 摘要:随机微分方程(SDE)用于描述各种复杂的随机动力系统。学习SDEs中隐藏的物理对于揭示这些系统的随机和非线性行为的基本理解至关重要。我们提出了一个灵活且可扩展的框架,用于训练深层神经网络,以学习代表SDE中隐藏物理的本构方程。提出的随机物理信息神经网络框架(SPINN)依赖于不确定性传播和矩匹配技术以及最先进的深度学习策略。自旋首先通过SDE的已知结构(即已知物理)传播随机性,以预测随机状态统计矩的时间演化。SPINN通过将预测的时刻与从数据中估计的时刻相匹配来学习隐藏物理的(深层)神经网络表示。利用自动微分和小批量梯度下降的最新进展来建立神经网络的未知参数。我们在三个基准上对SPINN进行了实例演示,并分析了该框架的鲁棒性和数值稳定性。SPINN为系统地揭示具有乘性噪声的多元随机动力系统的隐藏物理提供了一个有希望的新方向。 摘要:Stochastic differential equations (SDEs) are used to describe a wide variety of complex stochastic dynamical systems. Learning the hidden physics within SDEs is crucial for unraveling fundamental understanding of the stochastic and nonlinear behavior of these systems. We propose a flexible and scalable framework for training deep neural networks to learn constitutive equations that represent hidden physics within SDEs. The proposed stochastic physics-informed neural network framework (SPINN) relies on uncertainty propagation and moment-matching techniques along with state-of-the-art deep learning strategies. SPINN first propagates stochasticity through the known structure of the SDE (i.e., the known physics) to predict the time evolution of statistical moments of the stochastic states. SPINN learns (deep) neural network representations of the hidden physics by matching the predicted moments to those estimated from data. Recent advances in automatic differentiation and mini-batch gradient descent are leveraged to establish the unknown parameters of the neural networks. We demonstrate SPINN on three benchmark in-silico case studies and analyze the framework's robustness and numerical stability. SPINN provides a promising new direction for systematically unraveling the hidden physics of multivariate stochastic dynamical systems with multiplicative noise.

【4】 Multi-model Machine Learning Inference Serving with GPU Spatial Partitioning 标题:基于GPU空间划分的多模型机器学习推理服务 链接:https://arxiv.org/abs/2109.01611

作者:Seungbeom Choi,Sunho Lee,Yeonjae Kim,Jongse Park,Youngjin Kwon,Jaehyuk Huh 机构:School of Computing, KAIST 摘要:随着机器学习技术应用于越来越多的应用领域,高吞吐量机器学习(ML)推理服务器已成为在线服务应用的关键。这种ML推理服务器面临两个挑战:首先,它们必须为每个请求提供有限的延迟,以支持一致的服务级别目标(SLO),其次,它们可以为系统中的多个异构ML模型提供服务,因为某些任务涉及调用多个模型,整合多个模型可以提高系统利用率。针对ML推理服务器的两个需求,提出了一种新的多模型ML推理服务器的ML推理调度框架。本文首先表明,在SLO约束下,当前的GPU不能充分用于ML推理任务。为了最大限度地提高推理服务器的资源效率,本文提出的一个关键机制是利用硬件支持对GPU资源进行空间分区。通过分区机制,可以使用可配置的GPU资源创建一个新的GPU资源抽象层。调度器使用最有效的资源量将请求分配给虚拟gpu(称为gpu let)。本文还研究了在GPU中同时运行两个ML任务时潜在干扰效应的补救方法。我们的原型实现证明,在满足SLO的同时,空间分区平均提高了102.6%的吞吐量。 摘要:As machine learning techniques are applied to a widening range of applications, high throughput machine learning (ML) inference servers have become critical for online service applications. Such ML inference servers pose two challenges: first, they must provide a bounded latency for each request to support consistent service-level objective (SLO), and second, they can serve multiple heterogeneous ML models in a system as certain tasks involve invocation of multiple models and consolidating multiple models can improve system utilization. To address the two requirements of ML inference servers, this paper proposes a new ML inference scheduling framework for multi-model ML inference servers. The paper first shows that with SLO constraints, current GPUs are not fully utilized for ML inference tasks. To maximize the resource efficiency of inference servers, a key mechanism proposed in this paper is to exploit hardware support for spatial partitioning of GPU resources. With the partitioning mechanism, a new abstraction layer of GPU resources is created with configurable GPU resources. The scheduler assigns requests to virtual GPUs, called gpu-lets, with the most effective amount of resources. The paper also investigates a remedy for potential interference effects when two ML tasks are running concurrently in a GPU. Our prototype implementation proves that spatial partitioning enhances throughput by 102.6% on average while satisfying SLOs.

【5】 Super Neurons 标题:超级神经元 链接:https://arxiv.org/abs/2109.01594

作者:Serkan Kiranyaz,Junaid Malik,Mehmet Yamac,Esin Guldogan,Turker Ince,Moncef Gabbouj 机构:Electrical & Electronics Engineering Department, Izmir University of Economics, Turkey; e-mail: 摘要:操作神经网络(ONN)是新一代的网络模型,可以通过“节点”和“池”操作符的适当组合执行任何(非线性)转换。然而,它们仍然有一定的限制,即对每个神经元的所有(突触)连接只使用一个节点操作符。“生成神经元”背后的想法是为了弥补这一限制而产生的,在训练过程中,每个节点算子都可以“定制”,以最大限度地提高学习性能。由生成神经元组成的自组织神经元(Self-ONNs)即使在紧凑的结构下也能实现最大程度的多样性;然而,它仍然受到从CNN继承的最后一个属性的影响:本地化内核操作,这对层之间的信息流造成了严重的限制。因此,希望神经元在不增加内核大小的情况下,从先前层映射中的较大区域收集信息。对于某些应用程序,可能更希望在训练过程中“学习”每个连接的核心位置以及定制的节点操作符,以便两者可以同时优化。这项研究引入了超级(生成)神经元模型,它可以在不改变内核大小的情况下实现这一点,并将在信息流方面实现显著的多样性。本研究中提出的两种超级神经元模型在内核的定位过程上有所不同:i)在为每层设置的偏差范围内随机定位内核,ii)在反向传播(BP)训练期间优化每个内核的位置。大量的比较评估表明,具有超级神经元的自神经元确实可以在不显著增加计算复杂度的情况下实现优异的学习和泛化能力。 摘要:Operational Neural Networks (ONNs) are new generation network models that can perform any (non-linear) transformation with a proper combination of "nodal" and "pool" operators. However, they still have a certain restriction, which is the sole usage of a single nodal operator for all (synaptic) connections of each neuron. The idea behind the "generative neurons" was born as a remedy for this restriction where each nodal operator can be "customized" during the training in order to maximize the learning performance. Self-Organized ONNs (Self-ONNs) composed with the generative neurons can achieve an utmost level of diversity even with a compact configuration; however, it still suffers from the last property that was inherited from the CNNs: localized kernel operations which imposes a severe limitation to the information flow between layers. It is, therefore, desirable for the neurons to gather information from a larger area in the previous layer maps without increasing the kernel size. For certain applications, it might be even more desirable "to learn" the kernel locations of each connection during the training process along with the customized nodal operators so that both can be optimized simultaneously. This study introduces the super (generative) neuron models that can accomplish this without altering the kernel sizes and will enable a significant diversity in terms of information flow. The two models of super neurons proposed in this study vary on the localization process of the kernels: i) randomly localized kernels within a bias range set for each layer, ii) optimized locations of each kernel during the Back-Propagation (BP) training. The extensive set of comparative evaluations show that Self-ONNs with super-neurons can indeed achieve a superior learning and generalization capability without any significant rise of the computational complexity.

【6】 Learning from Multiple Noisy Augmented Data Sets for Better Cross-Lingual Spoken Language Understanding 标题:从多个噪声增强数据集学习以更好地理解跨语言口语 链接:https://arxiv.org/abs/2109.01583

作者:Yingmei Guo,Linjun Shou,Jian Pei,Ming Gong,Mingxing Xu,Zhiyong Wu,Daxin Jiang 机构:Department of Computer Science and Technology, Tsinghua University, NLP Group, Microsoft STCA, School of Computing Science, Simon Fraser University 备注:Long paper at EMNLP 2021 摘要:缺乏训练数据是将口语理解(SLU)扩展到低资源语言的一大挑战。虽然已经提出了各种数据扩充方法来合成低资源目标语言中的训练数据,但扩充的数据集通常是有噪声的,因此阻碍了SLU模型的性能。在本文中,我们主要关注增强数据中的噪声抑制。我们开发了一种去噪训练方法。多个模型使用各种增强方法生成的数据进行训练。这些模型相互提供监控信号。实验结果表明,在两个基准数据集上,我们的方法分别比现有的最新技术高3.05和4.24个百分点。该代码将在github上开源。 摘要:Lack of training data presents a grand challenge to scaling out spoken language understanding (SLU) to low-resource languages. Although various data augmentation approaches have been proposed to synthesize training data in low-resource target languages, the augmented data sets are often noisy, and thus impede the performance of SLU models. In this paper we focus on mitigating noise in augmented data. We develop a denoising training approach. Multiple models are trained with data produced by various augmented methods. Those models provide supervision signals to each other. The experimental results show that our method outperforms the existing state of the art by 3.05 and 4.24 percentage points on two benchmark datasets, respectively. The code will be made open sourced on github.

【7】 Continuous-Time Behavior Trees as Discontinuous Dynamical Systems 标题:作为不连续动力系统的连续时间行为树 链接:https://arxiv.org/abs/2109.01575

作者:Christopher Iliffe Sprague,Petter Ögren 机构:School of Electrical Engineering and ComputerScience, Royal Institute of Technology (KTH) 备注:To be submitted to the IEEE Control Systems Letters (L-CSS) 摘要:行为树表示一种层次化和模块化的方式,将几个低级控制策略组合成高级任务切换策略。混合动力系统也可以通过不同策略之间的任务切换来观察,因此,已经对行为树和混合动力系统进行了一些比较,但只是非正式地,并且仅在离散时间内。缺乏行为树的正式连续时间公式。此外,已经对特定类别的行为树设计进行了收敛性分析,但没有对一般设计进行收敛性分析。在这封信中,我们提供了行为树的第一个连续时间公式,表明它们可以被视为不连续动力系统(混合动力系统的一个子类),这使得存在性和唯一性结果能够应用于行为树,最后,提供充分条件,使此类系统收敛到一般设计所需的状态空间区域。有了这些结果,可以在设计行为树控制器时使用连续时间动力系统的大量结果。 摘要:Behavior trees represent a hierarchical and modular way of combining several low-level control policies into a high-level task-switching policy. Hybrid dynamical systems can also be seen in terms of task switching between different policies, and therefore several comparisons between behavior trees and hybrid dynamical systems have been made, but only informally, and only in discrete time. A formal continuous-time formulation of behavior trees has been lacking. Additionally, convergence analyses of specific classes of behavior tree designs have been made, but not for general designs. In this letter, we provide the first continuous-time formulation of behavior trees, show that they can be seen as discontinuous dynamical systems (a subclass of hybrid dynamical systems), which enables the application of existence and uniqueness results to behavior trees, and finally, provide sufficient conditions under which such systems will converge to a desired region of the state space for general designs. With these results, a large body of results on continuous-time dynamical systems can be brought to use when designing behavior tree controllers.

【8】 Situated Conditional Reasoning 标题:情境条件推理 链接:https://arxiv.org/abs/2109.01552

作者:Giovanni Casini,Thomas Meyer,Ivan Varzinczak 机构:ISTI–CNR, Italy, University of Cape Town, South Africa, CRIL, Univ. Artois & CNRS, France, CAIR, South Africa, Stellenbosch University, South Africa 备注:51 pages 摘要:条件句对于建模很有用,但对于准确地捕获信息来说并不总是具有足够的表达能力。在本文中,我们提出了一种基于情境的条件句形式。这些条件句比经典条件句更具表现力,具有足够的通用性,可以在多个应用领域中使用,并且能够区分期望和反事实。在形式上,它们被证明是以克劳斯、莱曼和马吉多的风格概括了条件设置。我们证明了基于情境的条件句可以用一组合理性假设来描述。然后,我们为这些条件句提出了一个直观的语义,并给出了一个表示结果,该结果表明我们的语义结构与假设方面的描述完全一致。语义就位后,我们继续为位置条件知识库定义一种蕴涵形式,我们称之为最小闭包。它让人联想到命题条件知识库的蕴涵(称为有理闭包)的版本,事实上,它受到了这一版本的启发。最后,我们继续证明了将最小闭包的计算简化为一系列命题蕴涵和可满足性检查是可能的。虽然这也是rational闭包的情况,但结果会延续到最小闭包,这有点令人惊讶。 摘要:Conditionals are useful for modelling, but are not always sufficiently expressive for capturing information accurately. In this paper we make the case for a form of conditional that is situation-based. These conditionals are more expressive than classical conditionals, are general enough to be used in several application domains, and are able to distinguish, for example, between expectations and counterfactuals. Formally, they are shown to generalise the conditional setting in the style of Kraus, Lehmann, and Magidor. We show that situation-based conditionals can be described in terms of a set of rationality postulates. We then propose an intuitive semantics for these conditionals, and present a representation result which shows that our semantic construction corresponds exactly to the description in terms of postulates. With the semantics in place, we proceed to define a form of entailment for situated conditional knowledge bases, which we refer to as minimal closure. It is reminiscent of and, indeed, inspired by, the version of entailment for propositional conditional knowledge bases known as rational closure. Finally, we proceed to show that it is possible to reduce the computation of minimal closure to a series of propositional entailment and satisfiability checks. While this is also the case for rational closure, it is somewhat surprising that the result carries over to minimal closure.

【9】 Ontology-driven Knowledge Graph for Android Malware 标题:本体驱动的Android恶意软件知识图 链接:https://arxiv.org/abs/2109.01544

作者:Ryan Christian,Sharmishtha Dutta,Youngja Park,Nidhi Rastogi 机构:Rensselaer Polytechnic Institute, IBM TJ Watson Research Center, New York, USA 备注:3 pages, 5 figures 摘要:我们介绍了MalONT2.0——恶意软件威胁情报的本体{rastogi2020malont}。随着核心能力问题范围的扩大,增加了新的类别(攻击模式、支持攻击的基础设施资源、结合静态分析的恶意软件分析以及二进制文件的动态分析)和关系。MalONT2.0允许研究人员广泛捕获收集android恶意软件攻击语义和语法特征的所有必需类和关系。该本体构成了恶意软件威胁情报知识图MalKG的基础,我们使用三种不同的、不重叠的演示来举例说明。恶意软件特征是从CTI关于android威胁情报的报告中提取出来的,这些报告在互联网上共享,并以非结构化文本的形式编写。其中一些来源是博客、威胁情报报告、推特和新闻文章。捕获恶意软件特征的最小信息单元写为三元组,包括头部和尾部实体,每个实体通过关系连接。在海报和演示中,我们讨论了MalONT2.0、MalKG以及动态增长的知识图TINKER。 摘要:We present MalONT2.0 -- an ontology for malware threat intelligence cite{rastogi2020malont}. New classes (attack patterns, infrastructural resources to enable attacks, malware analysis to incorporate static analysis, and dynamic analysis of binaries) and relations have been added following a broadened scope of core competency questions. MalONT2.0 allows researchers to extensively capture all requisite classes and relations that gather semantic and syntactic characteristics of an android malware attack. This ontology forms the basis for the malware threat intelligence knowledge graph, MalKG, which we exemplify using three different, non-overlapping demonstrations. Malware features have been extracted from CTI reports on android threat intelligence shared on the Internet and written in the form of unstructured text. Some of these sources are blogs, threat intelligence reports, tweets, and news articles. The smallest unit of information that captures malware features is written as triples comprising head and tail entities, each connected with a relation. In the poster and demonstration, we discuss MalONT2.0, MalKG, as well as the dynamically growing knowledge graph, TINKER.

【10】 A Longitudinal Multi-modal Dataset for Dementia Monitoring and Diagnosis 标题:一种用于痴呆监测和诊断的纵向多模态数据集 链接:https://arxiv.org/abs/2109.01537

作者:Dimitris Gkoumas,Bo Wang,Adam Tsakalidis,Maria Wolters,Arkaitz Zubiaga,Matthew Purver,Maria Liakata 机构:School of Electronic Engineering and Computer Science, Queen Mary University of London, UK, Department of Psychiatry, University of Oxford, UK, The Alan Turing Institute, London, UK, School of Informatics, University of Edinburgh, UK 摘要:痴呆症是一个神经生成性疾病家族,影响全球老龄化人口中越来越多的人的记忆和认知。语言、言语和副语言指标的自动分析作为认知能力下降的潜在指标越来越受欢迎。在这里,我们提出了一个新的纵向多模式数据集,从轻度痴呆患者和年龄匹配的对照组在自然环境中收集几个月。多模态数据包括口语对话(其中一部分是转录的)、打字和书面想法以及相关的语言外信息,如笔划和按键。我们详细描述了数据集,并继续关注使用语音模态的任务。后者涉及通过利用数据的纵向性质来区分对照组和痴呆症患者。我们的实验表明,在控制组和痴呆组中,不同时段的讲话方式存在显著差异。 摘要:Dementia is a family of neurogenerative conditions affecting memory and cognition in an increasing number of individuals in our globally aging population. Automated analysis of language, speech and paralinguistic indicators have been gaining popularity as potential indicators of cognitive decline. Here we propose a novel longitudinal multi-modal dataset collected from people with mild dementia and age matched controls over a period of several months in a natural setting. The multi-modal data consists of spoken conversations, a subset of which are transcribed, as well as typed and written thoughts and associated extra-linguistic information such as pen strokes and keystrokes. We describe the dataset in detail and proceed to focus on a task using the speech modality. The latter involves distinguishing controls from people with dementia by exploiting the longitudinal nature of the data. Our experiments showed significant differences in how the speech varied from session to session in the control and dementia groups.

【11】 A brief history of AI: how to prevent another winter (a critical review) 标题:人工智能简史:如何防止另一个冬天(评论) 链接:https://arxiv.org/abs/2109.01517

作者:Amirhosein Toosi,Andrea Bottino,Babak Saboury,Eliot Siegel,Arman Rahmim 机构:Department of Integrative Oncology, BC Cancer Research Institute, Vancouver, BC, V,Z ,L, Department of Computer and Control Eng., Polytechnic University of Turin, Turin, Italy , Department of Radiology and Imaging Sciences, National Institutes of Health, Bethesda, MD 备注:20 pages, 12 figures, 106 references, a Glossary section comes at the end of the paper, right after References. The article is accepted and going to be published by Elsevier, journal of PET - Clinics 摘要:人工智能(AI)领域被认为是科学领域中最神秘的领域之一,在过去十年中经历了指数级增长,包括非常广泛的应用,已经影响到我们的日常生活。计算能力的进步和复杂人工智能算法的设计使计算机在各种任务中都能超越人类,特别是在计算机视觉和语音识别领域。然而,人工智能的道路从来都不是一帆风顺的,在它的一生中(人工智能的冬天)两次基本上分崩离析,这两次都是在人工智能的流行时期(人工智能的夏天)之后。我们简要介绍了人工智能几十年来的发展历程,重点介绍了从一开始到现在的关键时刻和主要转折点。在这样做的过程中,我们试图了解、预测未来,并讨论可能采取哪些措施来防止另一个“冬天”。 摘要:The field of artificial intelligence (AI), regarded as one of the most enigmatic areas of science, has witnessed exponential growth in the past decade including a remarkably wide array of applications, having already impacted our everyday lives. Advances in computing power and the design of sophisticated AI algorithms have enabled computers to outperform humans in a variety of tasks, especially in the areas of computer vision and speech recognition. Yet, AI's path has never been smooth, having essentially fallen apart twice in its lifetime ('winters' of AI), both after periods of popular success ('summers' of AI). We provide a brief rundown of AI's evolution over the course of decades, highlighting its crucial moments and major turning points from inception to the present. In doing so, we attempt to learn, anticipate the future, and discuss what steps may be taken to prevent another 'winter'.

【12】 Computing Graph Descriptors on Edge Streams 标题:基于边缘流的图描述子计算 链接:https://arxiv.org/abs/2109.01494

作者:Zohair Raza Hassan,Imdadullah Khan,Mudassir Shabbir,Waseem Abbas 机构:Rochester Institute of Technology, USA, Lahore University of Management Sciences, Pakistan, Information Technology University, Pakistan, The University of Texas at Dallas, USA 备注:Extension of work accepted to PAKDD 2020 摘要:图形特征提取是图形分析中的一项基本任务。将特征向量(图描述符)与对欧几里德数据进行操作的数据挖掘算法结合使用,可以解决对图结构数据进行分类、聚类和异常检测等问题。这一想法在过去被证明是卓有成效的,基于光谱的图描述符在基准数据集上提供了最先进的分类精度。但是,这些算法不能扩展到大型图,因为:1)它们需要将整个图存储在内存中,2)最终用户无法控制算法的运行时。在本文中,我们提出了单次流算法来近似图的结构特征(顺序为$kgeq 4$的子图的计数)。对边缘流进行操作可以避免将整个图形保存在内存中,控制样本大小可以控制算法所花费的时间。我们通过分析近似误差、分类精度和对海量图的可伸缩性来证明描述符的有效性。我们的实验展示了样本大小对近似误差和预测精度的影响。所提出的描述符适用于在几分钟内具有数百万条边的图,并且在分类精度方面优于最先进的描述符。 摘要:Graph feature extraction is a fundamental task in graphs analytics. Using feature vectors (graph descriptors) in tandem with data mining algorithms that operate on Euclidean data, one can solve problems such as classification, clustering, and anomaly detection on graph-structured data. This idea has proved fruitful in the past, with spectral-based graph descriptors providing state-of-the-art classification accuracy on benchmark datasets. However, these algorithms do not scale to large graphs since: 1) they require storing the entire graph in memory, and 2) the end-user has no control over the algorithm's runtime. In this paper, we present single-pass streaming algorithms to approximate structural features of graphs (counts of subgraphs of order $k geq 4$). Operating on edge streams allows us to avoid keeping the entire graph in memory, and controlling the sample size enables us to control the time taken by the algorithm. We demonstrate the efficacy of our descriptors by analyzing the approximation error, classification accuracy, and scalability to massive graphs. Our experiments showcase the effect of the sample size on approximation error and predictive accuracy. The proposed descriptors are applicable on graphs with millions of edges within minutes and outperform the state-of-the-art descriptors in classification accuracy.

【13】 LG4AV: Combining Language Models and Graph Neural Networks for Author Verification 标题:LG4AV:结合语言模型和图神经网络的作者认证 链接:https://arxiv.org/abs/2109.01479

作者:Maximilian Stubbemann,Gerd Stumme 机构:L,S Research Center and University of Kassel, Kassel, Germany, University of Kassel and L,S Research Center 备注:9 pages, 1 figure 摘要:文档作者身份的自动验证在各种设置中都很重要。例如,研究人员根据其出版物的数量和影响进行判断和比较,公众人物面对他们在社交媒体平台上的帖子。因此,经常使用的web服务和平台中的作者信息是正确的是很重要的。给定文档是否由给定作者编写的问题通常称为作者身份验证(AV)。虽然AV通常是一个被广泛研究的问题,但只有很少的作品考虑文档短且写得相当统一的设置。这使得大多数方法在学术领域的在线数据库和知识图中不实用。在这里,必须核实科学出版物的作者身份,通常只有摘要和标题。在这一点上,我们提出了我们的新方法LG4AV,它结合了语言模型和图形神经网络进行作者身份验证。通过在预先训练好的transformer架构中直接输入可用文本,我们的模型不需要任何手工制作的笔迹特征,这些特征在书写风格至少在某种程度上是标准化的场景中没有意义。通过加入图形神经网络结构,我们的模型可以从作者之间的关系中获益,这些关系对于验证过程是有意义的。例如,科学作者更有可能写他们的合著者所涉及的主题,推特用户倾向于发布与他们所关注的人相同的主题。我们通过实验评估了我们的模型,并研究了在文献计量环境中,合作作者的加入在多大程度上增强了验证决策。 摘要:The automatic verification of document authorships is important in various settings. Researchers are for example judged and compared by the amount and impact of their publications and public figures are confronted by their posts on social media platforms. Therefore, it is important that authorship information in frequently used web services and platforms is correct. The question whether a given document is written by a given author is commonly referred to as authorship verification (AV). While AV is a widely investigated problem in general, only few works consider settings where the documents are short and written in a rather uniform style. This makes most approaches unpractical for online databases and knowledge graphs in the scholarly domain. Here, authorships of scientific publications have to be verified, often with just abstracts and titles available. To this point, we present our novel approach LG4AV which combines language models and graph neural networks for authorship verification. By directly feeding the available texts in a pre-trained transformer architecture, our model does not need any hand-crafted stylometric features that are not meaningful in scenarios where the writing style is, at least to some extent, standardized. By the incorporation of a graph neural network structure, our model can benefit from relations between authors that are meaningful with respect to the verification process. For example, scientific authors are more likely to write about topics that are addressed by their co-authors and twitter users tend to post about the same subjects as people they follow. We experimentally evaluate our model and study to which extent the inclusion of co-authorships enhances verification decisions in bibliometric environments.

【14】 Is Machine Learning Ready for Traffic Engineering Optimization? 标题:机器学习为流量工程优化做好准备了吗? 链接:https://arxiv.org/abs/2109.01445

作者:Guillermo Bernárdez,José Suárez-Varela,Albert López,Bo Wu,Shihan Xiao,Xiangle Cheng,Pere Barlet-Ros,Albert Cabellos-Aparicio 机构:∗ Barcelona Neural Networking Center, Universitat Politècnica de Catalunya, Barcelona, Spain, †Network Technology Lab., Huawei Technologies Co., Ltd., Beijing, China 备注:To appear at IEEE ICNP 2021 摘要:流量工程(TE)是互联网的基本组成部分。在本文中,我们分析了现代机器学习(ML)方法是否可以用于TE优化。我们通过比较分析ML的最新技术和TE的最新技术来解决这个悬而未决的问题。为此,我们首先提出了一种新的分布式TE系统,该系统利用了ML的最新进展。我们的系统实现了一种新的体系结构,该体系结构结合了多代理强化学习(MARL)和图形神经网络(GNN),以最小化网络拥塞。在我们的评估中,我们将MARL GNN系统与DEFO进行了比较,DEFO是一种基于约束编程的网络优化器,代表了TE的最新技术。我们的实验结果表明,所提出的MARL GNN解决方案在包括三种真实网络拓扑在内的各种网络场景中实现了与DEFO同等的性能。同时,我们证明MARL GNN可以显著缩短执行时间(从使用笛福的分钟到使用我们的解决方案的几秒钟)。 摘要:Traffic Engineering (TE) is a basic building block of the Internet. In this paper, we analyze whether modern Machine Learning (ML) methods are ready to be used for TE optimization. We address this open question through a comparative analysis between the state of the art in ML and the state of the art in TE. To this end, we first present a novel distributed system for TE that leverages the latest advancements in ML. Our system implements a novel architecture that combines Multi-Agent Reinforcement Learning (MARL) and Graph Neural Networks (GNN) to minimize network congestion. In our evaluation, we compare our MARL GNN system with DEFO, a network optimizer based on Constraint Programming that represents the state of the art in TE. Our experimental results show that the proposed MARL GNN solution achieves equivalent performance to DEFO in a wide variety of network scenarios including three real-world network topologies. At the same time, we show that MARL GNN can achieve significant reductions in execution time (from the scale of minutes with DEFO to a few seconds with our solution).

【15】 The Impact of Algorithmic Risk Assessments on Human Predictions and its Analysis via Crowdsourcing Studies 标题:算法风险评估对人类预测的影响及其众包分析 链接:https://arxiv.org/abs/2109.01443

作者:Riccardo Fogliato,Alexandra Chouldechova,Zachary Lipton 机构:Carnegie Mellon University 备注:Proceedings of the ACM on Human-Computer Interaction 5, CSCW2, Article 428 (October 2021) 摘要:随着算法风险评估工具(RAI)被越来越多地用于协助决策者,其预测性能和促进不平等的潜力受到了仔细的检查。然而,尽管大多数研究都是孤立地研究这些工具,但研究人员已经认识到,评估这些工具的影响需要了解人类互动者的行为。在本文中,基于最近几项集中于刑事司法的众包工作,我们进行了一项小插曲研究,其中非专业人士的任务是预测未来的再次逮捕。我们的主要发现如下:(1)参与者经常预测,即使他们认为再次逮捕的可能性远远低于50%,罪犯也会被重新逮捕(2) 参与者不依赖RAI的预测(3) 调查花费的时间因参与者而异,大多数案例的评估时间不到10秒(4) 与参与者的预测不同,司法判决部分取决于与再次逮捕可能性正交的因素。这些结果突出了在构建众包研究以分析RAI的影响时,几个关键但往往被忽视的设计决策的影响,以及对普遍性的关注。 摘要:As algorithmic risk assessment instruments (RAIs) are increasingly adopted to assist decision makers, their predictive performance and potential to promote inequity have come under scrutiny. However, while most studies examine these tools in isolation, researchers have come to recognize that assessing their impact requires understanding the behavior of their human interactants. In this paper, building off of several recent crowdsourcing works focused on criminal justice, we conduct a vignette study in which laypersons are tasked with predicting future re-arrests. Our key findings are as follows: (1) Participants often predict that an offender will be rearrested even when they deem the likelihood of re-arrest to be well below 50%; (2) Participants do not anchor on the RAI's predictions; (3) The time spent on the survey varies widely across participants and most cases are assessed in less than 10 seconds; (4) Judicial decisions, unlike participants' predictions, depend in part on factors that are orthogonal to the likelihood of re-arrest. These results highlight the influence of several crucial but often overlooked design decisions and concerns around generalizability when constructing crowdsourcing studies to analyze the impacts of RAIs.

【16】 Building Interpretable Models for Business Process Prediction using Shared and Specialised Attention Mechanisms 标题:使用共享和专门的注意机制构建业务流程预测的可解释模型 链接:https://arxiv.org/abs/2109.01419

作者:Bemali Wickramanayake,Zhipeng He,Chun Ouyang,Catarina Moreira,Yue Xu,Renuka Sindhgatta 机构:Queensland University of Technology, Brisbane, Australia, IBM Research, Bangalore, India 备注:25 pages, 11 figures, 5 tables 摘要:在本文中,我们通过构建可解释的模型来解决预测过程分析中的“黑箱”问题,该模型能够告知预测是什么以及为什么。预测过程分析是一门新兴学科,致力于在现代组织中提供业务过程智能。它使用事件日志(以多维序列数据的形式捕获流程执行跟踪)作为训练预测模型的关键输入。这些预测模型通常建立在深度学习技术的基础上,可用于预测业务流程执行的未来状态。我们应用注意机制来实现模型的可解释性。我们建议i)两种类型的注意:事件注意用于捕捉特定过程事件对预测的影响,属性注意用于揭示事件的哪些属性影响预测;和ii)两种注意机制:共享注意机制和专门注意机制,以反映在何时对单个输入特征(专门)构建属性注意或使用所有输入特征向量的串联特征张量(共享)时的不同设计决策。这导致了两种不同的基于注意的模型,它们都是可解释的模型,将可解释性直接纳入过程预测模型的结构中。我们使用真实数据集对提出的模型进行实验评估,并对模型的准确性和可解释性进行比较分析,从评估和分析结果中得出见解。 摘要:In this paper, we address the "black-box" problem in predictive process analytics by building interpretable models that are capable to inform both what and why is a prediction. Predictive process analytics is a newly emerged discipline dedicated to providing business process intelligence in modern organisations. It uses event logs, which capture process execution traces in the form of multi-dimensional sequence data, as the key input to train predictive models. These predictive models, often built upon deep learning techniques, can be used to make predictions about the future states of business process execution. We apply attention mechanism to achieve model interpretability. We propose i) two types of attentions: event attention to capture the impact of specific process events on a prediction, and attribute attention to reveal which attribute(s) of an event influenced the prediction; and ii) two attention mechanisms: shared attention mechanism and specialised attention mechanism to reflect different design decisions in when to construct attribute attention on individual input features (specialised) or using the concatenated feature tensor of all input feature vectors (shared). These lead to two distinct attention-based models, and both are interpretable models that incorporate interpretability directly into the structure of a process predictive model. We conduct experimental evaluation of the proposed models using real-life dataset, and comparative analysis between the models for accuracy and interpretability, and draw insights from the evaluation and analysis results.

【17】 Efficient Communication in Multi-Agent Distributed Reinforcement Learning 标题:多Agent分布式强化学习中的高效通信 链接:https://arxiv.org/abs/2109.01417

作者:Daniel Jarne Ornia,Manuel Mazo Jr 机构:Delft University of TechnologyMANUEL MAZO JR, Delft University of TechnologyWe present in this work an approach to reduce the communication of information needed on a multi-agent learning system inspired byEvent Triggered Control (ETC) techniques 摘要:在这项工作中,我们提出了一种方法,以减少多智能体学习系统所需的信息交流,其灵感来自事件触发控制(ETC)技术。我们考虑在马尔可夫决策过程(MDP)上的分布式Q-学习问题的基线场景。根据基于事件的方法,N个代理探索MDP,并仅在必要时将经验传达给中心学习者,该学习者执行参与者Q函数的更新。我们分析了相对于常规Q-学习算法保留的收敛保证,并给出了实验结果,表明基于事件的通信在此类分布式系统中大大降低了数据传输速率。此外,我们还讨论了这些基于事件的方法对所研究的学习过程有什么影响(期望的和不期望的),以及它们如何应用于更复杂的多智能体学习系统。 摘要:We present in this work an approach to reduce the communication of information needed on a multi-agent learning system inspired by Event Triggered Control (ETC) techniques. We consider a baseline scenario of a distributed Q-learning problem on a Markov Decision Process (MDP). Following an event-based approach, N agents explore the MDP and communicate experiences to a central learner only when necessary, which performs updates of the actor Q functions. We analyse the convergence guarantees retained with respect to a regular Q-learning algorithm, and present experimental results showing that event-based communication results in a substantial reduction of data transmission rates in such distributed systems. Additionally, we discuss what effects (desired and undesired) these event-based approaches have on the learning processes studied, and how they can be applied to more complex multi-agent learning systems.

【18】 An Exploratory Study on Utilising the Web of Linked Data for Product Data Mining 标题:利用关联数据网络进行产品数据挖掘的探索性研究 链接:https://arxiv.org/abs/2109.01411

作者:Ziqi Zhang,Xingyi Song 机构:Received: date Accepted: date 备注:Currently under review at LRE journal 摘要:在过去十年中,链接开放数据实践导致了Web上结构化数据的显著增长。这种结构化数据以机器可读的方式描述真实世界的实体,为自然语言处理领域的研究创造了前所未有的机会。然而,对于这些数据如何使用、用于什么样的任务以及它们在多大程度上对这些任务有用,缺乏研究。这项工作侧重于电子商务领域,探索利用此类结构化数据创建可用于产品分类和链接的语言资源的方法。我们以RDF n-quads的形式处理数十亿结构化数据点,以创建数百万字的产品相关语料库,这些语料库随后以三种不同的方式用于创建语言资源:训练单词嵌入模型、继续预训练类BERT语言模型,以及训练机器翻译模型,这些模型用作生成产品相关关键字的代理。我们对大量基准测试的评估表明,单词嵌入是提高这两项任务准确性的最可靠和一致的方法(在某些数据集上,宏平均F1高达6.9个百分点)。然而,其他两种方法没有那么有用。我们的分析表明,这可能是由于一些原因造成的,包括结构化数据中有偏见的领域表示和词汇覆盖率不足。我们分享我们的数据集,并讨论如何利用我们的经验教训,为这方面的未来研究提供信息。 摘要:The Linked Open Data practice has led to a significant growth of structured data on the Web in the last decade. Such structured data describe real-world entities in a machine-readable way, and have created an unprecedented opportunity for research in the field of Natural Language Processing. However, there is a lack of studies on how such data can be used, for what kind of tasks, and to what extent they can be useful for these tasks. This work focuses on the e-commerce domain to explore methods of utilising such structured data to create language resources that may be used for product classification and linking. We process billions of structured data points in the form of RDF n-quads, to create multi-million words of product-related corpora that are later used in three different ways for creating of language resources: training word embedding models, continued pre-training of BERT-like language models, and training Machine Translation models that are used as a proxy to generate product-related keywords. Our evaluation on an extensive set of benchmarks shows word embeddings to be the most reliable and consistent method to improve the accuracy on both tasks (with up to 6.9 percentage points in macro-average F1 on some datasets). The other two methods however, are not as useful. Our analysis shows that this could be due to a number of reasons, including the biased domain representation in the structured data and lack of vocabulary coverage. We share our datasets and discuss how our lessons learned could be taken forward to inform future research in this direction.

【19】 CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models 标题:CX-TOM:图像识别模型中增强人类信任度的心理理论反事实解释 链接:https://arxiv.org/abs/2109.01401

作者:Arjun R. Akula,Keze Wang,Changsong Liu,Sari Saba-Sadiya,Hongjing Lu,Sinisa Todorovic,Joyce Chai,Song-Chun Zhu 机构:Oregon State University, University of Michigan 备注:Accepted by iScience Cell Press Journal 2021. arXiv admin note: text overlap with arXiv:1909.06907 摘要:我们提出CX-ToM(心灵理论反事实解释的简称),一个新的可解释人工智能(XAI)框架,用于解释深度卷积神经网络(CNN)做出的决策。与XAI中当前将解释生成为单发响应的方法不同,我们将解释设置为机器和人类用户之间的迭代通信过程,即对话。更具体地说,我们的CX-ToM框架通过调解机器和人类用户的思维差异,在对话中生成一系列解释。为了做到这一点,我们使用心理理论(ToM),它帮助我们明确地建模人类的意图、人类推断出的机器思维以及机器推断出的人类思维。此外,大多数最先进的XAI框架都提供了基于注意(或热图)的解释。在我们的工作中,我们发现这些基于注意力的解释不足以增加人类对潜在CNN模型的信任。在CX ToM中,我们使用称为“断层线”的反事实解释,我们定义如下:给定CNN分类模型M预测c_pred类的输入图像I,断层线识别最小语义级别特征(例如斑马条纹、狗尖耳朵),称为可解释概念,需要添加到I或从I中删除,以便将I by M的分类类别更改为另一个指定的类别c_alt。我们认为,由于CX-ToM解释的迭代性、概念性和反事实性,对于专家和非专家用户来说,我们的框架是实用的,更自然地理解复杂深度学习模型的内部工作原理。大量的定量和定性实验验证了我们的假设,表明我们的CX-ToM显著优于最先进的可解释人工智能模型。 摘要:We propose CX-ToM, short for counterfactual explanations with theory-of mind, a new explainable AI (XAI) framework for explaining decisions made by a deep convolutional neural network (CNN). In contrast to the current methods in XAI that generate explanations as a single shot response, we pose explanation as an iterative communication process, i.e. dialog, between the machine and human user. More concretely, our CX-ToM framework generates sequence of explanations in a dialog by mediating the differences between the minds of machine and human user. To do this, we use Theory of Mind (ToM) which helps us in explicitly modeling human's intention, machine's mind as inferred by the human as well as human's mind as inferred by the machine. Moreover, most state-of-the-art XAI frameworks provide attention (or heat map) based explanations. In our work, we show that these attention based explanations are not sufficient for increasing human trust in the underlying CNN model. In CX-ToM, we instead use counterfactual explanations called fault-lines which we define as follows: given an input image I for which a CNN classification model M predicts class c_pred, a fault-line identifies the minimal semantic-level features (e.g., stripes on zebra, pointed ears of dog), referred to as explainable concepts, that need to be added to or deleted from I in order to alter the classification category of I by M to another specified class c_alt. We argue that, due to the iterative, conceptual and counterfactual nature of CX-ToM explanations, our framework is practical and more natural for both expert and non-expert users to understand the internal workings of complex deep learning models. Extensive quantitative and qualitative experiments verify our hypotheses, demonstrating that our CX-ToM significantly outperforms the state-of-the-art explainable AI models.

【20】 Topographic VAEs learn Equivariant Capsules 标题:地形VAE学习等变胶囊 链接:https://arxiv.org/abs/2109.01394

作者:T. Anderson Keller,Max Welling 机构:UvA-Bosch Delta Lab, University of Amsterdam 摘要:在这项工作中,我们试图在神经网络中连接地形组织和等变的概念。为了实现这一点,我们引入了地形VAE:一种有效训练具有地形组织潜变量的深层生成模型的新方法。我们表明,这样一个模型确实学会了根据MNIST上的显著特征(如数字类别、宽度和样式)组织其激活。此外,通过随时间变化的地形组织(即时间相干性),我们展示了如何为观察到的变换输入序列鼓励预定义的潜在空间变换算子——一种无监督学习等变的原始形式。我们证明,该模型成功地从序列中直接学习近似等变特征集(即“胶囊”),并在相应转换测试序列时获得更高的可能性。通过测量推理网络和序列变换的近似交换性,定量地验证了等变性。最后,我们证明了复杂变换的近似等变性,扩展了现有的群等变神经网络的能力。 摘要:In this work we seek to bridge the concepts of topographic organization and equivariance in neural networks. To accomplish this, we introduce the Topographic VAE: a novel method for efficiently training deep generative models with topographically organized latent variables. We show that such a model indeed learns to organize its activations according to salient characteristics such as digit class, width, and style on MNIST. Furthermore, through topographic organization over time (i.e. temporal coherence), we demonstrate how predefined latent space transformation operators can be encouraged for observed transformed input sequences -- a primitive form of unsupervised learned equivariance. We demonstrate that this model successfully learns sets of approximately equivariant features (i.e. "capsules") directly from sequences and achieves higher likelihood on correspondingly transforming test sequences. Equivariance is verified quantitatively by measuring the approximate commutativity of the inference network and the sequence transformations. Finally, we demonstrate approximate equivariance to complex transformations, expanding upon the capabilities of existing group equivariant neural networks.

【21】 Edge-featured Graph Neural Architecture Search 标题:边特征图神经网络结构搜索 链接:https://arxiv.org/abs/2109.01356

作者:Shaofei Cai,Liang Li,Xinzhe Han,Zheng-jun Zha,Qingming Huang 机构:Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS, Beijing, China, University of Chinese Academy of Sciences, Beijing, China, University of Science and Technology of China, China,Peng Cheng Laboratory, Shenzhen, China 摘要:图神经网络(GNNs)已成功地应用于许多关系任务中的图表示学习。最近,研究人员研究了神经结构搜索(NAS),以减少对人类专业知识的依赖,探索更好的GNN结构,但他们过分强调实体特征,忽略了隐藏在边缘中的潜在关系信息。为了解决这个问题,我们将边缘特征融入到图搜索空间中,并提出边缘特征图神经结构搜索来寻找最优GNN结构。具体地说,我们设计了丰富的实体和边更新操作来学习高阶表示,这传递了更通用的消息传递机制。此外,我们的搜索空间中的架构拓扑允许探索实体和边缘的复杂特征依赖性,这可以通过可微搜索策略进行有效优化。在六个数据集上进行的三个图形任务的实验表明,EGNAS可以搜索性能更好的GNN,比目前最先进的基于人类设计和搜索的GNN具有更高的性能。 摘要:Graph neural networks (GNNs) have been successfully applied to learning representation on graphs in many relational tasks. Recently, researchers study neural architecture search (NAS) to reduce the dependence of human expertise and explore better GNN architectures, but they over-emphasize entity features and ignore latent relation information concealed in the edges. To solve this problem, we incorporate edge features into graph search space and propose Edge-featured Graph Neural Architecture Search to find the optimal GNN architecture. Specifically, we design rich entity and edge updating operations to learn high-order representations, which convey more generic message passing mechanisms. Moreover, the architecture topology in our search space allows to explore complex feature dependence of both entities and edges, which can be efficiently optimized by differentiable search strategy. Experiments at three graph tasks on six datasets show EGNAS can search better GNNs with higher performance than current state-of-the-art human-designed and searched-based GNNs.

【22】 Self-Taught Cross-Domain Few-Shot Learning with Weakly Supervised Object Localization and Task-Decomposition 标题:基于弱监督目标定位和任务分解的自学跨域小概率学习 链接:https://arxiv.org/abs/2109.01302

作者:Xiyao Liu,Zhong Ji,Yanwei Pang,Zhongfei Zhang 机构:Pang are with the School ofElectrical and Information Engineering, Tianjin University 摘要:源域和目标域之间的域转移是跨域Few-Shot学习(CD-FSL)的主要挑战。然而,在源域上进行训练时,目标域是绝对未知的,这导致缺乏对目标任务的直接指导。我们观察到,由于目标领域中存在相似的背景,它可以将自标记样本作为优先任务来将知识转移到目标任务上。为此,我们提出了一种CD-FSL任务扩展分解框架,称为自学习(ST)方法,该方法通过构造面向任务的度量空间来缓解非目标制导问题。具体而言,采用弱监督对象定位(WSOL)和自监督技术,通过交换和旋转判别区域来丰富面向任务的样本,从而生成更丰富的任务集。然后将这些任务分解为多个任务,完成少量镜头识别和旋转分类任务。它有助于将源知识转移到目标任务上,并将注意力集中在有区别的区域。我们在跨域设置下进行了广泛的实验,包括8个目标域:CUB、Cars、Places、Plantae、Cropdieas、EuroSAT、ISIC和ChestX。实验结果表明,所提出的ST方法适用于各种基于度量的模型,并为CD-FSL提供了有希望的改进。 摘要:The domain shift between the source and target domain is the main challenge in Cross-Domain Few-Shot Learning (CD-FSL). However, the target domain is absolutely unknown during the training on the source domain, which results in lacking directed guidance for target tasks. We observe that since there are similar backgrounds in target domains, it can apply self-labeled samples as prior tasks to transfer knowledge onto target tasks. To this end, we propose a task-expansion-decomposition framework for CD-FSL, called Self-Taught (ST) approach, which alleviates the problem of non-target guidance by constructing task-oriented metric spaces. Specifically, Weakly Supervised Object Localization (WSOL) and self-supervised technologies are employed to enrich task-oriented samples by exchanging and rotating the discriminative regions, which generates a more abundant task set. Then these tasks are decomposed into several tasks to finish the task of few-shot recognition and rotation classification. It helps to transfer the source knowledge onto the target tasks and focus on discriminative regions. We conduct extensive experiments under the cross-domain setting including 8 target domains: CUB, Cars, Places, Plantae, CropDieases, EuroSAT, ISIC, and ChestX. Experimental results demonstrate that the proposed ST approach is applicable to various metric-based models, and provides promising improvements in CD-FSL.

【23】 Information Symmetry Matters: A Modal-Alternating Propagation Network for Few-Shot Learning 标题:信息对称问题:一种用于少发学习的模态交替传播网络 链接:https://arxiv.org/abs/2109.01295

作者:Zhong Ji,Zhishen Hou,Xiyao Liu,Yanwei Pang,Jungong Han 摘要:语义信息提供了超越视觉概念的类内一致性和类间可辨别性,这已被用于少数镜头学习(FSL)以实现进一步的收益。然而,语义信息只适用于标记样本,而不适用于未标记样本,通过语义引导少数标记样本单方面纠正嵌入。因此,语义引导样本和非语义引导样本之间不可避免地存在跨模态偏差,从而导致信息不对称问题。为了解决这一问题,我们提出了一种模式交替传播网络(MAP-Net)来补充未标记样本的缺失语义信息,该网络在视觉和语义模式中建立了所有样本之间的信息对称性。具体地说,地图网络通过图传播来传输邻居信息,在完成的视觉关系的指导下生成未标记样本的伪语义,并校正特征嵌入。此外,由于视觉和语义模式之间的巨大差异,我们设计了一种关系引导(RG)策略,通过语义来引导视觉关系向量,从而使传播的信息更加有利。在加州理工大学UCSD Birds 200-2011、SUN属性数据库和牛津102 Flower三个语义标记数据集上的大量实验结果表明,我们提出的方法取得了令人满意的性能,优于最先进的方法,这表明了信息对称的必要性。 摘要:Semantic information provides intra-class consistency and inter-class discriminability beyond visual concepts, which has been employed in Few-Shot Learning (FSL) to achieve further gains. However, semantic information is only available for labeled samples but absent for unlabeled samples, in which the embeddings are rectified unilaterally by guiding the few labeled samples with semantics. Therefore, it is inevitable to bring a cross-modal bias between semantic-guided samples and nonsemantic-guided samples, which results in an information asymmetry problem. To address this problem, we propose a Modal-Alternating Propagation Network (MAP-Net) to supplement the absent semantic information of unlabeled samples, which builds information symmetry among all samples in both visual and semantic modalities. Specifically, the MAP-Net transfers the neighbor information by the graph propagation to generate the pseudo-semantics for unlabeled samples guided by the completed visual relationships and rectify the feature embeddings. In addition, due to the large discrepancy between visual and semantic modalities, we design a Relation Guidance (RG) strategy to guide the visual relation vectors via semantics so that the propagated information is more beneficial. Extensive experimental results on three semantic-labeled datasets, i.e., Caltech-UCSD-Birds 200-2011, SUN Attribute Database, and Oxford 102 Flower, have demonstrated that our proposed method achieves promising performance and outperforms the state-of-the-art approaches, which indicates the necessity of information symmetry.

【24】 Symbol Emergence and The Solutions to Any Task 标题:符号涌现与任一任务的解决方案 链接:https://arxiv.org/abs/2109.01281

作者:Michael Timothy Bennett 机构:School of Computing, Australian National University, Canberra, Australia 摘要:下面定义了意图、任意任务及其解决方案,然后论证了始终构造所谓内涵解决方案的代理将符合人工通用智能的条件。然后,我们解释了自然语言是如何产生并被这样一个主体习得的,赋予了模拟其他在类似强迫下工作的个体意图的能力,因为抽象符号系统和任务的解决方案是一回事。 摘要:The following defines intent, an arbitrary task and its solutions, and then argues that an agent which always constructs what is called an Intensional Solution would qualify as artificial general intelligence. We then explain how natural language may emerge and be acquired by such an agent, conferring the ability to model the intent of other individuals labouring under similar compulsions, because an abstract symbol system and the solution to a task are one and the same.

【25】 An Empirical Study on Leveraging Position Embeddings for Target-oriented Opinion Words Extraction 标题:利用位置杠杆嵌入进行目标导向观点词抽取的实证研究 链接:https://arxiv.org/abs/2109.01238

作者:Samuel Mensah,Kai Sun,Nikolaos Aletras 机构:Computer Science Department, University of Sheffield, UK, BDBC and SKLSDE, Beihang University, China 备注:Accepted at EMNLP 2021 摘要:面向目标的意见词提取(TOWE)(Fan et al.,2019b)是面向目标的情感分析的一个新子任务,旨在提取文本中给定方面的意见词。当前最先进的方法利用位置嵌入来捕获单词与目标的相对位置。然而,这些方法的性能取决于将这些信息合并到单词表示中的能力。在本文中,我们探索了各种基于预训练单词嵌入或利用词性和位置嵌入的语言模型的文本编码器,旨在检查TOWE中每个组件的实际贡献。我们还采用了一种图卷积网络(GCN),通过加入句法信息来增强单词表示。我们的实验结果表明,基于BiLSTM的模型可以有效地将位置信息编码为单词表示,而使用GCN只能获得边际收益。有趣的是,我们的简单方法优于几种最先进的复杂神经结构。 摘要:Target-oriented opinion words extraction (TOWE) (Fan et al., 2019b) is a new subtask of target-oriented sentiment analysis that aims to extract opinion words for a given aspect in text. Current state-of-the-art methods leverage position embeddings to capture the relative position of a word to the target. However, the performance of these methods depends on the ability to incorporate this information into word representations. In this paper, we explore a variety of text encoders based on pretrained word embeddings or language models that leverage part-of-speech and position embeddings, aiming to examine the actual contribution of each component in TOWE. We also adapt a graph convolutional network (GCN) to enhance word representations by incorporating syntactic information. Our experimental results demonstrate that BiLSTM-based models can effectively encode position information into word representations while using a GCN only achieves marginal gains. Interestingly, our simple methods outperform several state-of-the-art complex neural structures.

【26】 So Cloze yet so Far: N400 Amplitude is Better Predicted by Distributional Information than Human Predictability Judgements 标题:所以到目前为止,完形填空:与人类可预测性判断相比,分布信息更好地预测了N400振幅 链接:https://arxiv.org/abs/2109.01226

作者:James A. Michaelov,Seana Coulson,Benjamin K. Bergen 机构:Bergen are with the Department ofCognitive Science, University of California San Diego 备注:Submitted 摘要:更易预测的单词更容易处理——它们的阅读速度更快,并引发与处理困难相关的更小的神经信号,最显著的是事件相关脑电位的N400成分。因此,有人认为,预测即将出现的单词是语言理解的一个关键组成部分,而研究N400的振幅是研究我们所做预测的一个有价值的方法。在这项研究中,我们调查了计算语言模型或人类的语言预测是否更好地反映了自然语言刺激调节N400振幅的方式。人类语言预测与计算语言模型的一个重要区别是,虽然语言模型的预测完全基于前面的语言背景,但人类可能依赖其他因素。我们发现三种顶尖的当代语言模型——GPT-3、RoBERTa和ALBERT——的预测比人类的预测更接近N400。这表明N400背后的预测过程可能比以前认为的对语言表层统计更敏感。 摘要:More predictable words are easier to process - they are read faster and elicit smaller neural signals associated with processing difficulty, most notably, the N400 component of the event-related brain potential. Thus, it has been argued that prediction of upcoming words is a key component of language comprehension, and that studying the amplitude of the N400 is a valuable way to investigate the predictions that we make. In this study, we investigate whether the linguistic predictions of computational language models or humans better reflect the way in which natural language stimuli modulate the amplitude of the N400. One important difference in the linguistic predictions of humans versus computational language models is that while language models base their predictions exclusively on the preceding linguistic context, humans may rely on other factors. We find that the predictions of three top-of-the-line contemporary language models - GPT-3, RoBERTa, and ALBERT - match the N400 more closely than human predictions. This suggests that the predictive processes underlying the N400 may be more sensitive to the surface-level statistics of language than previously thought.

【27】 An Oracle and Observations for the OpenAI Gym / ALE Freeway Environment 标题:OpenAI Gym/ALE Freeway环境的ORACLE和观测 链接:https://arxiv.org/abs/2109.01220

作者:James S. Plank,Catherine D. Schuman,Robert M. Patton 摘要:OpenAI健身房项目包含数百个控制问题,其目标是为强化学习算法提供一个测试平台。其中一个问题是Freeway-ram-v0,其中提交给代理的观测值是128字节的ram。虽然该项目的目标是让非专家AI代理通过一般训练来解决控制问题,但在这项工作中,我们试图了解更多有关该问题的信息,以便更好地评估解决方案。特别是,我们开发oracle来玩这个游戏,这样我们就有了成功的底线。我们将介绍oracle的详细信息,以及可用于训练和测试AI代理的最佳游戏环境。 摘要:The OpenAI Gym project contains hundreds of control problems whose goal is to provide a testbed for reinforcement learning algorithms. One such problem is Freeway-ram-v0, where the observations presented to the agent are 128 bytes of RAM. While the goals of the project are for non-expert AI agents to solve the control problems with general training, in this work, we seek to learn more about the problem, so that we can better evaluate solutions. In particular, we develop on oracle to play the game, so that we may have baselines for success. We present details of the oracle, plus optimal game-playing situations that can be used for training and testing AI agents.

【28】 Multi-Agent Inverse Reinforcement Learning: Suboptimal Demonstrations and Alternative Solution Concepts 标题:多智能体逆强化学习:次优演示和替代解概念 链接:https://arxiv.org/abs/2109.01178

作者:Sage Bergerson 机构:Stanford Existential Risk Initiative, Stanford University, Stanford, CA 摘要:多智能体反向强化学习(MIRL)可用于学习社会环境中智能体的奖励函数。为了模拟现实的社会动态,MIRL方法必须考虑人类的次优推理和行为。传统的博弈论形式提供了可计算的行为模型,但假设代理具有不切实际的认知能力。本研究确定并比较了MIRL方法中的机制,即a)在agent决策中处理噪声、偏差和启发式,b)模型现实均衡解概念。系统地审查了MIRL研究,以确定应对这些挑战的解决方案。这些研究的方法和结果基于绩效准确性、效率和描述质量等因素进行分析和比较。我们发现,在MIRL中处理噪声、偏差和启发式的主要方法是将最大熵(MaxEnt)IRL扩展到多代理设置。我们还发现许多成功的解概念是传统纳什均衡(NE)的推广。这些解包括相关平衡点、logistic随机最佳响应平衡点和熵正则化平均场。使用递归推理或更新的方法也表现良好,包括反馈NE和归档多代理对抗IRL。在单智能体IRL中成功地建模了特定的偏见和启发式,在MIRL中使用心理理论方法获得了有希望的结果,这意味着建模特定的偏见和启发式可能是有用的。在确定的备选解决方案概念中的灵活性和无偏推理表明,具有递归和广义特征的解决方案概念可能在模拟现实社会互动方面表现良好。 摘要:Multi-agent inverse reinforcement learning (MIRL) can be used to learn reward functions from agents in social environments. To model realistic social dynamics, MIRL methods must account for suboptimal human reasoning and behavior. Traditional formalisms of game theory provide computationally tractable behavioral models, but assume agents have unrealistic cognitive capabilities. This research identifies and compares mechanisms in MIRL methods which a) handle noise, biases and heuristics in agent decision making and b) model realistic equilibrium solution concepts. MIRL research is systematically reviewed to identify solutions for these challenges. The methods and results of these studies are analyzed and compared based on factors including performance accuracy, efficiency, and descriptive quality. We found that the primary methods for handling noise, biases and heuristics in MIRL were extensions of Maximum Entropy (MaxEnt) IRL to multi-agent settings. We also found that many successful solution concepts are generalizations of the traditional Nash Equilibrium (NE). These solutions include the correlated equilibrium, logistic stochastic best response equilibrium and entropy regularized mean field NE. Methods which use recursive reasoning or updating also perform well, including the feedback NE and archive multi-agent adversarial IRL. Success in modeling specific biases and heuristics in single-agent IRL and promising results using a Theory of Mind approach in MIRL imply that modeling specific biases and heuristics may be useful. Flexibility and unbiased inference in the identified alternative solution concepts suggest that a solution concept which has both recursive and generalized characteristics may perform well at modeling realistic social interactions.

【29】 Black-Box Attacks on Sequential Recommenders via Data-Free Model Extraction 标题:基于无数据模型提取的序列推荐器黑盒攻击 链接:https://arxiv.org/abs/2109.01165

作者:Zhenrui Yue,Zhankui He,Huimin Zeng,Julian McAuley 机构:Technical University of Munich, University of California 备注:Accepted to RecSys 2021 摘要:我们研究了模型提取是否可用于“窃取”顺序推荐系统的权重,以及对此类攻击的受害者构成的潜在威胁。这种类型的风险在图像和文本分类中引起了注意,但据我们所知,在推荐系统中没有。我们认为,序列推荐系统由于用于训练它们的特定自回归机制而具有独特的脆弱性。不像许多现有的推荐攻击者,假定用于训练受害者模型的数据集暴露给攻击者,我们考虑无数据设置,其中训练数据是不可访问的。在此背景下,我们提出了一种基于API的有限预算合成数据生成和知识提取模型提取方法。我们研究了用于顺序推荐的最新模型,并展示了它们在模型提取和下游攻击下的脆弱性。我们分两个阶段进行攻击(1) 模型提取:给定从黑盒推荐器检索到的不同类型的合成数据及其标签,我们通过蒸馏将黑盒模型提取为白盒模型(2) 下游攻击:我们使用白盒推荐程序生成的对抗性样本攻击黑盒模型。实验表明,在剖面污染和数据中毒两种情况下,我们的无数据模型提取和对顺序推荐器的下游攻击都是有效的。 摘要:We investigate whether model extraction can be used to "steal" the weights of sequential recommender systems, and the potential threats posed to victims of such attacks. This type of risk has attracted attention in image and text classification, but to our knowledge not in recommender systems. We argue that sequential recommender systems are subject to unique vulnerabilities due to the specific autoregressive regimes used to train them. Unlike many existing recommender attackers, which assume the dataset used to train the victim model is exposed to attackers, we consider a data-free setting, where training data are not accessible. Under this setting, we propose an API-based model extraction method via limited-budget synthetic data generation and knowledge distillation. We investigate state-of-the-art models for sequential recommendation and show their vulnerability under model extraction and downstream attacks. We perform attacks in two stages. (1) Model extraction: given different types of synthetic data and their labels retrieved from a black-box recommender, we extract the black-box model to a white-box model via distillation. (2) Downstream attacks: we attack the black-box model with adversarial samples generated by the white-box recommender. Experiments show the effectiveness of our data-free model extraction and downstream attacks on sequential recommenders in both profile pollution and data poisoning settings.

【30】 Challenges in Generalization in Open Domain Question Answering 标题:开放领域问答中泛化面临的挑战 链接:https://arxiv.org/abs/2109.01156

作者:Linqing Liu,Patrick Lewis,Sebastian Riedel,Pontus Stenetorp 机构:†University College London, ‡Facebook AI Research 摘要:最近关于开放领域问题回答的研究表明,新的测试问题和那些与训练问题有很大重叠的问题在模型性能上存在很大差异。然而,目前尚不清楚新问题的哪些方面使其具有挑战性。基于对系统泛化的研究,我们根据三个类别引入和注释问题,这三个类别衡量不同的泛化水平和类型:训练集重叠、合成泛化(comp-gen)和新实体泛化(新实体)。在评估六种流行的参数和非参数模型时,我们发现,对于已建立的自然问题和TriviaQA数据集,与完整测试集相比,即使是comp-gen/novel entity最强大的模型性能也分别降低了13.1/5.4%和9.6/1.5%——这表明了这些类型问题所带来的挑战。此外,我们还表明,虽然非参数模型可以处理包含新实体的问题,但它们与那些需要合成泛化的问题相抗争。通过深入分析,我们发现关键的问题难度因素是:检索组件的级联错误、问题模式的频率和实体的频率。 摘要:Recent work on Open Domain Question Answering has shown that there is a large discrepancy in model performance between novel test questions and those that largely overlap with training questions. However, it is as of yet unclear which aspects of novel questions that make them challenging. Drawing upon studies on systematic generalization, we introduce and annotate questions according to three categories that measure different levels and kinds of generalization: training set overlap, compositional generalization (comp-gen), and novel entity generalization (novel-entity). When evaluating six popular parametric and non-parametric models, we find that for the established Natural Questions and TriviaQA datasets, even the strongest model performance for comp-gen/novel-entity is 13.1/5.4% and 9.6/1.5% lower compared to that for the full test set -- indicating the challenge posed by these types of questions. Furthermore, we show that whilst non-parametric models can handle questions containing novel entities, they struggle with those requiring compositional generalization. Through thorough analysis we find that key question difficulty factors are: cascading errors from the retrieval component, frequency of question pattern, and frequency of the entity.

机器翻译,仅供参考

0 人点赞