人工智能学术速递[7.16]

2021-07-27 11:00:22 浏览数 (1)

访问www.arxivdaily.com获取含摘要速递,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏、发帖等功能!点击阅读原文即可访问

cs.AI人工智能,共计55篇

【1】 A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification 标题:温文尔雅地介绍保角预测和无分布不确定性量化

作者:Anastasios N. Angelopoulos,Stephen Bates 备注:Blog and tutorial video this http URL 链接:https://arxiv.org/abs/2107.07511 摘要:黑箱机器学习方法现在经常用于高风险环境,如医疗诊断,这需要不确定性量化,以避免间接的模型故障。无分布不确定性量化(Distribution-free UQ)是一种用户友好的范例,用于为此类预测创建统计上严格的置信区间/集。关键的是,区间/集合在没有分布假设或模型假设的情况下是有效的,并且具有有限多个数据点的显式保证。适应输入的难度;当输入示例很困难时,不确定性区间/集合很大,表明模型可能是错误的。不需要做太多的工作,就可以在任何底层算法(如神经网络)上使用无分布方法来生成保证包含用户指定概率(如90%)的基本真值的置信集。事实上,这些方法易于理解和通用,适用于计算机视觉、自然语言处理、深度强化学习等领域的许多现代预测问题。这个实际操作的介绍是针对读者感兴趣的实际执行无分布UQ,包括保形预测和相关方法,谁不一定是一个统计学家。我们将用pythorn语法在Python中包含许多解释性说明、示例和代码示例。其目的是让读者对无分布UQ有一个有效的理解,允许他们用一个自包含的文档在算法上设置置信区间。 摘要:Black-box machine learning learning methods are now routinely used in high-risk settings, like medical diagnostics, which demand uncertainty quantification to avoid consequential model failures. Distribution-free uncertainty quantification (distribution-free UQ) is a user-friendly paradigm for creating statistically rigorous confidence intervals/sets for such predictions. Critically, the intervals/sets are valid without distributional assumptions or model assumptions, with explicit guarantees with finitely many datapoints. Moreover, they adapt to the difficulty of the input; when the input example is difficult, the uncertainty intervals/sets are large, signaling that the model might be wrong. Without much work, one can use distribution-free methods on any underlying algorithm, such as a neural network, to produce confidence sets guaranteed to contain the ground truth with a user-specified probability, such as 90%. Indeed, the methods are easy-to-understand and general, applying to many modern prediction problems arising in the fields of computer vision, natural language processing, deep reinforcement learning, and so on. This hands-on introduction is aimed at a reader interested in the practical implementation of distribution-free UQ, including conformal prediction and related methods, who is not necessarily a statistician. We will include many explanatory illustrations, examples, and code samples in Python, with PyTorch syntax. The goal is to provide the reader a working understanding of distribution-free UQ, allowing them to put confidence intervals on their algorithms, with one self-contained document.

【2】 MultiBench: Multiscale Benchmarks for Multimodal Representation Learning 标题:MultiBench:多模态表征学习的多尺度基准

作者:Paul Pu Liang,Yiwei Lyu,Xiang Fan,Zetian Wu,Yun Cheng,Jason Wu,Leslie Chen,Peter Wu,Michelle A. Lee,Yuke Zhu,Ruslan Salakhutdinov,Louis-Philippe Morency 机构:CMU,Johns Hopkins,Northeastern,Stanford,UT Austin 备注:Code: this https URL and Website: this https URL 链接:https://arxiv.org/abs/2107.07502 摘要:学习多模态表示涉及整合来自多个异构数据源的信息。这是一个具有挑战性但又至关重要的领域,在多媒体、情感计算、机器人技术、金融、人机交互和医疗保健等领域有着广泛的应用。不幸的是,多模态研究的资源有限,无法研究(1)跨域和模式的泛化,(2)训练和推理过程中的复杂性,以及(3)对噪声和缺失模式的鲁棒性。为了加快研究模式和任务的进度,同时确保真实世界的鲁棒性,我们发布了MultiBench,这是一个系统和统一的大规模基准,涵盖15个数据集、10个模式、20个预测任务和6个研究领域。MultiBench提供了一个自动化的端到端机器学习管道,简化和标准化数据加载、实验设置和模型评估。为了实现整体评估,MultiBench提供了一种全面的方法来评估(1)泛化,(2)时间和空间复杂性,以及(3)模态稳健性。MultiBench为未来的研究带来了巨大的挑战,包括对大规模多模态数据集的可扩展性和对现实缺陷的鲁棒性。为了配合这一基准,我们还提供了多模式学习中20种核心方法的标准化实施。简单地应用在不同研究领域提出的方法可以提高9/15数据集的最新性能。因此,MultiBench在统一多模态研究中不相交的工作方面具有里程碑意义,并为更好地理解多模态模型的能力和局限性铺平了道路,同时确保了易用性、可访问性和可再现性。MultiBench,我们的标准化代码和排行榜是公开的,将定期更新,并欢迎来自社区的投入。 摘要:Learning multimodal representations involves integrating information from multiple heterogeneous sources of data. It is a challenging yet crucial area with numerous real-world applications in multimedia, affective computing, robotics, finance, human-computer interaction, and healthcare. Unfortunately, multimodal research has seen limited resources to study (1) generalization across domains and modalities, (2) complexity during training and inference, and (3) robustness to noisy and missing modalities. In order to accelerate progress towards understudied modalities and tasks while ensuring real-world robustness, we release MultiBench, a systematic and unified large-scale benchmark spanning 15 datasets, 10 modalities, 20 prediction tasks, and 6 research areas. MultiBench provides an automated end-to-end machine learning pipeline that simplifies and standardizes data loading, experimental setup, and model evaluation. To enable holistic evaluation, MultiBench offers a comprehensive methodology to assess (1) generalization, (2) time and space complexity, and (3) modality robustness. MultiBench introduces impactful challenges for future research, including scalability to large-scale multimodal datasets and robustness to realistic imperfections. To accompany this benchmark, we also provide a standardized implementation of 20 core approaches in multimodal learning. Simply applying methods proposed in different research areas can improve the state-of-the-art performance on 9/15 datasets. Therefore, MultiBench presents a milestone in unifying disjoint efforts in multimodal research and paves the way towards a better understanding of the capabilities and limitations of multimodal models, all the while ensuring ease of use, accessibility, and reproducibility. MultiBench, our standardized code, and leaderboards are publicly available, will be regularly updated, and welcomes inputs from the community.

【3】 An End-to-End Differentiable Framework for Contact-Aware Robot Design 标题:一种端到端可区分的接触式机器人设计框架

作者:Jie Xu,Tao Chen,Lara Zlokapa,Michael Foshey,Wojciech Matusik,Shinjiro Sueda,Pulkit Agrawal 机构:†Massachusetts Institute of Technology, ‡Texas A&M University 备注:Robotics: Science and Systems 链接:https://arxiv.org/abs/2107.07501 摘要:目前机器人操作的主流模式包括两个独立的阶段:机械手设计和控制。由于机器人的形态和控制方式密切相关,因此设计和控制的联合优化可以显著提高机器人的性能。现有的协同优化方法存在局限性,无法探索丰富的设计空间。主要原因是在复杂的设计之间的权衡,这对于接触丰富的任务是必要的,与实际的限制,制造,优化,接触处理等。我们克服了这些挑战,通过建立一个端到端可微框架接触机器人的设计。这个框架的两个关键组成部分是:一个新的基于变形的参数化,允许设计具有任意复杂几何结构的关节刚性机器人,以及一个可微刚体模拟器,可以处理接触丰富的情况,并计算分析梯度的运动学和动力学参数的全谱。在多个操作任务上,我们的框架优于现有的方法,这些方法要么只针对控制进行优化,要么针对使用替代表示的设计进行优化,要么使用无梯度方法进行协同优化。 摘要:The current dominant paradigm for robotic manipulation involves two separate stages: manipulator design and control. Because the robot's morphology and how it can be controlled are intimately linked, joint optimization of design and control can significantly improve performance. Existing methods for co-optimization are limited and fail to explore a rich space of designs. The primary reason is the trade-off between the complexity of designs that is necessary for contact-rich tasks against the practical constraints of manufacturing, optimization, contact handling, etc. We overcome several of these challenges by building an end-to-end differentiable framework for contact-aware robot design. The two key components of this framework are: a novel deformation-based parameterization that allows for the design of articulated rigid robots with arbitrary, complex geometry, and a differentiable rigid body simulator that can handle contact-rich scenarios and computes analytical gradients for a full spectrum of kinematic and dynamic parameters. On multiple manipulation tasks, our framework outperforms existing methods that either only optimize for control or for design using alternate representations or co-optimize using gradient-free methods.

【4】 FewCLUE: A Chinese Few-shot Learning Evaluation Benchmark 标题:FewCLUE:一种中文短程学习评价基准

作者:Liang Xu,Xiaojing Lu,Chenyang Yuan,Xuanwei Zhang,Hu Yuan,Huilin Xu,Guoao Wei,Xiang Pan,Hai Hu 机构:CLUE team 备注:Work in Progress; 8 pages, 3 tables 链接:https://arxiv.org/abs/2107.07498 摘要:预训练语言模型在自然语言理解任务中取得了巨大的成功。虽然人们对英语等语言的不同学习模式——微调、Zero-Shot和Few-Shot学习——进行了广泛的探索和比较,但在汉语中,公平、全面地评价和比较这些方法的工作相对较少。本文首先介绍了汉语第一个综合性小样本学习评价基准(FewCLUE)。它包括九个任务,从单句和句子对分类任务到机器阅读理解任务。鉴于Few-Shot学习性能的高方差性,我们提供了多个训练/验证集,以便于对Few-Shot模型进行更准确和稳定的评估。提供了一个未标记的训练集,每个任务最多可增加20000个样本,使研究人员能够探索使用未标记样本的更好方法。接下来,我们实现了一套最先进的(SOTA)少数镜头学习方法(包括PET、ADAPET、LM-BFF、P-tuning和EFL),并在新构建的FewCLUE基准上,将其与微调和零次学习方案的性能进行了比较,结果表明:1)五种少量次学习方法的性能均优于微调和零次学习;2) 在五种方法中,PET是表现最好的少射法;3) 少数镜头的学习成绩是高度依赖于具体的任务。我们的基准和代码可在https://github.com/CLUEbenchmark/FewCLUE 摘要:Pretrained Language Models (PLMs) have achieved tremendous success in natural language understanding tasks. While different learning schemes -- fine-tuning, zero-shot and few-shot learning -- have been widely explored and compared for languages such as English, there is comparatively little work in Chinese to fairly and comprehensively evaluate and compare these methods. This work first introduces Chinese Few-shot Learning Evaluation Benchmark (FewCLUE), the first comprehensive small sample evaluation benchmark in Chinese. It includes nine tasks, ranging from single-sentence and sentence-pair classification tasks to machine reading comprehension tasks. Given the high variance of the few-shot learning performance, we provide multiple training/validation sets to facilitate a more accurate and stable evaluation of few-shot modeling. An unlabeled training set with up to 20,000 additional samples per task is provided, allowing researchers to explore better ways of using unlabeled samples. Next, we implement a set of state-of-the-art (SOTA) few-shot learning methods (including PET, ADAPET, LM-BFF, P-tuning and EFL), and compare their performance with fine-tuning and zero-shot learning schemes on the newly constructed FewCLUE benchmark.Our results show that: 1) all five few-shot learning methods exhibit better performance than fine-tuning or zero-shot learning; 2) among the five methods, PET is the best performing few-shot method; 3) few-shot learning performance is highly dependent on the specific task. Our benchmark and code are available at https://github.com/CLUEbenchmark/FewCLUE

【5】 Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks 标题:班次:跨多个大规模任务的实际分布班次的数据集

作者:Andrey Malinin,Neil Band,German Chesnokov,Yarin Gal,Mark J. F. Gales,Alexey Noskov,Andrey Ploskonosov,Liudmila Prokhorenkova,Ivan Provilkov,Vatsal Raina,Vyas Raina,Mariya Shmatova,Panos Tigas,Boris Yangel 机构: 2HSE University, 3Moscow Institute of Physics and Technology, 4University of Cambridge, 5University of Oxford, 6Alan Turing InstitutePreprint 链接:https://arxiv.org/abs/2107.07455 摘要:对于如何提高对分布偏移和不确定性估计的鲁棒性,人们已经进行了大量的研究。相比之下,只有有限的工作审查了为评估这些方法而开发的标准数据集和基准。此外,大多数关于不确定性估计和鲁棒性的工作已经发展了基于小尺度回归或图像分类任务的新技术。然而,许多实际感兴趣的任务具有不同的模式,例如表格数据、音频、文本或传感器数据,这对回归和离散或连续结构化预测提出了重大挑战。因此,鉴于该领域的现状,有必要建立一个标准化的大规模任务数据集,涵盖受分配变化影响的一系列模式。这将使研究人员能够有意义地评估最近开发的大量不确定性量化方法,以及评估标准和最先进的基线。在这项工作中,我们提出emph{Shifts Dataset}来评估不确定性估计和对分布移位的鲁棒性。从工业来源和服务收集的数据集由三个任务组成,每个任务对应一种特定的数据模式:表格天气预报、机器翻译和自动驾驶汽车(SDC)车辆运动预测。所有这些数据模式和任务都受到真实的“野外”分布变化的影响,并在不确定性估计方面提出了有趣的挑战。在这项工作中,我们提供了所有任务的数据集和基线结果的描述。 摘要:There has been significant research done on developing methods for improving robustness to distributional shift and uncertainty estimation. In contrast, only limited work has examined developing standard datasets and benchmarks for assessing these approaches. Additionally, most work on uncertainty estimation and robustness has developed new techniques based on small-scale regression or image classification tasks. However, many tasks of practical interest have different modalities, such as tabular data, audio, text, or sensor data, which offer significant challenges involving regression and discrete or continuous structured prediction. Thus, given the current state of the field, a standardized large-scale dataset of tasks across a range of modalities affected by distributional shifts is necessary. This will enable researchers to meaningfully evaluate the plethora of recently developed uncertainty quantification methods, as well as assessment criteria and state-of-the-art baselines. In this work, we propose the emph{Shifts Dataset} for evaluation of uncertainty estimates and robustness to distributional shift. The dataset, which has been collected from industrial sources and services, is composed of three tasks, with each corresponding to a particular data modality: tabular weather prediction, machine translation, and self-driving car (SDC) vehicle motion prediction. All of these data modalities and tasks are affected by real, `in-the-wild' distributional shifts and pose interesting challenges with respect to uncertainty estimation. In this work we provide a description of the dataset and baseline results for all tasks.

【6】 GI-NNet & RGI-NNet: Development of Robotic Grasp Pose Models, Trainable with Large as well as Limited Labelled Training Datasets, under supervised and semi supervised paradigms

作者:Priya Shukla,Nilotpal Pramanik,Deepesh Mehta,G. C. Nandi 机构:IndianInstituteofInformationTechnology 链接:https://arxiv.org/abs/2107.07452 摘要:我们抓取物体的方法对于cobot的高效、智能和最优抓取是一个挑战。为了简化这个过程,这里我们使用深度学习技术来帮助机器人学习快速生成和执行适当的抓取。我们开发了一个生成性初始神经网络(GI-NNet)模型,能够生成机器人对可见和不可见物体的反足抓取。该方法在康奈尔抓取数据集(CGD)上训练,对RGB深度(RGB-D)图像中规则形状和不规则形状的物体的抓取姿态精度达到98.87%,同时只需要现有方法的三分之一的网络可训练参数。然而,为了达到这一性能水平,该模型需要整个CGD的90%可用标记数据只保留10%的标记数据进行测试,这使得它容易受到较差的泛化。此外,获得足够的和高质量的标记数据集越来越难以跟上庞大网络的需求。为了解决这些问题,我们将我们的模型作为一个具有半监督学习结构的解码器,称为矢量量化变分自动编码器(VQVAE),当使用可用的标记数据和未标记数据进行训练时,它可以有效地工作。提出的模型,我们称之为基于表示的GI-NNet(RGI-NNet),在CGD上用不同的标签数据分裂训练,最小为10%的标记数据集,由VQVAE生成的潜在嵌入高达50%的标记数据,由VQVAE获得的潜在嵌入。RGI-NNet的抓取位姿精度在92.13%~95.6%之间,远远优于现有的几种仅用标记数据集训练的模型。为了验证GI-NNet和RGI-NNet模型的性能,我们使用了Anukul(Baxter)硬件cobot。 摘要:Our way of grasping objects is challenging for efficient, intelligent and optimal grasp by COBOTs. To streamline the process, here we use deep learning techniques to help robots learn to generate and execute appropriate grasps quickly. We developed a Generative Inception Neural Network (GI-NNet) model, capable of generating antipodal robotic grasps on seen as well as unseen objects. It is trained on Cornell Grasping Dataset (CGD) and attained 98.87% grasp pose accuracy for detecting both regular and irregular shaped objects from RGB-Depth (RGB-D) images while requiring only one third of the network trainable parameters as compared to the existing approaches. However, to attain this level of performance the model requires the entire 90% of the available labelled data of CGD keeping only 10% labelled data for testing which makes it vulnerable to poor generalization. Furthermore, getting sufficient and quality labelled dataset is becoming increasingly difficult keeping in pace with the requirement of gigantic networks. To address these issues, we attach our model as a decoder with a semi-supervised learning based architecture known as Vector Quantized Variational Auto Encoder (VQVAE), which works efficiently when trained both with the available labelled and unlabelled data. The proposed model, which we name as Representation based GI-NNet (RGI-NNet), has been trained with various splits of label data on CGD with as minimum as 10% labelled dataset together with latent embedding generated from VQVAE up to 50% labelled data with latent embedding obtained from VQVAE. The performance level, in terms of grasp pose accuracy of RGI-NNet, varies between 92.13% to 95.6% which is far better than several existing models trained with only labelled dataset. For the performance verification of both GI-NNet and RGI-NNet models, we use Anukul (Baxter) hardware cobot.

【7】 High-level Decisions from a Safe Maneuver Catalog with Reinforcement Learning for Safe and Cooperative Automated Merging 标题:采用强化学习的安全协作自动合并的安全机动目录中的高级决策

作者:Danial Kamran,Yu Ren,Martin Lauer 机构: al in 1Authors are with Institute of Measurement and Control Systems, Karlsruhe Institute of Technology (KIT) 链接:https://arxiv.org/abs/2107.07413 摘要:强化学习(RL)最近被用于解决自动驾驶环境下具有挑战性的决策问题。然而,所提出的基于RL的策略的主要缺点之一是缺乏安全保证,因为它们努力减少预期的碰撞次数,但仍然容忍碰撞。在本文中,我们提出了一个有效的基于RL的决策管道,用于合并场景中的安全和协作自动驾驶。RL代理能够预测当前情况并提供高级决策,指定负责安全的低级计划员的操作模式。为了学习一个更通用的策略,我们提出了一个可扩展的RL架构,用于对环境配置变化不敏感的合并场景。实验结果表明,所提出的RL智能体能够有效地从车辆状态历史中识别出合作驾驶员,并生成交互操作,从而实现更快、更舒适的自动驾驶。同时,由于计划者内部的安全约束,所有的机动都是无碰撞和安全的。 摘要:Reinforcement learning (RL) has recently been used for solving challenging decision-making problems in the context of automated driving. However, one of the main drawbacks of the presented RL-based policies is the lack of safety guarantees, since they strive to reduce the expected number of collisions but still tolerate them. In this paper, we propose an efficient RL-based decision-making pipeline for safe and cooperative automated driving in merging scenarios. The RL agent is able to predict the current situation and provide high-level decisions, specifying the operation mode of the low level planner which is responsible for safety. In order to learn a more generic policy, we propose a scalable RL architecture for the merging scenario that is not sensitive to changes in the environment configurations. According to our experiments, the proposed RL agent can efficiently identify cooperative drivers from their vehicle state history and generate interactive maneuvers, resulting in faster and more comfortable automated driving. At the same time, thanks to the safety constraints inside the planner, all of the maneuvers are collision free and safe.

【8】 Two-Sided Matching Meets Fair Division 标题:双面匹配相遇公平划分

作者:Rupert Freeman,Evi Micha,Nisarg Shah 机构:University of Virginia, University of Toronto 链接:https://arxiv.org/abs/2107.07404 摘要:我们引入了一个新的双边匹配模型,它允许我们从公平划分的文献中借用流行的公平概念,如无嫉妒到一个好和最大份额保证。在我们的模型中,每一个代理都与另一方的多个代理相匹配,而另一方的代理具有加性偏好。我们分别要求每一方的公平性,由此产生了诸如“双嫉妒-无一匹配”(DEF1)和“双最大份额保证”(DMMS)等概念。我们证明了(稍微加强)DEF1并不总是可以实现的,但是在双方都有相同偏好的特殊情况下,具有精心设计的代理排序的循环算法可以实现它。相反,即使双方有相同的偏好,也无法实现DMMS。 摘要:We introduce a new model for two-sided matching which allows us to borrow popular fairness notions from the fair division literature such as envy-freeness up to one good and maximin share guarantee. In our model, each agent is matched to multiple agents on the other side over whom she has additive preferences. We demand fairness for each side separately, giving rise to notions such as double envy-freeness up to one match (DEF1) and double maximin share guarantee (DMMS). We show that (a slight strengthening of) DEF1 cannot always be achieved, but in the special case where both sides have identical preferences, the round-robin algorithm with a carefully designed agent ordering achieves it. In contrast, DMMS cannot be achieved even when both sides have identical preferences.

【9】 Explore and Control with Adversarial Surprise 标题:对抗性突击的探索与控制

作者:Arnaud Fickinger,Natasha Jaques,Samyak Parajuli,Michael Chang,Nicholas Rhinehart,Glen Berseth,Stuart Russell,Sergey Levine 机构:UC Berkeley, Google Research, Brain Team 链接:https://arxiv.org/abs/2107.07394 摘要:强化学习(RL)提供了一个框架,学习目标导向的政策给予用户指定的奖励。然而,由于设计奖励往往需要大量的工程努力,我们感兴趣的是没有奖励的学习问题,即代理人必须在没有特定任务激励的情况下发现有用的行为。内在动机是一系列无监督的RL技术,它们为RL代理制定了总体目标,以优化这些目标,从而更好地探索或发现技能。在本文中,我们提出了一种新的无监督RL技术,该技术基于一个对抗性博弈,即两个策略相互竞争RL代理所经历的惊喜量。每个策略轮流控制代理。Explore策略将熵最大化,将代理置于意外或不熟悉的情况中。然后,控制策略接管并通过最小化熵来寻求从这些情况中恢复。该游戏利用多智能体竞争的力量,驱使智能体在学习掌握环境中越来越令人惊讶的部分的同时,寻找环境中越来越令人惊讶的部分。我们的经验表明,我们的方法导致出现复杂的技能表现出明确的相变。此外,我们在理论上(通过一个潜在的状态空间覆盖参数)和经验上都证明了我们的方法有潜力应用于随机的,部分观测的环境的探索。我们发现,对抗性惊喜比竞争性基线学习更复杂的行为,探索更有效,优于基于主动推理的内在动机方法、新颖性寻求(随机网络蒸馏(RND))和多智能体无监督的RL(不对称自玩(ASP)),Atari和VizDoom环境。 摘要:Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards. However, since designing rewards often requires substantial engineering effort, we are interested in the problem of learning without rewards, where agents must discover useful behaviors in the absence of task-specific incentives. Intrinsic motivation is a family of unsupervised RL techniques which develop general objectives for an RL agent to optimize that lead to better exploration or the discovery of skills. In this paper, we propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences. The policies each take turns controlling the agent. The Explore policy maximizes entropy, putting the agent into surprising or unfamiliar situations. Then, the Control policy takes over and seeks to recover from those situations by minimizing entropy. The game harnesses the power of multi-agent competition to drive the agent to seek out increasingly surprising parts of the environment while learning to gain mastery over them. We show empirically that our method leads to the emergence of complex skills by exhibiting clear phase transitions. Furthermore, we show both theoretically (via a latent state space coverage argument) and empirically that our method has the potential to be applied to the exploration of stochastic, partially-observed environments. We show that Adversarial Surprise learns more complex behaviors, and explores more effectively than competitive baselines, outperforming intrinsic motivation methods based on active inference, novelty-seeking (Random Network Distillation (RND)), and multi-agent unsupervised RL (Asymmetric Self-Play (ASP)) in MiniGrid, Atari and VizDoom environments.

【10】 Auditing for Diversity using Representative Examples 标题:利用有代表性的例子对多样性进行审计

作者:Vijay Keswani,L. Elisa Celis 机构:Yale University 链接:https://arxiv.org/abs/2107.07393 摘要:在将这些数据用于下游应用程序之前,评估与人相关的信息数据集的多样性至关重要。对于给定的数据集,这通常涉及计算受保护属性(如性别、方言等)的经验边际分布的不平衡或差异。然而,现实世界中的数据集,例如来自Google搜索的图像或Twitter帖子的集合,通常没有标记的受保护属性。因此,为了得到这样的数据集的差异度量,元素需要手工标记或群组注释,这是昂贵的过程。我们提出了一个成本效益的方法来近似的差距,一个给定的未标记的数据集,就一个保护的属性,使用一个控制集的标记代表性的例子。该算法利用数据集元素与控制集元素之间的成对相似性,有效地引导数据集元素间的差异近似。重要的是,我们证明了使用一个比数据集的大小小得多的控制集足以获得一个小的近似误差。此外,基于我们的理论框架,我们还提供了一种算法来构造自适应控制集,实现了比随机选择控制集更小的逼近误差。在两个图像数据集和一个Twitter数据集上的仿真表明了我们的方法(使用随机和自适应控制集)在审计各种数据集多样性方面的有效性。 摘要:Assessing the diversity of a dataset of information associated with people is crucial before using such data for downstream applications. For a given dataset, this often involves computing the imbalance or disparity in the empirical marginal distribution of a protected attribute (e.g. gender, dialect, etc.). However, real-world datasets, such as images from Google Search or collections of Twitter posts, often do not have protected attributes labeled. Consequently, to derive disparity measures for such datasets, the elements need to hand-labeled or crowd-annotated, which are expensive processes. We propose a cost-effective approach to approximate the disparity of a given unlabeled dataset, with respect to a protected attribute, using a control set of labeled representative examples. Our proposed algorithm uses the pairwise similarity between elements in the dataset and elements in the control set to effectively bootstrap an approximation to the disparity of the dataset. Importantly, we show that using a control set whose size is much smaller than the size of the dataset is sufficient to achieve a small approximation error. Further, based on our theoretical framework, we also provide an algorithm to construct adaptive control sets that achieve smaller approximation errors than randomly chosen control sets. Simulations on two image datasets and one Twitter dataset demonstrate the efficacy of our approach (using random and adaptive control sets) in auditing the diversity of a wide variety of datasets.

【11】 Proceedings of the Sixteenth Workshop on Logical Frameworks and Meta-Languages: Theory and Practice 标题:第十六届逻辑框架与元语言研讨会论文集:理论与实践

作者:Elaine Pimentel,Enrico Tassi 备注:None 链接:https://arxiv.org/abs/2107.07376 摘要:逻辑框架和元语言形成了一个共同的基础,用于表示、实现和推理逻辑和计算机科学中感兴趣的各种演绎系统。它们的设计、实现和在推理任务中的应用,从软件的正确性到形式化系统的属性,都是过去二十年来大量研究的焦点。本次研讨会汇集了设计人员、实施人员和实践者,讨论了影响逻辑框架结构和实用性的各个方面,包括变量绑定的处理、归纳和共归纳推理技术以及推理过程的表达性和清晰性。 摘要:Logical frameworks and meta-languages form a common substrate for representing, implementing and reasoning about a wide variety of deductive systems of interest in logic and computer science. Their design, implementation and their use in reasoning tasks, ranging from the correctness of software to the properties of formal systems, have been the focus of considerable research over the last two decades. This workshop brings together designers, implementors and practitioners to discuss various aspects impinging on the structure and utility of logical frameworks, including the treatment of variable binding, inductive and co-inductive reasoning techniques and the expressiveness and lucidity of the reasoning process.

【12】 A Reinforcement Learning Environment for Mathematical Reasoning via Program Synthesis 标题:基于程序综合的数学推理强化学习环境

作者:Joseph Palermo,Johnny Ye,Alok Singh 机构:Cash App Labs, Lawrence Berkeley National Laboratory 链接:https://arxiv.org/abs/2107.07373 摘要:我们将DeepMind数学数据集转化为一个强化学习环境,将其解释为一个程序综合问题。在环境中执行的每个操作都会将一个运算符或一个输入添加到离散计算图中。计算正确答案的图产生正回报,使策略的优化能够构建以问题陈述为条件的计算图。使用双DQN对各种问题类型的子集训练基线模型,证明了在组合爆炸和噪声奖励的挑战下学习正确构造图的能力。 摘要:We convert the DeepMind Mathematics Dataset into a reinforcement learning environment by interpreting it as a program synthesis problem. Each action taken in the environment adds an operator or an input into a discrete compute graph. Graphs which compute correct answers yield positive reward, enabling the optimization of a policy to construct compute graphs conditioned on problem statements. Baseline models are trained using Double DQN on various subsets of problem types, demonstrating the capability to learn to correctly construct graphs despite the challenges of combinatorial explosion and noisy rewards.

【13】 DiRe Committee : Diversity and Representation Constraints in Multiwinner Elections 标题:可怕的委员会:多赢家选举中的多样性和代表性限制

作者:Kunal Relia 机构:New York University, USA 备注:30 pages, 8 figures, 2 tables, 4 algorithms 链接:https://arxiv.org/abs/2107.07356 摘要:对多赢家选举公平性的研究主要集中在候选人具有属性的情况下。然而,选民也可以根据一个或多个属性(例如,“州”属性下的“加利福尼亚”和“伊利诺伊”人口)划分为预定义的人群,这些属性可能与候选人属性相同或不同。仅关注候选人属性的模型可能系统地低估了较小的选民群体。因此,我们发展了一个模型,即委员会胜利者决定模型(DRCWD),该模型通过指定多样性和代表性约束以及投票规则来描述候选人和选民的属性来选择一个委员会。我们展示了我们模型的可推广性,并分析了它的计算复杂性、不可逼近性和参数化复杂性。我们开发了一个基于启发式的算法,在63%的合成数据集实例和100%的真实数据集实例上,在不到两分钟的时间内找到获胜的决策委员会。我们提出了一个运行时间,可行性和效用交易的实证分析。总体而言,DRCWD促使研究多赢家选举应考虑其参与者,即候选人和选民,作为候选人特定的“公平”模型可以不知不觉伤害选民群体,反之亦然。此外,即使候选人和选民的属性一致,也必须将他们分开对待,例如,委员会中有一名女性候选人与委员会中有一名女性选民更喜欢的候选人不同,而候选人本身可能是女性,也可能不是女性。 摘要:The study of fairness in multiwinner elections focuses on settings where candidates have attributes. However, voters may also be divided into predefined populations under one or more attributes (e.g., "California" and "Illinois" populations under the "state" attribute), which may be same or different from candidate attributes. The models that focus on candidate attributes alone may systematically under-represent smaller voter populations. Hence, we develop a model, DiRe Committee Winner Determination (DRCWD), which delineates candidate and voter attributes to select a committee by specifying diversity and representation constraints and a voting rule. We show the generalizability of our model, and analyze its computational complexity, inapproximability, and parameterized complexity. We develop a heuristic-based algorithm, which finds the winning DiRe committee in under two minutes on 63% of the instances of synthetic datasets and on 100% of instances of real-world datasets. We present an empirical analysis of the running time, feasibility, and utility traded-off. Overall, DRCWD motivates that a study of multiwinner elections should consider both its actors, namely candidates and voters, as candidate-specific "fair" models can unknowingly harm voter populations, and vice versa. Additionally, even when the attributes of candidates and voters coincide, it is important to treat them separately as having a female candidate on the committee, for example, is different from having a candidate on the committee who is preferred by the female voters, and who themselves may or may not be female.

【14】 Copula-Based Normalizing Flows 标题:基于Copula的归一化流

作者:Mike Laszkiewicz,Johannes Lederer,Asja Fischer 机构:Equal contribution 1Department of Mathematics 备注:Accepted for presentation at the ICML 2021 Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models (INNF 2021) 链接:https://arxiv.org/abs/2107.07352 摘要:通过将数据转换为高斯基分布的样本来学习分布的规范化流已经证明了强大的密度近似。但它们的表现力受到这种基数分布选择的限制。因此,我们建议将基分布推广到更精细的copula分布,以便更准确地捕捉目标分布的特性。在第一个实证分析中,我们证明了这种替代可以显著地提高重尾数据的灵活性、稳定性和有效性。我们的结果表明,改进与学习流的局部Lipschitz稳定性增加有关。 摘要:Normalizing flows, which learn a distribution by transforming the data to samples from a Gaussian base distribution, have proven powerful density approximations. But their expressive power is limited by this choice of the base distribution. We, therefore, propose to generalize the base distribution to a more elaborate copula distribution to capture the properties of the target distribution more accurately. In a first empirical analysis, we demonstrate that this replacement can dramatically improve the vanilla normalizing flows in terms of flexibility, stability, and effectivity for heavy-tailed data. Our results suggest that the improvements are related to an increased local Lipschitz-stability of the learned flow.

【15】 A multi-schematic classifier-independent oversampling approach for imbalanced datasets 标题:一种多模式分类器无关的不平衡数据集过采样方法

作者:Saptarshi Bej,Kristian Schultz,Prashant Srivastava,Markus Wolfien,Olaf Wolkenhauer 机构: University of Rostock, Germany•Olaf Wolkenhauer is affiliated by the Leibniz-Institute for Food SystemsBiology, Technical University of Munich, University ofLondon, Department of Systems Biology & BioinformaticsUniversity of Rostock, Universit¨atsplatz 1 备注:12 tables, 6 figures 链接:https://arxiv.org/abs/2107.07349 摘要:在过去的二十年中,已经建立了超过85个过采样算法,其中大部分是SMOTE算法的扩展,以解决数据集不平衡的问题。然而,以往的研究表明,不同的过采样算法对不同的分类器具有不同的效率。由于算法众多,很难为所选分类器确定过采样算法。在这里,我们用一种多图解和分类器独立的过采样方法来克服这个问题:ProWRAS(邻近加权随机仿射阴影采样)。ProWRAS集成了局部随机仿射阴影采样(LoRAS)算法和邻近加权合成过采样(ProWSyn)算法。通过控制合成样本的方差,以及少数类数据的邻近加权聚类系统,ProWRAS算法与通过建模少数类的高维凸空间生成合成样本的算法相比,提高了性能。ProWRAS有四种过采样方案,每种方案都有其独特的方法来模拟生成数据的方差。最重要的是,通过选择适当的过采样方案,ProWRAS的性能与所使用的分类器无关。在20个公开的数据集上,我们用五种最先进的过采样模型和四种不同的分类器对我们新开发的ProWRAS算法进行了基准测试。ProWRAS在F1评分和Kappa评分方面均优于其他过采样算法。此外,我们还引入了一种新的分类器独立性度量I-score,定量地表明ProWRAS的性能更好,与所使用的分类器无关。在实践中,ProWRAS根据选择的分类器定制合成样本生成,从而减少基准测试工作。 摘要:Over 85 oversampling algorithms, mostly extensions of the SMOTE algorithm, have been built over the past two decades, to solve the problem of imbalanced datasets. However, it has been evident from previous studies that different oversampling algorithms have different degrees of efficiency with different classifiers. With numerous algorithms available, it is difficult to decide on an oversampling algorithm for a chosen classifier. Here, we overcome this problem with a multi-schematic and classifier-independent oversampling approach: ProWRAS(Proximity Weighted Random Affine Shadowsampling). ProWRAS integrates the Localized Random Affine Shadowsampling (LoRAS)algorithm and the Proximity Weighted Synthetic oversampling (ProWSyn) algorithm. By controlling the variance of the synthetic samples, as well as a proximity-weighted clustering system of the minority classdata, the ProWRAS algorithm improves performance, compared to algorithms that generate synthetic samples through modelling high dimensional convex spaces of the minority class. ProWRAS has four oversampling schemes, each of which has its unique way to model the variance of the generated data. Most importantly, the performance of ProWRAS with proper choice of oversampling schemes, is independent of the classifier used. We have benchmarked our newly developed ProWRAS algorithm against five sate-of-the-art oversampling models and four different classifiers on 20 publicly available datasets. ProWRAS outperforms other oversampling algorithms in a statistically significant way, in terms of both F1-score and Kappa-score. Moreover, we have introduced a novel measure for classifier independence I-score, and showed quantitatively that ProWRAS performs better, independent of the classifier used. In practice, ProWRAS customizes synthetic sample generation according to a classifier of choice and thereby reduces benchmarking efforts.

【16】 Framework for A Personalized Intelligent Assistant to Elderly People for Activities of Daily Living 标题:一种面向老年人日常生活活动的个性化智能助手框架

作者:Nirmalya Thakur,Chia Y. Han 机构:edu Department of Electrical Engineering and Computer Science University of Cincinnati Cincinnati 备注:None 链接:https://arxiv.org/abs/2107.07344 摘要:随着老年人口的不断增长,人们需要满足他们日益增长的需求,并提供能够提高他们在智能家居中生活质量的解决方案。除了对与系统接口的恐惧和焦虑之外;随着年龄的增长,老年人往往会面临认知障碍、记忆力减退、行为紊乱甚至身体限制等问题。提供以科技为基础的解决方案,以满足老年人的这些需求,并为老年人创造智能和辅助生活空间;关键在于开发能够通过解决其多样性进行调整的系统,并能够根据其日常目标增强其性能。因此,本研究提出一个架构来开发一个个人化的智慧助理员,以协助老人在智慧且互联的物联网环境中进行日常生活活动。这种个性化的智能助手可以分析用户执行的不同任务,并根据用户的日常生活、当前的情感状态和潜在的用户体验来推荐活动。为了验证这个框架的有效性,我们分别对一个普通用户和一个特定用户的数据集进行了测试。结果表明,该模型在对特定用户建模时的性能准确率为73.12%,大大高于对普通用户建模时的性能,支持了该框架的开发和实现。 摘要:The increasing population of elderly people is associated with the need to meet their increasing requirements and to provide solutions that can improve their quality of life in a smart home. In addition to fear and anxiety towards interfacing with systems; cognitive disabilities, weakened memory, disorganized behavior and even physical limitations are some of the problems that elderly people tend to face with increasing age. The essence of providing technology-based solutions to address these needs of elderly people and to create smart and assisted living spaces for the elderly; lies in developing systems that can adapt by addressing their diversity and can augment their performances in the context of their day to day goals. Therefore, this work proposes a framework for development of a Personalized Intelligent Assistant to help elderly people perform Activities of Daily Living (ADLs) in a smart and connected Internet of Things (IoT) based environment. This Personalized Intelligent Assistant can analyze different tasks performed by the user and recommend activities by considering their daily routine, current affective state and the underlining user experience. To uphold the efficacy of this proposed framework, it has been tested on a couple of datasets for modelling an average user and a specific user respectively. The results presented show that the model achieves a performance accuracy of 73.12% when modelling a specific user, which is considerably higher than its performance while modelling an average user, this upholds the relevance for development and implementation of this proposed framework.

【17】 Leveraging wisdom of the crowds to improve consensus among radiologists by real time, blinded collaborations on a digital swarm platform 标题:利用群众的智慧,在数字集群平台上实时、盲目地提高放射科医生之间的共识

作者:Rutwik Shah,Bruno Astuto,Tyler Gleason,Will Fletcher,Justin Banaga,Kevin Sweetwood,Allen Ye,Rina Patel,Kevin McGill,Thomas Link,Jason Crane,Valentina Pedoia,Sharmila Majumdar 机构:Center for Intelligent Imaging, Dept. of Radiology and Biomedical Imaging, UCSF 备注:24 pages, 2 tables, 7 figures 链接:https://arxiv.org/abs/2107.07341 摘要:今天,放射科医生在做出诊断决定和标记图像以训练人工智能算法方面发挥着关键作用。在解释具有挑战性的案例时,专家之间的读者可靠性(IRR)较低。尽管基于团队的决策被认为优于个人决策,但在群体互动中,个人间的偏见往往会蔓延,从而限制非主导参与者表达真实的观点。为了克服低共识和个人间偏见的双重问题,我们探索了一种以蜂群为模型的解决方案。两个独立的队列;三名放射科医生和五名放射科住院医师在一个数字swarm平台上进行了实时的、盲法的合作,在膝关节MR检查中对半月板病变进行分级。这些共识投票以临床(关节镜检查)和放射学(最高级放射科医生)观察为基准。将一致投票的内部收益率与两组大多数和最有信心投票的内部收益率进行比较,放射科医师组的群体投票的内部收益率比多数投票提高了23%。与多数票相比,三居民群体投票的内部收益率提高了23%。与多数票相比,5个居民群体的内部收益率提高了32%。Swarm共识投票也提高了50%的特异性。在放射科医生和住院医师队列中,群体一致投票优于个人和多数投票决定。5居民群体的内部收益率高于3居民群体,表明群体规模增加的积极作用。参加和居住的人群也比最先进的人工智能算法预测的要好。利用数字swarm平台改进协议,允许参与者表达无需判断的意图,从而产生卓越的临床表现和强大的人工智能训练标签。 摘要:Radiologists today play a key role in making diagnostic decisions and labeling images for training A.I. algorithms. Low inter-reader reliability (IRR) can be seen between experts when interpreting challenging cases. While teams-based decisions are known to outperform individual decisions, inter-personal biases often creep up in group interactions which limit non-dominant participants from expressing true opinions. To overcome the dual problems of low consensus and inter-personal bias, we explored a solution modeled on biological swarms of bees. Two separate cohorts; three radiologists and five radiology residents collaborated on a digital swarm platform in real time and in a blinded fashion, grading meniscal lesions on knee MR exams. These consensus votes were benchmarked against clinical (arthroscopy) and radiological (senior-most radiologist) observations. The IRR of the consensus votes was compared to the IRR of the majority and most confident votes of the two cohorts.The radiologist cohort saw an improvement of 23% in IRR of swarm votes over majority vote. Similar improvement of 23% in IRR in 3-resident swarm votes over majority vote, was observed. The 5-resident swarm had an even higher improvement of 32% in IRR over majority vote. Swarm consensus votes also improved specificity by up to 50%. The swarm consensus votes outperformed individual and majority vote decisions in both the radiologists and resident cohorts. The 5-resident swarm had higher IRR than 3-resident swarm indicating positive effect of increased swarm size. The attending and resident swarms also outperformed predictions from a state-of-the-art A.I. algorithm. Utilizing a digital swarm platform improved agreement and allows participants to express judgement free intent, resulting in superior clinical performance and robust A.I. training labels.

【18】 Towards Natural Brain-Machine Interaction using Endogenous Potentials based on Deep Neural Networks 标题:基于深度神经网络的基于内生电位的自然脑机交互研究

作者:Hyung-Ju Ahn,Dae-Hyeok Lee,Ji-Hoon Jeong,Seong-Whan Lee 备注:6 pages, 3 figures 链接:https://arxiv.org/abs/2107.07335 摘要:人-机器人协作有可能最大限度地提高自主机器人的操作效率。脑机接口(BMI)是一种理想的机器人协作技术,因为用户的意图或状态可以从神经活动中转化出来。然而,脑电图(electroencephalogram,EEG)是目前应用最广泛的无创BMI检查方法之一,由于其信噪比低,其准确度低,自由度有限。因此,提高多类脑电分类的性能对于开发更灵活的基于BMI的人-机器人协作至关重要。在这项研究中,我们探讨了多种内源性BMI范式的范式间分类的可能性,如运动表象(MI)、视觉表象(VI)和言语表象(SI),以增强有限的自由度,同时保持稳健的准确性。我们对MI、VI和SI进行了统计和神经生理学分析,并使用提出的基于时间信息的神经网络(TINN)对三种范式进行了分类。我们证实,在对三种内源性范式进行分类时,可以在不同的脑区提取具有统计意义的特征。此外,我们提出的TINN与之前的三种不同类型的心理表象任务(MI、VI和SI)分类方法相比,显示出最高的准确率为0.93。 摘要:Human-robot collaboration has the potential to maximize the efficiency of the operation of autonomous robots. Brain-machine interface (BMI) would be a desirable technology to collaborate with robots since the intention or state of users can be translated from the neural activities. However, the electroencephalogram (EEG), which is one of the most popularly used non-invasive BMI modalities, has low accuracy and a limited degree of freedom (DoF) due to a low signal-to-noise ratio. Thus, improving the performance of multi-class EEG classification is crucial to develop more flexible BMI-based human-robot collaboration. In this study, we investigated the possibility for inter-paradigm classification of multiple endogenous BMI paradigms, such as motor imagery (MI), visual imagery (VI), and speech imagery (SI), to enhance the limited DoF while maintaining robust accuracy. We conducted the statistical and neurophysiological analyses on MI, VI, and SI and classified three paradigms using the proposed temporal information-based neural network (TINN). We confirmed that statistically significant features could be extracted on different brain regions when classifying three endogenous paradigms. Moreover, our proposed TINN showed the highest accuracy of 0.93 compared to the previous methods for classifying three different types of mental imagery tasks (MI, VI, and SI).

【19】 Modeling Accurate Human Activity Recognition for Embedded Devices Using Multi-level Distillation 标题:基于多级蒸馏的嵌入式设备精确人体活动识别建模

作者:Runze Chen,Haiyong Luo,Fang Zhao,Xuechun Meng,Zhiqing Xie,Yida Zhu 备注:This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible 链接:https://arxiv.org/abs/2107.07331 摘要:基于IMU传感器的人类活动识别(Human activity recognition,HAR)是普适计算的一个重要领域。由于在物联网设备或智能手机中部署人工智能的趋势不断改善,越来越多的研究人员为嵌入式设备设计了HAR模型。我们提出了一个即插即用的HAR建模管道,通过多级蒸馏来建立深度卷积HAR模型,并在本地支持嵌入式设备。SMLDist包括阶段蒸馏、记忆蒸馏和logits蒸馏,涵盖了深层模型的所有信息流。阶段升华限制了中间特征的学习方向。记忆蒸馏教授学生模型如何解释和存储基于Hopfield网络的高维特征之间的内在关系。Logits蒸馏通过平滑的条件规则构造蒸馏Logits,以保持概率分布,提高软目标的正确性。我们比较了各种最先进的HAR框架和SMLDist构建的mobilenetv3模型在嵌入式平台上的精度、F1宏评分和能耗性能。该模型具有良好的鲁棒性、效率和准确性。SMLDist还可以在七个公共数据集上以与其他最先进的知识提取方法相同的压缩率压缩性能损失较小的模型。 摘要:Human activity recognition (HAR) based on IMU sensors is an essential domain in ubiquitous computing. Because of the improving trend to deploy artificial intelligence into IoT devices or smartphones, more researchers design the HAR models for embedded devices. We propose a plug-and-play HAR modeling pipeline with multi-level distillation to build deep convolutional HAR models with native support of embedded devices. SMLDist consists of stage distillation, memory distillation, and logits distillation, which covers all the information flow of the deep models. Stage distillation constrains the learning direction of the intermediate features. Memory distillation teaches the student models how to explain and store the inner relationship between high-dimensional features based on Hopfield networks. Logits distillation constructs distilled logits by a smoothed conditional rule to keep the probable distribution and improve the correctness of the soft target. We compare the performance of accuracy, F1 macro score, and energy cost on the embedded platform of various state-of-the-art HAR frameworks with a MobileNet V3 model built by SMLDist. The produced model has well balance with robustness, efficiency, and accuracy. SMLDist can also compress the models with minor performance loss in an equal compression rate than other state-of-the-art knowledge distillation methods on seven public datasets.

【20】 Minimizing Safety Interference for Safe and Comfortable Automated Driving with Distributional Reinforcement Learning 标题:分布式强化学习最小化安全舒适自动驾驶的安全干扰

作者:Danial Kamran,Tizian Engelgeh,Marvin Busch,Johannes Fischer,Christoph Stiller 机构: Although the learned policies are prevented to 1AuthorsarewithInstituteofMeasurementandControlSystems, Karlsruhe Institute of Technology (KIT) 链接:https://arxiv.org/abs/2107.07316 摘要:尽管近年来强化学习(RL)取得了一些进展,但其在自主车辆等安全关键领域的应用仍然具有挑战性。尽管在危险情况下惩罚RL代理有助于学习安全策略,但也可能导致高度保守的行为。在本文中,我们提出了一个分布式RL框架来学习自适应策略,该策略可以根据期望的舒适度和效用在运行时调整其保守性水平。通过使用主动安全验证方法,该框架可以保证由RL生成的动作在最坏情况下是故障安全的。同时,鼓励该政策将安全干扰降至最低,并产生更舒适的行为。我们使用一个高级模拟器对所提出的方法和基线策略进行了训练和评估,该模拟器具有多种随机场景,包括一些在现实中很少发生但非常关键的角落案例。根据我们的实验,使用分布式RL学习的策略行为在运行时是自适应的,并且对环境的不确定性是鲁棒的。从数量上讲,学习的分布式RL代理比普通的DQN策略平均快8秒,需要的安全干扰比基于规则的策略少83%,平均交叉时间略有增加。我们还研究了学习策略在高感知噪声环境下的敏感性,结果表明,在感知噪声比训练配置高两倍的情况下,我们的算法学习到的策略仍然可以可靠地驾驶,用于闭塞交叉口的自动合并和交叉。 摘要:Despite recent advances in reinforcement learning (RL), its application in safety critical domains like autonomous vehicles is still challenging. Although punishing RL agents for risky situations can help to learn safe policies, it may also lead to highly conservative behavior. In this paper, we propose a distributional RL framework in order to learn adaptive policies that can tune their level of conservativity at run-time based on the desired comfort and utility. Using a proactive safety verification approach, the proposed framework can guarantee that actions generated from RL are fail-safe according to the worst-case assumptions. Concurrently, the policy is encouraged to minimize safety interference and generate more comfortable behavior. We trained and evaluated the proposed approach and baseline policies using a high level simulator with a variety of randomized scenarios including several corner cases which rarely happen in reality but are very crucial. In light of our experiments, the behavior of policies learned using distributional RL can be adaptive at run-time and robust to the environment uncertainty. Quantitatively, the learned distributional RL agent drives in average 8 seconds faster than the normal DQN policy and requires 83% less safety interference compared to the rule-based policy with slightly increasing the average crossing time. We also study sensitivity of the learned policy in environments with higher perception noise and show that our algorithm learns policies that can still drive reliable when the perception noise is two times higher than the training configuration for automated merging and crossing at occluded intersections.

【21】 Spanish Language Models 标题:西班牙语模型

作者:Asier Gutiérrez-Fandiño,Jordi Armengol-Estapé,Marc Pàmies,Joan Llop-Palao,Joaquín Silveira-Ocampo,Casimiro Pio Carrino,Aitor Gonzalez-Agirre,Carme Armentano-Oller,Carlos Rodriguez-Penagos,Marta Villegas 机构:Text Mining Unit, Barcelona Supercomputing Center 链接:https://arxiv.org/abs/2107.07253 摘要:本文介绍了西班牙RoBERTa基地和RoBERTa大型模型,以及相应的性能评价。这两个模型都是使用迄今为止已知的最大的西班牙语语料库进行预训练的,这项工作总共处理了570GB的干净和重复的文本,这些文本是从西班牙国家图书馆2009年至2019年进行的网络爬网中汇编而成的。 摘要:This paper presents the Spanish RoBERTa-base and RoBERTa-large models, as well as the corresponding performance evaluations. Both models were pre-trained using the largest Spanish corpus known to date, with a total of 570GB of clean and deduplicated text processed for this work, compiled from the web crawlings performed by the National Library of Spain from 2009 to 2019.

【22】 Deep Automatic Natural Image Matting 标题:深度自动自然图像遮片

作者:Jizhizi Li,Jing Zhang,Dacheng Tao 机构:The University of Sydney, Australia 备注:Accepted to IJCAI-21, code and dataset available at this https URL 链接:https://arxiv.org/abs/2107.07235 摘要:自动图像抠图(AIM)是指在没有trimap等辅助输入的情况下,从任意自然图像中估计出软前景,这对于图像编辑非常有用。以前的方法试图学习语义特征来辅助铺垫过程,但仅限于具有显著不透明前景的图像,如人和动物。在本文中,我们研究了将它们扩展到具有显著透明/细致前景或非显著前景的自然图像时的困难。为了解决这一问题,提出了一种新的端到端matting网络,它可以作为统一的语义表示来预测上述类型的任何图像的广义trimap。同时,学习到的语义特征通过注意机制引导matting网络聚焦于过渡区域。我们还构建了一个测试集AIM-500,其中包含500个不同的自然图像,涵盖了所有类型以及手动标记的alpha蒙版,使得测试AIM模型的泛化能力成为可能。实验结果表明,我们的网络训练可用的复合垫数据集在客观和主观上都优于现有的方法。源代码和数据集位于https://github.com/JizhiziLi/AIM. 摘要:Automatic image matting (AIM) refers to estimating the soft foreground from an arbitrary natural image without any auxiliary input like trimap, which is useful for image editing. Prior methods try to learn semantic features to aid the matting process while being limited to images with salient opaque foregrounds such as humans and animals. In this paper, we investigate the difficulties when extending them to natural images with salient transparent/meticulous foregrounds or non-salient foregrounds. To address the problem, a novel end-to-end matting network is proposed, which can predict a generalized trimap for any image of the above types as a unified semantic representation. Simultaneously, the learned semantic features guide the matting network to focus on the transition areas via an attention mechanism. We also construct a test set AIM-500 that contains 500 diverse natural images covering all types along with manually labeled alpha mattes, making it feasible to benchmark the generalization ability of AIM models. Results of the experiments demonstrate that our network trained on available composite matting datasets outperforms existing methods both objectively and subjectively. The source code and dataset are available at https://github.com/JizhiziLi/AIM.

【23】 Genetic CFL: Optimization of Hyper-Parameters in Clustered Federated Learning 标题:遗传CFL:聚类联邦学习中超参数的优化

作者:Shaashwat Agrawal,Sagnik Sarkar,Mamoun Alazab,Praveen Kumar Reddy Maddikunta,Thippa Reddy Gadekallu,Quoc-Viet Pham 机构:∗School of Computer Science and Engineering, VIT, Vellore, India, †College of Engineering, IT and Environment, Charles Darwin University, Casuarina, NT , Australia, ‡School of Information Technology and Engineering, VIT, Vellore, India 备注:7 pages, 4 figures, 4 tables 链接:https://arxiv.org/abs/2107.07233 摘要:联邦学习(FL)是一种分布式的深度学习模型,它集成了客户机-服务器体系结构、边缘计算和实时智能。FL具有彻底改变机器学习(ML)的能力,但由于技术限制、通信开销、非IID(独立同分布)数据和隐私问题,FL缺乏实现的实用性。在异构非IID数据上训练ML模型会严重降低收敛速度和性能。现有的传统FL算法和集群FL算法存在两个主要的局限性,包括客户端训练效率低和静态超参数利用率低。为了克服这些局限性,我们提出了一种新的混合算法,即遗传聚类FL(genetic-CFL),该算法根据训练的超参数对边缘设备进行聚类,并对聚类参数进行遗传修改。然后,将基于密度的聚类与遗传超参数优化相结合,提出了一种大幅度提高个体聚类精度的算法。使用MNIST手写数字数据集和CIFAR-10数据集对结果进行了基准测试。提出的遗传CFL显示出显著的改进,并能很好地处理非IID和模糊数据的实际情况。 摘要:Federated learning (FL) is a distributed model for deep learning that integrates client-server architecture, edge computing, and real-time intelligence. FL has the capability of revolutionizing machine learning (ML) but lacks in the practicality of implementation due to technological limitations, communication overhead, non-IID (independent and identically distributed) data, and privacy concerns. Training a ML model over heterogeneous non-IID data highly degrades the convergence rate and performance. The existing traditional and clustered FL algorithms exhibit two main limitations, including inefficient client training and static hyper-parameter utilization. To overcome these limitations, we propose a novel hybrid algorithm, namely genetic clustered FL (Genetic CFL), that clusters edge devices based on the training hyper-parameters and genetically modifies the parameters cluster-wise. Then, we introduce an algorithm that drastically increases the individual cluster accuracy by integrating the density-based clustering and genetic hyper-parameter optimization. The results are bench-marked using MNIST handwritten digit dataset and the CIFAR-10 dataset. The proposed genetic CFL shows significant improvements and works well with realistic cases of non-IID and ambiguous data.

【24】 Trusting RoBERTa over BERT: Insights from CheckListing the Natural Language Inference Task 标题:信任罗BERT胜过BERT:检查自然语言推理任务的启示

作者:Ishan Tarunesh,Somak Aditya,Monojit Choudhury 机构:Samsung Korea, Microsoft Research India, Bengaluru, India 备注:15 pages, 5 figures and 9 tables 链接:https://arxiv.org/abs/2107.07229 摘要:当前最先进的自然语言理解(NLU)系统往往表现出不可预测的行为,在简单的推理实例上失败。尽管如此,人们对量化朝着行为更可预测的系统发展的关注还是有限的。我们认为推理能力明智的行为总结是弥补这一差距的一个步骤。我们为自然语言推理(NLI)任务(一个具有代表性的NLU任务)创建了一个清单测试套件(184K个例子)。我们在这个测试套件上测试最先进的NLI系统,这揭示了BERT和RoBERTa对推理能力的细粒度洞察。我们的分析进一步揭示了来自同一模板或不同模板但属于同一推理能力的实例上的模型的不一致性,表明通过在检查表上的观察来概括模型的行为是不平凡的。通过一个用户研究,我们发现用户能够利用行为信息来概括RoBERTa所预测的例子,比BERT所预测的要好得多。 摘要:The recent state-of-the-art natural language understanding (NLU) systems often behave unpredictably, failing on simpler reasoning examples. Despite this, there has been limited focus on quantifying progress towards systems with more predictable behavior. We think that reasoning capability-wise behavioral summary is a step towards bridging this gap. We create a CheckList test-suite (184K examples) for the Natural Language Inference (NLI) task, a representative NLU task. We benchmark state-of-the-art NLI systems on this test-suite, which reveals fine-grained insights into the reasoning abilities of BERT and RoBERTa. Our analysis further reveals inconsistencies of the models on examples derived from the same template or distinct templates but pertaining to same reasoning capability, indicating that generalizing the models' behavior through observations made on a CheckList is non-trivial. Through an user-study, we find that users were able to utilize behavioral information to generalize much better for examples predicted from RoBERTa, compared to that of BERT.

【25】 Deep Learning based Food Instance Segmentation using Synthetic Data 标题:基于深度学习的人工数据食品实例分割

作者:D. Park,J. Lee,J. Lee,K. Lee 机构: 1SchoolofIntegratedTechnology(SIT), GwangjuInstituteofScienceandTechnology(GIST) 备注:Technical Report 链接:https://arxiv.org/abs/2107.07191 摘要:在利用深度神经网络对图像中的食物进行智能分割的过程中,数据的采集和标注是一项非常重要但又很费时费力的任务。为了解决数据采集和标注的困难,本文提出了一种适用于真实世界的食品分割方法。为了在医疗机器人系统(如用餐辅助机器人手臂)上进行食物分割,我们使用开源的三维图形软件Blender生成合成数据,将多个对象放置在餐盘上,并以R-CNN为例进行了分割。此外,我们建立了一个数据收集系统,并在真实的食物数据上验证了我们的分割模型。因此,在我们的真实数据集上,仅训练合成数据的模型可用于分割未使用52.2%遮罩训练的食品实例AP@all与从头开始训练的模型相比,微调后的性能提高了 6.4%p。此外,我们还验证了在公共数据集上进行公平分析的可能性和性能改进。我们的代码和预先训练的重量可在线获取:https://github.com/gist-ailab/Food-Instance-Segmentation 摘要:In the process of intelligently segmenting foods in images using deep neural networks for diet management, data collection and labeling for network training are very important but labor-intensive tasks. In order to solve the difficulties of data collection and annotations, this paper proposes a food segmentation method applicable to real-world through synthetic data. To perform food segmentation on healthcare robot systems, such as meal assistance robot arm, we generate synthetic data using the open-source 3D graphics software Blender placing multiple objects on meal plate and train Mask R-CNN for instance segmentation. Also, we build a data collection system and verify our segmentation model on real-world food data. As a result, on our real-world dataset, the model trained only synthetic data is available to segment food instances that are not trained with 52.2% mask AP@all, and improve performance by 6.4%p after fine-tuning comparing to the model trained from scratch. In addition, we also confirm the possibility and performance improvement on the public dataset for fair analysis. Our code and pre-trained weights are avaliable online at: https://github.com/gist-ailab/Food-Instance-Segmentation

【26】 Scene-adaptive Knowledge Distillation for Sequential Recommendation via Differentiable Architecture Search 标题:基于区分体系结构搜索的场景自适应顺序推荐知识提取

作者:Lei Chen,Fajie Yuan,Jiaxi Yang,Min Yang,Chengming Li 链接:https://arxiv.org/abs/2107.07173 摘要:序贯推荐系统(SRS)以其对用户动态兴趣和序贯行为模式的建模能力成为研究热点。为了最大限度地提高模型的表达能力,默认的选择是应用更大、更深的网络体系结构,然而,在生成在线推荐时,这通常会带来较高的网络延迟。自然地,我们认为将重推荐模型压缩成中等或轻权重的神经网络对于实际生产系统是非常重要的。为了实现这一目标,我们提出了AdaRec(knowledge extraction)框架,该框架利用可微神经结构搜索(NAS)技术,根据教师模型的推荐场景,自适应地将教师模型的知识压缩为学生模型。具体来说,我们引入了一个面向目标的蒸馏损失来指导学生网络结构的搜索过程,并引入了一个对成本敏感的损失作为模型大小的约束,从而在推荐的有效性和效率之间实现了一个很好的折衷。此外,我们利用地球移动距离(EMD)实现了知识提取过程中的多对多层次映射,使得每个中间学生层能够自适应地向其他中间教师层学习。在实际推荐数据集上的大量实验表明,与强推荐数据集相比,我们的模型在推理速度上有显著的提高,同时在不同的推荐场景下发现了序列推荐模型的不同神经结构。 摘要:Sequential recommender systems (SRS) have become a research hotspot due to its power in modeling user dynamic interests and sequential behavioral patterns. To maximize model expressive ability, a default choice is to apply a larger and deeper network architecture, which, however, often brings high network latency when generating online recommendations. Naturally, we argue that compressing the heavy recommendation models into middle- or light- weight neural networks is of great importance for practical production systems. To realize such a goal, we propose AdaRec, a knowledge distillation (KD) framework which compresses knowledge of a teacher model into a student model adaptively according to its recommendation scene by using differentiable Neural Architecture Search (NAS). Specifically, we introduce a target-oriented distillation loss to guide the structure search process for finding the student network architecture, and a cost-sensitive loss as constraints for model size, which achieves a superior trade-off between recommendation effectiveness and efficiency. In addition, we leverage Earth Mover's Distance (EMD) to realize many-to-many layer mapping during knowledge distillation, which enables each intermediate student layer to learn from other intermediate teacher layers adaptively. Extensive experiments on real-world recommendation datasets demonstrate that our model achieves competitive or better accuracy with notable inference speedup comparing to strong counterparts, while discovering diverse neural architectures for sequential recommender models under different recommendation scenes.

【27】 What and When to Look?: Temporal Span Proposal Network for Video Visual Relation Detection 标题:看什么,什么时候看?--视频视觉关系检测的时间跨度建议网络

作者:Sangmin Woo,Junhyug Noh,Kangil Kim 机构: Member, IEEE 备注:This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible 链接:https://arxiv.org/abs/2107.07154 摘要:识别对象之间的关系是理解场景的核心。虽然已经提出了一些用于图像域中的关系建模的工作,但是由于时空交互的挑战性动力学(例如,哪些对象之间存在交互?关系何时发生和结束?)。迄今为止,已有两种典型的视频关系检测方法:基于片段的方法和基于窗口的方法。本文首先指出了这两种方法的局限性,并提出了一种新的时间跨度建议网络(TSPN),它具有效率和有效性两个优点。1) TSPN告诉我们要看什么:它通过对对象对的关系性(即,对象对之间存在关系的置信度)评分来稀疏关系搜索空间。2) TSPN告诉何时查看:它利用完整的视频上下文来同时预测整个关系的时间跨度和类别。TSPN通过在两个VidVRD基准(ImageNet-VidVDR和VidOR)上以显著的优势实现最新的技术水平来证明其有效性,同时也显示出比现有方法更低的时间复杂度,特别是比流行的基于段的方法效率高一倍。 摘要:Identifying relations between objects is central to understanding the scene. While several works have been proposed for relation modeling in the image domain, there have been many constraints in the video domain due to challenging dynamics of spatio-temporal interactions (e.g., Between which objects are there an interaction? When do relations occur and end?). To date, two representative methods have been proposed to tackle Video Visual Relation Detection (VidVRD): segment-based and window-based. We first point out the limitations these two methods have and propose Temporal Span Proposal Network (TSPN), a novel method with two advantages in terms of efficiency and effectiveness. 1) TSPN tells what to look: it sparsifies relation search space by scoring relationness (i.e., confidence score for the existence of a relation between pair of objects) of object pair. 2) TSPN tells when to look: it leverages the full video context to simultaneously predict the temporal span and categories of the entire relations. TSPN demonstrates its effectiveness by achieving new state-of-the-art by a significant margin on two VidVRD benchmarks (ImageNet-VidVDR and VidOR) while also showing lower time complexity than existing methods - in particular, twice as efficient as a popular segment-based approach.

【28】 Learning Mixed-Integer Linear Programs from Contextual Examples 标题:从上下文示例学习混合整数线性规划

作者:Mohit Kumar,Samuel Kolb,Luc De Raedt,Stefano Teso 机构:KU Leuven, Belgium, University of Trento, Italy 链接:https://arxiv.org/abs/2107.07136 摘要:混合整数线性规划(Mixed integer linear programmes,MILPs)广泛应用于人工智能和运筹学中,用于建模调度和路由等复杂决策问题。然而,设计这样的程序需要领域和建模方面的专业知识。在这篇论文中,我们研究了从上下文例子中获取milp的问题,这是一个新颖而现实的环境,在这个环境中,例子捕获特定上下文中的解和非解。由此产生的学习问题涉及到获取连续的参数,即成本向量和可行性多面体,但具有明显的组合特征。为了解决这个复杂的问题,我们还提供了MISSLE,一种从上下文示例学习milp的算法。MISSLE使用了一种随机局部搜索的变体,由连续代理损失函数的梯度引导。我们对合成数据的经验评估表明,导弹比基于随机局部搜索和梯度下降的方案更快地获得更好的MILPs。 摘要:Mixed-integer linear programs (MILPs) are widely used in artificial intelligence and operations research to model complex decision problems like scheduling and routing. Designing such programs however requires both domain and modelling expertise. In this paper, we study the problem of acquiring MILPs from contextual examples, a novel and realistic setting in which examples capture solutions and non-solutions within a specific context. The resulting learning problem involves acquiring continuous parameters -- namely, a cost vector and a feasibility polytope -- but has a distinctly combinatorial flavor. To solve this complex problem, we also contribute MISSLE, an algorithm for learning MILPs from contextual examples. MISSLE uses a variant of stochastic local search that is guided by the gradient of a continuous surrogate loss function. Our empirical evaluation on synthetic data shows that MISSLE acquires better MILPs faster than alternatives based on stochastic local search and gradient descent.

【29】 An Educational System for Personalized Teacher Recommendation in K-12 Online Classrooms 标题:一种面向K-12在线课堂的个性化教师推荐教育系统

作者:Jiahao Chen,Hang Li,Wenbiao Ding,Zitao Liu 机构:TAL Education Group, Beijing, China 备注:AIED'21: The 22nd International Conference on Artificial Intelligence in Education, 2021 链接:https://arxiv.org/abs/2107.07124 摘要:本文提出了一个简单而有效的解决方案来构建一对一在线课堂的实用教师推荐系统。我们的系统包括:(1)提供可靠训练标签的伪匹配评分模块(2) 为每位候选教师打分的排名模型(3) 一个新颖性提升模块,为新教师提供额外的机会;以及(4)多样性度量,该度量对推荐结果进行了保护,以减少碰撞的可能性。离线实验结果表明,我们的方法优于广泛的基线。此外,我们在第三方在线教育平台上进行了为期五个月的观察,结果显示我们的方法能够将师生匹配尝试次数从7.22次减少到3.09次。 摘要:In this paper, we propose a simple yet effective solution to build practical teacher recommender systems for online one-on-one classes. Our system consists of (1) a pseudo matching score module that provides reliable training labels; (2) a ranking model that scores every candidate teacher; (3) a novelty boosting module that gives additional opportunities to new teachers; and (4) a diversity metric that guardrails the recommended results to reduce the chance of collision. Offline experimental results show that our approach outperforms a wide range of baselines. Furthermore, we show that our approach is able to reduce the number of student-teacher matching attempts from 7.22 to 3.09 in a five-month observation on a third-party online education platform.

【30】 Solving ESL Sentence Completion Questions via Pre-trained Neural Language Models 标题:用预先训练好的神经语言模型求解ESL句子补全问题

作者:Qiongqiong Liu,Tianqiao Liu,Jiafu Zhao,Qiang Fang,Wenbiao Ding,Zhongqin Wu,Feng Xia,Jiliang Tang,Zitao Liu 机构: TAL Education Group, Beijing, China, Data Science and Engineering Lab, Michigan State University, USA, Federation University Australia, Australia 备注:AIED'21: The 22nd International Conference on Artificial Intelligence in Education, 2021 链接:https://arxiv.org/abs/2107.07122 摘要:句子完成(SC)问题是指一个句子有一个或多个空格需要填写,三到五个可能的单词或短语作为选项。作为第二语言的英语学习者,SC问题得到了广泛的应用,建立自动求解这类问题的计算方法对语言学习者是有益的。在这项工作中,我们提出了一个神经网络框架,利用预先训练的语言模型来解决英语考试中的SC问题。我们在一个真实的K-12 ESL-SC问题数据集上进行了大量的实验,结果证明了该模型在预测精度方面的优越性。此外,我们还进行了精确召回权衡分析,以讨论在实际场景中部署它时的实际问题。为了鼓励可复制的结果,我们在url上公开了我们的代码{https://github.com/AIED2021/ESL-SentenceCompletion}. 摘要:Sentence completion (SC) questions present a sentence with one or more blanks that need to be filled in, three to five possible words or phrases as options. SC questions are widely used for students learning English as a Second Language (ESL) and building computational approaches to automatically solve such questions is beneficial to language learners. In this work, we propose a neural framework to solve SC questions in English examinations by utilizing pre-trained language models. We conduct extensive experiments on a real-world K-12 ESL SC question dataset and the results demonstrate the superiority of our model in terms of prediction accuracy. Furthermore, we run precision-recall trade-off analysis to discuss the practical issues when deploying it in real-life scenarios. To encourage reproducible results, we make our code publicly available at url{https://github.com/AIED2021/ESL-SentenceCompletion}.

【31】 Multi-Task Learning based Online Dialogic Instruction Detection with Pre-trained Language Models 标题:基于多任务学习的预训练语言模型在线对话教学检测

作者:Yang Hao,Hang Li,Wenbiao Ding,Zhongqin Wu,Jiliang Tang,Rose Luckin,Zitao Liu 机构: TAL Education Group, Beijing, China, Data Science and Engineering Lab, Michigan State University, USA, UCL Knowledge Lab, London, UK 备注:AIED'21: The 22nd International Conference on Artificial Intelligence in Education, 2021 链接:https://arxiv.org/abs/2107.07119 摘要:在这项工作中,我们研究了计算方法来检测在线对话指令,它被广泛用于帮助学生理解学习材料,建立有效的学习习惯。由于对话教学的质量和教学风格千差万别,这项任务相当具有挑战性。为了应对这些挑战,我们利用预先训练好的语言模型,提出了一种多任务范式,该范式通过对比损失来扩大类别之间的界限,从而增强了区分不同类别实例的能力。此外,我们还设计了一个策略,在训练阶段充分利用错误分类的例子。在一个真实的在线教育数据集上的大量实验表明,与典型的基线相比,我们的方法取得了更好的性能。为了鼓励可复制的结果,我们在url上提供了我们的在线实现{https://github.com/AIED2021/multitask-dialogic-instruction}. 摘要:In this work, we study computational approaches to detect online dialogic instructions, which are widely used to help students understand learning materials, and build effective study habits. This task is rather challenging due to the widely-varying quality and pedagogical styles of dialogic instructions. To address these challenges, we utilize pre-trained language models, and propose a multi-task paradigm which enhances the ability to distinguish instances of different classes by enlarging the margin between categories via contrastive loss. Furthermore, we design a strategy to fully exploit the misclassified examples during the training stage. Extensive experiments on a real-world online educational data set demonstrate that our approach achieves superior performance compared to representative baselines. To encourage reproducible results, we make our implementation online available at url{https://github.com/AIED2021/multitask-dialogic-instruction}.

【32】 Transformer-based Machine Learning for Fast SAT Solvers and Logic Synthesis 标题:基于Transformer的快速SAT求解器和逻辑综合机器学习

作者:Feng Shi,Chonghan Lee,Mohammad Khairul Bashar,Nikhil Shukla,Song-Chun Zhu,Vijaykrishnan Narayanan 机构:University of California Los Angeles, The Pennsylvania State University, University of Virginia 链接:https://arxiv.org/abs/2107.07116 摘要:基于CNF的SAT和MaxSAT解算器是逻辑综合和验证系统的核心。这些约束问题在电子设计自动化中的日益普及,促使人们研究不同的SAT问题及其性质,以进一步提高计算效率。现代冲突驱动从句学习SAT求解器在理论和实践上都取得了成功,它可以在相对较短的时间内解决非常大的工业实例。近年来,机器学习方法为解决这一具有挑战性的问题提供了一个新的视角。神经符号模型可以作为一种通用的求解器,可以在不改变模型结构的情况下,根据数据对特定领域进行专门化。在这项工作中,我们提出了一个从Transformer架构导出的一次性模型来解决MaxSAT问题,这是SAT的优化版本,其目标是满足最大子句数。我们的模型具有无标度结构,可以处理不同大小的实例。我们使用元路径和自我注意机制来捕捉同质节点之间的交互。我们在二部图上采用交叉注意机制来捕捉异质节点之间的交互。我们进一步应用一个迭代算法来满足我们的模型中的附加条款,使解决方案接近于一个精确的SAT问题。注意机制利用并行性来加速。我们的评估表明,改进的加速比相比启发式方法和改进的完成率相比,机器学习方法。 摘要:CNF-based SAT and MaxSAT solvers are central to logic synthesis and verification systems. The increasing popularity of these constraint problems in electronic design automation encourages studies on different SAT problems and their properties for further computational efficiency. There has been both theoretical and practical success of modern Conflict-driven clause learning SAT solvers, which allows solving very large industrial instances in a relatively short amount of time. Recently, machine learning approaches provide a new dimension to solving this challenging problem. Neural symbolic models could serve as generic solvers that can be specialized for specific domains based on data without any changes to the structure of the model. In this work, we propose a one-shot model derived from the Transformer architecture to solve the MaxSAT problem, which is the optimization version of SAT where the goal is to satisfy the maximum number of clauses. Our model has a scale-free structure which could process varying size of instances. We use meta-path and self-attention mechanism to capture interactions among homogeneous nodes. We adopt cross-attention mechanisms on the bipartite graph to capture interactions among heterogeneous nodes. We further apply an iterative algorithm to our model to satisfy additional clauses, enabling a solution approaching that of an exact-SAT problem. The attention mechanisms leverage the parallelism for speedup. Our evaluation indicates improved speedup compared to heuristic approaches and improved completion rate compared to machine learning approaches.

【33】 Uncertainty-Aware Reliable Text Classification 标题:不确定性感知的可靠文本分类

作者:Yibo Hu,Latifur Khan 机构:The University of Texas at Dallas, Richardson, TX, USA 备注:KDD 2021 链接:https://arxiv.org/abs/2107.07114 摘要:深度神经网络对分类任务的预测精度有着重要的贡献。然而,他们倾向于在现实世界中做出过于自信的预测,在现实世界中存在领域转移和分布外(OOD)的例子。不确定度估计的研究主要集中在计算机视觉上,因为它提供了对不确定度质量的视觉验证。然而,在自然语言处理领域却鲜有报道。与通过权重不确定性间接推断不确定性的贝叶斯方法不同,现有的基于证据不确定性的方法通过主观意见对类概率的不确定性进行显式建模。他们进一步考虑不同的根本原因的数据中固有的不确定性,真空度(即,由于缺乏证据的不确定性)和不和谐(即,由于相互矛盾的证据的不确定性)。本文首次将证据不确定性应用于文本分类任务的面向对象检测。我们提出了一个廉价的框架,该框架采用辅助离群点和伪离流形样本来训练具有某类先验知识的模型,对OOD样本具有很高的真空度。大量的实证实验表明,基于证据不确定性的模型在检测OOD实例方面优于其他模型。我们的方法可以很容易地部署到传统的递归神经网络和微调预训练Transformer。 摘要:Deep neural networks have significantly contributed to the success in predictive accuracy for classification tasks. However, they tend to make over-confident predictions in real-world settings, where domain shifting and out-of-distribution (OOD) examples exist. Most research on uncertainty estimation focuses on computer vision because it provides visual validation on uncertainty quality. However, few have been presented in the natural language process domain. Unlike Bayesian methods that indirectly infer uncertainty through weight uncertainties, current evidential uncertainty-based methods explicitly model the uncertainty of class probabilities through subjective opinions. They further consider inherent uncertainty in data with different root causes, vacuity (i.e., uncertainty due to a lack of evidence) and dissonance (i.e., uncertainty due to conflicting evidence). In our paper, we firstly apply evidential uncertainty in OOD detection for text classification tasks. We propose an inexpensive framework that adopts both auxiliary outliers and pseudo off-manifold samples to train the model with prior knowledge of a certain class, which has high vacuity for OOD samples. Extensive empirical experiments demonstrate that our model based on evidential uncertainty outperforms other counterparts for detecting OOD examples. Our approach can be easily deployed to traditional recurrent neural networks and fine-tuned pre-trained transformers.

【34】 Robust Learning for Text Classification with Multi-source Noise Simulation and Hard Example Mining 标题:基于多源噪声模拟和硬例挖掘的文本分类鲁棒学习

作者:Guowei Xu,Wenbiao Ding,Weiping Fu,Zhongqin Wu,Zitao Liu 机构:TAL Education Group, Beijing, China 备注:ECML-PKDD'21: The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021 链接:https://arxiv.org/abs/2107.07113 摘要:许多实际应用都涉及到使用光学字符识别(OCR)引擎将手写图像转换成抄本,并应用下游自然语言处理(NLP)模型。在这个过程中,OCR引擎可能会引入错误,并且下游NLP模型的输入会变得有噪声。尽管预先训练的模型在许多NLP基准测试中取得了最先进的性能,但我们证明了它们对真实OCR引擎产生的噪声文本不具有鲁棒性。这大大限制了NLP模型在现实场景中的应用。为了提高模型在含噪OCR文本上的性能,自然需要对含噪文本进行NLP模型训练。然而,在大多数情况下,只有标记干净的文本。由于不存在与文本相对应的手写图片,因此不可能直接使用识别模型来获得带噪的标签数据。人力资源可以用来复制文本和拍照,但考虑到模型训练的数据量,这是非常昂贵的。因此,我们感兴趣的是使NLP模型以低资源的方式对OCR错误具有内在的鲁棒性。提出了一种新的鲁棒性训练框架:1)采用简单而有效的方法直接模拟干净文本中的自然OCR噪声;2)从大量模拟样本中迭代挖掘硬样本以获得最佳性能。3) 为了使我们的模型学习噪声不变表示,采用了稳定性损失。在三个真实数据集上的实验表明,该框架大大提高了预训练模型的鲁棒性。我们相信这项工作可以极大地促进NLP模型在实际场景中的应用,尽管我们使用的算法简单明了。我们公开了我们的代码和三个数据集footnote{https://github.com/tal-ai/Robust-learning-MSSHEM}. 摘要:Many real-world applications involve the use of Optical Character Recognition (OCR) engines to transform handwritten images into transcripts on which downstream Natural Language Processing (NLP) models are applied. In this process, OCR engines may introduce errors and inputs to downstream NLP models become noisy. Despite that pre-trained models achieve state-of-the-art performance in many NLP benchmarks, we prove that they are not robust to noisy texts generated by real OCR engines. This greatly limits the application of NLP models in real-world scenarios. In order to improve model performance on noisy OCR transcripts, it is natural to train the NLP model on labelled noisy texts. However, in most cases there are only labelled clean texts. Since there is no handwritten pictures corresponding to the text, it is impossible to directly use the recognition model to obtain noisy labelled data. Human resources can be employed to copy texts and take pictures, but it is extremely expensive considering the size of data for model training. Consequently, we are interested in making NLP models intrinsically robust to OCR errors in a low resource manner. We propose a novel robust training framework which 1) employs simple but effective methods to directly simulate natural OCR noises from clean texts and 2) iteratively mines the hard examples from a large number of simulated samples for optimal performance. 3) To make our model learn noise-invariant representations, a stability loss is employed. Experiments on three real-world datasets show that the proposed framework boosts the robustness of pre-trained models by a large margin. We believe that this work can greatly promote the application of NLP models in actual scenarios, although the algorithm we use is simple and straightforward. We make our codes and three datasets publicly availablefootnote{https://github.com/tal-ai/Robust-learning-MSSHEM}.

【35】 Neural Code Summarization: How Far Are We? 标题:神经代码总结:我们还有多远?

作者:Ensheng Shi,Yanlin Wang,Lun Du,Junjie Chen,Shi Han,Hongyu Zhang,Dongmei Zhang,Hongbin Sun 机构:†Xi’an Jiaotong University, com††Tianjin University, cn¶The University of Newcastle, au∥Xi’an Jiaotong University 链接:https://arxiv.org/abs/2107.07112 摘要:源代码摘要对于理解和维护程序非常重要。然而,有许多程序缺少、过时或不匹配的摘要。最近,人们利用深度学习技术为给定的代码片段自动生成摘要。为了深刻理解我们离解决这个问题还有多远,本文在三个广泛使用的数据集上,对五种最先进的神经源代码摘要模型进行了系统深入的分析。我们的评估结果表明:(1)BLEU度量有许多变体,BLEU度量被广泛用于评估摘要模型的性能。忽略BLEU变体之间的差异可能会影响声称结果的有效性。此外,我们在一个常用的软件包中发现了一个重要的、以前未知的BLEU计算错误(2) 代码预处理的选择对摘要的性能有很大的影响,因此不应该忽略它们(3) 数据集的一些重要特征(语料库大小、数据分割方法和重复率)对模型评价有着重要的影响。在实验结果的基础上,我们给出了一些可操作的指导方针,为更系统地评估代码摘要和在不同场景下选择最佳方法提供了指导。我们还提出了未来可能的研究方向。我们相信我们的研究结果对这一有趣领域的从业者和研究人员有很大的帮助。 摘要:Source code summaries are important for the comprehension and maintenance of programs. However, there are plenty of programs with missing, outdated, or mismatched summaries. Recently, deep learning techniques have been exploited to automatically generate summaries for given code snippets. To achieve a profound understanding of how far we are from solving this problem, in this paper, we conduct a systematic and in-depth analysis of five state-of-the-art neural source code summarization models on three widely used datasets. Our evaluation results suggest that: (1) The BLEU metric, which is widely used by existing work for evaluating the performance of the summarization models, has many variants. Ignoring the differences among the BLEU variants could affect the validity of the claimed results. Furthermore, we discover an important, previously unknown bug about BLEU calculation in a commonly-used software package. (2) Code pre-processing choices can have a large impact on the summarization performance, therefore they should not be ignored. (3) Some important characteristics of datasets (corpus size, data splitting method, and duplication ratio) have a significant impact on model evaluation. Based on the experimental results, we give some actionable guidelines on more systematic ways for evaluating code summarization and choosing the best method in different scenarios. We also suggest possible future research directions. We believe that our results can be of great help for practitioners and researchers in this interesting area.

【36】 Applying the Case Difference Heuristic to Learn Adaptations from Deep Network Features 标题:案例差异启发式在深度网络特征自适应学习中的应用

作者:Xiaomeng Ye,Ziwei Zhao,David Leake,Xizi Wang,David Crandall 机构:Luddy School of Informatics, Computing, and Engineering, Indiana University, Bloomington IN , USA 备注:7 pages, 2 figures, 1 table. To be published in the IJCAI-21 Workshop on Deep Learning, Case-Based Reasoning, and AutoML: Present and Future Synergies 链接:https://arxiv.org/abs/2107.07095 摘要:案例差异启发式(CDH)方法是一种从基于案例推理系统的案例库中学习案例适应性知识的知识光方法。给定一对案例,CDH方法将解决方案的差异归结为解决问题的差异,当检索到的案例和新查询具有相似的问题差异时,生成自适应规则来相应地调整解决方案。作为学习适应规则的一种替代方法,一些研究人员已经应用神经网络来学习从问题差异中预测解决方案差异。以前关于这种方法的工作假设描述问题的特征集是预定义的。本文研究了一种结合深度学习和神经网络自适应学习的两阶段特征提取方法。它的性能表现在一个回归任务的图像数据:预测年龄给定的图像的一张脸。结果表明,组合过程能成功地学习适用于病例非符号差异的适应知识。CBR系统的总体性能略低于基线深度网络回归器,但在新查询上的性能优于基线深度网络回归器。 摘要:The case difference heuristic (CDH) approach is a knowledge-light method for learning case adaptation knowledge from the case base of a case-based reasoning system. Given a pair of cases, the CDH approach attributes the difference in their solutions to the difference in the problems they solve, and generates adaptation rules to adjust solutions accordingly when a retrieved case and new query have similar problem differences. As an alternative to learning adaptation rules, several researchers have applied neural networks to learn to predict solution differences from problem differences. Previous work on such approaches has assumed that the feature set describing problems is predefined. This paper investigates a two-phase process combining deep learning for feature extraction and neural network based adaptation learning from extracted features. Its performance is demonstrated in a regression task on an image data: predicting age given the image of a face. Results show that the combined process can successfully learn adaptation knowledge applicable to nonsymbolic differences in cases. The CBR system achieves slightly lower performance overall than a baseline deep network regressor, but better performance than the baseline on novel queries.

【37】 STAR: Sparse Transformer-based Action Recognition 标题:STAR:基于稀疏Transformer的动作识别

作者:Feng Shi,Chonghan Lee,Liang Qiu,Yizhou Zhao,Tianyi Shen,Shivran Muralidhar,Tian Han,Song-Chun Zhu,Vijaykrishnan Narayanan 机构:University of California Los Angeles, The Pennsylvania State University, Stevens Institute of Technology 链接:https://arxiv.org/abs/2107.07089 摘要:人类行为和行为的认知系统已经演变成一个深度学习系统,特别是近年来图卷积网络的出现改变了这个领域。然而,以往的工作主要集中在基于稠密图卷积网络的参数化和复杂模型上,导致训练和推理效率低下。同时,基于Transformer结构的模型在人类行为和行为估计的认知应用方面还没有得到很好的探索。提出了一种新的基于骨架的人体动作识别模型,该模型在空间维度上具有稀疏注意,在时间维度上具有分段线性注意。我们的模型也可以处理可变长度的视频剪辑分组为一个单一的批次。实验结果表明,该模型在利用较少可训练参数的情况下,可以达到相当的性能,并且训练和推理速度快。实验结果表明,与基准模型相比,该模型在竞争精度下的加速比为4~18倍,模型尺寸为1/7~1/15。 摘要:The cognitive system for human action and behavior has evolved into a deep learning regime, and especially the advent of Graph Convolution Networks has transformed the field in recent years. However, previous works have mainly focused on over-parameterized and complex models based on dense graph convolution networks, resulting in low efficiency in training and inference. Meanwhile, the Transformer architecture-based model has not yet been well explored for cognitive application in human action and behavior estimation. This work proposes a novel skeleton-based human action recognition model with sparse attention on the spatial dimension and segmented linear attention on the temporal dimension of data. Our model can also process the variable length of video clips grouped as a single batch. Experiments show that our model can achieve comparable performance while utilizing much less trainable parameters and achieve high speed in training and inference. Experiments show that our model achieves 4~18x speedup and 1/7~1/15 model size compared with the baseline models at competitive accuracy.

【38】 Deep Reinforcement Learning based Dynamic Optimization of Bus Timetable 标题:基于深度强化学习的公交时刻表动态优化

作者:Guanqun Ai,Xingquan Zuo,Gang chen,Binglin Wu 机构: Guanqun Ai and Binglin Wu are with School of Com-puter Science, Beijing University of Posts and Telecommunications, and also with the Key Laboratory of Trustworthy DistributedComputing and Service, Gang Chen is with School of Engineering and Computer Science 链接:https://arxiv.org/abs/2107.07066 摘要:公交时刻表优化是公交公司降低运营成本、提高服务质量的关键问题。现有的方法采用精确算法或启发式算法离线优化时刻表。实际上,随着时间的推移,客流可能会发生显著变化。离线确定的时刻表不能调整发车间隔以满足变化的客流。为了提高公交时刻表的在线性能,提出了一种基于深度强化学习的公交时刻表动态优化方法(DRL-TO)。该方法将时刻表优化问题看作一个序列决策问题。采用深度Q网络(DQN)作为决策模型,在服务周期的每一分钟决定是否调度公交服务。因此,巴士服务的发车间隔是根据乘客需求实时确定的。我们确定了一些新的和有用的状态特征的DQN,包括负荷系数,载客能力利用率,和搁浅乘客的数量。同时考虑公交公司和乘客的利益,设计了一个奖励函数,包括满载率、空载率、乘客等待时间、滞留乘客数等指标。在现有公交车站通过能力计算方法的基础上,提出了一种提高公交车站匹配度的新方法。实验表明,与基于记忆算法(BTOA-MA)、遗传算法(GA)和人工方法的最新公交时刻表优化方法生成的时刻表相比,DRL-TO能够根据实时客流动态确定发车间隔,平均节省8$%$的车辆,减少17$%$的乘客等候时间。 摘要:Bus timetable optimization is a key issue to reduce operational cost of bus companies and improve the service quality. Existing methods use exact or heuristic algorithms to optimize the timetable in an offline manner. In practice, the passenger flow may change significantly over time. Timetables determined in offline cannot adjust the departure interval to satisfy the changed passenger flow. Aiming at improving the online performance of bus timetable, we propose a Deep Reinforcement Learning based bus Timetable dynamic Optimization method (DRL-TO). In this method, the timetable optimization is considered as a sequential decision problem. A Deep Q-Network (DQN) is employed as the decision model to determine whether to dispatch a bus service during each minute of the service period. Therefore, the departure intervals of bus services are determined in real time in accordance with passenger demand. We identify several new and useful state features for the DQN, including the load factor, carrying capacity utilization rate, and the number of stranding passengers. Taking into account both the interests of the bus company and passengers, a reward function is designed, which includes the indicators of full load rate, empty load rate, passengers' waiting time, and the number of stranding passengers. Building on an existing method for calculating the carrying capacity, we develop a new technique to enhance the matching degree at each bus station. Experiments demonstrate that compared with the timetable generated by the state-of-the-art bus timetable optimization approach based on a memetic algorithm (BTOA-MA), Genetic Algorithm (GA) and the manual method, DRL-TO can dynamically determine the departure intervals based on the real-time passenger flow, saving 8$%$ of vehicles and reducing 17$%$ of passengers' waiting time on average.

【39】 Expert Graphs: Synthesizing New Expertise via Collaboration 标题:专家图表:通过协作综合新的专业知识

作者:Bijan Mazaheri,Siddharth Jain,Jehoshua Bruck 机构: California Institute of Technology 备注:13 pages, 11 figures 链接:https://arxiv.org/abs/2107.07054 摘要:考虑在不确定输入下处理重叠问题的多专家专家。什么构成了一套一致的观点?如何预测专家对缺失子域的意见?在本文中,我们定义了一个分析这个问题的框架,称之为“专家图”。在专家图中,顶点表示类,边表示顶点主题的二元观点。我们推导出了专家图有效性的必要条件,并用它们来创建“综合专家”,描述与其他专家的观察意见一致的意见。我们证明了这个框架是等价的研究线性排序多面体。我们证明了我们的条件不足以描述团上的所有专家图,但足以描述圈。 摘要:Consider multiple experts with overlapping expertise working on a classification problem under uncertain input. What constitutes a consistent set of opinions? How can we predict the opinions of experts on missing sub-domains? In this paper, we define a framework of to analyze this problem, termed "expert graphs." In an expert graph, vertices represent classes and edges represent binary opinions on the topics of their vertices. We derive necessary conditions for expert graph validity and use them to create "synthetic experts" which describe opinions consistent with the observed opinions of other experts. We show this framework to be equivalent to the well-studied linear ordering polytope. We show our conditions are not sufficient for describing all expert graphs on cliques, but are sufficient for cycles.

【40】 Explainable AI: current status and future directions 标题:可解释人工智能:现状与未来发展方向

作者:Prashant Gohel,Priyanka Singh,Manoranjan Mohanty 机构:DA-IICT, Gandhinagar, Gujarat, India, Centre for Forensic Science, University of Technology Sydney, Australia 链接:https://arxiv.org/abs/2107.07045 摘要:可解释人工智能是人工智能领域的一个新兴研究领域。XAI可以解释AI是如何获得特定解决方案的(例如,分类或目标检测),还可以回答其他“wh”问题。这种解释在传统的人工智能中是不可能的。可解释性对于关键应用至关重要,例如国防、医疗保健、法律和秩序以及自动驾驶车辆等,在这些应用中,技术诀窍是信任和透明所必需的。到目前为止,许多XAI技术都是为此类应用而设计的。本文从多媒体(即文本、图像、音频和视频)的角度对这些技术进行了概述。讨论了这些技术的优缺点,并对今后的发展方向提出了一些建议。 摘要:Explainable Artificial Intelligence (XAI) is an emerging area of research in the field of Artificial Intelligence (AI). XAI can explain how AI obtained a particular solution (e.g., classification or object detection) and can also answer other "wh" questions. This explainability is not possible in traditional AI. Explainability is essential for critical applications, such as defense, health care, law and order, and autonomous driving vehicles, etc, where the know-how is required for trust and transparency. A number of XAI techniques so far have been purposed for such applications. This paper provides an overview of these techniques from a multimedia (i.e., text, image, audio, and video) point of view. The advantages and shortcomings of these techniques have been discussed, and pointers to some future directions have also been provided.

【41】 GGT: Graph-Guided Testing for Adversarial Sample Detection of Deep Neural Network 标题:GGT:深度神经网络对抗性样本检测的图导测试

作者:Zuohui Chen,Renxuan Wang,Jingyang Xiang,Yue Yu,Xin Xia,Shouling Ji,Qi Xuan,Xiaoniu Yang 机构: Institute of Cyberspace Security, Zhejiang University of Technology, Hangzhou, China, National University of Defense Technology, Changsha, China, Monash University, Melbourne, Australia, Zhejiang University, Hangzhou, China 链接:https://arxiv.org/abs/2107.07043 摘要:深度神经网络(Deep Neural Networks,DNN)是一种易受对抗性样本攻击的网络,其检测对于DNN模型的广泛应用至关重要。近年来,软件工程中提出了许多深入的测试方法来发现DNN系统的脆弱性,其中一种方法模型变异测试(MMT)被成功地用于检测各种对抗性攻击产生的各种对抗性样本。然而,MMT中的变异模型数量庞大(如超过100个模型),且缺乏多样性(如易被高置信度对抗性样本所规避),这使得MMT在实际应用中的效率较低,对高置信度对抗性样本的检测效率较低。在这项研究中,我们提出了图形引导测试(GGT)的对抗性样本检测,以克服上述挑战。GGT以图特征为导向生成剪枝模型,每个剪枝模型的参数仅为MMT中变异模型的5%左右,图导向模型具有较高的多样性。在CIFAR10和SVHN上的实验验证了GGT在有效性和效率上都优于MMT。 摘要:Deep Neural Networks (DNN) are known to be vulnerable to adversarial samples, the detection of which is crucial for the wide application of these DNN models. Recently, a number of deep testing methods in software engineering were proposed to find the vulnerability of DNN systems, and one of them, i.e., Model Mutation Testing (MMT), was used to successfully detect various adversarial samples generated by different kinds of adversarial attacks. However, the mutated models in MMT are always huge in number (e.g., over 100 models) and lack diversity (e.g., can be easily circumvented by high-confidence adversarial samples), which makes it less efficient in real applications and less effective in detecting high-confidence adversarial samples. In this study, we propose Graph-Guided Testing (GGT) for adversarial sample detection to overcome these aforementioned challenges. GGT generates pruned models with the guide of graph characteristics, each of them has only about 5% parameters of the mutated model in MMT, and graph guided models have higher diversity. The experiments on CIFAR10 and SVHN validate that GGT performs much better than MMT with respect to both effectiveness and efficiency.

【42】 Conditional Teaching Size 标题:有条件的教学规模

作者:Manuel Garcia-Piqueras,José Hernández-Orallo 机构:Jos´e Hern´andez-Orallo,[,−,−,−,], Math. Dept., Universidad de Castilla-La Mancha, Albacete, Spain, VRAIN, Universitat Politecnica de Valencia, Valencia, Spain 备注:26 pages 链接:https://arxiv.org/abs/2107.07038 摘要:最近的机器教学研究探索了用通用语言表达任何概念的教学。在这种组合背景下,新的实验结果表明,存在着比概念描述本身短得惊人的数据教学集。然而,通过教学规模和概念复杂性,这些显著的实验结果存在一定的局限性,我们将在这里进一步探讨。由于概念很少单独教授,我们研究概念的最佳配置来教授给定的一组概念,其中首先获得的概念可以重新用于描述新的概念。这种新的条件教学规模的概念揭示了新的见解,如插入现象:某些先验知识产生了更简单的兼容概念,增加了我们想要教授的概念的教学规模。对于条件Kolmogorov复杂性,这种情况不会发生。在此基础上,提出了一种基于避免插入的课程优化算法。本文介绍了一系列的理论结果,包括它们的证明,以及今后工作的一些方向。作文情境下课程教学的新的研究可能性正在被广泛探索。 摘要:Recent research in machine teaching has explored the instruction of any concept expressed in a universal language. In this compositional context, new experimental results have shown that there exist data teaching sets surprisingly shorter than the concept description itself. However, there exists a bound for those remarkable experimental findings through teaching size and concept complexity that we further explore here. As concepts are rarely taught in isolation we investigate the best configuration of concepts to teach a given set of concepts, where those that have been acquired first can be reused for the description of new ones. This new notion of conditional teaching size uncovers new insights, such as the interposition phenomenon: certain prior knowledge generates simpler compatible concepts that increase the teaching size of the concept that we want to teach. This does not happen for conditional Kolmogorov complexity. Furthermore, we provide an algorithm that constructs optimal curricula based on interposition avoidance. This paper presents a series of theoretical results, including their proofs, and some directions for future work. New research possibilities in curriculum teaching in compositional scenarios are now wide open to exploration.

【43】 Experimental Evidence that Empowerment May Drive Exploration in Sparse-Reward Environments 标题:实验证据表明,授权可以在报酬稀少的环境中推动探索

作者:Francesco Massari,Martin Biehl,Lisa Meeden,Ryota Kanai 机构:Swarthmore College, Swarthmore, USA, Araya Inc., Tokyo, Japan, Department of Computer Science 备注:6 pages, 3 figures, to be published in proceedings of the International Conference on Development and Learning 2021 链接:https://arxiv.org/abs/2107.07031 摘要:强化学习(RL)是已知的,往往是不成功的环境与稀疏的外在回报。一个可能的对策是赋予RL代理一个内在的奖励函数,或“内在动机”,它根据当前传感器状态的某些特征奖励代理。一个基于授权原理的内在奖励函数根据代理对自身传感器的控制量按比例分配奖励。我们在最近提出的内在动机代理上实现了一个变体,我们称之为“好奇”代理和赋权激励代理。前者利用传感器状态编码和可变的自动编码器,而后者通过可变的信息瓶颈预测下一个传感器状态。在四个稀疏的奖励网格世界中,我们将这两个代理的性能与优势参与者-批评家基线的性能进行了比较。授权代理和好奇的竞争对手似乎都从他们的内在回报中获得了相似的利益。这为赋权可以用来推动探索的猜想提供了一些实验支持。 摘要:Reinforcement Learning (RL) is known to be often unsuccessful in environments with sparse extrinsic rewards. A possible countermeasure is to endow RL agents with an intrinsic reward function, or 'intrinsic motivation', which rewards the agent based on certain features of the current sensor state. An intrinsic reward function based on the principle of empowerment assigns rewards proportional to the amount of control the agent has over its own sensors. We implemented a variation on a recently proposed intrinsically motivated agent, which we refer to as the 'curious' agent, and an empowerment-inspired agent. The former leverages sensor state encoding with a variational autoencoder, while the latter predicts the next sensor state via a variational information bottleneck. We compared the performance of both agents to that of an advantage actor-critic baseline in four sparse reward grid worlds. Both the empowerment agent and its curious competitor seem to benefit to similar extents from their intrinsic rewards. This provides some experimental support to the conjecture that empowerment can be used to drive exploration.

【44】 Forgetting in Answer Set Programming -- A Survey 标题:答案集编程中的遗忘--综述

作者:Ricardo Gonçalves,Matthias Knorr,João Leite 机构:NOVA LINCS, Departamento de Inform´atica, Faculdade de Ciˆencias e Tecnologia, Universidade Nova de Lisboa, Portugal 备注:Under consideration in Theory and Practice of Logic Programming (TPLP) 链接:https://arxiv.org/abs/2107.07016 摘要:遗忘或变量消除是一种操作,它允许从知识库中删除不再相关的中间变量。近年来,关于答案集规划中的遗忘问题,人们提出了许多不同的解决方法,它们以特定的算子或此类算子的类的形式出现,通常遵循不同的原则,遵循不同的性质。每一种方法都是为了解决遗忘的某些特殊观点而发展起来的,其目的是遵守在这种观点中被认为是可取的一组特定的属性,但是缺少对所有现有操作符和属性的全面和统一的概述。本文深入研究了答案集规划中遗忘算子的现有性质和(类),给出了这类遗忘算子的全貌,其中包括许多关于性质和算子之间关系的新结果,包括计算遗忘结果的具体算子和计算复杂度的考虑。我们的目标是提供指导,帮助用户选择最适合其应用需求的运营商。 摘要:Forgetting - or variable elimination - is an operation that allows the removal, from a knowledge base, of middle variables no longer deemed relevant. In recent years, many different approaches for forgetting in Answer Set Programming have been proposed, in the form of specific operators, or classes of such operators, commonly following different principles and obeying different properties. Each such approach was developed to somehow address some particular view on forgetting, aimed at obeying a specific set of properties deemed desirable in such view, but a comprehensive and uniform overview of all the existing operators and properties is missing. In this paper, we thoroughly examine existing properties and (classes of) operators for forgetting in Answer Set Programming, drawing a complete picture of the landscape of these classes of forgetting operators, which includes many novel results on relations between properties and operators, including considerations on concrete operators to compute results of forgetting and computational complexity. Our goal is to provide guidance to help users in choosing the operator most adequate for their application requirements.

【45】 Do Humans Trust Advice More if it Comes from AI? An Analysis of Human-AI Interactions 标题:如果建议来自人工智能,人类会更信任吗?人工智能与人的交互分析

作者:Kailas Vodrahalli,Tobias Gerstenberg,James Zou 机构:Stanford University 备注:34 pages, 6 figures 18 full page figures 链接:https://arxiv.org/abs/2107.07015 摘要:在人工智能的许多应用中,算法的输出被设计成对人类用户的建议。用户可以忽略这些建议,或者考虑修改他/她的决定。随着此类人工智能交互的日益普及,了解用户如何根据人工智能的建议采取行动(或不采取行动)以及如果用户认为建议来自“人工智能”而不是来自另一个人,他们如何以不同的方式看待建议是很重要的。在这篇论文中,我们描述了人类如何使用人工智能的建议,相对于来自几个实验设置的一组同龄人的等效建议。我们发现,参与者对人类与人工智能在给定任务中的表现的信念会影响他们是否听从建议。当参与者决定使用这些建议时,他们对人类和人工智能的建议也是这样做的。这些结果提供了影响人工智能交互的因素的见解。 摘要:In many applications of AI, the algorithm's output is framed as a suggestion to a human user. The user may ignore the advice or take it into consideration to modify his/her decisions. With the increasing prevalence of such human-AI interactions, it is important to understand how users act (or do not act) upon AI advice, and how users regard advice differently if they believe the advice come from an "AI" versus another human. In this paper, we characterize how humans use AI suggestions relative to equivalent suggestions from a group of peer humans across several experimental settings. We find that participants' beliefs about the human versus AI performance on a given task affects whether or not they heed the advice. When participants decide to use the advice, they do so similarly for human and AI suggestions. These results provide insights into factors that affect human-AI interactions.

【46】 WeightScale: Interpreting Weight Change in Neural Networks 标题:WeightScale:解释神经网络中的权重变化

作者:Ayush Manish Agrawal,Atharva Tendle,Harshvardhan Sikka,Sahib Singh 机构:University of Nebraska-Lincoln, Georgia Institute of Technology, Ford Motor Company 备注:9 pages, 8 figures. arXiv admin note: text overlap with arXiv:2011.06735 链接:https://arxiv.org/abs/2107.07005 摘要:解释神经网络的学习动力学可以提供有用的见解,了解网络是如何学习的,以及开发更好的训练和设计方法。我们提出了一种解释神经网络学习的方法,通过在每层基础上测量相对权重的变化,并通过降维和聚类的结合动态地聚集新的趋势,使我们能够扩展到非常深的网络。我们使用这种方法来研究视觉任务背景下跨各种最先进网络的学习,并深入了解这些网络的学习行为,包括任务复杂性如何影响网络深层次的分层学习。 摘要:Interpreting the learning dynamics of neural networks can provide useful insights into how networks learn and the development of better training and design approaches. We present an approach to interpret learning in neural networks by measuring relative weight change on a per layer basis and dynamically aggregating emerging trends through combination of dimensionality reduction and clustering which allows us to scale to very deep networks. We use this approach to investigate learning in the context of vision tasks across a variety of state-of-the-art networks and provide insights into the learning behavior of these networks, including how task complexity affects layer-wise learning in deeper layers of networks.

【47】 DeepHyperion: Exploring the Feature Space of Deep Learning-Based Systems through Illumination Search 标题:DeepHyperion:通过光照搜索探索深度学习系统的特征空间

作者:Tahereh Zohdinasab,Vincenzo Riccio,Alessio Gambi,Paolo Tonella 机构:Università della Svizzera Italiana, Lugano, Switzerland, University of Passau, Passau, Germany 备注:To be published in Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA '21), July 11-17, 2021, Virtual, Denmark. ACM, New York, NY, USA, 12 pages 链接:https://arxiv.org/abs/2107.06997 摘要:深度学习(Deep Learning,DL)已成功应用于包括安全关键领域在内的广泛应用领域。文献中最近提出了几种DL测试方法,但没有一种方法旨在评估生成输入的不同可解释特征如何影响系统行为。在本文中,我们借助于光照搜索来寻找分布在表示系统特征空间的地图单元中性能最高的测试用例(即,行为不端和最接近行为不端的测试用例)。我们介绍了一种方法,指导我们的方法在识别和量化的任务,为给定领域的特征空间的维度的用户。我们开发了DeepHyperion,这是一种基于搜索的DL系统工具,通过向开发人员提供可解释的特征图,自动生成的输入与暴露行为的信息一起放置在该特征图中,从而照亮(即,全面探索)特征空间。 摘要:Deep Learning (DL) has been successfully applied to a wide range of application domains, including safety-critical ones. Several DL testing approaches have been recently proposed in the literature but none of them aims to assess how different interpretable features of the generated inputs affect the system's behaviour. In this paper, we resort to Illumination Search to find the highest-performing test cases (i.e., misbehaving and closest to misbehaving), spread across the cells of a map representing the feature space of the system. We introduce a methodology that guides the users of our approach in the tasks of identifying and quantifying the dimensions of the feature space for a given domain. We developed DeepHyperion, a search-based tool for DL systems that illuminates, i.e., explores at large, the feature space, by providing developers with an interpretable feature map where automatically generated inputs are placed along with information about the exposed behaviours.

【48】 Elastic Graph Neural Networks 标题:弹性图神经网络

作者:Xiaorui Liu,Wei Jin,Yao Ma,Yaxin Li,Hua Liu,Yiqi Wang,Ming Yan,Jiliang Tang 机构: The specificEqual contribution 1Department of Computer Science and En-gineering, Michigan State University, USA 2School of Mathemat-ics, Shandong University, China 3Department of ComputationalMathematics 备注:ICML 2021 (International Conference on Machine Learning) 链接:https://arxiv.org/abs/2107.06996 摘要:虽然许多现有的图神经网络(GNNs)已经被证明可以执行基于$ellu 2$的全局平滑,但是在这项工作中,我们的目标是通过基于$ellu 1$的图平滑来进一步增强GNNs的局部平滑适应性。因此,我们引入了一类基于$ellu 1$和$ellu 2$的图平滑的GNNs(elasticgnns)。特别地,我们提出了一种新的通用的GNNs消息传递方案。这种消息传递算法不仅对反向传播训练友好,而且在理论上保证了算法的收敛性,达到了预期的平滑效果。在半监督学习任务上的实验表明,所提出的弹性GNNs对基准数据集具有较好的适应性,对图攻击具有较强的鲁棒性。Elastic GNNs的实现可从url获得{https://github.com/lxiaorui/ElasticGNN}. 摘要:While many existing graph neural networks (GNNs) have been proven to perform $ell_2$-based graph smoothing that enforces smoothness globally, in this work we aim to further enhance the local smoothness adaptivity of GNNs via $ell_1$-based graph smoothing. As a result, we introduce a family of GNNs (Elastic GNNs) based on $ell_1$ and $ell_2$-based graph smoothing. In particular, we propose a novel and general message passing scheme into GNNs. This message passing algorithm is not only friendly to back-propagation training but also achieves the desired smoothing properties with a theoretical convergence guarantee. Experiments on semi-supervised learning tasks demonstrate that the proposed Elastic GNNs obtain better adaptivity on benchmark datasets and are significantly robust to graph adversarial attacks. The implementation of Elastic GNNs is available at url{https://github.com/lxiaorui/ElasticGNN}.

【49】 What underlies rapid learning and systematic generalization in humans 标题:人类快速学习和系统归纳的基础是什么?

作者:Andrew Joohun Nam,James L. McClelland 机构:Department of Psychology, Stanford University 备注:22 pages, 48 references, 6 Figures, and one Table, plus SI 链接:https://arxiv.org/abs/2107.06994 摘要:尽管神经网络取得了突破性的成功,但现代模型需要大量数据集的广泛训练,而且样本外泛化能力较差。一个建议的解决方案是在模型中建立系统性和特定领域的约束,呼应经典的符号认知架构的原则。在本文中,我们考虑这种方法的局限性,通过检查成人学习抽象推理任务的能力,从一个简短的教学教程和解释反馈错误的反应,证明了人类的学习动力和在训练样本范围外的概括能力与典型的神经网络模型有很大的不同,并且该模型对于作者没有预料到的特征变化是脆弱的。我们从人类数据中提出了进一步的证据,证明持续解决难题的能力与教育,特别是基础数学教育,以及提供可靠的、可识别的、有效的策略描述的能力有关。我们认为,人类的快速学习和系统概括可能依赖于一个渐进的、依赖经验的学习过程,即使用指令和解释来指导构建支持归纳推理的显式抽象规则。 摘要:Despite the groundbreaking successes of neural networks, contemporary models require extensive training with massive datasets and exhibit poor out-of-sample generalization. One proposed solution is to build systematicity and domain-specific constraints into the model, echoing the tenets of classical, symbolic cognitive architectures. In this paper, we consider the limitations of this approach by examining human adults' ability to learn an abstract reasoning task from a brief instructional tutorial and explanatory feedback for incorrect responses, demonstrating that human learning dynamics and ability to generalize outside the range of the training examples differ drastically from those of a representative neural network model, and that the model is brittle to changes in features not anticipated by its authors. We present further evidence from human data that the ability to consistently solve the puzzles was associated with education, particularly basic mathematics education, and with the ability to provide a reliably identifiable, valid description of the strategy used. We propose that rapid learning and systematic generalization in humans may depend on a gradual, experience-dependent process of learning-to-learn using instructions and explanations to guide the construction of explicit abstract rules that support generalizable inferences.

【50】 Annotation and Classification of Evidence and Reasoning Revisions in Argumentative Writing 标题:议论文写作中证据与推理修改的注解与分类

作者:Tazin Afrin,Elaine Wang,Diane Litman,Lindsay C. Matsumura,Richard Correnti 机构:Learning Research and Development Center, University of Pittsburgh, Pittsburgh, Pennsylvania 备注:10 pages, 11 tables, 15th Workshop on Innovative Use of NLP for Building Educational Applications 链接:https://arxiv.org/abs/2107.06990 摘要:自动写作评价系统可以提高学生的写作水平,只要学生注意到所提供的反馈,并根据反馈修改论文草稿。然而,现有的关于这类系统中议论文修改的研究主要集中在学生修改的类型(如表面与内容)上,而不是修改在多大程度上对反馈做出反应并改进论文。我们引入了一个注释方案来捕捉证据使用和推理的句子级修订的性质(RER方案),并将其应用于五年级和六年级学生的议论文中。我们表明,可靠的手动注释可以实现,修订注释与论文改进的整体评估相关联,与提供的反馈一致。此外,我们探讨了根据我们的方案自动分类修订的可行性。 摘要:Automated writing evaluation systems can improve students' writing insofar as students attend to the feedback provided and revise their essay drafts in ways aligned with such feedback. Existing research on revision of argumentative writing in such systems, however, has focused on the types of revisions students make (e.g., surface vs. content) rather than the extent to which revisions actually respond to the feedback provided and improve the essay. We introduce an annotation scheme to capture the nature of sentence-level revisions of evidence use and reasoning (the `RER' scheme) and apply it to 5th- and 6th-grade students' argumentative essays. We show that reliable manual annotation can be achieved and that revision annotations correlate with a holistic assessment of essay improvement in line with the feedback provided. Furthermore, we explore the feasibility of automatically classifying revisions according to our scheme.

【51】 FetalNet: Multi-task deep learning framework for fetal ultrasound biometric measurements 标题:FetalNet:胎儿超声生物特征测量的多任务深度学习框架

作者:Szymon Płotka,Tomasz Włodarczyk,Adam Klasa,Michał Lipa,Arkadiusz Sitek,Tomasz Trzciński 机构: Sano Centre for Computational Medicine, Cracow, Poland, Warsaw University of Technology, Warsaw, Poland, Medical University of Warsaw, Warsaw, Poland, Fetai Health Ltd., Tooploox, Wroclaw, Poland 备注:Submitted to ICONIP 2021 链接:https://arxiv.org/abs/2107.06943 摘要:在本文中,我们提出了一个端到端的多任务神经网络称为胎儿网络与注意机制和堆叠模块的时空胎儿超声扫描视频分析。胎儿生物测量是妊娠期的一种标准检查,用于胎儿生长监测和估计胎龄和胎儿体重。胎儿超声扫描图像分析的主要目的是寻找合适的标准平面来测量胎儿的头、腹和股骨。由于超声数据中固有的高散斑噪声和阴影,需要医学专家和超声经验来找到合适的采集平面并对胎儿进行精确测量。此外,现有的计算机辅助胎儿超声生物特征测量方法只处理一帧图像,没有考虑时间特征。针对这些不足,我们提出了一种端到端的多任务神经网络,用于时空超声扫描视频分析,同时对胎儿身体部位进行定位、分类和测量。我们提出了一个新的编码器-解码器分割架构,其中包含了一个分类分支。此外,我们利用一个堆叠模组的注意机制来学习显著地图,以抑制不相关的US区域,并有效地进行扫描平面定位。我们对700名不同患者的常规检查中的胎儿超声视频进行了训练。我们称之为FetalNet的方法在胎儿超声录像的分类和分割方面都优于现有的最新方法。 摘要:In this paper, we propose an end-to-end multi-task neural network called FetalNet with an attention mechanism and stacked module for spatio-temporal fetal ultrasound scan video analysis. Fetal biometric measurement is a standard examination during pregnancy used for the fetus growth monitoring and estimation of gestational age and fetal weight. The main goal in fetal ultrasound scan video analysis is to find proper standard planes to measure the fetal head, abdomen and femur. Due to natural high speckle noise and shadows in ultrasound data, medical expertise and sonographic experience are required to find the appropriate acquisition plane and perform accurate measurements of the fetus. In addition, existing computer-aided methods for fetal US biometric measurement address only one single image frame without considering temporal features. To address these shortcomings, we propose an end-to-end multi-task neural network for spatio-temporal ultrasound scan video analysis to simultaneously localize, classify and measure the fetal body parts. We propose a new encoder-decoder segmentation architecture that incorporates a classification branch. Additionally, we employ an attention mechanism with a stacked module to learn salient maps to suppress irrelevant US regions and efficient scan plane localization. We trained on the fetal ultrasound video comes from routine examinations of 700 different patients. Our method called FetalNet outperforms existing state-of-the-art methods in both classification and segmentation in fetal ultrasound video recordings.

【52】 Mutually improved endoscopic image synthesis and landmark detection in unpaired image-to-image translation 标题:非配对图像到图像转换中相互改进的内窥镜图像合成和标志点检测

作者:Lalith Sharan,Gabriele Romano,Sven Koehler,Halvar Kelm,Matthias Karck,Raffaele De Simone,Sandy Engelhardt 机构:at Heidelberg University Hospital 备注:Submitted to IEEE JBHI 2021, 13 pages, 8 figures, 4 tables 链接:https://arxiv.org/abs/2107.06941 摘要:CycleGAN框架允许对未配对数据进行无监督的图像到图像的转换。在物理手术模拟器上进行手术训练的场景中,该方法可用于将内窥镜下的模型图像转换为更接近于同一手术靶结构术中外观的图像。这可以看作是一种新的增强现实方法,我们在以前的工作中创造了超现实主义。在这个用例中,最重要的是在两个域中显示一致的对象,如针、缝线或工具,同时将样式更改为更像组织的外观。分割这些物体将允许直接传输,然而,这些,部分微小和薄前景物体的轮廓是麻烦的,也许是不准确的。相反,我们建议在缝合线进入组织时,使用地标检测。通过将预训练检测器模型的性能作为一个额外的优化目标,该目标被直接纳入CycleGAN框架中。我们表明,在这些稀疏的地标标签上定义的任务提高了生成网络在两个域中合成的一致性。比较基线CycleGAN架构和我们提出的扩展(DetCycleGAN),平均精确度(PPV)提高了 61.32,平均灵敏度(TPR)提高了 37.91,平均F1得分提高了 0.4743。此外,可以证明,通过数据集融合,生成的术中图像可以作为检测网络本身的额外训练数据。这些数据是在2021年的MICCAI挑战赛范围内发布的https://adaptor2021.github.io/,并在处编码https://github.com/Cardio-AI/detcyclegan_pytorch. 摘要:The CycleGAN framework allows for unsupervised image-to-image translation of unpaired data. In a scenario of surgical training on a physical surgical simulator, this method can be used to transform endoscopic images of phantoms into images which more closely resemble the intra-operative appearance of the same surgical target structure. This can be viewed as a novel augmented reality approach, which we coined Hyperrealism in previous work. In this use case, it is of paramount importance to display objects like needles, sutures or instruments consistent in both domains while altering the style to a more tissue-like appearance. Segmentation of these objects would allow for a direct transfer, however, contouring of these, partly tiny and thin foreground objects is cumbersome and perhaps inaccurate. Instead, we propose to use landmark detection on the points when sutures pass into the tissue. This objective is directly incorporated into a CycleGAN framework by treating the performance of pre-trained detector models as an additional optimization goal. We show that a task defined on these sparse landmark labels improves consistency of synthesis by the generator network in both domains. Comparing a baseline CycleGAN architecture to our proposed extension (DetCycleGAN), mean precision (PPV) improved by 61.32, mean sensitivity (TPR) by 37.91, and mean F1 score by 0.4743. Furthermore, it could be shown that by dataset fusion, generated intra-operative images can be leveraged as additional training data for the detection network itself. The data is released within the scope of the AdaptOR MICCAI Challenge 2021 at https://adaptor2021.github.io/, and code at https://github.com/Cardio-AI/detcyclegan_pytorch.

【53】 Training Compact CNNs for Image Classification using Dynamic-coded Filter Fusion 标题:利用动态编码过滤融合训练紧凑CNN用于图像分类

作者:Mingbao Lin,Rongrong Ji,Bohong Chen,Fei Chao,Jianzhuang Liu,Wei Zeng,Yonghong Tian,Qi Tian 链接:https://arxiv.org/abs/2107.06916 摘要:滤波器剪枝的主流方法通常是对计算量大的预训练模型强制进行硬编码的重要性估计来选择“重要”滤波器,或者对损失目标施加超参数敏感的稀疏约束来正则化网络训练。本文提出了一种新的滤波剪枝方法,称为动态编码滤波融合(DCFF),以一种计算经济且无需正则化的方式得到紧凑的cnn,从而实现高效的图像分类。本文首先以温度参数作为滤波器的代理,给出了DCFF中每个滤波器的互相似性分布,在此基础上,提出了一种新的基于Kullback-Leibler散度的动态编码准则来评价滤波器的重要性。与其他方法简单地保留高分滤波器不同,我们提出了滤波器融合的概念,即使用指定代理的加权平均值作为我们的保留滤波器。当温度参数接近无穷大时,我们得到了一个单热互相似分布。因此,每个滤波器的相对重要性可以随压缩CNN的训练而变化,从而产生动态可变的融合滤波器,而不依赖于预训练模型和引入稀疏约束。在分类基准上的大量实验证明了DCFF的优越性。例如,我们的DCFF得到了一个紧凑的VGGNet-16,只有72.77M的触发器和1.06M的参数,同时在CIFAR-10上达到了93.47%的顶级精度。一个紧凑的ResNet-50得到了63.8%的触发器和58.6%的参数缩减,在ILSVRC-2012上保持了75.60%的顶级精度。我们的代码,更窄的模型和训练日志可在https://github.com/lmbxmu/DCFF. 摘要:The mainstream approach for filter pruning is usually either to force a hard-coded importance estimation upon a computation-heavy pretrained model to select "important" filters, or to impose a hyperparameter-sensitive sparse constraint on the loss objective to regularize the network training. In this paper, we present a novel filter pruning method, dubbed dynamic-coded filter fusion (DCFF), to derive compact CNNs in a computation-economical and regularization-free manner for efficient image classification. Each filter in our DCFF is firstly given an inter-similarity distribution with a temperature parameter as a filter proxy, on top of which, a fresh Kullback-Leibler divergence based dynamic-coded criterion is proposed to evaluate the filter importance. In contrast to simply keeping high-score filters in other methods, we propose the concept of filter fusion, i.e., the weighted averages using the assigned proxies, as our preserved filters. We obtain a one-hot inter-similarity distribution as the temperature parameter approaches infinity. Thus, the relative importance of each filter can vary along with the training of the compact CNN, leading to dynamically changeable fused filters without both the dependency on the pretrained model and the introduction of sparse constraints. Extensive experiments on classification benchmarks demonstrate the superiority of our DCFF over the compared counterparts. For example, our DCFF derives a compact VGGNet-16 with only 72.77M FLOPs and 1.06M parameters while reaching top-1 accuracy of 93.47% on CIFAR-10. A compact ResNet-50 is obtained with 63.8% FLOPs and 58.6% parameter reductions, retaining 75.60% top-1 accuracy on ILSVRC-2012. Our code, narrower models and training logs are available at https://github.com/lmbxmu/DCFF.

【54】 Multi-label Chaining with Imprecise Probabilities 标题:具有不精确概率的多标签链

作者:Yonatan Carlos Carranza Alarcón,Sébastien Destercke 机构:Multi-label chaining with imprecise probabilitiesYonatan Carlos Carranza Alarc´on 1[0000−000 2−86 57−6 3 5 5] and S´ebastienDestercke 1[0000−000 3− 20 26− 468X]Sorbonne Universit´es, Universit´e Technologique de Compiegne 链接:https://arxiv.org/abs/2107.07443 摘要:我们提出了两种不同的策略来扩展经典的多标签链方法来处理不精确的概率估计。这些估计使用分布的凸集(或credal集)来描述我们的不确定性,而不是精确的不确定性。使用这种估计的主要原因是:(1)当链中检测到高度不确定性时,做出谨慎的预测(或根本不做决定),以及(2)通过避免链中早期决策中产生的偏差,做出更精确的预测。通过使用朴素credal分类器,我们提出了有效的程序和理论证明来解决这两种策略。我们在缺失标签上的实验结果表明,我们的方法对精确模型失败的那些难以预测的实例产生了相应的谨慎性。 摘要:We present two different strategies to extend the classical multi-label chaining approach to handle imprecise probability estimates. These estimates use convex sets of distributions (or credal sets) in order to describe our uncertainty rather than a precise one. The main reasons one could have for using such estimations are (1) to make cautious predictions (or no decision at all) when a high uncertainty is detected in the chaining and (2) to make better precise predictions by avoiding biases caused in early decisions in the chaining. Through the use of the naive credal classifier, we propose efficient procedures with theoretical justifications to solve both strategies. Our experimental results on missing labels, which investigate how reliable these predictions are in both approaches, indicate that our approaches produce relevant cautiousness on those hard-to-predict instances where the precise models fail.

【55】 Multiclass Permanent Magnets Superstructure for Indoor Localization using Artificial Intelligence 标题:基于人工智能的多级永磁上层建筑室内定位

作者:Amir Ivry,Elad Fisher,Roger Alimi,Idan Mosseri,Kanna Nahir 机构:Technion – Israel Institute of Technology, Haifa , Israel, Technology Division, Soreq NRC, Yavne , Israel, Department of Computer Science, Ben-Gurion University of the Negev, P.O.B. , Be’er Sheva, Israel 备注:None 链接:https://arxiv.org/abs/2107.07425 摘要:智能手机已经成为一种流行的室内定位和位置估计工具。现有的解决方案主要采用Wi-Fi、RFID和磁感应技术来跟踪拥挤场馆中的运动。它们对磁杂波非常敏感,并且依赖于局部环境磁场,这常常会降低它们的性能。此外,这些技术通常需要对该地区进行预先已知的测绘,或存在活动信标,而这些信标并不总是可用的。我们在已知的位置嵌入小体积和大力矩磁铁,并将它们排列在特定的几何星座中,从而产生监督磁特征的磁超结构模式。这些特征构成了相对于移动传感器载体的明确的磁环境。定位算法在训练过程中学习分散磁铁的独特模式,并在定位过程中从持续的数据流中检测它们。我们的贡献是双重的。首先,我们部署无源永磁体,不需要电源,与有源磁性发射机形成对比。其次,我们根据智能手机的运动而不是磁强计的静态定位来进行定位。在我们之前的研究中,我们考虑了单一的上层建筑模式。在这里,我们提出了一个扩展版本的多上层建筑定位算法,它涵盖了更广泛的定位领域的用户。实验结果表明,人工智能定位精度达95%,平均定位误差小于1m。 摘要:Smartphones have become a popular tool for indoor localization and position estimation of users. Existing solutions mainly employ Wi-Fi, RFID, and magnetic sensing techniques to track movements in crowded venues. These are highly sensitive to magnetic clutters and depend on local ambient magnetic fields, which frequently degrades their performance. Also, these techniques often require pre-known mapping surveys of the area, or the presence of active beacons, which are not always available. We embed small-volume and large-moment magnets in pre-known locations and arrange them in specific geometric constellations that create magnetic superstructure patterns of supervised magnetic signatures. These signatures constitute an unambiguous magnetic environment with respect to the moving sensor carrier. The localization algorithm learns the unique patterns of the scattered magnets during training and detects them from the ongoing streaming of data during localization. Our contribution is twofold. First, we deploy passive permanent magnets that do not require a power supply, in contrast to active magnetic transmitters. Second, we perform localization based on smartphone motion rather than on static positioning of the magnetometer. In our previous study, we considered a single superstructure pattern. Here, we present an extended version of that algorithm for multi-superstructure localization, which covers a broader localization area of the user. Experimental results demonstrate localization accuracy of 95% with a mean localization error of less than 1m using artificial intelligence.

0 人点赞