人工智能学术速递[12.20]

cs.AI人工智能，共计35篇

【1】 An Online Data-Driven Emergency-Response Method for Autonomous Agents in Unforeseen Situations 标题：一种数据驱动的自治Agent在不可预见情况下的在线应急响应方法链接：https://arxiv.org/abs/2112.09670

作者：Glenn Maguire,Nicholas Ketz,Praveen Pilly,Jean-Baptiste Mouret 机构： Inria, CNRS, Universit´e de Lorraine, Center for Human-Machine Collaboration, Information and Systems Sciences Laboratory, HRL Laboratories 摘要：强化学习代理在训练过程中遇到的输入分布中表现良好。然而，在他们接受额外训练之前，他们无法在面对新的发行外事件时做出有效反应。本文提出了一种在线、数据驱动的应急响应方法，旨在为自治代理提供对意外情况作出反应的能力，这些意外情况与它所训练或设计用于解决的情况大不相同。在这种情况下，由于在这些新情况下获得的观察结果不在代理优化处理的输入分布范围内，因此无法期望学习到的策略能够正确执行。所提出的方法通过选择使可变自动编码器的重建误差增加率最小化的操作来设计对不可预见情况的定制响应。使用改进的贝叶斯优化程序，以数据高效的方式（大约30个数据点）在线实现此优化。我们在一个模拟的3D汽车驾驶场景中展示了这种方法的潜力，在该场景中，agent在2秒内设计出一个响应，以避免与训练期间未看到的物体发生碰撞。摘要：Reinforcement learning agents perform well when presented with inputs within the distribution of those encountered during training. However, they are unable to respond effectively when faced with novel, out-of-distribution events, until they have undergone additional training. This paper presents an online, data-driven, emergency-response method that aims to provide autonomous agents the ability to react to unexpected situations that are very different from those it has been trained or designed to address. In such situations, learned policies cannot be expected to perform appropriately since the observations obtained in these novel situations would fall outside the distribution of inputs that the agent has been optimized to handle. The proposed approach devises a customized response to the unforeseen situation sequentially, by selecting actions that minimize the rate of increase of the reconstruction error from a variational auto-encoder. This optimization is achieved online in a data-efficient manner (on the order of 30 data-points) using a modified Bayesian optimization procedure. We demonstrate the potential of this approach in a simulated 3D car driving scenario, in which the agent devises a response in under 2 seconds to avoid collisions with objects it has not seen during training.

【2】 Deep Learning for Spatiotemporal Modeling of Urbanization 标题：深度学习在城市化时空建模中的应用链接：https://arxiv.org/abs/2112.09668

作者：Tang Li,Jing Gao,Xi Peng 机构：University of Delaware, Newark, DE 备注：Accepted by NeurIPS 2021 MLPH (Machine Learning in Public Health) Workshop; Best Paper Awarded by NeurIPS 2021 MLPH (Machine Learning in Public Health) Workshop 摘要：城市化对全世界人口的健康和福祉产生了巨大影响。因此，城市化的预测性空间模型可以成为有效公共卫生规划的有用工具。许多空间城市化模型已经使用经典的机器学习和数值建模技术开发出来。然而，深度学习及其捕获复杂时空现象的能力尚未应用于城市化建模。在这里，我们探讨了城市化预测模型的深层空间学习能力。我们将数字地理空间数据视为具有像素和通道的图像，并通过扩充来丰富数据集，以利用深度学习的高容量。我们得到的模型可以生成端到端的多变量城市化预测，并在初步比较中优于最先进的经典机器学习城市化模型。摘要：Urbanization has a strong impact on the health and wellbeing of populations across the world. Predictive spatial modeling of urbanization therefore can be a useful tool for effective public health planning. Many spatial urbanization models have been developed using classic machine learning and numerical modeling techniques. However, deep learning with its proven capacity to capture complex spatiotemporal phenomena has not been applied to urbanization modeling. Here we explore the capacity of deep spatial learning for the predictive modeling of urbanization. We treat numerical geospatial data as images with pixels and channels, and enrich the dataset by augmentation, in order to leverage the high capacity of deep learning. Our resulting model can generate end-to-end multi-variable urbanization predictions, and outperforms a state-of-the-art classic machine learning urbanization model in preliminary comparisons.

【3】 Distillation of RL Policies with Formal Guarantees via Variational Abstraction of Markov Decision Processes (Technical Report) 标题：通过马尔可夫决策过程的变分抽象提炼具有形式保证的RL策略(技术报告) 链接：https://arxiv.org/abs/2112.09655

作者：Florent Delgrange,Ann Nowé,Guillermo A. Pérez 机构： AI Lab, Vrije Universiteit Brussel, University of Antwerp – Flanders Make 备注：Accepted at AAAI 2022, technical report including supplementary material (10 pages main text, 14 pages appendix) 摘要：在连续环境中通过强化学习（RL）学习的策略的背景下，我们考虑策略简化和验证的挑战。在性能良好的环境中，RL算法的收敛保证在极限范围内。虽然这些保证是有价值的，但它们不足以满足安全关键应用。此外，在应用深度RL等先进技术时，它们也会丢失。在将高级RL算法应用于（i）可达性、（ii）安全约束可达性或（iii）折扣奖励目标的更复杂环境时，恢复保证，我们基于Gelada等人引入的DeepMDP框架，推导未知环境和学习的离散潜在模型之间的新互模拟边界。我们的互模拟界使马尔可夫决策过程的形式化方法得以应用。最后，我们展示了如何使用通过最先进的RL获得的策略来有效地训练变分自动编码器，该编码器产生具有可证明的近似正确互模拟保证的离散潜在模型。此外，我们还获得了潜在模型策略的一个提炼版本。摘要：We consider the challenge of policy simplification and verification in the context of policies learned through reinforcement learning (RL) in continuous environments. In well-behaved settings, RL algorithms have convergence guarantees in the limit. While these guarantees are valuable, they are insufficient for safety-critical applications. Furthermore, they are lost when applying advanced techniques such as deep-RL. To recover guarantees when applying advanced RL algorithms to more complex environments with (i) reachability, (ii) safety-constrained reachability, or (iii) discounted-reward objectives, we build upon the DeepMDP framework introduced by Gelada et al. to derive new bisimulation bounds between the unknown environment and a learned discrete latent model of it. Our bisimulation bounds enable the application of formal methods for Markov decision processes. Finally, we show how one can use a policy obtained via state-of-the-art RL to efficiently train a variational autoencoder that yields a discrete latent model with provably approximately correct bisimulation guarantees. Additionally, we obtain a distilled version of the policy for the latent model.

【4】 Local contrastive loss with pseudo-label based self-training for semi-supervised medical image segmentation 标题：基于伪标签自训练的局部对比损失半监督医学图像分割链接：https://arxiv.org/abs/2112.09645

作者：Krishna Chaitanya,Ertunc Erdil,Neerav Karani,Ender Konukoglu 备注：13 pages, 4 figures, 7 tables. This article is under review at a Journal 摘要：基于监督深度学习的方法可以得到准确的医学图像分割结果。然而，这需要大量的标记数据集，获取它们是一项艰巨的任务，需要临床专业知识。基于半监督/自监督学习的方法通过利用未标记数据和有限的注释数据来解决这一限制。最近的自监督学习方法使用对比损失从未标记图像中学习良好的全局级表示，并在流行的自然图像数据集（如ImageNet）上实现高性能的分类任务。在像素级预测任务（如分割）中，还必须学习良好的局部级表示和全局表示，以获得更好的精度。然而，现有的基于局部对比损失的方法对于学习良好的局部表示的影响仍然有限，因为相似和不同的局部区域是基于随机增强和空间邻近性定义的；由于在半监督/自监督环境中缺乏大规模专家注释，因此不基于局部区域的语义标签。在本文中，我们提出了一种局部对比损失法，通过利用从未标记图像的伪标签和有限的注释图像中获得的语义标签信息来学习用于分割的良好像素级特征。特别是，我们定义了建议的损失，以鼓励对具有相同伪标签/标签的像素进行类似表示，同时与数据集中具有不同伪标签/标签的像素的表示不同。我们执行基于伪标签的自训练，并通过联合优化建议的标记集和未标记集上的对比损失和仅有限标记集上的分割损失来训练网络。我们在三个公共心脏和前列腺数据集上进行了评估，获得了较高的分割性能。摘要：Supervised deep learning-based methods yield accurate results for medical image segmentation. However, they require large labeled datasets for this, and obtaining them is a laborious task that requires clinical expertise. Semi/self-supervised learning-based approaches address this limitation by exploiting unlabeled data along with limited annotated data. Recent self-supervised learning methods use contrastive loss to learn good global level representations from unlabeled images and achieve high performance in classification tasks on popular natural image datasets like ImageNet. In pixel-level prediction tasks such as segmentation, it is crucial to also learn good local level representations along with global representations to achieve better accuracy. However, the impact of the existing local contrastive loss-based methods remains limited for learning good local representations because similar and dissimilar local regions are defined based on random augmentations and spatial proximity; not based on the semantic label of local regions due to lack of large-scale expert annotations in the semi/self-supervised setting. In this paper, we propose a local contrastive loss to learn good pixel level features useful for segmentation by exploiting semantic label information obtained from pseudo-labels of unlabeled images alongside limited annotated images. In particular, we define the proposed loss to encourage similar representations for the pixels that have the same pseudo-label/ label while being dissimilar to the representation of pixels with different pseudo-label/label in the dataset. We perform pseudo-label based self-training and train the network by jointly optimizing the proposed contrastive loss on both labeled and unlabeled sets and segmentation loss on only the limited labeled set. We evaluated on three public cardiac and prostate datasets, and obtain high segmentation performance.

【5】 Explanation as Question Answering based on Design Knowledge 标题：基于设计知识的问答解释链接：https://arxiv.org/abs/2112.09616

作者：Ashok Goel,Vrinda Nandan,Eric Gregori,Sungeun An,Spencer Rugaber 机构：Design Intelligence Laboratory, School of Interactive Computing, Georgia Institute of Technology 备注：7 pages, 8 figures 摘要：解释人工智能代理需要了解其设计和操作。一个悬而未决的问题是如何识别、访问和使用这些设计知识来生成解释。在实践中使用的许多人工智能代理，例如在教育环境中部署的智能教学系统，通常会附带一份用户指南，说明代理的功能、工作方式以及如何使用代理。然而，很少有人真正详细阅读用户指南。相反，大多数用户会根据需要寻求问题的答案。在本文中，我们描述了一个问答代理（AskJill），它使用交互式学习环境用户指南（VERA）自动回答问题，从而解释了VERA的领域、功能和操作。我们对维拉的AskJill进行了初步评估。摘要：Explanation of an AI agent requires knowledge of its design and operation. An open question is how to identify, access and use this design knowledge for generating explanations. Many AI agents used in practice, such as intelligent tutoring systems fielded in educational contexts, typically come with a User Guide that explains what the agent does, how it works and how to use the agent. However, few humans actually read the User Guide in detail. Instead, most users seek answers to their questions on demand. In this paper, we describe a question answering agent (AskJill) that uses the User Guide for an interactive learning environment (VERA) to automatically answer questions and thereby explains the domain, functioning, and operation of VERA. We present a preliminary assessment of AskJill in VERA.

【6】 Global explainability in aligned image modalities 标题：对齐图像模态的全局可解释性链接：https://arxiv.org/abs/2112.09591

作者：Justin Engelmann,Amos Storkey,Miguel O. Bernabeu 机构：UKRI CDT Biomedical AI, University of Edinburgh, School of Informatics, Usher Institute 摘要：深度学习（DL）模型在许多计算机视觉问题上非常有效，并越来越多地用于关键应用。它们也是天生的黑匣子。有许多方法可以生成图像解释，使从业者能够理解和验证给定图像的模型预测。除此之外，还需要验证DL模型{通常}是否以合理的方式工作，即与领域知识一致，并且不依赖不需要的数据人工制品。为此，需要对模型进行全面解释。在这项工作中，我们将重点放在自然对齐的图像模式上，以便每个像素位置代表成像对象上类似的相对位置，这在医学成像中很常见。我们提出了图像解释的像素级聚合，作为获得标签级和全局解释的简单方法。然后，这些可以用于模型验证、知识发现，并作为交流从检查图像解释中得出的定性结论的有效方式。我们进一步提出渐进擦除加渐进恢复（PEPPR）作为一种方法，以定量验证这些全局解释是否忠实于模型的预测。然后，我们将这些方法应用于超宽视野视网膜图像，这是一种自然对齐的模式。我们发现，全局解释与领域知识一致，并且忠实地反映了模型的工作原理。摘要：Deep learning (DL) models are very effective on many computer vision problems and increasingly used in critical applications. They are also inherently black box. A number of methods exist to generate image-wise explanations that allow practitioners to understand and verify model predictions for a given image. Beyond that, it would be desirable to validate that a DL model textit{generally} works in a sensible way, i.e. consistent with domain knowledge and not relying on undesirable data artefacts. For this purpose, the model needs to be explained globally. In this work, we focus on image modalities that are naturally aligned such that each pixel position represents a similar relative position on the imaged object, as is common in medical imaging. We propose the pixel-wise aggregation of image-wise explanations as a simple method to obtain label-wise and overall global explanations. These can then be used for model validation, knowledge discovery, and as an efficient way to communicate qualitative conclusions drawn from inspecting image-wise explanations. We further propose Progressive Erasing Plus Progressive Restoration (PEPPR) as a method to quantitatively validate that these global explanations are faithful to how the model makes its predictions. We then apply these methods to ultra-widefield retinal images, a naturally aligned modality. We find that the global explanations are consistent with domain knowledge and faithfully reflect the model's workings.

【7】 cgSpan: Closed Graph-Based Substructure Pattern Mining 标题：cgSpan：基于封闭图的子结构模式挖掘链接：https://arxiv.org/abs/2112.09573

作者：Zevin Shaul,Sheikh Naaz 机构：Informatica, University of Wisconsin–Madison 摘要：gSpan是一种常用的频繁子图挖掘算法。cgSpan（基于闭图的子结构模式挖掘）是仅挖掘闭子图的gSpan扩展。如果没有适当的频繁超图g与g具有相同的出现次数，则子图g在图形数据库中关闭。cgSpan将提前终止修剪方法添加到gSpan修剪方法中，同时保持原始gSpan步骤不变。cgSpan还检测和处理不应提前终止的情况。据我们所知，cgSpan是第一个公开的封闭图挖掘实现摘要：gSpan is a popular algorithm for mining frequent subgraphs. cgSpan (closed graph-based substructure pattern mining) is a gSpan extension that only mines closed subgraphs. A subgraph g is closed in the graphs database if there is no proper frequent supergraph of g that has equivalent occurrence with g. cgSpan adds the Early Termination pruning method to the gSpan pruning methods, while leaving the original gSpan steps unchanged. cgSpan also detects and handles cases in which Early Termination should not be applied. To the best of our knowledge, cgSpan is the first publicly available implementation for closed graphs mining

【8】 CPPE-5: Medical Personal Protective Equipment Dataset 标题：CPPE-5：医用个人防护设备数据集链接：https://arxiv.org/abs/2112.09569

作者：Rishit Dagli,Ali Mustufa Shaikh 机构：High School, Narayana Junior College, Mumbai, India, Student Community Lead, Postman Inc 备注：16 pages, 6 tables, 6 figures. Code and models are available at this https URL 摘要：我们提出了一个新的具有挑战性的数据集CPPE-5（医疗个人防护装备），其目标是允许研究医疗个人防护装备的从属分类，这在其他关注广泛级别类别的流行数据集（如PASCAL VOC、ImageNet、Microsoft COCO、OpenImages等）中是不可能的。为了使在此数据集上训练的模型易于在复杂场景中的实际场景中使用，我们的数据集主要包含显示复杂场景的图像，每个场景中的多个对象在其自然上下文中。此数据集的图像收集重点在于：获取尽可能多的非标志性图像，并确保所有图像都是真实的图像，而不是该区域的其他现有数据集。我们的数据集包括5个对象类别（工作服、面罩、手套、面具和护目镜），每个图像都带有一组边界框和正面标签。我们对该数据集进行了详细分析，并将其与其他流行的大类数据集以及侧重于个人防护设备的数据集进行了比较。我们还发现，目前还不存在此类公开可用的数据集。最后，我们还分析了性能，并比较了边界框结果的基线模型和最先进模型的复杂性。我们的代码、数据和经过训练的模型可在https://git.io/cppe5-dataset . 摘要：We present a new challenging dataset, CPPE - 5 (Medical Personal Protective Equipment), with the goal to allow the study of subordinate categorization of medical personal protective equipments, which is not possible with other popular data sets that focus on broad level categories (such as PASCAL VOC, ImageNet, Microsoft COCO, OpenImages, etc). To make it easy for models trained on this dataset to be used in practical scenarios in complex scenes, our dataset mainly contains images that show complex scenes with several objects in each scene in their natural context. The image collection for this dataset focusing on: obtaining as many non-iconic images as possible and making sure all the images are real-life images unlike other existing datasets in this area. Our dataset includes 5 object categories (coveralls, face shield, gloves, mask, and goggles) and each image is annotated with a set of bounding boxes and positive labels. We present a detailed analysis of the dataset in comparison to other popular broad category datasets as well as datasets focusing on personal protective equipments, we also find that at present there exist no such publicly available datasets. Finally we also analyze performance and compare model complexities on baseline and state-of-the-art models for bounding box results. Our code, data, and trained models are available at https://git.io/cppe5-dataset .

【9】 Symmetry-aware Neural Architecture for Embodied Visual Navigation 标题：用于具身视觉导航的对称性感知神经结构链接：https://arxiv.org/abs/2112.09515

作者：Shuang Liu,Takayuki Okatani 机构：Okatani Takayuki, Tohoku University, RIKEN Center for AIP 摘要：视觉探索是一项寻求尽快访问环境中所有可导航区域的任务。现有的方法采用深度强化学习（RL）作为任务的标准工具。然而，它们往往容易受到训练数据和测试数据之间的统计变化的影响，导致在训练数据不分布（OOD）的新环境中泛化能力差。在本文中，我们试图通过利用任务中可用的归纳偏差来提高泛化能力。采用主动神经SLAM（ANS）学习探索策略，以优势行动者-批评家（A2C）方法为基本框架，我们首先指出行动者和批评家所代表的映射应满足特定的对称性。然后，我们为演员和评论家提出了一个网络设计，以内在地实现这些对称性。具体来说，我们使用$G$-卷积代替标准卷积，并在批评家网络的最后一部分插入我们在本研究中新设计的半全局极性池（SGPP）层。实验结果表明，当在Gibson数据集上进行训练并在MP3D数据集上进行测试时，我们的方法将区域覆盖率提高了810万美元，从而建立了新的最先进技术。摘要：Visual exploration is a task that seeks to visit all the navigable areas of an environment as quickly as possible. The existing methods employ deep reinforcement learning (RL) as the standard tool for the task. However, they tend to be vulnerable to statistical shifts between the training and test data, resulting in poor generalization over novel environments that are out-of-distribution (OOD) from the training data. In this paper, we attempt to improve the generalization ability by utilizing the inductive biases available for the task. Employing the active neural SLAM (ANS) that learns exploration policies with the advantage actor-critic (A2C) method as the base framework, we first point out that the mappings represented by the actor and the critic should satisfy specific symmetries. We then propose a network design for the actor and the critic to inherently attain these symmetries. Specifically, we use $G$-convolution instead of the standard convolution and insert the semi-global polar pooling (SGPP) layer, which we newly design in this study, in the last section of the critic network. Experimental results show that our method increases area coverage by $8.1 m^2$ when trained on the Gibson dataset and tested on the MP3D dataset, establishing the new state-of-the-art.

【10】 Learning Reward Machines: A Study in Partially Observable Reinforcement Learning 标题：学习奖励机：部分可观测强化学习的研究链接：https://arxiv.org/abs/2112.09477

作者：Rodrigo Toro Icarte,Ethan Waldie,Toryn Q. Klassen,Richard Valenzano,Margarita P. Castro,Sheila A. McIlraith 机构：Pontificia Universidad Cat´olica de Chile, Vicu˜na Mackenna , Macul, RM, Chile, University of Toronto, College St, Toronto, ON, Canada, Vector Institute, University, Toronto, ON, Canada, Ryerson University, Victoria St, Toronto, ON, Canada 摘要：强化学习是人工智能的核心问题。这个问题包括定义人工智能体，这些智能体可以通过与环境交互来学习最佳行为——在环境中，最佳行为是根据智能体寻求最大化的奖励信号来定义的。奖励机器（RMs）提供了奖励函数的结构化、基于自动机的表示，使RL代理能够将RL问题分解为结构化子问题，这些子问题可以通过非策略学习有效地学习。在这里，我们表明RMs可以从经验中学习，而不是由用户指定，并且由此产生的问题分解可以有效地解决部分可观察的RL问题。我们将学习RMs的任务视为一个离散优化问题，其目标是找到一个RM，该RM将问题分解为一组子问题，使其最优无记忆策略的组合成为原始问题的最优策略。我们展示了这种方法在三个部分可观察领域的有效性，在这三个领域，它的表现明显优于A3C、PPO和ACER，并讨论了其优势、局限性和更广泛的潜力。摘要：Reinforcement learning (RL) is a central problem in artificial intelligence. This problem consists of defining artificial agents that can learn optimal behaviour by interacting with an environment -- where the optimal behaviour is defined with respect to a reward signal that the agent seeks to maximize. Reward machines (RMs) provide a structured, automata-based representation of a reward function that enables an RL agent to decompose an RL problem into structured subproblems that can be efficiently learned via off-policy learning. Here we show that RMs can be learned from experience, instead of being specified by the user, and that the resulting problem decomposition can be used to effectively solve partially observable RL problems. We pose the task of learning RMs as a discrete optimization problem where the objective is to find an RM that decomposes the problem into a set of subproblems such that the combination of their optimal memoryless policies is an optimal policy for the original problem. We show the effectiveness of this approach on three partially observable domains, where it significantly outperforms A3C, PPO, and ACER, and discuss its advantages, limitations, and broader potential.

【11】 Towards fuzzification of adaptation rules in self-adaptive architectures 标题：自适应体系结构中适配规则模糊化的研究链接：https://arxiv.org/abs/2112.09468

作者：Tomáš Bureš,Petr Hnětynka,Martin Kruliš,Danylo Khalyeyev,Sebastian Hahner,Stephan Seifermann,Maximilian Walter,Robert Heinrich 机构：Charles University, Prague, Czech Republic, Karlsruhe Institute of Technology (KIT), Germany 摘要：在这篇论文中，我们着重于在自适应体系结构的分析和规划阶段开发神经网络。本文所研究的激励案例涉及现有的（遗留的）自适应体系结构及其由逻辑规则指定的自适应逻辑。我们进一步假设，有必要赋予这些系统基于输入和预期输出示例的学习能力。解决这种需求的一个简单选择是用神经网络取代基于逻辑规则的推理。然而，这一步骤带来了一些问题，这些问题通常至少会造成暂时的倒退。原因是逻辑规则通常代表一个大型且经过测试的领域知识体，如果用神经网络替换逻辑规则，这些知识体可能会丢失。此外，一般神经网络的黑盒特性混淆了系统内部的工作方式，因此引入了更多的不确定性。在本文中，我们提出了一种方法，使我们能够赋予现有的自适应体系结构使用神经网络进行学习的能力，同时保留存在于逻辑规则中的领域知识。我们在现有的基于规则的系统和基于通用神经网络的系统之间引入了一个连续体。我们展示了如何在这个连续体中导航并创建一个自然嵌入原始逻辑规则的神经网络架构，以及如何逐步扩展网络的学习潜力，从而控制所有软计算模型固有的不确定性。我们在两个更大的实际用例的代表性摘录上展示和评估该方法。摘要：In this paper, we focus on exploiting neural networks for the analysis and planning stage in self-adaptive architectures. The studied motivating cases in the paper involve existing (legacy) self-adaptive architectures and their adaptation logic, which has been specified by logical rules. We further assume that there is a need to endow these systems with the ability to learn based on examples of inputs and expected outputs. One simple option to address such a need is to replace the reasoning based on logical rules with a neural network. However, this step brings several problems that often create at least a temporary regress. The reason is the logical rules typically represent a large and tested body of domain knowledge, which may be lost if the logical rules are replaced by a neural network. Further, the black-box nature of generic neural networks obfuscates how the systems work inside and consequently introduces more uncertainty. In this paper, we present a method that makes it possible to endow an existing self-adaptive architectures with the ability to learn using neural networks, while preserving domain knowledge existing in the logical rules. We introduce a continuum between the existing rule-based system and a system based on a generic neural network. We show how to navigate in this continuum and create a neural network architecture that naturally embeds the original logical rules and how to gradually scale the learning potential of the network, thus controlling the uncertainty inherent to all soft computing models. We showcase and evaluate the approach on representative excerpts from two larger real-life use cases.

【12】 Contrastive Explanations for Comparing Preferences of Reinforcement Learning Agents 标题：强化学习智能体偏好比较的对比解释链接：https://arxiv.org/abs/2112.09462

作者：Jasmina Gajcin,Rahul Nair,Tejaswini Pedapati,Radu Marinescu,Elizabeth Daly,Ivana Dusparic 机构：Dusparic , Trinity College Dublin, IBM Ireland 备注：7 pages, 3 figures 摘要：在复杂任务中，奖励函数不是直接的，而是由一组目标组成的，可以通过调整个体目标对奖励函数的影响来训练多个强化学习（RL）策略，这些策略充分执行任务，但采用不同的策略。理解策略之间的差异对于用户在提供的策略之间进行选择是必要的，并且可以帮助开发人员理解RL系统中各种奖励函数和训练超参数产生的不同行为。在这项工作中，我们比较了在同一任务中接受训练但目标偏好不同的两个策略的行为。我们提出了一种方法，用于区分由不同能力引起的行为差异和由两个RL代理的相反偏好引起的行为差异。此外，我们仅使用基于偏好差异的数据，以生成关于代理偏好的对比解释。最后，我们在自主驾驶任务上测试和评估我们的方法，并比较安全导向策略和偏好速度策略的行为。摘要：In complex tasks where the reward function is not straightforward and consists of a set of objectives, multiple reinforcement learning (RL) policies that perform task adequately, but employ different strategies can be trained by adjusting the impact of individual objectives on reward function. Understanding the differences in strategies between policies is necessary to enable users to choose between offered policies, and can help developers understand different behaviors that emerge from various reward functions and training hyperparameters in RL systems. In this work we compare behavior of two policies trained on the same task, but with different preferences in objectives. We propose a method for distinguishing between differences in behavior that stem from different abilities from those that are a consequence of opposing preferences of two RL agents. Furthermore, we use only data on preference-based differences in order to generate contrasting explanations about agents' preferences. Finally, we test and evaluate our approach on an autonomous driving task and compare the behavior of a safety-oriented policy and one that prefers speed.

【13】 Weakly Supervised Semantic Segmentation via Alternative Self-Dual Teaching 标题：基于交替自对偶教学的弱监督语义切分链接：https://arxiv.org/abs/2112.09459

作者：Dingwen Zhang,Wenyuan Zeng,Guangyu Guo,Chaowei Fang,Lechao Cheng,Junwei Han 机构：The Brain and Artificial Intelligence Laboratory, Northwestern Polytechnical University, Xi’an, China, Xidian University, Xi’an, China, Zhejiang Lab, Hangzhou, China 摘要：现有的弱监督语义分割（WSSS）框架通常包含分离的掩码细化模型和主要的语义区域挖掘模型。这些方法将包含冗余的特征提取主干和有偏见的学习目标，使得它们计算复杂，但对于解决WSSS任务来说不是最优的。为了解决这个问题，本文建立了一个紧凑的学习框架，将分类和掩码细化组件嵌入到一个统一的深度模型中。通过共享特征提取主干，我们的模型能够促进两个组件之间的知识共享，同时保持较低的计算复杂度。为了鼓励高质量的知识交互，我们提出了一种新的替代自我双重教学（ASDT）机制。与传统的提取策略不同，我们模型中的两个教师分支的知识通过脉宽调制（PWM）交替提取到学生分支，产生PW波形选择信号来指导知识提取过程。通过这种方式，学生分支可以帮助防止由于教师分支提供的知识不完善而导致模型陷入局部最小解。在PASCAL VOC 2012和COCO Stuff 10K上进行的综合实验证明了所提出的替代性自我双重教学机制的有效性，以及我们方法的最新性能。摘要：Current weakly supervised semantic segmentation (WSSS) frameworks usually contain the separated mask-refinement model and the main semantic region mining model. These approaches would contain redundant feature extraction backbones and biased learning objectives, making them computational complex yet sub-optimal to addressing the WSSS task. To solve this problem, this paper establishes a compact learning framework that embeds the classification and mask-refinement components into a unified deep model. With the shared feature extraction backbone, our model is able to facilitate knowledge sharing between the two components while preserving a low computational complexity. To encourage high-quality knowledge interaction, we propose a novel alternative self-dual teaching (ASDT) mechanism. Unlike the conventional distillation strategy, the knowledge of the two teacher branches in our model is alternatively distilled to the student branch by a Pulse Width Modulation (PWM), which generates PW wave-like selection signal to guide the knowledge distillation process. In this way, the student branch can help prevent the model from falling into local minimum solutions caused by the imperfect knowledge provided of either teacher branch. Comprehensive experiments on the PASCAL VOC 2012 and COCO-Stuff 10K demonstrate the effectiveness of the proposed alternative self-dual teaching mechanism as well as the new state-of-the-art performance of our approach.

【14】 Visual Learning-based Planning for Continuous High-Dimensional POMDPs 标题：基于可视化学习的连续高维POMDP规划链接：https://arxiv.org/abs/2112.09456

作者：Sampada Deglurkar,Michael H. Lim,Johnathan Tucker,Zachary N. Sunberg,Aleksandra Faust,Claire J. Tomlin 机构：Department of Electrical Engineering and Computer Sciences, UC Berkeley, Department of Aerospace Engineering Science, CU Boulder, Google Research 摘要：部分可观测马尔可夫决策过程（POMDP）是一个强大的框架，用于捕获涉及状态和转移不确定性的决策问题。然而，目前大多数POMDP规划者无法有效处理他们在现实世界中经常遇到的高维观测（例如机器人领域中的图像观测）。在这项工作中，我们提出了可视化树搜索（VTS），这是一种学习和规划过程，将离线学习的生成模型与在线基于模型的POMDP规划相结合。VTS通过利用一组深度生成观测模型在蒙特卡罗树搜索规划器中预测和评估图像观测的可能性，将离线模型训练和在线规划联系起来。我们证明了VTS对不同的观测噪声具有鲁棒性，并且由于它采用了在线、基于模型的规划，可以适应不同的奖励结构，而无需重新训练。这种新方法在策略规划算法方面优于最新的基线状态，同时显著减少了离线训练时间。摘要：The Partially Observable Markov Decision Process (POMDP) is a powerful framework for capturing decision-making problems that involve state and transition uncertainty. However, most current POMDP planners cannot effectively handle very high-dimensional observations they often encounter in the real world (e.g. image observations in robotic domains). In this work, we propose Visual Tree Search (VTS), a learning and planning procedure that combines generative models learned offline with online model-based POMDP planning. VTS bridges offline model training and online planning by utilizing a set of deep generative observation models to predict and evaluate the likelihood of image observations in a Monte Carlo tree search planner. We show that VTS is robust to different observation noises and, since it utilizes online, model-based planning, can adapt to different reward structures without the need to re-train. This new approach outperforms a baseline state-of-the-art on-policy planning algorithm while using significantly less offline training time.

【15】 ML Supported Predictions for SAT Solvers Performance 标题：对SAT解算器性能的ML支持预测链接：https://arxiv.org/abs/2112.09438

作者：A. -M. Leventi-Peetz,Jörg-Volker Peetz,Martina Rohde 机构： Federal Office for Information Security, Godesberger Allee ,–, DE-, Bonn, Germany 备注：None 摘要：为了在处理难以解决的布尔可满足性问题实例时，在多线程模式下对开源SAT解算器CryptoMiniSat的不确定性终止行为进行分类，收集并分析了内部解算器运行时参数。已选择这些参数的子集并将其用作特征向量，以成功创建一个机器学习模型，用于对解算器的终止行为进行二元分类，并对尚未求解的实例进行任何一次新的求解运行。该模型可用于解决尝试的早期估计，即属于或不属于具有快速终止机会的候选类别。在这种情况下，运行时特征的活动配置文件的组合似乎反映了解算器的瞬时启发式对解算器解析过程即时质量的影响。由于前两次求解迭代的运行时参数足以预测具有良好成功分数的尝试的终止，因此当前工作的结果提供了一个有希望的基础，可以进一步发展，以丰富CryptoMiniSat或任何具有AI能力的现代SAT解算器。摘要：In order to classify the indeterministic termination behavior of the open source SAT solver CryptoMiniSat in multi-threading mode while processing hard to solve boolean satisfiability problem instances, internal solver runtime parameters have been collected and analyzed. A subset of these parameters has been selected and employed as features vector to successfully create a machine learning model for the binary classification of the solver's termination behavior with any single new solving run of a not yet solved instance. The model can be used for the early estimation of a solving attempt as belonging or not belonging to the class of candidates with good chances for a fast termination. In this context a combination of active profiles of runtime characteristics appear to mirror the influence of the solver's momentary heuristics on the immediate quality of the solver's resolution process. Because runtime parameters of already the first two solving iterations are enough to forecast termination of the attempt with good success scores, the results of the present work deliver a promising basis which can be further developed in order to enrich CryptoMiniSat or generally any modern SAT solver with AI abilities.

【16】 Can Machine Learning Tools Support the Identification of Sustainable Design Leads From Product Reviews? Opportunities and Challenges 标题：机器学习工具能支持从产品评审中识别可持续设计线索吗？机遇与挑战链接：https://arxiv.org/abs/2112.09391

作者：Michael Saidani,Harrison Kim,Bernard Yannou 机构：Department of Industrial and Enterprise, Systems Engineering, University of Illinois at, Urbana-Champaign, Illinois, USA, Laboratoire Genie Industriel, CentraleSupélec, Université Paris Saclay, Gif-sur-Yvette, France 备注：ASME 2021 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference, Aug 2021, Virtual, United States 摘要：越来越多的产品评论发布在网上，这对于设计师来说是一座金矿，他们可以通过捕捉客户的声音更好地了解自己开发的产品，并相应地改进这些产品。与此同时，产品设计和开发在创造更可持续的未来方面发挥着至关重要的作用。随着人工智能技术在自然语言处理领域的最新进展，本研究旨在开发一个集成的机器学习解决方案，从在线产品评论中自动获得可持续的设计见解。在本文中，将讨论、说明和定位现有框架（包括Python库、包以及最先进的算法，如BERT）带来的机遇和挑战，并将其定位在一个特别的机器学习过程中。这篇文章讨论了构建机器学习渠道的机会和挑战，以便从产品审查中获得见解，设计更可持续的产品，包括以下五个阶段，从确定可持续性相关审查到解释可持续性设计线索：数据收集、数据格式、模型训练、模型评估和模型部署。给出了通过产品审查挖掘和处理产生的可持续设计见解的示例。最后，为该领域的未来研究提供了有希望的路线，包括将标准产品与其可持续替代品平行放置的案例研究，以比较客户所重视的特征，并产生精细相关的可持续设计线索。摘要：The increasing number of product reviews posted online is a gold mine for designers to know better about the products they develop, by capturing the voice of customers, and to improve these products accordingly. In the meantime, product design and development have an essential role in creating a more sustainable future. With the recent advance of artificial intelligence techniques in the field of natural language processing, this research aims to develop an integrated machine learning solution to obtain sustainable design insights from online product reviews automatically. In this paper, the opportunities and challenges offered by existing frameworks - including Python libraries, packages, as well as state-of-the-art algorithms like BERT - are discussed, illustrated, and positioned along an ad hoc machine learning process. This contribution discusses the opportunities to reach and the challenges to address for building a machine learning pipeline, in order to get insights from product reviews to design more sustainable products, including the five following stages, from the identification of sustainability-related reviews to the interpretation of sustainable design leads: data collection, data formatting, model training, model evaluation, and model deployment. Examples of sustainable design insights that can be produced out of product review mining and processing are given. Finally, promising lines for future research in the field are provided, including case studies putting in parallel standard products with their sustainable alternatives, to compare the features valued by customers and to generate in fine relevant sustainable design leads.

【17】 Full Transformer Framework for Robust Point Cloud Registration with Deep Information Interaction 标题：具有深度信息交互的鲁棒点云配准全Transformer框架链接：https://arxiv.org/abs/2112.09385

作者：Guangyan Chen,Meiling Wang,Yufeng Yue,Qingxiang Zhang,Li Yuan 机构： Beijing Institute of Technology , National University of Singapore 备注：10pages, 7figures 摘要：最新的基于变换器的方法利用变换器的顺序不变性和对聚集信息的建模依赖性的优点，在点云配准方面取得了更高的性能。然而，它们仍然受到模糊特征提取、噪声敏感性和异常值的影响。其原因是：（1）CNN的局部感受野使其无法模拟全局关系，导致提取的特征容易受到噪声的影响；（2） Transformer的浅宽结构和位置编码的缺乏导致了由于低效的信息交互导致的模糊特征提取；（3）忽略几何相容性会导致内点和异常点之间的分类不准确。针对上述局限性，提出了一种用于点云注册的新型全Transformer网络，称为深度交互Transformer（DIT），该网络包括：（1）一个点云结构提取器（PSE），用于建模全局关系并使用Transformer编码器检索结构信息；（2）深度窄点特征变换器（PFT），通过位置编码促进跨两个点云的深度信息交互，以便变换器可以建立全面的关联并直接学习点之间的相对位置；（3）提出了一种基于几何匹配的对应可信度评估（GMCCE）方法，通过设计三角化描述符来度量空间一致性和估计内部可信度。在干净、有噪声、部分重叠的点云配准上的大量实验表明，我们的方法优于最新的方法。摘要：Recent Transformer-based methods have achieved advanced performance in point cloud registration by utilizing advantages of the Transformer in order-invariance and modeling dependency to aggregate information. However, they still suffer from indistinct feature extraction, sensitivity to noise, and outliers. The reasons are: (1) the adoption of CNNs fails to model global relations due to their local receptive fields, resulting in extracted features susceptible to noise; (2) the shallow-wide architecture of Transformers and lack of positional encoding lead to indistinct feature extraction due to inefficient information interaction; (3) the omission of geometrical compatibility leads to inaccurate classification between inliers and outliers. To address above limitations, a novel full Transformer network for point cloud registration is proposed, named the Deep Interaction Transformer (DIT), which incorporates: (1) a Point Cloud Structure Extractor (PSE) to model global relations and retrieve structural information with Transformer encoders; (2) a deep-narrow Point Feature Transformer (PFT) to facilitate deep information interaction across two point clouds with positional encoding, such that Transformers can establish comprehensive associations and directly learn relative position between points; (3) a Geometric Matching-based Correspondence Confidence Evaluation (GMCCE) method to measure spatial consistency and estimate inlier confidence by designing the triangulated descriptor. Extensive experiments on clean, noisy, partially overlapping point cloud registration demonstrate that our method outperforms state-of-the-art methods.

【18】 WebGPT: Browser-assisted question-answering with human feedback 标题：WebGPT：具有人工反馈的浏览器辅助问答链接：https://arxiv.org/abs/2112.09332

作者：Reiichiro Nakano,Jacob Hilton,Suchir Balaji,Jeff Wu,Long Ouyang,Christina Kim,Christopher Hesse,Shantanu Jain,Vineet Kosaraju,William Saunders,Xu Jiang,Karl Cobbe,Tyna Eloundou,Gretchen Krueger,Kevin Button,Matthew Knight,Benjamin Chess,John Schulman 机构：OpenAI 备注：30 pages 摘要：我们使用基于文本的web浏览环境对GPT-3进行微调，以回答长格式的问题，该环境允许模型搜索和浏览web。通过将任务设置为可由人工执行，我们可以使用模仿学习对任务模型进行训练，然后通过人工反馈优化答案质量。为了使人类更容易评估事实的准确性，模型必须在浏览时收集参考资料以支持其答案。我们在ELI5（Reddit用户提出的问题数据集）上训练和评估我们的模型。我们的最佳模型是通过使用行为克隆对GPT-3进行微调，然后对经过训练以预测人类偏好的奖励模型进行拒绝抽样来获得的。这个模型的答案在56%的时间里是人类的首选答案，而在69%的时间里是来自Reddit的投票率最高的答案。摘要：We fine-tune GPT-3 to answer long-form questions using a text-based web-browsing environment, which allows the model to search and navigate the web. By setting up the task so that it can be performed by humans, we are able to train models on the task using imitation learning, and then optimize answer quality with human feedback. To make human evaluation of factual accuracy easier, models must collect references while browsing in support of their answers. We train and evaluate our models on ELI5, a dataset of questions asked by Reddit users. Our best model is obtained by fine-tuning GPT-3 using behavior cloning, and then performing rejection sampling against a reward model trained to predict human preferences. This model's answers are preferred by humans 56% of the time to those of our human demonstrators, and 69% of the time to the highest-voted answer from Reddit.

【19】 Dilemma of the Artificial Intelligence Regulatory Landscape 标题：人工智能管制景观的困境链接：https://arxiv.org/abs/2112.09325

作者：Weiyue Wu,Shaoshan Liu 机构：PerceptIn 摘要：作为一家自主驾驶领域的初创公司，我们在处理各种监管要求方面经历了四年的痛苦经历。与软件行业标准相比，软件行业标准将13%的总预算用于合规，我们被迫将42%的预算用于合规。我们的情况并不孤单，在某种程度上反映了人工智能（AI）监管格局的困境。根本原因是立法和行政部门缺乏人工智能专业知识，导致行业缺乏可遵循的标准化。在本文中，我们分享了我们的第一手经验，并主张建立一个类似FDA的机构来适当地监管人工智能。摘要：As a startup company in the autonomous driving space, we have undergone four years of painful experiences dealing with a broad spectrum of regulatory requirements. Compared to the software industry norm, which spends 13% of their overall budget on compliances, we were forced to spend 42% of our budget on compliances. Our situation is not alone and, in a way, reflects the dilemma of the artificial intelligence (AI) regulatory landscape. The root cause is the lack of AI expertise in the legislative and executive branches, leading to a lack of standardization for the industry to follow. In this article, we share our first-hand experiences and advocate for the establishment of an FDA-like agency to regulate AI properly.

【20】 Optimal discharge of patients from intensive care via a data-driven policy learning framework 标题：通过数据驱动的政策学习框架优化重症监护患者的放电链接：https://arxiv.org/abs/2112.09315

作者：Fernando Lejarza,Jacob Calvert,Misty M Attwood,Daniel Evans,Qingqing Mao 机构：Dascena, Inc., Sowden Road, Suite B, Houston, TX ,-, USA, McKetta Department of Chemical Engineering, The University of Texas at Austin, Austin, TX 摘要：植根于机器学习和优化的临床决策支持工具可以为医疗保健提供者提供重要价值，包括更好地管理重症监护病房。尤其重要的是，患者出院任务应处理减少患者住院时间（和相关住院费用）与出院决定后再入院甚至死亡风险之间的微妙权衡。这项工作引入了一个端到端的通用框架，用于捕获这种权衡，从而根据患者的电子健康记录推荐最佳出院时间决策。数据驱动的方法用于导出一种简洁、离散的状态空间表示法，以捕获患者的生理状况。基于该模型和给定的成本函数，建立了一个无限期贴现马尔可夫决策过程，并对其进行了数值求解，计算出一个最优排放政策，该政策的价值采用非政策评估策略进行评估。通过大量的数值实验，使用真实的重症监护病房患者数据验证了所提出的框架。摘要：Clinical decision support tools rooted in machine learning and optimization can provide significant value to healthcare providers, including through better management of intensive care units. In particular, it is important that the patient discharge task addresses the nuanced trade-off between decreasing a patient's length of stay (and associated hospitalization costs) and the risk of readmission or even death following the discharge decision. This work introduces an end-to-end general framework for capturing this trade-off to recommend optimal discharge timing decisions given a patient's electronic health records. A data-driven approach is used to derive a parsimonious, discrete state space representation that captures a patient's physiological condition. Based on this model and a given cost function, an infinite-horizon discounted Markov decision process is formulated and solved numerically to compute an optimal discharge policy, whose value is assessed using off-policy evaluation strategies. Extensive numerical experiments are performed to validate the proposed framework using real-life intensive care unit patient data.

【21】 Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages 标题：2021年HASOC火灾中的子路径概览：英语和印度雅利安语言中的仇恨言论和攻击性内容识别链接：https://arxiv.org/abs/2112.09301

作者：Thomas Mandl,Sandip Modha,Gautam Kishore Shahi,Hiren Madhu,Shrey Satapara,Prasenjit Majumder,Johannes Schaefer,Tharindu Ranasinghe,Marcos Zampieri,Durgesh Nandini,Amit Kumar Jaiswal 机构：University of Hildesheim, Germany, LDRP-ITR, Gandhinagar, India, University of Duisburg-Essen, Germany, Indian Institute of Science, Bangalore, India, DA-IICT, Gandhinagar, India, University of Wolverhampton, United Kingdom, Rochester Institute of Technology, USA 摘要：仇恨言论等攻击性内容在网上的广泛传播构成了一个日益严重的社会问题。AI工具对于支持在线平台上的审核过程是必要的。为了评估这些识别工具，有必要对不同语言的数据集进行连续实验。HASOC track（仇恨言论和攻击性内容识别）致力于为此目的开发基准数据。本文介绍了英语、印地语和马拉地语的HASOC子词条。该数据集是从Twitter收集的。此子机架有两个子任务。任务A是一个针对所有三种语言的二进制分类问题（仇恨而非攻击性）。任务B是一个针对三类（仇恨）的细粒度分类问题，即针对英语和印地语的仇恨言论、冒犯和亵渎。总共有65个团队提交了652分。任务A的最佳分类算法的性能为F1，马拉地语、印地语和英语分别为0.91、0.78和0.83。本概述介绍了任务和数据开发以及详细结果。提交竞赛的系统应用了多种技术。性能最好的算法主要是Transformer结构的变体。摘要：The widespread of offensive content online such as hate speech poses a growing societal problem. AI tools are necessary for supporting the moderation process at online platforms. For the evaluation of these identification tools, continuous experimentation with data sets in different languages are necessary. The HASOC track (Hate Speech and Offensive Content Identification) is dedicated to develop benchmark data for this purpose. This paper presents the HASOC subtrack for English, Hindi, and Marathi. The data set was assembled from Twitter. This subtrack has two sub-tasks. Task A is a binary classification problem (Hate and Not Offensive) offered for all three languages. Task B is a fine-grained classification problem for three classes (HATE) Hate speech, OFFENSIVE and PROFANITY offered for English and Hindi. Overall, 652 runs were submitted by 65 teams. The performance of the best classification algorithms for task A are F1 measures 0.91, 0.78 and 0.83 for Marathi, Hindi and English, respectively. This overview presents the tasks and the data development as well as the detailed results. The systems submitted to the competition applied a variety of technologies. The best performing algorithms were mainly variants of transformer architectures.

【22】 PeopleSansPeople: A Synthetic Data Generator for Human-Centric Computer Vision 标题：PeopleSansPeople：以人为中心的计算机视觉合成数据生成器链接：https://arxiv.org/abs/2112.09290

作者：Salehe Erfanian Ebadi,You-Cyuan Jhang,Alex Zook,Saurav Dhakad,Adam Crespi,Pete Parisi,Steven Borkman,Jonathan Hogins,Sujoy Ganguly 机构：Unity Technologies 备注：PeopleSansPeople template Unity environment, benchmark binaries, and source code is available at: this https URL 摘要：近年来，在大规模标记数据集的帮助下，人体检测和姿态估计取得了长足的进步。然而，这些数据集无法保证或分析人类活动、姿势或环境多样性。此外，隐私、法律、安全和道德问题可能会限制收集更多人类数据的能力。一种新兴的替代现实世界数据的方法是合成数据，它可以缓解其中的一些问题。然而，合成数据生成器的创建极具挑战性，阻碍了研究人员对其有用性的探索。因此，我们发布了一个以人为中心的合成数据生成器PeopleSansPeople，其中包含模拟就绪的3D人体资源、参数化照明和摄像系统，并生成2D和3D边界框、实例和语义分割以及COCO姿势标签。使用PeopleSansPeople，我们使用Detectron2关键点R-CNN变体[1]执行基准合成数据训练。我们发现，使用合成数据对网络进行预训练，并对目标真实世界数据进行微调（少量镜头传输到有限的COCO person train子集[2]），导致keypoint AP为$60.37pm 0.48$（COCO test-dev2017），优于仅使用相同真实数据训练（keypoint AP为$55.80$）并使用ImageNet进行预训练的模型（关键点AP为57.50美元）。这种免费提供的数据生成器应该能够在以人为中心的计算机视觉的关键领域，对新兴的仿真领域进行广泛的研究。摘要：In recent years, person detection and human pose estimation have made great strides, helped by large-scale labeled datasets. However, these datasets had no guarantees or analysis of human activities, poses, or context diversity. Additionally, privacy, legal, safety, and ethical concerns may limit the ability to collect more human data. An emerging alternative to real-world data that alleviates some of these issues is synthetic data. However, creation of synthetic data generators is incredibly challenging and prevents researchers from exploring their usefulness. Therefore, we release a human-centric synthetic data generator PeopleSansPeople which contains simulation-ready 3D human assets, a parameterized lighting and camera system, and generates 2D and 3D bounding box, instance and semantic segmentation, and COCO pose labels. Using PeopleSansPeople, we performed benchmark synthetic data training using a Detectron2 Keypoint R-CNN variant [1]. We found that pre-training a network using synthetic data and fine-tuning on target real-world data (few-shot transfer to limited subsets of COCO-person train [2]) resulted in a keypoint AP of $60.37 pm 0.48$ (COCO test-dev2017) outperforming models trained with the same real data alone (keypoint AP of $55.80$) and pre-trained with ImageNet (keypoint AP of $57.50$). This freely-available data generator should enable a wide range of research into the emerging field of simulation to real transfer learning in the critical area of human-centric computer vision.

【23】 Neural Architectures for Biological Inter-Sentence Relation Extraction 标题：生物句间关系提取的神经结构链接：https://arxiv.org/abs/2112.09288

作者：Enrique Noriega-Atala,Peter M. Lovett,Clayton T. Morrison,Mihai Surdeanu 机构：The University of Arizona, Tucson, AZ, USA 备注：Accepted at the Scientific Document Understanding workshop at AAAI'22 摘要：我们介绍了一系列用于句子间关系提取的深度学习体系结构，即参与者不一定在同一个句子中的关系。我们将这些架构应用于生物医学领域的一个重要用例：为生化事件分配生物上下文。在这项工作中，生物环境被定义为观察生化事件的生物系统的类型。神经结构对同一候选上下文提及的多次出现进行编码和聚合，以确定它是否是特定事件提及的正确上下文。我们提出了两种广泛的体系结构类型：第一种类型在发出分类之前聚合多个实例，这些实例对应于关于事件提及的相同候选上下文；第二种类型独立地对每个实例进行分类，并使用结果为最终类投票，类似于集成方法。我们的实验表明，所提出的神经分类器是有竞争力的，并且一些分类器在不需要特征工程的情况下比以前的最先进的传统机器学习方法取得了更好的性能。我们的分析表明，与传统的机器学习分类器相比，神经方法特别提高了分类精度，并且还表明了句子间关系提取的难度是如何随着事件和上下文提及之间的距离的增加而增加的。摘要：We introduce a family of deep-learning architectures for inter-sentence relation extraction, i.e., relations where the participants are not necessarily in the same sentence. We apply these architectures to an important use case in the biomedical domain: assigning biological context to biochemical events. In this work, biological context is defined as the type of biological system within which the biochemical event is observed. The neural architectures encode and aggregate multiple occurrences of the same candidate context mentions to determine whether it is the correct context for a particular event mention. We propose two broad types of architectures: the first type aggregates multiple instances that correspond to the same candidate context with respect to event mention before emitting a classification; the second type independently classifies each instance and uses the results to vote for the final class, akin to an ensemble approach. Our experiments show that the proposed neural classifiers are competitive and some achieve better performance than previous state of the art traditional machine learning methods without the need for feature engineering. Our analysis shows that the neural methods particularly improve precision compared to traditional machine learning classifiers and also demonstrates how the difficulty of inter-sentence relation extraction increases as the distance between the event and context mentions increase.

【24】 Link-Intensive Alignment for Incomplete Knowledge Graphs 标题：不完备知识图的链接密集对齐链接：https://arxiv.org/abs/2112.09266

作者：Vinh Van Tong,Thanh Trung Huynh,Thanh Tam Nguyen,Hongzhi Yin,Quoc Viet Hung Nguyen,Quyet Thang Huynh 机构：Hanoi University of Science and Technology, Vietnam, Griffith University, Australia, The University of Queensland, Australia 摘要：知识图（KG）对齐——识别不同KG中引用相同事物的实体的任务——被认为是KG构建和完成领域中最重要的操作之一。然而，现有的对齐技术通常假设输入KG是完整的和同构的，这是不正确的，因为现实世界中的域、大小和稀疏性存在异质性。在这项工作中，我们解决了将不完全KG与表征学习相结合的问题。我们的KG嵌入框架利用了两个特征通道：基于传递性和基于邻近性。前者通过转换路径捕捉实体间的一致性约束，后者通过注意引导的关系感知图神经网络捕捉KG的邻域结构。两个特征通道被联合学习以在输入KG之间交换重要特征，同时在相同的嵌入空间中强制输入KG的输出表示。此外，我们还开发了一个缺失链接检测器，用于在训练过程中发现并恢复输入KG中的缺失链接，这有助于缓解不完整性问题，从而提高学习表示的兼容性。然后融合嵌入以生成对齐结果，并将高置信度匹配的节点对更新为预对齐的监控数据以逐步改进嵌入。实证结果表明，我们的模型比SOTA高出15.2%，并且对不同程度的不完全性具有鲁棒性。我们还证明，KG之间的知识交换有助于从知识图（也称为知识完成）中揭示不可见的事实，其结果比SOTA知识图完成技术高3.5%。摘要：Knowledge graph (KG) alignment - the task of recognizing entities referring to the same thing in different KGs - is recognized as one of the most important operations in the field of KG construction and completion. However, existing alignment techniques often assume that the input KGs are complete and isomorphic, which is not true due to the real-world heterogeneity in the domain, size, and sparsity. In this work, we address the problem of aligning incomplete KGs with representation learning. Our KG embedding framework exploits two feature channels: transitivity-based and proximity-based. The former captures the consistency constraints between entities via translation paths, while the latter captures the neighbourhood structure of KGs via attention guided relation-aware graph neural network. The two feature channels are jointly learned to exchange important features between the input KGs while enforcing the output representations of the input KGs in the same embedding space. Also, we develop a missing links detector that discovers and recovers the missing links in the input KGs during the training process, which helps mitigate the incompleteness issue and thus improve the compatibility of the learned representations. The embeddings then are fused to generate the alignment result, and the high-confidence matched node pairs are updated to the pre-aligned supervision data to improve the embeddings gradually. Empirical results show that our model is up to 15.2% more accurate than the SOTA and is robust against different levels of incompleteness. We also demonstrate that the knowledge exchanging between the KGs helps reveal the unseen facts from knowledge graphs (a.k.a. knowledge completion), with the result being 3.5% higher than the SOTA knowledge graph completion techniques.

【25】 Confidence-Aware Subject-to-Subject Transfer Learning for Brain-Computer Interface 标题：脑-机接口的可信度感知的主体间迁移学习链接：https://arxiv.org/abs/2112.09243

作者：Dong-Kyun Han,Serkan Musellim,Dong-Young Kim 机构：Dept. Brain and Cognitive Engineering, Korea University, Seoul, Republic of Korea, Dept. Artificial Intelligence 备注：Submitted to 2022 10th IEEE International Winter Conference on Brain-Computer Interface 摘要：脑电图（EEG）的受试者间/受试者内变异性使得脑-机接口（BCI）的实际使用变得困难。通常，每次使用BCI系统时，BCI系统都需要校准程序来调整模型。这个问题被认为是脑机接口的主要障碍，为了克服这个问题，最近出现了基于迁移学习（TL）的方法。然而，许多脑机接口范例的局限性在于，它们由一个结构组成，该结构首先显示标签，然后测量“图像”，在受试者对受试者TL过程的许多情况下，源受试者包含不包含控制信号的数据的负面影响被忽略。本文的主要目的是提出一种排除可能对受试者到受试者TL训练产生负面影响的受试者的方法，该方法通常使用尽可能多的受试者的数据。在本文中，我们提出了一个只使用高置信度主题进行TL训练的BCI框架。在我们的框架中，深度神经网络使用基于小损失技巧的协同教学算法为TL过程选择有用的主题，并排除噪声主题。我们在两个公共数据集（2020年国际BCI竞赛轨道4和OpenBMI数据集）上试验了遗漏一个主题的验证。我们的实验结果表明，置信感知的TL选择具有小损失实例的被试，提高了BCI的泛化性能。摘要：The inter/intra-subject variability of electroencephalography (EEG) makes the practical use of the brain-computer interface (BCI) difficult. In general, the BCI system requires a calibration procedure to tune the model every time the system is used. This problem is recognized as a major obstacle to BCI, and to overcome it, approaches based on transfer learning (TL) have recently emerged. However, many BCI paradigms are limited in that they consist of a structure that shows labels first and then measures "imagery", the negative effects of source subjects containing data that do not contain control signals have been ignored in many cases of the subject-to-subject TL process. The main purpose of this paper is to propose a method of excluding subjects that are expected to have a negative impact on subject-to-subject TL training, which generally uses data from as many subjects as possible. In this paper, we proposed a BCI framework using only high-confidence subjects for TL training. In our framework, a deep neural network selects useful subjects for the TL process and excludes noisy subjects, using a co-teaching algorithm based on the small-loss trick. We experimented with leave-one-subject-out validation on two public datasets (2020 international BCI competition track 4 and OpenBMI dataset). Our experimental results showed that confidence-aware TL, which selects subjects with small loss instances, improves the generalization performance of BCI.

【26】 Two-view Graph Neural Networks for Knowledge Graph Completion 标题：用于知识图补全的双视图图神经网络链接：https://arxiv.org/abs/2112.09231

作者：Vinh Tong,Dai Quoc Nguyen,Dinh Phung,Dat Quoc Nguyen 机构：VinAI Research, Vietnam; ,Oracle Labs, Australia; ,Monash University, Australia 摘要：在本文中，我们引入了一种新的基于GNN的知识图嵌入模型WGE，用于捕获以实体为中心的图结构和以关系为中心的图结构。特别是，给定知识图，WGE构建一个以实体为中心的无向图，将实体视为节点。此外，WGE还从以关系为中心的约束构造另一个单一的无向图，将实体和关系视为节点。然后，WGE提出了一种新的体系结构，在这两个单独的图上直接使用两个普通GNN来更好地更新实体和关系的向量表示，然后使用加权分数函数来返回三重分数。实验结果表明，WGE在三个新的具有挑战性的基准数据集CoDEx上获得了最先进的性能，用于知识图的完成。摘要：In this paper, we introduce a novel GNN-based knowledge graph embedding model, named WGE, to capture entity-focused graph structure and relation-focused graph structure. In particular, given the knowledge graph, WGE builds a single undirected entity-focused graph that views entities as nodes. In addition, WGE also constructs another single undirected graph from relation-focused constraints, which views entities and relations as nodes. WGE then proposes a new architecture of utilizing two vanilla GNNs directly on these two single graphs to better update vector representations of entities and relations, followed by a weighted score function to return the triple scores. Experimental results show that WGE obtains state-of-the-art performances on three new and challenging benchmark datasets CoDEx for knowledge graph completion.

【27】 Sim2Real Docs: Domain Randomization for Documents in Natural Scenes using Ray-traced Rendering 标题：Sim2Real Docs：基于光线跟踪渲染的自然场景中文档的域随机化链接：https://arxiv.org/abs/2112.09220

作者：Nikhil Maddikunta,Huijun Zhao,Sumit Keswani,Alfy Samuel,Fu-Ming Guo,Nishan Srishankar,Vishwa Pardeshi,Austin Huang 机构：Fidelity Investments, Artificial Intelligence Center of Excellence 备注：Accepted to Neurips 2021 Data Centric AI (DCAI) Workshop 摘要：过去，数字化文档的计算机视觉系统可以依赖于系统捕获的高质量扫描。如今，涉及数字文档的交易更可能从非专业人士上传手机照片开始。因此，用于文档自动化的计算机视觉现在必须考虑在自然场景上下文中捕获的文档。另一个挑战是，文档处理的任务目标可能是高度特定于用例的，这使得公共可用数据集的实用性受到限制，而手动数据标记的成本也很高，并且在用例之间的转换很差。为了解决这些问题，我们创建了Sim2Real Docs—一个用于合成数据集和在自然场景中执行文档领域随机化的框架。Sim2Real Docs支持使用Blender（一种用于三维建模和光线跟踪渲染的开源工具）对文档进行编程式三维渲染。通过使用模拟灯光、几何体、相机和背景的物理交互的渲染，我们在自然场景上下文中合成文档数据集。每个渲染都与特定于用例的地面真实数据配对，指定感兴趣的潜在特征，生成无限适合任务训练数据。然后，机器学习模型的作用是解决由渲染管道引起的反问题。通过微调或调整领域随机化参数，此类模型可以进一步使用真实数据进行迭代。摘要：In the past, computer vision systems for digitized documents could rely on systematically captured, high-quality scans. Today, transactions involving digital documents are more likely to start as mobile phone photo uploads taken by non-professionals. As such, computer vision for document automation must now account for documents captured in natural scene contexts. An additional challenge is that task objectives for document processing can be highly use-case specific, which makes publicly-available datasets limited in their utility, while manual data labeling is also costly and poorly translates between use cases. To address these issues we created Sim2Real Docs - a framework for synthesizing datasets and performing domain randomization of documents in natural scenes. Sim2Real Docs enables programmatic 3D rendering of documents using Blender, an open source tool for 3D modeling and ray-traced rendering. By using rendering that simulates physical interactions of light, geometry, camera, and background, we synthesize datasets of documents in a natural scene context. Each render is paired with use-case specific ground truth data specifying latent characteristics of interest, producing unlimited fit-for-task training data. The role of machine learning models is then to solve the inverse problem posed by the rendering pipeline. Such models can be further iterated upon with real-world data by either fine tuning or making adjustments to domain randomization parameters.

【28】 Hyperbolic Disentangled Representation for Fine-Grained Aspect Extraction 标题：用于细粒度特征提取的双曲解缠表示法链接：https://arxiv.org/abs/2112.09215

作者：Chang-You Tai,Ming-Yao Li,Lun-Wei Ku 机构：Academia Sinica, Taipei, Taiwan 摘要：从用户评论中自动识别显著方面对于意见分析特别有用。在利用弱监督方法方面已经取得了重大进展，这种方法只需要一小部分种子词来训练方面分类器。然而，总有改进的余地。首先，没有一种弱监督方法能够充分利用词之间的潜在层次结构。第二，每个种子词表示应该具有不同的潜在语义，并且当它表示不同的方面时应该是不同的。在本文中，我们提出了HDAE，一个双曲线解纠缠体提取器，其中一个双曲线体分类器捕获单词的潜在层次结构，并且体解纠缠表示为每个种子单词的不同潜在语义建模。与之前的基线相比，HDAE在亚马逊产品评论和餐厅评论数据集上实现的F1平均性能增益分别为18.2%和24.1%。此外，em bedding可视化经验表明，HDAE是利用种子词的更有效方法。烧蚀研究和案例研究进一步证明了所提出组件的有效性摘要：Automatic identification of salient aspects from user reviews is especially useful for opinion analysis. There has been significant progress in utilizing weakly supervised approaches, which require only a small set of seed words for training aspect classifiers. However, there is always room for improvement. First, no weakly supervised approaches fully utilize latent hierarchies between words. Second, each seed words representation should have different latent semantics and be distinct when it represents a different aspect. In this paper, we propose HDAE, a hyperbolic disentangled aspect extractor in which a hyperbolic aspect classifier captures words latent hierarchies, and aspect-disentangled representation models the distinct latent semantics of each seed word. Compared to previous baselines, HDAE achieves average F1 performance gains of 18.2% and 24.1% on Amazon product review and restaurant review datasets, respectively. In addition, the em-bedding visualization experience demonstrates that HDAE is a more effective approach to leveraging seed words. An ablation study and a case study further attest to the effectiveness of the proposed components

【29】 Semantic-Based Few-Shot Learning by Interactive Psychometric Testing 标题：基于语义的交互式心理测量学习链接：https://arxiv.org/abs/2112.09201

作者：Lu Yin,Vlado Menkovski,Yulong Pei,Mykola Pechenizkiy 机构：Eindhoven University of Technology, Eindhoven , MB, Netherlands 备注：Accepted by AAAI 2022 Workshop on Interactive Machine Learning (IML@AAAI22) 摘要：很少有镜头分类任务的目标是仅基于支持集中的几个标记示例对查询集中的图像进行分类。大多数研究通常假设任务中的每个图像都有一个唯一的类关联。在这些假设下，当支持类和查询类之间没有精确匹配时，这些算法可能无法识别正确的类分配。例如，给出几张狮子、自行车和苹果的图片来对老虎进行分类。然而，在一个更一般的设置，我们可以考虑更高层次的概念，大型食肉动物匹配老虎到狮子的语义分类。现有研究很少考虑这种情况，因为基于标签的监督与复杂的概念关系不兼容。在这项工作中，我们将少数镜头学习推向了更具挑战性的场景，即基于语义的少数镜头学习，并提出了一种通过交互心理测量学学习捕捉内部语义关系来解决这一范式的方法。我们在CIFAR-100数据集上评估了我们的方法。结果表明了我们提出的方法的优点。摘要：Few-shot classification tasks aim to classify images in query sets based on only a few labeled examples in support sets. Most studies usually assume that each image in a task has a single and unique class association. Under these assumptions, these algorithms may not be able to identify the proper class assignment when there is no exact matching between support and query classes. For example, given a few images of lions, bikes, and apples to classify a tiger. However, in a more general setting, we could consider the higher-level concept of large carnivores to match the tiger to the lion for semantic classification. Existing studies rarely considered this situation due to the incompatibility of label-based supervision with complex conception relationships. In this work, we advanced the few-shot learning towards this more challenging scenario, the semantic-based few-shot learning, and proposed a method to address the paradigm by capturing the inner semantic relationships using interactive psychometric learning. We evaluate our method on the CIFAR-100 dataset. The results show the merits of our proposed method.

【30】 Verification of Neural-Network Control Systems by Integrating Taylor Models and Zonotopes 标题：集成Taylor模型和Zonotope的神经网络控制系统验证链接：https://arxiv.org/abs/2112.09197

作者：Christian Schilling,Marcelo Forets,Sebastian Guadalupe 机构： Aalborg University, Denmark, Universidad de la Rep´ublica, Uruguay 备注：accepted at AAAI-22 摘要：研究了具有神经网络控制器的闭环动态系统的验证问题。这个问题通常简化为计算可达状态集。当孤立地考虑动力系统和神经网络时，存在基于集合表示的精确方法，分别称为泰勒模型和zonotopes。然而，将这些方法组合到NNC是非常重要的，因为在集合表示之间转换时，依赖性信息在每个控制周期中丢失，累积的近似误差很快使结果无效。我们提出了一种基于泰勒模型和zonotopes的链式方法算法，给出了NNC的精确可达性算法。由于该算法只作用于孤立方法的界面，因此它适用于一般的动力系统和神经网络，并可受益于这些领域的未来发展。我们的实现提供了最先进的性能，并且是第一个成功分析NNC年度可达性竞争中所有基准问题的实现。摘要：We study the verification problem for closed-loop dynamical systems with neural-network controllers (NNCS). This problem is commonly reduced to computing the set of reachable states. When considering dynamical systems and neural networks in isolation, there exist precise approaches for that task based on set representations respectively called Taylor models and zonotopes. However, the combination of these approaches to NNCS is non-trivial because, when converting between the set representations, dependency information gets lost in each control cycle and the accumulated approximation error quickly renders the result useless. We present an algorithm to chain approaches based on Taylor models and zonotopes, yielding a precise reachability algorithm for NNCS. Because the algorithm only acts at the interface of the isolated approaches, it is applicable to general dynamical systems and neural networks and can benefit from future advances in these areas. Our implementation delivers state-of-the-art performance and is the first to successfully analyze all benchmark problems of an annual reachability competition for NNCS.

【31】 Causal Modeling With Infinitely Many Variables 标题：具有无限多个变量的因果模型链接：https://arxiv.org/abs/2112.09171

作者：Spencer Peters,Joseph Y. Halpern 机构：Cornell University 摘要：结构方程模型（SEM）可能是因果关系建模最常用的框架。然而，正如我们所展示的，天真地将这个框架扩展到无限多的变量，这是建立动态系统模型所必需的，会遇到一些问题。我们介绍了GSEMs（广义SEMs），一种直接指定干预结果的SEM的灵活泛化，其中（1）微分方程系统可以以自然直观的方式表示，（2）SEMs根本无法表示的某些自然情况可以轻松表示，（3）中小企业实际因果关系的定义基本上没有改变。摘要：Structural-equations models (SEMs) are perhaps the most commonly used framework for modeling causality. However, as we show, naively extending this framework to infinitely many variables, which is necessary, for example, to model dynamical systems, runs into several problems. We introduce GSEMs (generalized SEMs), a flexible generalization of SEMs that directly specify the results of interventions, in which (1) systems of differential equations can be represented in a natural and intuitive manner, (2) certain natural situations, which cannot be represented by SEMs at all, can be represented easily, (3) the definition of actual causality in SEMs carries over essentially without change.

【32】 On Optimizing Interventions in Shared Autonomy 标题：论共享自治中的优化干预链接：https://arxiv.org/abs/2112.09169

作者：Weihao Tan,David Koleczek,Siddhant Pradhan,Nicholas Perello,Vivek Chettiar,Vishal Rohra,Aaslesha Rajaram,Soundararajan Srinivasan,H M Sajjad Hossain,Yash Chandak 机构： University of Massachusetts Amherst, Microsoft, MassMutual Data Science 备注：Accepted by AAAI2022 摘要：共享自治是指使自治代理能够与人协作以提高人的绩效的方法。然而，除了提高性能之外，代理同时负责保存用户的体验或协作满意度通常也是有益的。为了实现这一额外目标，我们研究了通过限制自主代理的干预次数来改善用户体验的方法。我们提出了两种无模型强化学习方法，可以同时考虑干预数量的硬约束和软约束。我们表明，我们的方法不仅优于现有的基线，而且不需要手动调整黑盒超参数来控制援助水平。我们还对干预场景进行了深入分析，以进一步阐明对系统的理解。摘要：Shared autonomy refers to approaches for enabling an autonomous agent to collaborate with a human with the aim of improving human performance. However, besides improving performance, it may often also be beneficial that the agent concurrently accounts for preserving the user's experience or satisfaction of collaboration. In order to address this additional goal, we examine approaches for improving the user experience by constraining the number of interventions by the autonomous agent. We propose two model-free reinforcement learning methods that can account for both hard and soft constraints on the number of interventions. We show that not only does our method outperform the existing baseline, but also eliminates the need to manually tune a black-box hyperparameter for controlling the level of assistance. We also provide an in-depth analysis of intervention scenarios in order to further illuminate system understanding.

【33】 High Fidelity Visualization of What Your Self-Supervised Representation Knows About 标题：自我监督制图表达所知内容的高保真可视化链接：https://arxiv.org/abs/2112.09164

作者：Florian Bordes,Randall Balestriero,Pascal Vincent 机构：Meta AI, Mila, Universite de Montreal, Canada CIFAR AI Chair 摘要：发现神经网络学习到的知识仍然是一项挑战。在自监督学习中，分类是用来评估表示有多好的最常见任务。然而，仅仅依靠这种下游任务会限制我们对给定输入表示中保留多少信息的理解。在这项工作中，我们展示了使用基于条件扩散的生成模型（RCDM）来可视化自监督模型学习的表示。我们进一步展示了该模型的生成质量如何与最先进的生成模型相一致，同时忠实于表示用作调节的表示。通过使用这个新工具来分析自监督模型，我们可以直观地显示i）SSL（主干）表示对于他们所训练的许多数据扩充并不是真正不变的。ii）对于分类之类的任务，SSL似乎过于固定。iii）SSL表示对其输入的小干扰更具鲁棒性iv）通过SSL模型学习到的固有结构可用于图像处理。摘要：Discovering what is learned by neural networks remains a challenge. In self-supervised learning, classification is the most common task used to evaluate how good a representation is. However, relying only on such downstream task can limit our understanding of how much information is retained in the representation of a given input. In this work, we showcase the use of a conditional diffusion based generative model (RCDM) to visualize representations learned with self-supervised models. We further demonstrate how this model's generation quality is on par with state-of-the-art generative models while being faithful to the representation used as conditioning. By using this new tool to analyze self-supervised models, we can show visually that i) SSL (backbone) representation are not really invariant to many data augmentation they were trained on. ii) SSL projector embedding appear too invariant for tasks like classifications. iii) SSL representations are more robust to small adversarial perturbation of their inputs iv) there is an inherent structure learned with SSL model that can be used for image manipulation.

【34】 An Empirical Investigation of the Role of Pre-training in Lifelong Learning 标题：关于职前训练在终身学习中作用的实证研究链接：https://arxiv.org/abs/2112.09153

作者：Sanket Vaibhav Mehta,Darshan Patil,Sarath Chandar,Emma Strubell 机构：Carnegie Mellon University,Mila - Quebec AI Institute,University of Montreal, École Polytechnique de Montréal,Canada CIFAR AI Chair 备注：30 pages 摘要：机器学习中的终身学习模式是一种有吸引力的替代更为突出的孤立学习模式的方法，这不仅是因为它与生物学习相似，还因为它可以通过避免过度的模型再训练来减少能量浪费。这一范式的一个关键挑战是灾难性遗忘现象。随着预训练模型在机器学习中的日益普及和成功，我们提出了一个问题：预训练在终身学习中扮演什么角色，特别是在灾难性遗忘方面？我们在预先训练的大型模型中研究现有方法，并评估它们在各种文本和图像分类任务中的性能，包括使用15种不同NLP任务的新数据集进行的大规模研究。在所有设置中，我们观察到，与随机初始化的模型相比，通用预训练隐式地减轻了顺序学习多个任务时灾难性遗忘的影响。然后，我们进一步研究为什么在这种环境下，预先训练可以减轻遗忘。我们通过分析损失情况来研究这一现象，发现预先训练的权重似乎通过导致更大的极小值来缓解遗忘。基于这一观点，我们建议联合优化当前任务损失和损失盆地锐度，以便在顺序微调期间明确鼓励更宽的盆地。我们表明，这种优化方法可以在多个环境中实现与最先进的任务顺序连续学习相媲美的性能，而不会保留随任务数量而扩展的内存。摘要：The lifelong learning paradigm in machine learning is an attractive alternative to the more prominent isolated learning scheme not only due to its resemblance to biological learning, but also its potential to reduce energy waste by obviating excessive model re-training. A key challenge to this paradigm is the phenomenon of catastrophic forgetting. With the increasing popularity and success of pre-trained models in machine learning, we pose the question: What role does pre-training play in lifelong learning, specifically with respect to catastrophic forgetting? We investigate existing methods in the context of large, pre-trained models and evaluate their performance on a variety of text and image classification tasks, including a large-scale study using a novel dataset of 15 diverse NLP tasks. Across all settings, we observe that generic pre-training implicitly alleviates the effects of catastrophic forgetting when learning multiple tasks sequentially compared to randomly initialized models. We then further investigate why pre-training alleviates forgetting in this setting. We study this phenomenon by analyzing the loss landscape, finding that pre-trained weights appear to ease forgetting by leading to wider minima. Based on this insight, we propose jointly optimizing for current task loss and loss basin sharpness in order to explicitly encourage wider basins during sequential fine-tuning. We show that this optimization approach leads to performance comparable to the state-of-the-art in task-sequential continual learning across multiple settings, without retaining a memory that scales in size with the number of tasks.

【35】 ASC-Net: Unsupervised Medical Anomaly Segmentation Using an Adversarial-based Selective Cutting Network 标题：ASC-Net：基于对抗性选择性切割网络的无监督医学异常分割链接：https://arxiv.org/abs/2112.09135

作者：Raunak Dey,Wenbo Sun,Haibo Xu,Yi Hong 机构： Department of Computer Science, University of Georgia, Department of Radiology, Zhongnan Hospital of Wuhan University, Department of Computer Science and Engineering, Shanghai Jiao Tong University 备注：Currently in Submission to Medical Image Analysis Journal. Extension of DOI - 10.1007/978-3-030-87240-3_23 with more details and experiments and indepth analysis. arXiv admin note: substantial text overlap with arXiv:2103.03664 摘要：在本文中，我们考虑的问题，在医疗图像中的无监督异常分割，这已引起越来越多的关注，近年来，由于昂贵的像素级注释专家和大量未注释的正常和异常图像扫描的存在。我们介绍了一个分割网络，该网络利用对抗式学习将图像分割为两个切块，其中一个切块落入用户提供的参考分布中。这种基于对抗的选择性切割网络（ASC-Net）连接了基于集群的深度分割和基于对抗的异常/新颖性检测算法这两个领域。我们的ASC网络从正常和异常的医学扫描中学习，在没有任何面罩监督的情况下分割医学扫描中的异常。我们在三个公共数据集（即用于脑肿瘤分割的BraTS 2019、用于肝损伤分割的LiTS和用于脑损伤分割的MS-SEG 2015）以及用于脑肿瘤分割的私有数据集上评估了该无监督异常分割模型。与现有的方法相比，我们的模型在无监督的异常分割任务中表现出了巨大的性能提升。尽管与监督学习算法相比，仍有进一步提高性能的空间，但有希望的实验结果和有趣的观察结果为利用用户定义的知识构建用于医学异常识别的无监督学习算法提供了依据。摘要：In this paper we consider the problem of unsupervised anomaly segmentation in medical images, which has attracted increasing attention in recent years due to the expensive pixel-level annotations from experts and the existence of a large amount of unannotated normal and abnormal image scans. We introduce a segmentation network that utilizes adversarial learning to partition an image into two cuts, with one of them falling into a reference distribution provided by the user. This Adversarial-based Selective Cutting network (ASC-Net) bridges the two domains of cluster-based deep segmentation and adversarial-based anomaly/novelty detection algorithms. Our ASC-Net learns from normal and abnormal medical scans to segment anomalies in medical scans without any masks for supervision. We evaluate this unsupervised anomly segmentation model on three public datasets, i.e., BraTS 2019 for brain tumor segmentation, LiTS for liver lesion segmentation, and MS-SEG 2015 for brain lesion segmentation, and also on a private dataset for brain tumor segmentation. Compared to existing methods, our model demonstrates tremendous performance gains in unsupervised anomaly segmentation tasks. Although there is still room to further improve performance compared to supervised learning algorithms, the promising experimental results and interesting observations shed light on building an unsupervised learning algorithm for medical anomaly identification using user-defined knowledge.

机器翻译，仅供参考

linux https 网络安全 php 学习方法

0 人点赞