人工智能学术速递[7.22]

访问www.arxivdaily.com获取含摘要速递，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏、发帖等功能！点击阅读原文即可访问

cs.AI人工智能，共计28篇

【1】 Neural Fixed-Point Acceleration for Convex Optimization 标题：凸优化的神经不动点加速算法

作者：Shobha Venkataraman,Brandon Amos 机构：Facebook AI 备注：AutoML@ICML2021 链接：https://arxiv.org/abs/2107.10254 摘要：定点迭代是数值计算的核心，在实时应用中常常是计算瓶颈，而实时应用通常需要中等精度的快速解。经典的不动点问题加速方法侧重于设计具有适用于任何不动点问题的理论保证的算法。利用元学习和经典加速算法的思想，我们提出了一个神经不动点加速框架，该框架可以自动学习加速从分布中提取的凸不动点问题。我们将我们的框架应用于SCS（最先进的凸锥规划求解器），并设计模型和损失函数，以克服学习展开优化和加速不稳定性的挑战。我们的工作将神经加速引入到任何可以用CVXPY表示的优化问题中。本文的源代码可以在https://github.com/facebookresearch/neural-scs 摘要：Fixed-point iterations are at the heart of numerical computing and are often a computational bottleneck in real-time applications, which typically instead need a fast solution of moderate accuracy. Classical acceleration methods for fixed-point problems focus on designing algorithms with theoretical guarantees that apply to any fixed-point problem. We present neural fixed-point acceleration, a framework to automatically learn to accelerate convex fixed-point problems that are drawn from a distribution, using ideas from meta-learning and classical acceleration algorithms. We apply our framework to SCS, the state-of-the-art solver for convex cone programming, and design models and loss functions to overcome the challenges of learning over unrolled optimization and acceleration instabilities. Our work brings neural acceleration into any optimization problem expressible with CVXPY. The source code behind this paper is available at https://github.com/facebookresearch/neural-scs

【2】 Demonstration-Guided Reinforcement Learning with Learned Skills 标题：带学习技能的示范引导式强化学习

作者：Karl Pertsch,Youngwoon Lee,Yue Wu,Joseph J. Lim 机构：University of Southern California 链接：https://arxiv.org/abs/2107.10253 摘要：示范引导强化学习（RL）是一种利用奖励反馈和一组目标任务示范来学习复杂行为的有效方法。先前的演示引导RL方法将每个新任务视为一个独立的学习问题，并尝试一步一步地跟随所提供的演示，类似于人类试图通过跟随演示者的精确肌肉运动来模仿完全看不见的行为。当然，这样的学习会很慢，但新的行为往往不是完全看不见的：它们与我们以前学过的行为共享子任务。在这项工作中，我们的目标是利用这种共享的子任务结构来提高演示引导RL的效率。我们首先从跨多个任务收集的大量离线经验数据集中学习一组可重用的技能。然后，我们提出了基于技能的示范学习（SkiLD），这是一种示范引导RL算法，它通过遵循示范技能而不是原始动作来有效地利用所提供的示范，从而比以前的示范引导RL方法有显著的性能改进。在长视距迷宫导航和复杂机器人操作任务中验证了该方法的有效性。摘要：Demonstration-guided reinforcement learning (RL) is a promising approach for learning complex behaviors by leveraging both reward feedback and a set of target task demonstrations. Prior approaches for demonstration-guided RL treat every new task as an independent learning problem and attempt to follow the provided demonstrations step-by-step, akin to a human trying to imitate a completely unseen behavior by following the demonstrator's exact muscle movements. Naturally, such learning will be slow, but often new behaviors are not completely unseen: they share subtasks with behaviors we have previously learned. In this work, we aim to exploit this shared subtask structure to increase the efficiency of demonstration-guided RL. We first learn a set of reusable skills from large offline datasets of prior experience collected across many tasks. We then propose Skill-based Learning with Demonstrations (SkiLD), an algorithm for demonstration-guided RL that efficiently leverages the provided demonstrations by following the demonstrated skills instead of the primitive actions, resulting in substantial performance improvements over prior demonstration-guided RL approaches. We validate the effectiveness of our approach on long-horizon maze navigation and complex robot manipulation tasks.

【3】 Bridging the Gap between Spatial and Spectral Domains: A Theoretical Framework for Graph Neural Networks 标题：弥合空间域和谱域之间的鸿沟：一个图神经网络的理论框架

作者：Zhiqian Chen,Fanglan Chen,Lei Zhang,Taoran Ji,Kaiqun Fu,Liang Zhao,Feng Chen,Lingfei Wu,Charu Aggarwal,Chang-Tien Lu 机构：Department of Computer Science and Engineering, Mississippi State University 链接：https://arxiv.org/abs/2107.10234 摘要：在过去的十年中，深度学习的性能在各种机器学习任务中得到了广泛的认可，从图像分类、语音识别到自然语言理解。图神经网络（GNN）是一种深度学习，它利用传统深度学习技术难以解决的图结构数据来处理非欧氏问题。大多数GNN是使用各种过程创建的，包括随机游走、PageRank、图卷积和热扩散，因此无法进行直接比较。以前的研究主要集中在将现有模型划分为不同的类别，而很少研究它们的内部关系。这项研究提出了一个统一的理论框架和一个新的视角，可以将现有的GNN方法整合到我们的框架中。我们调查和分类现有的GNN模型分为空间域和光谱域，以及显示之间的联系在每个领域的子类别。进一步的研究揭示了这些域的空间、光谱和子群之间的密切关系。摘要：During the past decade, deep learning's performance has been widely recognized in a variety of machine learning tasks, ranging from image classification, speech recognition to natural language understanding. Graph neural networks (GNN) are a type of deep learning that is designed to handle non-Euclidean issues using graph-structured data that are difficult to solve with traditional deep learning techniques. The majority of GNNs were created using a variety of processes, including random walk, PageRank, graph convolution, and heat diffusion, making direct comparisons impossible. Previous studies have primarily focused on classifying current models into distinct categories, with little investigation of their internal relationships. This research proposes a unified theoretical framework and a novel perspective that can methodologically integrate existing GNN into our framework. We survey and categorize existing GNN models into spatial and spectral domains, as well as show linkages between subcategories within each domain. Further investigation reveals a strong relationship between the spatial, spectral, and subgroups of these domains.

【4】 Distribution of Classification Margins: Are All Data Equal? 标题：分类边距的分布：所有数据都相等吗？

作者：Andrzej Banburski,Fernanda De La Torre,Nishka Pant,Ishana Shastri,Tomaso Poggio 机构： 2Brown University 备注：Previously online as CBMM Memo 115 on the CBMM MIT site 链接：https://arxiv.org/abs/2107.10199 摘要：最近的理论结果表明，在指数损失函数下，深度神经网络的梯度下降使分类裕度局部最大，这相当于在裕度约束下使权重矩阵的范数最小化。然而，解的这一性质并不能完全描述其泛化性能。我们从理论上证明了训练集上边缘分布曲线下的面积实际上是一个很好的泛化度量。然后，我们证明，在实现数据分离后，可以动态地将训练集减少99%以上，而不会显著降低性能。有趣的是，得到的“高容量”特征子集在不同的训练运行中并不一致，这与理论上的观点一致，即在SGD下，以及在存在批量归一化和权重衰减的情况下，所有训练点都应收敛到相同的渐近边界。摘要：Recent theoretical results show that gradient descent on deep neural networks under exponential loss functions locally maximizes classification margin, which is equivalent to minimizing the norm of the weight matrices under margin constraints. This property of the solution however does not fully characterize the generalization performance. We motivate theoretically and show empirically that the area under the curve of the margin distribution on the training set is in fact a good measure of generalization. We then show that, after data separation is achieved, it is possible to dynamically reduce the training set by more than 99% without significant loss of performance. Interestingly, the resulting subset of "high capacity" features is not consistent across different training runs, which is consistent with the theoretical claim that all training points should converge to the same asymptotic margin under SGD and in the presence of both batch normalization and weight decay.

【5】 Answer-Set Programs for Reasoning about Counterfactual Interventions and Responsibility Scores for Classification 标题：关于反事实干预和分类责任得分推理的答案集程序

作者：Leopoldo Bertossi,Gabriela Reyes 机构：Universidad Adolfo Ib´a˜nez, and, Millennium Inst. for Foundational Research on Data (IMFD), Santiago, Chile 备注：Extended version with appendices of conference submission (under review). arXiv admin note: text overlap with arXiv:2106.10562 链接：https://arxiv.org/abs/2107.10159 摘要：我们描述了如何使用答案集程序声明性地指定对分类下实体的反事实干预，以及它们的原因。特别是，它们可以用来定义和计算责任分数，作为分类模型结果的基于归因的解释。该方法允许包含领域知识并支持查询应答。给出了一个朴素贝叶斯分类器的详细实例。摘要：We describe how answer-set programs can be used to declaratively specify counterfactual interventions on entities under classification, and reason about them. In particular, they can be used to define and compute responsibility scores as attribution-based explanations for outcomes from classification models. The approach allows for the inclusion of domain knowledge and supports query answering. A detailed example with a naive-Bayes classifier is presented.

【6】 A Deep Reinforcement Learning Approach for Fair Traffic Signal Control 标题：一种用于公平交通信号控制的深度强化学习方法

作者：Majid Raeis,Alberto Leon-Garcia 机构：UniversityofToronto 备注：7 pages, Accepted at ITSC 2021 (International Conference on Intelligent Transportation Systems) 链接：https://arxiv.org/abs/2107.10146 摘要：交通信号控制是城市交通管理中最有效的方法之一。近年来，基于深度强化学习（DRL）的交通控制方法因其对实时交通数据的挖掘能力而受到广泛关注，而传统的手工方法往往使用较少。最近的基于DRL的方法主要集中在最大化车辆的吞吐量或最小化车辆的平均行驶时间，而交通信号控制器的公平性常常被忽略。这一点尤其重要，因为忽略公平性可能导致某些车辆经历极端等待时间，或者特定交通流的吞吐量受到交叉口另一冲突流量波动的高度影响。为了解决这些问题，我们引入了两个公平性的概念：基于延迟的公平性和基于吞吐量的公平性。此外，我们还提出了两种基于DRL的交通信号控制方法来实现这些公平性概念，这两种方法都可以获得较高的吞吐量。我们使用三种流量到达分布来评估我们提出的方法的性能，发现我们的方法在测试场景中的性能优于基线。摘要：Traffic signal control is one of the most effective methods of traffic management in urban areas. In recent years, traffic control methods based on deep reinforcement learning (DRL) have gained attention due to their ability to exploit real-time traffic data, which is often poorly used by the traditional hand-crafted methods. While most recent DRL-based methods have focused on maximizing the throughput or minimizing the average travel time of the vehicles, the fairness of the traffic signal controllers has often been neglected. This is particularly important as neglecting fairness can lead to situations where some vehicles experience extreme waiting times, or where the throughput of a particular traffic flow is highly impacted by the fluctuations of another conflicting flow at the intersection. In order to address these issues, we introduce two notions of fairness: delay-based and throughput-based fairness, which correspond to the two issues mentioned above. Furthermore, we propose two DRL-based traffic signal control methods for implementing these fairness notions, that can achieve a high throughput as well. We evaluate the performance of our proposed methods using three traffic arrival distributions, and find that our methods outperform the baselines in the tested scenarios.

【7】 Peer Selection with Noisy Assessments 标题：带噪声评估的同伴选择

作者：Omer Lev,Nicholas Mattei,Paolo Turrini,Stanislav Zhydkov 机构：Ben-Gurion University, Israel, Tulane University, USA, University of Warwick, UK 备注：15 pages, 5 figures 链接：https://arxiv.org/abs/2107.10121 摘要：在同行选择问题中，一组代理必须选择自己的一个子集作为获奖者，例如，同行评审的赠款或奖金。在这里，我们从Condorcet的角度来看待这个聚合问题，也就是说，在代理上存在一个基本的真值排序，我们希望根据节点的噪声评估来选择最佳的代理集。考虑到这个模型，一些代理可能是不可靠的，而另一些代理可能是自私自利的，试图影响对他们有利的结果。本文将目前最精确的同行评议算法PeerNomination扩展为加权PeerNomination，它能够处理有噪声和不精确的agent。为了做到这一点，我们明确制定评估员的可靠性权重的方式，不违反策略性，并使用此信息来重新衡量他们的得分。分析表明，加权方案可以显著提高选择的整体精度。最后，我们实现了几个重加权方法的实例，并从经验上证明了我们的方法在噪声评估中的鲁棒性。摘要：In the peer selection problem a group of agents must select a subset of themselves as winners for, e.g., peer-reviewed grants or prizes. Here, we take a Condorcet view of this aggregation problem, i.e., that there is a ground-truth ordering over the agents and we wish to select the best set of agents, subject to the noisy assessments of the peers. Given this model, some agents may be unreliable, while others might be self-interested, attempting to influence the outcome in their favour. In this paper we extend PeerNomination, the most accurate peer reviewing algorithm to date, into WeightedPeerNomination, which is able to handle noisy and inaccurate agents. To do this, we explicitly formulate assessors' reliability weights in a way that does not violate strategyproofness, and use this information to reweight their scores. We show analytically that a weighting scheme can improve the overall accuracy of the selection significantly. Finally, we implement several instances of reweighting methods and show empirically that our methods are robust in the face of noisy assessments.

【8】 Training Electric Vehicle Charging Controllers with Imitation Learning 标题：用模拟学习方法训练电动汽车充电控制器

作者：Martin Pilát 备注：Submitted to ICTAI 2021 链接：https://arxiv.org/abs/2107.10111 摘要：随着电动汽车数量的增加，协调电动汽车充电的问题变得越来越重要。本文提出了一种电动汽车充电协调控制器的训练方法。与此主题的大多数现有工作不同，我们要求控制器保护用户的隐私，因此我们不允许控制器与任何第三方进行任何通信。为了训练控制器，我们使用了模仿学习的思想——我们首先用二次优化方法为问题的松弛版本找到一个最优解，然后训练控制器来模仿这个解。研究了最优解的正则化对控制器性能的影响。在实际数据上对该方法进行了评估，结果表明，与使用进化算法训练的类似控制器相比，该方法的性能和训练速度都有所提高。摘要：The problem of coordinating the charging of electric vehicles gains more importance as the number of such vehicles grows. In this paper, we develop a method for the training of controllers for the coordination of EV charging. In contrast to most existing works on this topic, we require the controllers to preserve the privacy of the users, therefore we do not allow any communication from the controller to any third party. In order to train the controllers, we use the idea of imitation learning -- we first find an optimum solution for a relaxed version of the problem using quadratic optimization and then train the controllers to imitate this solution. We also investigate the effects of regularization of the optimum solution on the performance of the controllers. The method is evaluated on realistic data and shows improved performance and training speed compared to similar controllers trained using evolutionary algorithms.

【9】 SituationCO v1.2's Terms, Properties, Relationships and Axioms -- A Core Ontology for Particular and Generic Situations 标题：SituationCOV1.2的术语、属性、关系和公理--针对特殊和一般情况的核心本体

作者：Luis Olsina,Guido Tebes,Pablo Becker 机构：SituationCO v,.,’s Terms, Properties, Relationships and Axioms -- A, Core Ontology for Particular and Generic Situations, GIDIS_Web, Facultad de Ingeniería, UNLPam, General Pico, LP, Argentina 备注：9 pgs 链接：https://arxiv.org/abs/2107.10083 摘要：当前的预印本是对SituationCO v1.1（情境核心本体）的更新，它代表了它的新版本1.2。它规定和定义了情境的所有术语、属性、关系和公理，是一个特定和一般情境的本体论，在称为FCD OntoArch（科学的基础、核心和领域本体论架构）的四层本体论架构的背景下置于核心层。这是一个四层本体论架构，考虑了基础、核心、领域和实例级别。域层次又分为两个子层次，即顶级域和低级域本体层次。事实上，我们可以认为它是一个五层体系结构。同一层次的本体可以相互关联，除了基础层次的本体只有ThingFO（Thing-foundational-Ontology）。此外，较低层次的本体术语和关系可以通过较高层次的本体术语和关系来丰富语义。请注意，ThingFO和核心级别的本体（如SituationCO、ProcessCO等）都是独立于域的。SituationCO的术语和关系主要来自ThingFO。它还完全重用了主要来自ProcessCO、ProjectCO和GoalCO本体的术语。刻板印象是用来丰富情景词汇的机制。请注意，在本文的最后，我们讨论了非分类学关系验证矩阵的情况co与ThingFO。摘要：The current preprint is an update to SituationCO v1.1 (Situation Core Ontology), which represents its new version 1.2. It specifies and defines all the terms, properties, relationships and axioms of SituationCO v1.2, being an ontology for particular and generic Situations placed at the core level in the context of a four-layered ontological architecture called FCD-OntoArch (Foundational, Core, and Domain Ontological Architecture for Sciences). This is a four-layered ontological architecture, which considers Foundational, Core, Domain and Instance levels. In turn, the domain level is split down in two sub-levels, namely: Top-domain and Low-domain ontological levels. So in fact, we can consider it to be a five-tier architecture. Ontologies at the same level can be related to each other, except for the foundational level where only ThingFO (Thing Foundational Ontology) is found. In addition, ontologies' terms and relationships at lower levels can be semantically enriched by ontologies' terms and relationships from the higher levels. Note that both ThingFO and ontologies at the core level such as SituationCO, ProcessCO, among others, are domain independent. SituationCO's terms and relationships are specialized primarily from ThingFO. It also completely reuses terms primarily from ProcessCO, ProjectCO and GoalCO ontologies. Stereotypes are the used mechanism for enriching SituationCO terms. Note that in the end of this document, we address the SituationCO vs. ThingFO non-taxonomic relationship verification matrix.

【10】 Learning Theorem Proving Components 标题：学习定理证明组件

作者：Karel Chvalovský,Jan Jakubův,Miroslav Olšák,Josef Urban 机构：Czech Technical University in Prague, Prague, Czechia, University of Innsbruck, Innsbruck, Austria 备注：Accepted to TABLEAUX'21 链接：https://arxiv.org/abs/2107.10034 摘要：基于给定子句过程的饱和式自动定理证明器（ATPs）是目前经典一阶逻辑最强大的通用推理器。然而，在这样的系统中，子句选择启发式算法常常是孤立地评估子句，而忽略其他子句。最近，通过为E/ENIGMA系统配备一个图形神经网络（GNN），该网络基于在先前选择的子句上下文中的评估来选择下一个给定的子句，这种情况发生了变化。在这项工作中，我们描述了几种算法并用ENIGMA进行了实验，提出了基于学习从句图重要成分的上下文评价思想。摘要：Saturation-style automated theorem provers (ATPs) based on the given clause procedure are today the strongest general reasoners for classical first-order logic. The clause selection heuristics in such systems are, however, often evaluating clauses in isolation, ignoring other clauses. This has changed recently by equipping the E/ENIGMA system with a graph neural network (GNN) that chooses the next given clause based on its evaluation in the context of previously selected clauses. In this work, we describe several algorithms and experiments with ENIGMA, advancing the idea of contextual evaluation based on learning important components of the graph of clauses.

【11】 An artificial intelligence natural language processing pipeline for information extraction in neuroradiology 标题：一种用于神经放射学信息提取的人工智能自然语言处理流水线

作者：Henry Watkins,Robert Gray,Ashwani Jha,Parashkev Nachev 机构：Nachev 1UCL Queen Square Institute of Neurology, University College London 备注：20 pages, 2 png image figures 链接：https://arxiv.org/abs/2107.10021 摘要：电子病历由于其非结构化的格式，在医学研究中的应用非常困难。从报告中提取信息，并以一种便于下游分析的方式总结患者陈述，将对操作和临床研究大有裨益。在这项工作中，我们提出了一个自然语言处理管道的信息提取放射报告在神经病学。我们的管道使用基于规则和人工智能模型的混合序列来准确地提取和总结神经报告。我们训练并评估了一个自定义语言模型，该模型来自伦敦国立神经外科医院的150000份放射报告。我们还介绍了特定领域神经放射数据集上标准NLP任务的结果。我们展示了我们的管道，称为“neuroNLP”，能够可靠地从这些报告中提取临床相关信息，使报告的下游建模和相关成像达到前所未有的规模。摘要：The use of electronic health records in medical research is difficult because of the unstructured format. Extracting information within reports and summarising patient presentations in a way amenable to downstream analysis would be enormously beneficial for operational and clinical research. In this work we present a natural language processing pipeline for information extraction of radiological reports in neurology. Our pipeline uses a hybrid sequence of rule-based and artificial intelligence models to accurately extract and summarise neurological reports. We train and evaluate a custom language model on a corpus of 150000 radiological reports from National Hospital for Neurology and Neurosurgery, London MRI imaging. We also present results for standard NLP tasks on domain-specific neuroradiology datasets. We show our pipeline, called `neuroNLP', can reliably extract clinically relevant information from these reports, enabling downstream modelling of reports and associated imaging on a heretofore unprecedented scale.

【12】 Optimal Operation of Power Systems with Energy Storage under Uncertainty: A Scenario-based Method with Strategic Sampling 标题：不确定条件下储能电力系统最优运行：一种基于情景的策略抽样方法

作者：Ren Hu,Qifeng Li 机构：E 链接：https://arxiv.org/abs/2107.10013 摘要：储能系统的多周期动态特性、间歇性可再生能源发电以及电力负荷的不可控性，使得电力系统运行优化具有挑战性。采用机会约束优化（CCO）建模方法，建立了不确定条件下的多周期最优PSO模型，其中约束条件包括非线性储能模型和交流潮流模型。针对这一具有挑战性的CCO问题，提出了一种新的求解方法。提出的方法在计算上是有效的，主要有两个原因。首先，利用一组基于广义最小绝对收缩和选择算子的学习辅助二次凸不等式来逼近交流潮流约束。其次，考虑到数据的物理模式，以基于学习的抽样为动力，提出了策略抽样方法，通过不同的抽样策略显著减少了所需的场景数。在IEEE标准系统上的仿真结果表明：1）所提出的策略抽样方法显著提高了基于情景的机会约束最优PSO问题求解方法的计算效率，2）数据驱动的潮流凸逼近是一种很有前途的非线性、非凸交流潮流的替代方法。摘要：The multi-period dynamics of energy storage (ES), intermittent renewable generation and uncontrollable power loads, make the optimization of power system operation (PSO) challenging. A multi-period optimal PSO under uncertainty is formulated using the chance-constrained optimization (CCO) modeling paradigm, where the constraints include the nonlinear energy storage and AC power flow models. Based on the emerging scenario optimization method which does not rely on pre-known probability distribution functions, this paper develops a novel solution method for this challenging CCO problem. The proposed meth-od is computationally effective for mainly two reasons. First, the original AC power flow constraints are approximated by a set of learning-assisted quadratic convex inequalities based on a generalized least absolute shrinkage and selection operator. Second, considering the physical patterns of data and motived by learning-based sampling, the strategic sampling method is developed to significantly reduce the required number of scenarios through different sampling strategies. The simulation results on IEEE standard systems indicate that 1) the proposed strategic sampling significantly improves the computational efficiency of the scenario-based approach for solving the chance-constrained optimal PSO problem, 2) the data-driven convex approximation of power flow can be promising alternatives of nonlinear and nonconvex AC power flow.

【13】 MarsExplorer: Exploration of Unknown Terrains via Deep Reinforcement Learning and Procedurally Generated Environments 标题：MarsExplorer：通过深度强化学习和程序生成环境探索未知地形

作者：Dimitrios I. Koutras,Athanasios Ch. Kapoutsis,Angelos A. Amanatiadis,Elias B. Kosmatopoulos 机构：Kosmatopoulos , Department of Electrical and Computer Engineering, Democritus University of Thrace, Xanthi, Greece, Information Technologies Institute, The Centre for Research & Technology, Hellas, Thessaloniki, Greece 链接：https://arxiv.org/abs/2107.09996 摘要：本文是一个初步的努力，以弥补之间的差距强大的深层强化学习方法和探索/覆盖未知地形的问题。在这个范围内，MarsExplorer，一个与openai健身房兼容的环境，专门用于未知区域的探索/覆盖。MarsExplorer将最初的机器人问题转化为一个强化学习设置，各种现成的算法可以解决。任何学习到的策略都可以直接应用到机器人平台上，而无需对机器人的动力学建立详细的仿真模型来应用不同的学习/适应阶段。它的核心特征之一是可控的多维地形过程生成，这是生成具有较强泛化能力的策略的关键。在MarsExplorer环境中训练了四种不同的最新RL算法（A3C、PPO、Rainbow和SAC），并与人类水平的平均性能进行了比较。在后续的实验分析中，分析了多维难度设置对最佳执行算法（PPO）学习能力的影响。一个里程碑式的结果是生成一个遵循希尔BERT曲线的勘探政策，而不向环境提供这些信息，也不直接或间接地奖励希尔BERT曲线样的轨迹。实验分析的结论是比较PPO学习的政策结果与基于边界的探索背景下的扩展地形大小。源代码位于：https://github.com/dimikout3/GeneralExplorationPolicy. 摘要：This paper is an initial endeavor to bridge the gap between powerful Deep Reinforcement Learning methodologies and the problem of exploration/coverage of unknown terrains. Within this scope, MarsExplorer, an openai-gym compatible environment tailored to exploration/coverage of unknown areas, is presented. MarsExplorer translates the original robotics problem into a Reinforcement Learning setup that various off-the-shelf algorithms can tackle. Any learned policy can be straightforwardly applied to a robotic platform without an elaborate simulation model of the robot's dynamics to apply a different learning/adaptation phase. One of its core features is the controllable multi-dimensional procedural generation of terrains, which is the key for producing policies with strong generalization capabilities. Four different state-of-the-art RL algorithms (A3C, PPO, Rainbow, and SAC) are trained on the MarsExplorer environment, and a proper evaluation of their results compared to the average human-level performance is reported. In the follow-up experimental analysis, the effect of the multi-dimensional difficulty setting on the learning capabilities of the best-performing algorithm (PPO) is analyzed. A milestone result is the generation of an exploration policy that follows the Hilbert curve without providing this information to the environment or rewarding directly or indirectly Hilbert-curve-like trajectories. The experimental analysis is concluded by comparing PPO learned policy results with frontier-based exploration context for extended terrain sizes. The source code can be found at: https://github.com/dimikout3/GeneralExplorationPolicy.

【14】 Strategic Mitigation of Agent Inattention in Drivers with Open-Quantum Cognition Models 标题：开放量子认知模型下驾驶员主体注意力疏忽的策略缓解

作者：Qizi Zhang,Venkata Sriram Siddhardh Nadendla,S. N. Balakrishnan,Jerome Busemeyer 机构：∗Dept. of Mechanical and Aerospace Engineering and †Dept. of Computer Science, Missouri Univ. of Science and Technology, Rolla, Missouri ,., ‡Dept. of Psychological and Brain Sciences, Indiana University Bloomington, Bloomington, IN ,. 备注：12 pages, 4 figures, submitted to IEEE Transactions on Human-Machine Systems 链接：https://arxiv.org/abs/2107.09888 摘要：最先进的驾驶员辅助系统未能有效缓解驾驶员的注意力不集中，对日益增多的道路事故（例如，由于各种因素导致驾驶员注意力不集中而导致的事故造成的生命损失、人身伤害）的影响微乎其微。这是因为传统的人机交互设置是在经典和行为博弈论领域建模的，这些领域在技术上适合描述两个效用最大化代理或人类决策者之间的战略交互。因此，为了提高驾驶员辅助系统的说服效果，我们开发了一种新的策略性的、个性化的、适应驾驶员心理状态和选择行为的驾驶员辅助系统。首先，我们在人-系统交互博弈中提出了一个新的均衡概念，即系统的期望效用最大化，人的决策可以用任何一般的决策模型来描述。然后，我们使用这个新的均衡概念来研究策略性的驾驶员-车辆互动博弈，其中汽车提出了一个有说服力的建议来引导驾驶员做出更安全的驾驶决策。我们假设驾驶员采用了一个开放的量子系统认知模型，该模型捕捉到了人类决策的复杂方面，如违反经典的全概率定律和某些信息的心理表征的不相容性。我们给出了参与者对对方策略的最终反应的封闭形式表达式，这样我们就可以数值计算纯平衡点和混合平衡点。数值结果说明了这两种平衡。摘要：State-of-the-art driver-assist systems have failed to effectively mitigate driver inattention and had minimal impacts on the ever-growing number of road mishaps (e.g. life loss, physical injuries due to accidents caused by various factors that lead to driver inattention). This is because traditional human-machine interaction settings are modeled in classical and behavioral game-theoretic domains which are technically appropriate to characterize strategic interaction between either two utility maximizing agents, or human decision makers. Therefore, in an attempt to improve the persuasive effectiveness of driver-assist systems, we develop a novel strategic and personalized driver-assist system which adapts to the driver's mental state and choice behavior. First, we propose a novel equilibrium notion in human-system interaction games, where the system maximizes its expected utility and human decisions can be characterized using any general decision model. Then we use this novel equilibrium notion to investigate the strategic driver-vehicle interaction game where the car presents a persuasive recommendation to steer the driver towards safer driving decisions. We assume that the driver employs an open-quantum system cognition model, which captures complex aspects of human decision making such as violations to classical law of total probability and incompatibility of certain mental representations of information. We present closed-form expressions for players' final responses to each other's strategies so that we can numerically compute both pure and mixed equilibria. Numerical results are presented to illustrate both kinds of equilibria.

【15】 CogME: A Novel Evaluation Metric for Video Understanding Intelligence 标题：CogME：一种新的视频理解智能评价指标

作者：Minjung Shin,Jeonghoon Kim,Seongho Choi,Yu-Jung Heo,Donghyun Kim,Minsu Lee,Byoung-Tak Zhang,Jeh-Kwang Ryu 机构：Ryu ,∗, Laboratory for Natural & Artificial Kin¨asthese, Convergence Research Center for, Artificial Intelligence(CRC,AI), Dongguk University, Seoul, South Korea, Department of Artificial Intelligence, Dongguk University, Seoul, South Korea 备注：17 pages with 3 figures and 3 tables 链接：https://arxiv.org/abs/2107.09847 摘要：开发视频理解智能非常具有挑战性，因为它需要基于自然语言处理、时间依赖性和推理的图像、脚本和声音的整体集成。近年来，人们在大量的视频数据集上进行了大量的尝试。然而，现有的视频问答（VideoQA）评价指标并不能提供有意义的分析。为了取得进展，我们认为一个完善的框架，建立在人类理解的方式上，需要详细解释和评估理解的表现。基于人的认知过程和故事要素，提出了一个自上而下的视频质量评价系统：用于评价的认知模块（CogME）。CogME由三个认知模块组成：目标、内容和思维。理解过程中各模块之间的相互作用可以用一句话来表达：“我通过一种思维方式来理解目标的内容。”每个模块都有派生自故事元素的子组件。我们可以通过对个别问题的子部分进行注释来指定理解所需的方面。因此，CogME为VideoQA数据集的详细规范提供了一个框架。为了检验VideoQA数据集在验证视频理解智能方面的适用性，我们应用CogME评估了VideoQA数据集的基线模型。评估结果显示，故事元素在现有数据集中的反映是不均匀的，基于数据集的模型可能会导致预测偏差。虽然这项研究只能掌握很小范围的故事，但我们期望它能为研究人类在视频上的认知过程，理解人类和人工智能提供第一步。摘要：Developing video understanding intelligence is quite challenging because it requires holistic integration of images, scripts, and sounds based on natural language processing, temporal dependency, and reasoning. Recently, substantial attempts have been made on several video datasets with associated question answering (QA) on a large scale. However, existing evaluation metrics for video question answering (VideoQA) do not provide meaningful analysis. To make progress, we argue that a well-made framework, established on the way humans understand, is required to explain and evaluate the performance of understanding in detail. Then we propose a top-down evaluation system for VideoQA, based on the cognitive process of humans and story elements: Cognitive Modules for Evaluation (CogME). CogME is composed of three cognitive modules: targets, contents, and thinking. The interaction among the modules in the understanding procedure can be expressed in one sentence as follows: "I understand the CONTENT of the TARGET through a way of THINKING." Each module has sub-components derived from the story elements. We can specify the required aspects of understanding by annotating the sub-components to individual questions. CogME thus provides a framework for an elaborated specification of VideoQA datasets. To examine the suitability of a VideoQA dataset for validating video understanding intelligence, we evaluated the baseline model of the DramaQA dataset by applying CogME. The evaluation reveals that story elements are unevenly reflected in the existing dataset, and the model based on the dataset may cause biased predictions. Although this study has only been able to grasp a narrow range of stories, we expect that it offers the first step in considering the cognitive process of humans on the video understanding intelligence of humans and AI.

【16】 Bayesian Controller Fusion: Leveraging Control Priors in Deep Reinforcement Learning for Robotics 标题：贝叶斯控制器融合：在机器人深度强化学习中利用控制先验

作者：Krishan Rana,Vibhavari Dasagi,Jesse Haviland,Ben Talbot,Michael Milford,Niko Sünderhauf 机构：1Queensland University of Technology (QUT) Centre for Robotics 备注：Under review for The International Journal of Robotics Research (IJRR). Project page: this https URL 链接：https://arxiv.org/abs/2107.09822 摘要：我们提出了贝叶斯控制器融合（BCF）：一种混合控制策略，结合了传统手工控制器和无模型深层强化学习（RL）的优点。BCF在机器人领域蓬勃发展，在机器人领域，许多任务存在可靠但次优的控制先验，但RL从头开始仍然不安全，数据效率低下。通过融合来自每个系统的具有不确定性意识的分布式输出，BCF在它们之间仲裁控制，利用它们各自的优势。我们研究了两个真实世界的机器人任务的BCF，这两个任务涉及在广阔和长地平线环境中导航，以及一个涉及可操作性最大化的复杂到达任务。对于这两个领域，都存在简单的手工控制器，它们可以以规避风险的方式解决手头的任务，但由于分析建模、控制器错误校准和任务变化的限制，不一定表现出最优解。由于在训练的早期阶段，探索自然受到先验知识的指导，随着策略获得更多的经验，BCF加速了学习，同时大大提高了控制先验知识的表现。更重要的是，考虑到控制优先权的风险厌恶性，BCF确保了安全的探索和部署，其中控制优先权在策略未知的状态下自然支配着动作分布。此外，我们还展示了BCF对零炮模拟现实环境的适用性，以及它处理现实世界中分布外状态的能力。BCF是一种很有前途的方法，它将深度RL和传统机器人控制的互补优势结合起来，超越了两者各自可以独立实现的优势。代码和补充视频材料在url公开{https://krishanrana.github.io/bcf}. 摘要：We present Bayesian Controller Fusion (BCF): a hybrid control strategy that combines the strengths of traditional hand-crafted controllers and model-free deep reinforcement learning (RL). BCF thrives in the robotics domain, where reliable but suboptimal control priors exist for many tasks, but RL from scratch remains unsafe and data-inefficient. By fusing uncertainty-aware distributional outputs from each system, BCF arbitrates control between them, exploiting their respective strengths. We study BCF on two real-world robotics tasks involving navigation in a vast and long-horizon environment, and a complex reaching task that involves manipulability maximisation. For both these domains, there exist simple handcrafted controllers that can solve the task at hand in a risk-averse manner but do not necessarily exhibit the optimal solution given limitations in analytical modelling, controller miscalibration and task variation. As exploration is naturally guided by the prior in the early stages of training, BCF accelerates learning, while substantially improving beyond the performance of the control prior, as the policy gains more experience. More importantly, given the risk-aversity of the control prior, BCF ensures safe exploration emph{and} deployment, where the control prior naturally dominates the action distribution in states unknown to the policy. We additionally show BCF's applicability to the zero-shot sim-to-real setting and its ability to deal with out-of-distribution states in the real-world. BCF is a promising approach for combining the complementary strengths of deep RL and traditional robotic control, surpassing what either can achieve independently. The code and supplementary video material are made publicly available at url{https://krishanrana.github.io/bcf}.

【17】 Multi-agent Reinforcement Learning Improvement in a Dynamic Environment Using Knowledge Transfer 标题：动态环境下基于知识转移的多智能体强化学习改进

作者：Mahnoosh Mahdavimoghaddama,Amin Nikanjama,Monireh Abdoos 机构：K. N. Toosi University of Technology, Tehran, Iran, Shahid Beheshti University, Tehran, Iran, c SWAT Lab., Polytechnique Montréal, Quebec, Canada 备注：arXiv admin note: text overlap with arXiv:1912.07796 by other authors 链接：https://arxiv.org/abs/2107.09807 摘要：多智能体协作系统在不同的领域有着广泛的应用。代理之间的交互将带来好处，包括降低操作成本、高可扩展性和促进并行处理。这些系统也是处理大规模、未知和动态环境的好选择。然而，在这些环境中学习已经成为各种应用中一个非常重要的挑战。这些挑战包括搜索空间大小对学习时间的影响、代理之间的低效合作以及代理决策之间缺乏适当的协调。此外，在这些问题中，强化学习算法的收敛时间较长。本文提出了一种基于知识转移概念的通信框架来解决大状态空间羊群问题。为了解决算法的收敛性问题，采用了知识转移的方法，大大提高了强化学习算法的效率。代理之间的协调分别通过每个代理组中的一个主代理和一个协调代理来执行。结果表明，该框架确实可以提高学习速度，缩短收敛时间。摘要：Cooperative multi-agent systems are being widely used in different domains. Interaction among agents would bring benefits, including reducing operating costs, high scalability, and facilitating parallel processing. These systems are also a good option for handling large-scale, unknown, and dynamic environments. However, learning in these environments has become a very important challenge in various applications. These challenges include the effect of search space size on learning time, inefficient cooperation among agents, and the lack of proper coordination among agents' decisions. Moreover, reinforcement learning algorithms may suffer from long convergence time in these problems. In this paper, a communication framework using knowledge transfer concepts is introduced to address such challenges in the herding problem with large state space. To handle the problems of convergence, knowledge transfer has been utilized that can significantly increase the efficiency of reinforcement learning algorithms. Coordination between the agents is carried out through a head agent in each group of agents and a coordinator agent respectively. The results demonstrate that this framework could indeed enhance the speed of learning and reduce convergence time.

【18】 Two-phase Optimization of Binary Sequences with Low Peak Sidelobe Level Value 标题：低峰值旁瓣电平二元序列的两阶段优化

作者：Borko Bošković,Janez Brest 备注：8 pages, 4 figures, 5 tables 链接：https://arxiv.org/abs/2107.09801 摘要：具有低峰值旁瓣电平值的二进制序列的搜索是一个艰巨的计算问题。为了解决这个问题，我们设计了一个使用两个适应度函数的随机算法。在这些适应度函数中，自相关函数的值对最终适应度值有不同的影响。它是用自相关函数值上的指数值来定义的。每个函数在相应的优化阶段使用，优化过程在这两个阶段之间切换，直到满足停止条件。所提出的算法是使用计算统一设备架构实现的，因此允许我们利用图形处理单元的计算能力。该算法在长度为$L=2^m-1$的序列上进行了测试，测试价格为$14lemle20$。结果表明，两个适应度函数的使用显著地提高了算法的效率，得到了新的最优解，得到的PSL值明显小于$sqrt{L}$。摘要：The search for binary sequences with low peak sidelobe level value represents a formidable computational problem. To locate better sequences for this problem, we designed a stochastic algorithm that uses two fitness functions. In these fitness functions, the value of the autocorrelation function has a different impact on the final fitness value. It is defined with the value of the exponent over the autocorrelation function values. Each function is used in the corresponding optimization phase, and the optimization process switches between these two phases until the stopping condition is satisfied. The proposed algorithm was implemented using the compute unified device architecture and therefore allowed us to exploit the computational power of graphics processing units. This algorithm was tested on sequences with lengths $L = 2^m - 1$, for $14 le m le 20$. From the obtained results it is evident that the usage of two fitness functions improved the efficiency of the algorithm significantly, new-best known solutions were achieved, and the achieved PSL values were significantly less than $sqrt{L}$.

【19】 High-dimensional Multivariate Time Series Forecasting in IoT Applications using Embedding Non-stationary Fuzzy Time Series 标题：物联网应用中嵌入非平稳模糊时间序列的高维多变量时间序列预测

作者：Hugo Vinicius Bitencourt,Frederico Gadelha Guimarães 机构：Machine Intelligence and Data Science Lab (MINDS), Graduate Program in Electrical Engineering, Universidade Federal de Minas Gerais, Av. Antˆonio Carlos ,-, Belo Horizonte, MG, Brazil, Frederico Gadelha Guimar˜aes 备注：6 pages, 1 figure, submitted to the 7th IEEE LA-CCI (Latin American Conference on Computational Intelligence) 链接：https://arxiv.org/abs/2107.09785 摘要：在物联网（IoT）中，数据是从不同的数据源连续记录的，设备的嵌入式电子设备会发生故障，从而导致高维数据集和概念漂移事件。因此，能够处理高维非平稳时间序列的方法在物联网应用中具有重要价值。模糊时间序列（Fuzzy Time Series，FTS）模型是数据驱动的非参数模型，具有实现简单、精度高等特点。不幸的是，FTS在处理多变量数据集和概念漂移的场景时遇到了困难。本文提出了一种处理高维非平稳时间序列的新方法，将原始高维数据投影到低维嵌入空间，并采用FTS方法。结合这些技术可以更好地表示非平稳多元时间序列的复杂内容和准确的预测。该模型能解释98%的方差，RMSE、MAE和MAPE分别达到11.52%、2.68%和2.91%。摘要：In Internet of things (IoT), data is continuously recorded from different data sources and devices can suffer faults in their embedded electronics, thus leading to a high-dimensional data sets and concept drift events. Therefore, methods that are capable of high-dimensional non-stationary time series are of great value in IoT applications. Fuzzy Time Series (FTS) models stand out as data-driven non-parametric models of easy implementation and high accuracy. Unfortunately, FTS encounters difficulties when dealing with data sets of many variables and scenarios with concept drift. We present a new approach to handle high-dimensional non-stationary time series, by projecting the original high-dimensional data into a low dimensional embedding space and using FTS approach. Combining these techniques enables a better representation of the complex content of non-stationary multivariate time series and accurate forecasts. Our model is able to explain 98% of the variance and reach 11.52% of RMSE, 2.68% of MAE and 2.91% of MAPE.

【20】 Explainable AI Enabled Inspection of Business Process Prediction Models 标题：可解释的人工智能支持的业务流程预测模型检查

作者：Chun Ouyang,Renuka Sindhgatta,Catarina Moreira 备注：17 pages, 6 figures, 1 table 链接：https://arxiv.org/abs/2107.09767 摘要：以机器学习技术为基础的现代数据分析已经成为以数据为主导的决策自动化的关键因素。作为最新数据分析的一个重要分支，业务流程预测也面临着一个挑战，即缺乏对底层“黑箱”预测模型的推理和决策的解释。随着可解释机器学习技术的发展，可以为黑盒模型生成解释，使得（人类）用户能够访问机器学习预测背后的推理。在本文中，我们的目标是提出一种方法，允许我们使用模型解释来研究机器学习预测应用的某些推理，并检测潜在的问题，从而增强对业务流程预测模型的信任。我们的方法的一个新贡献是模型检查的建议，它利用了可解释的机器学习机制产生的解释和从记录历史进程执行的事件日志中提取的上下文或领域知识。从这项工作中得出的结论有望作为开发模型可靠性度量和业务流程预测上下文中的评估的关键输入。摘要：Modern data analytics underpinned by machine learning techniques has become a key enabler to the automation of data-led decision making. As an important branch of state-of-the-art data analytics, business process predictions are also faced with a challenge in regard to the lack of explanation to the reasoning and decision by the underlying `black-box' prediction models. With the development of interpretable machine learning techniques, explanations can be generated for a black-box model, making it possible for (human) users to access the reasoning behind machine learned predictions. In this paper, we aim to present an approach that allows us to use model explanations to investigate certain reasoning applied by machine learned predictions and detect potential issues with the underlying methods thus enhancing trust in business process prediction models. A novel contribution of our approach is the proposal of model inspection that leverages both the explanations generated by interpretable machine learning mechanisms and the contextual or domain knowledge extracted from event logs that record historical process execution. Findings drawn from this work are expected to serve as a key input to developing model reliability metrics and evaluation in the context of business process predictions.

【21】 Enhancing Loop-Invariant Synthesis via Reinforcement Learning 标题：基于强化学习的环路不变综合

作者：Takeshi Tsukada,Hiroshi Unno,Taro Sekiyama,Kohei Suenaga 机构：Chiba University, Japan, University of Tsukuba, Japan, National Institute of Informatics, Japan, Kyoto University, Japan 链接：https://arxiv.org/abs/2107.09766 摘要：循环不变综合是每个程序验证过程的基础。由于其一般不确定性，用于不变综合的工具必然使用启发式。尽管人们普遍认为启发式算法的设计对于验证器的有效性能至关重要，但是对于获得每个不变综合工具的最优启发式算法的研究却很少。相反，开发人员手工调整了工具的启发式。这项研究表明，我们可以有效地自动学习一个良好的启发式强化学习为一个不变的合成器PCSat。我们的实验表明，PCSat结合强化学习的启发式学习算法在这项任务上的表现优于目前最先进的求解算法。据我们所知，这是第一个工作，研究学习启发式的不变综合工具。摘要：Loop-invariant synthesis is the basis of every program verification procedure. Due to its undecidability in general, a tool for invariant synthesis necessarily uses heuristics. Despite the common belief that the design of heuristics is vital for the effective performance of a verifier, little work has been performed toward obtaining the optimal heuristics for each invariant-synthesis tool. Instead, developers have hand-tuned the heuristics of tools. This study demonstrates that we can effectively and automatically learn a good heuristic via reinforcement learning for an invariant synthesizer PCSat. Our experiment shows that PCSat combined with the heuristic learned by reinforcement learning outperforms the state-of-the-art solvers for this task. To the best of our knowledge, this is the first work that investigates learning the heuristics of an invariant synthesis tool.

【22】 Uncertainty Estimation and Out-of-Distribution Detection for Counterfactual Explanations: Pitfalls and Solutions 标题：反事实解释的不确定性估计和离散性检测：陷阱和解决方案

作者：Eoin Delaney,Derek Greene,Mark T. Keane 机构：The provision of uncertaintyestimations on counterfactual explanations can avoid pre-senting users with overconfident and potentially harmful 1School of Computer Science, University College Dublin 备注：None 链接：https://arxiv.org/abs/2107.09734 摘要：虽然最近提出了大量的技术来产生对不透明黑匣子系统的预测的反事实解释，但对探索这些产生的解释的不确定性的关注明显较少。在高风险场景中，这成为一个关键问题，不确定和误导性的解释可能会产生可怕的后果（例如，医疗诊断和治疗计划）。此外，通常很难确定生成的解释是否基于训练数据并且对分布变化敏感。本文提出了一些实用的解决方案，可以通过与其他研究工作在解释性（如信任分数）和不确定性估计（如蒙特卡罗退出）方面建立新的联系来解决这些问题。两个实验证明了我们提出的解决方案的实用性。摘要：Whilst an abundance of techniques have recently been proposed to generate counterfactual explanations for the predictions of opaque black-box systems, markedly less attention has been paid to exploring the uncertainty of these generated explanations. This becomes a critical issue in high-stakes scenarios, where uncertain and misleading explanations could have dire consequences (e.g., medical diagnosis and treatment planning). Moreover, it is often difficult to determine if the generated explanations are well grounded in the training data and sensitive to distributional shifts. This paper proposes several practical solutions that can be leveraged to solve these problems by establishing novel connections with other research works in explainability (e.g., trust scores) and uncertainty estimation (e.g., Monte Carlo Dropout). Two experiments demonstrate the utility of our proposed solutions.

【23】 What Do You Get When You Cross Beam Search with Nucleus Sampling? 标题：当您使用Nucleus采样进行交叉光束搜索时，您会得到什么？

作者：Uri Shaham,Omer Levy 机构：The Blavatnik School of Computer Science, Tel Aviv University 链接：https://arxiv.org/abs/2107.09729 摘要：本文将波束搜索与核采样的概率剪枝技术相结合，提出了两种用于自然语言生成的确定性核搜索算法。第一种算法p-exact search对下一个令牌分布进行局部剪枝，并对剩余空间进行精确搜索。第二种算法，动态波束搜索，根据候选概率分布的熵来缩小和扩大波束大小。尽管nucleus搜索背后有概率直觉，但在机器翻译和摘要基准测试上的实验表明，这两种算法都达到了与标准beam搜索相同的性能水平。摘要：We combine beam search with the probabilistic pruning technique of nucleus sampling to create two deterministic nucleus search algorithms for natural language generation. The first algorithm, p-exact search, locally prunes the next-token distribution and performs an exact search over the remaining space. The second algorithm, dynamic beam search, shrinks and expands the beam size according to the entropy of the candidate's probability distribution. Despite the probabilistic intuition behind nucleus search, experiments on machine translation and summarization benchmarks show that both algorithms reach the same performance levels as standard beam search.

【24】 An Efficient Multi-objective Evolutionary Approach for Solving the Operation of Multi-Reservoir System Scheduling in Hydro-Power Plants 标题：求解水电厂多水库系统调度问题的一种高效多目标进化方法

作者：C. G. Marcelino,G. M. C. Leite,C. A. D. M Delgado,L. B. de Oliveira,E. F. Wanner,S. Jiménez-Fernández,S. Salcedo-Sanz 机构：Department of Signal Processing and Communications, Universidad de Alcalá, Spain., Institute of Computing, Federal University of Rio de Janeiro, Brazil., Post-Graduate Program in Systems Engineering and Computer Science, Federal 链接：https://arxiv.org/abs/2107.09718 摘要：本文研究了多水库系统中的短期水电机组组合问题——基于梯级运行的方案。为此，我们提出了一种新的数学模型，其目标是使水电站在次日常运行中的总发电量最大化，同时使水库的总含水量（体积）最大化。为了解决这个问题，我们讨论了多目标进化群杂交（MESH）算法，这是最近提出的一种基于多目标群智能的优化方法，与现有的进化算法相比，在具体应用中取得了非常有竞争力的结果。应用网格法求解水电站中所有可能的水轮机组合在最大库容下的最优排水量和发电量。在一个实际问题中，考虑到来自巴西两个梯级水电站的水电能源系统的数据，将MESH的性能与著名的进化方法（如NSGA-II、NSGA-III、SPEA2和MOEA/D）进行了比较。结果表明，在效率和准确性方面，MESH比其他多目标方法表现出更高的性能，在进行的投影分析中提供了每月41.25万美元的利润。摘要：This paper tackles the short-term hydro-power unit commitment problem in a multi-reservoir system - a cascade-based operation scenario. For this, we propose a new mathematical modelling in which the goal is to maximize the total energy production of the hydro-power plant in a sub-daily operation, and, simultaneously, to maximize the total water content (volume) of reservoirs. For solving the problem, we discuss the Multi-objective Evolutionary Swarm Hybridization (MESH) algorithm, a recently proposed multi-objective swarm intelligence-based optimization method which has obtained very competitive results when compared to existing evolutionary algorithms in specific applications. The MESH approach has been applied to find the optimal water discharge and the power produced at the maximum reservoir volume for all possible combinations of turbines in a hydro-power plant. The performance of MESH has been compared with that of well-known evolutionary approaches such as NSGA-II, NSGA-III, SPEA2, and MOEA/D in a realistic problem considering data from a hydro-power energy system with two cascaded hydro-power plants in Brazil. Results indicate that MESH showed a superior performance than alternative multi-objective approaches in terms of efficiency and accuracy, providing a profit of $412,500 per month in a projection analysis carried out.

【25】 Learning MR-Sort Models from Non-Monotone Data 标题：从非单调数据中学习MR排序模型

作者：Pegdwende Minoungou,Vincent Mousseau,Wassila Ouerdane,Paolo Scotton 机构：Universit´e Paris-Saclay 链接：https://arxiv.org/abs/2107.09668 摘要：多数规则排序（MR Sort）方法将根据多个标准评估的备选方案分配给预定义的有序类别之一。逆MR排序问题（Inv MR Sort）计算与数据集匹配的MR排序参数。现有的学习算法的VMR排序考虑单调偏好的标准。我们将这个问题推广到标准偏好不一定是单调的，但可能是单峰（或单谷）的情况。我们提出了一种基于混合整数规划的算法，该算法从训练数据中学习对准则的偏好以及其他MR排序参数。我们通过数值实验研究了该算法的性能，并通过一个实际案例说明了它的应用。摘要：The Majority Rule Sorting (MR-Sort) method assigns alternatives evaluated on multiple criteria to one of the predefined ordered categories. The Inverse MR-Sort problem (Inv-MR-Sort) computes MR-Sort parameters that match a dataset. Existing learning algorithms for Inv-MR-Sort consider monotone preferences on criteria. We extend this problem to the case where the preferences on criteria are not necessarily monotone, but possibly single-peaked (or single-valley). We propose a mixed-integer programming based algorithm that learns the preferences on criteria together with the other MR-Sort parameters from the training data. We investigate the performance of the algorithm using numerical experiments and we illustrate its use on a real-world case study.

【26】 Human Perception of Audio Deepfakes 标题：人类对音频Deepfake的感知

作者：Nicolas M. Müller,Karla Markert,Konstantin Böttinger 机构：Fraunhofer AISEC, Germany, ∗authors contributed equally 链接：https://arxiv.org/abs/2107.09667 摘要：最近出现的深赝品，计算机逼真的多媒体赝品，使检测操纵和生成的内容的前沿。虽然已经提出了许多机器学习模型来检测假货，但是人类的检测能力仍然没有得到充分的研究。这是特别重要的，因为人类的感知不同于机器的感知，深度伪造通常是为了欺骗人类。到目前为止，这个问题只在图像和视频领域得到解决。为了比较人类和机器在检测音频伪造方面的能力，我们进行了一项在线游戏化实验，让用户从用各种算法生成的伪造音频中辨别出真实的音频样本。200名用户参加了8976场比赛，比赛中使用了人工智能（AI）算法进行音频检测。根据收集到的数据，我们发现机器在检测音频假货方面通常优于人类，但是相反，对于某种攻击类型，人类仍然更准确。此外，我们发现年轻的参与者比年长的参与者平均更善于检测音频假货，而IT专业人士比外行没有优势。因此，将人与机器知识相结合对提高音频检测的准确性具有重要意义。摘要：The recent emergence of deepfakes, computerized realistic multimedia fakes, brought the detection of manipulated and generated content to the forefront. While many machine learning models for deepfakes detection have been proposed, the human detection capabilities have remained far less explored. This is of special importance as human perception differs from machine perception and deepfakes are generally designed to fool the human. So far, this issue has only been addressed in the area of images and video. To compare the ability of humans and machines in detecting audio deepfakes, we conducted an online gamified experiment in which we asked users to discern bonda-fide audio samples from spoofed audio, generated with a variety of algorithms. 200 users competed for 8976 game rounds with an artificial intelligence (AI) algorithm trained for audio deepfake detection. With the collected data we found that the machine generally outperforms the humans in detecting audio deepfakes, but that the converse holds for a certain attack type, for which humans are still more accurate. Furthermore, we found that younger participants are on average better at detecting audio deepfakes than older participants, while IT-professionals hold no advantage over laymen. We conclude that it is important to combine human and machine knowledge in order to improve audio deepfake detection.

【27】 Conditional Sound Generation Using Neural Discrete Time-Frequency Representation Learning 标题：基于神经离散时频表示学习的条件声音生成

作者：Xubo Liu,Turab Iqbal,Jinzheng Zhao,Qiushi Huang,Mark D. Plumbley,Wenwu Wang 机构：Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, UK 备注：Submitted to MLSP 2021 链接：https://arxiv.org/abs/2107.09998 摘要：深度生成模型最近在语音合成和音乐生成方面取得了令人瞩目的成绩。然而，与这些特定领域声音的产生相比，一般声音（如汽车喇叭、狗吠声和枪声）的产生虽然有着广泛的潜在应用，但受到的关注较少。在我们以前的工作中，声音是在时域中使用SampleRNN生成的。然而，使用这种方法很难捕获录音中的长程依赖关系。在这项工作中，我们提出通过神经离散时频表示学习来产生以声音类别为条件的声音。这在建模远程依赖关系和在声音片段中保留局部细粒度结构方面提供了优势。我们在UrbanSound8K数据集上评估了我们提出的方法，并将其与SampleRNN基线进行了比较，其性能指标衡量了生成的声音样本的质量和多样性。实验结果表明，与基线方法相比，该方法在分集性能和质量上都有显著提高。摘要：Deep generative models have recently achieved impressive performance in speech synthesis and music generation. However, compared to the generation of those domain-specific sounds, the generation of general sounds (such as car horn, dog barking, and gun shot) has received less attention, despite their wide potential applications. In our previous work, sounds are generated in the time domain using SampleRNN. However, it is difficult to capture long-range dependencies within sound recordings using this method. In this work, we propose to generate sounds conditioned on sound classes via neural discrete time-frequency representation learning. This offers an advantage in modelling long-range dependencies and retaining local fine-grained structure within a sound clip. We evaluate our proposed approach on the UrbanSound8K dataset, as compared to a SampleRNN baseline, with the performance metrics measuring the quality and diversity of the generated sound samples. Experimental results show that our proposed method offers significantly better performance in diversity and comparable performance in quality, as compared to the baseline method.

【28】 CL4AC: A Contrastive Loss for Audio Captioning 标题：CL4AC：音频字幕的对比损失

作者：Xubo Liu,Qiushi Huang,Xinhao Mei,Tom Ko,H Lilian Tang,Mark D. Plumbley,Wenwu Wang 机构：Centre for Vision, Speech and Signal Processing (CVSSP), University of Surrey, UK 备注：The first two authors contributed equally, 5 pages, 3 figures, submitted to DCASE2021 Workshop 链接：https://arxiv.org/abs/2107.09990 摘要：自动音频字幕（AAC）是一种跨模态的翻译任务，旨在使用自然语言来描述音频片段的内容。如DCASE 2021挑战任务6提交的资料所示，社区对这个问题越来越感兴趣。现有的AAC系统通常基于编解码器结构，将音频信号编码成潜在的表示，并与相应的文本描述对齐，然后使用解码器生成字幕。然而，AAC系统的训练经常遇到数据匮乏的问题，这可能导致不准确的表示和音频文本对齐。为了解决这个问题，我们提出了一种新的编码-解码器框架，称为对比损失音频字幕（CL4AC）。在CL4AC中，利用原始音频-文本配对数据产生的自监督信号，通过对比样本来挖掘音频与文本之间的对应关系，在有限数据的训练下，提高了潜在表征的质量和音频与文本的对齐度。在Clotho数据集上进行了实验，验证了该方法的有效性。摘要：Automated Audio captioning (AAC) is a cross-modal translation task that aims to use natural language to describe the content of an audio clip. As shown in the submissions received for Task 6 of the DCASE 2021 Challenges, this problem has received increasing interest in the community. The existing AAC systems are usually based on an encoder-decoder architecture, where the audio signal is encoded into a latent representation, and aligned with its corresponding text descriptions, then a decoder is used to generate the captions. However, training of an AAC system often encounters the problem of data scarcity, which may lead to inaccurate representation and audio-text alignment. To address this problem, we propose a novel encoder-decoder framework called Contrastive Loss for Audio Captioning (CL4AC). In CL4AC, the self-supervision signals derived from the original audio-text paired data are used to exploit the correspondences between audio and texts by contrasting samples, which can improve the quality of latent representation and the alignment between audio and texts, while trained with limited data. Experiments are performed on the Clotho dataset to show the effectiveness of our proposed approach.

linux https 网络安全 php 学习方法

0 人点赞