人工智能学术速递[12.9]

cs.AI人工智能，共计29篇

【1】 CoMPS: Continual Meta Policy Search 标题：COMPS：连续元策略搜索链接：https://arxiv.org/abs/2112.04467

作者：Glen Berseth,Zhiwei Zhang,Grace Zhang,Chelsea Finn,Sergey Levine 备注：23 pages, under review 摘要：我们开发了一种新的持续元学习方法来应对顺序多任务学习中的挑战。在此设置中，代理的目标是在任何任务序列中快速获得高回报。先前的元强化学习算法在加速新任务的获取方面已显示出良好的效果。但是，它们需要在训练期间访问所有任务。除了简单地将过去的经验转移到新任务之外，我们的目标是设计出能够学会学习的持续强化学习算法，利用他们在以前任务中的经验更快地学习新任务。我们引入了一种新的方法，即连续元策略搜索（CoMPS），该方法通过以增量方式对每个任务进行元训练来消除这一限制，而无需重新访问以前的任务。CoMPS不断重复两个子例程：使用RL学习新任务，使用RL的经验执行完全离线元学习，为后续任务学习做好准备。我们发现，在几个具有挑战性的连续控制任务序列上，CoMPS优于先前的连续学习和非策略元强化方法。摘要：We develop a new continual meta-learning method to address challenges in sequential multi-task learning. In this setting, the agent's goal is to achieve high reward over any sequence of tasks quickly. Prior meta-reinforcement learning algorithms have demonstrated promising results in accelerating the acquisition of new tasks. However, they require access to all tasks during training. Beyond simply transferring past experience to new tasks, our goal is to devise continual reinforcement learning algorithms that learn to learn, using their experience on previous tasks to learn new tasks more quickly. We introduce a new method, continual meta-policy search (CoMPS), that removes this limitation by meta-training in an incremental fashion, over each task in a sequence, without revisiting prior tasks. CoMPS continuously repeats two subroutines: learning a new task using RL and using the experience from RL to perform completely offline meta-learning to prepare for subsequent task learning. We find that CoMPS outperforms prior continual learning and off-policy meta-reinforcement methods on several sequences of challenging continuous control tasks.

【2】 Adapting Procedural Content Generation to Player Personas Through Evolution 标题：通过进化使过程性内容生成适应玩家角色链接：https://arxiv.org/abs/2112.04406

作者：Pedro M. Fernandes,Jonathan Jørgensen,Niels N. T. G. Poldervaart 摘要：自动适应玩家的游戏内容为游戏开发打开了新的大门。在本文中，我们提出了一个使用角色代理和经验度量的体系结构，该体系结构支持为特定玩家角色定制的逐步生成的级别。使用我们的游戏“Grave Rave”，我们证明了这种方法在三种不同的体验指标上成功地适应了四种基于规则的角色代理。此外，适应性本质上是特定的，这意味着级别是角色意识的，而不仅仅是针对所选度量的一般优化。摘要：Automatically adapting game content to players opens new doors for game development. In this paper we propose an architecture using persona agents and experience metrics, which enables evolving procedurally generated levels tailored for particular player personas. Using our game, "Grave Rave", we demonstrate that this approach successfully adapts to four rule-based persona agents over three different experience metrics. Furthermore, the adaptation is shown to be specific in nature, meaning that the levels are persona-conscious, and not just general optimizations with regard to the selected metric.

【3】 Gaudí: Conversational Interactions with Deep Representations to Generate Image Collections 标题：Gaudí：与深层表示的对话交互以生成图像集合链接：https://arxiv.org/abs/2112.04404

作者：Victor S. Bursztyn,Jennifer Healey,Vishwa Vinay 备注：Accepted at the NeurIPS 2021 Workshop on Machine Learning for Creativity and Design 摘要：基于现实主义语言建模（GPT-3）和跨模态表达（CLIP）的最新进展，Gaud'i被开发用于帮助设计师使用自然语言搜索灵感图像。在设计过程的早期阶段，设计师通常会创建主题集，将鼓舞人心的图像称为“情绪板”，目的是引出客户的首选创意方向。创建情绪板涉及顺序图像搜索，当前使用关键字或图像执行这些搜索。Gaud'i将这个过程转化为一个对话，用户在对话中逐渐详细描述情绪板的主题。这种表示方式允许我们的AI根据GPT-3假设的主题，直接从项目简报中从头开始生成新的搜索查询。与之前的情绪板创建计算方法相比，据我们所知，我们首次尝试将情绪板表示为设计师在向客户展示创意方向时讲述的故事。摘要：Based on recent advances in realistic language modeling (GPT-3) and cross-modal representations (CLIP), Gaud'i was developed to help designers search for inspirational images using natural language. In the early stages of the design process, with the goal of eliciting a client's preferred creative direction, designers will typically create thematic collections of inspirational images called "mood-boards". Creating a mood-board involves sequential image searches which are currently performed using keywords or images. Gaud'i transforms this process into a conversation where the user is gradually detailing the mood-board's theme. This representation allows our AI to generate new search queries from scratch, straight from a project briefing, following a theme hypothesized by GPT-3. Compared to previous computational approaches to mood-board creation, to the best of our knowledge, ours is the first attempt to represent mood-boards as the stories that designers tell when presenting a creative direction to a client.

【4】 Truth-tracking via Approval Voting: Size Matters 标题：通过批准投票追踪真相：大小很重要链接：https://arxiv.org/abs/2112.04387

作者：Tahar Allouche,Jérôme Lang,Florian Yger 备注：Accepted in the 36th AAAI Conference on Artificial Intelligence (AAAI 2022) 摘要：认知社会选择的目的在于揭示给定选票的隐藏真相，而选票被解释为关于它的嘈杂信号。我们在这里考虑一个简单的设置，其中投票包括批准投票：每个选民认可一套他们认为可能是地面真相的替代方案。基于更可靠的投票包含更少备选方案的直观想法，我们定义了几个噪声模型，它们是Mallows模型的批准投票变体。然后，将似然最大化备选方案描述为加权批准规则的获胜者，其中选票的权重随基数的增加而减小。我们在三个图像标注数据集上进行了实验；他们得出结论，基于我们的噪音模型的规则优于标准的批准投票；最好的性能是通过Condorcet噪声模型的变体获得的。摘要：Epistemic social choice aims at unveiling a hidden ground truth given votes, which are interpreted as noisy signals about it. We consider here a simple setting where votes consist of approval ballots: each voter approves a set of alternatives which they believe can possibly be the ground truth. Based on the intuitive idea that more reliable votes contain fewer alternatives, we define several noise models that are approval voting variants of the Mallows model. The likelihood-maximizing alternative is then characterized as the winner of a weighted approval rule, where the weight of a ballot decreases with its cardinality. We have conducted an experiment on three image annotation datasets; they conclude that rules based on our noise model outperform standard approval voting; the best performance is obtained by a variant of the Condorcet noise model.

【5】 Semantic TrueLearn: Using Semantic Knowledge Graphs in Recommendation Systems 标题：语义TrueLearn：在推荐系统中使用语义知识图链接：https://arxiv.org/abs/2112.04368

作者：Sahan Bulathwela,María Pérez-Ortiz,Emine Yilmaz,John Shawe-Taylor 备注：Presented at the First International Workshop on Joint Use of Probabilistic Graphical Models and Ontology at Conference on Knowledge Graph and Semantic Web 2021 摘要：在信息推荐者中，需要处理知识领域之间的语义和层次结构，这就产生了许多挑战。这项工作的目的是建立一个国家意识的教育推荐系统，该系统整合了知识主题之间的语义关联，在语义相关的主题之间传播潜在信息。我们介绍了一种新的学习者模型，该模型利用维基百科链接图利用学习资源中知识组件之间的语义相关性，旨在更好地预测终身学习场景中的学习者参与度和潜在知识。从这个意义上说，语义TrueLearn构建了一个人性化的直观知识表示，同时利用贝叶斯机器学习来提高教育参与的预测性能。我们在一个大数据集上的实验表明，这种新的语义版本的TrueLearn算法在预测性能方面取得了统计上的显著改进，它通过一个简单的扩展将语义感知添加到模型中。摘要：In informational recommenders, many challenges arise from the need to handle the semantic and hierarchical structure between knowledge areas. This work aims to advance towards building a state-aware educational recommendation system that incorporates semantic relatedness between knowledge topics, propagating latent information across semantically related topics. We introduce a novel learner model that exploits this semantic relatedness between knowledge components in learning resources using the Wikipedia link graph, with the aim to better predict learner engagement and latent knowledge in a lifelong learning scenario. In this sense, Semantic TrueLearn builds a humanly intuitive knowledge representation while leveraging Bayesian machine learning to improve the predictive performance of the educational engagement. Our experiments with a large dataset demonstrate that this new semantic version of TrueLearn algorithm achieves statistically significant improvements in terms of predictive performance with a simple extension that adds semantic awareness to the model.

【6】 On visual self-supervision and its effect on model robustness 标题：论视觉自我监督及其对模型稳健性的影响链接：https://arxiv.org/abs/2112.04367

作者：Michal Kucer,Diane Oyen,Garrett Kenyon 摘要：最近的自我监督方法在学习特征表示方面取得了成功，这些特征表示可以与完全监督中的特征表示相媲美，并且已经证明在几个方面对模型有益：例如，提高模型鲁棒性和分布外检测。在我们的论文中，我们进行了一项实证研究，以更准确地了解自我监督学习（作为训练前技术或对抗性训练的一部分）以何种方式影响模型对$l_2$和$l_{infty}$对抗性干扰和自然图像损坏的鲁棒性。自我监督确实可以提高模型的稳健性，但事实证明，问题在于细节。如果简单地将自我监督损失与对抗性训练结合起来，那么当对抗性干扰小于或可与稳健模型训练时的$epsilon_{train}$值相比较时，可以看到模型准确性的提高。但是，如果观察$epsilon{test}geepsilon{train}$的精度，则模型精度会下降。事实上，监督损失的权重越大，性能下降的幅度就越大，即损害模型的稳健性。我们确定了将自我监督添加到对抗性训练中的主要方法，并观察到使用自我监督损失优化网络参数和查找对抗性示例可以最大程度地提高模型鲁棒性，因为这可以被视为整体对抗性训练的一种形式。虽然与随机权重初始化相比，自我监督预训练在改进对抗性训练方面有好处，但如果将自我监督纳入对抗性训练，我们观察到在模型鲁棒性或准确性方面没有好处。摘要：Recent self-supervision methods have found success in learning feature representations that could rival ones from full supervision, and have been shown to be beneficial to the model in several ways: for example improving models robustness and out-of-distribution detection. In our paper, we conduct an empirical study to understand more precisely in what way can self-supervised learning - as a pre-training technique or part of adversarial training - affects model robustness to $l_2$ and $l_{infty}$ adversarial perturbations and natural image corruptions. Self-supervision can indeed improve model robustness, however it turns out the devil is in the details. If one simply adds self-supervision loss in tandem with adversarial training, then one sees improvement in accuracy of the model when evaluated with adversarial perturbations smaller or comparable to the value of $epsilon_{train}$ that the robust model is trained with. However, if one observes the accuracy for $epsilon_{test} ge epsilon_{train}$, the model accuracy drops. In fact, the larger the weight of the supervision loss, the larger the drop in performance, i.e. harming the robustness of the model. We identify primary ways in which self-supervision can be added to adversarial training, and observe that using a self-supervised loss to optimize both network parameters and find adversarial examples leads to the strongest improvement in model robustness, as this can be viewed as a form of ensemble adversarial training. Although self-supervised pre-training yields benefits in improving adversarial training as compared to random weight initialization, we observe no benefit in model robustness or accuracy if self-supervision is incorporated into adversarial training.

【7】 Geometry-Aware Fruit Grasping Estimation for Robotic Harvesting in Orchards 标题：果园机器人采摘中基于几何感知的果实抓取估计链接：https://arxiv.org/abs/2112.04363

作者：Hanwen Kang,Xing Wang,Chao Chen 摘要：田间机器人收获是近年来农业发展中一项很有前途的技术。在自然果园收获水果之前，机器人识别和定位水果是至关重要的。然而，果园采摘机器人的工作空间是复杂的：许多水果被树枝和树叶遮挡。在进行操作之前，估计每个水果的正确抓取姿势是很重要的。在这项研究中，提出了一种几何感知网络A3N，用于使用来自RGB-D相机的颜色和几何感知数据执行端到端实例分割和抓取估计。此外，利用工作空间几何建模辅助机器人操作。此外，我们实施了一种全局到局部的扫描策略，该策略使机器人能够使用两个消费者级RGB-D摄像头在田间环境中准确识别和检索水果。在实验中，我们还对该网络的准确性和鲁棒性进行了综合评估。实验结果表明，A3N的实例分割精度为0.873，平均计算时间为35ms，抓取估计的中心和方向平均精度分别为0.61cm和4.8$^{circ}$。总体而言，该机器人系统利用全球到本地扫描和A3N，在田间收割试验中获得了70%-85%的收割成功率。摘要：Field robotic harvesting is a promising technique in recent development of agricultural industry. It is vital for robots to recognise and localise fruits before the harvesting in natural orchards. However, the workspace of harvesting robots in orchards is complex: many fruits are occluded by branches and leaves. It is important to estimate a proper grasping pose for each fruit before performing the manipulation. In this study, a geometry-aware network, A3N, is proposed to perform end-to-end instance segmentation and grasping estimation using both color and geometry sensory data from a RGB-D camera. Besides, workspace geometry modelling is applied to assist the robotic manipulation. Moreover, we implement a global-to-local scanning strategy, which enables robots to accurately recognise and retrieve fruits in field environments with two consumer-level RGB-D cameras. We also evaluate the accuracy and robustness of proposed network comprehensively in experiments. The experimental results show that A3N achieves 0.873 on instance segmentation accuracy, with an average computation time of 35 ms. The average accuracy of grasping estimation is 0.61 cm and 4.8$^{circ}$ in centre and orientation, respectively. Overall, the robotic system that utilizes the global-to-local scanning and A3N, achieves success rate of harvesting ranging from 70% - 85% in field harvesting experiments.

【8】 Ethical and social risks of harm from Language Models 标题：语言模式危害的伦理和社会风险链接：https://arxiv.org/abs/2112.04359

作者：Laura Weidinger,John Mellor,Maribeth Rauh,Conor Griffin,Jonathan Uesato,Po-Sen Huang,Myra Cheng,Mia Glaese,Borja Balle,Atoosa Kasirzadeh,Zac Kenton,Sasha Brown,Will Hawkins,Tom Stepleton,Courtney Biles,Abeba Birhane,Julia Haas,Laura Rimell,Lisa Anne Hendricks,William Isaac,Sean Legassick,Geoffrey Irving,Iason Gabriel 摘要：本文旨在帮助构建与大规模语言模型（LMs）相关的风险景观。为了促进负责任创新的进步，需要深入了解这些模型带来的潜在风险。利用计算机科学、语言学和社会科学的多学科专业知识和文献，详细分析了广泛的已确定和预期风险。我们概述了六个具体的风险领域：一、歧视、排斥和毒性；二。信息危害，III.错误信息危害，V.恶意使用，V.人机交互危害，VI.自动化、访问和环境危害。第一个领域涉及陈规定型观念、不公平歧视、排他性规范、有毒语言以及社会群体在LMs方面表现不佳。第二个重点是私人数据泄漏或LMs正确推断敏感信息的风险。第三种方法解决了由不良、虚假或误导性信息（包括敏感领域中的信息）引起的风险，以及对共享信息的信任受损等连锁风险。第四部分考虑了试图使用LMs造成伤害的行为者的风险。第五个重点关注用于支持与人类用户交互的会话代理的LLM的特定风险，包括不安全使用、操纵或欺骗。第六部分讨论了可能对不同社会群体或社区产生不同影响的环境危害、工作自动化和其他挑战的风险。我们总共深入审查了21项风险。我们讨论了不同风险的起源点，并指出了潜在的缓解方法。最后，我们讨论了实施缓解措施的组织责任，以及协作和参与的作用。我们强调了进一步研究的方向，特别是扩展用于评估和评估LMs中概述的风险的工具包。摘要：This paper aims to help structure the risk landscape associated with large-scale Language Models (LMs). In order to foster advances in responsible innovation, an in-depth understanding of the potential risks posed by these models is needed. A wide range of established and anticipated risks are analysed in detail, drawing on multidisciplinary expertise and literature from computer science, linguistics, and social sciences. We outline six specific risk areas: I. Discrimination, Exclusion and Toxicity, II. Information Hazards, III. Misinformation Harms, V. Malicious Uses, V. Human-Computer Interaction Harms, VI. Automation, Access, and Environmental Harms. The first area concerns the perpetuation of stereotypes, unfair discrimination, exclusionary norms, toxic language, and lower performance by social group for LMs. The second focuses on risks from private data leaks or LMs correctly inferring sensitive information. The third addresses risks arising from poor, false or misleading information including in sensitive domains, and knock-on risks such as the erosion of trust in shared information. The fourth considers risks from actors who try to use LMs to cause harm. The fifth focuses on risks specific to LLMs used to underpin conversational agents that interact with human users, including unsafe use, manipulation or deception. The sixth discusses the risk of environmental harm, job automation, and other challenges that may have a disparate effect on different social groups or communities. In total, we review 21 risks in-depth. We discuss the points of origin of different risks and point to potential mitigation approaches. Lastly, we discuss organisational responsibilities in implementing mitigations, and the role of collaboration and participation. We highlight directions for further research, particularly on expanding the toolkit for assessing and evaluating the outlined risks in LMs.

【9】 Trainability for Universal GNNs Through Surgical Randomness 标题：基于手术随机性的通用GNN可训练性链接：https://arxiv.org/abs/2112.04314

作者：Billy Joe Franks,Markus Anders,Marius Kloft,Pascal Schweitzer 摘要：消息传递神经网络（MPNN）具有可证明的局限性，通用网络可以克服这些局限性。然而，通用网络通常是不切实际的。唯一的例外是随机节点初始化（RNI），这是一种数据增强方法，可以产生可证明的通用网络。不幸的是，RNI存在严重的缺点，如收敛速度慢和对超参数变化高度敏感。我们将强大的技术从图同构测试的实际世界转移到MPNNs，解决了这些缺点。这最终导致个性化细化节点初始化（IRNI）。我们将RNI中使用的不分青红皂白和随意的随机性替换为在精心选择的节点上仅使用几个随机位的外科手术切口。我们新颖的非侵入式数据扩充方案在解决可训练性问题的同时保持了网络的通用性。我们正式证明了所声称的普遍性，并在实验上证实了IRNI克服了MPNN的局限性——在先前明确为此目的设计的合成基准集上。我们还验证了我们的方法在标准基准数据集蛋白质和NCI1上的实际有效性。摘要：Message passing neural networks (MPNN) have provable limitations, which can be overcome by universal networks. However, universal networks are typically impractical. The only exception is random node initialization (RNI), a data augmentation method that results in provably universal networks. Unfortunately, RNI suffers from severe drawbacks such as slow convergence and high sensitivity to changes in hyperparameters. We transfer powerful techniques from the practical world of graph isomorphism testing to MPNNs, resolving these drawbacks. This culminates in individualization-refinement node initialization (IRNI). We replace the indiscriminate and haphazard randomness used in RNI by a surgical incision of only a few random bits at well-selected nodes. Our novel non-intrusive data-augmentation scheme maintains the networks' universality while resolving the trainability issues. We formally prove the claimed universality and corroborate experimentally -- on synthetic benchmarks sets previously explicitly designed for that purpose -- that IRNI overcomes the limitations of MPNNs. We also verify the practical efficacy of our approach on the standard benchmark data sets PROTEINS and NCI1.

【10】 iRoPro: An interactive Robot Programming Framework 标题：iRoPro：一种交互式机器人编程框架链接：https://arxiv.org/abs/2112.04289

作者：Ying Siu Liang,Damien Pellier,Humbert Fiorino,Sylvie Pesty 备注：None 摘要：从制造环境到个人住宅，终端用户任务的巨大多样性使得用于通用应用的预编程机器人极具挑战性。事实上，从头开始教机器人新的动作，这些动作可以重复用于以前看不见的任务，这仍然是一项艰巨的挑战，通常留给机器人专家来完成。在这项工作中，我们介绍了iRoPro，一个交互式机器人编程框架，它允许几乎没有技术背景的最终用户教授机器人新的可重用动作。我们将演示编程和自动规划技术相结合，允许用户通过动觉演示教授新动作来构建机器人的知识库。这些操作被概括并与任务计划器一起重用，以解决用户定义的以前未发现的问题。我们将iRoPro作为一个端到端系统在巴克斯特研究机器人上实现，通过演示用户可以通过图形用户界面进行定制以适应其特定用例，从而同时教授低级和高级动作。为了评估我们的方法的可行性，我们首先进行了预设计实验，以更好地理解用户对相关概念的采用以及建议的机器人编程过程。我们将结果与后设计实验进行比较，在后设计实验中，我们进行了一项用户研究，以验证我们的方法在实际最终用户中的可用性。总的来说，我们展示了具有不同编程水平和教育背景的用户可以轻松地学习和使用iRoPro及其机器人编程过程。摘要：The great diversity of end-user tasks ranging from manufacturing environments to personal homes makes pre-programming robots for general purpose applications extremely challenging. In fact, teaching robots new actions from scratch that can be reused for previously unseen tasks remains a difficult challenge and is generally left up to robotics experts. In this work, we present iRoPro, an interactive Robot Programming framework that allows end-users with little to no technical background to teach a robot new reusable actions. We combine Programming by Demonstration and Automated Planning techniques to allow the user to construct the robot's knowledge base by teaching new actions by kinesthetic demonstration. The actions are generalised and reused with a task planner to solve previously unseen problems defined by the user. We implement iRoPro as an end-to-end system on a Baxter Research Robot to simultaneously teach low- and high-level actions by demonstration that the user can customise via a Graphical User Interface to adapt to their specific use case. To evaluate the feasibility of our approach, we first conducted pre-design experiments to better understand the user's adoption of involved concepts and the proposed robot programming process. We compare results with post-design experiments, where we conducted a user study to validate the usability of our approach with real end-users. Overall, we showed that users with different programming levels and educational backgrounds can easily learn and use iRoPro and its robot programming process.

【11】 TempAMLSI : Temporal Action Model Learning based on Grammar Induction 标题：TempAMLSI：基于语法归纳的时间动作模型学习链接：https://arxiv.org/abs/2112.04286

作者：Maxence Grand,Damien Pellier,Humbert Fiorino 备注：Proceedings of the International workshop of Knowledge Engineering for Planning and Scheduling (ICAPS), 2021 摘要：手工编码PDDL域通常被认为是困难、乏味和容易出错的。当必须对时域进行编码时，难度更大。事实上，行动是有持续时间的，其效果不是瞬间的。本文提出了一种基于AMLSI方法的时域学习算法TEMPAMSI。TEMPAMSI基于时间规划中的经典假设，即可以将非时间域转换为时间域。TempAMLSI是第一种能够使用单一硬包络和库欣间隔学习时域的方法。我们通过实验证明，TEMPAMSI能够学习精确的时域，即可以直接用于解决新规划问题的时域，具有不同形式的动作并发性。摘要：Hand-encoding PDDL domains is generally accepted as difficult, tedious and error-prone. The difficulty is even greater when temporal domains have to be encoded. Indeed, actions have a duration and their effects are not instantaneous. In this paper, we present TempAMLSI, an algorithm based on the AMLSI approach able to learn temporal domains. TempAMLSI is based on the classical assumption done in temporal planning that it is possible to convert a non-temporal domain into a temporal domain. TempAMLSI is the first approach able to learn temporal domain with single hard envelope and Cushing's intervals. We show experimentally that TempAMLSI is able to learn accurate temporal domains, i.e., temporal domain that can be used directly to solve new planning problem, with different forms of action concurrency.

【12】 On the Use of Unrealistic Predictions in Hundreds of Papers Evaluating Graph Representations 标题：论不切实际的预测在数百篇评价图形表示的论文中的运用链接：https://arxiv.org/abs/2112.04274

作者：Li-Chung Lin,Cheng-Hung Liu,Chih-Ming Chen,Kai-Chin Hsu,I-Feng Wu,Ming-Feng Tsai,Chih-Jen Lin 备注：Accepted by AAAI 2022 摘要：使用基本事实进行预测听起来像是机器学习中的一个矛盾修饰法。然而，这种不切实际的设置在数百篇（如果不是数千篇）寻找图形表示的论文中被使用。为了使用获得的表示来评估节点分类的多标签问题，许多工作在预测阶段假设每个测试实例的标签数量是已知的。在实践中，这样的地面真相信息很少可用，但我们指出，这种不适当的设置现在在这个研究领域无处不在。我们详细调查了这种情况发生的原因。我们的分析表明，如果信息不切实际，性能可能会被高估。为了了解为什么没有使用合适的预测，我们确定了应用一些多标签技术的困难。为了在未来的研究中使用，我们建议在不使用实际未知信息的情况下进行简单有效的设置。最后，我们借此机会对多标签节点分类中的主要图表示学习方法进行了公平而认真的比较。摘要：Prediction using the ground truth sounds like an oxymoron in machine learning. However, such an unrealistic setting was used in hundreds, if not thousands of papers in the area of finding graph representations. To evaluate the multi-label problem of node classification by using the obtained representations, many works assume in the prediction stage that the number of labels of each test instance is known. In practice such ground truth information is rarely available, but we point out that such an inappropriate setting is now ubiquitous in this research area. We detailedly investigate why the situation occurs. Our analysis indicates that with unrealistic information, the performance is likely over-estimated. To see why suitable predictions were not used, we identify difficulties in applying some multi-label techniques. For the use in future studies, we propose simple and effective settings without using practically unknown information. Finally, we take this chance to conduct a fair and serious comparison of major graph-representation learning methods on multi-label node classification.

【13】 Efficient Batch Homomorphic Encryption for Vertically Federated XGBoost 标题：一种高效的垂直联合XGBoost批量同态加密算法链接：https://arxiv.org/abs/2112.04261

作者：Wuxing Xu,Hao Fan,Kaixin Li,Kai Yang 摘要：越来越多的组织和机构致力于利用外部数据来提高人工智能服务的性能。为了解决数据隐私和安全问题，联合学习吸引了学术界和工业界越来越多的关注，以跨多个孤立的数据提供商安全地构建AI模型。在本文中，我们研究了在实际应用中广泛使用的XGBoost模型适应垂直联合学习环境的效率问题。最先进的垂直联合XGBoost框架需要大量加密操作和密文传输，这使得模型训练的效率远远低于本地训练XGBoost模型。为了弥补这一差距，我们提出了一种新的批量同态加密方法，将加密相关的计算和传输成本降低近一半。这是通过将一阶导数和二阶导数编码为一个数字来实现的，用于加密、密文传输和同态加法操作。多个一阶导数和二阶导数之和可以从编码值之和同时解码。在BatchCrypt的水平联邦学习工作中，我们受批处理思想的启发，设计了一种新的批处理方法来解决允许少量负数的限制。该方法的编码过程包括移位、截断、量化和批处理四个步骤，而解码过程包括去量化和向后移位。通过理论分析和大量数值实验证明了该方法的优越性。摘要：More and more orgainizations and institutions make efforts on using external data to improve the performance of AI services. To address the data privacy and security concerns, federated learning has attracted increasing attention from both academia and industry to securely construct AI models across multiple isolated data providers. In this paper, we studied the efficiency problem of adapting widely used XGBoost model in real-world applications to vertical federated learning setting. State-of-the-art vertical federated XGBoost frameworks requires large number of encryption operations and ciphertext transmissions, which makes the model training much less efficient than training XGBoost models locally. To bridge this gap, we proposed a novel batch homomorphic encryption method to cut the cost of encryption-related computation and transmission in nearly half. This is achieved by encoding the first-order derivative and the second-order derivative into a single number for encryption, ciphertext transmission, and homomorphic addition operations. The sum of multiple first-order derivatives and second-order derivatives can be simultaneously decoded from the sum of encoded values. We are motivated by the batch idea in the work of BatchCrypt for horizontal federated learning, and design a novel batch method to address the limitations of allowing quite few number of negative numbers. The encode procedure of the proposed batch method consists of four steps, including shifting, truncating, quantizing and batching, while the decoding procedure consists of de-quantization and shifting back. The advantages of our method are demonstrated through theoretical analysis and extensive numerical experiments.

【14】 Application of Deep Reinforcement Learning to Payment Fraud 标题：深度强化学习在支付欺诈中的应用链接：https://arxiv.org/abs/2112.04236

作者：Siddharth Vimal,Kanishka Kayathwal,Hardik Wadhwa,Gaurav Dhama 备注：Multi-Armed Bandits and Reinforcement Learning: Advancing Decision Making in E-Commerce and Beyond at KDD 2021 摘要：在过去十年中，消费者可以选择的各种数字支付方式一直是电子商务交易的关键驱动力。不幸的是，这也导致了网络罪犯和欺诈者不断通过部署越来越复杂的欺诈攻击来寻找这些系统中的漏洞。典型的欺诈检测系统采用标准的监督学习方法，重点是最大化欺诈召回率。然而，我们认为这样的公式可能导致次优解。这些欺诈模型的设计要求它们对数据中的高级不平衡具有鲁棒性，能够适应欺诈模式的变化，在欺诈率和下降率之间保持平衡，以实现收入最大化，并且易于接受异步反馈，因为通常在交易和欺诈实现之间存在明显的滞后。为了实现这一点，我们将欺诈检测描述为一个连续的决策问题，通过在模型中以奖励函数的形式包含效用最大化。历史拒绝率和欺诈率使用由批准或拒绝交易组成的二元操作空间定义系统的状态。在这项研究中，我们主要关注效用最大化，并为此探索不同的奖励函数。针对两个公开的欺诈数据集，使用深度Q-学习对所提出的强化学习系统的性能进行了评估，并与不同的分类器进行了比较。我们的目标是在今后的工作中解决其余问题。摘要：The large variety of digital payment choices available to consumers today has been a key driver of e-commerce transactions in the past decade. Unfortunately, this has also given rise to cybercriminals and fraudsters who are constantly looking for vulnerabilities in these systems by deploying increasingly sophisticated fraud attacks. A typical fraud detection system employs standard supervised learning methods where the focus is on maximizing the fraud recall rate. However, we argue that such a formulation can lead to sub-optimal solutions. The design requirements for these fraud models requires that they are robust to the high-class imbalance in the data, adaptive to changes in fraud patterns, maintain a balance between the fraud rate and the decline rate to maximize revenue, and be amenable to asynchronous feedback since usually there is a significant lag between the transaction and the fraud realization. To achieve this, we formulate fraud detection as a sequential decision-making problem by including the utility maximization within the model in the form of the reward function. The historical decline rate and fraud rate define the state of the system with a binary action space composed of approving or declining the transaction. In this study, we primarily focus on utility maximization and explore different reward functions to this end. The performance of the proposed Reinforcement Learning system has been evaluated for two publicly available fraud datasets using Deep Q-learning and compared with different classifiers. We aim to address the rest of the issues in future work.

【15】 Towards automation of threat modeling based on a semantic model of attack patterns and weaknesses 标题：基于攻击模式和弱点语义模型的威胁建模自动化链接：https://arxiv.org/abs/2112.04231

作者：Andrei Brazhuk 摘要：这项工作考虑了建立和使用一个正式的知识库（模型）的挑战，该知识库将ATT&CK、CAPEC、CWE、CVE安全枚举结合起来。所提出的模型可用于学习攻击技术、攻击模式、弱点和漏洞之间的关系，以便构建各种威胁场景，特别是用于威胁建模。该模型创建为本体，具有OWL和RDF格式的免费可用数据集。本体的使用是集成安全枚举的结构化和基于图的方法的替代方法。在这项工作中，我们考虑威胁建模的方法与ATT和CK的数据组件的基础上的知识库和本体驱动的威胁建模框架。此外，还进行了一些评估，评估了如何使用威胁建模的本体论方法以及可能面临的挑战。摘要：This works considers challenges of building and usage a formal knowledge base (model), which unites the ATT&CK, CAPEC, CWE, CVE security enumerations. The proposed model can be used to learn relations between attack techniques, attack pattern, weaknesses, and vulnerabilities in order to build various threat landscapes, in particular, for threat modeling. The model is created as an ontology with freely available datasets in the OWL and RDF formats. The use of ontologies is an alternative of structural and graph based approaches to integrate the security enumerations. In this work we consider an approach of threat modeling with the data components of ATT&CK based on the knowledge base and an ontology driven threat modeling framework. Also, some evaluations are made, how it can be possible to use the ontological approach of threat modeling and which challenges this can be faced.

【16】 Replay For Safety 标题：为安全起见重播链接：https://arxiv.org/abs/2112.04229

作者：Liran Szlak,Ohad Shamir 摘要：经验重播citep{Lin1993Encurrence，mnih2015human}是一种广泛使用的技术，用于在RL算法中实现数据的高效使用和改进性能。在经验回放中，过去的转换存储在内存缓冲区中，并在学习过程中重复使用。在以前的工作中，已经对回放缓冲区中的采样方案提出了各种建议，试图以最佳方式选择那些最有助于收敛到最优策略的经验。这里，我们给出了重播抽样方案的一些条件，以确保收敛性，重点讨论了著名的表格式Q-学习算法。在建立了收敛的充分条件之后，我们转而建议对经验重放的一种稍微不同的用法——以一种有偏见的方式重放记忆，作为改变结果策略属性的一种手段。我们开始对经验回放进行严格研究，将其作为控制和修改最终策略属性的工具。特别是，我们证明了使用适当的有偏抽样方案可以实现emph{safe}策略。我们相信，使用经验重播作为一种偏向机制，允许以理想的方式控制产生的策略，这是一种在许多应用中具有潜在潜力的想法。摘要：Experience replay citep{lin1993reinforcement, mnih2015human} is a widely used technique to achieve efficient use of data and improved performance in RL algorithms. In experience replay, past transitions are stored in a memory buffer and re-used during learning. Various suggestions for sampling schemes from the replay buffer have been suggested in previous works, attempting to optimally choose those experiences which will most contribute to the convergence to an optimal policy. Here, we give some conditions on the replay sampling scheme that will ensure convergence, focusing on the well-known Q-learning algorithm in the tabular setting. After establishing sufficient conditions for convergence, we turn to suggest a slightly different usage for experience replay - replaying memories in a biased manner as a means to change the properties of the resulting policy. We initiate a rigorous study of experience replay as a tool to control and modify the properties of the resulting policy. In particular, we show that using an appropriate biased sampling scheme can allow us to achieve a emph{safe} policy. We believe that using experience replay as a biasing mechanism that allows controlling the resulting policy in desirable ways is an idea with promising potential for many applications.

【17】 Convergence Results For Q-Learning With Experience Replay 标题：带经验回放的Q-学习的收敛性结果链接：https://arxiv.org/abs/2112.04213

作者：Liran Szlak,Ohad Shamir 摘要：RL中一种常用的启发式方法是经验重放（例如~citet{lin1993reinforction，mnih2015human}），在这种方法中，学习者存储并重复使用过去的轨迹，就像在线采样一样。在这项工作中，我们开始在表格Q-学习环境中对这种启发式进行严格的研究。我们提供了一个收敛速度保证，并讨论了它如何与Q-学习的收敛性进行比较，这取决于重要参数，如重放迭代的频率和次数。通过引入和分析一类简单的MDP，我们还提供了理论证据，表明我们何时可以期望这种启发式严格提高性能。最后，我们提供了一些实验来支持我们的理论发现。摘要：A commonly used heuristic in RL is experience replay (e.g.~citet{lin1993reinforcement, mnih2015human}), in which a learner stores and re-uses past trajectories as if they were sampled online. In this work, we initiate a rigorous study of this heuristic in the setting of tabular Q-learning. We provide a convergence rate guarantee, and discuss how it compares to the convergence of Q-learning depending on important parameters such as the frequency and number of replay iterations. We also provide theoretical evidence showing when we might expect this heuristic to strictly improve performance, by introducing and analyzing a simple class of MDPs. Finally, we provide some experiments to support our theoretical findings.

【18】 Pretrained Cost Model for Distributed Constraint Optimization Problems 标题：分布式约束优化问题的预训练成本模型链接：https://arxiv.org/abs/2112.04187

作者：Yanchen Deng,Shufeng Kong,Bo An 备注：Accepted by AAAI-22 摘要：分布式约束优化问题（DCOP）是组合优化问题的一个重要子类，其中信息和控制分布在多个自治代理之间。以前，机器学习（ML）主要通过学习有效的启发式算法来解决组合优化问题。然而，现有的基于ML的启发式方法往往不能推广到不同的搜索算法。最重要的是，这些方法通常需要完全了解要解决的问题，这不适用于分布式环境，因为在分布式环境中，由于地理限制或隐私问题，集中化不现实。为了解决通用性问题，我们提出了一种新的DCOP有向无环图表示模式，并利用图注意网络（GAT）嵌入图表示。然后，我们的模型GAT-PCM以离线方式使用最佳标记数据进行预训练，以便构造有效的启发式算法，以促进广泛的DCOP算法，其中评估部分分配的质量至关重要，例如局部搜索或回溯搜索。此外，为了实现分散的模型推理，我们提出了GAT-PCM的分布式嵌入模式，其中每个代理只交换嵌入向量，并展示了其合理性和复杂性。最后，我们将模型与局部搜索或回溯搜索算法相结合，证明了模型的有效性。大量的实证评估表明，GAT PCM增强算法在各种基准测试中显著优于最先进的方法。预训练模型可在以下网址获得：https://github.com/dyc941126/GAT-PCM. 摘要：Distributed Constraint Optimization Problems (DCOPs) are an important subclass of combinatorial optimization problems, where information and controls are distributed among multiple autonomous agents. Previously, Machine Learning (ML) has been largely applied to solve combinatorial optimization problems by learning effective heuristics. However, existing ML-based heuristic methods are often not generalizable to different search algorithms. Most importantly, these methods usually require full knowledge about the problems to be solved, which are not suitable for distributed settings where centralization is not realistic due to geographical limitations or privacy concerns. To address the generality issue, we propose a novel directed acyclic graph representation schema for DCOPs and leverage the Graph Attention Networks (GATs) to embed graph representations. Our model, GAT-PCM, is then pretrained with optimally labelled data in an offline manner, so as to construct effective heuristics to boost a broad range of DCOP algorithms where evaluating the quality of a partial assignment is critical, such as local search or backtracking search. Furthermore, to enable decentralized model inference, we propose a distributed embedding schema of GAT-PCM where each agent exchanges only embedded vectors, and show its soundness and complexity. Finally, we demonstrate the effectiveness of our model by combining it with a local search or a backtracking search algorithm. Extensive empirical evaluations indicate that the GAT-PCM-boosted algorithms significantly outperform the state-of-the-art methods in various benchmarks. The pretrained model is available at https://github.com/dyc941126/GAT-PCM.

【19】 Equity Promotion in Online Resource Allocation 标题：网络资源配置中的公平促进链接：https://arxiv.org/abs/2112.04169

作者：Pan Xu,Yifan Xu 备注：A preliminary version will appear in the 36th AAAI conference on artificial intelligence (AAAI 22) 摘要：我们认为在线资源分配在一个典型的非营利性的设置，其中有限的或甚至稀缺的资源是由一个非营利组织，如政府管理。我们通过假设到达请求者在其外部因素（如需求）方面是同质的，但在其内部属性（如人口统计）方面是异质的，从而关注内部公平。具体而言，我们根据每个到达的请求者的人口统计（即种族、性别和年龄）将其与一个或多个群体相关联，并且我们旨在设计一个公平的分配策略，以便每个请求者群体都能获得与预设目标比率成比例的公平资源份额。2019冠状病毒疾病的检测，我们提出了两种基于LP的抽样算法，并从理论上（竞争比分析）和实验基于真实的COVID-19疫苗接种数据保持明尼苏达卫生部。理论和数值结果2019冠状病毒疾病的早期阶段，观察到我们的LP策略可以有效地促进公平性，特别是当到达人群不成比例地表示。摘要：We consider online resource allocation under a typical non-profit setting, where limited or even scarce resources are administered by a not-for-profit organization like a government. We focus on the internal-equity by assuming that arriving requesters are homogeneous in terms of their external factors like demands but heterogeneous for their internal attributes like demographics. Specifically, we associate each arriving requester with one or several groups based on their demographics (i.e., race, gender, and age), and we aim to design an equitable distributing strategy such that every group of requesters can receive a fair share of resources proportional to a preset target ratio. We present two LP-based sampling algorithms and investigate them both theoretically (in terms of competitive-ratio analysis) and experimentally based on real COVID-19 vaccination data maintained by the Minnesota Department of Health. Both theoretical and numerical results show that our LP-based sampling strategies can effectively promote equity, especially when the arrival population is disproportionately represented, as observed in the early stage of the COVID-19 vaccine rollout.

【20】 SNEAK: Synonymous Sentences-Aware Adversarial Attack on Natural Language Video Localization 标题：Screak：自然语言视频本地化的同义句感知对抗性攻击链接：https://arxiv.org/abs/2112.04154

作者：Wenbo Gou,Wen Shi,Jian Lou,Lijie Huang,Pan Zhou,Ruixuan Li 摘要：自然语言视频定位（NLVL）是视觉语言理解领域的一项重要任务，它不仅要求深入理解计算机视觉和自然语言方面，更重要的是深入理解两者之间的相互作用。对抗性漏洞已被公认为深层神经网络模型的一个关键安全问题，需要谨慎调查。尽管对视频和语言任务进行了广泛而独立的研究，但目前对NLVL等视觉-语言联合任务中对抗性稳健性的理解还不太成熟。因此，本文旨在通过从攻击和防御两个方面考察漏洞的三个方面，全面研究NLVL模型的对抗鲁棒性。为了实现攻击目标，我们提出了一种新的对抗性攻击范式，称为NLVL上的同义句感知对抗性攻击（SLEK），它捕获了视觉和语言双方之间的跨模态相互作用。摘要：Natural language video localization (NLVL) is an important task in the vision-language understanding area, which calls for an in-depth understanding of not only computer vision and natural language side alone, but more importantly the interplay between both sides. Adversarial vulnerability has been well-recognized as a critical security issue of deep neural network models, which requires prudent investigation. Despite its extensive yet separated studies in video and language tasks, current understanding of the adversarial robustness in vision-language joint tasks like NLVL is less developed. This paper therefore aims to comprehensively investigate the adversarial robustness of NLVL models by examining three facets of vulnerabilities from both attack and defense aspects. To achieve the attack goal, we propose a new adversarial attack paradigm called synonymous sentences-aware adversarial attack on NLVL (SNEAK), which captures the cross-modality interplay between the vision and language sides.

【21】 Model-Value Inconsistency as a Signal for Epistemic Uncertainty 标题：作为认知不确定性信号的模型-值不一致链接：https://arxiv.org/abs/2112.04153

作者：Angelos Filos,Eszter Vértes,Zita Marinho,Gregory Farquhar,Diana Borsa,Abram Friesen,Feryal Behbahani,Tom Schaul,André Barreto,Simon Osindero 备注：The first three authors contributed equally 摘要：通过使用环境模型和价值函数，agent可以通过将模型展开为不同的长度并使用其价值函数进行引导，来构建对状态价值的许多估计。我们的关键洞察是，我们可以将这组值估计视为一种集合，我们称之为emph{implicit value integration}（IVE）。因此，这些估计之间的差异可以作为代理人认知不确定性的代理；我们将此信号简称为emph{model value inconsistency}或emph{self inconsistency}。与之前通过训练多个模型和/或值函数的集合来估计不确定性的工作不同，这种方法只需要单个模型和值函数，而大多数基于模型的强化学习算法已经在学习这些模型和值函数。我们从像素的表格和函数近似设置中提供了经验证据，证明自不一致性是有用的（i）作为勘探信号，（ii）在分布变化下安全行动，以及（iii）使用模型对基于价值的规划进行稳健性验证。摘要：Using a model of the environment and a value function, an agent can construct many estimates of a state's value, by unrolling the model for different lengths and bootstrapping with its value function. Our key insight is that one can treat this set of value estimates as a type of ensemble, which we call an emph{implicit value ensemble} (IVE). Consequently, the discrepancy between these estimates can be used as a proxy for the agent's epistemic uncertainty; we term this signal emph{model-value inconsistency} or emph{self-inconsistency} for short. Unlike prior work which estimates uncertainty by training an ensemble of many models and/or value functions, this approach requires only the single model and value function which are already being learned in most model-based reinforcement learning algorithms. We provide empirical evidence in both tabular and function approximation settings from pixels that self-inconsistency is useful (i) as a signal for exploration, (ii) for acting safely under distribution shifts, and (iii) for robustifying value-based planning with a model.

【22】 BA-Net: Bridge Attention for Deep Convolutional Neural Networks 标题：BA网：深卷积神经网络的桥接关注点链接：https://arxiv.org/abs/2112.04150

作者：Yue Zhao,Junzhou Chen,Zirui Zhang,Ronghui Zhang 摘要：近年来，通道注意机制因其在改善深度卷积神经网络（CNN）性能方面的巨大潜力而被广泛研究。然而，在大多数现有方法中，仅将相邻卷积层的输出馈送到关注层以计算信道权重。忽略来自其他卷积层的信息。基于这些观察，我们提出了一种简单的策略，称为桥接注意网（BA-Net），以改善通道注意机制。该设计的主要思想是通过跳转连接桥接先前卷积层的输出，以生成信道权重。BA网络不仅可以在前馈时提供更丰富的信道权值计算功能，而且可以在反向时提供多条参数更新路径。综合评估表明，与现有方法相比，该方法在准确性和速度方面达到了最先进的性能。桥接注意为神经网络结构的设计提供了一个新的视角，在改善现有通道注意机制的性能方面显示出巨大的潜力。该代码位于url{https://github.com/zhaoy376/Attention-mechanism 摘要：In recent years, channel attention mechanism is widely investigated for its great potential in improving the performance of deep convolutional neural networks (CNNs). However, in most existing methods, only the output of the adjacent convolution layer is fed to the attention layer for calculating the channel weights. Information from other convolution layers is ignored. With these observations, a simple strategy, named Bridge Attention Net (BA-Net), is proposed for better channel attention mechanisms. The main idea of this design is to bridge the outputs of the previous convolution layers through skip connections for channel weights generation. BA-Net can not only provide richer features to calculate channel weight when feedforward, but also multiply paths of parameters updating when backforward. Comprehensive evaluation demonstrates that the proposed approach achieves state-of-the-art performance compared with the existing methods in regards to accuracy and speed. Bridge Attention provides a fresh perspective on the design of neural network architectures and shows great potential in improving the performance of the existing channel attention mechanisms. The code is available at url{https://github.com/zhaoy376/Attention-mechanism

【23】 A Review for Deep Reinforcement Learning in Atari:Benchmarks, Challenges, and Solutions 标题：Atari的深度强化学习综述：基准、挑战和解决方案链接：https://arxiv.org/abs/2112.04145

作者：Jiajun Fan 备注：AAAI-22 Workshop on Reinforcement Learning in Games 摘要：Arcade Learning Environment（ALE）被提议作为一个评估平台，用于经验性地评估数十款Atari 2600游戏中代理的通用性。ALE提供了各种具有挑战性的问题，并引起了深度强化学习（RL）社区的极大关注。来自深度Q网络（DQN）在Agent57看来，RL代理似乎在ALE中实现了超人的性能。然而，是这样吗？在本文中，为了探讨这个问题，我们首先回顾了Atari基准中当前的评估指标，然后揭示了当前实现超人性能的评估标准是不适当的，这低估了人类的peR性能与可能发生的情况相关。为了解决这些问题并促进RL研究的发展，我们提出了一种基于人类世界记录（HWR）的新型Atari基准，该基准对RL代理的最终性能和学习效率提出了更高的要求。此外，我们总结了最新技术（SOTA）方法，并提供基于人类世界记录的新评估指标的基准结果。我们得出结论，至少有四个开放的挑战阻碍RL代理从这些新的基准结果中实现超人性能。最后，我们还讨论了一些有希望的方法来处理这些问题。摘要：The Arcade Learning Environment (ALE) is proposed as an evaluation platform for empirically assessing the generality of agents across dozens of Atari 2600 games. ALE offers various challenging problems and has drawn significant attention from the deep reinforcement learning (RL) community. From Deep Q-Networks (DQN) to Agent57, RL agents seem to achieve superhuman performance in ALE. However, is this the case? In this paper, to explore this problem, we first review the current evaluation metrics in the Atari benchmarks and then reveal that the current evaluation criteria of achieving superhuman performance are inappropriate, which underestimated the human performance relative to what is possible. To handle those problems and promote the development of RL research, we propose a novel Atari benchmark based on human world records (HWR), which puts forward higher requirements for RL agents on both final performance and learning efficiency. Furthermore, we summarize the state-of-the-art (SOTA) methods in Atari benchmarks and provide benchmark results over new evaluation metrics based on human world records. We concluded that at least four open challenges hinder RL agents from achieving superhuman performance from those new benchmark results. Finally, we also discuss some promising ways to handle those problems.

【24】 Improving Knowledge Graph Representation Learning by Structure Contextual Pre-training 标题：通过结构上下文预训练改进知识图表征学习链接：https://arxiv.org/abs/2112.04087

作者：Ganqiang Ye,Wen Zhang,Zhen Bi,Chi Man Wong,Chen Hui,Huajun Chen 备注：Accepted to IJCKG 2021 摘要：知识图的表示学习模型（KG）在本文中，我们提出了一种新的知识图表示学习的预训练-微调框架，该框架首先使用三分类任务对KG模型进行预训练，然后对特定的KG模型进行区分性微调流任务，如实体类型预测和实体对齐。基于在典型的预训练语言模型中学习深层语境化单词表示的一般思想，我们提出SCoP来学习预训练的KG表示，并对目标三元组的结构和上下文三元组进行编码。实验结果表明t微调SCoP不仅在下游任务组合上优于基线结果，而且还避免了繁琐的特定任务模型设计和参数训练。摘要：Representation learning models for Knowledge Graphs (KG) have proven to be effective in encoding structural information and performing reasoning over KGs. In this paper, we propose a novel pre-training-then-fine-tuning framework for knowledge graph representation learning, in which a KG model is firstly pre-trained with triple classification task, followed by discriminative fine-tuning on specific downstream tasks such as entity type prediction and entity alignment. Drawing on the general ideas of learning deep contextualized word representations in typical pre-trained language models, we propose SCoP to learn pre-trained KG representations with structural and contextual triples of the target triple encoded. Experimental results demonstrate that fine-tuning SCoP not only outperforms results of baselines on a portfolio of downstream tasks but also avoids tedious task-specific model design and parameter training.

【25】 Hyper-parameter optimization based on soft actor critic and hierarchical mixture regularization 标题：基于软角色批评和分层混合正则化的超参数优化链接：https://arxiv.org/abs/2112.04084

作者：Chaoyue Liu,Yulai Zhang 摘要：超参数优化是机器学习中的一个关键问题，因为它的目标是在任何模型中实现最先进的性能。在这一领域已经做出了很大的努力，如随机搜索、网格搜索、贝叶斯优化。在本文中，我们将超参数优化过程建模为一个马尔可夫决策过程，并用强化学习进行处理。提出了一种新的基于软作用子批评和分层混合正则化的超参数优化方法。实验表明，该方法能在较短的时间内获得较好的超参数。摘要：Hyper-parameter optimization is a crucial problem in machine learning as it aims to achieve the state-of-the-art performance in any model. Great efforts have been made in this field, such as random search, grid search, Bayesian optimization. In this paper, we model hyper-parameter optimization process as a Markov decision process, and tackle it with reinforcement learning. A novel hyper-parameter optimization method based on soft actor critic and hierarchical mixture regularization has been proposed. Experiments show that the proposed method can obtain better hyper-parameters in a shorter time.

【26】 Accelerating Understanding of Scientific Experiments with End to End Symbolic Regression 标题：用端到端符号回归促进对科学实验的理解链接：https://arxiv.org/abs/2112.04023

作者：Nikos Arechiga,Francine Chen,Yan-Ying Chen,Yanxia Zhang,Rumen Iliev,Heishiro Toyoda,Kent Lyons 摘要：我们考虑从原始数据中学习自由形式符号表达式的问题，例如任何科学领域中的实验所产生的问题。科学现象的精确和可解释的模型是科学研究的基石。简单但可解释的模型，如线性或逻辑回归和决策树通常缺乏预测准确性。或者，精确的黑盒模型（如深度神经网络）提供了很高的预测精度，但不容易接受人类对这一现象的理解，从而丰富了这一现象的科学理论。科学上的许多重大突破都是围绕着开发具有高预测精度的简约方程模型展开的，如牛顿定律、万有引力和麦克斯韦方程。以前关于从数据中自动搜索等式模型的工作结合了特定领域的启发式方法以及计算昂贵的技术，如遗传编程和蒙特卡罗搜索。我们开发了一个深度神经网络（MACSYMA），将符号回归问题作为端到端的监督学习问题来处理。MACSYMA可以生成描述数据集的符号表达式。该任务的计算复杂度降低为神经网络的前馈计算。我们在一个合成数据集上训练神经网络，该数据集由不同长度和不同噪声水平的数据表组成，为此，神经网络必须学会一个标记一个标记地生成正确的符号表达式。最后，我们通过在行为科学的公共数据集上运行来验证我们的技术。摘要：We consider the problem of learning free-form symbolic expressions from raw data, such as that produced by an experiment in any scientific domain. Accurate and interpretable models of scientific phenomena are the cornerstone of scientific research. Simple yet interpretable models, such as linear or logistic regression and decision trees often lack predictive accuracy. Alternatively, accurate blackbox models such as deep neural networks provide high predictive accuracy, but do not readily admit human understanding in a way that would enrich the scientific theory of the phenomenon. Many great breakthroughs in science revolve around the development of parsimonious equational models with high predictive accuracy, such as Newton's laws, universal gravitation, and Maxwell's equations. Previous work on automating the search of equational models from data combine domain-specific heuristics as well as computationally expensive techniques, such as genetic programming and Monte-Carlo search. We develop a deep neural network (MACSYMA) to address the symbolic regression problem as an end-to-end supervised learning problem. MACSYMA can generate symbolic expressions that describe a dataset. The computational complexity of the task is reduced to the feedforward computation of a neural network. We train our neural network on a synthetic dataset consisting of data tables of varying length and varying levels of noise, for which the neural network must learn to produce the correct symbolic expression token by token. Finally, we validate our technique by running on a public dataset from behavioral science.

【27】 Emotion-Cause Pair Extraction in Customer Reviews 标题：客户评论中情感原因对的提取链接：https://arxiv.org/abs/2112.03984

作者：Arpit Mittal,Jeel Tejaskumar Vaishnav,Aishwarya Kaliki,Nathan Johns,Wyatt Pease 备注：7 Pages, 8 Figures 摘要：情感原因对提取（ECPE）是自然语言处理中一个复杂而热门的领域，因为它在各个领域都有着重要的意义和潜在的应用。在本报告中，我们的目标是介绍我们在在线评论领域的ECPE工作。利用人工标注的数据集，我们探索了一种使用神经网络提取情感-原因对的算法。此外，我们提出了一个模型，该模型使用了以前的参考资料，并将情绪原因对提取与情绪感知单词嵌入领域的研究相结合，将这些嵌入发送到Bi LSTM层，该层为我们提供情绪相关子句。在有限数据集的约束下，我们实现了。我们报告的总体范围包括全面的文献综述、数据集构建和初始模型训练参考方法的实施、通过建议改进管道来修改ECPE中以前的工作，以及特定审查领域的算法开发和实施。摘要：Emotion-Cause Pair Extraction (ECPE) is a complex yet popular area in Natural Language Processing due to its importance and potential applications in various domains. In this report , we aim to present our work in ECPE in the domain of online reviews. With a manually annotated dataset, we explore an algorithm to extract emotion cause pairs using a neural network. In addition, we propose a model using previous reference materials and combining emotion-cause pair extraction with research in the domain of emotion-aware word embeddings, where we send these embeddings into a Bi-LSTM layer which gives us the emotionally relevant clauses. With the constraint of a limited dataset, we achieved . The overall scope of our report comprises of a comprehensive literature review, implementation of referenced methods for dataset construction and initial model training, and modifying previous work in ECPE by proposing an improvement to the pipeline, as well as algorithm development and implementation for the specific domain of reviews.

【28】 Synthetic Acute Hypotension and Sepsis Datasets Based on MIMIC-III and Published as Part of the Health Gym Project 标题：基于MIMIC-III的合成急性低血压和脓毒症数据集，并作为健康健身房项目的一部分出版链接：https://arxiv.org/abs/2112.03914

作者：Nicholas I-Hsien Kuo,Mark Polizzotto,Simon Finfer,Louisa Jorm,Sebastiano Barbieri 摘要：这两个合成数据集包括重症监护病房（ICU）中3910名急性低血压患者和2164名败血症患者的生命体征、实验室检查结果、给药的液体和血管升压药。患者队列是使用先前公布的纳入和排除标准建立的，数据是使用生成性对抗网络（GANs）和MIMIC-III临床数据库创建的。与这些数据发布相关的身份披露风险估计非常低（0.045%）。这些数据集是作为Health Gym项目的一部分生成和发布的，该项目旨在公开发布合成纵向健康数据，用于开发机器学习算法（特别关注离线强化学习）和教育目的。摘要：These two synthetic datasets comprise vital signs, laboratory test results, administered fluid boluses and vasopressors for 3,910 patients with acute hypotension and for 2,164 patients with sepsis in the Intensive Care Unit (ICU). The patient cohorts were built using previously published inclusion and exclusion criteria and the data were created using Generative Adversarial Networks (GANs) and the MIMIC-III Clinical Database. The risk of identity disclosure associated with the release of these data was estimated to be very low (0.045%). The datasets were generated and published as part of the Health Gym, a project aiming to publicly distribute synthetic longitudinal health data for developing machine learning algorithms (with a particular focus on offline reinforcement learning) and for educational purposes.

【29】 RID-Noise: Towards Robust Inverse Design under Noisy Environments 标题：RID-噪声：噪声环境下的鲁棒逆设计链接：https://arxiv.org/abs/2112.03912

作者：Jia-Qi Yang,Ke-Bin Fan,Hao Ma,De-Chuan Zhan 备注：AAAI'22 摘要：从工程角度来看，设计不仅应在理想条件下运行良好，还应抵抗噪音。这种设计方法，即稳健设计，已广泛应用于工业产品质量控制。然而，传统稳健设计需要对单个设计目标进行大量评估，而这些评估的结果不能用于新的目标。为了实现数据高效的鲁棒设计，我们提出了噪声下的鲁棒逆设计（RID噪声），它可以利用现有的噪声数据来训练条件可逆神经网络（cINN）。具体来说，我们通过前向神经网络的预测误差来衡量设计参数的可预测性，从而估计其鲁棒性。我们还定义了样本权重，可用于基于cINN的逆模型的最大加权似然估计。通过实验的可视化结果，我们清楚地证明了RID噪声是如何通过从数据中学习分布和鲁棒性来工作的。在多个有噪声的真实基准任务上的进一步实验证实，我们的方法比其他最先进的逆设计方法更有效。守则及补充资料可于https://github.com/ThyrixYang/rid-noise-aaai22 摘要：From an engineering perspective, a design should not only perform well in an ideal condition, but should also resist noises. Such a design methodology, namely robust design, has been widely implemented in the industry for product quality control. However, classic robust design requires a lot of evaluations for a single design target, while the results of these evaluations could not be reused for a new target. To achieve a data-efficient robust design, we propose Robust Inverse Design under Noise (RID-Noise), which can utilize existing noisy data to train a conditional invertible neural network (cINN). Specifically, we estimate the robustness of a design parameter by its predictability, measured by the prediction error of a forward neural network. We also define a sample-wise weight, which can be used in the maximum weighted likelihood estimation of an inverse model based on a cINN. With the visual results from experiments, we clearly justify how RID-Noise works by learning the distribution and robustness from data. Further experiments on several real-world benchmark tasks with noises confirm that our method is more effective than other state-of-the-art inverse design methods. Code and supplementary is publicly available at https://github.com/ThyrixYang/rid-noise-aaai22

机器翻译，仅供参考

linux https 网络安全 NLP服务批量计算

0 人点赞