自然语言处理学术速递[7.27]

2021-07-28 14:57:03 浏览数 (1)

cs.CL 方向,今日共计31篇

Transformer(1篇)

【1】 H-Transformer-1D: Fast One-Dimensional Hierarchical Attention for Sequences 标题:H-Transformer-1D:序列的快速一维分层关注

作者:Zhenhai Zhu,Radu Soricut 机构:Google Research 备注:ACL2021 long paper oral presentation 链接:https://arxiv.org/abs/2107.11906 摘要:我们描述了一种有效的层次化方法来计算Transformer结构中的注意。该注意机制采用了一种类似于数值分析界提出的层次矩阵(H矩阵)的矩阵结构,具有线性的运行时间和记忆复杂度。我们进行了大量的实验来证明我们的层次注意所体现的归纳偏差在捕捉自然语言和视觉任务中典型序列的层次结构方面是有效的。我们的方法是优于替代次二次建议超过6点平均长期竞技场基准。它还设置了一个新的SOTA测试困惑10亿字的数据集与5倍以下的模型参数比以前最好的Transformer为基础的模型。 摘要:We describe an efficient hierarchical method to compute attention in the Transformer architecture. The proposed attention mechanism exploits a matrix structure similar to the Hierarchical Matrix (H-Matrix) developed by the numerical analysis community, and has linear run time and memory complexity. We perform extensive experiments to show that the inductive bias embodied by our hierarchical attention is effective in capturing the hierarchical structure in the sequences typical for natural language and vision tasks. Our method is superior to alternative sub-quadratic proposals by over 6 points on average on the Long Range Arena benchmark. It also sets a new SOTA test perplexity on One-Billion Word dataset with 5x fewer model parameters than that of the previous-best Transformer-based models.

QA|VQA|问答|对话(2篇)

【1】 One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval 标题:跨语种密集段落检索的多语种一次问答模型

作者:Akari Asai,Xinyan Yu,Jungo Kasai,Hannaneh Hajishirzi 机构:†University of Washington, ‡Allen Institute for AI 备注:Our code and trained model are publicly available at this https URL 链接:https://arxiv.org/abs/2107.11976 摘要:我们提出了CORA,一个跨语言的开放检索答案生成模型,它可以跨多种语言回答问题,即使在特定语言的注释数据或知识源不可用的情况下。我们介绍了一种新的密文检索算法,该算法训练为跨语言检索一个问题的文档。结合多语种自回归生成模型,CORA直接以目标语言回答问题,无需任何翻译或语言检索模块。我们提出了一种迭代训练方法,可以自动地将只在高资源语言中可用的注释数据扩展到低资源语言中。我们的研究结果显示,CORA在26种语言的多语言开放式问题回答基准测试上显著优于以前的水平,其中9种是在训练中看不到的。我们的分析显示了跨语言检索和生成在许多语言中的重要性,特别是在低资源环境下。 摘要:We present CORA, a Cross-lingual Open-Retrieval Answer Generation model that can answer questions across many languages even when language-specific annotated data or knowledge sources are unavailable. We introduce a new dense passage retrieval algorithm that is trained to retrieve documents across languages for a question. Combined with a multilingual autoregressive generation model, CORA answers directly in the target language without any translation or in-language retrieval modules as used in prior work. We propose an iterative training method that automatically extends annotated data available only in high-resource languages to low-resource ones. Our results show that CORA substantially outperforms the previous state of the art on multilingual open question answering benchmarks across 26 languages, 9 of which are unseen during training. Our analyses show the significance of cross-lingual retrieval and generation in many languages, particularly under low-resource settings.

【2】 Transferable Dialogue Systems and User Simulators 标题:可转移对话系统和用户模拟器

作者:Bo-Hsiang Tseng,Yinpei Dai,Florian Kreyssig,Bill Byrne 机构:†Engineering Department, University of Cambridge, UK, ‡Alibaba Group 备注:Accepted by ACL-IJCNLP 2021 链接:https://arxiv.org/abs/2107.11904 摘要:训练对话系统的困难之一是缺乏训练数据。我们探讨了通过对话系统和用户模拟器之间的交互来创建对话数据的可能性。我们的目标是开发一个建模框架,可以通过两个代理之间的自我游戏来整合新的对话场景。在这个框架中,我们首先在一组源域对话上对两个agent进行预训练,使它们能够通过自然语言进行对话。通过对少量的目标域数据进行进一步的微调,智能体可以继续进行交互,目的是通过具有结构化奖励函数的强化学习来改善其行为。在MultiWOZ数据集上的实验中,研究了两个实际的迁移学习问题:1)域自适应问题和2)单域到多域迁移问题。我们证明了所提出的框架是非常有效的引导性能的两个代理人在转移学习。我们还表明,我们的方法可以提高对话系统在完整数据集上的性能。 摘要:One of the difficulties in training dialogue systems is the lack of training data. We explore the possibility of creating dialogue data through the interaction between a dialogue system and a user simulator. Our goal is to develop a modelling framework that can incorporate new dialogue scenarios through self-play between the two agents. In this framework, we first pre-train the two agents on a collection of source domain dialogues, which equips the agents to converse with each other via natural language. With further fine-tuning on a small amount of target domain data, the agents continue to interact with the aim of improving their behaviors using reinforcement learning with structured reward functions. In experiments on the MultiWOZ dataset, two practical transfer learning problems are investigated: 1) domain adaptation and 2) single-to-multiple domain transfer. We demonstrate that the proposed framework is highly effective in bootstrapping the performance of the two agents in transfer learning. We also show that our method leads to improvements in dialogue system performance on complete datasets.

机器翻译(3篇)

【1】 Revisiting Negation in Neural Machine Translation 标题:神经机器翻译中的否定再访

作者:Gongbo Tang,Philipp Rönchen,Rico Sennrich,Joakim Nivre 机构:Department of Linguistics and Philology, Uppsala University, Department of Computational Linguistics, University of Zurich, School of Informatics, University of Edinburgh 备注:To appear at TACL and to be presented at ACL 2021. Authors' final version 链接:https://arxiv.org/abs/2107.12203 摘要:本文对英语-德语(EN--DE)和英语-汉语(EN--ZH)中否定句的翻译进行了自动和手动的评估。我们发现,神经机器翻译(NMT)模型翻译否定的能力已经提高了更深层次和更先进的网络,虽然性能在语言对和翻译方向上有所不同。人工评估EN-DE、DE-EN、EN-ZH和ZH-EN的准确率分别为95.7%、94.8%、93.4%和91.7%。此外,我们还发现,在非机器翻译中,欠翻译是最显著的错误类型,这与先前观察到的统计机器翻译中更为多样的错误模式形成了对比。为了更好地理解否定翻译不足的根源,我们研究了模型的信息流和训练数据。虽然我们的信息流分析没有揭示出任何可以用来检测或修复否定翻译不足的缺陷,但我们发现否定在训练过程中经常被重新表述,这可能会使模型更难学习源否定和目标否定之间的可靠联系。最后,我们对否定进行了内在分析和外在探测,结果表明NMT模型能够很好地区分否定和非否定标记,并在隐藏状态下编码了大量否定信息,但仍有改进的余地。 摘要:In this paper, we evaluate the translation of negation both automatically and manually, in English--German (EN--DE) and English--Chinese (EN--ZH). We show that the ability of neural machine translation (NMT) models to translate negation has improved with deeper and more advanced networks, although the performance varies between language pairs and translation directions. The accuracy of manual evaluation in EN-DE, DE-EN, EN-ZH, and ZH-EN is 95.7%, 94.8%, 93.4%, and 91.7%, respectively. In addition, we show that under-translation is the most significant error type in NMT, which contrasts with the more diverse error profile previously observed for statistical machine translation. To better understand the root of the under-translation of negation, we study the model's information flow and training data. While our information flow analysis does not reveal any deficiencies that could be used to detect or fix the under-translation of negation, we find that negation is often rephrased during training, which could make it more difficult for the model to learn a reliable link between source and target negation. We finally conduct intrinsic analysis and extrinsic probing tasks on negation, showing that NMT models can distinguish negation and non-negation tokens very well and encode a lot of information about negation in hidden states but nevertheless leave room for improvement.

【2】 Extending Challenge Sets to Uncover Gender Bias in Machine Translation: Impact of Stereotypical Verbs and Adjectives 标题:扩展挑战集以揭示机器翻译中的性别偏见:刻板印象动词和形容词的影响

作者:Jonas-Dario Troles,Ute Schmid 备注:16 pages, 4 figures 链接:https://arxiv.org/abs/2107.11584 摘要:人类的性别偏见反映在语言和文本的生产中。由于最先进的机器翻译(MT)系统是在大量文本语料库(主要由人类生成)上训练的,因此在机器翻译中也会发现性别偏见。例如,当职业从英语(主要使用中性词)等语言翻译为德语等语言时,它主要使用女性和男性版本的职业,必须由机器翻译系统作出决定。最近的研究表明,机器翻译系统倾向于职业的刻板翻译。2019年,第一个也是迄今为止唯一一个明确设计用于测量机器翻译系统中性别偏见程度的挑战集已经出版。在这一套中,性别偏见的衡量完全基于职业的翻译。在本文中,我们提出了一个扩展的挑战集,称为WiBeMT,与性别偏见的形容词,并增加了性别偏见的动词句子。由此产生的挑战集由超过70000个句子组成,并已被三个商业机器翻译系统翻译:DeepL翻译,微软翻译,谷歌翻译。结果显示,所有三个机器翻译系统的性别偏见。这种性别偏见在很大程度上受到形容词的显著影响,而在较小程度上受到动词的影响。 摘要:Human gender bias is reflected in language and text production. Because state-of-the-art machine translation (MT) systems are trained on large corpora of text, mostly generated by humans, gender bias can also be found in MT. For instance when occupations are translated from a language like English, which mostly uses gender neutral words, to a language like German, which mostly uses a feminine and a masculine version for an occupation, a decision must be made by the MT System. Recent research showed that MT systems are biased towards stereotypical translation of occupations. In 2019 the first, and so far only, challenge set, explicitly designed to measure the extent of gender bias in MT systems has been published. In this set measurement of gender bias is solely based on the translation of occupations. In this paper we present an extension of this challenge set, called WiBeMT, with gender-biased adjectives and adds sentences with gender-biased verbs. The resulting challenge set consists of over 70, 000 sentences and has been translated with three commercial MT systems: DeepL Translator, Microsoft Translator, and Google Translate. Results show a gender bias for all three MT systems. This gender bias is to a great extent significantly influenced by adjectives and to a lesser extent by verbs.

【3】 The USYD-JD Speech Translation System for IWSLT 2021 标题:用于IWSLT 2021的USYD-JD语音翻译系统

作者:Liang Ding,Di Wu,Dacheng Tao 机构:The University of Sydney, Peking University 备注:IWSLT 2021 winning system of the low-resource speech translation track 链接:https://arxiv.org/abs/2107.11572 摘要:本文介绍了悉尼大学和JD的联合提交的IWSLT 2021低资源语音翻译任务。我们参加了斯瓦希里语英语指导,并获得了所有参与者中最好的Scarbleu(25.3)分。我们的约束系统是基于流水线框架的,即ASR和NMT。我们用官方提供的ASR和MT数据集训练我们的模型。ASR系统基于开源工具Kaldi,本文主要研究如何充分利用NMT模型。为了减少由ASR模型产生的标点错误,我们利用我们以前的工作SlotRefine来训练标点纠正模型。为了获得更好的翻译效果,我们探讨了最新的有效翻译策略,包括反译、知识提炼、多特征重排序和转换微调。对于模型结构,我们分别尝试了自回归模型和非自回归模型。此外,我们提出了两种新的预训练方法,即.textit{去噪训练}和.textit{双向训练}来充分利用数据。大量实验表明,加入上述技术后,BLEU分数得到了持续的提高,最终提交系统的BLEU分数比基线(用原始并行数据训练的Transformer集成模型)提高了约10.8bleu,达到了SOTA的性能。 摘要:This paper describes the University of Sydney& JD's joint submission of the IWSLT 2021 low resource speech translation task. We participated in the Swahili-English direction and got the best scareBLEU (25.3) score among all the participants. Our constrained system is based on a pipeline framework, i.e. ASR and NMT. We trained our models with the officially provided ASR and MT datasets. The ASR system is based on the open-sourced tool Kaldi and this work mainly explores how to make the most of the NMT models. To reduce the punctuation errors generated by the ASR model, we employ our previous work SlotRefine to train a punctuation correction model. To achieve better translation performance, we explored the most recent effective strategies, including back translation, knowledge distillation, multi-feature reranking and transductive finetuning. For model structure, we tried auto-regressive and non-autoregressive models, respectively. In addition, we proposed two novel pre-train approaches, i.e. textit{de-noising training} and textit{bidirectional training} to fully exploit the data. Extensive experiments show that adding the above techniques consistently improves the BLEU scores, and the final submission system outperforms the baseline (Transformer ensemble model trained with the original parallel data) by approximately 10.8 BLEU score, achieving the SOTA performance.

Graph|知识图谱|Knowledge(4篇)

【1】 How Knowledge Graph and Attention Help? A Quantitative Analysis into Bag-level Relation Extraction 标题:知识图谱和注意力有何帮助?袋级关系抽取的定量分析

作者:Zikun Hu,Yixin Cao,Lifu Huang,Tat-Seng Chua 机构:National University of Singapore, S-Lab, Nanyang Technological University, Computer Science Department, Virginia Tech 链接:https://arxiv.org/abs/2107.12064 摘要:知识图和注意机制在弱监督方法中有效地引入和选择有用信息。然而,只有定性分析和消融研究作为证据。在本文中,我们提供了一个数据集,并提出了一个范式来定量评估注意和KG对袋水平关系提取(RE)的影响。我们发现:(1)较高的注意准确率可能会导致较差的性能,因为它可能会损害模型提取实体提及特征的能力(2) 注意的表现在很大程度上受各种噪声分布模式的影响,这些噪声分布模式与真实数据集密切相关(3) KG强化注意确实提高了再绩效,虽然不是通过强化注意,而是通过整合实体优先权;注意机制可能会加剧训练数据不足的问题。基于这些发现,我们表明,与三种最先进的基线相比,在两个真实数据集上,RE模型的直接变体可以实现显著的改进(平均6%的AUC)。我们的代码和数据集可在https://github.com/zig-kwin-hu/how-KG-ATT-help. 摘要:Knowledge Graph (KG) and attention mechanism have been demonstrated effective in introducing and selecting useful information for weakly supervised methods. However, only qualitative analysis and ablation study are provided as evidence. In this paper, we contribute a dataset and propose a paradigm to quantitatively evaluate the effect of attention and KG on bag-level relation extraction (RE). We find that (1) higher attention accuracy may lead to worse performance as it may harm the model's ability to extract entity mention features; (2) the performance of attention is largely influenced by various noise distribution patterns, which is closely related to real-world datasets; (3) KG-enhanced attention indeed improves RE performance, while not through enhanced attention but by incorporating entity prior; and (4) attention mechanism may exacerbate the issue of insufficient training data. Based on these findings, we show that a straightforward variant of RE model can achieve significant improvements (6% AUC on average) on two real-world datasets as compared with three state-of-the-art baselines. Our codes and datasets are available at https://github.com/zig-kwin-hu/how-KG-ATT-help.

【2】 Graph-free Multi-hop Reading Comprehension: A Select-to-Guide Strategy 标题:无图多跳阅读理解:选择导向策略

作者:Bohong Wu,Zhuosheng Zhang,Hai Zhao 机构: Zhao are with the Department of ComputerScience and Engineering, Shanghai Jiao Tong University, and also with KeyLaboratory of Shanghai Education Commission for Intelligent Interactionand Cognitive Engineering, AI Institute, Shanghai Jiao TongUniversity 链接:https://arxiv.org/abs/2107.11823 摘要:多跳阅读理解(MHRC)不仅需要预测文章中正确答案的广度,而且需要为推理的可解释性提供一系列支持性证据。将多跳推理理解为跨越实体节点的过程,很自然地将该过程建模为图结构,这使得图建模在这项任务中占据主导地位。近年来,由于图形构建的不方便,对图形建模是否不可或缺的问题一直存在着不同的看法,然而现有的最先进的无图化方法与基于图形的方法相比存在着巨大的性能差距。本文提出了一种新的无图方案,该方案首先优于MHRC上的所有图模型。具体地说,我们利用了一种从粗到细的从选择到引导(S2G)的策略,结合了两种新的注意机制,准确地检索证据段落,这令人惊讶地表明符合多跳推理的本质。我们的无图模型在MHRC基准HotpotQA上取得了显著的和一致的性能增益。 摘要:Multi-hop reading comprehension (MHRC) requires not only to predict the correct answer span in the given passage, but also to provide a chain of supporting evidences for reasoning interpretability. It is natural to model such a process into graph structure by understanding multi-hop reasoning as jumping over entity nodes, which has made graph modelling dominant on this task. Recently, there have been dissenting voices about whether graph modelling is indispensable due to the inconvenience of the graph building, however existing state-of-the-art graph-free attempts suffer from huge performance gap compared to graph-based ones. This work presents a novel graph-free alternative which firstly outperform all graph models on MHRC. In detail, we exploit a select-to-guide (S2G) strategy to accurately retrieve evidence paragraphs in a coarse-to-fine manner, incorporated with two novel attention mechanisms, which surprisingly shows conforming to the nature of multi-hop reasoning. Our graph-free model achieves significant and consistent performance gain over strong baselines and the current new state-of-the-art on the MHRC benchmark, HotpotQA, among all the published works.

【3】 Graph Convolutional Network with Generalized Factorized Bilinear Aggregation 标题:具有广义因子双线性聚集的图卷积网络

作者:Hao Zhu,Piotr Koniusz 机构:Australian National University and Data,CSIRO, Canberra, Australia 链接:https://arxiv.org/abs/2107.11666 摘要:尽管图卷积网络(GCN)在各种应用中显示了其强大的功能,但作为GCN最重要的组成部分,图卷积层仍然使用线性变换和简单的池化步骤。在本文中,我们提出了一种新的泛化因子化双线性(FB)层来模拟GCNs中的特征交互。FB执行两个矩阵向量乘法,即将权重矩阵与来自两侧的隐藏特征向量的外积相乘。然而,FB层由于隐藏表示的信道之间的相关性违反i.i.d.假设而遭受系数的二次数、过拟合和虚假相关性。因此,我们提出了一个紧凑的FB层通过定义一个家庭的总结运算符适用于二次项。我们分析了提出的池运算符并激励它们的使用。我们在多个数据集上的实验结果表明,GFB-GCN在文本分类方面与其他方法具有一定的竞争力。 摘要:Although Graph Convolutional Networks (GCNs) have demonstrated their power in various applications, the graph convolutional layers, as the most important component of GCN, are still using linear transformations and a simple pooling step. In this paper, we propose a novel generalization of Factorized Bilinear (FB) layer to model the feature interactions in GCNs. FB performs two matrix-vector multiplications, that is, the weight matrix is multiplied with the outer product of the vector of hidden features from both sides. However, the FB layer suffers from the quadratic number of coefficients, overfitting and the spurious correlations due to correlations between channels of hidden representations that violate the i.i.d. assumption. Thus, we propose a compact FB layer by defining a family of summarizing operators applied over the quadratic term. We analyze proposed pooling operators and motivate their use. Our experimental results on multiple datasets demonstrate that the GFB-GCN is competitive with other methods for text classification.

【4】 Differentiable Allophone Graphs for Language-Universal Speech Recognition 标题:用于语言通用语音识别的可微全音图

作者:Brian Yan,Siddharth Dalmia,David R. Mortensen,Florian Metze,Shinji Watanabe 机构:Language Technologies Institute, Carnegie Mellon University, USA 备注:INTERSPEECH 2021. Contains additional studies on phone recognition for unseen languages 链接:https://arxiv.org/abs/2107.11628 摘要:建立语言通用语音识别系统需要产生语音单元,这些语音单元可以在不同语言间共享。虽然语言特定音素或表面层次的语音注释很容易获得,但通用音素层次的注释相对较少且难以产生。在这项工作中,我们提出了一个通用的框架,从音素转录和音素到音素的映射中获得电话级别的监督,这些映射使用加权有限状态变换器表示可学习的权重,我们称之为可微异音图。通过多语种训练,我们建立了一个通用的基于电话的语音识别模型,每个语言都有可解释的概率电话-音素映射。语言学家可以利用这些基于电话的系统和学习到的异音图来记录新的语言,建立基于电话的词汇来捕捉丰富的发音变化,并重新评估所见语言的异音映射。我们用一个训练了7种不同语言的系统来展示我们提出的框架的上述优点。 摘要:Building language-universal speech recognition systems entails producing phonological units of spoken sound that can be shared across languages. While speech annotations at the language-specific phoneme or surface levels are readily available, annotations at a universal phone level are relatively rare and difficult to produce. In this work, we present a general framework to derive phone-level supervision from only phonemic transcriptions and phone-to-phoneme mappings with learnable weights represented using weighted finite-state transducers, which we call differentiable allophone graphs. By training multilingually, we build a universal phone-based speech recognition model with interpretable probabilistic phone-to-phoneme mappings for each language. These phone-based systems with learned allophone graphs can be used by linguists to document new languages, build phone-based lexicons that capture rich pronunciation variations, and re-evaluate the allophone mappings of seen language. We demonstrate the aforementioned benefits of our proposed framework with a system trained on 7 diverse languages.

推理|分析|理解|解释(3篇)

【1】 Hybrid Autoregressive Solver for Scalable Abductive Natural Language Inference 标题:基于混合自回归求解器的可扩展外推自然语言推理

作者:Marco Valentino,Mokanarangan Thayaparan,Deborah Ferreira,André Freitas 机构:Department of Computer Science, University of Manchester, United Kingdom†, Idiap Research Institute, Switzerland‡ 链接:https://arxiv.org/abs/2107.11879 摘要:重新生成科学问题的自然语言解释对于评估复杂的多跳和诱因推理能力是一项具有挑战性的任务。在这种设置下,当被采用为交叉编码器架构时,经过人类注释解释训练的Transformer可以达到最先进的性能。然而,尽管人们对所构建的解释的质量给予了很大的关注,但在规模上进行诱因推理的问题仍有待研究。由于本质上不可扩展,交叉编码器架构范式不适合在大规模事实库上进行有效的多跳推理。为了最大限度地提高精度和推理时间,我们提出了一种混合诱因解算器,它利用解释中的显式模式,将稠密的双编码器与解释力的稀疏模型进行自回归组合。我们的实验表明,所提出的框架可以达到与最先进的交叉编码器相当的性能,同时速度快约50$倍,可扩展到数百万个事实的语料库。此外,我们研究了杂交对语义漂移和科学问答的影响,结果表明,杂交可以提高解释的质量,并有助于提高下游推理的性能。 摘要:Regenerating natural language explanations for science questions is a challenging task for evaluating complex multi-hop and abductive inference capabilities. In this setting, Transformers trained on human-annotated explanations achieve state-of-the-art performance when adopted as cross-encoder architectures. However, while much attention has been devoted to the quality of the constructed explanations, the problem of performing abductive inference at scale is still under-studied. As intrinsically not scalable, the cross-encoder architectural paradigm is not suitable for efficient multi-hop inference on massive facts banks. To maximise both accuracy and inference time, we propose a hybrid abductive solver that autoregressively combines a dense bi-encoder with a sparse model of explanatory power, computed leveraging explicit patterns in the explanations. Our experiments demonstrate that the proposed framework can achieve performance comparable with the state-of-the-art cross-encoder while being $approx 50$ times faster and scalable to corpora of millions of facts. Moreover, we study the impact of the hybridisation on semantic drift and science question answering without additional training, showing that it boosts the quality of the explanations and contributes to improved downstream inference performance.

【2】 A Joint and Domain-Adaptive Approach to Spoken Language Understanding 标题:一种联合的、领域自适应的口语理解方法

作者:Linhao Zhang,Yu Shi,Linjun Shou,Ming Gong,Houfeng Wang,Michael Zeng 机构:MOE Key Lab of Computational Linguistics, Peking University, Microsoft 链接:https://arxiv.org/abs/2107.11768 摘要:口语理解(SLU)由两个子任务组成:意图检测(ID)和时隙填充(SF)。关于SLU的研究有两条线。一个是联合处理这两个子任务以提高其预测精度,另一个则侧重于其中一个子任务的域适应能力。在本文中,我们尝试将这两条研究路线连接起来,并提出一种联合域自适应的SLU方法。我们将SLU描述为一个约束生成任务,并利用基于领域特定本体的动态词汇表。我们在ASMixed和MTOD数据集上进行了实验,取得了与以前最先进的联合模型相比较的性能。此外,结果表明,我们的联合模型可以有效地适应一个新的领域。 摘要:Spoken Language Understanding (SLU) is composed of two subtasks: intent detection (ID) and slot filling (SF). There are two lines of research on SLU. One jointly tackles these two subtasks to improve their prediction accuracy, and the other focuses on the domain-adaptation ability of one of the subtasks. In this paper, we attempt to bridge these two lines of research and propose a joint and domain adaptive approach to SLU. We formulate SLU as a constrained generation task and utilize a dynamic vocabulary based on domain-specific ontology. We conduct experiments on the ASMixed and MTOD datasets and achieve competitive performance with previous state-of-the-art joint models. Besides, results show that our joint model can be effectively adapted to a new domain.

【3】 MuSe-Toolbox: The Multimodal Sentiment Analysis Continuous Annotation Fusion and Discrete Class Transformation Toolbox 标题:MUSE工具箱:多模态情感分析、连续标注融合和离散类变换工具箱

作者:Lukas Stappen,Lea Schumann,Benjamin Sertolli,Alice Baird,Benjamin Weigel,Erik Cambria,Björn W. Schuller 机构:University of Augsburg, Augsburg, Germany, Nanyang Technological University, Singapore, Imperial College London, London, United Kingdom 备注:(1) this https URL (2) docker pull musetoolbox/musetoolbox 链接:https://arxiv.org/abs/2107.11757 摘要:我们将介绍MuSe工具箱-一个基于Python的开放源码工具箱,用于创建各种连续和离散的标准。在一个单一的框架中,我们统一了多种融合方法,提出了一种新的评分者对齐注释加权算法(RAAW),该算法在对注释进行加权和融合之前,先对注释进行翻译不变性的对齐,然后根据注释之间的评分者协议对注释进行加权和融合。此外,离散的类别往往比连续的信号更容易被人类理解。考虑到这一点,MuSe工具箱提供了在连续gold标准中对有意义的类集群进行详尽搜索的功能。据我们所知,这是第一个工具箱,它提供了大量最先进的情感黄金标准方法及其到离散类的转换。实验结果表明,MuSe工具箱可以提供新的、有前途的类结构,比硬编码的类边界具有更好的预测能力。实现(1)是开箱即用的,所有依赖项都可以使用Docker容器(2)。 摘要:We introduce the MuSe-Toolbox - a Python-based open-source toolkit for creating a variety of continuous and discrete emotion gold standards. In a single framework, we unify a wide range of fusion methods and propose the novel Rater Aligned Annotation Weighting (RAAW), which aligns the annotations in a translation-invariant way before weighting and fusing them based on the inter-rater agreements between the annotations. Furthermore, discrete categories tend to be easier for humans to interpret than continuous signals. With this in mind, the MuSe-Toolbox provides the functionality to run exhaustive searches for meaningful class clusters in the continuous gold standards. To our knowledge, this is the first toolkit that provides a wide selection of state-of-the-art emotional gold standard methods and their transformation to discrete classes. Experimental results indicate that MuSe-Toolbox can provide promising and novel class formations which can be better predicted than hard-coded classes boundaries with minimal human intervention. The implementation (1) is out-of-the-box available with all dependencies using a Docker container (2).

GAN|对抗|攻击|生成相关(4篇)

【1】 Meta-Learning Adversarial Domain Adaptation Network for Few-Shot Text Classification 标题:用于少射文本分类的元学习对抗性领域自适应网络

作者:ChengCheng Han,Zeqiu Fan,Dongxiang Zhang,Minghui Qiu,Ming Gao,Aoying Zhou 机构:School of Data Science and Engineering, East China Normal University, College of Computer Science and Technology, Zhejiang University, Alibaba Group 链接:https://arxiv.org/abs/2107.12262 摘要:元学习已经成为一种处理少量镜头文本分类的趋势性技术,并取得了最先进的性能。然而,现有的解决方案在很大程度上依赖于对训练数据的词汇特征及其分布特征的利用,而忽视了对新任务的适应能力的增强。本文提出了一种新的元学习框架,结合对抗域自适应网络,旨在提高模型的自适应能力,为新类生成高质量的文本嵌入。在四个基准数据集上进行了大量的实验,结果表明,在所有的数据集上,我们的方法都明显优于现有的模型。特别是在20个新闻组的数据集上,单镜头分类和五镜头分类的准确率分别从52.1%提高到59.6%和68.3%提高到77.8%。 摘要:Meta-learning has emerged as a trending technique to tackle few-shot text classification and achieved state-of-the-art performance. However, existing solutions heavily rely on the exploitation of lexical features and their distributional signatures on training data, while neglecting to strengthen the model's ability to adapt to new tasks. In this paper, we propose a novel meta-learning framework integrated with an adversarial domain adaptation network, aiming to improve the adaptive ability of the model and generate high-quality text embedding for new classes. Extensive experiments are conducted on four benchmark datasets and our method demonstrates clear superiority over the state-of-the-art models in all the datasets. In particular, the accuracy of 1-shot and 5-shot classification on the dataset of 20 Newsgroups is boosted from 52.1% to 59.6%, and from 68.3% to 77.8%, respectively.

【2】 Towards Controlled and Diverse Generation of Article Comments 标题:走向受控和多样化的文章评论生成

作者:Linhao Zhang,Houfeng Wang 机构:MOE Key Lab of Computational Linguistics, Peking University, Beijing, China, Baidu Inc., China 链接:https://arxiv.org/abs/2107.11781 摘要:近年来,许多研究都集中在自动评论上。然而,以往的研究很少关注评论的可控生成。此外,它们往往会产生枯燥乏味的评论,这进一步限制了它们的实际应用。在本文中,我们通过建立一个能够显式控制评论情绪的系统,朝着可控的评论生成迈出了第一步。为了实现这一点,我们将每种情感类别与一个嵌入相关联,并采用动态融合机制将该嵌入融合到解码器中。为了更好地指导模型生成表达所需情感的评论,进一步采用了句子级情感分类器。为了增加生成的评论的多样性,我们提出了一种分层复制机制,允许我们的模型直接从输入文章中复制单词。我们还提出了一种限制波束搜索(RBS)算法来增加句内分集。实验结果表明,该模型能够生成信息量大、形式多样的评论,准确地表达期望的情感。 摘要:Much research in recent years has focused on automatic article commenting. However, few of previous studies focus on the controllable generation of comments. Besides, they tend to generate dull and commonplace comments, which further limits their practical application. In this paper, we make the first step towards controllable generation of comments, by building a system that can explicitly control the emotion of the generated comments. To achieve this, we associate each kind of emotion category with an embedding and adopt a dynamic fusion mechanism to fuse this embedding into the decoder. A sentence-level emotion classifier is further employed to better guide the model to generate comments expressing the desired emotion. To increase the diversity of the generated comments, we propose a hierarchical copy mechanism that allows our model to directly copy words from the input articles. We also propose a restricted beam search (RBS) algorithm to increase intra-sentence diversity. Experimental results show that our model can generate informative and diverse comments that express the desired emotions with high accuracy.

【3】 Context-aware Adversarial Training for Name Regularity Bias in Named Entity Recognition 标题:命名实体识别中名称规则偏差的上下文感知对抗性训练

作者:Abbas Ghaddar,Philippe Langlais,Ahmad Rashid,Mehdi Rezagholizadeh 机构:Huawei Noah’s Ark Lab, Montreal Research Center, Canada, †RALIDIRO, Université de Montréal, Canada 备注:None 链接:https://arxiv.org/abs/2107.11610 摘要:在这项工作中,我们检验了NER模型在预测模糊实体类型时使用上下文信息的能力。我们介绍了NRB,一个精心设计的新的测试平台,用于诊断NER模型的命名规则性偏差。我们的结果表明,我们测试的所有最先进的模型都显示出这样的偏差;BERT微调模型在NRB上的性能显著优于基于特征(LSTM-CRF)的模型,尽管在标准基准上的性能相当(有时更低)。为了减少这种偏差,我们提出了一种新的模型无关训练方法,该方法将可学习的对抗性噪声添加到一些实体提及中,从而使模型更加关注上下文信号,从而显著提高NRB。将它与另外两种训练策略,数据扩充和参数冻结相结合,可以获得更大的收益。 摘要:In this work, we examine the ability of NER models to use contextual information when predicting the type of an ambiguous entity. We introduce NRB, a new testbed carefully designed to diagnose Name Regularity Bias of NER models. Our results indicate that all state-of-the-art models we tested show such a bias; BERT fine-tuned models significantly outperforming feature-based (LSTM-CRF) ones on NRB, despite having comparable (sometimes lower) performance on standard benchmarks. To mitigate this bias, we propose a novel model-agnostic training method that adds learnable adversarial noise to some entity mentions, thus enforcing models to focus more strongly on the contextual signal, leading to significant gains on NRB. Combining it with two other training strategies, data augmentation and parameter freezing, leads to further gains.

【4】 Similarity Based Label Smoothing For Dialogue Generation 标题:基于相似度的标签平滑在对话生成中的应用

作者:Sougata Saha,Souvik Das,Rohini Srihari 机构:Department of Computer Science and Engineering, University at Buffalo, New York 链接:https://arxiv.org/abs/2107.11481 摘要:生成型神经会话系统的训练一般以训练硬目标和预测逻辑之间的熵损失最小为目标。通常,可以通过使用正则化技术(如标签平滑)来获得性能增益和改进的泛化,这种正则化技术将训练的“硬”目标转化为“软”目标。然而,标签平滑在不正确的训练目标上强制了一个与数据无关的均匀分布,这导致了对每个正确目标等概率不正确目标的错误假设。本文提出并实验了一种基于数据相关词相似度的加权方法,将标签平滑中错误目标概率的均匀分布转化为基于语义的更自然的分布。我们引入超参数来控制不正确的目标分布,并在两个标准的开放域对话语料库上报告了使用基于损失的标准标签平滑训练的网络的显著性能改进。 摘要:Generative neural conversational systems are generally trained with the objective of minimizing the entropy loss between the training "hard" targets and the predicted logits. Often, performance gains and improved generalization can be achieved by using regularization techniques like label smoothing, which converts the training "hard" targets to "soft" targets. However, label smoothing enforces a data independent uniform distribution on the incorrect training targets, which leads to an incorrect assumption of equi-probable incorrect targets for each correct target. In this paper we propose and experiment with incorporating data dependent word similarity based weighing methods to transforms the uniform distribution of the incorrect target probabilities in label smoothing, to a more natural distribution based on semantics. We introduce hyperparameters to control the incorrect target distribution, and report significant performance gains over networks trained using standard label smoothing based loss, on two standard open domain dialogue corpora.

识别/分类(4篇)

【1】 DYPLODOC: Dynamic Plots for Document Classification 标题:DYPLODOC:用于文档分类的动态绘图

作者:Anastasia Malysheva,Alexey Tikhonov,Ivan P. Yamshchikov 机构:Open Data Science, Moscow, Russia, Berlin, Germany, LEYA Lab, Yandex, Higher School of, Economics, St.Petersburg, Russia 链接:https://arxiv.org/abs/2107.12226 摘要:叙事的生成和分析仍然处于现代自然语言处理的边缘,但在许多应用中却至关重要。提出了一种地块动态特征提取方法。我们提出了一个数据集,其中包括一万三千个电视节目的情节描述,以及从中提取的关于其类型和动态情节的元信息。我们验证了提出的情节动态提取工具,并讨论了该方法在叙事分析和生成任务中的可能应用。 摘要:Narrative generation and analysis are still on the fringe of modern natural language processing yet are crucial in a variety of applications. This paper proposes a feature extraction method for plot dynamics. We present a dataset that consists of the plot descriptions for thirteen thousand TV shows alongside meta-information on their genres and dynamic plots extracted from them. We validate the proposed tool for plot dynamics extraction and discuss possible applications of this method to the tasks of narrative analysis and generation.

【2】 Preliminary Steps Towards Federated Sentiment Classification 标题:迈向联合情感分类的初步步骤

作者:Xin-Chun Li,De-Chuan Zhan,Yunfeng Shao,Bingshuai Li,Shaoming Song 机构:State Key Laboratory for Novel Software Technology, Nanjing University, Huawei Noah’s Ark Lab 链接:https://arxiv.org/abs/2107.11956 摘要:自然语言中情感倾向的自动挖掘是人工智能应用中的一个基础性研究课题,解决方案与挑战并存。迁移学习和多任务学习技术被用来缓解监督稀疏性,并相应地对多个异构领域进行协作。近年来,用户隐私数据的敏感性给情感分类提出了另一个挑战,即数据隐私保护。本文在语料库必须存储在分散的设备上的约束条件下,采用联邦学习方法进行多领域情感分类。针对多方语义的异构性和嵌入词的特殊性,有针对性地提出了相应的解决方案。首先,我们提出了一个知识转移增强的私有共享(KTEPS)框架,以便在联邦情感分类中更好地进行模型聚合和个性化。其次,我们提出KTEPS$^star$,考虑到词向量丰富的语义和巨大的嵌入量特性,利用基于投影的降维(PDR)方法同时实现隐私保护和高效传输。我们提出了两种基于公共基准的联邦情感分类场景,并通过大量的实验研究验证了本文方法的优越性。 摘要:Automatically mining sentiment tendency contained in natural language is a fundamental research to some artificial intelligent applications, where solutions alternate with challenges. Transfer learning and multi-task learning techniques have been leveraged to mitigate the supervision sparsity and collaborate multiple heterogeneous domains correspondingly. Recent years, the sensitive nature of users' private data raises another challenge for sentiment classification, i.e., data privacy protection. In this paper, we resort to federated learning for multiple domain sentiment classification under the constraint that the corpora must be stored on decentralized devices. In view of the heterogeneous semantics across multiple parties and the peculiarities of word embedding, we pertinently provide corresponding solutions. First, we propose a Knowledge Transfer Enhanced Private-Shared (KTEPS) framework for better model aggregation and personalization in federated sentiment classification. Second, we propose KTEPS$^star$ with the consideration of the rich semantic and huge embedding size properties of word vectors, utilizing Projection-based Dimension Reduction (PDR) methods for privacy protection and efficient transmission simultaneously. We propose two federated sentiment classification scenes based on public benchmarks, and verify the superiorities of our proposed methods with abundant experimental investigations.

【3】 Negation Handling in Machine Learning-Based Sentiment Classification for Colloquial Arabic 标题:基于机器学习的阿拉伯语口语情感分类中的否定处理

作者:Omar Al-Harbi 机构:Jazan University, Saudi Arabia 备注:None 链接:https://arxiv.org/abs/2107.11597 摘要:情感分析的一个重要方面是否定处理,否定的出现会翻转句子的情感,并对基于机器学习的情感分类产生负面影响。否定在阿拉伯语情绪分析中的作用,尤其是对阿拉伯语口语中的否定作用的研究还很有限。本文研究了基于机器学习的阿拉伯语口语情感分类的否定问题。为此,我们提出了一种简单的基于规则的算法来处理该问题;这些规则是在观察许多否定案例的基础上制定的。此外,简单的语言知识和情感词汇也被用于此目的。作者还研究了该算法对不同机器学习算法性能的影响。将该算法的结果与三种基线模型进行了比较。实验结果表明,与基线相比,该算法对分类器的准确率、查全率和召回率都有积极的影响。 摘要:One crucial aspect of sentiment analysis is negation handling, where the occurrence of negation can flip the sentiment of a sentence and negatively affects the machine learning-based sentiment classification. The role of negation in Arabic sentiment analysis has been explored only to a limited extent, especially for colloquial Arabic. In this paper, the author addresses the negation problem of machine learning-based sentiment classification for a colloquial Arabic language. To this end, we propose a simple rule-based algorithm for handling the problem; the rules were crafted based on observing many cases of negation. Additionally, simple linguistic knowledge and sentiment lexicon are used for this purpose. The author also examines the impact of the proposed algorithm on the performance of different machine learning algorithms. The results given by the proposed algorithm are compared with three baseline models. The experimental results show that there is a positive impact on the classifiers accuracy, precision and recall when the proposed algorithm is used compared to the baselines.

【4】 Brazilian Portuguese Speech Recognition Using Wav2vec 2.0 标题:使用Wav2vec 2.0的巴西葡萄牙语语音识别

作者:Lucas Rafael Stefanel Gris,Edresson Casanova,Frederico Santos de Oliveira,Anderson da Silva Soares,Arnaldo Candido Junior 机构:Silva Soares,[,−,−,−,], Arnaldo Candido, Junior,[,−,−,−,], Federal University of Technology - Paran´a, Medianeira, Brazil, University of S˜ao Paulo, S˜ao Carlos, Brazil, Federal University of Goias, Goiˆania, Brazil 链接:https://arxiv.org/abs/2107.11414 摘要:深度学习技术已被证明在各种任务中是有效的,特别是在语音识别系统的开发中,也就是说,旨在将一个句子在音频中转录成一系列单词的系统。尽管在这方面取得了进展,但语音识别仍然被认为是困难的,特别是对于缺乏可用数据的语言,如巴西葡萄牙语。从这个意义上说,这项工作提出了一个公共自动语音识别系统的发展,只使用开放可用的音频数据,从微调的Wav2vec 2.0 XLSR-53模型预先训练了多种语言的巴西葡萄牙语数据。最终的模型的字错误率为11.95%(普通语音数据集)。据我们所知,这比巴西葡萄牙语的最佳开放式自动语音识别模型少了13%,这对该语言来说是一个很有希望的结果。总的来说,这项工作验证了自我监督学习技术的使用,特别是wav2vec2.0体系结构在健壮系统开发中的使用,即使对于可用数据很少的语言也是如此。 摘要:Deep learning techniques have been shown to be efficient in various tasks, especially in the development of speech recognition systems, that is, systems that aim to transcribe a sentence in audio in a sequence of words. Despite the progress in the area, speech recognition can still be considered difficult, especially for languages lacking available data, as Brazilian Portuguese. In this sense, this work presents the development of an public Automatic Speech Recognition system using only open available audio data, from the fine-tuning of the Wav2vec 2.0 XLSR-53 model pre-trained in many languages over Brazilian Portuguese data. The final model presents a Word Error Rate of 11.95% (Common Voice Dataset). This corresponds to 13% less than the best open Automatic Speech Recognition model for Brazilian Portuguese available according to our best knowledge, which is a promising result for the language. In general, this work validates the use of self-supervising learning techniques, in special, the use of the Wav2vec 2.0 architecture in the development of robust systems, even for languages having few available data.

Word2Vec|文本|单词(1篇)

【1】 Stress Test Evaluation of Biomedical Word Embeddings 标题:生物医学词嵌入的压力测试评价

作者:Vladimir Araujo,Andrés Carvallo,Carlos Aspillaga,Camilo Thorne,Denis Parra 机构:Pontificia Universidad Católica de Chile, Millennium Institute for Foundational Research on Data (IMFD), Elsevier 备注:Accepted paper BioNLP2021 链接:https://arxiv.org/abs/2107.11652 摘要:预训练词嵌入的成功推动了其在生物医学领域的应用,语境化嵌入在一些生物医学自然语言处理任务中取得了显著的效果。然而,目前还缺乏定量研究他们在严重“压力”情境下的行为。在这项工作中,我们用对抗性的例子系统地评估了三种语言模型——自动构建的测试,使我们能够检查模型的健壮性。我们提出了两种类型的压力情景集中在生物医学命名实体识别(NER)任务,一种是基于拼写错误的启发,另一种是基于医学术语同义词的使用。我们用三个基准进行的实验表明,除了暴露出它们的弱点和优点之外,原始模型的性能显著下降。最后,我们证明了对抗性训练可以提高模型的鲁棒性,甚至在某些情况下超过原始性能。 摘要:The success of pretrained word embeddings has motivated their use in the biomedical domain, with contextualized embeddings yielding remarkable results in several biomedical NLP tasks. However, there is a lack of research on quantifying their behavior under severe "stress" scenarios. In this work, we systematically evaluate three language models with adversarial examples -- automatically constructed tests that allow us to examine how robust the models are. We propose two types of stress scenarios focused on the biomedical named entity recognition (NER) task, one inspired by spelling errors and another based on the use of synonyms for medical terms. Our experiments with three benchmarks show that the performance of the original models decreases considerably, in addition to revealing their weaknesses and strengths. Finally, we show that adversarial training causes the models to improve their robustness and even to exceed the original performance in some cases.

其他神经网络|深度学习|模型|建模(4篇)

【1】 Thought Flow Nets: From Single Predictions to Trains of Model Thought 标题:思维流网络:从单一预测到模型思维序列

作者:Hendrik Schuff,Heike Adel,Ngoc Thang Vu 机构: Bosch Center for Artificial Intelligence, Renningen, Germany, Institut für Maschinelle Sprachverarbeitung, University of Stuttgart 链接:https://arxiv.org/abs/2107.12220 摘要:当人类解决复杂的问题时,很少能马上做出决定。相反,他们从一个直观的决定开始,反思它,发现错误,解决矛盾,在不同的假设之间跳跃。因此,他们创造了一系列的想法,并遵循一系列的思路,最终得出结论性的决定。与此相反,今天的神经分类模型大多是训练一个单一的输入映射到一个固定的输出。在本文中,我们将探讨如何给予模型第二次、第三次和第k$次思考的机会。我们从黑格尔的辩证法中得到启发,提出了一种将现有分类器的类预测(如图像类forest)转化为一系列预测(如forest$rightarrow$tree$rightarrow$蘑菇)的方法。具体地说,我们提出了一个校正模块,用来估计模型的正确性,以及一个基于预测梯度的迭代预测更新。我们的方法在类概率分布上产生一个动态系统$unicode{x2014}$思想流。我们从计算机视觉和自然语言处理的不同数据集和任务来评估我们的方法。我们观察到令人惊讶的复杂但直观的行为,并证明我们的方法(i)可以纠正错误分类,(ii)增强模型性能,(iii)对高水平的敌对攻击具有鲁棒性,(iv)在标签分布偏移设置中可将精确度提高高达4%,(iv)提供了一种模型解释性工具,该工具可揭示在单个分布预测中不可见的模型知识。 摘要:When humans solve complex problems, they rarely come up with a decision right-away. Instead, they start with an intuitive decision, reflect upon it, spot mistakes, resolve contradictions and jump between different hypotheses. Thus, they create a sequence of ideas and follow a train of thought that ultimately reaches a conclusive decision. Contrary to this, today's neural classification models are mostly trained to map an input to one single and fixed output. In this paper, we investigate how we can give models the opportunity of a second, third and $k$-th thought. We take inspiration from Hegel's dialectics and propose a method that turns an existing classifier's class prediction (such as the image class forest) into a sequence of predictions (such as forest $rightarrow$ tree $rightarrow$ mushroom). Concretely, we propose a correction module that is trained to estimate the model's correctness as well as an iterative prediction update based on the prediction's gradient. Our approach results in a dynamic system over class probability distributions $unicode{x2014}$ the thought flow. We evaluate our method on diverse datasets and tasks from computer vision and natural language processing. We observe surprisingly complex but intuitive behavior and demonstrate that our method (i) can correct misclassifications, (ii) strengthens model performance, (iii) is robust to high levels of adversarial attacks, (iv) can increase accuracy up to 4% in a label-distribution-shift setting and (iv) provides a tool for model interpretability that uncovers model knowledge which otherwise remains invisible in a single distribution prediction.

【2】 Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction 标题:面向方面情感三元组提取的学习跨度交互

作者:Lu Xu,Yew Ken Chia,Lidong Bing 机构: Singapore University of Technology and Design, DAMO Academy, Alibaba Group 备注:ACL 2021, long paper, main conference 链接:https://arxiv.org/abs/2107.12214 摘要:方面情感三元组提取(ASTE)是ABSA最新的子任务,它输出一个方面目标的三元组、相关的情感和相应的意见项。最近的模型以端到端的方式进行三元组提取,但在很大程度上依赖于每个目标词和观点词之间的交互作用。因此,他们不能很好地执行目标和意见,其中包含多个词。我们提出的跨层次方法在预测目标和观点的情感关系时,明确地考虑了目标和观点的整个跨度之间的相互作用。因此,它可以使用整个跨度的语义进行预测,确保更好的情感一致性。为了缓解跨度枚举带来的高计算量,我们提出了一种双通道跨度剪枝策略,该策略结合了方面项提取(ATE)和观点项提取(OTE)任务的监督。该策略不仅提高了计算效率,而且能更准确地区分观点和目标。我们的框架同时实现了ASTE以及ATE和OTE任务的强大性能。特别是,我们的分析表明,我们的跨域方法在具有多词目标或观点的三元组的基线上取得了更显著的改进。 摘要:Aspect Sentiment Triplet Extraction (ASTE) is the most recent subtask of ABSA which outputs triplets of an aspect target, its associated sentiment, and the corresponding opinion term. Recent models perform the triplet extraction in an end-to-end manner but heavily rely on the interactions between each target word and opinion word. Thereby, they cannot perform well on targets and opinions which contain multiple words. Our proposed span-level approach explicitly considers the interaction between the whole spans of targets and opinions when predicting their sentiment relation. Thus, it can make predictions with the semantics of whole spans, ensuring better sentiment consistency. To ease the high computational cost caused by span enumeration, we propose a dual-channel span pruning strategy by incorporating supervision from the Aspect Term Extraction (ATE) and Opinion Term Extraction (OTE) tasks. This strategy not only improves computational efficiency but also distinguishes the opinion and target spans more properly. Our framework simultaneously achieves strong performance for the ASTE as well as ATE and OTE tasks. In particular, our analysis shows that our span-level approach achieves more significant improvements over the baselines on triplets with multi-word targets or opinions.

【3】 Exploiting Language Model for Efficient Linguistic Steganalysis: An Empirical Study 标题:利用语言模型进行有效的语言隐写分析:一项实证研究

作者:Biao Yi,Hanzhou Wu,Guorui Feng,Xinpeng Zhang 机构:Shanghai University, Shanghai , China 备注:5 pages 链接:https://arxiv.org/abs/2107.12168 摘要:近年来,语言隐写分析的研究成果相继应用CNNs、RNNs、GNNs等深度学习模型来检测生成文本中的秘密信息。这些方法倾向于寻找更强的特征抽取器来获得更高的隐写效果。然而,我们通过实验发现,自动生成的隐写文本与载体文本在单个词的条件概率分布上确实存在显著差异。生成隐写文本的语言模型可以很自然地捕捉到这种统计差异,这就要求我们给分类器提供语言模型的先验知识,以增强隐写分析的能力。为此,本文提出了两种有效的语言隐写分析方法。一种是对基于RNN的语言模型进行预训练,另一种是对序列自动编码器进行预训练。实验结果表明,与随机初始化的RNN分类器相比,两种方法都有不同程度的性能提高,收敛速度明显加快。而且,我们的方法取得了最好的检测效果。 摘要:Recent advances in linguistic steganalysis have successively applied CNNs, RNNs, GNNs and other deep learning models for detecting secret information in generative texts. These methods tend to seek stronger feature extractors to achieve higher steganalysis effects. However, we have found through experiments that there actually exists significant difference between automatically generated steganographic texts and carrier texts in terms of the conditional probability distribution of individual words. Such kind of statistical difference can be naturally captured by the language model used for generating steganographic texts, which drives us to give the classifier a priori knowledge of the language model to enhance the steganalysis ability. To this end, we present two methods to efficient linguistic steganalysis in this paper. One is to pre-train a language model based on RNN, and the other is to pre-train a sequence autoencoder. Experimental results show that the two methods have different degrees of performance improvement when compared to the randomly initialized RNN classifier, and the convergence speed is significantly accelerated. Moreover, our methods have achieved the best detection results.

【4】 Learn to Focus: Hierarchical Dynamic Copy Network for Dialogue State Tracking 标题:学会聚焦:用于对话状态跟踪的分层动态复制网络

作者:Linhao Zhang,Houfeng Wang 机构:MOE Key Lab of Computational Linguistics, Peking University, Beijing, China 链接:https://arxiv.org/abs/2107.11778 摘要:近年来,研究者们探索了利用编解码框架来解决面向任务的对话系统中的一个关键组成部分——对话状态跟踪问题。然而,他们认为多回合对话是一个平淡的序列,当序列很长时,他们没有把注意力集中在有用的信息上。在本文中,我们提出了一个分层动态复制网络(HDCN),以便于关注信息量最大的回合,从而更容易从对话上下文中提取时隙值。在编解码框架的基础上,我们采用分层复制的方法,在字级和话轮级计算两个注意级别,然后对其进行重新规范化,得到最终的副本分布。使用焦点损失项来鼓励模型将最高的回合水平注意力权重分配给信息量最大的回合。实验结果表明,该模型在multiwoz2.1数据集上的联合精度达到46.76%。 摘要:Recently, researchers have explored using the encoder-decoder framework to tackle dialogue state tracking (DST), which is a key component of task-oriented dialogue systems. However, they regard a multi-turn dialogue as a flat sequence, failing to focus on useful information when the sequence is long. In this paper, we propose a Hierarchical Dynamic Copy Network (HDCN) to facilitate focusing on the most informative turn, making it easier to extract slot values from the dialogue context. Based on the encoder-decoder framework, we adopt a hierarchical copy approach that calculates two levels of attention at the word- and turn-level, which are then renormalized to obtain the final copy distribution. A focus loss term is employed to encourage the model to assign the highest turn-level attention weight to the most informative turn. Experimental results show that our model achieves 46.76% joint accuracy on the MultiWOZ 2.1 dataset.

其他(5篇)

【1】 Fine-Grained Emotion Prediction by Modeling Emotion Definitions 标题:基于情感定义建模的细粒度情感预测

作者:Gargi Singh,Dhanajit Brahma,Piyush Rai,Ashutosh Modi 机构:CSE Department, Indian Institute of Technology Kanpur (IIT-K), Kanpur , India 备注:8 Pages, accepted at ACII 2021 for Orals 链接:https://arxiv.org/abs/2107.12135 摘要:本文通过情感定义模型,提出了一种新的文本细粒度情感预测框架。我们的方法包括一个多任务学习框架,该框架将情绪定义建模为一个辅助任务,同时对情绪预测的主要任务进行训练。我们使用掩蔽语言建模和类定义预测任务来建模定义。我们的模型在细粒度情感运动方面优于现有的最新技术。我们进一步证明,该训练模型可用于其他基准数据集上的迁移学习,用于不同情绪标签集、域和大小的情绪预测。在迁移学习实验中,该模型的泛化能力优于基线模型。 摘要:In this paper, we propose a new framework for fine-grained emotion prediction in the text through emotion definition modeling. Our approach involves a multi-task learning framework that models definitions of emotions as an auxiliary task while being trained on the primary task of emotion prediction. We model definitions using masked language modeling and class definition prediction tasks. Our models outperform existing state-of-the-art for fine-grained emotion dataset GoEmotions. We further show that this trained model can be used for transfer learning on other benchmark datasets in emotion prediction with varying emotion label sets, domains, and sizes. The proposed models outperform the baselines on transfer learning experiments demonstrating the generalization capability of the models.

【2】 Multilingual Coreference Resolution with Harmonized Annotations 标题:基于调和标注的多语种指代消解

作者:Ondřej Pražák,Miloslav Konopík,Jakub Sido 机构:NTIS – New Technologies for the Information Society, Department of Computer Science and Engineering, University of West Bohemia, Technick´a , Plzeˇn, Czech Republic 链接:https://arxiv.org/abs/2107.12088 摘要:在本文中,我们提出了一个新的多语种语料库CorefUD的共指消解实验。我们关注以下语言:捷克语、俄语、波兰语、德语、西班牙语和加泰罗尼亚语。除了单语实验外,我们还结合了多语言实验中的训练数据,训练了两个连接模型——斯拉夫语模型和所有语言模型。我们依赖于一个端到端的深度学习模型,我们稍微适应了CorefUD语料库。我们的结果表明,我们可以从协调注释中获益,并且使用连接模型对于训练数据较小的语言有很大帮助。 摘要:In this paper, we present coreference resolution experiments with a newly created multilingual corpus CorefUD. We focus on the following languages: Czech, Russian, Polish, German, Spanish, and Catalan. In addition to monolingual experiments, we combine the training data in multilingual experiments and train two joined models -- for Slavic languages and for all the languages together. We rely on an end-to-end deep learning model that we slightly adapted for the CorefUD corpus. Our results show that we can profit from harmonized annotations, and using joined models helps significantly for the languages with smaller training data.

【3】 An Argumentative Dialogue System for COVID-19 Vaccine Information 标题:一个冠状病毒疫苗信息讨论式对话系统

作者:Bettina Fazzinga,Andrea Galassi,Paolo Torroni 机构: ICAR CNR, Rende, Italy, DISI, University of Bologna, Bologna, Italy 备注:20 pages, 2 figures, currently under submission 链接:https://arxiv.org/abs/2107.12079 摘要:对话系统在人工智能中被广泛应用,以支持与用户的及时互动交流。我们提出一个通用的对话系统架构,利用计算论证和最先进的语言技术。我们用一个COVID-19疫苗信息案例来说明和评估这个系统。 摘要:Dialogue systems are widely used in AI to support timely and interactive communication with users. We propose a general-purpose dialogue system architecture that leverages computational argumentation and state-of-the-art language technologies. We illustrate and evaluate the system using a COVID-19 vaccine information case study.

【4】 Clinical Utility of the Automatic Phenotype Annotation in Unstructured Clinical Notes: ICU Use Cases 标题:自动表型注释在非结构化临床笔记中的临床应用:ICU使用案例

作者:Jingqing Zhang,Luis Bolanos,Ashwani Tanwar,Albert Sokol,Julia Ive,Vibhor Gupta,Yike Guo 机构:Pangaea Data Limited, UK, USA, Data Science Institute, Imperial College London, London, SW,AZ, UK, Department of Computing, Imperial College London, London, SW,AZ, UK, Hong Kong Baptist University, Hong Kong SAR, China 备注:Manuscript under review 链接:https://arxiv.org/abs/2107.11665 摘要:临床笔记包含其他地方没有的信息,包括药物反应和症状,所有这些在预测急性护理患者的关键结果时都非常重要。我们建议从临床笔记中自动注释表型,作为在重症监护病房(ICU)中获取基本信息以预测预后的方法。这些信息是对通常使用的生命体征和实验室测试结果的补充。我们演示并验证了我们的方法,对超过24000名患者的院内死亡率、生理失代偿和ICU住院时间的预测进行了实验。结合表型信息的预测模型始终优于仅利用生命体征和实验室测试结果的基线模型。此外,我们进行了一项彻底的可解释性研究,表明表型在患者和队列水平上提供了有价值的见解。我们的方法说明了在ICU中使用表型来确定结果的可行性。 摘要:Clinical notes contain information not present elsewhere, including drug response and symptoms, all of which are highly important when predicting key outcomes in acute care patients. We propose the automatic annotation of phenotypes from clinical notes as a method to capture essential information to predict outcomes in the Intensive Care Unit (ICU). This information is complementary to typically used vital signs and laboratory test results. We demonstrate and validate our approach conducting experiments on the prediction of in-hospital mortality, physiological decompensation and length of stay in the ICU setting for over 24,000 patients. The prediction models incorporating phenotypic information consistently outperform the baseline models leveraging only vital signs and laboratory test results. Moreover, we conduct a thorough interpretability study, showing that phenotypes provide valuable insights at the patient and cohort levels. Our approach illustrates the viability of using phenotypes to determine outcomes in the ICU.

【5】 MIPE: A Metric Independent Pipeline for Effective Code-Mixed NLG Evaluation 标题:MIPE:一种用于有效混合代码NLG评估的独立度量流水线

作者:Ayush Garg,Sammed S Kagi,Vivek Srivastava,Mayank Singh 机构:IIT Gandhinagar, India, TCS Research, India 链接:https://arxiv.org/abs/2107.11534 摘要:语码混用是指在一次话语和文本中,将两种或两种以上语言的词和短语混用在一起的现象。由于语言的高度多样性,代码混合在评估标准自然语言生成(NLG)任务时提出了一些挑战。各种广泛流行的度量在混合代码的NLG任务中表现不佳。为了应对这一挑战,我们提出了一种与度量无关的评估管道MIPE,它显著提高了评估度量与生成的代码混合文本上的人类判断之间的相关性。作为一个用例,我们展示了MIPE在机器生成的Hinglish(印地语和英语的代码混合)句子上的性能。我们可以将所提出的评估策略扩展到其他代码混合语言对、NLG任务和评估度量,只需很少或不需要付出任何努力。 摘要:Code-mixing is a phenomenon of mixing words and phrases from two or more languages in a single utterance of speech and text. Due to the high linguistic diversity, code-mixing presents several challenges in evaluating standard natural language generation (NLG) tasks. Various widely popular metrics perform poorly with the code-mixed NLG tasks. To address this challenge, we present a metric independent evaluation pipeline MIPE that significantly improves the correlation between evaluation metrics and human judgments on the generated code-mixed text. As a use case, we demonstrate the performance of MIPE on the machine-generated Hinglish (code-mixing of Hindi and English languages) sentences from the HinGE corpus. We can extend the proposed evaluation strategy to other code-mixed language pairs, NLG tasks, and evaluation metrics with minimal to no effort.

0 人点赞