自然语言处理学术速递[7.9]

2021-07-27 10:44:27 浏览数 (1)

cs.CL 方向,今日共计17篇

Transformer(2篇)

【1】 A Review of Bangla Natural Language Processing Tasks and the Utility of Transformer Models 标题:孟加拉自然语言处理任务与Transformer模型实用性述评

作者:Firoj Alam,Arid Hasan,Tanvir Alam,Akib Khan,Janntatul Tajrin,Naira Khan,Shammur Absar Chowdhury 机构:TANVIRUL ALAM, BJIT Limited, Bangladesh, JANNATUL TAJRIN, Cognitive Insight Limited, Bangladesh, Bangla – ranked as the ,?ℎ most widely spoken language across the world, with , million native speakers – 备注:Under Review, Bangla language processing, text classification, sequence tagging, datasets, benchmarks, transformer models 链接:https://arxiv.org/abs/2107.03844 摘要:孟加拉语是世界上使用最广泛的第六大语言(https://www.ethnologue.com/guides/ethnologue200)在自然语言处理(NLP)社区中,仍被视为低资源语言。经过三十年的研究,孟加拉邦民族解放党(BNLP)仍然落后,主要原因是资源匮乏和随之而来的挑战。BNLP的不同领域的工作比较少;然而,报告先前工作和最近进展的全面调查尚待完成。在这项研究中,我们首先提供了一个审查孟加拉语NLP的任务,资源和工具提供给研究界;我们使用当前最先进的算法(即基于Transformer的模型)对从不同平台收集的9个NLP任务的数据集进行基准测试。通过比较不同大小的单语和多语模型,我们为所研究的自然语言处理任务提供了比较结果。我们使用单独和合并的数据集报告我们的结果,并为将来的研究提供数据分割。我们共复习了108篇论文,进行了175组实验。我们的结果表明,使用基于Transformer的模型有很好的性能,同时强调了计算成本的权衡。我们希望,这样一个全面的调查将激励社会上建立和进一步推进孟加拉语民族解放党的研究。 摘要:Bangla -- ranked as the 6th most widely spoken language across the world (https://www.ethnologue.com/guides/ethnologue200), with 230 million native speakers -- is still considered as a low-resource language in the natural language processing (NLP) community. With three decades of research, Bangla NLP (BNLP) is still lagging behind mainly due to the scarcity of resources and the challenges that come with it. There is sparse work in different areas of BNLP; however, a thorough survey reporting previous work and recent advances is yet to be done. In this study, we first provide a review of Bangla NLP tasks, resources, and tools available to the research community; we benchmark datasets collected from various platforms for nine NLP tasks using current state-of-the-art algorithms (i.e., transformer-based models). We provide comparative results for the studied NLP tasks by comparing monolingual vs. multilingual models of varying sizes. We report our results using both individual and consolidated datasets and provide data splits for future research. We reviewed a total of 108 papers and conducted 175 sets of experiments. Our results show promising performance using transformer-based models while highlighting the trade-off with computational costs. We hope that such a comprehensive survey will motivate the community to build on and further advance the research on Bangla NLP.

【2】 Can Transformer Models Measure Coherence In Text? Re-Thinking the Shuffle Test 标题:Transformer模型能衡量语篇的连贯性吗?对洗牌测试的再思考

作者:Philippe Laban,Luke Dai,Lucas Bandarkar,Marti A. Hearst 机构:UC Berkeley 备注:None 链接:https://arxiv.org/abs/2107.03448 摘要:Shuffle测试是评价NLP模型能否衡量语篇连贯性的最常见的任务。最近的工作使用直接监督的任务;我们表明,通过对RoBERTa模型进行简单的微调,我们可以达到97.8%的近乎完美的精度,这是最先进的。我们认为,这种突出的表现不太可能导致一个良好的文本连贯模式,并建议洗牌测试应在Zero-Shot设置:模型应进行评估,而不是训练的任务本身。我们评估了这种设置下的常见模型,如发电Transformer和双向Transformer,并发现更大的体系结构可以实现高性能的开箱即用。最后,我们建议k-块洗牌测试,通过增加洗牌块的大小来修改原来的测试。尽管人类阅读器的性能仍然很高(准确率在95%左右),但随着块大小的增加,模型性能从94%下降到78%,这给基准NLP模型带来了概念上的简单挑战。可用代码:https://github.com/tingofurro/shuffle_test/ 摘要:The Shuffle Test is the most common task to evaluate whether NLP models can measure coherence in text. Most recent work uses direct supervision on the task; we show that by simply finetuning a RoBERTa model, we can achieve a near perfect accuracy of 97.8%, a state-of-the-art. We argue that this outstanding performance is unlikely to lead to a good model of text coherence, and suggest that the Shuffle Test should be approached in a Zero-Shot setting: models should be evaluated without being trained on the task itself. We evaluate common models in this setting, such as Generative and Bi-directional Transformers, and find that larger architectures achieve high-performance out-of-the-box. Finally, we suggest the k-Block Shuffle Test, a modification of the original by increasing the size of blocks shuffled. Even though human reader performance remains high (around 95% accuracy), model performance drops from 94% to 78% as block size increases, creating a conceptually simple challenge to benchmark NLP models. Code available: https://github.com/tingofurro/shuffle_test/

QA|VQA|问答|对话(1篇)

【1】 CANDLE: Decomposing Conditional and Conjunctive Queries for Task-Oriented Dialogue Systems 标题:CANDLE:面向任务对话系统的条件和合取查询分解

作者:Aadesh Gupta,Kaustubh D. Dhole,Rahul Tarway,Swetha Prabhakar,Ashish Shrivastava 机构:Amelia Science, IPsoft R&D 链接:https://arxiv.org/abs/2107.03884 摘要:特定领域的对话系统通常依靠句子级分类器来确定用户意图,而句子级分类器主要关注单个动作句。这样的分类器不能有效地处理由表示多个动作的条件子句和顺序子句组成的复杂查询。我们试图将这些查询分解成更小的单动作子查询,以便意图分类器在对话管道中理解。我们发布了CANDLE(Conditional&AND type Expressions),这是一个由3124条语句组成的数据集,这些语句被手动标记为条件和顺序标签,并通过训练两个基线标记器来演示这种分解。 摘要:Domain-specific dialogue systems generally determine user intents by relying on sentence-level classifiers which mainly focus on single action sentences. Such classifiers are not designed to effectively handle complex queries composed of conditional and sequential clauses that represent multiple actions. We attempt to decompose such queries into smaller single-action sub-queries that are reasonable for intent classifiers to understand in a dialogue pipeline. We release CANDLE (Conditional & AND type Expressions), a dataset consisting of 3124 utterances manually tagged with conditional and sequential labels and demonstrates this decomposition by training two baseline taggers.

机器翻译(1篇)

【1】 Using CollGram to Compare Formulaic Language in Human and Neural Machine Translation 标题:用CollGram比较人工翻译和神经机器翻译中的公式化语言

作者:Yves Bestgen 机构:Universite catholique de Louvain, place Cardinal Mercier, Louvain-la-Neuve 备注:Accepted at Translation and Interpreting Technology Online - TRITON 2021 链接:https://arxiv.org/abs/2107.03625 摘要:比较高质量的报纸文章中的人类和神经机器翻译的公式化序列表明,神经机器翻译包含更低的频率,但强烈关联的公式化序列,和更多的高频公式化序列。这些差异具有统计学意义,影响大小几乎总是中等或较大。这些观察结果可能与不同水平的第二语言学习者之间以及翻译文本和未翻译文本之间的差异有关。神经机器翻译系统之间的比较表明,一些系统比其他系统产生更多的公式化的两种类型的序列。 摘要:A comparison of formulaic sequences in human and neural machine translation of quality newspaper articles shows that neural machine translations contain less lower-frequency, but strongly-associated formulaic sequences, and more high-frequency formulaic sequences. These differences were statistically significant and the effect sizes were almost always medium or large. These observations can be related to the differences between second language learners of various levels and between translated and untranslated texts. The comparison between the neural machine translation systems indicates that some systems produce more formulaic sequences of both types than other systems.

语义分析(1篇)

【1】 COMBO: a new module for EUD parsing 标题:COMBO:一种新的EUD解析模块

作者:Mateusz Klimaszewski,Alina Wróblewska 机构:Warsaw University of Technology, Institute of Computer Science, Polish Academy of Sciences 备注:Accepted at IWPT 2021 链接:https://arxiv.org/abs/2107.03809 摘要:介绍了一种基于组合的EUD解析方法及其实现,并参与了iwpt2021eud共享任务。此任务的目标是将17种语言的原始文本解析为增强的通用依赖(EUD)。该方法利用组合预测UD树和EUD图。然后将这些结构合并到最终的EUD图中。一些EUD边标签使用独立于单一语言的扩展规则扩展了大小写信息。在官方评估中,该解决方案排名第四,平均ELAS为83.79%。源代码位于https://gitlab.clarin-pl.eu/syntactic-tools/combo. 摘要:We introduce the COMBO-based approach for EUD parsing and its implementation, which took part in the IWPT 2021 EUD shared task. The goal of this task is to parse raw texts in 17 languages into Enhanced Universal Dependencies (EUD). The proposed approach uses COMBO to predict UD trees and EUD graphs. These structures are then merged into the final EUD graphs. Some EUD edge labels are extended with case information using a single language-independent expansion rule. In the official evaluation, the solution ranked fourth, achieving an average ELAS of 83.79%. The source code is available at https://gitlab.clarin-pl.eu/syntactic-tools/combo.

GAN|对抗|攻击|生成相关(1篇)

【1】 HinGE: A Dataset for Generation and Evaluation of Code-Mixed Hinglish Text 标题:HINGH:一种用于代码混合印式英语文本生成和评估的数据集

作者:Vivek Srivastava,Mayank Singh 机构:TCS Research, Pune, Maharashtra, India, IIT Gandhinagar, Gandhinagar, Gujarat, India 链接:https://arxiv.org/abs/2107.03760 摘要:文本生成是计算语言学界一个非常活跃的研究领域。对生成文本的评价是一项具有挑战性的任务,多年来人们提出了多种理论和度量方法。然而,由于代码混合语言中的高质量资源的匮乏,文本生成和评价的研究相对较少,其中来自多种语言的单词和短语混合在单一的文本和语音中。为了应对这一挑战,我们提出了一个广泛流行的代码混合语言Hinglish(印地语和英语的代码混合)的语料库。铰链有由人类生成的Hinglish句子,以及两个基于规则的算法对应于平行的印地语英语句子。此外,我们还证明了广泛使用的评估指标对代码混合数据的无效性。铰链数据集将有助于代码混合语言自然语言生成研究的进展。 摘要:Text generation is a highly active area of research in the computational linguistic community. The evaluation of the generated text is a challenging task and multiple theories and metrics have been proposed over the years. Unfortunately, text generation and evaluation are relatively understudied due to the scarcity of high-quality resources in code-mixed languages where the words and phrases from multiple languages are mixed in a single utterance of text and speech. To address this challenge, we present a corpus (HinGE) for a widely popular code-mixed language Hinglish (code-mixing of Hindi and English languages). HinGE has Hinglish sentences generated by humans as well as two rule-based algorithms corresponding to the parallel Hindi-English sentences. In addition, we demonstrate the inefficacy of widely-used evaluation metrics on the code-mixed data. The HinGE dataset will facilitate the progress of natural language generation research in code-mixed languages.

半/弱/无监督|不确定性(1篇)

【1】 Keep it Simple: Unsupervised Simplification of Multi-Paragraph Text 标题:保持简单:多段落文本的无监督简化

作者:Philippe Laban,Tobias Schnabel,Paul Bennett,Marti A. Hearst 机构:Microsoft, Paul N. Bennett, UC Berkeley∗ 备注:None 链接:https://arxiv.org/abs/2107.03444 摘要:这项工作提出了Keep it Simple(KiS),一种新的无监督文本简化方法,它学习如何在流畅性、显著性和简单性三个属性之间平衡奖励。我们用一种新的奖励优化算法(k-SCST)训练该模型,该算法提出了多个候选人简化,计算每个候选人的奖励,并鼓励表现优于平均奖励的候选人。最后,我们提出了一个现实的文本理解任务作为文本简化的评价方法。当在英语新闻领域进行测试时,KiS模型比强监督基线好4个SARI点以上,可以帮助人们完成理解任务,与原文相比,平均快18%,同时保持准确性。可用代码:https://github.com/tingofurro/keep_it_simple 摘要:This work presents Keep it Simple (KiS), a new approach to unsupervised text simplification which learns to balance a reward across three properties: fluency, salience and simplicity. We train the model with a novel algorithm to optimize the reward (k-SCST), in which the model proposes several candidate simplifications, computes each candidate's reward, and encourages candidates that outperform the mean reward. Finally, we propose a realistic text comprehension task as an evaluation method for text simplification. When tested on the English news domain, the KiS model outperforms strong supervised baselines by more than 4 SARI points, and can help people complete a comprehension task an average of 18% faster while retaining accuracy, when compared to the original text. Code available: https://github.com/tingofurro/keep_it_simple

Word2Vec|文本|单词(2篇)

【1】 Inspiration through Observation: Demonstrating the Influence of Automatically Generated Text on Creative Writing 标题:从观察中获得灵感:论自动生成文本对创作的影响

作者:Melissa Roemmele 机构:Language Weaver (RWS Group), Los Angeles, CA, USA 备注:Accepted at ICCC 2021 链接:https://arxiv.org/abs/2107.04007 摘要:让机器产生被认为是有创意的文本是一个长期追求的目标。越来越多的研究将这一目标导向增强人类作者的创造性写作能力。在本文中,我们通过分析观察自动生成文本的例子对写作的影响来追求这一目标。特别是,我们研究了一个被称为句子填充的任务,它涉及到将一个单词列表转换成一个完整的句子。我们强调“可储存性”是句子的一个可取特征,其中“可储存性”句子是指那些暗示读者对某个故事感兴趣的句子。人类和一个自动化系统(基于神经语言模型)都执行了这个句子填充任务。在一种情况下,人们自己写句子;在另一个环境中,人们在写自己的句子时观察模型产生的句子。然后,在随后的评估中,读者将可储存性偏好分配给生成的句子。我们发现,当作者观察生成的例子时,人类创作的句子被判断为更易储存,并且随着作者从例子中获得更多的语义内容,可储存性增加。这一结果为人机合作写作提供了一种“观察启发”的范式,通过这种范式,文本生成模型可以在不直接复制文本输出的情况下增强人类的写作能力。 摘要:Getting machines to generate text perceived as creative is a long-pursued goal. A growing body of research directs this goal towards augmenting the creative writing abilities of human authors. In this paper, we pursue this objective by analyzing how observing examples of automatically generated text influences writing. In particular, we examine a task referred to as sentence infilling, which involves transforming a list of words into a complete sentence. We emphasize "storiability" as a desirable feature of the resulting sentences, where "storiable" sentences are those that suggest a story a reader would be curious to hear about. Both humans and an automated system (based on a neural language model) performed this sentence infilling task. In one setting, people wrote sentences on their own; in a different setting, people observed the sentences produced by the model while writing their own sentences. Readers then assigned storiability preferences to the resulting sentences in a subsequent evaluation. We find that human-authored sentences were judged as more storiable when authors observed the generated examples, and that storiability increased as authors derived more semantic content from the examples. This result gives evidence of an "inspiration through observation" paradigm for human-computer collaborative writing, through which human writing can be enhanced by text generation models without directly copying their output.

【2】 Handling Heavily Abbreviated Manuscripts: HTR engines vs text normalisation approaches 标题:处理大量缩写的手稿:HTR引擎与文本规范化方法

作者:Jean-Baptiste Camps,Chahan Vidal-Gorène,Marguerite Vernet 机构:École nationale des chartes – Université Paris, Sciences & Lettres, rue de Richelieu, Paris, France 备注:Accompanying data available at: this https URL 链接:https://arxiv.org/abs/2107.03450 摘要:尽管缩略语在手写体中相当常见,特别是在中世纪和现代西方手稿中,但以前关于缩略语扩展的计算方法的研究却很少。然而,缩略语对诸如手写文本识别和自然语言处理任务等计算方法提出了特殊的挑战。通常,预处理的最终目的是从源代码的数字化图像引导到标准化文本,包括缩写的扩展。我们探索不同的设置来获得这样一个标准化的文本,或者直接,通过训练HTR引擎的标准化(即,扩展,分解)文本,或者通过将过程分解成离散的步骤,每个步骤都利用专家模型进行识别,分词和标准化。这里考虑的案例研究来自中世纪的拉丁传统。 摘要:Although abbreviations are fairly common in handwritten sources, particularly in medieval and modern Western manuscripts, previous research dealing with computational approaches to their expansion is scarce. Yet abbreviations present particular challenges to computational approaches such as handwritten text recognition and natural language processing tasks. Often, pre-processing ultimately aims to lead from a digitised image of the source to a normalised text, which includes expansion of the abbreviations. We explore different setups to obtain such a normalised text, either directly, by training HTR engines on normalised (i.e., expanded, disabbreviated) text, or by decomposing the process into discrete steps, each making use of specialist models for recognition, word segmentation and normalisation. The case studies considered here are drawn from the medieval Latin tradition.

其他神经网络|深度学习|模型|建模(2篇)

【1】 Vector Space Morphology with Linear Discriminative Learning 标题:基于线性判别学习的向量空间形态

作者:Yu-Ying Chuang,Mihi Kang,Xuefeng Luo,R. Harald Baayen 机构:University of Tübingen 链接:https://arxiv.org/abs/2107.03950 摘要:本文介绍了三个用线性判别学习(LDL)对词汇加工进行建模的案例研究,LDL是判别词汇模型的计算引擎(Baayen等人,2019)。通过对单词形式和意义的数字表示,LDL学习将一个向量空间映射到另一个向量空间,而不需要了解任何形态结构或屈折变化类。建模结果表明,LDL不仅能很好地理解和产生形态复杂的单词,而且能产生对人类行为数据具有预测性的定量度量。低密度脂蛋白模型很容易用JudiLing软件包实现(Luo等人,2021年)。工作的例子提供了三个建模挑战:产生和理解韩国语动词屈折变化,预测荷兰语词汇决定潜伏期,预测普通话单词的声学持续时间。 摘要:This paper presents three case studies of modeling aspects of lexical processing with Linear Discriminative Learning (LDL), the computational engine of the Discriminative Lexicon model (Baayen et al., 2019). With numeric representations of word forms and meanings, LDL learns to map one vector space onto the other, without being informed about any morphological structure or inflectional classes. The modeling results demonstrated that LDL not only performs well for understanding and producing morphologically complex words, but also generates quantitative measures that are predictive for human behavioral data. LDL models are straightforward to implement with the JudiLing package (Luo et al., 2021). Worked examples are provided for three modeling challenges: producing and understanding Korean verb inflection, predicting primed Dutch lexical decision latencies, and predicting the acoustic duration of Mandarin words.

【2】 LanguageRefer: Spatial-Language Model for 3D Visual Grounding 标题:LanguageRefer:三维视觉基础的空间语言模型

作者:Junha Roh,Karthik Desingh,Ali Farhadi,Dieter Fox 机构:Paul G. Allen School, University of Washington, United States 备注:11 pages, 3 figures 链接:https://arxiv.org/abs/2107.03438 摘要:为了在不久的将来实现能够理解人类指令并执行有意义任务的机器人,开发能够理解参考语言的学习模型来识别现实世界三维场景中的常见对象是非常重要的。在本文中,我们发展了一个三维视觉接地问题的空间语言模型。具体地说,给定一个以点云的形式重建的三维场景,其中包含潜在候选对象的三维边界框,以及场景中引用目标对象的语言语句,我们的模型从一组潜在候选对象中识别目标对象。我们的空间语言模型使用了一种基于变换器的结构,它结合了边界盒的空间嵌入和DistilBert的精细语言嵌入,以及3D场景中对象之间的推理来找到目标对象。我们证明了我们的模型在ReferIt3D提出的visio语言数据集上具有竞争力。我们提供了额外的空间推理任务的性能分析,从感知噪声中分离出来,视点相关的话语在准确性方面的影响,以及潜在机器人应用的视点注释。 摘要:To realize robots that can understand human instructions and perform meaningful tasks in the near future, it is important to develop learned models that can understand referential language to identify common objects in real-world 3D scenes. In this paper, we develop a spatial-language model for a 3D visual grounding problem. Specifically, given a reconstructed 3D scene in the form of a point cloud with 3D bounding boxes of potential object candidates, and a language utterance referring to a target object in the scene, our model identifies the target object from a set of potential candidates. Our spatial-language model uses a transformer-based architecture that combines spatial embedding from bounding-box with a finetuned language embedding from DistilBert and reasons among the objects in the 3D scene to find the target object. We show that our model performs competitively on visio-linguistic datasets proposed by ReferIt3D. We provide additional analysis of performance in spatial reasoning tasks decoupled from perception noise, the effect of view-dependent utterances in terms of accuracy, and view-point annotations for potential robotics applications.

其他(6篇)

【1】 Meeting the SDGs : Enabling the Goals by Cooperation with Crowd using a Conversational AI Platform 标题:实现可持续发展目标:通过使用对话式人工智能平台与人群合作来实现目标

作者:J. Haqbeen,T. Ito,S. Sahab,R. Hadfi,T. Sato,S. Okuhara 机构: Enabling the Goals by Cooperation with Crowd using a Conversational AI Platform Jawad Haqbeen Department of Computer Science Nagoya Institute of Technology Nagoya, jp Rafik Hadfi Department of Computer Science Nagoya Institute of Technology Nagoya 备注:7 pages, 6 figures, 1 table, To appear as a conference paper at KICSS 2020 链接:https://arxiv.org/abs/2107.04011 摘要:在这篇论文中,我们报道了一个关于阿富汗可持续发展目标的大规模在线讨论。 摘要:In this paper, we report about a large-scale online discussion with 1099 citizens on the Afghanistan Sustainable Development Goals.

【2】 Privacy Concerns in Chatbot Interactions: When to Trust and When to Worry 标题:聊天机器人交互中的隐私问题:何时信任,何时担忧

作者:Rahime Belen Saglam,Jason R. C. Nurse,Duncan Hodges 机构: University of Kent, UK, Cranfield University, Defence Academy of the United Kingdom, UK 备注:None 链接:https://arxiv.org/abs/2107.03959 摘要:聊天机器人通过其会话能力的提高,已经开始请求和处理越来越多的敏感个人信息。敏感信息的准确披露对于向医疗和金融部门的用户提供建议和支持至关重要。在这项研究中,我们探讨了用户对聊天机器人提供商使用敏感数据相关因素的关注。我们调查了491名英国公民的代表性样本。我们的研究结果表明,用户关注的焦点集中在删除个人信息和关注他们的数据的不当使用。我们还发现,在与会话代理交谈后,个体担心失去对其数据的控制。我们没有发现使用者的性别或教育程度会对聊天机器人产生影响,但确实发现使用者的年龄会对聊天机器人产生影响,45岁以上的人比45岁以下的人更关心聊天机器人。我们还考虑了在聊天机器人中产生信任的因素。我们的受访者主要关注聊天机器人的技术要素,其中响应质量等因素被认为是最关键的因素。我们再次发现使用者的性别或教育程度没有影响;然而,当我们考虑一些社会因素(例如化身或感知到的“友好性”)时,我们发现45岁以下的人认为这些因素比45岁以上的人更重要。本文最后在设计支持广泛用户的包容性数字系统的背景下对这些结果进行了讨论。 摘要:Through advances in their conversational abilities, chatbots have started to request and process an increasing variety of sensitive personal information. The accurate disclosure of sensitive information is essential where it is used to provide advice and support to users in the healthcare and finance sectors. In this study, we explore users' concerns regarding factors associated with the use of sensitive data by chatbot providers. We surveyed a representative sample of 491 British citizens. Our results show that the user concerns focus on deleting personal information and concerns about their data's inappropriate use. We also identified that individuals were concerned about losing control over their data after a conversation with conversational agents. We found no effect from a user's gender or education but did find an effect from the user's age, with those over 45 being more concerned than those under 45. We also considered the factors that engender trust in a chatbot. Our respondents' primary focus was on the chatbot's technical elements, with factors such as the response quality being identified as the most critical factor. We again found no effect from the user's gender or education level; however, when we considered some social factors (e.g. avatars or perceived 'friendliness'), we found those under 45 years old rated these as more important than those over 45. The paper concludes with a discussion of these results within the context of designing inclusive, digital systems that support a wide range of users.

【3】 Multilingual Speech Evaluation: Case Studies on English, Malay and Tamil 标题:多语言演讲评价:以英语、马来语和泰米尔语为例

作者:Huayun Zhang,Ke Shi,Nancy F. Chen 机构:Institute for Infocomm Research, ASTAR, Singapore 备注:Accepted at INTERSPEECH 2021 链接:https://arxiv.org/abs/2107.03675 摘要:语音评价是计算机辅助语言学习的重要组成部分。虽然英语语音评价已经很流行,但低资源语言的自动语音评分仍然很有挑战性。这方面的工作主要集中在单语设计和源于英语等资源丰富的语言的手工特征上。这种方法往往难以推广到其他语言,特别是如果我们也想考虑超音格质量,如节奏。在这项工作中,我们研究了三种不同的语言,具有不同的节奏模式:英语(重音计时),马来语(音节计时),泰米尔语(莫拉计时)。我们利用音乐处理和向量表示学习的启发,开发了鲁棒的特征表示。实证检验表明,在预测语音、节奏和语调表现时,这三种语言的收益是一致的。 摘要:Speech evaluation is an essential component in computer-assisted language learning (CALL). While speech evaluation on English has been popular, automatic speech scoring on low resource languages remains challenging. Work in this area has focused on monolingual specific designs and handcrafted features stemming from resource-rich languages like English. Such approaches are often difficult to generalize to other languages, especially if we also want to consider suprasegmental qualities such as rhythm. In this work, we examine three different languages that possess distinct rhythm patterns: English (stress-timed), Malay (syllable-timed), and Tamil (mora-timed). We exploit robust feature representations inspired by music processing and vector representation learning. Empirical validations show consistent gains for all three languages when predicting pronunciation, rhythm and intonation performance.

【4】 POSLAN: Disentangling Chat with Positional and Language encoded Post Embeddings 标题:POSLAN:使用位置和语言编码的POST嵌入来解开纠缠的聊天

作者:Bhashithe Abeysinghe,Dhara Shah,Chris Freas,Robert Harrison,Rajshekhar Sunderraman 机构:Department of Computer Science, Georgia State University, Atlanta 链接:https://arxiv.org/abs/2107.03529 摘要:大多数在线消息线程本来就是杂乱无章的,任何新用户或中断后访问的现有用户都很难理解线程中讨论的内容。同样,消息线程中杂乱无章的响应使得分析消息成为一个困难的问题。当讨论所在的平台不提供检索消息的回复关系的功能时,对分离杂乱信息的需求要高得多。这引入了一个有趣的问题,引用{wang2011learning}短语作为结构学习问题。我们为线程中的post创建向量嵌入,以便它捕获与给定消息所在的上下文相关的语言和位置特征。使用这些嵌入的文章,我们计算一个基于相似性的连通矩阵,然后转换成一个图。在使用剪枝机制之后,结果图可以用来发现线程中帖子的回复关系。发现或解开聊天的过程是作为一种无监督的机制进行的。我们提出了我们的实验结果对数据集从电报有限的元数据。 摘要:Most online message threads inherently will be cluttered and any new user or an existing user visiting after a hiatus will have a difficult time understanding whats being discussed in the thread. Similarly cluttered responses in a message thread makes analyzing the messages a difficult problem. The need for disentangling the clutter is much higher when the platform where the discussion is taking place does not provide functions to retrieve reply relations of the messages. This introduces an interesting problem to which cite{wang2011learning} phrases as a structural learning problem. We create vector embeddings for posts in a thread so that it captures both linguistic and positional features in relation to a context of where a given message is in. Using these embeddings for posts we compute a similarity based connectivity matrix which then converted into a graph. After employing a pruning mechanisms the resultant graph can be used to discover the reply relation for the posts in the thread. The process of discovering or disentangling chat is kept as an unsupervised mechanism. We present our experimental results on a data set obtained from Telegram with limited meta data.

【5】 Worry, coping and resignation -- A repeated-measures study on emotional responses after a year in the pandemic 标题:焦虑、应对与顺从--大流行一年后情绪反应的重复测量研究

作者:Maximilian Mozes,Isabelle van der Vegt,Bennett Kleinberg 机构:All authors contributed to equally to this paper, Department of Security and Crime Science, University College London, UK, Dawes Centre for Future Crime, University College London, UK, Department of Computer Science, University College London, UK 备注:preprint 链接:https://arxiv.org/abs/2107.03466 摘要:COVID-19封锁措施的引入和恢复正常的前景要求社会变革。最紧迫的问题之一是个人如何适应这种流行病。本文采用重复测量设计研究了流感大流行的情绪反应。数据(n=1698)收集于2020年4月(在严格的封锁措施期间)和2021年4月(在疫苗接种计划取得进展时)。我们要求参与者报告他们的情绪,并在文本数据中表达这些情绪。统计检验显示,平均趋势是更好地适应大流行。然而,聚类分析表明,一个更复杂的异质模式与一个良好的应对和辞职的参与者分组。语言计算分析发现,话题和n-gram频率转向关注疫苗接种计划,而不是一般的担忧。对公共精神卫生工作在确定高危人群的影响进行了讨论。数据集是公开的。 摘要:The introduction of COVID-19 lockdown measures and an outlook on return to normality are demanding societal changes. Among the most pressing questions is how individuals adjust to the pandemic. This paper examines the emotional responses to the pandemic in a repeated-measures design. Data (n=1698) were collected in April 2020 (during strict lockdown measures) and in April 2021 (when vaccination programmes gained traction). We asked participants to report their emotions and express these in text data. Statistical tests revealed an average trend towards better adjustment to the pandemic. However, clustering analyses suggested a more complex heterogeneous pattern with a well-coping and a resigning subgroup of participants. Linguistic computational analyses uncovered that topics and n-gram frequencies shifted towards attention to the vaccination programme and away from general worrying. Implications for public mental health efforts in identifying people at heightened risk are discussed. The dataset is made publicly available.

【6】 Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling 标题:预测E2E会话人工智能中的安全问题:框架和工具

作者:Emily Dinan,Gavin Abercrombie,A. Stevie Bergman,Shannon Spruit,Dirk Hovy,Y-Lan Boureau,Verena Rieser 机构:Facebook AI Research, Heriot-Watt University, Responsible AI, Facebook, Independent Ethics Advisor at Populytics, Netherlands, Bocconi University 链接:https://arxiv.org/abs/2107.03451 摘要:在过去的几年里,端到端的神经会话代理在与人类进行聊天的能力上有了很大的提高。然而,这些模型通常是在互联网上的大型数据集上训练的,因此,可能会从这些数据中学习不受欢迎的行为,例如有毒或有害的语言。因此,研究人员必须努力解决如何以及何时发布这些模型的问题。在本文中,我们调查了端到端会话人工智能的安全问题,并讨论了最近和相关的工作。我们强调了价值观之间的紧张关系、潜在的积极影响和潜在的危害,并根据价值敏感设计的原则,为是否以及如何发布这些模型提供了一个决策框架。此外,我们还提供了一套工具,使研究人员能够就训练和发布端到端会话人工智能模型做出更明智的决策。 摘要:Over the last several years, end-to-end neural conversational agents have vastly improved in their ability to carry a chit-chat conversation with humans. However, these models are often trained on large datasets from the internet, and as a result, may learn undesirable behaviors from this data, such as toxic or otherwise harmful language. Researchers must thus wrestle with the issue of how and when to release these models. In this paper, we survey the problem landscape for safety for end-to-end conversational AI and discuss recent and related work. We highlight tensions between values, potential positive impact and potential harms, and provide a framework for making decisions about whether and how to release these models, following the tenets of value-sensitive design. We additionally provide a suite of tools to enable researchers to make better-informed decisions about training and releasing end-to-end conversational AI models.

0 人点赞