机器学习学术速递[8.24]

Update！H5支持摘要折叠，体验更佳！点击阅读原文访问arxivdaily.com，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏等功能！

cs.LG 方向，今日共计123篇

Graph相关(图学习|图神经网络|图优化等)(6篇)

【1】 Graph Attention Multi-Layer Perceptron 标题：图形注意多层感知器链接：https://arxiv.org/abs/2108.10097

作者：Wentao Zhang,Ziqi Yin,Zeang Sheng,Wen Ouyang,Xiaosen Li,Yangyu Tao,Zhi Yang,Bin Cui 机构：EECS, Peking University ,Tencent Inc. ,Beijing Institute of Technology 备注：12 pages, 4 figures 摘要：图形神经网络（GNN）最近在许多基于图形的应用中取得了最先进的性能。尽管具有很高的表达能力，但它们通常需要在多个训练阶段执行昂贵的递归邻域扩展，并且面临可伸缩性问题。此外，它们中的大多数都是不灵活的，因为它们被限制在固定的跳跃区域，并且对不同节点的实际感受野需求不敏感。我们通过引入可伸缩且灵活的图形注意多层感知器（GAMLP）来规避这些限制。通过分离非线性变换和特征传播，GAMLP以预计算的方式执行传播过程，显著提高了可扩展性和效率。GAMLP中的每个节点都有三个原则性的感受野注意，在不同大小的感受野上利用传播的特征是灵活和自适应的。我们对三个大型开放图基准（例如ogbn-papers100M、ogbn产品和ogbn mag）进行了广泛的评估，证明GAMLP不仅实现了最先进的性能，而且还提供了高可扩展性和效率。摘要：Graph neural networks (GNNs) have recently achieved state-of-the-art performance in many graph-based applications. Despite the high expressive power, they typically need to perform an expensive recursive neighborhood expansion in multiple training epochs and face a scalability issue. Moreover, most of them are inflexible since they are restricted to fixed-hop neighborhoods and insensitive to actual receptive field demands for different nodes. We circumvent these limitations by introducing a scalable and flexible Graph Attention Multilayer Perceptron (GAMLP). With the separation of the non-linear transformation and feature propagation, GAMLP significantly improves the scalability and efficiency by performing the propagation procedure in a pre-compute manner. With three principled receptive field attention, each node in GAMLP is flexible and adaptive in leveraging the propagated features over the different sizes of reception field. We conduct extensive evaluations on the three large open graph benchmarks (e.g., ogbn-papers100M, ogbn-products and ogbn-mag), demonstrating that GAMLP not only achieves the state-of-art performance, but also additionally provide high scalability and efficiency.

【2】 Relative Entropy-Regularized Optimal Transport on a Graph: a new algorithm and an experimental comparison 标题：图的相对熵正则化最优传输：一种新算法及实验比较链接：https://arxiv.org/abs/2108.10004

作者：Sylvain Courtain,Guillaume Guex,Ilkka Kivimaki,Marco Saerens 摘要：继[21,23]之后，本文研究了一种新的相对熵正则化算法，用于在随机最短路径形式下求解图上的最优运输问题。更准确地说，将单位流注入一组输入节点并从一组输出节点收集，同时使用路径相对熵正则化项最小化预期运输成本，从而提供随机路由策略。这种新公式的主要优点是，它可以很容易地适应现实问题中常见的边流容量约束。由此产生的最优路由策略，即每个节点中跟随一条边的概率分布，是马尔可夫的，通过将输入和输出流约束到规定的边际概率来计算，这要归功于[8]中开发的算法的变体。此外，与最近发展的其他技术的实验比较表明，从引入的模型中导出的节点之间的距离度量在半监督分类任务中提供了有竞争力的结果。摘要：Following [21, 23], the present work investigates a new relative entropy-regularized algorithm for solving the optimal transport on a graph problem within the randomized shortest paths formalism. More precisely, a unit flow is injected into a set of input nodes and collected from a set of output nodes while minimizing the expected transportation cost together with a paths relative entropy regularization term, providing a randomized routing policy. The main advantage of this new formulation is the fact that it can easily accommodate edge flow capacity constraints which commonly occur in real-world problems. The resulting optimal routing policy, i.e., the probability distribution of following an edge in each node, is Markovian and is computed by constraining the input and output flows to the prescribed marginal probabilities thanks to a variant of the algorithm developed in [8]. Besides, experimental comparisons with other recently developed techniques show that the distance measure between nodes derived from the introduced model provides competitive results on semi-supervised classification tasks.

【3】 Generative and Contrastive Self-Supervised Learning for Graph Anomaly Detection 标题：产生式和对比式自监督学习在图异常检测中的应用链接：https://arxiv.org/abs/2108.09896

作者：Yu Zheng,Ming Jin,Yixin Liu,Lianhua Chi,Khoa T. Phan,Yi-Ping Phoebe Chen 备注：14 pages, 5 figures 摘要：基于图形数据的异常检测由于其在网络安全、金融和社交网络等许多关键应用中的实际意义而受到广泛关注。现有的数据挖掘和机器学习方法要么是不能有效捕获图形数据复杂相互依赖关系的浅层方法，要么是不能充分利用上下文信息作为有效异常检测的监督信号的图形自动编码方法。为了克服这些挑战，本文提出了一种新的图形异常检测方法——自监督学习（SL-GAD）。我们的方法基于目标节点构造不同的上下文子图（视图），并使用两个模块，生成属性回归和多视图对比学习进行异常检测。生成性属性回归模块允许我们捕获属性空间中的异常，而多视图对比学习模块可以利用多个子图中更丰富的结构信息，从而能够捕获结构空间中的异常、结构和属性信息的混合。我们在六个基准数据集上进行了大量实验，结果表明，我们的方法大大优于最新的方法。摘要：Anomaly detection from graph data has drawn much attention due to its practical significance in many critical applications including cybersecurity, finance, and social networks. Existing data mining and machine learning methods are either shallow methods that could not effectively capture the complex interdependency of graph data or graph autoencoder methods that could not fully exploit the contextual information as supervision signals for effective anomaly detection. To overcome these challenges, in this paper, we propose a novel method, Self-Supervised Learning for Graph Anomaly Detection (SL-GAD). Our method constructs different contextual subgraphs (views) based on a target node and employs two modules, generative attribute regression and multi-view contrastive learning for anomaly detection. While the generative attribute regression module allows us to capture the anomalies in the attribute space, the multi-view contrastive learning module can exploit richer structure information from multiple subgraphs, thus abling to capture the anomalies in the structure space, mixing of structure, and attribute information. We conduct extensive experiments on six benchmark datasets and the results demonstrate that our method outperforms state-of-the-art methods by a large margin.

【4】 A Hard Label Black-box Adversarial Attack Against Graph Neural Networks 标题：一种针对图神经网络的硬标签黑盒对抗性攻击链接：https://arxiv.org/abs/2108.09513

作者：Jiaming Mu,Binghui Wang,Qi Li,Kun Sun,Mingwei Xu,Zhuotao Liu 机构：Institute for Network Sciences and Cyberspace & Department of Computer Science & BNRist, Tsinghua University , Illinois Institute of Technology ,George Mason University 摘要：图神经网络（GNNs）在节点分类和图分类等与图结构相关的任务中取得了最新的性能。然而，GNN容易受到敌对攻击。现有的工作主要集中在攻击GNNs进行节点分类；然而，针对GNNs的图形分类攻击还没有得到很好的研究。在这项工作中，我们通过扰动图的结构，对GNNs图分类的对抗性攻击进行了系统的研究。我们特别关注最具挑战性的攻击，即硬标签黑盒攻击，攻击者不知道目标GNN模型，只能通过查询目标模型获得预测的标签。为了实现这一目标，我们将攻击描述为一个优化问题，其目标是在保持较高的攻击成功率的同时，最小化图中要扰动的边数。原优化问题是一个难以求解的问题，我们将优化问题松弛为一个易于处理的问题，并在理论上保证了收敛性。我们还设计了一个粗粒度搜索算法和一个高效的查询梯度计算算法，以减少对目标GNN模型的查询数量。我们在三个真实数据集上的实验结果表明，我们的攻击可以在较少的查询和干扰下有效地攻击具有代表性的图分类GNN。我们还评估了我们的攻击在两种防御下的有效性：一种是设计良好的对抗图检测器，另一种是目标GNN模型本身配备了防止对抗图生成的防御。我们的实验结果表明，这种防御不够有效，这突出了更先进的防御。摘要：Graph Neural Networks (GNNs) have achieved state-of-the-art performance in various graph structure related tasks such as node classification and graph classification. However, GNNs are vulnerable to adversarial attacks. Existing works mainly focus on attacking GNNs for node classification; nevertheless, the attacks against GNNs for graph classification have not been well explored. In this work, we conduct a systematic study on adversarial attacks against GNNs for graph classification via perturbing the graph structure. In particular, we focus on the most challenging attack, i.e., hard label black-box attack, where an attacker has no knowledge about the target GNN model and can only obtain predicted labels through querying the target model.To achieve this goal, we formulate our attack as an optimization problem, whose objective is to minimize the number of edges to be perturbed in a graph while maintaining the high attack success rate. The original optimization problem is intractable to solve, and we relax the optimization problem to be a tractable one, which is solved with theoretical convergence guarantee. We also design a coarse-grained searching algorithm and a query-efficient gradient computation algorithm to decrease the number of queries to the target GNN model. Our experimental results on three real-world datasets demonstrate that our attack can effectively attack representative GNNs for graph classification with less queries and perturbations. We also evaluate the effectiveness of our attack under two defenses: one is well-designed adversarial graph detector and the other is that the target GNN model itself is equipped with a defense to prevent adversarial graph generation. Our experimental results show that such defenses are not effective enough, which highlights more advanced defenses.

【5】 Crown Jewels Analysis using Reinforcement Learning with Attack Graphs 标题：基于攻击图强化学习的皇冠首饰分析链接：https://arxiv.org/abs/2108.09358

作者：Rohit Gangupantulu,Tyler Cody,Abdul Rahman,Christopher Redino,Ryan Clark,Paul Park 机构：Deloitte Consulting, LLC, Hume Center for National Security and Technology, Virginia Polytechnic University, Deloitte & Touche LLP 摘要：网络攻击对国家和企业构成生存威胁。目前的做法倾向于使用威胁模型进行分段分析，而不是严格的网络地形分析和战场情报准备。使用强化学习的自动渗透测试为开发由网络结构和网络地形驱动的方法提供了一种新的、有希望的方法，这些方法稍后可以根据威胁模型进行解释，但主要是网络驱动的分析。本文提出了一种新的皇冠宝石分析方法CJA-RL，该方法使用强化学习来识别皇冠宝石开采的关键地形和途径。在我们的实验中，CJA-RL确定了利用具有多个王冠宝石的网络的理想入口点、瓶颈点和支点，举例说明了CJA-RL和渗透测试强化学习通常如何有益于计算机网络操作工作流。摘要：Cyber attacks pose existential threats to nations and enterprises. Current practice favors piece-wise analysis using threat-models in the stead of rigorous cyber terrain analysis and intelligence preparation of the battlefield. Automated penetration testing using reinforcement learning offers a new and promising approach for developing methodologies that are driven by network structure and cyber terrain, that can be later interpreted in terms of threat-models, but that are principally network-driven analyses. This paper presents a novel method for crown jewel analysis termed CJA-RL that uses reinforcement learning to identify key terrain and avenues of approach for exploiting crown jewels. In our experiment, CJA-RL identified ideal entry points, choke points, and pivots for exploiting a network with multiple crown jewels, exemplifying how CJA-RL and reinforcement learning for penetration testing generally can benefit computer network operations workflows.

【6】 Graph-Convolutional Deep Learning to Identify Optimized Molecular Configurations 标题：识别优化分子构型的图-卷积深度学习链接：https://arxiv.org/abs/2108.09637

作者：Eshan Joshi,Samuel Somuyiwa,Hossein Z. Jooya 备注：15 pages, 3 figures 摘要：用传统的计算方法解决分子优化问题是一个挑战，因为确定优化构型是一个NP难问题。最近，人们越来越有兴趣将不同的深度学习技术应用于基准分子优化任务。在这项工作中，我们使用QM7-X数据集中提供的平衡和非平衡构型实现了一种图卷积方法来分类分子结构。原子力被编码在图的顶点中，并且在优化的结构中原子上的总力大小的实质性抑制被学习用于图分类任务。我们使用两个不同的图池层演示了结果，并比较了它们各自的性能。摘要：Tackling molecular optimization problems using conventional computational methods is challenging, because the determination of the optimized configuration is known to be an NP-hard problem. Recently, there has been increasing interest in applying different deep-learning techniques to benchmark molecular optimization tasks. In this work, we implement a graph-convolutional method to classify molecular structures using the equilibrium and non-equilibrium configurations provided in the QM7-X data set. Atomic forces are encoded in graph vertices and the substantial suppression in the total force magnitude on the atoms in the optimized structure is learned for the graph classification task. We demonstrate the results using two different graph pooling layers and compare their respective performances.

Transformer(4篇)

【1】 C5T5: Controllable Generation of Organic Molecules with Transformers 标题：C5T5：用Transformer可控地产生有机分子链接：https://arxiv.org/abs/2108.10307

作者：Daniel Rothchild,Alex Tamkin,Julie Yu,Ujval Misra,Joseph Gonzalez 机构：EECS, UC Berkeley, Computer Science, Stanford University, Data Science & Chemistry 摘要：设计具有所需性能的有机材料的方法在医药、可再生能源、石化工程和农业等领域具有很高的潜在影响。然而，使用生成性建模来设计具有所需性质的物质是困难的，因为候选化合物必须满足多种约束条件，包括合成可及性和其他领域专家直观但难以量化的指标。我们提出C5T5，一种新颖的自我监督预训练方法，使Transformer能够进行零炮选择和替换编辑，将有机物质改变为所需的属性值。C5T5使用IUPAC名称——一种标准化的分子表示法，为有机化学家直观地编码丰富的结构信息，但这在很大程度上被ML社区所忽视。我们的技术不需要经过编辑的分子对来训练，只需要对分子性质进行粗略的估计，并且与基于图形的方法相比，它有可能更容易地对长程依赖性和对称分子结构进行建模。C5T5还为领域专家提供了一个强大的界面：它通过选择和替换IUPAC名称片段，为用户提供对生成过程的细粒度控制，从而使专家能够利用他们对结构-活动关系的直觉。我们展示了C5T5在与药物发现相关的四种物理性质上的有效性，表明它学习了成功的化学直观策略，可以将分子改变为期望的性质值。摘要：Methods for designing organic materials with desired properties have high potential impact across fields such as medicine, renewable energy, petrochemical engineering, and agriculture. However, using generative modeling to design substances with desired properties is difficult because candidate compounds must satisfy multiple constraints, including synthetic accessibility and other metrics that are intuitive to domain experts but challenging to quantify. We propose C5T5, a novel self-supervised pretraining method that enables transformers to make zero-shot select-and-replace edits, altering organic substances towards desired property values. C5T5 operates on IUPAC names -- a standardized molecular representation that intuitively encodes rich structural information for organic chemists but that has been largely ignored by the ML community. Our technique requires no edited molecule pairs to train and only a rough estimate of molecular properties, and it has the potential to model long-range dependencies and symmetric molecular structures more easily than graph-based methods. C5T5 also provides a powerful interface to domain experts: it grants users fine-grained control over the generative process by selecting and replacing IUPAC name fragments, which enables experts to leverage their intuitions about structure-activity relationships. We demonstrate C5T5's effectiveness on four physical properties relevant for drug discovery, showing that it learns successful and chemically intuitive strategies for altering molecules towards desired property values.

【2】 Power transformer faults diagnosis using undestructive methods (Roger and IEC) and artificial neural network for dissolved gas analysis applied on the functional transformer in the Algerian north-eastern: a comparative study 标题：无损诊断电力Transformer故障(ROGER和IEC)与人工神经网络溶解气体分析在阿尔及利亚东北部功能性Transformer上的应用比较研究链接：https://arxiv.org/abs/2108.10205

作者：Bouchaoui Lahcene,Kamel Eddine Hemsas,Hacene Mellah,saad eddine benlahneche 备注：None 摘要：目前，电力Transformer的老化和故障是输电行业十分关注的问题。溶解气体分析（DGA）是资产管理政策中使用最广泛的方法之一，用于检测电力Transformer早期的初始故障。到目前为止，已经采用了几种程序来讲授DGA结果。在这些有用的方法中，我们找到了关键气体、罗杰斯比率、IEC比率、今天较少使用的历史技术多尔内堡比率、两种杜瓦尔五边形方法、杜瓦尔三角形方法的几种版本和对数诺模图。问题从运行中的不同装置提取的DGA数据用于验证这些方法在评估电力Transformer健康状态方面的能力和可靠性。目标以阿尔及利亚东北部塞蒂夫省的功能性电力Transformer为例，基于两种传统方法，通过人工神经网络工具提高电力Transformer诊断质量。方法论基于IEC和Rogers的传统方法，利用神经网络设计一种用于电力Transformer诊断的不雅工具，该工具允许早期检测故障，提高从运输到消费者的整个电能系统的可靠性，并改善服务的连续性和质量。结果。在MATLAB-Simulink环境下，采用前馈-反向传播神经网络对问题进行求解。考虑了在沙漠、潮湿、寒冷等不同环境和气候条件下工作的四台实际电力Transformer。文中给出了用DGA诊断这些电力Transformer的实际结果。实用价值。。。。。摘要：Nowadays, power transformer aging and failures are viewed with great attention in power transmission industry. Dissolved gas analysis (DGA) is classified among the biggest widely used methods used within the context of asset management policy to detect the incipient faults in their earlier stage in power transformers. Up to now, several procedures have been employed for the lecture of DGA results. Among these useful means, we find Key Gases, Rogers Ratios, IEC Ratios, the historical technique less used today Doernenburg Ratios, the two types of Duval Pentagons methods, several versions of the Duval Triangles method and Logarithmic Nomograph. Problem. DGA data extracted from different units in service served to verify the ability and reliability of these methods in assessing the state of health of the power transformer. Aim. An improving the quality of diagnostics of electrical power transformer by artificial neural network tools based on two conventional methods in the case of a functional power transformer at S'etif province in East North of Algeria. Methodology. Design an inelegant tool for power transformer diagnosis using neural networks based on traditional methods IEC and Rogers, which allows to early detection faults, to increase the reliability, of the entire electrical energy system from transport to consumers and improve a continuity and quality of service. Results. The solution of the problem was carried out by using feed-forward back-propagation neural networks implemented in MATLAB-Simulink environment. Four real power transformers working under different environment and climate conditions such as: desert, humid, cold were taken into account. The practical results of the diagnosis of these power transformers by the DGA are presented. Practical value.....

【3】 Automated Identification of Cell Populations in Flow Cytometry Data with Transformers 标题：用Transformer自动识别流式细胞仪数据中的细胞群链接：https://arxiv.org/abs/2108.10072

作者：Matthias Wödlinger,Michael Reiter,Lisa Weijler,Margarita Maurer-Granofszky,Angela Schumich,Michael Dworzak 机构： TU Wien, Vienna, Austria, St Anna Children’s Cancer Research Institute, Vienna, Austria 摘要：急性淋巴细胞白血病（ALL）是儿童和青少年最常见的血液系统恶性肿瘤。最小残留病（MRD）是一个很强的预后因素，它是衡量患者体内持续存在的白血病细胞数量的指标。从治疗后的多参数流式细胞术（FCM）数据进行手动MRD评估既耗时又主观。在这项工作中，我们提出了一种直接从FCM数据计算MRD值的自动化方法。我们提出了一种新的神经网络方法，该方法基于Transformer结构，学习直接识别样本中的爆炸细胞。我们以有监督的方式训练我们的方法，并根据三个不同临床中心公开的所有FCM数据对其进行评估。在200个B-ALL样本上测试时，我们的方法达到了~0.93的f1中位数。摘要：Acute Lymphoblastic Leukemia (ALL) is the most frequent hematologic malignancy in children and adolescents. A strong prognostic factor in ALL is given by the Minimal Residual Disease (MRD), which is a measure for the number of leukemic cells persistent in a patient. Manual MRD assessment from Multiparameter Flow Cytometry (FCM) data after treatment is time-consuming and subjective. In this work, we present an automated method to compute the MRD value directly from FCM data. We present a novel neural network approach based on the transformer architecture that learns to directly identify blast cells in a sample. We train our method in a supervised manner and evaluate it on publicly available ALL FCM data from three different clinical centers. Our method reaches a median f1 score of ~0.93 when tested on 200 B-ALL samples.

【4】 A Transformer Architecture for Stress Detection from ECG 标题：一种用于心电应力检测的Transformer结构链接：https://arxiv.org/abs/2108.09737

作者：Behnam Behinaein,Anubhav Bhatti,Dirk Rodenburg,Paul Hungler,Ali Etemad 机构：Department of Electrical and, Computer Engineering, Ingenuity Labs Research Institute, Queen’s University, Kingston, Canada 备注：Accepted by 2021 International Symposium on Wearable Computers (ISWC) 摘要：心电图（ECG）已被广泛用于情绪识别。本文提出了一种基于卷积层的深度神经网络和一种利用心电信号检测压力的Transformer机制。我们在两个公开可用的数据集WESAD和SWEEL-KW上进行了一个遗漏实验，以评估我们的方法。我们的实验表明，在这两个数据集上，所提出的模型取得了很好的效果，与基于ECG的最新压力检测模型相当或更好。此外，我们的方法是端到端的，不需要手工制作的特征，并且只需要几个卷积块和Transformer组件就可以学习鲁棒表示。摘要：Electrocardiogram (ECG) has been widely used for emotion recognition. This paper presents a deep neural network based on convolutional layers and a transformer mechanism to detect stress using ECG signals. We perform leave-one-subject-out experiments on two publicly available datasets, WESAD and SWELL-KW, to evaluate our method. Our experiments show that the proposed model achieves strong results, comparable or better than the state-of-the-art models for ECG-based stress detection on these two datasets. Moreover, our method is end-to-end, does not require handcrafted features, and can learn robust representations with only a few convolutional blocks and the transformer component.

GAN|对抗|攻击|生成相关(4篇)

【1】 Back to the Drawing Board: A Critical Evaluation of Poisoning Attacks on Federated Learning 标题：回到绘图板：对联邦学习的毒害攻击的批判性评估链接：https://arxiv.org/abs/2108.10241

作者：Virat Shejwalkar,Amir Houmansadr,Peter Kairouz,Daniel Ramage 机构：∗University of Massachusetts Amherst, †Google Research 摘要：虽然最近的研究表明联邦学习（FL）容易受到受损客户端的毒害攻击，但我们发现这些研究做出了一些不切实际的假设，并得出了一些误导性的结论。例如，他们经常使用不切实际的高比例的受损客户，或者为对手假设不切实际的能力。我们通过仔细描述一组现实威胁模型和对抗能力，对实际生产FL环境下的中毒攻击进行了第一次批判性分析。我们的发现相当令人惊讶：与既定的信念相反，我们表明，即使没有任何防御，FL在实践中也非常稳健。事实上，我们更进一步，在两个现实的威胁模型下提出了新颖、最先进的中毒攻击，并通过三个基准数据集的大量实验展示了中毒攻击的有效性，特别是在使用简单防御机制的情况下。我们纠正了以前的误解，并给出了具体的指导方针，希望这些指导方针能鼓励我们的社区在这一领域进行更准确的研究，并建立更强大（更现实）的攻击和防御。摘要：While recent works have indicated that federated learning (FL) is vulnerable to poisoning attacks by compromised clients, we show that these works make a number of unrealistic assumptions and arrive at somewhat misleading conclusions. For instance, they often use impractically high percentages of compromised clients or assume unrealistic capabilities for the adversary. We perform the first critical analysis of poisoning attacks under practical production FL environments by carefully characterizing the set of realistic threat models and adversarial capabilities. Our findings are rather surprising: contrary to the established belief, we show that FL, even without any defenses, is highly robust in practice. In fact, we go even further and propose novel, state-of-the-art poisoning attacks under two realistic threat models, and show via an extensive set of experiments across three benchmark datasets how (in)effective poisoning attacks are, especially when simple defense mechanisms are used. We correct previous misconceptions and give concrete guidelines that we hope will encourage our community to conduct more accurate research in this space and build stronger (and more realistic) attacks and defenses.

【2】 ExamGAN and Twin-ExamGAN for Exam Script Generation 标题：用于考试脚本生成的ExamGAN和Twin-ExamGAN 链接：https://arxiv.org/abs/2108.09656

作者：Zhengyang Wu,Ke Deng,Judy Qiu,Yong Tang 机构： RMITUniversity 摘要：如今，学习管理系统（LMS）已广泛应用于从小学到高等教育的不同教育阶段，用于学生管理、记录、跟踪、报告和提供教育课程、训练计划或学习与发展计划。为了实现有效的学习效果评估，考试脚本生成问题近年来受到了广泛的关注和研究。但这方面的研究还处于起步阶段。有机会在各个方面进一步提高生成的考试脚本的质量。特别是，现有解决方案基本上忽视了两个基本问题。首先，给定一门课程，我们还不知道如何生成一个考试脚本，从而在一个班级（或不同班级）中实现学生分数的理想分布。第二，虽然在实践中经常遇到这种情况，但到目前为止，还不知道如何生成一对高质量的考试脚本，这些脚本在评估中是等效的（即，学生的分数可以通过选择其中一个脚本进行比较），但问题集却显著不同。为了填补这一空白，本文提出ExamGAN（考试脚本生成对抗网络）生成高质量的考试脚本，然后将ExamGAN扩展到T-ExamGAN（Twin-ExamGAN）生成一对高质量的考试脚本。基于对三个基准数据集的大量实验，它验证了所提出的解决方案在各个方面相对于最新技术的优越性。此外，我们还进行了一个案例研究，在一个真实的教学场景中证明了所提出的解决方案的有效性。摘要：Nowadays, the learning management system (LMS) has been widely used in different educational stages from primary to tertiary education for student administration, documentation, tracking, reporting, and delivery of educational courses, training programs, or learning and development programs. Towards effective learning outcome assessment, the exam script generation problem has attracted many attentions and been investigated recently. But the research in this field is still in its early stage. There are opportunities to further improve the quality of generated exam scripts in various aspects. In particular, two essential issues have been ignored largely by existing solutions. First, given a course, it is unknown yet how to generate an exam script which can result in a desirable distribution of student scores in a class (or across different classes). Second, while it is frequently encountered in practice, it is unknown so far how to generate a pair of high quality exam scripts which are equivalent in assessment (i.e., the student scores are comparable by taking either of them) but have significantly different sets of questions. To fill the gap, this paper proposes ExamGAN (Exam Script Generative Adversarial Network) to generate high quality exam scripts, and then extends ExamGAN to T-ExamGAN (Twin-ExamGAN) to generate a pair of high quality exam scripts. Based on extensive experiments on three benchmark datasets, it has verified the superiority of proposed solutions in various aspects against the state-of-the-art. Moreover, we have conducted a case study which demonstrated the effectiveness of proposed solution in a real teaching scenario.

【3】 Cascade Watchdog: A Multi-tiered Adversarial Guard for Outlier Detection 标题：级联看门狗：一种用于孤立点检测的多层敌方守卫链接：https://arxiv.org/abs/2108.09375

作者：Glauco A. Amigo Galán,Justin Bui,Robert J. Marks 机构：© , IEEE. Personal use of this material is permitted., Permission from IEEE must be obtained for all other uses, in, any current or future media, including reprintingrepublishing, this material for advertising or promotional purposes, creating 摘要：非分布内容的识别对于神经网络的成功实现至关重要。已经开发了看门狗技术来支持这些输入的检测，但是性能可能会受到可用数据量的限制。生成性对抗网络已显示出许多能力，包括以极高精度生成传真的能力。本文提出并实证评估了一种多层看门狗，该看门狗是使用GAN生成的数据开发的，用于改进分布外检测。cascade watchdog使用对抗性训练来增加可用数据量，类似于更难检测的分布外元素。然后，按顺序添加专门的第二个防护装置。结果表明，在保持极低的误报率的同时，在检测最具挑战性的分布外输入方面有了坚实而显著的改进。摘要：The identification of out-of-distribution content is critical to the successful implementation of neural networks. Watchdog techniques have been developed to support the detection of these inputs, but the performance can be limited by the amount of available data. Generative adversarial networks have displayed numerous capabilities, including the ability to generate facsimiles with excellent accuracy. This paper presents and empirically evaluates a multi-tiered watchdog, which is developed using GAN generated data, for improved out-of-distribution detection. The cascade watchdog uses adversarial training to increase the amount of available data similar to the out-of-distribution elements that are more difficult to detect. Then, a specialized second guard is added in sequential order. The results show a solid and significant improvement on the detection of the most challenging out-of-distribution inputs while preserving an extremely low false positive rate.

【4】 An Adversarial Learning Based Approach for Unknown View Tomographic Reconstruction 标题：一种基于对抗性学习的未知视图层析重建方法链接：https://arxiv.org/abs/2108.09873

作者：Mona Zehni,Zhizhen Zhao 机构： Zhao are with the Department of Electrical and ComputerEngineering and Coordinated Science Laboratory, University of Illinois atUrbana-Champaign 摘要：二维层析重建的目标是从各种视图中恢复给定投影线的图像。通常假定与投影线相关联的投影角是预先已知的。然而，在某些情况下，这些角度只能大致知道或完全未知。从一组随机投影线重建图像变得更具挑战性。我们提出了一种基于对抗式学习的方法，通过将测量值的经验分布与生成的数据相匹配来恢复图像和投影角度分布。拟合分布是通过求解基于Wasserstein生成对抗网络结构的生成器和批评家之间的最小-最大博弈来实现的。为了通过梯度反向传播适应投影角分布的更新，我们使用Gumbel Softmax对离散分布的样本重新参数化来近似损失。我们的理论分析验证了图像的唯一恢复以及在收敛时旋转和反射的投影分布。我们大量的数值实验显示了我们的方法在噪声污染下精确恢复图像和投影角分布的潜力。摘要：The goal of 2D tomographic reconstruction is to recover an image given its projection lines from various views. It is often presumed that projection angles associated with the projection lines are known in advance. Under certain situations, however, these angles are known only approximately or are completely unknown. It becomes more challenging to reconstruct the image from a collection of random projection lines. We propose an adversarial learning based approach to recover the image and the projection angle distribution by matching the empirical distribution of the measurements with the generated data. Fitting the distributions is achieved through solving a min-max game between a generator and a critic based on Wasserstein generative adversarial network structure. To accommodate the update of the projection angle distribution through gradient back propagation, we approximate the loss using the Gumbel-Softmax reparameterization of samples from discrete distributions. Our theoretical analysis verifies the unique recovery of the image and the projection distribution up to a rotation and reflection upon convergence. Our extensive numerical experiments showcase the potential of our method to accurately recover the image and the projection angle distribution under noise contamination.

半/弱/无/有监督|不确定性|主动学习(5篇)

【1】 Adaptive unsupervised learning with enhanced feature representation for intra-tumor partitioning and survival prediction for glioblastoma 标题：增强特征表示的自适应无监督学习在胶质母细胞瘤瘤内分割和生存预测中的应用链接：https://arxiv.org/abs/2108.09423

作者：Yifan Li,Chao Li,Yiran Wei,Stephen Price,Carola-Bibiane Schönlieb,Xi Chen 机构： Department of Computer Science, University of Bath, Bath, UK., Department of Applied Mathematics and Theoretical Physics 摘要：胶质母细胞瘤在区域微观结构和血管系统上具有深刻的异质性。描述胶质母细胞瘤的空间异质性可以导致更精确的治疗。随着无监督学习技术的发展，胶质母细胞瘤MRI影像学特征已被广泛用于肿瘤亚区分割和生存预测。然而，算法结果的可靠性经常受到模糊中间过程和聚类算法随机性引入的不稳定性的挑战，特别是对于来自异质患者的数据。在本文中，我们提出了一种自适应无监督学习方法，用于有效的MRI肿瘤内分割和胶质母细胞瘤生存预测。开发了一种新的、针对特定问题的特征增强自动编码器（FAE），以增强成对临床模式的表示，从而提高无监督学习算法（如K-means）的聚类稳定性。此外，整个过程采用贝叶斯优化（BO）技术建模，具有自定义损失函数，可以在合理的几个步骤内自适应优化超参数。结果表明，所提出的方法可以产生稳健和临床相关的MRI亚区和统计上显著的生存预测。摘要：Glioblastoma is profoundly heterogeneous in regional microstructure and vasculature. Characterizing the spatial heterogeneity of glioblastoma could lead to more precise treatment. With unsupervised learning techniques, glioblastoma MRI-derived radiomic features have been widely utilized for tumor sub-region segmentation and survival prediction. However, the reliability of algorithm outcomes is often challenged by both ambiguous intermediate process and instability introduced by the randomness of clustering algorithms, especially for data from heterogeneous patients. In this paper, we propose an adaptive unsupervised learning approach for efficient MRI intra-tumor partitioning and glioblastoma survival prediction. A novel and problem-specific Feature-enhanced Auto-Encoder (FAE) is developed to enhance the representation of pairwise clinical modalities and therefore improve clustering stability of unsupervised learning algorithms such as K-means. Moreover, the entire process is modelled by the Bayesian optimization (BO) technique with a custom loss function that the hyper-parameters can be adaptively optimized in a reasonably few steps. The results demonstrate that the proposed approach can produce robust and clinically relevant MRI sub-regions and statistically significant survival predictions.

【2】 SemiFed: Semi-supervised Federated Learning with Consistency and Pseudo-Labeling 标题：SemiFED：具有一致性和伪标注的半监督联邦学习链接：https://arxiv.org/abs/2108.09412

作者：Haowen Lin,Jian Lou,Li Xiong,Cyrus Shahabi 机构：University of Southern California, Emory University, Xidian University 摘要：联合学习使多个客户端（如移动电话和组织）能够协作学习用于预测的共享模型，同时保护本地数据隐私。然而，联邦学习的最新研究和应用假设所有客户端都有完全标记的数据，这在现实环境中是不切实际的。在这项工作中，我们关注一个新的跨思洛联盟学习场景，其中每个客户机的数据样本都有部分标记。我们借鉴了半监督学习方法的思想，在半监督学习方法中，使用大量未标记的数据来提高模型的准确性，尽管对标记示例的访问有限。我们提出了一个称为SemiFed的新框架，它将两种主要的半监督学习方法统一起来：一致性正则化和伪标记。SemiFed首先应用先进的数据增强技术来实施一致性正则化，然后在训练期间使用模型的预测生成伪标签。SemiFed利用了联合的优势，因此对于给定的图像，只有当来自不同客户的多个模型产生高置信度预测并同意相同的标签时，伪标签才有效。在两个图像基准上的大量实验证明了我们的方法在同质和异构数据分布环境下的有效性摘要：Federated learning enables multiple clients, such as mobile phones and organizations, to collaboratively learn a shared model for prediction while protecting local data privacy. However, most recent research and applications of federated learning assume that all clients have fully labeled data, which is impractical in real-world settings. In this work, we focus on a new scenario for cross-silo federated learning, where data samples of each client are partially labeled. We borrow ideas from semi-supervised learning methods where a large amount of unlabeled data is utilized to improve the model's accuracy despite limited access to labeled examples. We propose a new framework dubbed SemiFed that unifies two dominant approaches for semi-supervised learning: consistency regularization and pseudo-labeling. SemiFed first applies advanced data augmentation techniques to enforce consistency regularization and then generates pseudo-labels using the model's predictions during training. SemiFed takes advantage of the federation so that for a given image, the pseudo-label holds only if multiple models from different clients produce a high-confidence prediction and agree on the same label. Extensive experiments on two image benchmarks demonstrate the effectiveness of our approach under both homogeneous and heterogeneous data distribution settings

【3】 Influence Selection for Active Learning 标题：主动学习的影响选择链接：https://arxiv.org/abs/2108.09331

作者：Zhuoming Liu,Hao Ding,Huaping Zhong,Weijia Li,Jifeng Dai,Conghui He 机构：University Southern California, Johns Hopkins University, SenseTime Research, CUHK-SenseTime Joint Lab, The Chinese University of Hong Kong 备注：ICCV2021 accepted paper 摘要：现有的主动学习方法是基于不同的任务特定或模型特定标准，通过评估样本的不确定性或其对标记数据集多样性的影响来选择样本。在本文中，我们提出了主动学习的影响选择（ISAL），它选择对模型性能具有最积极影响的未标记样本。为了获得主动学习场景中未标记样本的影响，我们设计了未经训练的未标记样本影响计算（UUIC）来估计未标记样本的预期梯度，并以此计算其影响。为了证明UUIC的有效性，我们提供了理论和实验分析。由于UUIC只依赖于模型梯度，这可以很容易地从任何神经网络中获得，因此我们的主动学习算法是任务不可知和模型不可知的。ISAL在不同的主动学习环境中，针对不同的任务和不同的数据集，实现了最先进的性能。与以前的方法相比，我们的方法在CIFAR10、VOC2012和COCO上的注释成本分别降低了至少12%、13%和16%。摘要：The existing active learning methods select the samples by evaluating the sample's uncertainty or its effect on the diversity of labeled datasets based on different task-specific or model-specific criteria. In this paper, we propose the Influence Selection for Active Learning(ISAL) which selects the unlabeled samples that can provide the most positive Influence on model performance. To obtain the Influence of the unlabeled sample in the active learning scenario, we design the Untrained Unlabeled sample Influence Calculation(UUIC) to estimate the unlabeled sample's expected gradient with which we calculate its Influence. To prove the effectiveness of UUIC, we provide both theoretical and experimental analyses. Since the UUIC just depends on the model gradients, which can be obtained easily from any neural network, our active learning algorithm is task-agnostic and model-agnostic. ISAL achieves state-of-the-art performance in different active learning settings for different tasks with different datasets. Compared with previous methods, our method decreases the annotation cost at least by 12%, 13% and 16% on CIFAR10, VOC2012 and COCO, respectively.

【4】 SALIENCE: An Unsupervised User Adaptation Model for Multiple Wearable Sensors Based Human Activity Recognition 标题：显著性：一种基于多可穿戴传感器的人类活动识别无监督用户适应模型链接：https://arxiv.org/abs/2108.10213

作者：Ling Chen,Yi Zhang,Sirou Zhu,Shenghuan Miao,Liangying Peng,Rong Hu,Mingqi Lv 机构： and also with Alibaba-Zhejiang University Joint Research Institute of Frontier Technologies, Zhejiang University of Technology 摘要：无监督用户自适应调整来自训练用户和新用户的数据的特征分布，因此训练有素的可穿戴人类活动识别（WHAR）模型可以很好地适应新用户。随着可穿戴传感器的发展，基于WHAR的多个可穿戴传感器越来越受到人们的关注。为了应对不同传感器的可转移性不同的挑战，我们提出了基于人类活动识别的多可穿戴传感器无监督用户适应模型（SALIENCE）。它分别对齐每个传感器的数据以实现局部对齐，同时统一对齐所有传感器的数据以确保全局对齐。此外，还提出了一种注意机制，将显著性活动分类器聚焦于具有强特征识别和良好分布对齐的传感器上。在两个公开的WHAR数据集上进行了实验，实验结果表明我们的模型能够产生具有竞争力的性能。摘要：Unsupervised user adaptation aligns the feature distributions of the data from training users and the new user, so a well-trained wearable human activity recognition (WHAR) model can be well adapted to the new user. With the development of wearable sensors, multiple wearable sensors based WHAR is gaining more and more attention. In order to address the challenge that the transferabilities of different sensors are different, we propose SALIENCE (unsupervised user adaptation model for multiple wearable sensors based human activity recognition) model. It aligns the data of each sensor separately to achieve local alignment, while uniformly aligning the data of all sensors to ensure global alignment. In addition, an attention mechanism is proposed to focus the activity classifier of SALIENCE on the sensors with strong feature discrimination and well distribution alignment. Experiments are conducted on two public WHAR datasets, and the experimental results show that our model can yield a competitive performance.

【5】 Self-Supervised Delineation of Geological Structures using Orthogonal Latent Space Projection 标题：基于正交潜在空间投影的地质构造自监督圈定链接：https://arxiv.org/abs/2108.09605

作者：Oluwaseun Joseph Aribido,Ghassan AlRegib,Yazeed Alaudah 机构：title = Self-Supervised Delineation of Geological Structures using Orthogonal, Latent Space Projection, journal = GEOPHYSICS, volume = , number = , year = , Copyright ©, Geophysics., A revised version of this manuscript has been accepted to Geophysics and is awaiting 摘要：我们开发了两个机器学习框架，可以帮助地震体的自动岩石地层解释，而无需经验丰富的地震解释人员手动标记。第一个框架是一个无监督的层次聚类模型，用于将地震图像从一个体分割成由该算法确定的一定数量的聚类。聚类框架结合使用密度和层次技术来确定聚类的大小和同质性。第二个框架包括自监督深度学习框架，用于标记地震图像中的地质感兴趣区域。它将编码器-解码器体系结构的潜在空间投影到两个正交子空间，从中学习在地震图像中描绘感兴趣的区域。为了演示这两种框架的应用，将地震体分为不同的连续簇，根据不同的地震模式从中选择四个簇：层位、断层、盐丘和混沌结构。来自所选集群的图像用于训练编码器-解码器网络。编码器-解码器网络的输出是振幅反射事件属于有趣地质结构可能性的概率图。使用概率图描绘结构。描绘的图像进一步用于后训练分割模型，以将我们的结果扩展到整个垂直截面。垂直剖面上的结果表明，我们可以将地震体分解为相应的结构分量。最后，我们展示了我们的深度学习框架可以建模为一个属性抽取器，我们将我们的属性结果与文献中的各种现有属性进行了比较，并展示了它们的竞争性能。摘要：We developed two machine learning frameworks that could assist in automated litho-stratigraphic interpretation of seismic volumes without any manual hand labeling from an experienced seismic interpreter. The first framework is an unsupervised hierarchical clustering model to divide seismic images from a volume into certain number of clusters determined by the algorithm. The clustering framework uses a combination of density and hierarchical techniques to determine the size and homogeneity of the clusters. The second framework consists of a self-supervised deep learning framework to label regions of geological interest in seismic images. It projects the latent-space of an encoder-decoder architecture unto two orthogonal subspaces, from which it learns to delineate regions of interest in the seismic images. To demonstrate an application of both frameworks, a seismic volume was clustered into various contiguous clusters, from which four clusters were selected based on distinct seismic patterns: horizons, faults, salt domes and chaotic structures. Images from the selected clusters are used to train the encoder-decoder network. The output of the encoder-decoder network is a probability map of the possibility an amplitude reflection event belongs to an interesting geological structure. The structures are delineated using the probability map. The delineated images are further used to post-train a segmentation model to extend our results to full-vertical sections. The results on vertical sections show that we can factorize a seismic volume into its corresponding structural components. Lastly, we showed that our deep learning framework could be modeled as an attribute extractor and we compared our attribute result with various existing attributes in literature and demonstrate competitive performance with them.

迁移|Zero/Few/One-Shot|自适应(5篇)

【1】 Dynamic Neural Network Architectural and Topological Adaptation and Related Methods -- A Survey 标题：动态神经网络结构和拓扑自适应及其相关方法综述链接：https://arxiv.org/abs/2108.10066

作者：Lorenz Kummer 机构：Computer Science Department †, University of Vienna 备注：12 pages, preprint 摘要：深层神经网络（DNN）中的训练和推理，由于结构复杂性和数据集大小的稳步增加，导致了减少DNN训练和推理的时间和空间需求的策略的发展，这对于在资源受限的计算环境中进行训练或推理是时间关键型应用程序的一部分的场景尤为重要。在这项调查中，我们的目的是提供最先进的技术（SOTA）的总体概述和分类，以减少DNN训练和推理的时间和空间复杂性，特别关注体系结构的适应。摘要：Training and inference in deep neural networks (DNNs) has, due to a steady increase in architectural complexity and data set size, lead to the development of strategies for reducing time and space requirements of DNN training and inference, which is of particular importance in scenarios where training takes place in resource constrained computation environments or inference is part of a time critical application. In this survey, we aim to provide a general overview and categorization of state-of-the-art (SOTA) of techniques to reduced DNN training and inference time and space complexities with a particular focus on architectural adaptions.

【2】 Study of Proximal Normalized Subband Adaptive Algorithm for Acoustic Echo Cancellation 标题：声学回波对消的近邻归一化子带自适应算法研究链接：https://arxiv.org/abs/2108.10219

作者：Gang Guo,Yi Yu,Rodrigo C. de Lamare,Zongsheng Zheng,Lu Lu,Qiangming Cai 备注：12 figures, 13 pages 摘要：在本文中，我们提出了一种适用于稀疏场景的归一化子带自适应滤波算法，该算法结合了比例和稀疏感知机制。该算法基于近端正反向分裂和软阈值方法。我们分析了该算法的均方和均方行为，这得到了仿真的支持。此外，还提出了一种基于均方差最小化的自适应阈值参数选择方法。在系统辨识和声回波抵消环境下的仿真验证了该算法相对于同类算法的优越性。摘要：In this paper, we propose a novel normalized subband adaptive filter algorithm suited for sparse scenarios, which combines the proportionate and sparsity-aware mechanisms. The proposed algorithm is derived based on the proximal forward-backward splitting and the soft-thresholding methods. We analyze the mean and mean square behaviors of the algorithm, which is supported by simulations. In addition, an adaptive approach for the choice of the thresholding parameter in the proximal step is also proposed based on the minimization of the mean square deviation. Simulations in the contexts of system identification and acoustic echo cancellation verify the superiority of the proposed algorithm over its counterparts.

【3】 Learned Image Coding for Machines: A Content-Adaptive Approach 标题：机器学习图像编码：一种内容自适应的方法链接：https://arxiv.org/abs/2108.09992

作者：Nam Le,Honglei Zhang,Francesco Cricri,Ramin Ghaznavi-Youvalari,Hamed Rezazadegan Tavakoli,Esa Rahtu 机构：†Nokia Technologies, ∗Tampere University, Tampere, Finland 备注：None 摘要：今天，根据思科年度互联网报告（2018-2023），互联网流量增长最快的类别是机器对机器通信。特别是，图像和视频的机器对机器通信代表了一个新的挑战，并在数据压缩方面开辟了新的前景。一种可能的解决方法是根据机器消耗的使用情况，调整当前针对人类的图像和视频编码标准。另一种方法包括为机器对机器通信开发全新的压缩范例和体系结构。在本文中，我们将重点放在图像压缩上，并提出了一种推理时间内容自适应微调方案，该方案优化了端到端学习图像编解码器的潜在表示，旨在提高机器消耗的压缩效率。进行的实验表明，与预训练图像编解码器相比，我们的在线微调带来了-3.66%的平均比特率节省（BD率）。特别是，在低比特率点，我们提出的方法可以显著节省-9.85%的比特率。总体而言，我们经过预训练和微调的系统在最先进的图像/视频编解码器通用视频编码（VVC）上实现了-30.54%的BD速率。摘要：Today, according to the Cisco Annual Internet Report (2018-2023), the fastest-growing category of Internet traffic is machine-to-machine communication. In particular, machine-to-machine communication of images and videos represents a new challenge and opens up new perspectives in the context of data compression. One possible solution approach consists of adapting current human-targeted image and video coding standards to the use case of machine consumption. Another approach consists of developing completely new compression paradigms and architectures for machine-to-machine communications. In this paper, we focus on image compression and present an inference-time content-adaptive finetuning scheme that optimizes the latent representation of an end-to-end learned image codec, aimed at improving the compression efficiency for machine-consumption. The conducted experiments show that our online finetuning brings an average bitrate saving (BD-rate) of -3.66% with respect to our pretrained image codec. In particular, at low bitrate points, our proposed method results in a significant bitrate saving of -9.85%. Overall, our pretrained-and-then-finetuned system achieves -30.54% BD-rate over the state-of-the-art image/video codec Versatile Video Coding (VVC).

【4】 FEDI: Few-shot learning based on Earth Mover's Distance algorithm combined with deep residual network to identify diabetic retinopathy 标题：FEDI：基于地球移动距离算法结合深度残差网络的小概率学习识别糖尿病视网膜病变链接：https://arxiv.org/abs/2108.09711

作者：Liangrui Pan,Boya Ji,Peng Xi,Xiaoqi Wang,Mitchai Chongcheawchamnan,Shaoliang Peng 机构：College of Computer Scienceand, Electronic Engineering, HunanUniversity, Chang Sha, China, College of Computer Science and, Prince of Songkla University, Songkhla, Thailand, Hunan University 摘要：糖尿病视网膜病变（DR）是糖尿病患者致盲的主要原因。然而，DR通过眼底的诊断很容易延迟失明的发生。鉴于现实情况，在临床实践中很难收集到大量的糖尿病视网膜数据。本文提出了一种基于地球移动器距离算法的深度剩余网络的多镜头学习模型，以辅助诊断DR。我们基于39类1000个样本数据建立了多镜头学习的训练和验证分类任务，训练深度剩余网络，并获得经验最大化预训练模型。基于预训练模型的权重，土方工程距离算法计算图像之间的距离，获得图像之间的相似性，并更改模型参数以提高训练模型的准确性。最后，实验构建了小样本分类任务的测试集，进一步优化了模型，最后，对糖尿病视网膜测试集的3way10shot任务的准确率达到93.5667%。有关实验代码和结果，请参考：https://github.com/panliangrui/few-shot-learning-funds. 摘要：Diabetic retinopathy(DR) is the main cause of blindness in diabetic patients. However, DR can easily delay the occurrence of blindness through the diagnosis of the fundus. In view of the reality, it is difficult to collect a large amount of diabetic retina data in clinical practice. This paper proposes a few-shot learning model of a deep residual network based on Earth Mover's Distance algorithm to assist in diagnosing DR. We build training and validation classification tasks for few-shot learning based on 39 categories of 1000 sample data, train deep residual networks, and obtain experience maximization pre-training models. Based on the weights of the pre-trained model, the Earth Mover's Distance algorithm calculates the distance between the images, obtains the similarity between the images, and changes the model's parameters to improve the accuracy of the training model. Finally, the experimental construction of the small sample classification task of the test set to optimize the model further, and finally, an accuracy of 93.5667% on the 3way10shot task of the diabetic retina test set. For the experimental code and results, please refer to: https://github.com/panliangrui/few-shot-learning-funds.

【5】 Variable-Rate Deep Image Compression through Spatially-Adaptive Feature Transform 标题：基于空间自适应特征变换的变速率深度图像压缩链接：https://arxiv.org/abs/2108.09551

作者：Myungseo Song,Jinyoung Choi,Bohyung Han 机构：ECE & ASRI, Seoul National University, Korea 备注：ICCV 2021 摘要：我们提出了一种基于空间特征变换的通用深度图像压缩网络（SFT arXiv:1804.02815），该网络以源图像和相应的质量图作为输入，并以可变速率生成压缩图像。我们的模型使用单个模型覆盖了广泛的压缩率，该模型由任意像素质量贴图控制。此外，所提出的框架允许我们通过有效地估计特定于编码网络目标任务的优化质量映射，对各种任务（例如分类）执行任务感知图像压缩。这甚至可以通过预先训练的网络实现，而无需学习单独任务的单独模型。与基于多个模型的方法相比，我们的算法实现了出色的率失真折衷，这些模型分别针对多个不同的目标速率进行了优化。在相同的压缩水平下，该方法通过任务感知质量图估计成功地提高了图像分类和文本区域质量保持的性能，而无需额外的模型训练。该代码可在项目网站上获得：https://github.com/micmic123/QmapCompression 摘要：We propose a versatile deep image compression network based on Spatial Feature Transform (SFT arXiv:1804.02815), which takes a source image and a corresponding quality map as inputs and produce a compressed image with variable rates. Our model covers a wide range of compression rates using a single model, which is controlled by arbitrary pixel-wise quality maps. In addition, the proposed framework allows us to perform task-aware image compressions for various tasks, e.g., classification, by efficiently estimating optimized quality maps specific to target tasks for our encoding network. This is even possible with a pretrained network without learning separate models for individual tasks. Our algorithm achieves outstanding rate-distortion trade-off compared to the approaches based on multiple models that are optimized separately for several different target rates. At the same level of compression, the proposed approach successfully improves performance on image classification and text region quality preservation via task-aware quality map estimation without additional model training. The code is available at the project website: https://github.com/micmic123/QmapCompression

强化学习(5篇)

【1】 Collect & Infer -- a fresh look at data-efficient Reinforcement Learning 标题：收集与推理--数据高效强化学习的新视角链接：https://arxiv.org/abs/2108.10273

作者：Martin Riedmiller,Jost Tobias Springenberg,Roland Hafner,Nicolas Heess 机构：DeepMind, UK 摘要：本立场文件从数据效率的角度对强化学习（RL）提出了新的看法。数据高效的RL经历了三个主要阶段：纯在线RL，其中每个数据点只考虑一次，带有回放缓冲区的RL，其中对部分体验进行了额外的学习，最后是基于转换内存的RL，从概念上讲，所有转换都存储并在每个更新步骤中重复使用。虽然从所有显式存储的经验中推断知识可以极大地提高数据效率，但如何收集这些数据的问题一直没有得到充分的研究。我们认为，只有仔细考虑这两个方面，才能实现数据效率。我们建议通过一个我们称之为“收集和推断”的范例来明确这一观点，该范例将RL明确建模为两个独立但相互关联的过程，分别涉及数据收集和知识推断。我们讨论了该范式的含义，其思想如何反映在文献中，以及它如何指导数据高效RL的未来研究。摘要：This position paper proposes a fresh look at Reinforcement Learning (RL) from the perspective of data-efficiency. Data-efficient RL has gone through three major stages: pure on-line RL where every data-point is considered only once, RL with a replay buffer where additional learning is done on a portion of the experience, and finally transition memory based RL, where, conceptually, all transitions are stored and re-used in every update step. While inferring knowledge from all explicitly stored experience has lead to a tremendous gain in data-efficiency, the question of how this data is collected has been vastly understudied. We argue that data-efficiency can only be achieved through careful consideration of both aspects. We propose to make this insight explicit via a paradigm that we call 'Collect and Infer', which explicitly models RL as two separate but interconnected processes, concerned with data collection and knowledge inference respectively. We discuss implications of the paradigm, how its ideas are reflected in the literature, and how it can guide future research into data efficient RL.

【2】 Distilling Neuron Spike with High Temperature in Reinforcement Learning Agents 标题：强化学习智能体中神经元棘波的高温提取链接：https://arxiv.org/abs/2108.10078

作者：Ling Zhang,Jian Cao,Yuan Zhang,Bohan Zhou,Shuo Feng 机构： School of Software and Microelectronics, Peking University, Beijing, China. 备注：7 pages, 5 figures, conference 摘要：与深度神经网络（DNN）相比，尖峰神经网络（SNN）具有更快的处理速度、更低的能耗和更高的生物可解释性，有望接近强人工智能。强化学习类似于生物学学习。研究SNN与RL的结合具有重要意义。提出了基于STBP的尖峰蒸馏网络（SDN）强化学习方法。该方法采用蒸馏法，有效地避免了STBP算法的缺点，在分类上可以达到SOTA性能，并且可以得到更小、更快收敛和更低功耗的SNN强化学习模型。实验表明，该方法比传统的SNN强化学习和DNN强化学习方法收敛速度快，约快1000个历元，获得的SNN比DNN小200倍。我们还将SDN部署到PKU nc64c芯片上，证明SDN的功耗比DNN低，在大规模设备上SDN的功耗比DNN低600多倍。SDN提供了一种新的SNN强化学习方式，并能实现SOTA性能，证明了SNN强化学习进一步发展的可能性。摘要：Spiking neural network (SNN), compared with depth neural network (DNN), has faster processing speed, lower energy consumption and more biological interpretability, which is expected to approach Strong AI. Reinforcement learning is similar to learning in biology. It is of great significance to study the combination of SNN and RL. We propose the reinforcement learning method of spike distillation network (SDN) with STBP. This method uses distillation to effectively avoid the weakness of STBP, which can achieve SOTA performance in classification, and can obtain a smaller, faster convergence and lower power consumption SNN reinforcement learning model. Experiments show that our method can converge faster than traditional SNN reinforcement learning and DNN reinforcement learning methods, about 1000 epochs faster, and obtain SNN 200 times smaller than DNN. We also deploy SDN to the PKU nc64c chip, which proves that SDN has lower power consumption than DNN, and the power consumption of SDN is more than 600 times lower than DNN on large-scale devices. SDN provides a new way of SNN reinforcement learning, and can achieve SOTA performance, which proves the possibility of further development of SNN reinforcement learning.

【3】 A Boosting Approach to Reinforcement Learning 标题：强化学习的一种增强式方法链接：https://arxiv.org/abs/2108.09767

作者：Nataly Brukhim,Elad Hazan,Karan Singh 机构：Princeton University, Google AI Princeton, Microsoft Research 摘要：研究了复杂度与状态数无关的马尔可夫决策过程中强化学习的有效算法。这个公式简洁地描述了大规模的问题，但也知道它的一般形式在计算上很困难。以前的方法试图通过在过渡函数或值函数中假设结构，或通过将解的保证放宽到局部最优性条件来规避计算困难。我们考虑的方法，促进，借用监督学习，为弱学习者转化为一个准确的政策。我们研究的弱学习的概念是基于抽样的策略上线性函数的近似优化。在弱可学习性的假设下，我们给出了一个有效的算法，该算法能够提高这种弱学习方法的精度，直到达到全局最优。我们证明了我们的方法的样本复杂性和运行时间界限，这是问题的自然参数中的多项式：近似保证、折扣因子、分布不匹配和操作数。特别是，我们的边界并不取决于国家的数目。应用以前的boosting结果的一个技术难点是，策略空间上的值函数不是凸的。我们展示了如何使用Frank-Wolfe方法的非凸变体，结合梯度提升的最新进展，允许合并具有乘法近似保证的弱学习者，以克服非凸性并实现全局收敛。摘要：We study efficient algorithms for reinforcement learning in Markov decision processes whose complexity is independent of the number of states. This formulation succinctly captures large scale problems, but is also known to be computationally hard in its general form. Previous approaches attempt to circumvent the computational hardness by assuming structure in either transition function or the value function, or by relaxing the solution guarantee to a local optimality condition. We consider the methodology of boosting, borrowed from supervised learning, for converting weak learners into an accurate policy. The notion of weak learning we study is that of sampled-based approximate optimization of linear functions over policies. Under this assumption of weak learnability, we give an efficient algorithm that is capable of improving the accuracy of such weak learning methods, till global optimality is reached. We prove sample complexity and running time bounds on our method, that are polynomial in the natural parameters of the problem: approximation guarantee, discount factor, distribution mismatch and number of actions. In particular, our bound does not depend on the number of states. A technical difficulty in applying previous boosting results, is that the value function over policy space is not convex. We show how to use a non-convex variant of the Frank-Wolfe method, coupled with recent advances in gradient boosting that allow incorporating a weak learner with multiplicative approximation guarantee, to overcome the non-convexity and attain global convergence.

【4】 MimicBot: Combining Imitation and Reinforcement Learning to win in Bot Bowl 标题：MimicBot：结合模仿和强化学习在机器人碗中取胜链接：https://arxiv.org/abs/2108.09478

作者：Nicola Pezzotti 机构：AI, Data Science and Digital Twin Department, Philips Research, Eindhoven, The Netherlands, Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, The Netherlands, Editor: Kevin Murphy and Bernhard Sch¨olkopf 摘要：本文描述了一个混合智能体，该智能体经过训练，可以在参加Bot Bowl III比赛的幻想足球AI中玩。代理MimicBot使用专门设计的深层策略网络实现，并使用模仿和强化学习相结合的方式进行训练。以前在这种情况下使用强化学习方法的尝试失败的原因有很多，例如，由于环境中固有的随机性和可用的大量且不均匀的行动，课程学习方法未能始终击败随机付费的代理。目前，没有任何机器学习方法可以打败脚本机器人，它利用游戏中的领域知识。我们的解决方案，由于模仿学习和混合决策过程，始终击败了这种脚本代理。此外，我们还阐明了如何在强化学习环境中更有效地训练，同时大幅提高样本效率。MimicBot是Bot Bowl III竞赛的获胜者，目前是最先进的解决方案。摘要：This paper describe an hybrid agent trained to play in Fantasy Football AI which participated in the Bot Bowl III competition. The agent, MimicBot, is implemented using a specifically designed deep policy network and trained using a combination of imitation and reinforcement learning. Previous attempts in using a reinforcement learning approach in such context failed for a number of reasons, e.g. due to the intrinsic randomness in the environment and the large and uneven number of actions available, with a curriculum learning approach failing to consistently beat a randomly paying agent. Currently no machine learning approach can beat a scripted bot which makes use of the domain knowledge on the game. Our solution, thanks to an imitation learning and a hybrid decision-making process, consistently beat such scripted agents. Moreover we shed lights on how to more efficiently train in a reinforcement learning setting while drastically increasing sample efficiency. MimicBot is the winner of the Bot Bowl III competition, and it is currently the state-of-the-art solution.

【5】 Cooperative Localization Utilizing Reinforcement Learning for 5G Networks 标题：基于强化学习的5G网络协同定位链接：https://arxiv.org/abs/2108.10222

作者：Ghazaleh Kia,Laura Ruotsalainen 机构：dept. of Computer Science, University of Helsinki, Helsinki, Finland 备注：2 pages, 1 figure, presented as a poster at the Second 6G Wireless Summit 2020 摘要：近年来，为了实现自动驾驶汽车的出现，对精确定位的需求有所增加。为了让这些车辆进入智能城市的交通生态系统，需要一个精确的定位系统。为了实现精确定位，协同定位扮演着重要的角色。这种类型的定位计算车辆之间的距离测量值，并通过使用另一个车辆的更精确值来纠正其中一个车辆可能存在的错误值，从而提高位置精度。采用毫米波（mmWave）技术的5G信号支持精确的距离测量，5G网络提供设备到设备（D2D）通信，从而提高协作定位。本文的目的是为自动驾驶车辆提供一种精确的协作定位，利用强化学习技术为5G信号选择最精确和合适的距离测量技术，这种定位不太容易出错。摘要：The demand for accurate localization has risen in recent years to enable the emerging of autonomous vehicles. To have these vehicles in the traffic ecosystem of smart cities, the need for an accurate positioning system is emphasized. To realize accurate positioning, collaborative localization plays an important role. This type of localization computes range measurements between vehicles and improves the accuracy of position by correcting the possibly faulty values of one of them by using the more accurate values of the other. 5G signals with the technology of Millimeter Wave (mmWave) support precise range measurements and 5G networks provide Device to Device (D2D) communication which improves collaborative localization. The aim of this paper is to provide an accurate collaborative positioning for autonomous vehicles, which is less prone to errors utilizing reinforcement learning technique for selecting the most accurate and suitable range measurement technique for the 5G signal.

元学习(1篇)

【1】 Fairness-Aware Online Meta-learning 标题：公平感知的在线元学习链接：https://arxiv.org/abs/2108.09435

作者：Chen Zhao,Feng Chen,Bhavani Thuraisingham 机构：The University of Texas at Dallas, Richardson, Texas, USA 备注：KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining 摘要：与离线工作方式不同，为在线学习设计了两种研究范式：（1）在线元学习（OML）在一个连续的环境中学习模型参数（或学习）的良好先验知识，其中任务一个接一个地显示。虽然它提供了一个次线性的遗憾边界，但这种技术完全忽视了公平学习的重要性，公平学习是人类智力的一个重要标志(2）在线公平意识学习。此设置捕获了许多涉及公平性的分类问题。但它的目标是在没有任何特定任务适应的情况下实现Zero-Shot泛化。因此，这限制了模型适应新到达数据的能力。为了克服这些问题并缩小差距，本文首次提出了一种新的在线元学习算法，即FFML，它是在不公平预防的背景下提出的。FFML的关键部分是学习在线公平分类模型的原始参数和对偶参数的良好先验信息，这些参数分别与模型的准确性和公平性相关。该问题以双层凸凹优化的形式表示。理论分析为损失后悔和违反累积公平约束提供了次线性上界。我们的实验通过将FFML应用于三个真实数据集的分类，证明了FFML的多功能性，并且在公平性和分类准确性之间的折衷方面比之前的最佳工作有了实质性的改进摘要：In contrast to offline working fashions, two research paradigms are devised for online learning: (1) Online Meta Learning (OML) learns good priors over model parameters (or learning to learn) in a sequential setting where tasks are revealed one after another. Although it provides a sub-linear regret bound, such techniques completely ignore the importance of learning with fairness which is a significant hallmark of human intelligence. (2) Online Fairness-Aware Learning. This setting captures many classification problems for which fairness is a concern. But it aims to attain zero-shot generalization without any task-specific adaptation. This therefore limits the capability of a model to adapt onto newly arrived data. To overcome such issues and bridge the gap, in this paper for the first time we proposed a novel online meta-learning algorithm, namely FFML, which is under the setting of unfairness prevention. The key part of FFML is to learn good priors of an online fair classification model's primal and dual parameters that are associated with the model's accuracy and fairness, respectively. The problem is formulated in the form of a bi-level convex-concave optimization. Theoretic analysis provides sub-linear upper bounds for loss regret and for violation of cumulative fairness constraints. Our experiments demonstrate the versatility of FFML by applying it to classification on three real-world datasets and show substantial improvements over the best prior work on the tradeoff between fairness and classification accuracy

医学相关(9篇)

【1】 Spatio-Temporal Split Learning for Privacy-Preserving Medical Platforms: Case Studies with COVID-19 CT, X-Ray, and Cholesterol Data 标题：隐私保护医学平台的时空分裂学习：冠状病毒CT、X射线和胆固醇数据的案例研究链接：https://arxiv.org/abs/2108.10147

作者：Yoo Jeong Ha,Minjae Yoo,Gusang Lee,Soyi Jung,Sae Won Choi,Joongheon Kim,Seehwan Yoo 机构：This work was supported by the Ministry of Health and Welfare (MHW), Korea (HI,C,). 摘要：机器学习需要大量的样本数据，尤其是在高精度医学应用中。然而，患者记录是最敏感的私人信息之一，通常不会在研究所之间共享。本文介绍了时空分割学习，一种分布式深度神经网络框架，它是允许隐私敏感组织之间协作的一个转折点。我们的时空分割学习展示了分布式机器学习如何在最小隐私关注的情况下高效地进行。提出的分割学习由多个客户端和一个集中式服务器组成。每个客户端只有一个隐藏层，作为隐私保护层，集中式服务器包括其他隐藏层和输出层。由于集中式服务器不需要访问训练数据，并且使用从隐私保护层接收的参数训练深度神经网络，因此保证了原始数据的隐私性。我们创造了时空分割学习这一术语，因为多个客户端在空间上分布以覆盖来自不同参与者的不同数据集，我们可以在时间上分割学习过程，将隐私保护层与学习过程的其余部分分离，以最大限度地减少隐私泄露。本文展示了如何使用我们提出的多站点时空分割学习算法对冠状病毒病-19（COVID-19）、胸部计算机断层扫描（CT）、肌肉骨骼X线照片（MURA）X射线图像和胆固醇水平进行分析，同时确保隐私。摘要：Machine learning requires a large volume of sample data, especially when it is used in high-accuracy medical applications. However, patient records are one of the most sensitive private information that is not usually shared among institutes. This paper presents spatio-temporal split learning, a distributed deep neural network framework, which is a turning point in allowing collaboration among privacy-sensitive organizations. Our spatio-temporal split learning presents how distributed machine learning can be efficiently conducted with minimal privacy concerns. The proposed split learning consists of a number of clients and a centralized server. Each client has only has one hidden layer, which acts as the privacy-preserving layer, and the centralized server comprises the other hidden layers and the output layer. Since the centralized server does not need to access the training data and trains the deep neural network with parameters received from the privacy-preserving layer, privacy of original data is guaranteed. We have coined the term, spatio-temporal split learning, as multiple clients are spatially distributed to cover diverse datasets from different participants, and we can temporally split the learning process, detaching the privacy preserving layer from the rest of the learning process to minimize privacy breaches. This paper shows how we can analyze the medical data whilst ensuring privacy using our proposed multi-site spatio-temporal split learning algorithm on Coronavirus Disease-19 (COVID-19) chest Computed Tomography (CT) scans, MUsculoskeletal RAdiographs (MURA) X-ray images, and cholesterol levels.

【2】 Remote Sensing and Machine Learning for Food Crop Production Data in Africa Post-COVID-19 标题：冠状病毒后非洲粮食作物生产数据的遥感和机器学习链接：https://arxiv.org/abs/2108.10054

作者：Racine Ly,Khadim Dia,Mariam Diallo 机构：AKADEMIYA, Kigali, Rwanda 备注：This chapter has been submitted to the Annual Trends and Outlook Report (ATOR, 2021). The ATOR is a flagship report of the Regional Strategic Analysis and Knowledge Support System (ReSAKSS) program at AKADEMIYA2063. The chapter has 22 pages, 14 images, 9 tables, and 36 references 摘要：在农业部门，2019冠状病毒疾病威胁到该地区的严重粮食安全危机，粮食供应链中断，农业生产预计将收缩2.6%至7%。从粮食作物生产方面看，旅行禁令和边境关闭、延迟接收和使用进口种子、化肥和农药等农业投入可能导致粮食作物生产表现不佳。流动限制措施带来的另一层破坏是农业工人的短缺，主要是季节性工人。封锁措施和边境封锁限制了季节性工人准时到达农场进行种植和收割活动。此外，大多数进口的农业投入品都是空运的，这一流行病对空运造成了严重影响。这种运输中断也会对粮食作物生产系统产生负面影响。本章评估了2020年——收获期之前——所有非洲地区以及玉米、木薯、水稻和小麦等四种主食的粮食作物产量水平。利用从卫星图像检索的生物地球物理遥感数据和机器学习人工神经网络（ANN）技术相结合，预测生产水平。以遥感产品为输入变量，人工神经网络为预测建模框架。输入的遥感产品包括归一化植被指数（NDVI）、白天地表温度（LST）、降雨数据和农田蒸散量（ET）。输出地图和数据在基于网络的平台AAgWa（非洲农业观察，www.AAgWa.org）上公开，以便于决策者、决策者和其他利益相关者获取此类信息。摘要：In the agricultural sector, the COVID-19 threatens to lead to a severe food security crisis in the region, with disruptions in the food supply chain and agricultural production expected to contract between 2.6% and 7%. From the food crop production side, the travel bans and border closures, the late reception and the use of agricultural inputs such as imported seeds, fertilizers, and pesticides could lead to poor food crop production performances. Another layer of disruption introduced by the mobility restriction measures is the scarcity of agricultural workers, mainly seasonal workers. The lockdown measures and border closures limit seasonal workers' availability to get to the farm on time for planting and harvesting activities. Moreover, most of the imported agricultural inputs travel by air, which the pandemic has heavily impacted. Such transportation disruptions can also negatively affect the food crop production system. This chapter assesses food crop production levels in 2020 -- before the harvesting period -- in all African regions and four staples such as maize, cassava, rice, and wheat. The production levels are predicted using the combination of biogeophysical remote sensing data retrieved from satellite images and machine learning artificial neural networks (ANNs) technique. The remote sensing products are used as input variables and the ANNs as the predictive modeling framework. The input remote sensing products are the Normalized Difference Vegetation Index (NDVI), the daytime Land Surface Temperature (LST), rainfall data, and agricultural lands' Evapotranspiration (ET). The output maps and data are made publicly available on a web-based platform, AAgWa (Africa Agriculture Watch, www.aagwa.org), to facilitate access to such information to policymakers, deciders, and other stakeholders.

【3】 Integrating LSTMs and GNNs for COVID-19 Forecasting 标题：集成LSTMS和GNNs进行冠状病毒预测链接：https://arxiv.org/abs/2108.10052

作者：Nathan Sesti,Juan Jose Garau-Luis,Edward Crawley,Bruce Cameron 机构： Mas-sachusetts Institute of Technology 摘要：2019冠状病毒疾病的传播与图神经网络（GNN）的兴起相一致，导致了一些研究建议它们更好地预测大流行的演变。许多这样的模型还包括长短时记忆（LSTM）网络，这是时间序列预测的常用工具。在这项工作中，我们通过在LSTM门内实现GNN和利用空间信息，进一步研究这两种方法的集成。此外，我们还引入了一个跳过连接，这对于联合捕获数据中的空间和时间模式至关重要。我们验证了我们2019冠状病毒疾病的新的预测模型，在过去的472天里，37个欧洲国家的数据，并显示出与基于平均绝对缩放误差（MASE）的最先进的图时间序列模型相比优越的性能。这一研究领域在政策制定方面有着重要的应用，我们分析了其在大流行资源控制方面的潜力。摘要：The spread of COVID-19 has coincided with the rise of Graph Neural Networks (GNNs), leading to several studies proposing their use to better forecast the evolution of the pandemic. Many such models also include Long Short Term Memory (LSTM) networks, a common tool for time series forecasting. In this work, we further investigate the integration of these two methods by implementing GNNs within the gates of an LSTM and exploiting spatial information. In addition, we introduce a skip connection which proves critical to jointly capture the spatial and temporal patterns in the data. We validate our daily COVID-19 new cases forecast model on data of 37 European nations for the last 472 days and show superior performance compared to state-of-the-art graph time series models based on mean absolute scaled error (MASE). This area of research has important applications to policy-making and we analyze its potential for pandemic resource control.

【4】 Subject Envelope based Multitype Reconstruction Algorithm of Speech Samples of Parkinson's Disease 标题：基于主题包络的帕金森病语音样本多类型重建算法链接：https://arxiv.org/abs/2108.09922

作者：Yongming Li,Chengyu Liu,Pin Wang,Hehua Zhang,Anhai Wei 机构：(,. School of Microcommunication Engineering, Chongqing University, Chongqing , P.R. China;, . Department of Medical Engineering, Daping Hospital, Army Medical University (Third Military Medical University), Chongqing, China) 备注：11 pages, 6 tables 摘要：帕金森病（Parkinson'sdisease，PD）的危险性非常严重，PD语音识别是目前一种有效的诊断方法。然而，由于疾病分期、语料库和其他因素对数据收集的影响，同一受试者内每个样本反映PD状态的能力各不相同。没有样品是完全无用的，没有样品是100%完美的。这一特点意味着它不适合仅仅移除一些样品或保留一些样品。为了获得高质量的新样品，需要考虑样品转化。不幸的是，现有的PD语音识别方法主要集中在特征学习和分类器设计上，而不是样本学习，很少有方法考虑样本变换。针对上述问题，本文提出了一种基于多类型重构算子的局部放电语音样本变换算法。该算法分为四个主要步骤。该算法设计了三种类型的重建算子：A型、B型和C型。对于A型算子，通过设计线性变换直接重构原始数据集以获得第一个数据集。类型B算子用于对数据集进行聚类和线性变换，以获得第二个新数据集。第三种算子，即C型算子，通过聚类和卷积重构数据集，得到第三种数据集。最后，基于这三个新的数据集对基础分类器进行训练，然后对分类结果进行决策加权融合。在实验部分，使用了两个具有代表性的PD语音数据集进行验证。结果表明，该算法是有效的。与其他算法相比，该算法在分类精度上有明显的提高。摘要：The risk of Parkinson's disease (PD) is extremely serious, and PD speech recognition is an effective method of diagnosis nowadays. However, due to the influence of the disease stage, corpus, and other factors on data collection, the ability of every samples within one subject to reflect the status of PD vary. No samples are useless totally, and not samples are 100% perfect. This characteristic means that it is not suitable just to remove some samples or keep some samples. It is necessary to consider the sample transformation for obtaining high quality new samples. Unfortunately, existing PD speech recognition methods focus mainly on feature learning and classifier design rather than sample learning, and few methods consider the sample transformation. To solve the problem above, a PD speech sample transformation algorithm based on multitype reconstruction operators is proposed in this paper. The algorithm is divided into four major steps. Three types of reconstruction operators are designed in the algorithm: types A, B and C. Concerning the type A operator, the original dataset is directly reconstructed by designing a linear transformation to obtain the first dataset. The type B operator is designed for clustering and linear transformation of the dataset to obtain the second new dataset. The third operator, namely, the type C operator, reconstructs the dataset by clustering and convolution to obtain the third dataset. Finally, the base classifier is trained based on the three new datasets, and then the classification results are fused by decision weighting. In the experimental section, two representative PD speech datasets are used for verification. The results show that the proposed algorithm is effective. Compared with other algorithms, the proposed algorithm achieves apparent improvements in terms of classification accuracy.

【5】 A Multi-Task Learning Framework for COVID-19 Monitoring and Prediction of PPE Demand in Community Health Centres 标题：用于社区卫生中心个人防护用品需求监测和预测的多任务学习框架链接：https://arxiv.org/abs/2108.09402

作者：Bonaventure Chidube Molokwu,Shaon Bhatta Shuvo,Ziad Kobti,Anne Snowdon 机构：School of Computer Science, University of Windsor, Windsor - Ontario, Canada, -,-,-, SCAN in Health 备注：6-page article/manuscript 摘要：目前，全世界都在寻求合适的缓解技术来控制和预防新型SARS-CoV-2的传播。在本文中，我们提出了一个独特的多任务学习框架，该框架共同预测SARS-CoV-2的影响以及特定人群在社区卫生中心的个人防护设备消费。通过研究和分析预测病毒（SARS-CoV-2）的影响，使我们能够了解SARS-CoV-2的性质以及促进其生长和传播的因素。因此，这些措施有助于提高广泛的认识；民众可以变得更加主动和谨慎，以缓解2019年冠状病毒病（COVID-19）的传播。此外，了解和预测个人防护设备的需求可以提高社区卫生中心医护人员的效率和安全性。由于SARS-CoV-2的新性质和菌株，这方面的文献和研究相对较少。这些现有文献试图使用基于代理的模型、机器学习模型或数学模型来解决问题陈述。有鉴于此，我们在这里的工作通过将问题陈述建模为多任务学习问题来补充现有文献。我们的研究结果表明，政府行为和人为因素是影响SARS-CoV-2传播的最重要决定因素。摘要：Currently, the world seeks to find appropriate mitigation techniques to control and prevent the spread of the new SARS-CoV-2. In our paper herein, we present a peculiar Multi-Task Learning framework that jointly predicts the effect of SARS-CoV-2 as well as Personal-Protective-Equipment consumption in Community Health Centres for a given populace. Predicting the effect of the virus (SARS-CoV-2), via studies and analyses, enables us to understand the nature of SARS-CoV- 2 with reference to factors that promote its growth and spread. Therefore, these foster widespread awareness; and the populace can become more proactive and cautious so as to mitigate the spread of Corona Virus Disease 2019 (COVID- 19). Furthermore, understanding and predicting the demand for Personal Protective Equipment promotes the efficiency and safety of healthcare workers in Community Health Centres. Owing to the novel nature and strains of SARS-CoV-2, relatively few literature and research exist in this regard. These existing literature have attempted to solve the problem statement(s) using either Agent-based Models, Machine Learning Models, or Mathematical Models. In view of this, our work herein adds to existing literature via modeling our problem statements as Multi- Task Learning problems. Results from our research indicate that government actions and human factors are the most significant determinants that influence the spread of SARS-CoV-2.

【6】 ECG-Based Heart Arrhythmia Diagnosis Through Attentional Convolutional Neural Networks 标题：基于注意力卷积神经网络的心电图心律失常诊断链接：https://arxiv.org/abs/2108.10226

作者：Ziyu Liu,Xiang Zhang 机构：University of New South Wales, Sydney, Australia, Harvard Medical School, Harvard University, Boston, USA 备注：7 pages, in review of an international conference 摘要：心电图（ECG）信号是一种高度应用于个体心脏状况的测量方法，基于机器学习的心律失常自动诊断一直是人们努力的方向。然而，传统的机器学习模型在对原始数据进行预处理和特征提取时需要投入大量的时间和精力，而且分类性能较差。在这里，我们提出了一种新的深度学习模型，称为基于注意的卷积神经网络（ABCNN），它利用CNN和多头注意，直接处理原始心电信号，并自动提取信息相关性以准确检测心律失常。为了评估所提出的方法，我们在一个基准ECG数据集上进行了大量实验。我们的主要任务是从正常心跳中发现心律失常，同时从五种心律失常类型中准确识别心脏疾病。我们还提供了ABCNN的收敛性分析，并通过可视化直观地展示了提取表示的意义。实验结果表明，所提出的ABCNN性能优于广泛使用的基线，更接近于智能心脏病诊断系统。摘要：Electrocardiography (ECG) signal is a highly applied measurement for individual heart condition, and much effort have been endeavored towards automatic heart arrhythmia diagnosis based on machine learning. However, traditional machine learning models require large investment of time and effort for raw data preprocessing and feature extraction, as well as challenged by poor classification performance. Here, we propose a novel deep learning model, named Attention-Based Convolutional Neural Networks (ABCNN) that taking advantage of CNN and multi-head attention, to directly work on the raw ECG signals and automatically extract the informative dependencies for accurate arrhythmia detection. To evaluate the proposed approach, we conduct extensive experiments over a benchmark ECG dataset. Our main task is to find the arrhythmia from normal heartbeats and, at the meantime, accurately recognize the heart diseases from five arrhythmia types. We also provide convergence analysis of ABCNN and intuitively show the meaningfulness of extracted representation through visualization. The experimental results show that the proposed ABCNN outperforms the widely used baselines, which puts one step closer to intelligent heart disease diagnosis system.

【7】 Pediatric Automatic Sleep Staging: A comparative study of state-of-the-art deep learning methods 标题：儿科自动睡眠分期：最新深度学习方法的比较研究链接：https://arxiv.org/abs/2108.10211

作者：Huy Phan,Alfred Mertins,Mathias Baumert 机构： Phan is with the School of Electronic Engineering and ComputerScience, Queen Mary University of London, UK and the Alan Turing Institute, Mertins is with the Institute for Signal Processing, University ofL¨ubeck 备注：10 pages, 7 figures 摘要：尽管最近在成人自动睡眠分期方面取得了巨大进展，但目前已知最先进的算法是否适用于儿童人群，这在夜间多导睡眠图（PSG）中显示出独特的特征。为了回答这个问题，在这项工作中，我们对儿科自动睡眠分期的最新深度学习方法进行了大规模的对比研究。选择了六种不同的具有不同特征的深层神经网络，对1200多名儿童样本进行了评估，这些儿童的阻塞性睡眠呼吸暂停（OSA）严重程度范围很广。我们的实验结果表明，当对新受试者进行评估时，自动儿童睡眠分期的表现与成人专家报告的一级相当，总体准确率为87.0%，Cohen's kappa为0.829，单通道脑电图的宏观F1评分为83.5%。当使用双通道EEG$cdot$EOG时，性能得到进一步改善，精确度达到88.2%，Cohen's kappa为0.844，宏观F1得分为85.1%。结果还表明，当训练和测试数据间隔7个月时，所研究的算法对概念漂移具有鲁棒性。详细的分析进一步证明了自动记分员之间的“几乎完美”一致性以及他们在分期错误上的相似行为模式。摘要：Despite the tremendous progress recently made towards automatic sleep staging in adults, it is currently known if the most advanced algorithms generalize to the pediatric population, which displays distinctive characteristics in overnight polysomnography (PSG). To answer the question, in this work, we conduct a large-scale comparative study on the state-of-the-art deep learning methods for pediatric automatic sleep staging. A selection of six different deep neural networks with diverging features are adopted to evaluate a sample of more than 1,200 children across a wide spectrum of obstructive sleep apnea (OSA) severity. Our experimental results show that the performance of automated pediatric sleep staging when evaluated on new subjects is equivalent to the expert-level one reported on adults, reaching an overall accuracy of 87.0%, a Cohen's kappa of 0.829, and a macro F1-score of 83.5% in case of single-channel EEG. The performance is further improved when dual-channel EEG$cdot$EOG are used, reaching an accuracy of 88.2%, a Cohen's kappa of 0.844, and a macro F1-score of 85.1%. The results also show that the studied algorithms are robust to concept drift when the training and test data were recorded 7-months apart. Detailed analyses further demonstrate "almost perfect" agreement between the automatic scorers to one another and their similar behavioral patterns on the staging errors.

【8】 Modeling COVID-19 uncertainties evolving over time and density-dependent social reinforcement and asymptomatic infections 标题：模拟冠状病毒不确定性随时间和密度依赖的社会强化和无症状感染的演变链接：https://arxiv.org/abs/2108.10029

作者：Qing Liu,Longbing Cao 机构：University of Technology Sydney, Advanced Analytics Institute, Sydney, Australia 摘要：新型冠状病毒2019冠状病毒疾病2019（COVID-19）呈现独特和未知的问题复杂性和建模挑战，其中一个迫切的任务是对其过程和数据不确定性进行建模，以隐性和高比例的无证感染、无症状传染、社会强化感染来表示。以及报告数据中的各种质量问题。这些不确定性在接种疫苗但仍易感人群中以突变为主的压倒性复苏中变得更加显著。2019冠状病毒疾病流行性感染恢复（SIR）模型的建立，我们引入了一种新的混合方法（1），对未COVID -19潜伏期和无症状感染的未记录的（U）和记录的（D）感染进行了表征和区分，产生了一种新的易感、无文件记录的感染恢复（SUDR）模型(2）通过使SUDR能够捕获诸如集群传染相互作用、超扩散和社会强化等外源过程来表征感染的概率密度；（3）2019冠状病毒疾病的密度似然随时间的推移，将贝叶斯推断引入苏德尔。与现有的2019冠状病毒疾病模型不同，SUDR在未知的传播过程中表现为未记录的感染。为了捕捉2019冠状病毒疾病传播期间的时间传递和社会强化的不确定性，通过无记录传染病例的时变密度函数模拟传输速率。2019冠状病毒疾病2019冠状病毒疾病的随机抽样，我们用平均先验分布进行合理的先验抽样，从而解决了在公共COVID-19病例数据中广泛应用的COVID-19观察的随机性、噪声和稀疏性。摘要：The novel coronavirus disease 2019 (COVID-19) presents unique and unknown problem complexities and modeling challenges, where an imperative task is to model both its process and data uncertainties, represented in implicit and high-proportional undocumented infections, asymptomatic contagion, social reinforcement of infections, and various quality issues in the reported data. These uncertainties become even more phenomenal in the overwhelming mutation-dominated resurgences with vaccinated but still susceptible populations. Here we introduce a novel hybrid approach to (1) characterizing and distinguishing Undocumented (U) and Documented (D) infections commonly seen during COVID-19 incubation periods and asymptomatic infections by expanding the foundational compartmental epidemic Susceptible-Infected-Recovered (SIR) model with two compartments, resulting in a new Susceptible-Undocumented infected-Documented infected-Recovered (SUDR) model; (2) characterizing the probabilistic density of infections by empowering SUDR to capture exogenous processes like clustering contagion interactions, superspreading and social reinforcement; and (3) approximating the density likelihood of COVID-19 prevalence over time by incorporating Bayesian inference into SUDR. Different from existing COVID-19 models, SUDR characterizes the undocumented infections during unknown transmission processes. To capture the uncertainties of temporal transmission and social reinforcement during the COVID-19 contagion, the transmission rate is modeled by a time-varying density function of undocumented infectious cases. We solve the modeling by sampling from the mean-field posterior distribution with reasonable priors, making SUDR suitable to handle the randomness, noise and sparsity of COVID-19 observations widely seen in the public COVID-19 case data.

【9】 APObind: A Dataset of Ligand Unbound Protein Conformations for Machine Learning Applications in De Novo Drug Design 标题：APObind：用于机器学习的配体非结合蛋白构象数据集在de Novo药物设计中的应用链接：https://arxiv.org/abs/2108.09926

作者：Rishal Aggarwal,Akash Gupta,U Deva Priyakumar 备注：The 2021 ICML Workshop on Computational Biology 摘要：蛋白质-配体复合物结构已被用于设计基准机器学习方法，执行与药物设计相关的重要任务，如受体结合位点检测、小分子对接和结合亲和力预测。然而，这些方法通常只针对蛋白质的配体结合（或全）构象进行训练，因此，当蛋白质结构处于其天然未结合构象（或apo）时，不能保证表现良好，这通常是新识别受体的构象。其主要原因是结合位点的局部结构通常随配体结合而改变。为了解决这个问题，我们提出了一个名为APObind的数据集，旨在提供PDBbind数据集中存在的蛋白质的apo构象，PDBbind数据集是药物设计中常用的数据集。此外，我们还探讨了特定于此数据集上三个用例的方法的性能，通过这些方法，我们证明了在APObind数据集上验证这些方法的重要性。摘要：Protein-ligand complex structures have been utilised to design benchmark machine learning methods that perform important tasks related to drug design such as receptor binding site detection, small molecule docking and binding affinity prediction. However, these methods are usually trained on only ligand bound (or holo) conformations of the protein and therefore are not guaranteed to perform well when the protein structure is in its native unbound conformation (or apo), which is usually the conformation available for a newly identified receptor. A primary reason for this is that the local structure of the binding site usually changes upon ligand binding. To facilitate solutions for this problem, we propose a dataset called APObind that aims to provide apo conformations of proteins present in the PDBbind dataset, a popular dataset used in drug design. Furthermore, we explore the performance of methods specific to three use cases on this dataset, through which, the importance of validating them on the APObind dataset is demonstrated.

推荐(1篇)

【1】 Data Augmentation Using Many-To-Many RNNs for Session-Aware Recommender Systems 标题：基于多对多RNN的会话感知推荐系统数据增强链接：https://arxiv.org/abs/2108.09858

作者：Martín Baigorria Alonso 备注：None 摘要：Booking.com组织的ACM WSDM WebTour 2021挑战集中于在旅游领域应用会话感知推荐系统。给定用户旅行中的一系列旅行预订，我们将推荐用户的下一个目的地。为了处理输出空间的高维性，我们提出了一个多对多RNN模型，预测用户在每个序列步骤中选择的下一个目的地，而不是只预测最后一个目的地。我们展示了如何这是一个计算有效的替代做数据增强在一个多到一个RNN，其中我们考虑每个子序列的会话从第一个元素开始。我们的解决方案在最终排行榜中排名第四，以accuracy@4为0.5566。摘要：The ACM WSDM WebTour 2021 Challenge organized by Booking.com focuses on applying Session-Aware recommender systems in the travel domain. Given a sequence of travel bookings in a user trip, we look to recommend the user's next destination. To handle the large dimensionality of the output's space, we propose a many-to-many RNN model, predicting the next destination chosen by the user at every sequence step as opposed to only the final one. We show how this is a computationally efficient alternative to doing data augmentation in a many-to-one RNN, where we consider every subsequence of a session starting from the first element. Our solution achieved 4th place in the final leaderboard, with an accuracy@4 of 0.5566.

聚类(3篇)

【1】 Cube Sampled K-Prototype Clustering for Featured Data 标题：面向特征数据的立方体抽样K-Prototype聚类链接：https://arxiv.org/abs/2108.10262

作者：Seemandhar Jain,Aditya A. Shastri,Kapil Ahuja,Yann Busnel,Navneet Pratap Singh 机构：Computer Science and Engineering, Indian Institute of Technology Indore, Indore, India, Information Technology, NMIMS, Shirpur, Shirpur, India, Math of Data Science & Simulation (MODSS) Lab, Network Systems, Cybersecurity and Digital Law Department 备注：5 Pages, 2 Columns, 5 Tables, 2 Figures 摘要：在当今时代，对大量数据进行聚类变得越来越重要。由于数据量大，聚类算法往往耗时过长。在聚类之前对这些数据进行采样通常用于减少这一时间。在这项工作中，我们提出了一种称为立方体抽样和K原型聚类的概率抽样技术。使用立方体采样是因为其精确的样本选择。K-Prototype是最常用的聚类算法，当数据是数字数据和分类数据时（在当今非常常见）。这项工作的新颖之处在于使用主成分分析（PCA）获得立方体抽样的关键包含概率。在UCI存储库的多个数据集上的实验表明，立方体采样的K-Prototype算法在类似采样的其他流行聚类算法（K-Means、层次聚类（HC）、谱聚类（SC））中具有最佳的聚类精度。与未采样的K-Prototype、K-Means、HC和SC相比，它仍然具有最佳的精度，同时还具有计算复杂度降低的额外优势（由于数据量减少）。摘要：Clustering large amount of data is becoming increasingly important in the current times. Due to the large sizes of data, clustering algorithm often take too much time. Sampling this data before clustering is commonly used to reduce this time. In this work, we propose a probabilistic sampling technique called cube sampling along with K-Prototype clustering. Cube sampling is used because of its accurate sample selection. K-Prototype is most frequently used clustering algorithm when the data is numerical as well as categorical (very common in today's time). The novelty of this work is in obtaining the crucial inclusion probabilities for cube sampling using Principal Component Analysis (PCA). Experiments on multiple datasets from the UCI repository demonstrate that cube sampled K-Prototype algorithm gives the best clustering accuracy among similarly sampled other popular clustering algorithms (K-Means, Hierarchical Clustering (HC), Spectral Clustering (SC)). When compared with unsampled K-Prototype, K-Means, HC and SC, it still has the best accuracy with the added advantage of reduced computational complexity (due to reduced data size).

【2】 Rainfall-runoff prediction using a Gustafson-Kessel clustering based Takagi-Sugeno Fuzzy model 标题：基于Gustafson-Kessel聚类的Takagi-Sugeno模糊模型在降雨径流预测中的应用链接：https://arxiv.org/abs/2108.09684

作者：Subhrasankha Dey,Tanmoy Dam 机构： The proposed model is validated using the rainfall-runoffdata collected from the sensors installed on the campus of theIndian Institute of Technology 备注：This paper is underreview to IEEE SSCI 2022 摘要：降雨径流模型使用基于物理的方法或基于系统的方法预测地表径流。Takagi-Sugeno（TS）模糊模型是一种基于系统的方法，近几十年来，由于与其他现有模型相比在预测方面的一些优势和更高的准确性，是水文学家的一种流行建模选择。在本文中，我们提出了一个新的降雨-径流模型，该模型采用基于GK聚类的TS模糊模型。我们给出了GK算法与其他两种聚类算法的性能比较指标：（i）模糊C-均值（FCM）和（ii）减法聚类（SC）。我们提出的TS模糊模型使用以下方法预测地表径流：（i）流域内观测到的降雨和（ii）流域出口先前观测到的降雨流量。利用安装在哈拉格布尔印度理工学院校园内的传感器收集的降雨径流数据验证了所提出的模型。通过不同的验证指标，得到了该模型的最优规则数。对每种聚类算法的四个性能标准：均方根误差（RMSE）、效率系数（CE）、体积误差（VE）和相关确定系数（R）进行了定量比较研究。摘要：A rainfall-runoff model predicts surface runoff either using a physically-based approach or using a systems-based approach. Takagi-Sugeno (TS) Fuzzy models are systems-based approaches and a popular modeling choice for hydrologists in recent decades due to several advantages and improved accuracy in prediction over other existing models. In this paper, we propose a new rainfall-runoff model developed using Gustafson-Kessel (GK) clustering-based TS Fuzzy model. We present comparative performance measures of GK algorithms with two other clustering algorithms: (i) Fuzzy C-Means (FCM), and (ii)Subtractive Clustering (SC). Our proposed TS Fuzzy model predicts surface runoff using: (i) observed rainfall in a drainage basin and (ii) previously observed precipitation flow in the basin outlet. The proposed model is validated using the rainfall-runoff data collected from the sensors installed on the campus of the Indian Institute of Technology, Kharagpur. The optimal number of rules of the proposed model is obtained by different validation indices. A comparative study of four performance criteria: RootMean Square Error (RMSE), Coefficient of Efficiency (CE), Volumetric Error (VE), and Correlation Coefficient of Determination(R) have been quantitatively demonstrated for each clustering algorithm.

【3】 The Exploitation of Distance Distributions for Clustering 标题：距离分布在聚类中的应用链接：https://arxiv.org/abs/2108.09649

作者：Michael C. Thrun 机构：Databionics Research Group, Philipps-University of Marburg, D-, Marburg, Germany 备注：None 摘要：虽然许多机器学习算法都使用了距离度量，但是关于距离度量的上下文无关选择和评估的文献在使用先验知识的意义上是有限的。在聚类分析中，当前的研究在应用基于错误概率的无监督方法后评估距离度量的选择，隐含地设定了在数据中重现预定义分区的目标。此类研究使用的数据集群通常基于数据的上下文以及特定研究的自定义目标。根据数据上下文，判断距离分布的不同属性与适当的距离选择相关。然而，如果聚类分析是基于寻找相似数据分区的任务，那么分区内距离应该小于分区间距离。通过使用镜像密度图的分布分析系统地研究该规范，表明多峰距离分布在聚类分析中更可取。因此，在无监督方法的评估阶段之前，使用高斯混合建模距离分布是有利的。在几个人工数据集和自然数据集上进行了聚类实验。摘要：Although distance measures are used in many machine learning algorithms, the literature on the context-independent selection and evaluation of distance measures is limited in the sense that prior knowledge is used. In cluster analysis, current studies evaluate the choice of distance measure after applying unsupervised methods based on error probabilities, implicitly setting the goal of reproducing predefined partitions in data. Such studies use clusters of data that are often based on the context of the data as well as the custom goal of the specific study. Depending on the data context, different properties for distance distributions are judged to be relevant for appropriate distance selection. However, if cluster analysis is based on the task of finding similar partitions of data, then the intrapartition distances should be smaller than the interpartition distances. By systematically investigating this specification using distribution analysis through a mirrored-density plot, it is shown that multimodal distance distributions are preferable in cluster analysis. As a consequence, it is advantageous to model distance distributions with Gaussian mixtures prior to the evaluation phase of unsupervised methods. Experiments are performed on several artificial datasets and natural datasets for the task of clustering.

自动驾驶|车辆|车道检测等(1篇)

【1】 Deep Representation of Imbalanced Spatio-temporal Traffic Flow Data for Traffic Accident Detection 标题：交通事故检测中非平衡时空交通流数据的深度表示链接：https://arxiv.org/abs/2108.09506

作者：Pouya Mehrannia,Shayan Shirahmad Gale Bagi,Behzad Moshiri,Otman Adam Al-Basir 机构：Moshiri are with the Department of Electrical andComputer Engineering, University of Tehran 摘要：交通事故的自动检测对于改善交通、公共安全和路径规划有着至关重要的作用。从事故发生到派遣救援队之间的时间缩短，可以挽救许多人的生命，通过通知驾驶员选择替代路线可以节省大量的旅行时间。这一问题之所以具有挑战性，主要是因为事故的罕见性和环境的空间异质性。本文研究了利用长短时记忆（LSTM）网络对环路检测器数据进行深度表示，用于高速公路事故的自动检测。基于LSTM的框架增加了编码特征空间中的类可分性，同时降低了数据的维数。我们对从明尼苏达州双城地铁高速公路收集的真实事故和环路检测器数据进行的实验表明，使用LSTM网络对交通流数据进行深度表示有可能在不到18分钟内检测出高速公路事故，真阳性率为0.71，假阳性率为0.25，优于同一安排中的其他竞争方法。摘要：Automatic detection of traffic accidents has a crucial effect on improving transportation, public safety, and path planning. Many lives can be saved by the consequent decrease in the time between when the accidents occur and when rescue teams are dispatched, and much travelling time can be saved by notifying drivers to select alternative routes. This problem is challenging mainly because of the rareness of accidents and spatial heterogeneity of the environment. This paper studies deep representation of loop detector data using Long-Short Term Memory (LSTM) network for automatic detection of freeway accidents. The LSTM-based framework increases class separability in the encoded feature space while reducing the dimension of data. Our experiments on real accident and loop detector data collected from the Twin Cities Metro freeways of Minnesota demonstrate that deep representation of traffic flow data using LSTM network has the potential to detect freeway accidents in less than 18 minutes with a true positive rate of 0.71 and a false positive rate of 0.25 which outperforms other competing methods in the same arrangement.

联邦学习|隐私保护|加密(5篇)

【1】 Federated Multi-Task Learning under a Mixture of Distributions 标题：混合分布下的联合多任务学习链接：https://arxiv.org/abs/2108.10252

作者：Othmane Marfoq,Giovanni Neglia,Aurélien Bellet,Laetitia Kameni,Richard Vidal 机构：Inria, Université Côte d’Azur, Sophia Antipolis, France, Inria, Lille, France, Accenture Labs, Sophia Antipolis, France 备注：73 pages 摘要：智能手机和物联网设备生成的数据越来越大，这推动了联合学习（FL）的发展，FL是一种机器学习模型的设备上协作训练框架。FL的最初努力集中于学习单个全局模型，该模型在客户中具有良好的平均性能，但由于本地数据分布的固有异构性，全局模型对于给定客户可能是任意不好的。联邦多任务学习（MTL）方法可以通过制定一个适当的惩罚优化问题来学习个性化模型。惩罚项可以捕捉个性化模型之间的复杂关系，但避开了关于局部数据分布的明确统计假设。在这项工作中，我们建议在灵活的假设下研究联邦MTL，即每个本地数据分布是未知底层分布的混合。这一假设涵盖了大多数现有的个性化FL方法，并为客户机-服务器和完全分散的设置带来了类似EM的联合算法。此外，它还提供了一种原则性的方法，为训练时未见到的客户提供个性化模型。通过一个新的联邦代理优化框架分析了算法的收敛性，该框架具有普遍意义。FL基准测试的实验结果表明，在大多数情况下，我们的方法提供的模型比最先进的方法具有更高的准确性和公平性。摘要：The increasing size of data generated by smartphones and IoT devices motivated the development of Federated Learning (FL), a framework for on-device collaborative training of machine learning models. First efforts in FL focused on learning a single global model with good average performance across clients, but the global model may be arbitrarily bad for a given client, due to the inherent heterogeneity of local data distributions. Federated multi-task learning (MTL) approaches can learn personalized models by formulating an opportune penalized optimization problem. The penalization term can capture complex relations among personalized models, but eschews clear statistical assumptions about local data distributions. In this work, we propose to study federated MTL under the flexible assumption that each local data distribution is a mixture of unknown underlying distributions. This assumption encompasses most of the existing personalized FL approaches and leads to federated EM-like algorithms for both client-server and fully decentralized settings. Moreover, it provides a principled way to serve personalized models to clients not seen at training time. The algorithms' convergence is analyzed through a novel federated surrogate optimization framework, which can be of general interest. Experimental results on FL benchmarks show that in most cases our approach provides models with higher accuracy and fairness than state-of-the-art methods.

【2】 Federated Learning Meets Fairness and Differential Privacy 标题：联合学习遇到公平和差异隐私链接：https://arxiv.org/abs/2108.09932

作者：Manisha Padala,Sankarshan Damle,Sujit Gujar 机构：Machine Learning Lab, International Institute of Information Technology (IIIT), Hyderabad 摘要：深度学习的空前成功引发了从偏见预测到数据隐私等诸多伦理问题。研究人员通过引入公平性指标、联合学习或差异隐私来解决这些问题。首先，这项工作提出了一个道德联合学习模型，同时包含了所有三项措施。对成人、银行和荷兰数据集的实验突出了准确性、公平性和隐私性之间的“经验互动”。摘要：Deep learning's unprecedented success raises several ethical concerns ranging from biased predictions to data privacy. Researchers tackle these issues by introducing fairness metrics, or federated learning, or differential privacy. A first, this work presents an ethical federated learning model, incorporating all three measures simultaneously. Experiments on the Adult, Bank and Dutch datasets highlight the resulting ``empirical interplay" between accuracy, fairness, and privacy.

【3】 Anarchic Federated Learning 标题：无政府联邦学习链接：https://arxiv.org/abs/2108.09875

作者：Haibo Yang,Xin Zhang,Prashant Khanduri,Jia Liu 机构：†Department of Electrical and Computer Engineering, The Ohio State University, ‡Department of Statistics, Iowa State University, ∗Department of Electrical and Computer Engineering, University of Minnesota 摘要：当今部署在边缘网络上的联邦学习（FL）系统必须始终如一地处理大量数据和/或计算能力高度异构的工作人员。这种多样化的员工群体需要开发FL算法，该算法允许：（1）灵活的员工参与，使员工能够随意参与训练，（2）每个员工的本地更新数量（基于计算资源）以及与服务器的异步通信，（3）员工之间的异构数据。为了应对这些挑战，在这项工作中，我们提出了一种新的外语教学模式，称为“无政府联合学习”（AFL）。与传统的FL模型形成鲜明对比的是，AFL中的每个工人都有完全的自由选择i）何时参加FL，以及ii）根据其当前情况（例如电池电量、通信渠道、隐私问题）在每轮中执行的本地步骤的数量。然而，AFL也给算法设计带来了重大挑战，因为服务器需要处理混乱的工作者行为。为此，我们针对跨设备和跨思洛存储器设置提出了两种具有双边学习率的无政府FedAvg类算法，分别命名为AFedAvg TSLR CD和AFedAvg TSLR CS。对于一般工人信息到达过程，我们表明，在新的AFL范式中，两种算法都保持了非常理想的线性加速效果。此外，我们还表明，我们的AFedAvg TSLR算法框架可以被视为AFL的{em meta algorithm}，因为它们可以利用高级FL算法作为工作端和/或服务器端优化器，以实现AFL下的增强性能。我们通过在真实数据集上的大量实验来验证所提出的算法。摘要：Present-day federated learning (FL) systems deployed over edge networks have to consistently deal with a large number of workers with high degrees of heterogeneity in data and/or computing capabilities. This diverse set of workers necessitates the development of FL algorithms that allow: (1) flexible worker participation that grants the workers' capability to engage in training at will, (2) varying number of local updates (based on computational resources) at each worker along with asynchronous communication with the server, and (3) heterogeneous data across workers. To address these challenges, in this work, we propose a new paradigm in FL called ``Anarchic Federated Learning'' (AFL). In stark contrast to conventional FL models, each worker in AFL has complete freedom to choose i) when to participate in FL, and ii) the number of local steps to perform in each round based on its current situation (e.g., battery level, communication channels, privacy concerns). However, AFL also introduces significant challenges in algorithmic design because the server needs to handle the chaotic worker behaviors. Toward this end, we propose two Anarchic FedAvg-like algorithms with two-sided learning rates for both cross-device and cross-silo settings, which are named AFedAvg-TSLR-CD and AFedAvg-TSLR-CS, respectively. For general worker information arrival processes, we show that both algorithms retain the highly desirable linear speedup effect in the new AFL paradigm. Moreover, we show that our AFedAvg-TSLR algorithmic framework can be viewed as a {em meta-algorithm} for AFL in the sense that they can utilize advanced FL algorithms as worker- and/or server-side optimizers to achieve enhanced performance under AFL. We validate the proposed algorithms with extensive experiments on real-world datasets.

【4】 Flexible Clustered Federated Learning for Client-Level Data Distribution Shift 标题：面向客户端级数据分布转移的灵活集群联合学习链接：https://arxiv.org/abs/2108.09749

作者：Moming Duan,Duo Liu,Xinyuan Ji,Yu Wu,Liang Liang,Xianzhang Chen,Yujuan Tan 机构：•Liang Liang is with School of Microelectronics and CommunicationEngineering, Chongqing University 备注：Manuscript under review. arXiv admin note: substantial text overlap with arXiv:2010.06870 摘要：联邦学习（FL）使多个参与设备能够在保持局部训练数据的同时，协作地为全局神经网络模型做出贡献。与集中式训练设置不同，FL的非IID、不平衡（统计异质性）和分布偏移训练数据分布在联邦网络中，这将增加局部模型和全局模型之间的差异，进一步降低性能。在本文中，我们提出了一个灵活的集群联合学习（CFL）框架FlexCFL，其中我们1）根据客户机优化方向之间的相似性对客户机的训练进行分组，以降低训练差异；2）实现高效的新来者设备冷启动机制，以实现框架的可扩展性和实用性；3）灵活地迁移客户机，以应对客户机级数据分发转移的挑战。FlexCFL可以通过将联合优化分为多组子优化来实现改进，并且可以在配送班次环境中在准确性和通信效率之间取得平衡。通过分析FlexCFL的收敛性和复杂性，证明了FlexCFL的有效性。我们还评估了几个开放数据集上的FlexCFL，并与相关的CFL框架进行了比较。结果表明，与FedAvg相比，FlexCFL能显著提高女性NIST的绝对测试准确度 10.6%，与FedProx相比，时装设计师能显著提高绝对测试准确度 3.5%，与FeSEM相比，女性NIST能显著提高绝对测试准确度 8.4%。实验结果表明，FlexCFL在分布式轮班环境下也具有良好的通信效率。摘要：Federated Learning (FL) enables the multiple participating devices to collaboratively contribute to a global neural network model while keeping the training data locally. Unlike the centralized training setting, the non-IID, imbalanced (statistical heterogeneity) and distribution shifted training data of FL is distributed in the federated network, which will increase the divergences between the local models and the global model, further degrading performance. In this paper, we propose a flexible clustered federated learning (CFL) framework named FlexCFL, in which we 1) group the training of clients based on the similarities between the clients' optimization directions for lower training divergence; 2) implement an efficient newcomer device cold start mechanism for framework scalability and practicality; 3) flexibly migrate clients to meet the challenge of client-level data distribution shift. FlexCFL can achieve improvements by dividing joint optimization into groups of sub-optimization and can strike a balance between accuracy and communication efficiency in the distribution shift environment. The convergence and complexity are analyzed to demonstrate the efficiency of FlexCFL. We also evaluate FlexCFL on several open datasets and made comparisons with related CFL frameworks. The results show that FlexCFL can significantly improve absolute test accuracy by 10.6% on FEMNIST compared to FedAvg, 3.5% on FashionMNIST compared to FedProx, 8.4% on MNIST compared to FeSEM. The experiment results show that FlexCFL is also communication efficient in the distribution shift environment.

【5】 Personalised Federated Learning: A Combinational Approach 标题：个性化联合学习：一种组合方法链接：https://arxiv.org/abs/2108.09618

作者：Sone Kyaw Pye,Han Yu 机构：School of Computer Science and Engineering, Nanyang Technological University, Singapore 备注：in Proceedings of the 1st International Student Conference on Artificial Intelligence (STCAI'21), 2021 摘要：联邦学习（FL）是一种分布式机器学习方法，涉及多个客户端协作训练共享模型。这样一个系统的优点是来自多个客户机的更多训练数据，但数据可以是非相同和独立分布的（非i.i.d.）。隐私和完整性保护功能，如差异隐私（DP）和鲁棒聚合（RA）通常用于FL。在这项工作中，我们表明，在常见的深度学习任务中，FL模型的性能因客户和情况而异，并且由于非i.i.d.数据，FL模型有时比本地模型的性能更差。其次，我们表明，合并DP和RA会进一步降低性能。然后，我们对FL常用个性化方法的不同组合（如微调、专家组合、多任务学习和知识提炼）对性能的影响进行了研究。据观察，个性化方法的某些组合在某些场景中更具影响力，而其他方法总是能提高性能，并且组合方法优于单个方法。大多数客户通过组合个性化FL获得了更好的性能，并从非i.i.d.数据、DP和RA导致的性能下降中恢复过来。摘要：Federated learning (FL) is a distributed machine learning approach involving multiple clients collaboratively training a shared model. Such a system has the advantage of more training data from multiple clients, but data can be non-identically and independently distributed (non-i.i.d.). Privacy and integrity preserving features such as differential privacy (DP) and robust aggregation (RA) are commonly used in FL. In this work, we show that on common deep learning tasks, the performance of FL models differs amongst clients and situations, and FL models can sometimes perform worse than local models due to non-i.i.d. data. Secondly, we show that incorporating DP and RA degrades performance further. Then, we conduct an ablation study on the performance impact of different combinations of common personalization approaches for FL, such as finetuning, mixture-of-experts ensemble, multi-task learning, and knowledge distillation. It is observed that certain combinations of personalization approaches are more impactful in certain scenarios while others always improve performance, and combination approaches are better than individual ones. Most clients obtained better performance with combined personalized FL and recover from performance degradation caused by non-i.i.d. data, DP, and RA.

推理|分析|理解|解释(7篇)

【1】 TRAPDOOR: Repurposing backdoors to detect dataset bias in machine learning-based genomic analysis 标题：陷门：重新调整后门的用途以检测基于机器学习的基因组分析中的数据集偏差链接：https://arxiv.org/abs/2108.10132

作者：Esha Sarkar,Michail Maniatakos 机构：NYU Tandon School of Engineering, Brooklyn, New York, USA, Center for Cybersecurity, New York University Abu Dhabi, Abu Dhabi, UAE 摘要：机器学习（ML）在图像、语音、文本和数据分析等应用中取得了前所未有的性能。利用ML了解基因突变（基因组学）的基本模式具有深远的意义，不仅可以克服诊断缺陷，还可以设计癌症等威胁生命的疾病的治疗方案。ML算法的成功和可持续性取决于收集和用于训练的数据的质量和多样性。在这种数据集中，群体（种族群体、性别群体等）的代表性不足可能导致对某些群体的预测不准确，从而进一步加剧系统性歧视问题。在这项工作中，我们提出了陷门（TRAPDOOR），这是一种识别有偏差数据集的方法，通过重新利用一种主要用于邪恶目的的技术：神经网络后门（neuralnetworkbackdoors）。我们认为一个典型的协作学习设置的基因组供应链，其中数据可能来自医院，合作项目，或研究机构的中心云，而不知道偏见的敏感群体。在这种情况下，我们开发了一种方法来泄漏集体数据的潜在偏差信息，而不妨碍使用面向基因组应用的ML后门的真正性能。使用真实世界的癌症数据集，我们分析了对白人个体已经存在的偏见的数据集，并在数据集中人为引入了偏见，我们的实验结果表明，陷门可以100%准确地检测数据集偏见的存在，此外，还可以通过以较小的误差恢复百分比来提取偏差的程度。摘要：Machine Learning (ML) has achieved unprecedented performance in several applications including image, speech, text, and data analysis. Use of ML to understand underlying patterns in gene mutations (genomics) has far-reaching results, not only in overcoming diagnostic pitfalls, but also in designing treatments for life-threatening diseases like cancer. Success and sustainability of ML algorithms depends on the quality and diversity of data collected and used for training. Under-representation of groups (ethnic groups, gender groups, etc.) in such a dataset can lead to inaccurate predictions for certain groups, which can further exacerbate systemic discrimination issues. In this work, we propose TRAPDOOR, a methodology for identification of biased datasets by repurposing a technique that has been mostly proposed for nefarious purposes: Neural network backdoors. We consider a typical collaborative learning setting of the genomics supply chain, where data may come from hospitals, collaborative projects, or research institutes to a central cloud without awareness of bias against a sensitive group. In this context, we develop a methodology to leak potential bias information of the collective data without hampering the genuine performance using ML backdooring catered for genomic applications. Using a real-world cancer dataset, we analyze the dataset with the bias that already existed towards white individuals and also introduced biases in datasets artificially, and our experimental result show that TRAPDOOR can detect the presence of dataset bias with 100% accuracy, and furthermore can also extract the extent of bias by recovering the percentage with a small error.

【2】 On the Acceleration of Deep Neural Network Inference using Quantized Compressed Sensing 标题：基于量化压缩感知的深度神经网络推理加速研究链接：https://arxiv.org/abs/2108.10101

作者：Meshia Cédric Oveneke 机构：Artificial Intelligence Research Lab, Fit-For-Purpose Technologies 备注：3 pages, no figures, paper accepted at Black In AI at the 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada 摘要：在资源有限的设备上加速深层神经网络（DNN）推理是确保更广泛和更包容采用的最重要障碍之一。为了缓解这种情况，DNN二进制量化可以更快地卷积和节省内存，是最有希望的策略之一，尽管其精度严重下降。因此，本文提出了一种新的基于量化压缩感知（QCS）的二进制量化函数。理论上的争论推测，我们的建议保留了标准方法的实际好处，同时减少了量化误差和由此导致的精度下降。摘要：Accelerating deep neural network (DNN) inference on resource-limited devices is one of the most important barriers to ensuring a wider and more inclusive adoption. To alleviate this, DNN binary quantization for faster convolution and memory savings is one of the most promising strategies despite its serious drop in accuracy. The present paper therefore proposes a novel binary quantization function based on quantized compressed sensing (QCS). Theoretical arguments conjecture that our proposal preserves the practical benefits of standard methods, while reducing the quantization error and the resulting drop in accuracy.

【3】 On Quantifying Literals in Boolean Logic and Its Applications to Explainable AI 标题：布尔逻辑中文字的量化及其在可解释人工智能中的应用链接：https://arxiv.org/abs/2108.09876

作者：Adnan Darwiche,Pierre Marquis 机构：Computer Science Department, UCLA, Los Angeles, CA , USA, CRIL, Universit´e d’Artois & CNRS, Institut Universitaire de France, F-, Lens Cedex, France 备注：To be published in Journal of Artificial Intelligence Research (JAIR) with minor modifications 摘要：量化布尔逻辑是将运算符添加到布尔逻辑中，用于存在和通用量化变量的结果。这扩展了布尔逻辑的范围，实现了几十年来探索的各种应用。文献中也研究了文字（可变状态）的存在量化及其应用。在本文中，我们通过研究通用文字量化及其应用，特别是对可解释人工智能的应用，来补充这一点。我们还提供了一种新的量化语义，讨论了变量/文字和存在/通用量化之间的相互作用。我们进一步确定了可以有效地进行量化的一些布尔公式和电路类。文字量化比变量量化更细粒度，因为后者可以根据前者定义。这导致了以文字量化为原语的量化布尔逻辑的细化。摘要：Quantified Boolean logic results from adding operators to Boolean logic for existentially and universally quantifying variables. This extends the reach of Boolean logic by enabling a variety of applications that have been explored over the decades. The existential quantification of literals (variable states) and its applications have also been studied in the literature. In this paper, we complement this by studying universal literal quantification and its applications, particularly to explainable AI. We also provide a novel semantics for quantification, discuss the interplay between variable/literal and existential/universal quantification. We further identify some classes of Boolean formulas and circuits on which quantification can be done efficiently. Literal quantification is more fine-grained than variable quantification as the latter can be defined in terms of the former. This leads to a refinement of quantified Boolean logic with literal quantification as its primitive.

【4】 Explainable Machine Learning using Real, Synthetic and Augmented Fire Tests to Predict Fire Resistance and Spalling of RC Columns 标题：用真实、合成和扩大火灾试验预测钢筋混凝土柱抗火和层裂的解释性机器学习链接：https://arxiv.org/abs/2108.09862

作者：M. Z. Naser,V. K. Kodur 机构： Glenn Department of Civil Engineering, Clemson University, com 2University Distinguished Professor 摘要：本文介绍了系统机器学习（ML）方法的发展，该方法能够解释和快速评估钢筋混凝土（RC）柱的耐火性和火灾引起的剥落。所开发的方法包括三个新的ML算法的集合，即：；随机森林（RF）、极端梯度增强树（ExGBT）和深度学习（DL）。对这些算法进行训练，以考虑广泛的几何特征和材料特性，以及荷载条件，通过分析包含494多个观测值的火灾试验综合数据库，检查正常和高强度钢筋混凝土柱的火灾性能。开发的集合还能够为ML预测提供可量化的见解；因此，打破了“黑盒”ML的概念，朝着透明和可解释的ML迈出了坚实的一步。最重要的是，这项工作通过提出新技术来利用真实、合成和增强的火灾试验观测，解决了可用火灾试验的不足。已针对标准和设计火灾暴露以及单面、双面、三面和四面火灾暴露对开发的ML系综进行了校准和验证；涵盖火灾事故期间的各种实际场景。当完全部署时，开发的集成可以在60秒内分析超过5000根钢筋混凝土柱，因此为研究人员和从业者提供了一个有吸引力的解决方案。所提出的方法也可以很容易地扩展到评估其他结构构件在不同火灾场景和荷载条件下的耐火性和剥落，从而为该研究领域和实践的现代化铺平道路。摘要：This paper presents the development of systematic machine learning (ML) approach to enable explainable and rapid assessment of fire resistance and fire-induced spalling of reinforced concrete (RC) columns. The developed approach comprises of an ensemble of three novel ML algorithms namely; random forest (RF), extreme gradient boosted trees (ExGBT), and deep learning (DL). These algorithms are trained to account for a wide collection of geometric characteristics and material properties, as well as loading conditions to examine fire performance of normal and high strength RC columns by analyzing a comprehensive database of fire tests comprising of over 494 observations. The developed ensemble is also capable of presenting quantifiable insights to ML predictions; thus, breaking free from the notion of 'blackbox' ML and establishing a solid step towards transparent and explainable ML. Most importantly, this work tackles the scarcity of available fire tests by proposing new techniques to leverage the use of real, synthetic and augmented fire test observations. The developed ML ensemble has been calibrated and validated for standard and design fire exposures and for one, two, three and four-sided fire exposures thus; covering a wide range of practical scenarios present during fire incidents. When fully deployed, the developed ensemble can analyze over 5,000 RC columns in under 60 seconds thus, providing an attractive solution for researchers and practitioners. The presented approach can also be easily extended for evaluating fire resistance and spalling of other structural members and under varying fire scenarios and loading conditions and hence paves the way to modernize the state of this research area and practice.

【5】 Automating Crystal-Structure Phase Mapping: Combining Deep Learning with Constraint Reasoning 标题：深度学习与约束推理相结合的晶体结构相图自动化链接：https://arxiv.org/abs/2108.09523

作者：Di Chen,Yiwei Bai,Sebastian Ament,Wenting Zhao,Dan Guevarra,Lan Zhou,Bart Selman,R. Bruce van Dover,John M. Gregoire,Carla P. Gomes 机构： Cornell University, Department of Computer Science, California Institute of Technology, Joint Center for Artificial Photosynthesis, Cornell University, Department of Materials Science and Engineering 摘要：晶体结构相位映射是材料科学中一个核心的长期挑战，需要识别合成材料中的晶体结构或其混合物。材料科学专家擅长解决简单系统，但无法解决复杂系统，这在高通量材料发现中造成了一个主要瓶颈。在此，我们展示了如何自动化晶体结构相位映射。我们将相位映射描述为一个无监督模式分离问题，并描述了如何使用深度推理网络（DRNET）解决它。DRNET将深度学习与约束推理结合起来，以整合科学先验知识，因此只需要少量（未标记的）数据。DRNET利用和放大控制晶体混合物的热力学规则的丰富先验知识，并将约束推理无缝集成到神经网络优化中，从而弥补了有限的数据。DRNET设计了一个可解释的潜在空间，用于编码先验知识域约束，并将约束推理无缝地集成到神经网络优化中。DRNET在晶体结构相图、揭示Bi-Cu-V氧化物相图以及帮助发现太阳能燃料材料方面超越了以往的方法。摘要：Crystal-structure phase mapping is a core, long-standing challenge in materials science that requires identifying crystal structures, or mixtures thereof, in synthesized materials. Materials science experts excel at solving simple systems but cannot solve complex systems, creating a major bottleneck in high-throughput materials discovery. Herein we show how to automate crystal-structure phase mapping. We formulate phase mapping as an unsupervised pattern demixing problem and describe how to solve it using Deep Reasoning Networks (DRNets). DRNets combine deep learning with constraint reasoning for incorporating scientific prior knowledge and consequently require only a modest amount of (unlabeled) data. DRNets compensate for the limited data by exploiting and magnifying the rich prior knowledge about the thermodynamic rules governing the mixtures of crystals with constraint reasoning seamlessly integrated into neural network optimization. DRNets are designed with an interpretable latent space for encoding prior-knowledge domain constraints and seamlessly integrate constraint reasoning into neural network optimization. DRNets surpass previous approaches on crystal-structure phase mapping, unraveling the Bi-Cu-V oxide phase diagram, and aiding the discovery of solar-fuels materials.

【6】 Understanding and Co-designing the Data Ingestion Pipeline for Industry-Scale RecSys Training 标题：理解并共同设计行业规模的RecSys训练的数据获取管道链接：https://arxiv.org/abs/2108.09373

作者：Mark Zhao,Niket Agarwal,Aarti Basant,Bugra Gedik,Satadru Pan,Mustafa Ozdal,Rakesh Komuravelli,Jerry Pan,Tianshu Bao,Haowei Lu,Sundaram Narayanan,Jack Langman,Kevin Wilfong,Harsha Rastogi,Carole-Jean Wu,Christos Kozyrakis,Parik Pol 机构：Stanford University, Stanford, USA, Facebook, Menlo Park, USA 摘要：数据摄取管道负责存储和预处理训练数据，是任何机器学习训练工作的重要组成部分。在Facebook，我们在我们的服务中广泛使用推荐模型。训练这些模型所需的数据摄取需求很大。在本文中，我们对行业级推荐模型训练的数据摄取挑战进行了广泛的描述。首先，数据集存储需求庞大且多变；超过本地存储容量。其次，读取和预处理数据的计算成本很高，需要的计算、内存和网络资源远远超过训练师本身的可用资源。当使用当前的训练师预处理解决方案时，这些需求会导致训练吞吐量大幅降低，从而浪费GPU资源。为了应对这些挑战，我们提出了一个分类数据摄取管道。它包括一个构建在分布式存储节点上的中央数据仓库。我们引入了数据预处理服务（DPP），这是一种完全分解的预处理服务，可扩展到数百个节点，消除了可将训练吞吐量降低56%的数据暂停。我们跨存储和DPP实施了重要的优化，将存储和预处理吞吐量分别提高了1.9倍和2.3倍，解决了数据摄取的大量功耗需求。最后，我们总结了经验教训，并介绍了围绕大规模数据摄取而存在的重要挑战和机遇。摘要：The data ingestion pipeline, responsible for storing and preprocessing training data, is an important component of any machine learning training job. At Facebook, we use recommendation models extensively across our services. The data ingestion requirements to train these models are substantial. In this paper, we present an extensive characterization of the data ingestion challenges for industry-scale recommendation model training. First, dataset storage requirements are massive and variable; exceeding local storage capacities. Secondly, reading and preprocessing data is computationally expensive, requiring substantially more compute, memory, and network resources than are available on trainers themselves. These demands result in drastically reduced training throughput, and thus wasted GPU resources, when current on-trainer preprocessing solutions are used. To address these challenges, we present a disaggregated data ingestion pipeline. It includes a central data warehouse built on distributed storage nodes. We introduce Data PreProcessing Service (DPP), a fully disaggregated preprocessing service that scales to hundreds of nodes, eliminating data stalls that can reduce training throughput by 56%. We implement important optimizations across storage and DPP, increasing storage and preprocessing throughput by 1.9x and 2.3x, respectively, addressing the substantial power requirements of data ingestion. We close with lessons learned and cover the important remaining challenges and opportunities surrounding data ingestion at scale.

【7】 Electroencephalogram Signal Processing with Independent Component Analysis and Cognitive Stress Classification using Convolutional Neural Networks 标题：基于独立分量分析的脑电信号处理和基于卷积神经网络的认知应激分类链接：https://arxiv.org/abs/2108.09817

作者：Venkatakrishnan Sutharsan,Alagappan Swaminathan,Saisrinivasan Ramachandran,Madan Kumar Lakshmanan,Balaji Mahadevan 机构： Dept. of Electrical and Electronics Engineering, SSN College of Engineering, India, CSIR - Central Electronics Engineering Research Institute, Pilani, Rajasthan, India 备注：16 pages, 10 figures, 2 tables, 8 equations, 16 references 摘要：脑电图（EEG）是由于生物电信号的活动而产生的记录，这些生物电信号是从头皮上的电极获得的。在脑电图信号（EEG）记录中，获得的信号主要受到眼电信号（EOG）的污染。由于与EEG信号相比，该伪影具有更高的幅度，因此必须去除这些噪声信号，以便更好地了解人脑在医学诊断等应用中的功能。提出了一种将独立分量分析（ICA）与互相关相结合的脑电信号去噪方法。这是通过基于具有阈值的互相关系数选择分量来实现的，并减少其影响，而不是将其完全归零，从而减少信息损失。记录数据的实验结果表明，该算法能够在脑电数据损失小的情况下消除EOG信号伪影。通过增加信噪比值和降低互相关系数值来验证去噪效果。去噪后的信号用于训练人工神经网络（ANN），该网络将检查输入EEG信号的特征，并预测个体的应激水平。摘要：Electroencephalogram (EEG) is the recording which is the result due to the activity of bio-electrical signals that is acquired from electrodes placed on the scalp. In Electroencephalogram signal(EEG) recordings, the signals obtained are contaminated predominantly by the Electrooculogram(EOG) signal. Since this artifact has higher magnitude compared to EEG signals, these noise signals have to be removed in order to have a better understanding regarding the functioning of a human brain for applications such as medical diagnosis. This paper proposes an idea of using Independent Component Analysis(ICA) along with cross-correlation to de-noise EEG signal. This is done by selecting the component based on the cross-correlation coefficient with a threshold value and reducing its effect instead of zeroing it out completely, thus reducing the information loss. The results of the recorded data show that this algorithm can eliminate the EOG signal artifact with little loss in EEG data. The denoising is verified by an increase in SNR value and the decrease in cross-correlation coefficient value. The denoised signals are used to train an Artificial Neural Network(ANN) which would examine the features of the input EEG signal and predict the stress levels of the individual.

检测相关(3篇)

【1】 An Interpretable Approach to Hateful Meme Detection 标题：一种可解释的仇恨模因检测方法链接：https://arxiv.org/abs/2108.10069

作者：Tanvi Deshpande,Nitya Mani 机构：Irvington High School, Fremont, CA, USA, Massachusetts Institute of Technology, Cambridge, MA, USA 备注：5 pages. 2021 ACM International Conference on Multimodal Interaction (ICMI) 摘要：仇恨模因是一种新兴的在互联网上传播仇恨的方法，它依靠图像和文本来传达仇恨信息。我们采用可解释的方法来检测可恨的模因，使用机器学习和简单的启发式方法来识别对将模因归类为可恨模因最重要的特征。在此过程中，我们构建了一个梯度增强的决策树和一个基于LSTM的模型，在这项具有挑战性的任务中，该模型的性能（73.8验证和72.7测试auROC）与人类金标准和最先进的Transformer模型相当。摘要：Hateful memes are an emerging method of spreading hate on the internet, relying on both images and text to convey a hateful message. We take an interpretable approach to hateful meme detection, using machine learning and simple heuristics to identify the features most important to classifying a meme as hateful. In the process, we build a gradient-boosted decision tree and an LSTM-based model that achieve comparable performance (73.8 validation and 72.7 test auROC) to the gold standard of humans and state-of-the-art transformer models on this challenging task.

【2】 Sarcasm Detection in Twitter -- Performance Impact when using Data Augmentation: Word Embeddings 标题：Twitter中的讽刺检测--使用数据增强时的性能影响：Word嵌入链接：https://arxiv.org/abs/2108.09924

作者：Alif Tri Handoyo,Hidayaturrahman,Derwin Suhartono 机构：Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta , Indonesia 备注：7 pages, 4 figures. arXiv admin note: text overlap with arXiv:2104.09261 by other authors 摘要：讽刺通常是用来嘲弄或惹恼某人，或用于幽默目的的词语。讽刺主要用于社交网络和微博网站，在这些网站上，人们嘲笑或指责的方式甚至让人们很难辨别所说的是什么意思。在情感分析和观点挖掘等自然语言处理应用中，如果不能识别讽刺话语，将混淆分类算法并产生错误结果。一些关于讽刺检测的研究使用了不同的学习算法。然而，这些学习模型大多只关注表达的内容，而将上下文信息孤立起来。结果，他们未能捕捉到讽刺表达中的上下文信息。此外，一些研究中使用的数据集存在不平衡数据集，这会影响模型结果。在本文中，我们使用RoBERTa提出了一个用于twitter讽刺识别的上下文模型，并通过应用全局向量表示（GlobalVectorRepresentation，手套）来构建单词嵌入和上下文学习，从而生成更多数据并平衡数据集，从而对数据集进行扩充。该技术的有效性通过各种数据集和数据增强设置进行了测试。特别是，当使用数据扩充将标记为讽刺的数据增加20%时，我们在ISARCSM数据集中实现了3.2%的性能提升，从而使F分数从没有数据扩充的37.2%提高到40.4%。摘要：Sarcasm is the use of words usually used to either mock or annoy someone, or for humorous purposes. Sarcasm is largely used in social networks and microblogging websites, where people mock or censure in a way that makes it difficult even for humans to tell if what is said is what is meant. Failure to identify sarcastic utterances in Natural Language Processing applications such as sentiment analysis and opinion mining will confuse classification algorithms and generate false results. Several studies on sarcasm detection have utilized different learning algorithms. However, most of these learning models have always focused on the contents of expression only, leaving the contextual information in isolation. As a result, they failed to capture the contextual information in the sarcastic expression. Moreover, some datasets used in several studies have an unbalanced dataset which impacting the model result. In this paper, we propose a contextual model for sarcasm identification in twitter using RoBERTa, and augmenting the dataset by applying Global Vector representation (GloVe) for the construction of word embedding and context learning to generate more data and balancing the dataset. The effectiveness of this technique is tested with various datasets and data augmentation settings. In particular, we achieve performance gain by 3.2% in the iSarcasm dataset when using data augmentation to increase 20% of data labeled as sarcastic, resulting F-score of 40.4% compared to 37.2% without data augmentation.

【3】 Data-driven Smart Ponzi Scheme Detection 标题：数据驱动的智能庞氏骗局检测链接：https://arxiv.org/abs/2108.09305

作者：Yuzhi Liang,Weijing Wu,Kai Lei,Feiyang Wang 机构： Peking University 摘要：智能庞氏骗局是一种新的经济犯罪形式，它使用以太坊智能合约账户和加密货币实施庞氏骗局。智能庞氏骗局已经损害了许多投资者的利益，但对智能庞氏骗局检测的研究仍然十分有限。现有的智能庞氏骗局检测方法存在特征工程需要大量人力资源和模型可移植性差的问题。为了解决这些问题，本文提出了一种数据驱动的智能庞氏骗局检测系统。该系统利用动态图嵌入技术，基于与账户交易相关的多源多模式数据，自动学习账户的表示。与传统方法相比，该系统需要非常有限的人机交互。据我们所知，这是第一个通过动态图嵌入实现智能庞氏骗局检测的工作。实验结果表明，该方法明显优于现有的智能庞氏骗局检测方法。摘要：A smart Ponzi scheme is a new form of economic crime that uses Ethereum smart contract account and cryptocurrency to implement Ponzi scheme. The smart Ponzi scheme has harmed the interests of many investors, but researches on smart Ponzi scheme detection is still very limited. The existing smart Ponzi scheme detection methods have the problems of requiring many human resources in feature engineering and poor model portability. To solve these problems, we propose a data-driven smart Ponzi scheme detection system in this paper. The system uses dynamic graph embedding technology to automatically learn the representation of an account based on multi-source and multi-modal data related to account transactions. Compared with traditional methods, the proposed system requires very limited human-computer interaction. To the best of our knowledge, this is the first work to implement smart Ponzi scheme detection through dynamic graph embedding. Experimental results show that this method is significantly better than the existing smart Ponzi scheme detection methods.

分类|识别(5篇)

【1】 Pattern Inversion as a Pattern Recognition Method for Machine Learning 标题：模式反演作为一种机器学习的模式识别方法链接：https://arxiv.org/abs/2108.10242

作者：Alexei Mikhailov,Mikhail Karavay 机构：Institute for Control Problems, Russian Academy of Sciences, Moscow, Russia 备注：9 pages, 12 figures 摘要：人工神经网络使用了大量的系数，这些系数的调整需要大量的计算能力，特别是在采用深度学习网络的情况下。然而，也存在一些无系数、基于索引的极快技术，例如，在谷歌搜索引擎、基因组测序等方面都能发挥作用。本文讨论了基于索引的模式识别方法的使用。结果表明，对于模式识别应用，这种索引方法用反向模式替换搜索引擎中通常使用的完全反向文件。这种反演不仅提供了自动特征提取，这是深度学习的一个显著标志，而且与深度学习不同，模式反演支持几乎瞬时的学习，这是缺少系数的结果。本文讨论了一种基于新模式变换的模式反演方法及其在无监督即时学习中的应用。示例演示了在任意背景下与视角无关的三维对象（如汽车）识别、飞机发动机剩余使用寿命预测以及其他应用。总之，值得注意的是，在神经生理学中，新皮质微柱的功能自1957年以来一直存在广泛的争论。本文假设，从数学上讲，皮质微柱可以描述为一种逆模式，它在物理上充当连接乘数，扩展输入与相关模式类的关联。摘要：Artificial neural networks use a lot of coefficients that take a great deal of computing power for their adjustment, especially if deep learning networks are employed. However, there exist coefficients-free extremely fast indexing-based technologies that work, for instance, in Google search engines, in genome sequencing, etc. The paper discusses the use of indexing-based methods for pattern recognition. It is shown that for pattern recognition applications such indexing methods replace with inverse patterns the fully inverted files, which are typically employed in search engines. Not only such inversion provide automatic feature extraction, which is a distinguishing mark of deep learning, but, unlike deep learning, pattern inversion supports almost instantaneous learning, which is a consequence of absence of coefficients. The paper discusses a pattern inversion formalism that makes use on a novel pattern transform and its application for unsupervised instant learning. Examples demonstrate a view-angle independent recognition of three-dimensional objects, such as cars, against arbitrary background, prediction of remaining useful life of aircraft engines, and other applications. In conclusion, it is noted that, in neurophysiology, the function of the neocortical mini-column has been widely debated since 1957. This paper hypothesize that, mathematically, the cortical mini-column can be described as an inverse pattern, which physically serves as a connection multiplier expanding associations of inputs with relevant pattern classes.

【2】 Improving the trustworthiness of image classification models by utilizing bounding-box annotations 标题：利用包围盒标注提高图像分类模型的可信性链接：https://arxiv.org/abs/2108.10131

作者：Dharma KC,Chicheng Zhang 机构：University of Arizona 备注：None 摘要：我们研究利用训练数据中的辅助信息来提高机器学习模型的可信度。具体地说，在图像分类的背景下，我们建议优化包含边界框信息的训练目标，这在许多图像分类数据集中都是可用的。初步实验结果表明，与基线算法相比，该算法在准确性、鲁棒性和可解释性方面都有较好的性能。摘要：We study utilizing auxiliary information in training data to improve the trustworthiness of machine learning models. Specifically, in the context of image classification, we propose to optimize a training objective that incorporates bounding box information, which is available in many image classification datasets. Preliminary experimental results show that the proposed algorithm achieves better performance in accuracy, robustness, and interpretability compared with baselines.

【3】 Face Photo-Sketch Recognition Using Bidirectional Collaborative Synthesis Network 标题：基于双向协同合成网络的人脸照片素描识别链接：https://arxiv.org/abs/2108.09898

作者：Seho Bae,Nizam Ud Din,Hyunkyu Park,Juneho Yi 机构：Department of Electrical and Computer Engineering, Sungkyunkwan University, Suwon, Republic of Korea 摘要：本研究采用了一个基于深度学习的框架来解决给定人脸草图图像与人脸照片数据库的匹配问题。照片-草图匹配问题具有挑战性，因为1）照片和草图之间存在较大的模态差异，2）成对的训练样本数量不足以训练基于深度学习的网络。为了避免大模态间隙的问题，我们的方法是在两种模态之间使用一个中间的潜在空间。通过采用双向（照片->草图和草图->照片）协作合成网络，我们有效地调整了这一潜在空间中两种模式的分布。利用StyleGAN式结构，使中间潜在空间具有丰富的表现力。为了解决训练样本不足的问题，我们引入了三步训练方案。对公共合成人脸草图数据库的广泛评估证实了我们的方法比现有的最先进的方法具有更高的性能。所提出的方法可用于匹配其他模态对。摘要：This research features a deep-learning based framework to address the problem of matching a given face sketch image against a face photo database. The problem of photo-sketch matching is challenging because 1) there is large modality gap between photo and sketch, and 2) the number of paired training samples is insufficient to train deep learning based networks. To circumvent the problem of large modality gap, our approach is to use an intermediate latent space between the two modalities. We effectively align the distributions of the two modalities in this latent space by employing a bidirectional (photo -> sketch and sketch -> photo) collaborative synthesis network. A StyleGAN-like architecture is utilized to make the intermediate latent space be equipped with rich representation power. To resolve the problem of insufficient training samples, we introduce a three-step training scheme. Extensive evaluation on public composite face sketch database confirms superior performance of our method compared to existing state-of-the-art methods. The proposed methodology can be employed in matching other modality pairs.

【4】 A Sparse Structure Learning Algorithm for Bayesian Network Identification from Discrete High-Dimensional Data 标题：离散高维数据贝叶斯网络辨识的稀疏结构学习算法链接：https://arxiv.org/abs/2108.09501

作者：Nazanin Shajoonnezhad,Amin Nikanjam 机构：Received: date Accepted: date 备注：submitted to the journal of Statistics and Computing 摘要：本文讨论了从高维离散数据中学习稀疏结构贝叶斯网络的问题。与连续贝叶斯网络相比，离散贝叶斯网络由于参数空间大，学习是一个具有挑战性的问题。虽然已经发展了许多学习连续贝叶斯网络的方法，但对于离散贝叶斯网络的学习方法却很少。在本文中，我们将学习贝叶斯网络视为一个优化问题，并提出一个同时满足稀疏性和DAG性质的得分函数。此外，我们还实现了一种分块随机坐标下降算法来优化评分函数。具体地说，我们在优化算法中使用了一种方差缩减方法，使算法在高维数据中有效地工作。该方法被应用于来自著名基准网络的合成数据。测量所构建网络的质量、可伸缩性和健壮性。与一些有竞争力的方法相比，结果表明我们的算法在评价指标上优于其他方法。摘要：This paper addresses the problem of learning a sparse structure Bayesian network from high-dimensional discrete data. Compared to continuous Bayesian networks, learning a discrete Bayesian network is a challenging problem due to the large parameter space. Although many approaches have been developed for learning continuous Bayesian networks, few approaches have been proposed for the discrete ones. In this paper, we address learning Bayesian networks as an optimization problem and propose a score function that satisfies the sparsity and the DAG property simultaneously. Besides, we implement a block-wised stochastic coordinate descent algorithm to optimize the score function. Specifically, we use a variance reducing method in our optimization algorithm to make the algorithm work efficiently in high-dimensional data. The proposed approach is applied to synthetic data from well-known benchmark networks. The quality, scalability, and robustness of the constructed network are measured. Compared to some competitive approaches, the results reveal that our algorithm outperforms the others in evaluation metrics.

【5】 Emotion Recognition from Multiple Modalities: Fundamentals and Methodologies 标题：多通道情绪识别的基本原理和方法链接：https://arxiv.org/abs/2108.10152

作者：Sicheng Zhao,Guoli Jia,Jufeng Yang,Guiguang Ding,Kurt Keutzer 备注：Accepted by IEEE Signal Processing Magazine (SPM) 摘要：人类是情感动物。当我们表达情感时，往往涉及多种形式，无论是明确表达（如面部表情、言语）还是隐含表达（如文本、图像）。使机器具有情绪智能，即识别、解释、处理和模拟情绪，变得越来越重要。在本教程中，我们将讨论多模态情感识别（MER）的几个关键方面。我们首先简要介绍了广泛使用的情感表征模型和情感模式。然后，我们总结了现有的情绪注释策略和相应的计算任务，然后描述了MER中的主要挑战。此外，我们还提出了一些有代表性的方法，包括每种情感模式的表征学习、不同情感模式的特征融合、MER分类器的优化以及MER的领域自适应。最后，我们概述了一些实际应用，并讨论了一些未来的方向。摘要：Humans are emotional creatures. Multiple modalities are often involved when we express emotions, whether we do so explicitly (e.g., facial expression, speech) or implicitly (e.g., text, image). Enabling machines to have emotional intelligence, i.e., recognizing, interpreting, processing, and simulating emotions, is becoming increasingly important. In this tutorial, we discuss several key aspects of multi-modal emotion recognition (MER). We begin with a brief introduction on widely used emotion representation models and affective modalities. We then summarize existing emotion annotation strategies and corresponding computational tasks, followed by the description of main challenges in MER. Furthermore, we present some representative approaches on representation learning of each affective modality, feature fusion of different affective modalities, classifier optimization for MER, and domain adaptation for MER. Finally, we outline several real-world applications and discuss some future directions.

3D|3D重建等相关(2篇)

【1】 Tracked 3D Ultrasound and Deep Neural Network-based Thyroid Segmentation reduce Interobserver Variability in Thyroid Volumetry 标题：基于跟踪三维超声和深度神经网络的甲状腺分割降低甲状腺体积测量中的观察者间变异性链接：https://arxiv.org/abs/2108.10118

作者：Markus Krönke,Christine Eilers,Desislava Dimova,Melanie Köhler,Gabriel Buschner,Lilit Mirzojan,Lemonia Konstantinidou,Marcus R. Makowski,James Nagarajah,Nassir Navab,Wolfgang Weber,Thomas Wendler 机构：Department of Nuclear Medicine, School of Medicine, Technical University of Munich, Department of Radiology, School of Medicine, Technical University of Munich, Munich, Chair for Computer Aided Medical Procedures and Augmented Reality, Department of 备注：7 figures, 19 pages, under review 摘要：背景：甲状腺容量测定在甲状腺疾病的诊断、治疗和监测中至关重要。然而，传统的二维超声甲状腺容积测定法高度依赖于操作者。本研究比较了2D超声和跟踪3D超声与基于深度神经网络的甲状腺自动分割，包括观察者之间和观察者内部的变异性、时间和准确性。体积参考值为MRI。方法：对28名健康志愿者进行二维、三维超声扫描及MRI检查。三位具有不同经验（6、4和1A）的医师（MD 1、2、3）对每位志愿者进行了三次二维超声和三次跟踪三维超声扫描。在2D扫描中，甲状腺叶体积采用椭球体公式计算。卷积深度神经网络（CNN）自动分割3D甲状腺叶。在MRI（T1 VIBE序列）上，由经验丰富的医生手动分割甲状腺。结果：CNN经过训练，骰子得分为0.94。比较两种MDs的观察者间变异性显示，2D和3D的平均差异分别为0.58 ml至0.52 ml（MD1对2）、-1.33 ml至-0.17 ml（MD1对3）和-1.89 ml至-0.70 ml（MD2对3）。配对样本t检验显示2D和3D的两个比较存在显著差异。二维和三维超声的观察内变异性相似。通过配对样本t检验比较超声体积和MRI体积，发现所有MDs的2D体积测定结果存在显著差异，而3D超声体积测定结果无显著差异。三维超声的采集时间明显缩短。结论：跟踪三维超声结合CNN分割显著降低了观察者之间甲状腺容积测量的变异性，并在较短的采集时间内提高了测量的准确性。摘要：Background: Thyroid volumetry is crucial in diagnosis, treatment and monitoring of thyroid diseases. However, conventional thyroid volumetry with 2D ultrasound is highly operator-dependent. This study compares 2D ultrasound and tracked 3D ultrasound with an automatic thyroid segmentation based on a deep neural network regarding inter- and intraobserver variability, time and accuracy. Volume reference was MRI. Methods: 28 healthy volunteers were scanned with 2D and 3D ultrasound as well as by MRI. Three physicians (MD 1, 2, 3) with different levels of experience (6, 4 and 1 a) performed three 2D ultrasound and three tracked 3D ultrasound scans on each volunteer. In the 2D scans the thyroid lobe volumes were calculated with the ellipsoid formula. A convolutional deep neural network (CNN) segmented the 3D thyroid lobes automatically. On MRI (T1 VIBE sequence) the thyroid was manually segmented by an experienced medical doctor. Results: The CNN was trained to obtain a dice score of 0.94. The interobserver variability comparing two MDs showed mean differences for 2D and 3D respectively of 0.58 ml to 0.52 ml (MD1 vs. 2), -1.33 ml to -0.17 ml (MD1 vs. 3) and -1.89 ml to -0.70 ml (MD2 vs. 3). Paired samples t-tests showed significant differences in two comparisons for 2D and none for 3D. Intraobsever variability was similar for 2D and 3D ultrasound. Comparison of ultrasound volumes and MRI volumes by paired samples t-tests showed a significant difference for the 2D volumetry of all MDs, and no significant difference for 3D ultrasound. Acquisition time was significantly shorter for 3D ultrasound. Conclusion: Tracked 3D ultrasound combined with a CNN segmentation significantly reduces interobserver variability in thyroid volumetry and increases the accuracy of the measurements with shorter acquisition times.

【2】 Rotationally Equivariant Neural Operators for Learning Transformations on Tensor Fields (eg 3D Images and Vector Fields) 标题：学习张量场(如三维图像和矢量场)变换的旋转等变神经算子链接：https://arxiv.org/abs/2108.09541

作者：Paul Shen,Michael Herbst,Venkat Viswanathan 机构：Carnegie Mellon University, RWTH Aachen University 摘要：我们引入等变神经算子来学习分辨率不变量以及张量场集之间的平移和旋转等变变换。输入和输出可以包含标量场、向量场、二阶张量场和高阶场的任意混合。我们的张量场卷积层通过学习其脉冲响应或格林函数作为卷积核来模拟任何线性算子。我们的张量场注意层通过局部张量积模拟成对场耦合。卷积和相关伴随可以是实空间或傅里叶空间，允许线性缩放。通过统一E3NN、TBNN和FNO的概念，我们在工程和量子化学中对广泛的偏微分方程和动力学系统实现了良好的预测性能。代码在Julia中，可根据作者的要求提供。摘要：We introduce equivariant neural operators for learning resolution invariant as well as translation and rotation equivariant transformations between sets of tensor fields. Input and output may contain arbitrary mixes of scalar fields, vector fields, second order tensor fields and higher order fields. Our tensor field convolution layers emulate any linear operator by learning its impulse response or Green's function as the convolution kernel. Our tensor field attention layers emulate pairwise field coupling via local tensor products. Convolutions and associated adjoints can be in real or Fourier space allowing for linear scaling. By unifying concepts from E3NN, TBNN and FNO, we achieve good predictive performance on a wide range of PDEs and dynamical systems in engineering and quantum chemistry. Code is in Julia and available upon request from authors.

编码器(1篇)

【1】 DTWSSE: Data Augmentation with a Siamese Encoder for Time Series 标题：DTWSSE：使用暹罗编码器进行时间序列的数据增强链接：https://arxiv.org/abs/2108.09885

作者：Xinyu Yang,Xinlan Zhang,Zhenguo Zhang,Yahui Zhao,Rongyi Cui 机构：Department of Computer Science and Technology, Yanbian University, Gongyuan Road, Yanji, People’s Republic of China 备注：Accepted as full research paper in APWEB-WAIM 2021 摘要：在现实世界中，对标记时间序列数据的访问往往受到限制，这限制了时间序列分析领域中深度学习模型的性能。数据扩充是解决时间序列数据样本量小、不平衡问题的有效途径。数据扩充的两个关键因素是距离度量和插值方法的选择。SMOTE在时间序列数据上表现不佳，因为它使用欧几里德距离度量并直接在对象上插值。因此，我们提出了一种基于DTW的合成少数过采样技术，使用暹罗编码器进行插值，名为DTWSSE。为了合理地测量时间序列的距离，采用DTW作为距离度量，该方法已被证实是forts的一种有效方法。为了适应DTW度量，我们使用以无监督自训练方式训练的自动编码器进行插值。编码器是一个连体神经网络，用于将时间序列数据从DTW隐藏空间映射到欧氏深度特征空间，解码器用于将深度特征空间映射回DTW隐藏空间。我们在大量不同的平衡或非平衡时间序列数据集上验证了所提出的方法。实验结果表明，该方法能使下游深度学习模型具有更好的性能。摘要：Access to labeled time series data is often limited in the real world, which constrains the performance of deep learning models in the field of time series analysis. Data augmentation is an effective way to solve the problem of small sample size and imbalance in time series datasets. The two key factors of data augmentation are the distance metric and the choice of interpolation method. SMOTE does not perform well on time series data because it uses a Euclidean distance metric and interpolates directly on the object. Therefore, we propose a DTW-based synthetic minority oversampling technique using siamese encoder for interpolation named DTWSSE. In order to reasonably measure the distance of the time series, DTW, which has been verified to be an effective method forts, is employed as the distance metric. To adapt the DTW metric, we use an autoencoder trained in an unsupervised self-training manner for interpolation. The encoder is a Siamese Neural Network for mapping the time series data from the DTW hidden space to the Euclidean deep feature space, and the decoder is used to map the deep feature space back to the DTW hidden space. We validate the proposed methods on a number of different balanced or unbalanced time series datasets. Experimental results show that the proposed method can lead to better performance of the downstream deep learning model.

优化|敛散性(3篇)

【1】 LoOp: Looking for Optimal Hard Negative Embeddings for Deep Metric Learning 标题：循环：寻找深度度量学习的最优硬负嵌入链接：https://arxiv.org/abs/2108.09335

作者：Bhavya Vasudeva,Puneesh Deora,Saumik Bhattacharya,Umapada Pal,Sukalpa Chanda 机构：Indian Statistical Institute, Kolkata, India, Indian Institute of Technology, Kharagpur, India, Østfold University College, Halden, Norway 备注：17 pages, 9 figures, 5 tables. Accepted at The IEEE/CVF International Conference on Computer Vision (ICCV) 2021 摘要：深度度量学习已被有效地用于学习不同视觉任务（如图像检索、聚类等）的距离度量。为了帮助训练过程，现有方法要么使用硬挖掘策略提取信息量最大的样本，要么使用附加网络生成硬合成。这种方法面临着不同的挑战，在前一种情况下可能导致有偏差的嵌入，（i）更难的优化（ii）更慢的训练速度（iii）在后一种情况下更高的模型复杂度。为了克服这些挑战，我们提出了一种在嵌入空间中寻找最优硬负（LoOp）的新方法，通过计算一对正和一对负之间的最小距离来充分利用每个元组。与基于挖掘的方法不同，我们的方法考虑嵌入对之间的整个空间来计算最佳硬负。结合我们的方法和具有代表性的度量学习损失的大量实验表明，在三个基准数据集上，性能显著提高。摘要：Deep metric learning has been effectively used to learn distance metrics for different visual tasks like image retrieval, clustering, etc. In order to aid the training process, existing methods either use a hard mining strategy to extract the most informative samples or seek to generate hard synthetics using an additional network. Such approaches face different challenges and can lead to biased embeddings in the former case, and (i) harder optimization (ii) slower training speed (iii) higher model complexity in the latter case. In order to overcome these challenges, we propose a novel approach that looks for optimal hard negatives (LoOp) in the embedding space, taking full advantage of each tuple by calculating the minimum distance between a pair of positives and a pair of negatives. Unlike mining-based methods, our approach considers the entire space between pairs of embeddings to calculate the optimal hard negatives. Extensive experiments combining our approach and representative metric learning losses reveal a significant boost in performance on three benchmark datasets.

【2】 New Q-Newton's method meets Backtracking line search: good convergence guarantee, saddle points avoidance, quadratic rate of convergence, and easy implementation 标题：新的Q-牛顿算法满足回溯线搜索的要求：良好的收敛性保证，避免鞍点，二次收敛速度快，易于实现链接：https://arxiv.org/abs/2108.10249

作者：Tuyen Trung Truong 备注：29 pages 摘要：在最近的一项联合工作中，作者开发了一种改进的牛顿法，称为新Q-牛顿法，它可以避免鞍点，并且具有二次收敛速度。虽然该方法尚未建立良好的理论收敛保证，但小规模问题的实验表明，该方法与牛顿方法的其他著名修改（如自适应三次正则化和BFGS）相比具有很强的竞争力，以及无界双向回溯梯度下降等一阶方法。在本文中，我们解决了收敛性保证问题，提出了一种新的Q-Newton方法的改进，称为新的Q-Newton方法回溯，它结合了更复杂的超参数使用和回溯线搜索。这个新方法有很好的理论保证，对于一个{bf莫尔斯函数}产生如下结果（对于新的Q-牛顿方法是未知的）：{bf定理。}让$f:mathbb{R}^mrightarrowmathbb{R}$是一个莫尔斯函数，即它的所有临界点都是可逆的Hessian函数。然后，对于由新的Q-牛顿方法从随机初始点$xu 0$回溯构造的序列${xu n}$，我们有以下两种选择：i）$lim{nrightarrowinfty}x|n |=infty$，或ii）${xu n$收敛到点$x{infty}$，该点是$f$的{bf局部最小值，收敛速度是二次的。此外，如果$f$具有紧凑的子级，则只有情况ii）发生。据我们所知，对于Morse函数，这是迄今为止文献中迭代优化算法的最佳理论保证。我们在小规模的实验中进行了测试，使用了新Q-牛顿法的一些进一步简化版本进行回溯，发现新方法显著改进了新Q-牛顿法。摘要：In a recent joint work, the author has developed a modification of Newton's method, named New Q-Newton's method, which can avoid saddle points and has quadratic rate of convergence. While good theoretical convergence guarantee has not been established for this method, experiments on small scale problems show that the method works very competitively against other well known modifications of Newton's method such as Adaptive Cubic Regularization and BFGS, as well as first order methods such as Unbounded Two-way Backtracking Gradient Descent. In this paper, we resolve the convergence guarantee issue by proposing a modification of New Q-Newton's method, named New Q-Newton's method Backtracking, which incorporates a more sophisticated use of hyperparameters and a Backtracking line search. This new method has very good theoretical guarantees, which for a {bf Morse function} yields the following (which is unknown for New Q-Newton's method): {bf Theorem.} Let $f:mathbb{R}^mrightarrow mathbb{R}$ be a Morse function, that is all its critical points have invertible Hessian. Then for a sequence ${x_n}$ constructed by New Q-Newton's method Backtracking from a random initial point $x_0$, we have the following two alternatives: i) $lim _{nrightarrowinfty}||x_n||=infty$, or ii) ${x_n}$ converges to a point $x_{infty}$ which is a {bf local minimum} of $f$, and the rate of convergence is {bf quadratic}. Moreover, if $f$ has compact sublevels, then only case ii) happens. As far as we know, for Morse functions, this is the best theoretical guarantee for iterative optimization algorithms so far in the literature. We have tested in experiments on small scale, with some further simplified versions of New Q-Newton's method Backtracking, and found that the new method significantly improve New Q-Newton's method.

【3】 Sequential Stochastic Optimization in Separable Learning Environments 标题：可分离学习环境中的序贯随机优化链接：https://arxiv.org/abs/2108.09585

作者：R. Reid Bishop,Chelsea C. White III 备注：30 pages (Main), 12 pages (Figures, References, Appendices), 5 figures 摘要：我们考虑一类在不确定条件下的序贯决策问题，可以包含不同类型的监督学习概念。这些问题具有完全观察到的状态过程和部分观察到的调制过程，其中状态过程仅通过观察过程受调制过程的影响，观察过程仅观察调制过程，并且调制过程是外部控制的。我们将这一大类问题建模为一个部分观测的马尔可夫决策过程（POMDP）。调制过程的置信函数是控制不变的，因此将调制过程的估计与状态过程的控制分离。我们将这种特殊结构的POMDP称为可分离的POMDP，或SEP-POMDP，并表明它（i）可以作为广泛应用领域的模型，例如库存控制、金融、医疗保健系统，（ii）从一组完全观察到的MDP中继承价值函数和最优政策结构，（iii）可以作为具有完全指定的模型工件的不确定性下顺序决策的经典模型与未完全指定且需要使用统计和机器学习的预测方法的此类模型之间的桥梁，以及（iv）允许专门的近似解程序。摘要：We consider a class of sequential decision-making problems under uncertainty that can encompass various types of supervised learning concepts. These problems have a completely observed state process and a partially observed modulation process, where the state process is affected by the modulation process only through an observation process, the observation process only observes the modulation process, and the modulation process is exogenous to control. We model this broad class of problems as a partially observed Markov decision process (POMDP). The belief function for the modulation process is control invariant, thus separating the estimation of the modulation process from the control of the state process. We call this specially structured POMDP the separable POMDP, or SEP-POMDP, and show it (i) can serve as a model for a broad class of application areas, e.g., inventory control, finance, healthcare systems, (ii) inherits value function and optimal policy structure from a set of completely observed MDPs, (iii) can serve as a bridge between classical models of sequential decision making under uncertainty having fully specified model artifacts and such models that are not fully specified and require the use of predictive methods from statistics and machine learning, and (iv) allows for specialized approximate solution procedures.

预测|估计(5篇)

【1】 ChiNet: Deep Recurrent Convolutional Learning for Multimodal Spacecraft Pose Estimation 标题：CHINET：用于多模态航天器姿态估计的深度递归卷积学习链接：https://arxiv.org/abs/2108.10282

作者：Duarte Rondao,Nabil Aouf,Mark A. Richardson 机构： Aouf is a Professor of Robotics and Autonomous Systems with theDepartment of Electrical and Electronic Engineering at City, University ofLondon 备注：This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible 摘要：本文提出了一种创新的深度学习管道，该管道通过结合交会序列的时间信息来估计航天器的相对姿态。它利用长短期记忆（LSTM）单元在数据序列建模中的性能来处理卷积神经网络（CNN）主干提取的特征。三种不同的训练策略，遵循从粗到细的漏斗式方法，结合起来促进特征学习，并通过回归改进端到端姿势估计。利用CNN自主确定图像特征表示的能力，将热红外数据与红-绿-蓝（RGB）输入进行融合，从而减轻在可见光波长下对空间物体成像产生的伪影的影响。在一个合成数据集上演示了所提出框架（称为ChiNet）的每一项贡献，并在实验数据上验证了整个管道。摘要：This paper presents an innovative deep learning pipeline which estimates the relative pose of a spacecraft by incorporating the temporal information from a rendezvous sequence. It leverages the performance of long short-term memory (LSTM) units in modelling sequences of data for the processing of features extracted by a convolutional neural network (CNN) backbone. Three distinct training strategies, which follow a coarse-to-fine funnelled approach, are combined to facilitate feature learning and improve end-to-end pose estimation by regression. The capability of CNNs to autonomously ascertain feature representations from images is exploited to fuse thermal infrared data with red-green-blue (RGB) inputs, thus mitigating the effects of artefacts from imaging space objects in the visible wavelength. Each contribution of the proposed framework, dubbed ChiNet, is demonstrated on a synthetic dataset, and the complete pipeline is validated on experimental data.

【2】 Construction Cost Index Forecasting: A Multi-feature Fusion Approach 标题：工程造价指数预测：一种多特征融合方法链接：https://arxiv.org/abs/2108.10155

作者：Tianxiang Zhan,Yuanpeng He,Fuyuan Xiao 机构：School of Computer and Information Science, Southwest University, Chongqing, China, School of Big Data and Software Engineering, Chongqing University, Chongqing, China 摘要：建筑成本指数是建筑业的一项重要指标。预测CCI具有重要的现实意义。本文将信息融合与机器学习相结合，提出了一种用于时间序列预测的多特征融合框架。MFF采用滑动窗口算法，并提出一个函数序列将时间序列转换为特征序列进行信息融合。MFF用机器学习代替传统的信息方法，实现信息融合，大大提高了CCI预测效果。MFF对CCI和时间序列预测具有重要意义。摘要：The construction cost index is an important indicator in the construction industry. Predicting CCI has great practical significance. This paper combines information fusion with machine learning, and proposes a Multi-feature Fusion framework for time series forecasting. MFF uses a sliding window algorithm and proposes a function sequence to convert the time sequence into a feature sequence for information fusion. MFF replaces the traditional information method with machine learning to achieve information fusion, which greatly improves the CCI prediction effect. MFF is of great significance to CCI and time series forecasting.

【3】 Integrating Transductive And Inductive Embeddings Improves Link Prediction Accuracy 标题：集成感应式和感应式嵌入可提高链接预测精度链接：https://arxiv.org/abs/2108.10108

作者：Chitrank Gupta,Yash Jain,Abir De,Soumen Chakrabarti 机构：IIT Bombay, India 备注：5 Pages, Accepted by CIKM 2021 摘要：近年来，归纳图嵌入模型、图神经网络（GNN）在在线社交网络中的链接预测（LP）方面变得越来越精确。这类网络的性能在很大程度上取决于输入节点的特性，这些特性因网络和应用程序而异。选择合适的节点功能仍然取决于应用程序，通常是一个悬而未决的问题。此外，由于隐私和道德问题，个性化节点功能的使用往往受到限制。事实上，来自在线社交网络的许多公开数据不包含任何节点特征（例如人口统计）。在这项工作中，我们提供了一个全面的实验分析，该分析表明，利用一种转换技术（例如Node2Vec）来获得初始节点表示，然后使用一种归纳节点嵌入技术，可以显著提高链路预测的准确性。我们证明，对于各种GNN变体，从Node2Vec获得的节点表示向量可以作为GNN的高质量输入特征，从而提高LP性能。摘要：In recent years, inductive graph embedding models, emph{viz.}, graph neural networks (GNNs) have become increasingly accurate at link prediction (LP) in online social networks. The performance of such networks depends strongly on the input node features, which vary across networks and applications. Selecting appropriate node features remains application-dependent and generally an open question. Moreover, owing to privacy and ethical issues, use of personalized node features is often restricted. In fact, many publicly available data from online social network do not contain any node features (e.g., demography). In this work, we provide a comprehensive experimental analysis which shows that harnessing a transductive technique (e.g., Node2Vec) for obtaining initial node representations, after which an inductive node embedding technique takes over, leads to substantial improvements in link prediction accuracy. We demonstrate that, for a wide variety of GNN variants, node representation vectors obtained from Node2Vec serve as high quality input features to GNNs, thereby improving LP performance.

【4】 Evolutionary Ensemble Learning for Multivariate Time Series Prediction 标题：进化集成学习在多变量时间序列预测中的应用链接：https://arxiv.org/abs/2108.09659

作者：Hui Song,A. K. Qin,Flora D. Salim 摘要：多变量时间序列（MTS）预测在金融、能源和交通等领域发挥着关键作用，其中每个单独的时间序列对应于从特定数据源（即所谓的通道）收集的数据。构建MTS预测模型（PM）的典型流程包括从所有可用通道中选择一个子集，从所选通道中提取特征，并基于提取的特征构建PM，其中每个组件都涉及某些优化任务，即通道选择、特征提取（FE）方法，和PM，以及所选FE方法和PM的配置。因此，追求最佳预测性能对应于通过解决所有涉及的优化问题来优化管道。由于解决方案空间的巨大性，这是一项非常重要的任务。与大多数现有的针对优化管道某些组件的工作不同，我们提出了一种新的进化集成学习框架来整体优化整个管道。在该框架中，将特定的管道编码为候选解，并在不同的种群规模下应用多目标进化算法生成多个Pareto最优集（POSs）。最后，设计了选择性集成学习，从POSs中选择最优解子集，并使用贪婪序列选择和最小二乘法将其组合，以产生最终预测。我们实现了所提出的框架，并在两个实际应用中评估了我们的实现，即用电量预测和空气质量预测。与最新技术的性能比较表明了该方法的优越性。摘要：Multivariate time series (MTS) prediction plays a key role in many fields such as finance, energy and transport, where each individual time series corresponds to the data collected from a certain data source, so-called channel. A typical pipeline of building an MTS prediction model (PM) consists of selecting a subset of channels among all available ones, extracting features from the selected channels, and building a PM based on the extracted features, where each component involves certain optimization tasks, i.e., selection of channels, feature extraction (FE) methods, and PMs as well as configuration of the selected FE method and PM. Accordingly, pursuing the best prediction performance corresponds to optimizing the pipeline by solving all of its involved optimization problems. This is a non-trivial task due to the vastness of the solution space. Different from most of the existing works which target at optimizing certain components of the pipeline, we propose a novel evolutionary ensemble learning framework to optimize the entire pipeline in a holistic manner. In this framework, a specific pipeline is encoded as a candidate solution and a multi-objective evolutionary algorithm is applied under different population sizes to produce multiple Pareto optimal sets (POSs). Finally, selective ensemble learning is designed to choose the optimal subset of solutions from the POSs and combine them to yield final prediction by using greedy sequential selection and least square methods. We implement the proposed framework and evaluate our implementation on two real-world applications, i.e., electricity consumption prediction and air quality prediction. The performance comparison with state-of-the-art techniques demonstrates the superiority of the proposed approach.

【5】 Reservoir Computing with Diverse Timescales for Prediction of Multiscale Dynamics 标题：用于多尺度动态预测的不同时间尺度的油藏计算链接：https://arxiv.org/abs/2108.09446

作者：Gouhei Tanaka,Tadayoshi Matsumori,Hiroaki Yoshida,Kazuyuki Aihara 机构：International Research Center for Neurointelligence, Department of Electrical Engineering and Information Systems, Graduate School of Engineering, The University of Tokyo, Tokyo ,-, Japan. 备注：6 pages, 5 figures, supplemental material (10 pages) 摘要：机器学习方法最近被用作动力系统物理/数学建模方法的替代或辅助手段。为了开发一种高效的多尺度动力学建模和预测的机器学习方法，我们提出了一种基于非均匀漏积分神经元的递归网络的不同时间尺度的储层计算模型。在快-慢混沌动力学系统的预测任务中，包括其子系统动力学时间尺度上的巨大差距，我们证明，所提出的模型比现有的标准模型具有更高的潜力，即使不优化泄漏率参数，也能产生与最佳标准模型相当的性能。我们的分析表明，通过模型训练，从储层动力学中适当灵活地选择了产生目标动力学各组成部分所需的时间尺度。摘要：Machine learning approaches have recently been leveraged as a substitute or an aid for physical/mathematical modeling approaches to dynamical systems. To develop an efficient machine learning method dedicated to modeling and prediction of multiscale dynamics, we propose a reservoir computing model with diverse timescales by using a recurrent network of heterogeneous leaky integrator neurons. In prediction tasks with fast-slow chaotic dynamical systems including a large gap in timescales of their subsystems dynamics, we demonstrate that the proposed model has a higher potential than the existing standard model and yields a performance comparable to the best one of the standard model even without an optimization of the leak rate parameter. Our analysis reveals that the timescales required for producing each component of target dynamics are appropriately and flexibly selected from the reservoir dynamics by model training.

其他神经网络|深度学习|模型|建模(23篇)

【1】 ReSpawn: Energy-Efficient Fault-Tolerance for Spiking Neural Networks considering Unreliable Memories 标题：REPAWN：考虑不可靠记忆的尖峰神经网络能量高效容错链接：https://arxiv.org/abs/2108.10271

作者：Rachmad Vidya Wicaksana Putra,Muhammad Abdullah Hanif,Muhammad Shafique 机构：∗†Technische Universit¨at Wien (TU Wien), Vienna, Austria, †‡New York University Abu Dhabi (NYUAD), Abu Dhabi, United Arab Emirates 备注：To appear at the 40th IEEE/ACM International Conference on Computer-Aided Design (ICCAD), November 2021, Virtual Event 摘要：由于其受生物启发的计算能力，尖峰神经网络（SNN）具有低能量和无监督学习能力的潜力。然而，如果在存储器中存在硬件引起的故障（可能来自制造缺陷或电压引起的近似误差）的情况下执行它们的处理，则它们可能会受到精度降低的影响。由于最近的工作仍然集中于SNN中的故障建模和随机故障注入，因此SNN硬件架构中的内存故障对准确性的影响以及相应的故障缓解技术没有得到彻底的探讨。为此，我们提出了ReSpawn，这是一个新的框架，用于缓解弹性和节能SNN的片外和片内存储器中故障的负面影响。ReSpawn的关键机制是：（1）分析SNNs的容错性；（2）通过（a）存储器中的故障感知映射（FAM）和（b）故障感知训练和映射（FATM）提高SNN容错性。如果训练数据集不完全可用，则通过有效的位洗牌技术使用FAM，将有效位放在非故障存储单元上，将不重要位放在故障存储单元上，同时最小化内存访问能量。同时，如果训练数据集完全可用，则在数据映射和训练过程中，通过考虑故障存储单元，采用FATM。实验结果表明，与没有故障缓解技术的基线SNN相比，使用故障感知映射方案的ReSpawn在没有重新训练的情况下，对于包含900个神经元的网络，准确率提高了70%。摘要：Spiking neural networks (SNNs) have shown a potential for having low energy with unsupervised learning capabilities due to their biologically-inspired computation. However, they may suffer from accuracy degradation if their processing is performed under the presence of hardware-induced faults in memories, which can come from manufacturing defects or voltage-induced approximation errors. Since recent works still focus on the fault-modeling and random fault injection in SNNs, the impact of memory faults in SNN hardware architectures on accuracy and the respective fault-mitigation techniques are not thoroughly explored. Toward this, we propose ReSpawn, a novel framework for mitigating the negative impacts of faults in both the off-chip and on-chip memories for resilient and energy-efficient SNNs. The key mechanisms of ReSpawn are: (1) analyzing the fault tolerance of SNNs; and (2) improving the SNN fault tolerance through (a) fault-aware mapping (FAM) in memories, and (b) fault-aware training-and-mapping (FATM). If the training dataset is not fully available, FAM is employed through efficient bit-shuffling techniques that place the significant bits on the non-faulty memory cells and the insignificant bits on the faulty ones, while minimizing the memory access energy. Meanwhile, if the training dataset is fully available, FATM is employed by considering the faulty memory cells in the data mapping and training processes. The experimental results show that, compared to the baseline SNN without fault-mitigation techniques, ReSpawn with a fault-aware mapping scheme improves the accuracy by up to 70% for a network with 900 neurons without retraining.

【2】 Molecular Design Based on Artificial Neural Networks, Integer Programming and Grid Neighbor Search 标题：基于人工神经网络、整数规划和网格邻域搜索的分子设计链接：https://arxiv.org/abs/2108.10266

作者：Naveed Ahmed Azam,Jianshen Zhu,Kazuya Haraguchi,Liang Zhao,Hiroshi Nagamochi,Tatsuya Akutsu 机构： Department of Applied Mathematics and Physics, Kyoto University, Kyoto ,-, Japan, Graduate School of Advanced Integrated Studies in Human Survavibility (Shishu-Kan), Kyoto Univer- 备注：arXiv admin note: substantial text overlap with arXiv:2107.02381 摘要：最近，人们提出了一种新的框架，利用人工神经网络和混合整数线性规划设计具有所需化学性质的化合物的分子结构。在该框架中，具有目标化学值的化学图被推断为一个混合整数线性规划的可行解，该混合整数线性规划表示预测函数和对图结构的其他要求。本文提出了一种在搜索空间中通过搜索输出化学图的邻域生成混合整数线性规划其他可行解的方法。该过程作为一个新的构建块组合在框架中。我们的计算实验结果表明，所提出的方法可以产生额外数量的新的化学图，最多有50个非氢原子。摘要：A novel framework has recently been proposed for designing the molecular structure of chemical compounds with a desired chemical property using both artificial neural networks and mixed integer linear programming. In the framework, a chemical graph with a target chemical value is inferred as a feasible solution of a mixed integer linear program that represents a prediction function and other requirements on the structure of graphs. In this paper, we propose a procedure for generating other feasible solutions of the mixed integer linear program by searching the neighbor of output chemical graph in a search space. The procedure is combined in the framework as a new building block. The results of our computational experiments suggest that the proposed method can generate an additional number of new chemical graphs with up to 50 non-hydrogen atoms.

【3】 A New Constructive Heuristic driven by Machine Learning for the Traveling Salesman Problem 标题：机器学习驱动的旅行商问题的一种新的构造性启发式算法链接：https://arxiv.org/abs/2108.10224

作者：Umberto Junior Mele,Luca Maria Gambardella,Roberto Montemanni 机构：Department of Sciences and Methods for Engineering, University of Modena and Reggio Emilia, Reggio Emilia, Italy 摘要：最近使用机器学习（ML）来解决旅行商问题（TSP）的系统在试图扩展到具有数百个顶点的真实场景时会出现问题。候选人名单（CLs）的使用被提出来解决这些问题。该过程允许在创建解的过程中限制搜索空间，从而减少解算器的计算负担。到目前为止，ML参与创建CLs和这些CLs边缘上的值，以表达溶液插入时的ML偏好。尽管前景看好，但这些系统并没有明确限制ML学习和创建解决方案的内容，这带来了一些泛化问题。因此，在探索性和统计研究的推动下，在这项工作中，我们使用机器学习模型来确认仅针对高概率边的解决方案中的添加。采用高概率边的CLs作为输入，ML负责区分这些边处于最优解的情况和不处于最优解的情况。这种策略可以实现更好的泛化，并在机器学习和搜索技术之间建立有效的平衡。我们的ML构造性启发式是在小实例上训练的。然后，它也能够在不损失质量的情况下，为大问题提供解决方案。我们将我们的结果与经典的构造性启发式算法进行了比较，在1748个城市的TSPLIB实例中显示了良好的性能。尽管我们的启发式算法表现出昂贵的常数时间操作，但我们证明了在最坏情况下，对于训练后的解决方案构造，计算复杂度为$O（n^2log n^2）$，即$n$TSP实例中的顶点数。摘要：Recent systems applying Machine Learning (ML) to solve the Traveling Salesman Problem (TSP) exhibit issues when they try to scale up to real case scenarios with several hundred vertices. The use of Candidate Lists (CLs) has been brought up to cope with the issues. The procedure allows to restrict the search space during solution creation, consequently reducing the solver computational burden. So far, ML were engaged to create CLs and values on the edges of these CLs expressing ML preferences at solution insertion. Although promising, these systems do not clearly restrict what the ML learns and does to create solutions, bringing with them some generalization issues. Therefore, motivated by exploratory and statistical studies, in this work we instead use a machine learning model to confirm the addition in the solution just for high probable edges. CLs of the high probable edge are employed as input, and the ML is in charge of distinguishing cases where such edges are in the optimal solution from those where they are not. . This strategy enables a better generalization and creates an efficient balance between machine learning and searching techniques. Our ML-Constructive heuristic is trained on small instances. Then, it is able to produce solutions, without losing quality, to large problems as well. We compare our results with classic constructive heuristics, showing good performances for TSPLIB instances up to 1748 cities. Although our heuristic exhibits an expensive constant time operation, we proved that the computational complexity in worst-case scenario, for the solution construction after training, is $O(n^2 log n^2)$, being $n$ the number of vertices in the TSP instance.

【4】 A Learning-Based Fast Uplink Grant for Massive IoT via Support Vector Machines and Long Short-Term Memory 标题：基于支持向量机和长短期记忆的海量物联网基于学习的快速上行授权链接：https://arxiv.org/abs/2108.10070

作者：Eslam Eldeeb,Mohammad Shehab,Hirley Alves 摘要：当前的随机接入（RA）分配技术在服务于大规模机器类型通信（mMTC）应用时存在拥塞和高信令开销。为此，3GPP引入了使用快速上行链路授权（FUG）分配的需求，以减少延迟并提高具有严格QoS约束的智能物联网（IoT）应用的可靠性。提出了一种新的基于支持向量机（SVM）的FUG分配算法，首先利用SVM分类器对MTC设备进行优先级排序。其次，LSTM体系结构用于流量预测和校正技术，以克服预测误差。这两个结果都用于在平均延迟和总吞吐量方面实现高效的资源调度器。应用混合报警和常规流量的耦合马尔可夫调制泊松过程（CMMPP）流量模型，将所提出的FUG分配与其他现有分配技术进行比较。此外，基于CMMPP的扩展流量模型被用于在更密集的网络中评估所提出的算法。我们使用从Numenta异常基准（NAB）数据库收集的实时测量数据来测试所提出的方案。我们的仿真结果表明，当使用有限的资源服务于目标大规模和关键的MTC应用程序时，所提出的模型通过实现98$%%的预测精度，实现了最高吞吐量和最低1 ms量级的访问延迟，优于现有的RA分配方案。摘要：The current random access (RA) allocation techniques suffer from congestion and high signaling overhead while serving massive machine type communication (mMTC) applications. To this end, 3GPP introduced the need to use fast uplink grant (FUG) allocation in order to reduce latency and increase reliability for smart internet-of-things (IoT) applications with strict QoS constraints. We propose a novel FUG allocation based on support vector machine (SVM), First, MTC devices are prioritized using SVM classifier. Second, LSTM architecture is used for traffic prediction and correction techniques to overcome prediction errors. Both results are used to achieve an efficient resource scheduler in terms of the average latency and total throughput. A Coupled Markov Modulated Poisson Process (CMMPP) traffic model with mixed alarm and regular traffic is applied to compare the proposed FUG allocation to other existing allocation techniques. In addition, an extended traffic model based CMMPP is used to evaluate the proposed algorithm in a more dense network. We test the proposed scheme using real-time measurement data collected from the Numenta Anomaly Benchmark (NAB) database. Our simulation results show the proposed model outperforms the existing RA allocation schemes by achieving the highest throughput and the lowest access delay of the order of 1 ms by achieving prediction accuracy of 98 $%$ when serving the target massive and critical MTC applications with a limited number of resources.

【5】 Deep Relational Metric Learning 标题：深度关系度量学习链接：https://arxiv.org/abs/2108.10026

作者：Wenzhao Zheng,Borui Zhang,Jiwen Lu,Jie Zhou 机构：Department of Automation, Tsinghua University, China, Beijing National Research Center for Information Science and Technology, China 备注：Accepted to ICCV 2021. Source code available at this https URL 摘要：提出了一种用于图像聚类和检索的深度关系度量学习（DRML）框架。大多数现有的深度度量学习方法学习嵌入空间，其总体目标是增加类间距离和减少类内距离。然而，传统的度量学习损失通常会抑制组内变化，这可能有助于识别未知类的样本。为了解决这个问题，我们建议自适应地学习从不同方面表征图像的特征集合，以建模类内和类内分布。我们进一步使用关系模块来捕获集合中每个特征之间的相关性，并构造一个图来表示图像。然后，我们在图上执行关系推理来集成集合，并获得关系感知嵌入来度量相似度。在广泛使用的CUB-200-2011、Cars196和斯坦福在线产品数据集上进行的大量实验表明，我们的框架改进了现有的深度度量学习方法，并取得了非常有竞争力的结果。摘要：This paper presents a deep relational metric learning (DRML) framework for image clustering and retrieval. Most existing deep metric learning methods learn an embedding space with a general objective of increasing interclass distances and decreasing intraclass distances. However, the conventional losses of metric learning usually suppress intraclass variations which might be helpful to identify samples of unseen classes. To address this problem, we propose to adaptively learn an ensemble of features that characterizes an image from different aspects to model both interclass and intraclass distributions. We further employ a relational module to capture the correlations among each feature in the ensemble and construct a graph to represent an image. We then perform relational inference on the graph to integrate the ensemble and obtain a relation-aware embedding to measure the similarities. Extensive experiments on the widely-used CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate that our framework improves existing deep metric learning methods and achieves very competitive results.

【6】 TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment 标题：TACO：用于视频-文本对齐的标记感知级联对比学习链接：https://arxiv.org/abs/2108.09980

作者：Jianwei Yang,Yonatan Bisk,Jianfeng Gao 机构：Microsoft Research, Carnegie Mellon University 备注：Accepted by ICCV 2021 摘要：对比学习被广泛用于训练基于变换器的视觉语言模型，用于视频文本对齐和多模态表示学习。本文提出了一种新的标记感知级联对比学习（TACo）算法，该算法利用两种新技术改进对比学习。第一种是标记感知对比损失，它是通过考虑单词的句法类别来计算的。这是因为观察到，对于视频-文本对，文本中的内容词（如名词和动词）比虚词更可能与视频中的视觉内容对齐。其次，采用级联采样方法生成一组小的硬负示例，用于有效估计多模态融合层的损耗。为了验证TACo的有效性，在我们的实验中，我们对一组下游任务的预训练模型进行了微调，包括文本视频检索（YouCook2、MSR-VTT和ActivityNet）、视频动作步骤定位（CrossTask）、视频动作分段（COIN）。结果表明，与以前的方法相比，我们的模型在不同的实验环境中取得了一致的改进，在YouCook2、MSR-VTT和ActivityNet三个公共文本视频检索基准上建立了最新的技术水平。摘要：Contrastive learning has been widely used to train transformer-based vision-language models for video-text alignment and multi-modal representation learning. This paper presents a new algorithm called Token-Aware Cascade contrastive learning (TACo) that improves contrastive learning using two novel techniques. The first is the token-aware contrastive loss which is computed by taking into account the syntactic classes of words. This is motivated by the observation that for a video-text pair, the content words in the text, such as nouns and verbs, are more likely to be aligned with the visual contents in the video than the function words. Second, a cascade sampling method is applied to generate a small set of hard negative examples for efficient loss estimation for multi-modal fusion layers. To validate the effectiveness of TACo, in our experiments we finetune pretrained models for a set of downstream tasks including text-video retrieval (YouCook2, MSR-VTT and ActivityNet), video action step localization (CrossTask), video action segmentation (COIN). The results show that our models attain consistent improvements across different experimental settings over previous methods, setting new state-of-the-art on three public text-video retrieval benchmarks of YouCook2, MSR-VTT and ActivityNet.

【7】 Convolutional Filtering and Neural Networks with Non Commutative Algebras 标题：卷积滤波与非交换代数神经网络链接：https://arxiv.org/abs/2108.09923

作者：Alejandro Parada-Mayorga,Alejandro Ribeiro 机构： it was found that stability is a property that isDepartment of Electrical and Systems Engineering, University of Pennsylvania 摘要：本文给出了基于非交换代数的代数神经网络的稳定性结果。ALGN是堆叠的分层结构，每一层与由代数、向量空间和同态确定的代数信号模型（ASM）相关联。信号被建模为向量空间的元素，滤波器是代数中的元素，而同态提供了滤波器作为具体算子的实现。研究了非交换代数中的代数滤波器对同态扰动的稳定性，并给出了保证稳定性的条件。我们证明了移位算子之间以及移位和扰动之间的交换性并不影响体系结构的稳定性。这就回答了移位不变性是否是卷积结构保证稳定性的必要属性的问题。此外，我们还表明，尽管非交换代数中滤波器的频率响应与交换代数中滤波器的频率响应存在显著差异，但它们对稳定滤波器的导数具有类似的行为。摘要：In this paper we provide stability results for algebraic neural networks (AlgNNs) based on non commutative algebras. AlgNNs are stacked layered structures with each layer associated to an algebraic signal model (ASM) determined by an algebra, a vector space, and a homomorphism. Signals are modeled as elements of the vector space, filters are elements in the algebra, while the homomorphism provides a realization of the filters as concrete operators. We study the stability of the algebraic filters in non commutative algebras to perturbations on the homomorphisms, and we provide conditions under which stability is guaranteed. We show that the commutativity between shift operators and between shifts and perturbations does not affect the property of an architecture of being stable. This provides an answer to the question of whether shift invariance was a necessary attribute of convolutional architectures to guarantee stability. Additionally, we show that although the frequency responses of filters in non commutative algebras exhibit substantial differences with respect to filters in commutative algebras, their derivatives for stable filters have a similar behavior.

【8】 Genetic Programming for Manifold Learning: Preserving Local Topology 标题：流形学习的遗传规划：保持局部拓扑链接：https://arxiv.org/abs/2108.09914

作者：Andrew Lensen,Bing Xue,Mengjie Zhang 机构：This work was supported in part by the Marsden Fund of the New ZealandGovernment under Contracts VUW 19 1 3 and VUW 19 1 4 and the UniversityResearch Fund at Te Herenga Waka–Victoria University of Wellington undergrant number 2 26 16 1 4 16 4 备注：Accepted by IEEE Transactions on Evolutionary Computation, 2021 摘要：在数据集日益庞大的今天，流形学习方法是一种非常宝贵的工具。流形学习算法可以通过保留原始数据最重要结构的非线性变换发现高维数据集的低维表示（嵌入）。最先进的流形学习方法直接优化嵌入，而无需在原始空间和发现的嵌入空间之间进行映射。这使得可解释性——探索性数据分析的关键要求——几乎不可能实现。最近，遗传规划通过将函数映射从原始空间演化到嵌入空间，成为一种非常有前途的流形学习方法。然而，基于遗传编程的流形学习一直难以与其他方法的性能相匹配。在这项工作中，我们提出了一种新的方法来使用遗传规划的流形学习，保持局部拓扑。这有望显著提高局部邻域结构（拓扑）至关重要的任务的性能。我们将我们提出的方法与各种基线流形学习方法进行比较，发现它通常优于其他方法，包括比以前的遗传编程方法有明显的改进。鉴于进化映射的潜在可解释性和可重用性，这些结果尤其有希望。摘要：Manifold learning methods are an invaluable tool in today's world of increasingly huge datasets. Manifold learning algorithms can discover a much lower-dimensional representation (embedding) of a high-dimensional dataset through non-linear transformations that preserve the most important structure of the original data. State-of-the-art manifold learning methods directly optimise an embedding without mapping between the original space and the discovered embedded space. This makes interpretability - a key requirement in exploratory data analysis - nearly impossible. Recently, genetic programming has emerged as a very promising approach to manifold learning by evolving functional mappings from the original space to an embedding. However, genetic programming-based manifold learning has struggled to match the performance of other approaches. In this work, we propose a new approach to using genetic programming for manifold learning, which preserves local topology. This is expected to significantly improve performance on tasks where local neighbourhood structure (topology) is paramount. We compare our proposed approach with various baseline manifold learning methods and find that it often outperforms other methods, including a clear improvement over previous genetic programming approaches. These results are particularly promising, given the potential interpretability and reusability of the evolved mappings.

【9】 Convex Latent Effect Logit Model via Sparse and Low-rank Decomposition 标题：基于稀疏低秩分解的凸潜效应Logit模型链接：https://arxiv.org/abs/2108.09859

作者：Hongyuan Zhan,Kamesh Madduri,Venkataraman Shankar 机构：Facebook Inc., Penn State University, Department of Computer Science and, Texas Tech University, Department of Civil, Environmental, and, Construction Engineering 摘要：在这篇文章中，我们提出了一个学习logistic回归模型（logit）的凸公式，该模型对子种群具有潜在的异质性影响。在交通运输领域，logistic回归及其变体通常被解释为效用理论下的离散选择模型（McFadden，2001）。logit模型在交通领域的两个突出应用是交通事故分析和选择建模。在这些应用程序中，研究人员通常希望了解并捕获相同事故或选择情景下的个体变化。混合效应logistic回归（mixed logit）是交通研究人员采用的一种流行模型。为了估计混合logit参数的分布，需要解决一个具有嵌套高维积分的非凸优化问题。基于仿真的优化通常用于解决混合logit参数估计问题。尽管这种方法很受欢迎，但用于学习个体异质性的混合logit方法有几个缺点。首先，分布的参数形式需要领域知识和用户强加的假设，尽管这个问题可以通过使用非参数方法在一定程度上解决。其次，混合logit的参数估计会产生优化问题，非参数扩展是非凸的，这导致模型解释不稳定。第三，仿真辅助估计中的仿真规模缺乏有限样本的理论保证，在实践中有点随意选择。为了解决这些问题，我们开发了一种公式，在保持凸性的同时对潜在的个体异质性进行建模，并避免了基于模拟的近似。我们的设置基于将参数分解为人口中的稀疏同质部分和每个个体的低阶异质部分。摘要：In this paper, we propose a convex formulation for learning logistic regression model (logit) with latent heterogeneous effect on sub-population. In transportation, logistic regression and its variants are often interpreted as discrete choice models under utility theory (McFadden, 2001). Two prominent applications of logit models in the transportation domain are traffic accident analysis and choice modeling. In these applications, researchers often want to understand and capture the individual variation under the same accident or choice scenario. The mixed effect logistic regression (mixed logit) is a popular model employed by transportation researchers. To estimate the distribution of mixed logit parameters, a non-convex optimization problem with nested high-dimensional integrals needs to be solved. Simulation-based optimization is typically applied to solve the mixed logit parameter estimation problem. Despite its popularity, the mixed logit approach for learning individual heterogeneity has several downsides. First, the parametric form of the distribution requires domain knowledge and assumptions imposed by users, although this issue can be addressed to some extent by using a non-parametric approach. Second, the optimization problems arise from parameter estimation for mixed logit and the non-parametric extensions are non-convex, which leads to unstable model interpretation. Third, the simulation size in simulation-assisted estimation lacks finite-sample theoretical guarantees and is chosen somewhat arbitrarily in practice. To address these issues, we are motivated to develop a formulation that models the latent individual heterogeneity while preserving convexity, and avoids the need for simulation-based approximation. Our setup is based on decomposing the parameters into a sparse homogeneous component in the population and low-rank heterogeneous parts for each individual.

【10】 Temporal Network Embedding via Tensor Factorization 标题：基于张量分解的时态网络嵌入链接：https://arxiv.org/abs/2108.09837

作者：Jing Ma,Qiuchen Zhang,Jian Lou,Li Xiong,Joyce C. Ho 机构：Emory University,Xidian University 备注：To appear in CIKM 2021 摘要：静态图结构数据的表示学习对许多实际应用产生了重大影响。然而，人们对时间网络的演化性质关注较少，在这种网络中，边缘往往随时间而变化。这种时间网络的嵌入应该同时编码图形结构信息和时间演化模式。现有的学习时间演化网络表示的方法无法捕捉到时间的相互依赖性。在本文中，我们提出了一种基于张量分解的时态网络表示学习新方法Toffee。我们的方法利用张量-张量积算子对交叉时间信息进行编码，从而可以捕获进化网络中的周期性变化。实验结果表明，在多个真实时间网络上，Toffee算法在生成链路预测任务的有效嵌入方面优于现有方法。摘要：Representation learning on static graph-structured data has shown a significant impact on many real-world applications. However, less attention has been paid to the evolving nature of temporal networks, in which the edges are often changing over time. The embeddings of such temporal networks should encode both graph-structured information and the temporally evolving pattern. Existing approaches in learning temporally evolving network representations fail to capture the temporal interdependence. In this paper, we propose Toffee, a novel approach for temporal network representation learning based on tensor decomposition. Our method exploits the tensor-tensor product operator to encode the cross-time information, so that the periodic changes in the evolving networks can be captured. Experimental results demonstrate that Toffee outperforms existing methods on multiple real-world temporal networks in generating effective embeddings for the link prediction tasks.

【11】 Efficient Algorithms for Learning from Coarse Labels 标题：一种从粗标签中学习的高效算法链接：https://arxiv.org/abs/2108.09805

作者：Dimitris Fotakis,Alkis Kalavasis,Vasilis Kontonis,Christos Tzamos 机构：National Technical University of Athens, University of Wisconsin-Madison 摘要：对于许多学习问题，可能无法访问细粒度标签信息；e、例如，根据注释者的专业知识，图像可以标记为哈士奇、狗甚至动物。在这项工作中，我们将这些设置形式化，并研究从这些粗糙数据中学习的问题。我们不是观察集合$mathcal{Z}$中的实际标签，而是观察对应于$mathcal{Z}$分区（或分区的混合）的粗略标签。我们的主要算法结果是，当粗数据具有足够的信息量时，从细粒度标签中学习的任何问题基本上都可以有效地学习。我们通过对仅给出粗标签的细粒度标签上的回答统计查询（SQ）的一般化约简来获得我们的结果。所需的粗标签数量在多项式上取决于粗化引起的信息失真和精细标签的数量$|mathcal{Z}|$。我们还研究了（无穷多个）实值标签的情况，重点是截尾和截断统计中的一个中心问题：粗数据的高斯平均估计。当划分中的集合是凸的时，我们给出了一个有效的算法，并且证明了即使对于非常简单的非凸集，问题也是NP难的。摘要：For many learning problems one may not have access to fine grained label information; e.g., an image can be labeled as husky, dog, or even animal depending on the expertise of the annotator. In this work, we formalize these settings and study the problem of learning from such coarse data. Instead of observing the actual labels from a set $mathcal{Z}$, we observe coarse labels corresponding to a partition of $mathcal{Z}$ (or a mixture of partitions). Our main algorithmic result is that essentially any problem learnable from fine grained labels can also be learned efficiently when the coarse data are sufficiently informative. We obtain our result through a generic reduction for answering Statistical Queries (SQ) over fine grained labels given only coarse labels. The number of coarse labels required depends polynomially on the information distortion due to coarsening and the number of fine labels $|mathcal{Z}|$. We also investigate the case of (infinitely many) real valued labels focusing on a central problem in censored and truncated statistics: Gaussian mean estimation from coarse data. We provide an efficient algorithm when the sets in the partition are convex and establish that the problem is NP-hard even for very simple non-convex sets.

【12】 Wind Power Projection using Weather Forecasts by Novel Deep Neural Networks 标题：基于新型深度神经网络的天气预报风电预测链接：https://arxiv.org/abs/2108.09797

作者：Alagappan Swaminathan,Venkatakrishnan Sutharsan,Tamilselvi Selvaraj 机构：Selvaraj, Associate Professor, SSN College of Engineering, Kalavakkam, Chennai 备注：27 pages, 12 figures, 12 tables, 7 equations, 22 references 摘要：从传统的能源生产方法过渡到可再生能源生产需要更好地预测即将到来的可再生能源供应。在风力发电生产中，由于风力的间歇性，预测产量的误差是不可能消除的。对于成功的电网整合，了解预测风力发电量时产生的不确定性并利用这些信息建立准确可靠的预测至关重要。这可以通过观察风力发电量的波动以及不同参数（如风速、温度和风向）的变化来实现，并得出相同参数的函数依赖关系。使用优化的机器学习算法，可以在观测中发现模糊模式并获得有意义的数据，然后可以使用这些数据准确预测风力发电需求。利用Gamesa位于Bableshwar的风电场提供的所需数据，本文探讨了利用功率曲线计算风电预测的参数模型和非参数模型的使用。对获得的结果进行比较，以更好地理解所用模型的准确性，并根据给定数据集确定预测风力发电量的最合适模型。摘要：The transition from conventional methods of energy production to renewable energy production necessitates better prediction models of the upcoming supply of renewable energy. In wind power production, error in forecasting production is impossible to negate owing to the intermittence of wind. For successful power grid integration, it is crucial to understand the uncertainties that arise in predicting wind power production and use this information to build an accurate and reliable forecast. This can be achieved by observing the fluctuations in wind power production with changes in different parameters such as wind speed, temperature, and wind direction, and deriving functional dependencies for the same. Using optimized machine learning algorithms, it is possible to find obscured patterns in the observations and obtain meaningful data, which can then be used to accurately predict wind power requirements . Utilizing the required data provided by the Gamesa's wind farm at Bableshwar, the paper explores the use of both parametric and the non-parametric models for calculating wind power prediction using power curves. The obtained results are subject to comparison to better understand the accuracy of the utilized models and to determine the most suitable model for predicting wind power production based on the given data set.

【13】 A universally consistent learning rule with a universally monotone error 标题：具有普遍单调错误的普遍一致的学习规则链接：https://arxiv.org/abs/2108.09733

作者：Vladimir Pestov 机构：Departamento de Matem´atica, Universidade Federal de Santa Catarina, Campus, Universit´ario Trindade, CEP ,.,-, Florian´opolis-SC, Brasil §, Departement of Mathematics and Statistics, University of Ottawa, STEM Complex 备注：latex, 26 pp., 4 figures 摘要：我们提出了一个普遍一致的学习规则，在每个数据分布下，其期望误差是单调的，不随样本量的增加而增加。Devroye、Gy “orfi和Lugosi（他们称之为“smart”）于1996年提出了此类规则存在的问题。我们的规则是完全确定的，一种使用循环顺序在任意域（标准Borel空间）中构造的依赖于数据的分区规则。其核心思想是在每一步仅划分那些具有足够的标签经验多样性的循环区间，从而避免误差函数为凸函数的区域。摘要：We present a universally consistent learning rule whose expected error is monotone non-increasing with the sample size under every data distribution. The question of existence of such rules was brought up in 1996 by Devroye, Gy"orfi and Lugosi (who called them "smart"). Our rule is fully deterministic, a data-dependent partitioning rule constructed in an arbitrary domain (a standard Borel space) using a cyclic order. The central idea is to only partition at each step those cyclic intervals that exhibit a sufficient empirical diversity of labels, thus avoiding a region where the error function is convex.

【14】 Evaluation Methodologies for Code Learning Tasks 标题：代码学习任务的评估方法链接：https://arxiv.org/abs/2108.09619

作者：Pengyu Nie,Jiyang Zhang,Junyi Jessy Li,Raymond J. Mooney,Milos Gligoric 机构：The University of Texas at Austin 摘要：人们对开发用于代码学习任务的机器学习（ML）模型越来越感兴趣，例如注释生成和方法命名。尽管ML模型的有效性大幅提高，但评估方法，即人们将数据集划分为训练集、验证集和测试集的方式，并没有得到很好的设计。具体而言，在评估过程中，先前关于上述主题的工作没有考虑到代码和注释的时间戳（例如，测试集中的示例可能来自2010年，训练集中的示例可能来自2020年）。这可能导致评估结果与ML模型的预期用例不一致。在本文中，我们形式化了一种新的时间分段评估方法，以及文献中常用的两种方法：混合项目和交叉项目。我们认为，时间分段方法是最现实的。我们还描述了ML模型的各种用例，并提供了使用方法来评估每个用例的指南。为了评估方法的影响，我们收集了一个带有时间戳的代码注释对数据集，以训练和评估用于注释生成和方法命名任务的几个最近的代码学习ML模型。我们的结果表明，不同的方法可能导致相互冲突和不一致的结果。我们邀请社区采用时间分段评估方法。摘要：There has been a growing interest in developing machine learning (ML) models for code learning tasks, e.g., comment generation and method naming. Despite substantial increase in the effectiveness of ML models, the evaluation methodologies, i.e., the way people split datasets into training, validation, and testing sets, were not well designed. Specifically, no prior work on the aforementioned topics considered the timestamps of code and comments during evaluation (e.g., examples in the testing set might be from 2010 and examples from the training set might be from 2020). This may lead to evaluations that are inconsistent with the intended use cases of the ML models. In this paper, we formalize a novel time-segmented evaluation methodology, as well as the two methodologies commonly used in the literature: mixed-project and cross-project. We argue that time-segmented methodology is the most realistic. We also describe various use cases of ML models and provide a guideline for using methodologies to evaluate each use case. To assess the impact of methodologies, we collect a dataset of code-comment pairs with timestamps to train and evaluate several recent code learning ML models for the comment generation and method naming tasks. Our results show that different methodologies can lead to conflicting and inconsistent results. We invite the community to adopt the time-segmented evaluation methodology.

【15】 SERF: Towards better training of deep neural networks using log-Softplus ERror activation Function 标题：SERF：使用LOG-Softplus误差激活函数更好地训练深度神经网络链接：https://arxiv.org/abs/2108.09598

作者：Sayan Nag,Mayukh Bhattacharyya 机构：University of Toronto, Stony Brook University 摘要：激活函数在决定训练动态和神经网络性能方面起着关键作用。广泛采用的激活函数ReLU尽管简单有效，但也存在一些缺点，包括即将消亡的ReLU问题。为了解决这些问题，我们提出了一种新的激活函数，称为Serf，它是自正则的，本质上是非单调的。和米什一样，农奴也属于时髦的职能家族。基于计算机视觉（图像分类和目标检测）和自然语言处理（机器翻译、情感分类和多模态蕴涵）任务的多个实验，采用不同的最新架构，据观察，Serf的性能远远优于ReLU（基线）和其他激活功能，包括Swish和Mish，在更深层次的体系结构上具有更大的优势。消融研究进一步证明，基于Serf的体系结构在不同场景下的性能优于Swish和Mish，验证了Serf在不同深度、复杂度、优化器、学习率、批量大小、初始值设定者和退出率下的有效性和兼容性。最后，我们研究了Swish和Serf之间的数学关系，从而显示了Serf一阶导数中固有的预条件函数的影响，它提供了一种正则化效果，使梯度更平滑，优化速度更快。摘要：Activation functions play a pivotal role in determining the training dynamics and neural network performance. The widely adopted activation function ReLU despite being simple and effective has few disadvantages including the Dying ReLU problem. In order to tackle such problems, we propose a novel activation function called Serf which is self-regularized and nonmonotonic in nature. Like Mish, Serf also belongs to the Swish family of functions. Based on several experiments on computer vision (image classification and object detection) and natural language processing (machine translation, sentiment classification and multimodal entailment) tasks with different state-of-the-art architectures, it is observed that Serf vastly outperforms ReLU (baseline) and other activation functions including both Swish and Mish, with a markedly bigger margin on deeper architectures. Ablation studies further demonstrate that Serf based architectures perform better than those of Swish and Mish in varying scenarios, validating the effectiveness and compatibility of Serf with varying depth, complexity, optimizers, learning rates, batch sizes, initializers and dropout rates. Finally, we investigate the mathematical relation between Swish and Serf, thereby showing the impact of preconditioner function ingrained in the first derivative of Serf which provides a regularization effect making gradients smoother and optimization faster.

【16】 Principal Gradient Direction and Confidence Reservoir Sampling for Continual Learning 标题：用于持续学习的主梯度方向和置信度库抽样链接：https://arxiv.org/abs/2108.09592

作者：Zhiyi Chen,Tong Lin 机构： Georgia Institute of Technology, USA, The Key Laboratory of Machine Perception (MOE), School of EECS, Peking, University; Peng Cheng Laboratory, Shenzhen, China 摘要：无任务在线持续学习旨在减轻学习者在非iid数据流上的灾难性遗忘。经验重放（ER）是一种SOTA连续学习方法，广泛用作其他重放方法的主干算法。然而，ER的训练策略过于简单，无法充分利用重复的示例，其储层采样策略也不理想。在这项工作中，我们提出了一个通用的近端梯度框架，使ER可以被视为一个特例。我们进一步提出了两个相应的改进：主梯度方向（PGD）和置信储层采样（CRS）。在主梯度方向，我们优化了一个目标梯度，该梯度不仅代表了过去梯度的主要贡献，而且保留了当前梯度的新知识。然后，我们提出了一种置信库抽样方法，该方法基于测量存储示例值的基于边距的度量来维护信息量更大的内存缓冲区。实验证明了我们改进的有效性，我们的新算法持续提高了基于SOTA ER的MIR重放方法的性能：我们的算法在四个数据集上的平均准确率提高了7.9%，遗忘率降低了15.4%。摘要：Task-free online continual learning aims to alleviate catastrophic forgetting of the learner on a non-iid data stream. Experience Replay (ER) is a SOTA continual learning method, which is broadly used as the backbone algorithm for other replay-based methods. However, the training strategy of ER is too simple to take full advantage of replayed examples and its reservoir sampling strategy is also suboptimal. In this work, we propose a general proximal gradient framework so that ER can be viewed as a special case. We further propose two improvements accordingly: Principal Gradient Direction (PGD) and Confidence Reservoir Sampling (CRS). In Principal Gradient Direction, we optimize a target gradient that not only represents the major contribution of past gradients, but also retains the new knowledge of the current gradient. We then present Confidence Reservoir Sampling for maintaining a more informative memory buffer based on a margin-based metric that measures the value of stored examples. Experiments substantiate the effectiveness of both our improvements and our new algorithm consistently boosts the performance of MIR-replay, a SOTA ER-based method: our algorithm increases the average accuracy up to 7.9% and reduces forgetting up to 15.4% on four datasets.

【17】 CushLEPOR: Customised hLEPOR Metric Using LABSE Distilled Knowledge Model to Improve Agreement with Human Judgements 标题：CushLEPOR：使用LABSE精炼知识模型改进与人类判断的一致性的定制hLEPOR度量链接：https://arxiv.org/abs/2108.09484

作者：Lifeng Han,Irina Sorokina,Gleb Erofeev,Serge Gladkoff 机构： ADAPT Research Centre, DCU, Ireland, Logrus Global, Translation & Localization 备注：Extended work from MT SUMMIT 2021: Gleb Erofeev, Irina Sorokina, Lifeng Han, and Serge Gladkoff. 2021. cushLEPOR uses LABSE distilled knowledge to improve correlation with human translation evaluations. In Proceedings for the MT summit - User Track (In Press), online. Association for Computa- tional Linguistics & AMTA 摘要：在研究人员努力信任自动度量的同时，人工评估的成本一直很高。为了解决这个问题，我们建议通过利用预先训练的语言模型（PLM）和有限的可用人类标记分数来定制传统指标。我们首先重新介绍了hLEPOR度量因子，然后介绍了我们开发的Python可移植版本，该版本实现了hLEPOR度量中权重参数的自动调整。然后，我们提出了定制的hLEPOR（cushLEPOR），它使用LABSE提取的知识模型，通过自动优化与cushLEPOR部署到的精确机器翻译语言对相关的因子权重，来改进度量与人类判断的一致性。我们还优化了基于MQM和pSQM框架的英语-德语和汉英语言对的cushLEPOR人类评估数据。实验研究表明，cushLEPOR以更低的成本提高了hLEPOR的性能，使其与PLM（如LABSE）达成更好的协议，并与人类评估（包括MQM和pSQM分数）达成更好的协议，并且产生了比BLEU更好的性能（数据可在{https://github.com/poethan/cushLEPOR}). 摘要：Human evaluation has always been expensive while researchers struggle to trust the automatic metrics. To address this, we propose to customise traditional metrics by taking advantages of the pre-trained language models (PLMs) and the limited available human labelled scores. We first re-introduce the hLEPOR metric factors, followed by the Python portable version we developed which achieved the automatic tuning of the weighting parameters in hLEPOR metric. Then we present the customised hLEPOR (cushLEPOR) which uses LABSE distilled knowledge model to improve the metric agreement with human judgements by automatically optimised factor weights regarding the exact MT language pairs that cushLEPOR is deployed to. We also optimise cushLEPOR towards human evaluation data based on MQM and pSQM framework on English-German and Chinese-English language pairs. The experimental investigations show cushLEPOR boosts hLEPOR performances towards better agreements to PLMs like LABSE with much lower cost, and better agreements to human evaluations including MQM and pSQM scores, and yields much better performances than BLEU (data available at url{https://github.com/poethan/cushLEPOR}).

【18】 "Adversarial Examples" for Proof-of-Learning 链接：https://arxiv.org/abs/2108.09454

作者：Rui Zhang,Jian Liu,Yuan Ding,Qingbiao Wu,Kui Ren 机构：Zhejiang University 摘要：在S&P'21中，Jia等人提出了一种新的概念/机制，名为学习证明（PoL），它允许验证人通过证明训练过程的完整性来证明机器学习模型的所有权。它保证了对手不能以比证明者生成证明的成本更低的成本（计算和存储）构造有效证明。PoL证明包括一组在训练期间记录的中间模型，以及用于获得每个记录模型的相应数据点。Jia等人声称，仅仅知道最终模型和训练数据集的对手无法有效地找到一组具有正确数据点的中间模型。然而，在本文中，我们表明PoL容易受到“对抗性示例”的攻击！具体来说，与优化对抗性示例类似，我们可以使任意选择的数据点“生成”给定模型，从而有效地生成具有正确数据点的中间模型。我们从理论和经验上证明，我们能够以比证明人生成证明所需的成本少得多的成本生成有效证明，从而成功地打破了PoL。摘要：In S&P '21, Jia et al. proposed a new concept/mechanism named proof-of-learning (PoL), which allows a prover to demonstrate ownership of a machine learning model by proving integrity of the training procedure. It guarantees that an adversary cannot construct a valid proof with less cost (in both computation and storage) than that made by the prover in generating the proof. A PoL proof includes a set of intermediate models recorded during training, together with the corresponding data points used to obtain each recorded model. Jia et al. claimed that an adversary merely knowing the final model and training dataset cannot efficiently find a set of intermediate models with correct data points. In this paper, however, we show that PoL is vulnerable to "adversarial examples"! Specifically, in a similar way as optimizing an adversarial example, we could make an arbitrarily-chosen data point "generate" a given model, hence efficiently generating intermediate models with correct data points. We demonstrate, both theoretically and empirically, that we are able to generate a valid proof with significantly less cost than generating a proof by the prover, thereby we successfully break PoL.

【19】 Integer-arithmetic-only Certified Robustness for Quantized Neural Networks 标题：量化神经网络的仅整数运算认证鲁棒性链接：https://arxiv.org/abs/2108.09413

作者：Haowen Lin,Jian Lou,Li Xiong,Cyrus Shahabi 机构：University of Southern California, Emory University, Xidian University 摘要：对抗性数据示例引起了机器学习和安全社区的极大关注。处理对抗性示例的一系列工作通过随机平滑验证了鲁棒性，可提供理论上的鲁棒性保证。然而，这种机制通常使用浮点算法进行推理计算，并且需要大量内存占用和令人望而生畏的计算成本。这些防御模型既不能在边缘设备上高效运行，也不能部署在纯整数逻辑单元（如图灵张量核或纯整数ARM处理器）上。为了克服这些挑战，我们提出了一种带量化的整数随机化平滑方法，将任何分类器转换为一个新的平滑分类器，该方法使用仅整数算法来证明对对抗性扰动的鲁棒性。我们证明了该方法在L2范数下的严格鲁棒性保证。我们表明，在通用CPU和移动设备上，在两个不同的数据集（CIFAR-10和Caltech-101）上，我们的方法可以获得与浮点算法认证的健壮方法相当的精度和4x~5x的加速比。摘要：Adversarial data examples have drawn significant attention from the machine learning and security communities. A line of work on tackling adversarial examples is certified robustness via randomized smoothing that can provide a theoretical robustness guarantee. However, such a mechanism usually uses floating-point arithmetic for calculations in inference and requires large memory footprints and daunting computational costs. These defensive models cannot run efficiently on edge devices nor be deployed on integer-only logical units such as Turing Tensor Cores or integer-only ARM processors. To overcome these challenges, we propose an integer randomized smoothing approach with quantization to convert any classifier into a new smoothed classifier, which uses integer-only arithmetic for certified robustness against adversarial perturbations. We prove a tight robustness guarantee under L2-norm for the proposed approach. We show our approach can obtain a comparable accuracy and 4x~5x speedup over floating-point arithmetic certified robust methods on general-purpose CPUs and mobile devices on two distinct datasets (CIFAR-10 and Caltech-101).

【20】 Early-exit deep neural networks for distorted images: providing an efficient edge offloading 标题：用于失真图像的提前退出深度神经网络：提供有效的边缘卸载链接：https://arxiv.org/abs/2108.09343

作者：Roberto G. Pacheco,Fernanda D. V. R. Oliveira,Rodrigo S. Couto 机构：∗Universidade Federal do Rio de Janeiro, GTAPADSPEE-COPPEDEL-Poli, Rio de Janeiro, RJ, Brazil 备注：to appear in Proc. IEEE Global Communications Conference (GLOBECOM) 2021 摘要：深度神经网络（DNN）的边缘卸载可以通过使用早期退出DNN来适应输入的复杂性。这些DNN在整个体系结构中都有分支，允许推断在边缘中提前结束。这些分支估计给定输入的精度。如果此估计精度达到阈值，则推断将在边缘结束。否则，边缘将推理卸载到云以处理剩余的DNN层。然而，用于图像分类的DNN处理的是扭曲的图像，这会对分支的估计精度产生负面影响。因此，边缘将更多的推断转移到云上。这项工作引入了在特定失真类型上训练的专家分支，以提高对图像失真的鲁棒性。边缘检测畸变类型并选择适当的专家分支来执行推理。这种方法提高了边缘的估计精度，改善了卸载决策。我们在一个现实场景中验证了我们的建议，在这个场景中，edge将DNN推理卸载到Amazon EC2实例。摘要：Edge offloading for deep neural networks (DNNs) can be adaptive to the input's complexity by using early-exit DNNs. These DNNs have side branches throughout their architecture, allowing the inference to end earlier in the edge. The branches estimate the accuracy for a given input. If this estimated accuracy reaches a threshold, the inference ends on the edge. Otherwise, the edge offloads the inference to the cloud to process the remaining DNN layers. However, DNNs for image classification deals with distorted images, which negatively impact the branches' estimated accuracy. Consequently, the edge offloads more inferences to the cloud. This work introduces expert side branches trained on a particular distortion type to improve robustness against image distortion. The edge detects the distortion type and selects appropriate expert branches to perform the inference. This approach increases the estimated accuracy on the edge, improving the offloading decisions. We validate our proposal in a realistic scenario, in which the edge offloads DNN inference to Amazon EC2 instances.

【21】 Inverse Aerodynamic Design of Gas Turbine Blades using Probabilistic Machine Learning 标题：基于概率机器学习的燃气轮机叶片气动反求设计链接：https://arxiv.org/abs/2108.10163

作者：Sayan Ghosh,Govinda A. Padmanabha,Cheng Peng,Steven Atkinson,Valeria Andreoli,Piyush Pandita,Thomas Vandeputte,Nicholas Zabaras,Liping Wang 机构：General Electric Research, Niskayuna, New York, Department of Aerospace and Mechanical Engineering, University of Notre Dame, Notre Dame, Indiana 摘要：工业燃气轮机（IGT）的关键部件之一是涡轮叶片。涡轮叶片的设计需要考虑空气动力效率、耐久性、安全性和制造等多个方面，这使得设计过程是连续的和迭代的。这些迭代的顺序性质迫使设计周期长，从几个月到几年不等。由于这些迭代的反动性质，以允许对整个设计空间进行深入探索和理解的方式积累数据的工作很少。这在设计IGT单个组件的过程中得到了证明，从而导致潜在的未实现效率。为了克服上述挑战，我们展示了一个概率逆向设计机器学习框架（PMI），以执行显式逆向设计。PMI明确计算设计，无需过度昂贵的迭代，并克服了与不适定反问题相关的挑战。在这项工作中，该框架将在三维涡轮叶片的逆气动设计中演示。摘要：One of the critical components in Industrial Gas Turbines (IGT) is the turbine blade. Design of turbine blades needs to consider multiple aspects like aerodynamic efficiency, durability, safety and manufacturing, which make the design process sequential and iterative.The sequential nature of these iterations forces a long design cycle time, ranging from several months to years. Due to the reactionary nature of these iterations, little effort has been made to accumulate data in a manner that allows for deep exploration and understanding of the total design space. This is exemplified in the process of designing the individual components of the IGT resulting in a potential unrealized efficiency. To overcome the aforementioned challenges, we demonstrate a probabilistic inverse design machine learning framework (PMI), to carry out an explicit inverse design. PMI calculates the design explicitly without excessive costly iteration and overcomes the challenges associated with ill-posed inverse problems. In this work, the framework will be demonstrated on inverse aerodynamic design of three-dimensional turbine blades.

【22】 Deep learning for surrogate modelling of 2D mantle convection 标题：深度学习在二维地幔对流代理模拟中的应用链接：https://arxiv.org/abs/2108.10105

作者：Siddhant Agarwal,Nicola Tosi,Pan Kessel,Doris Breuer,Grégoire Montavon 机构：Planetary Physics, Institute of Planetary Research, German Aerospace Center (DLR), Berlin, Germany, Machine Learning Group, Berlin Institute of Technology, Berlin, Germany, arXiv:,.,v, [astro-ph.EP] , Aug 摘要：传统上，基于标度定律的一维模型被用于参数化地球、火星、水星和金星等类地行星内部的对流传热岩石，以解决二维或三维高保真正演的计算瓶颈。然而，这些模型的物理量有限（例如，依赖深度的材料特性），只能预测平均量，如平均地幔温度。我们最近表明，使用大量2D模拟训练的前馈神经网络（FNN）可以克服这一限制，并可靠地预测复杂模型的整个一维横向平均温度分布的演变[Agarwal等人，2020]。现在，我们将该方法扩展到预测完整的二维温度场，该温度场包含更多对流结构形式的信息，如热羽流和冷降水。利用10525个类似火星的行星地幔热演化二维模拟数据集，我们表明，深度学习技术可以产生可靠的参数化替代物（即仅基于参数预测状态变量（如温度）的替代物）的基本偏微分方程。我们首先使用卷积自动编码器将温度场压缩142倍，然后使用FNN和长短时记忆网络（LSTM）预测压缩场。相对于未观测到的模拟，FNN预测平均为99.30%，LSTM预测平均为99.22%。LSTM和FNN预测的固有正交分解（POD）表明，尽管平均绝对相对精度较低，但LSTM比FNN更好地捕捉流动动力学。当求和时，FNN预测和LSTM预测的POD系数相对于原始模拟的系数分别为96.51%和97.66%。摘要：Traditionally, 1D models based on scaling laws have been used to parameterized convective heat transfer rocks in the interior of terrestrial planets like Earth, Mars, Mercury and Venus to tackle the computational bottleneck of high-fidelity forward runs in 2D or 3D. However, these are limited in the amount of physics they can model (e.g. depth dependent material properties) and predict only mean quantities such as the mean mantle temperature. We recently showed that feedforward neural networks (FNN) trained using a large number of 2D simulations can overcome this limitation and reliably predict the evolution of entire 1D laterally-averaged temperature profile in time for complex models [Agarwal et al. 2020]. We now extend that approach to predict the full 2D temperature field, which contains more information in the form of convection structures such as hot plumes and cold downwellings. Using a dataset of 10,525 two-dimensional simulations of the thermal evolution of the mantle of a Mars-like planet, we show that deep learning techniques can produce reliable parameterized surrogates (i.e. surrogates that predict state variables such as temperature based only on parameters) of the underlying partial differential equations. We first use convolutional autoencoders to compress the temperature fields by a factor of 142 and then use FNN and long-short term memory networks (LSTM) to predict the compressed fields. On average, the FNN predictions are 99.30% and the LSTM predictions are 99.22% accurate with respect to unseen simulations. Proper orthogonal decomposition (POD) of the LSTM and FNN predictions shows that despite a lower mean absolute relative accuracy, LSTMs capture the flow dynamics better than FNNs. When summed, the POD coefficients from FNN predictions and from LSTM predictions amount to 96.51% and 97.66% relative to the coefficients of the original simulations, respectively.

【23】 New Trends in Quantum Machine Learning 标题：量子机器学习的新动向链接：https://arxiv.org/abs/2108.09664

作者：Lorenzo Buffoni,Filippo Caruso 机构：Dipartimento di Fisica e Astronomia, Universit´a di Firenze, I-, Sesto Fiorentino, Italy, Dipartimento di Ingegneria dell’Informazione, Universit´a di Firenze, I-, Firenze, Italy, LENS, QSTAR and CNR-INO, I-, Sesto Fiorentino, Italy 备注：None 摘要：在这里，我们将对机器学习和量子物理之间新的可能相互作用给出一个观点，包括实际案例和应用。我们将探索机器学习从新的量子技术和算法中获益的方式，通过物理硬件的突破来寻找加速计算的新方法，以及改进现有模型或设计量子领域的新学习方案。此外，量子物理中有许多实验确实产生了难以置信的数据量，机器学习将是分析这些数据、做出预测甚至控制实验本身的一个伟大工具。除此之外，数据可视化技术和从机器学习中借鉴的其他方案对于理论家对复杂流形的结构有更好的直觉或对理论模型进行预测非常有用。这一被称为量子机器学习的新研究领域正在迅速发展，因为预计它将比经典领域提供巨大的优势，并且需要进行更深入的研究，因为它们已经可以在已经商用的量子机器上进行测试。摘要：Here we will give a perspective on new possible interplays between Machine Learning and Quantum Physics, including also practical cases and applications. We will explore the ways in which machine learning could benefit from new quantum technologies and algorithms to find new ways to speed up their computations by breakthroughs in physical hardware, as well as to improve existing models or devise new learning schemes in the quantum domain. Moreover, there are lots of experiments in quantum physics that do generate incredible amounts of data and machine learning would be a great tool to analyze those and make predictions, or even control the experiment itself. On top of that, data visualization techniques and other schemes borrowed from machine learning can be of great use to theoreticians to have better intuition on the structure of complex manifolds or to make predictions on theoretical models. This new research field, named as Quantum Machine Learning, is very rapidly growing since it is expected to provide huge advantages over its classical counterpart and deeper investigations are timely needed since they can be already tested on the already commercially available quantum machines.

其他(25篇)

【1】 Exclusive Group Lasso for Structured Variable Selection 标题：用于结构化变量选择的独家组合套索链接：https://arxiv.org/abs/2108.10284

作者：David Gregoratti,Xavier Mestre,Carlos Buelga 机构：Formerly with the Centre Tecnològic de Telecomunicacions de Catalunya (CTTCCERCA), Av. C. F. Gauss, Castelldefels (Barcelona, Spain) 备注：This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible 摘要：考虑了一个结构化变量选择问题，在该问题中，被划分为预定义组的协变量根据稀疏模式激活，每个组中只有很少的非零项。利用原子范数的概念，可以适当地设计复合范数来促进这种排他性群稀疏模式。由此产生的范数适用于支持恢复的高效灵活的正则化优化算法，如近端算法。此外，提出了一种主动集算法，该算法通过将结构原子依次加入到估计支持度中来构建解。它还表明，这种算法可以定制，以匹配更多的刚性结构比平原排他性组稀疏。渐近一致性分析（参数的数量以及组的数量随着观测规模的增加而增加）确定了在常规假设下，所提出的解决方案在有符号支持恢复方面的有效性。最后，一组数值模拟进一步证实了结果。摘要：A structured variable selection problem is considered in which the covariates, divided into predefined groups, activate according to sparse patterns with few nonzero entries per group. Capitalizing on the concept of atomic norm, a composite norm can be properly designed to promote such exclusive group sparsity patterns. The resulting norm lends itself to efficient and flexible regularized optimization algorithms for support recovery, like the proximal algorithm. Moreover, an active set algorithm is proposed that builds the solution by successively including structure atoms into the estimated support. It is also shown that such an algorithm can be tailored to match more rigid structures than plain exclusive group sparsity. Asymptotic consistency analysis (with both the number of parameters as well as the number of groups growing with the observation size) establishes the effectiveness of the proposed solution in terms of signed support recovery under conventional assumptions. Finally, a set of numerical simulations further corroborates the results.

【2】 Exploring Biases and Prejudice of Facial Synthesis via Semantic Latent Space 标题：基于语义潜在空间的人脸合成偏向和偏向研究链接：https://arxiv.org/abs/2108.10265

作者：Xuyang Shen,Jo Plested,Sabrina Caldwell,Tom Gedeon 机构：Research School of Computer Science, the Australian National University, Canberra, Australia 备注：8 pages, 11 figures; accepted by IJCNN2021 摘要：深度学习（DL）模式被广泛应用于提供更方便、更智能的生活。然而，有偏见的算法会对我们产生负面影响。例如，被有偏见的算法锁定的群体会感到不公平，甚至害怕这些偏见的负面后果。这项工作的目标是有偏见的生成模型的行为，找出偏见的原因并消除它们。我们可以（如预期的）得出结论，有偏见的数据会导致对人脸前沿化模型的有偏见的预测。改变训练数据中男性和女性面孔的比例会对测试数据上的行为产生实质性影响：我们发现，似乎显而易见的选择50:50的比例并不是该数据集减少女性面孔上有偏见行为的最佳选择，与我们最高的无偏见率84%相比，无偏见率为71%。生成失败和生成错误的性别面孔是这些模型的两种行为。此外，只有人脸正面化模型中的某些层容易受到有偏见的数据集的影响。优化正面化模型中生成器的跳过连接可以减少模型的偏差。我们的结论是，如果没有一个无限大小的数据集，很可能不可能消除所有的训练偏差，我们的实验表明，可以减少和量化偏差。我们相信，下一个最好的完美无偏预测是一个最小化剩余已知偏差的预测。摘要：Deep learning (DL) models are widely used to provide a more convenient and smarter life. However, biased algorithms will negatively influence us. For instance, groups targeted by biased algorithms will feel unfairly treated and even fearful of negative consequences of these biases. This work targets biased generative models' behaviors, identifying the cause of the biases and eliminating them. We can (as expected) conclude that biased data causes biased predictions of face frontalization models. Varying the proportions of male and female faces in the training data can have a substantial effect on behavior on the test data: we found that the seemingly obvious choice of 50:50 proportions was not the best for this dataset to reduce biased behavior on female faces, which was 71% unbiased as compared to our top unbiased rate of 84%. Failure in generation and generating incorrect gender faces are two behaviors of these models. In addition, only some layers in face frontalization models are vulnerable to biased datasets. Optimizing the skip-connections of the generator in face frontalization models can make models less biased. We conclude that it is likely to be impossible to eliminate all training bias without an unlimited size dataset, and our experiments show that the bias can be reduced and quantified. We believe the next best to a perfect unbiased predictor is one that has minimized the remaining known bias.

【3】 Influence-guided Data Augmentation for Neural Tensor Completion 标题：神经张量补全的影响引导数据增强算法链接：https://arxiv.org/abs/2108.10248

作者：Sejoon Oh,Sungchul Kim,Ryan A. Rossi,Srijan Kumar 机构：Georgia Institute of Technology, United States, Adobe Research 备注：Accepted for publication at 30th ACM International Conference on Information and Knowledge Management (ACM CIKM 2021). Code and data: this https URL 摘要：我们如何更准确地预测多维数据（或张量）中的缺失值？张量完成任务在社交网络中的个性化推荐、图像和视频恢复以及链接预测等应用中至关重要。许多基于张量因子分解和神经网络的张量完成算法已经被开发出来，用于预测部分观测张量中的缺失项。然而，它们可能会产生不准确的估计，因为现实世界中的张量非常稀疏，并且这些方法往往在少量数据上过度拟合。在这里，我们通过提出一种张量数据增强技术来克服这些缺点。在本文中，我们提出了DAIN，这是一个通用的数据增强框架，可以提高神经张量补全方法的预测精度。具体来说，DAIN首先训练一个神经模型，并用影响函数发现张量细胞的重要性。然后，DAIN聚合单元格重要性以计算每个实体的重要性（即维度的索引）。最后，DAIN通过实体重要性的加权采样和值预测器来增加张量。大量实验结果表明，DAIN在提高四种不同的真实世界张量的神经张量补全插补精度方面优于所有数据增强基线。DAIN的消融研究证实了DAIN各成分的有效性。此外，我们还表明，DAIN与大型数据集几乎成线性关系。摘要：How can we predict missing values in multi-dimensional data (or tensors) more accurately? The task of tensor completion is crucial in many applications such as personalized recommendation, image and video restoration, and link prediction in social networks. Many tensor factorization and neural network-based tensor completion algorithms have been developed to predict missing entries in partially observed tensors. However, they can produce inaccurate estimations as real-world tensors are very sparse, and these methods tend to overfit on the small amount of data. Here, we overcome these shortcomings by presenting a data augmentation technique for tensors. In this paper, we propose DAIN, a general data augmentation framework that enhances the prediction accuracy of neural tensor completion methods. Specifically, DAIN first trains a neural model and finds tensor cell importances with influence functions. After that, DAIN aggregates the cell importance to calculate the importance of each entity (i.e., an index of a dimension). Finally, DAIN augments the tensor by weighted sampling of entity importances and a value predictor. Extensive experimental results show that DAIN outperforms all data augmentation baselines in terms of enhancing imputation accuracy of neural tensor completion on four diverse real-world tensors. Ablation studies of DAIN substantiate the effectiveness of each component of DAIN. Furthermore, we show that DAIN scales near linearly to large datasets.

【4】 No DBA? No regret! Multi-armed bandits for index tuning of analytical and HTAP workloads with provable guarantees 标题：没有DBA？不后悔！多臂强盗，用于分析和HTAP工作负载的索引调优，并提供可证明的保证链接：https://arxiv.org/abs/2108.10130

作者：R. Malinga Perera,Bastian Oetomo,Benjamin I. P. Rubinstein,Renata Borovica-Gajic 机构：Received: date Accepted: date 备注：25 pages, 20 figures, 5 tables. arXiv admin note: substantial text overlap with arXiv:2010.09208 摘要：自动化物理数据库设计一直是数据库研究的一个长期兴趣，因为优化的结构带来了巨大的性能提升。尽管取得了重大进展，但当今的大多数商业解决方案都是高度手动的，需要数据库管理员（DBA）进行离线调用，他们需要识别并提供具有代表性的训练工作负载。即使是查询存储之类的最新改进也只能为动态环境提供有限的支持。这种现状是站不住脚的：识别具有代表性的静态工作负载不再现实；而物理设计工具仍然容易受到查询优化者错误估计成本的影响。此外，现代应用程序环境，如混合事务和分析处理（HTAP）系统，使得分析建模几乎不可能。我们提出了一种在线索引选择的自驱动方法，它避开了DBA和查询优化程序，而是通过战略探索和直接性能观察来学习可行结构的好处。我们将该问题视为不确定性下的连续决策问题，特别是在bandit学习环境中。多武装匪徒平衡勘探和开发，以确保平均性能收敛到具有完美后见之明的最优策略。我们针对最先进的商业调优工具进行的综合实证评估表明，在分析处理环境中，移动和特殊工作负载的速度提高了75%，静态工作负载的速度提高了28%。在HTAP环境中，我们的解决方案提供了高达59%的移动速度和51%的静态工作负载速度。此外，我们的bandit框架在收敛速度和性能波动性方面优于深度强化学习（RL）（提供高达58%的加速）。摘要：Automating physical database design has remained a long-term interest in database research due to substantial performance gains afforded by optimised structures. Despite significant progress, a majority of today's commercial solutions are highly manual, requiring offline invocation by database administrators (DBAs) who are expected to identify and supply representative training workloads. Even the latest advancements like query stores provide only limited support for dynamic environments. This status quo is untenable: identifying representative static workloads is no longer realistic; and physical design tools remain susceptible to the query optimiser's cost misestimates. Furthermore, modern application environments such as hybrid transactional and analytical processing (HTAP) systems render analytical modelling next to impossible. We propose a self-driving approach to online index selection that eschews the DBA and query optimiser, and instead learns the benefits of viable structures through strategic exploration and direct performance observation. We view the problem as one of sequential decision making under uncertainty, specifically within the bandit learning setting. Multi-armed bandits balance exploration and exploitation to provably guarantee average performance that converges to policies that are optimal with perfect hindsight. Our comprehensive empirical evaluation against a state-of-the-art commercial tuning tool demonstrates up to 75% speed-up on shifting and ad-hoc workloads and up to 28% speed-up on static workloads in analytical processing environments. In HTAP environments, our solution provides up to 59% speed-up on shifting and 51% speed-up on static workloads. Furthermore, our bandit framework outperforms deep reinforcement learning (RL) in terms of convergence speed and performance volatility (providing up to 58% speed-up).

【5】 Effective Streaming Low-tubal-rank Tensor Approximation via Frequent Directions 标题：基于频繁方向的有效流动低阶张量逼近链接：https://arxiv.org/abs/2108.10129

作者：Qianxin Yi,Chenhao Wang,Kaidong Wang,Yao Wang 机构： School ofManagement, Xi’an Jiaotong University, Qianxin Yi and Kaidong Wang are also with the School of Mathematicsand Statistics 摘要：低管秩张量近似已被提出用于分析大规模多维数据。然而，由于计算资源有限，在流媒体环境中找到这样一个精确的近似值很有挑战性。为了缓解这一问题，本文扩展了一种流行的矩阵绘制技术，即频繁方向，用于基于张量奇异值分解（t-SVD）从流数据构造高效、准确的低管秩张量近似。具体而言，新算法允许逐层观察张量数据，但只需要维护和增量更新一个更小的草图，该草图可以捕获原始张量的主要信息。严格的理论分析表明，当草图尺寸线性增长时，新算法的逼近误差可以任意小。在合成多维数据和真实多维数据上的大量实验结果进一步表明，与其他绘制算法相比，该算法在效率和准确性方面都具有优势，可以获得低管秩近似值。摘要：Low-tubal-rank tensor approximation has been proposed to analyze large-scale and multi-dimensional data. However, finding such an accurate approximation is challenging in the streaming setting, due to the limited computational resources. To alleviate this issue, this paper extends a popular matrix sketching technique, namely Frequent Directions, for constructing an efficient and accurate low-tubal-rank tensor approximation from streaming data based on the tensor Singular Value Decomposition (t-SVD). Specifically, the new algorithm allows the tensor data to be observed slice by slice, but only needs to maintain and incrementally update a much smaller sketch which could capture the principal information of the original tensor. The rigorous theoretical analysis shows that the approximation error of the new algorithm can be arbitrarily small when the sketch size grows linearly. Extensive experimental results on both synthetic and real multi-dimensional data further reveal the superiority of the proposed algorithm compared with other sketching algorithms for getting low-tubal-rank approximation, in terms of both efficiency and accuracy.

【6】 Effective and Privacy preserving Tabular Data Synthesizing 标题：有效保护隐私的表格数据综合链接：https://arxiv.org/abs/2108.10064

作者：Aditya Kunar 机构：Technische Universiteit Delft, arXiv:,.,v, [cs.LG] , Aug 备注：Thesis to obtain the degree of Master of Science at the Delft University of Technology 摘要：虽然数据共享对知识开发至关重要，但隐私问题和严格的监管（例如，欧洲通用数据保护条例（GDPR））不幸地限制了其充分有效性。合成表格数据作为一种替代方案出现，在满足监管和隐私限制的同时实现数据共享。最先进的表格数据合成器从生成性对抗网络（GAN）中提取方法。在本论文中，我们开发了CTAB-GAN，这是一种新的条件表GAN体系结构，可以有效地建模具有复杂分布的多种数据类型。CTAB-GAN在数据相似性和分析实用性方面广泛使用最先进的生成合成表的GAN进行评估。在五个数据集上的结果表明，CTAB-GAN的合成数据与所有三类变量的真实数据非常相似，并且五种机器学习算法的精度更高，高达17%。此外，为了确保针对恶意隐私攻击训练表格式GAN的更高安全性，研究并使用差异隐私（DP）训练具有严格隐私保证的CTAB-GAN。DP-CTAB-GAN使用最先进的DP表格GAN在数据实用性和隐私鲁棒性方面对成员身份和属性推断攻击进行了严格评估。我们在三个数据集上的结果表明，严格的理论差异隐私保证只有在严重影响数据效用之后才会出现。然而，经验表明，这些保证有助于提供更强的隐私攻击防御。总的来说，我们发现DP-CTABGA能够抵御隐私攻击，同时与以前的工作相比，能够保持最高的数据利用率，平均精确度得分高达18%。摘要：While data sharing is crucial for knowledge development, privacy concerns and strict regulation (e.g., European General Data Protection Regulation (GDPR)) unfortunately limits its full effectiveness. Synthetic tabular data emerges as an alternative to enable data sharing while fulfilling regulatory and privacy constraints. The state-of-the-art tabular data synthesizers draw methodologies from Generative Adversarial Networks (GAN). In this thesis, we develop CTAB-GAN, a novel conditional table GAN architecture that can effectively model diverse data types with complex distributions. CTAB-GAN is extensively evaluated with the state of the art GANs that generate synthetic tables, in terms of data similarity and analysis utility. The results on five datasets show that the synthetic data of CTAB-GAN remarkably resembles the real data for all three types of variables and results in higher accuracy for five machine learning algorithms, by up to 17%. Additionally, to ensure greater security for training tabular GANs against malicious privacy attacks, differential privacy (DP) is studied and used to train CTAB-GAN with strict privacy guarantees. DP-CTAB-GAN is rigorously evaluated using state-of-the-art DP-tabular GANs in terms of data utility and privacy robustness against membership and attribute inference attacks. Our results on three datasets indicate that strict theoretical differential privacy guarantees come only after severely affecting data utility. However, it is shown empirically that these guarantees help provide a stronger defence against privacy attacks. Overall, it is found that DP-CTABGAN is capable of being robust to privacy attacks while maintaining the highest data utility as compared to prior work, by up to 18% in terms of the average precision score.

【7】 An Extensible and Modular Design and Implementation of Monte Carlo Tree Search for the JVM 标题：一种可扩展的模块化JVM蒙特卡罗树搜索设计与实现链接：https://arxiv.org/abs/2108.10061

作者：Larkin Liu,Jun Tao Luo 机构：University of Toronto, Carnegie Mellon University 备注：18 pages, 7 figures, Manuscript 摘要：Monte Carlo树搜索（MCTS）的灵活实现，结合特定领域的知识以及与其他搜索算法的混合，可以有效地找到复杂规划问题的解决方案。我们将介绍mctreesearch4j，这是一个按照面向对象编程的关键设计原则编写的标准JVM库MCTS实现。我们定义了关键类抽象，使MCTS库能够灵活地适应任何定义良好的马尔可夫决策过程或基于回合的对抗性游戏。此外，我们的库设计为模块化和可扩展的，利用类继承和泛型类型来标准化自定义算法定义。我们证明了MCTS实现的设计为跨不同马尔可夫决策过程（MDP）域的独特启发式和定制提供了方便。此外，对于标准MDP而言，该实现具有合理的性能和准确性。此外，通过mctreesearch4j的实现，讨论了不同类型MCTS算法的细微差别。摘要：Flexible implementations of Monte Carlo Tree Search (MCTS), combined with domain specific knowledge and hybridization with other search algorithms, can be powerful for finding the solutions to problems in complex planning. We introduce mctreesearch4j, an MCTS implementation written as a standard JVM library following key design principles of object oriented programming. We define key class abstractions allowing the MCTS library to flexibly adapt to any well defined Markov Decision Process or turn-based adversarial game. Furthermore, our library is designed to be modular and extensible, utilizing class inheritance and generic typing to standardize custom algorithm definitions. We demonstrate that the design of the MCTS implementation provides ease of adaptation for unique heuristics and customization across varying Markov Decision Process (MDP) domains. In addition, the implementation is reasonably performant and accurate for standard MDP's. In addition, via the implementation of mctreesearch4j, the nuances of different types of MCTS algorithms are discussed.

【8】 Image coding for machines: an end-to-end learned approach 标题：机器图像编码：一种端到端的学习方法链接：https://arxiv.org/abs/2108.09993

作者：Nam Le,Honglei Zhang,Francesco Cricri,Ramin Ghaznavi-Youvalari,Esa Rahtu 机构：†Nokia Technologies, ∗Tampere University, Tampere, Finland 备注：None 摘要：近年来，基于深度学习的计算机视觉系统以不断增长的速度应用于图像，通常是这些图像的唯一消费类型。鉴于每天生成的图像数量急剧增加，出现了一个问题：针对机器消费的图像编解码器与针对人类消费的最先进编解码器相比，性能会有多好？在本文中，我们提出了一种基于神经网络（NN）和端到端学习的机器图像编解码器。特别是，我们提出了一套训练策略，解决了平衡竞争损失函数的微妙问题，如计算机视觉任务损失、图像失真损失和速率损失。我们的实验结果表明，基于神经网络的编解码器在目标检测和实例分割任务上优于最先进的Versa-tile视频编码（VVC）标准，分别实现了-37.87%和-32.90%的BD速率增益，同时由于其紧凑的尺寸，速度很快。据我们所知，这是第一个端到端学习机器目标图像编解码器。摘要：Over recent years, deep learning-based computer vision systems have been applied to images at an ever-increasing pace, oftentimes representing the only type of consumption for those images. Given the dramatic explosion in the number of images generated per day, a question arises: how much better would an image codec targeting machine-consumption perform against state-of-the-art codecs targeting human-consumption? In this paper, we propose an image codec for machines which is neural network (NN) based and end-to-end learned. In particular, we propose a set of training strategies that address the delicate problem of balancing competing loss functions, such as computer vision task losses, image distortion losses, and rate loss. Our experimental results show that our NN-based codec outperforms the state-of-the-art Versa-tile Video Coding (VVC) standard on the object detection and instance segmentation tasks, achieving -37.87% and -32.90% of BD-rate gain, respectively, while being fast thanks to its compact size. To the best of our knowledge, this is the first end-to-end learned machine-targeted image codec.

【9】 Revealing Distributional Vulnerability of Explicit Discriminators by Implicit Generators 标题：用隐式生成器揭示显式鉴别器的分布脆弱性链接：https://arxiv.org/abs/2108.09976

作者：Zhilin Zhao,Longbing Cao,Kun-Yu Lin 机构： SunYat-sen University 摘要：基于可观测分布内（ID）样本训练的显式鉴别器由于其分布脆弱性，可以对分布外（OOD）样本进行高置信度预测。这主要是由于当OOD样本不可用时，可用于训练鉴别器的有限ID样本造成的。为了解决这个问题，最先进的方法在不考虑数据和网络特征的情况下，通过一般假设生成的OOD样本来训练鉴别器。然而，不同的网络体系结构和训练ID数据集可能会导致不同的漏洞，因此生成的OOD样本通常会错误处理显式鉴别器的特定分布漏洞。为了揭示和修补分布式漏洞，我们提出了一种新的方法text{通过隐式生成器微调显式鉴别器}（图）。根据香农熵，显式鉴别器可以构造其相应的隐式生成器来生成特定的OOD样本，而无需额外的训练开销。然后，朗之万动态采样器从生成器中提取高质量的OOD样本，以揭示漏洞。最后，根据隐式生成器的设计原则构造正则化器，通过鼓励生成高熵的OOD样本来修补分布脆弱性。我们在四个网络、四个ID数据集和七个OOD数据集上的实验表明，FIG实现了最先进的OOD检测性能，并保持了具有竞争力的分类能力。摘要：An explicit discriminator trained on observable in-distribution (ID) samples can make high-confidence prediction on out-of-distribution (OOD) samples due to its distributional vulnerability. This is primarily caused by the limited ID samples observable for training discriminators when OOD samples are unavailable. To address this issue, the state-of-the-art methods train the discriminator with OOD samples generated by general assumptions without considering the data and network characteristics. However, different network architectures and training ID datasets may cause diverse vulnerabilities, and the generated OOD samples thus usually misaddress the specific distributional vulnerability of the explicit discriminator. To reveal and patch the distributional vulnerabilities, we propose a novel method of textit{fine-tuning explicit discriminators by implicit generators} (FIG). According to the Shannon entropy, an explicit discriminator can construct its corresponding implicit generator to generate specific OOD samples without extra training costs. A Langevin Dynamic sampler then draws high-quality OOD samples from the generator to reveal the vulnerability. Finally, a regularizer, constructed according to the design principle of the implicit generator, patches the distributional vulnerability by encouraging those generated OOD samples with high entropy. Our experiments on four networks, four ID datasets and seven OOD datasets demonstrate that FIG achieves state-of-the-art OOD detection performance and maintains a competitive classification capability.

【10】 Fluent: An AI Augmented Writing Tool for People who Stutter 标题：流利：一款针对口吃者的人工智能增强写作工具链接：https://arxiv.org/abs/2108.09918

作者：Bhavya Ghai,Klaus Mueller 机构：Stony Brook University 备注：Accepted to ACM ASSETS 2021 conference 摘要：口吃是一种言语障碍，影响着全世界数百万人的个人和职业生活。为了使自己免于耻辱和歧视，口吃者（PWS）可能会采取不同的策略来掩盖口吃。其中一个常见的策略是单词替换，即个体避免说他们可能结巴的单词，而是使用另一个替代词。这个过程本身会造成压力，增加负担。在这项工作中，我们介绍了Fluent，一种人工智能增强的写作工具，它可以帮助PWS编写脚本，使他们能够说得更流利。Fluent体现了一种新的基于主动学习的方法，用于识别个人可能难以发音的单词。这些词在界面中突出显示。将鼠标悬停在任何此类单词上，Fluent会显示一组具有类似含义但更容易说话的备选单词。用户可以自由接受或忽略这些建议。基于这种用户交互（反馈），Fluent不断改进其分类器，以更好地满足每个用户的个性化需求。我们通过测量10个模拟用户识别难懂单词的能力来评估我们的工具。我们发现，我们的工具可以在20次以下的互动中识别难词，平均准确率超过80%，并且随着反馈的增加，它会不断改进。我们的工具可以用于某些重要的生活场合，如演讲、演示等。该工具的源代码已在github.com/bhavyaghai/Fluent上公开提供。摘要：Stuttering is a speech disorder which impacts the personal and professional lives of millions of people worldwide. To save themselves from stigma and discrimination, people who stutter (PWS) may adopt different strategies to conceal their stuttering. One of the common strategies is word substitution where an individual avoids saying a word they might stutter on and use an alternative instead. This process itself can cause stress and add more burden. In this work, we present Fluent, an AI augmented writing tool which assists PWS in writing scripts which they can speak more fluently. Fluent embodies a novel active learning based method of identifying words an individual might struggle pronouncing. Such words are highlighted in the interface. On hovering over any such word, Fluent presents a set of alternative words which have similar meaning but are easier to speak. The user is free to accept or ignore these suggestions. Based on such user interaction (feedback), Fluent continuously evolves its classifier to better suit the personalized needs of each user. We evaluated our tool by measuring its ability to identify difficult words for 10 simulated users. We found that our tool can identify difficult words with a mean accuracy of over 80% in under 20 interactions and it keeps improving with more feedback. Our tool can be beneficial for certain important life situations like giving a talk, presentation, etc. The source code for this tool has been made publicly accessible at github.com/bhavyaghai/Fluent.

【11】 FRUGAL: Unlocking SSL for Software Analytics 标题：节俭：解锁SSL for Software Analytics 链接：https://arxiv.org/abs/2108.09847

作者：Huy Tu,Tim Menzies 机构：Com Sci, NCState, USA 备注：Accepted for ASE 2022 摘要：标准软件分析通常涉及到拥有大量带有标签的数据，以便委托具有可接受性能的模型。然而，之前的工作表明，这样的需求可能很昂贵，需要几周的时间来标记数千个提交，并且在遍历新的研究问题和领域时并不总是可用。无监督学习是在未标记数据中学习隐藏模式的一个很有前途的方向，它仅在缺陷预测中得到了广泛的研究。然而，无监督学习本身可能是无效的，在其他领域（例如静态分析和问题结束时间）也没有进行过探索。基于这一文献空白和技术限制，我们提出了一种节俭的、优化的半监督方法，该方法建立在一个简单的优化方案之上，不需要复杂的（如深度学习者）和昂贵的（如100%手动标记数据）方法。节俭优化了无监督学习者的配置（通过简单的网格搜索），同时验证了我们在预测前仅标记2.5%数据的设计决策。如本文的实验所示，节俭优于最先进的可采用静态代码警告识别器和发出关闭时间预测器，同时将标记成本降低了40倍（从100%降至2.5%）。因此，我们断言，节俭可以在数据标记方面节省大量精力，特别是在验证以前的工作或研究新问题方面。基于这项工作，我们建议，复杂和昂贵方法的支持者应始终将此类方法与更简单和更便宜的替代方法相比较。例如，像节俭这样的半监督学习者可以作为最先进的软件分析的基线。摘要：Standard software analytics often involves having a large amount of data with labels in order to commission models with acceptable performance. However, prior work has shown that such requirements can be expensive, taking several weeks to label thousands of commits, and not always available when traversing new research problems and domains. Unsupervised Learning is a promising direction to learn hidden patterns within unlabelled data, which has only been extensively studied in defect prediction. Nevertheless, unsupervised learning can be ineffective by itself and has not been explored in other domains (e.g., static analysis and issue close time). Motivated by this literature gap and technical limitations, we present FRUGAL, a tuned semi-supervised method that builds on a simple optimization scheme that does not require sophisticated (e.g., deep learners) and expensive (e.g., 100% manually labelled data) methods. FRUGAL optimizes the unsupervised learner's configurations (via a simple grid search) while validating our design decision of labelling just 2.5% of the data before prediction. As shown by the experiments of this paper FRUGAL outperforms the state-of-the-art adoptable static code warning recognizer and issue closed time predictor, while reducing the cost of labelling by a factor of 40 (from 100% to 2.5%). Hence we assert that FRUGAL can save considerable effort in data labelling especially in validating prior work or researching new problems. Based on this work, we suggest that proponents of complex and expensive methods should always baseline such methods against simpler and cheaper alternatives. For instance, a semi-supervised learner like FRUGAL can serve as a baseline to the state-of-the-art software analytics.

【12】 Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World TriFinger 标题：将GPU仿真中的灵巧操作转移到远程真实的三指格上链接：https://arxiv.org/abs/2108.09779

作者：Arthur Allshire,Mayank Mittal,Varun Lodaya,Viktor Makoviychuk,Denys Makoviichuk,Felix Widmaier,Manuel Wüthrich,Stefan Bauer,Ankur Handa,Animesh Garg 机构：University of Toronto, Vector Institute,ETH Zurich,Nvidia,Snap,MPI Tubingen 备注：13 pages, 11 figures 摘要：我们提出了一个学习具有挑战性的灵巧操作任务的系统，该任务涉及使用NVIDIA的IsaacGym模拟器训练3个手指，将立方体移动到任意6自由度姿势。我们展示了在模拟和模拟到真实转换中，使用关键点（而不是位置四元数表示）对6-DoF中的对象姿势进行策略观察和奖励计算来训练无模型强化学习代理的经验优势。通过利用区域随机化策略以及操纵对象姿势的关键点表示，我们在由真实机器人挑战赛组织者维护的远程三指系统上实现了83%的高成功率。为了帮助进一步研究手部操作的学习，我们在https://s2r2-ig.github.io 摘要：We present a system for learning a challenging dexterous manipulation task involving moving a cube to an arbitrary 6-DoF pose with only 3-fingers trained with NVIDIA's IsaacGym simulator. We show empirical benefits, both in simulation and sim-to-real transfer, of using keypoints as opposed to position quaternion representations for the object pose in 6-DoF for policy observations and in reward calculation to train a model-free reinforcement learning agent. By utilizing domain randomization strategies along with the keypoint representation of the pose of the manipulated object, we achieve a high success rate of 83% on a remote TriFinger system maintained by the organizers of the Real Robot Challenge. With the aim of assisting further research in learning in-hand manipulation, we make the codebase of our system, along with trained checkpoints that come with billions of steps of experience available, at https://s2r2-ig.github.io

【13】 Efficient Gaussian Neural Processes for Regression 标题：回归的有效高斯神经过程链接：https://arxiv.org/abs/2108.09676

作者：Stratis Markou,James Requeima,Wessel Bruinsma,Richard Turner 机构：CNPs are trained using a simple maximum-likelihood pro-Equal contribution 1Department of Engineering, Universityof Cambridge 备注：6 pages 摘要：条件神经过程；Garnelo等人，2018）是一个极具吸引力的元学习模型家族，该家族能够产生经过良好校准的预测，在测试时能够进行快速推理，并且可以通过简单的最大似然程序进行训练。CNP的一个限制是无法对输出中的依赖关系进行建模。这严重损害了预测性能，使得无法提取一致的函数样本，从而限制了CNP在下游应用和决策中的适用性。神经过程；Garnelo等人，2018年）试图通过使用潜在变量来缓解这一问题，依靠这些变量对输出依赖性进行建模，但引入了源自近似推理的困难。最近的一个备选方案（Bruinsma et al.，2021），我们称之为FullConvGNP，它在预测中建模依赖性，同时仍然可以通过精确的最大似然法进行训练。不幸的是，FullConvGNP依赖于昂贵的二维卷积，这限制了它仅适用于一维数据。在这项工作中，我们提出了一种对输出依赖性建模的替代方法，这种方法也可以进行自最大似然训练，但与FullConvGNP不同，它可以扩展到二维和三维数据。该模型在综合实验中表现出良好的性能摘要：Conditional Neural Processes (CNP; Garnelo et al., 2018) are an attractive family of meta-learning models which produce well-calibrated predictions, enable fast inference at test time, and are trainable via a simple maximum likelihood procedure. A limitation of CNPs is their inability to model dependencies in the outputs. This significantly hurts predictive performance and renders it impossible to draw coherent function samples, which limits the applicability of CNPs in down-stream applications and decision making. NeuralProcesses (NPs; Garnelo et al., 2018) attempt to alleviate this issue by using latent variables, rely-ing on these to model output dependencies, but introduces difficulties stemming from approximate inference. One recent alternative (Bruinsma et al.,2021), which we refer to as the FullConvGNP, models dependencies in the predictions while still being trainable via exact maximum-likelihood.Unfortunately, the FullConvGNP relies on expensive 2D-dimensional convolutions, which limit its applicability to only one-dimensional data.In this work, we present an alternative way to model output dependencies which also lends it-self maximum likelihood training but, unlike the FullConvGNP, can be scaled to two- and three-dimensional data. The proposed models exhibit good performance in synthetic experiments

【14】 Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift 标题：PI-NAS：通过减少超网训练一致性漂移来改进神经结构搜索链接：https://arxiv.org/abs/2108.09671

作者：Jiefeng Peng,Jiqi Zhang,Changlin Li,Guangrun Wang,Xiaodan Liang,Liang Lin 机构：Sun Yat-sen University, DarkMatter AI Research, GORSE Lab, Dept. of DSAI, Monash University, University of Oxford 备注：Accepted to ICCV 2021 摘要：最近提出的神经结构搜索（NAS）方法在一个超网中联合训练数十亿个结构，并使用从超网分离的网络权重估计其潜在精度。然而，体系结构的预测精度与实际性能之间的排序关系是不正确的，这导致了现有NAS方法的困境。我们将这种排序相关性问题归因于超网训练一致性的变化，包括特征变化和参数变化。由于随机路径采样，特征偏移被识别为隐藏层的动态输入分布。输入分布动态影响损耗下降，最终影响架构排名。参数偏移被识别为在不同的训练步骤中，对位于不同路径中的共享层进行的相互矛盾的参数更新。快速变化的参数无法保持体系结构排名。我们使用一个非平凡的超网Pi模型（称为Pi NAS）同时处理这两个移位。具体来说，我们采用了一个包含交叉路径学习的supernet Pi模型，以减少不同路径之间的特征一致性转移。同时，我们采用了一种新的包含负样本的非平凡均值算法来克服参数漂移和模型碰撞。此外，我们的Pi NAS以无监督的方式运行，可以搜索更多可转移的体系结构。在ImageNet和一系列下游任务（如COCO 2017、ADE20K和Cityscapes）上进行的大量实验证明了我们的Pi NAS与监督NAS相比的有效性和普遍性。见代码：https://github.com/Ernie1/Pi-NAS. 摘要：Recently proposed neural architecture search (NAS) methods co-train billions of architectures in a supernet and estimate their potential accuracy using the network weights detached from the supernet. However, the ranking correlation between the architectures' predicted accuracy and their actual capability is incorrect, which causes the existing NAS methods' dilemma. We attribute this ranking correlation problem to the supernet training consistency shift, including feature shift and parameter shift. Feature shift is identified as dynamic input distributions of a hidden layer due to random path sampling. The input distribution dynamic affects the loss descent and finally affects architecture ranking. Parameter shift is identified as contradictory parameter updates for a shared layer lay in different paths in different training steps. The rapidly-changing parameter could not preserve architecture ranking. We address these two shifts simultaneously using a nontrivial supernet-Pi model, called Pi-NAS. Specifically, we employ a supernet-Pi model that contains cross-path learning to reduce the feature consistency shift between different paths. Meanwhile, we adopt a novel nontrivial mean teacher containing negative samples to overcome parameter shift and model collision. Furthermore, our Pi-NAS runs in an unsupervised manner, which can search for more transferable architectures. Extensive experiments on ImageNet and a wide range of downstream tasks (e.g., COCO 2017, ADE20K, and Cityscapes) demonstrate the effectiveness and universality of our Pi-NAS compared to supervised NAS. See Codes: https://github.com/Ernie1/Pi-NAS.

【15】 A Systematic Literature Review of Automated Query Reformulations in Source Code Search 标题：源代码搜索中自动查询重构的系统文献综述链接：https://arxiv.org/abs/2108.09646

作者：Mohammad Masudur Rahman,Chanchal K. Roy 机构：Studies show that software maintenance costs up to ,% of the total budget. As a part of maintenance, software, developers often resolve critical bugs to ensure the reliability of their software. They might also need to add 备注：68 pages, 15 tables, 25 figures, to be submitted to CSUR 摘要：软件开发人员经常修复关键的bug，以确保其软件的可靠性。他们可能还需要定期为软件添加新功能，以保持市场竞争力。这些缺陷和特性被报告为变更请求（即软件用户编写的技术文档）。开发人员查阅这些文档以实现软件代码中所需的更改。作为变更实施的一部分，他们通常从变更请求中选择几个重要的关键字作为临时查询。然后，他们使用代码搜索引擎（如Lucene）执行查询，并试图找出软件代码中需要更改的确切位置。不幸的是，即使是有经验的开发人员也常常无法选择正确的查询。因此，开发人员在检测代码中的适当位置时经常会遇到困难，他们的大部分时间都花在大量的尝试和错误上。有许多研究试图通过自动重新格式化他们的特殊查询来支持开发人员构造查询。在这篇系统的文献综述中，我们从2970个候选研究中仔细选择了70个关于查询重新表述的初步研究，使用扎根理论方法进行了深入的定性分析，然后回答了六个重要的研究问题。我们的调查报告了几项重大发现。首先，到目前为止，在查询重新格式化中采用了八种主要方法（例如，术语权重、查询术语共现分析、同义词库查找）。其次，现有的研究存在一些主要的局限性（例如，缺乏普遍性、词汇不匹配问题、评估不力、开发人员的额外负担），这可能会阻碍它们的广泛采用。最后，我们讨论了搜索查询重新格式化中的几个开放问题，并提出了未来的多个研究机会。摘要：Software developers often fix critical bugs to ensure the reliability of their software. They might also need to add new features to their software at a regular interval to stay competitive in the market. These bugs and features are reported as change requests (i.e., technical documents written by software users). Developers consult these documents to implement the required changes in the software code. As a part of change implementation, they often choose a few important keywords from a change request as an ad hoc query. Then they execute the query with a code search engine (e.g., Lucene) and attempt to find out the exact locations within the software code that need to be changed. Unfortunately, even experienced developers often fail to choose the right queries. As a consequence, the developers often experience difficulties in detecting the appropriate locations within the code and spend the majority of their time in numerous trials and errors. There have been many studies that attempt to support developers in constructing queries by automatically reformulating their ad hoc queries. In this systematic literature review, we carefully select 70 primary studies on query reformulations from 2,970 candidate studies, perform an in-depth qualitative analysis using the Grounded Theory approach, and then answer six important research questions. Our investigation has reported several major findings. First, to date, eight major methodologies (e.g., term weighting, query-term co-occurrence analysis, thesaurus lookup) have been adopted in query reformulation. Second, the existing studies suffer from several major limitations (e.g., lack of generalizability, vocabulary mismatch problem, weak evaluation, the extra burden on the developers) that might prevent their wide adoption. Finally, we discuss several open issues in search query reformulations and suggest multiple future research opportunities.

【16】 Joint Characterization of Spatiotemporal Data Manifolds 标题：时空数据流形的联合刻画链接：https://arxiv.org/abs/2108.09545

作者：Daniel Sousa,Christopher Small 机构：Department of Geography, San Diego State University, San Diego, CA, USA, Lamont-Doherty Earth Observatory, Columbia University, Palisades, NY, USA, Correspondence: 备注：20 pages, 5 figures 摘要：时空（ST）图像数据越来越常见，并且通常是高维（高维）的。ST数据建模可能是一个挑战，因为有太多的独立和相互作用的过程，这些过程可能有助于测量，也可能没有。通过帮助指导关于生成过程及其在数据中的表示的假设，可以认为特征化是对建模的补充。降维（DR）是一种常用的表征方法，旨在缓解高维信号的“维数灾难”。几十年来，主成分（PC）和经验正交函数（EOF）分析一直被用作DR和ST分析的线性可逆方法。近年来，又出现了一系列非线性DR算法的发展，这些算法通常被归类为“流形学习”。在这里，我们探讨了使用PCs/EOFs以及两种非线性DR方法（拉普拉斯特征映射（LE）和t分布随机邻居嵌入（t-SNE）联合表征ST数据流形的思想。从一个合成示例开始，逐步发展到全球、区域和现场规模的ST数据集，空间跨度约为5个数量级，时间跨度约为2个数量级，我们表明这三种DR方法可以产生关于ST流形拓扑的补充信息。与PCs/EOF产生的相对扩散TF相比，非线性方法产生更紧凑的流形，减少了时间端成员（LE）和/或时空聚类（t-SNE）中的模糊性。与LE或t-SNE相比，PCs/EOF具有更高的可解释性、更低的计算需求和更低的空间混叠敏感性，从而补偿了这些特性。综上所述，我们发现，与单独使用任何一种方法相比，使用三种互补的DR方法进行联合表征能够更深入地了解生成性ST过程。摘要：Spatiotemporal (ST) image data are increasingly common and often high-dimensional (high-D). Modeling ST data can be a challenge due to the plethora of independent and interacting processes which may or may not contribute to the measurements. Characterization can be considered the complement to modeling by helping guide assumptions about generative processes and their representation in the data. Dimensionality reduction (DR) is a frequently implemented type of characterization designed to mitigate the "curse of dimensionality" on high-D signals. For decades, Principal Component (PC) and Empirical Orthogonal Function (EOF) analysis has been used as a linear, invertible approach to DR and ST analysis. Recent years have seen the additional development of a suite of nonlinear DR algorithms, frequently categorized as "manifold learning". Here, we explore the idea of joint characterization of ST data manifolds using PCs/EOFs alongside two nonlinear DR approaches: Laplacian Eigenmaps (LE) and t-distributed stochastic neighbor embedding (t-SNE). Starting with a synthetic example and progressing to global, regional, and field scale ST datasets spanning roughly 5 orders of magnitude in space and 2 in time, we show these three DR approaches can yield complementary information about ST manifold topology. Compared to the relatively diffuse TFS produced by PCs/EOFs, the nonlinear approaches yield more compact manifolds with decreased ambiguity in temporal endmembers (LE) and/or in spatiotemporal clustering (t-SNE). These properties are compensated by the greater interpretability, significantly lower computational demand and diminished sensitivity to spatial aliasing for PCs/EOFs than LE or t-SNE. Taken together, we find joint characterization using the three complementary DR approaches capable of greater insight into generative ST processes than possible using any single approach alone.

【17】 Using growth transform dynamical systems for spatio-temporal data sonification 标题：利用生长变换动力系统进行时空数据可听化链接：https://arxiv.org/abs/2108.09537

作者：Oindrila Chatterjee,Shantanu Chakrabartty 机构：Washington University in St. Louis, Missouri , USA. 备注：This article was submitted to PLoS One in March, 2021 and is currently under peer review 摘要：声音化，或在有意义的音频签名中编码信息，在增强或取代传统的人在回路决策可视化方法方面有几个优势。文献中报告的标准超声方法涉及（i）仅使用变量子集，或（ii）首先解决数据上的学习任务，然后将输出映射到音频波形，最终用户利用该波形做出决策。本文提出了一种利用复杂增长变换动态系统模型对高维数据进行超声处理的新框架，该模型将学习（或更一般地说，优化）和超声处理过程集成在一起。我们的算法将学习或预测任务的数据和优化参数作为输入，并将其与用户定义的心理声学参数相结合。因此，该框架输出的双耳音频特征不仅编码了高维数据的一些统计特性，而且还揭示了优化/学习过程的潜在复杂性。通过使用合成数据集进行大量实验，我们展示了超声脑电图（EEG）数据框架，该框架具有检测儿童癫痫发作的潜力。摘要：Sonification, or encoding information in meaningful audio signatures, has several advantages in augmenting or replacing traditional visualization methods for human-in-the-loop decision-making. Standard sonification methods reported in the literature involve either (i) using only a subset of the variables, or (ii) first solving a learning task on the data and then mapping the output to an audio waveform, which is utilized by the end-user to make a decision. This paper presents a novel framework for sonifying high-dimensional data using a complex growth transform dynamical system model where both the learning (or, more generally, optimization) and the sonification processes are integrated together. Our algorithm takes as input the data and optimization parameters underlying the learning or prediction task and combines it with the psychoacoustic parameters defined by the user. As a result, the proposed framework outputs binaural audio signatures that not only encode some statistical properties of the high-dimensional data but also reveal the underlying complexity of the optimization/learning process. Along with extensive experiments using synthetic datasets, we demonstrate the framework on sonifying Electro-encephalogram (EEG) data with the potential for detecting epileptic seizures in pediatric patients.

【18】 Term Interrelations and Trends in Software Engineering 标题：软件工程中的术语相互关系及发展趋势链接：https://arxiv.org/abs/2108.09529

作者：Janusan Baskararajah,Lei Zhang,Andriy Miranskyy 机构：Department of Computer Science, Ryerson University, Toronto, Canada 备注：None 摘要：软件工程（SE）社区是多产的，这使得专家们很难跟上新论文的浪潮，新手也很难进入这个领域。因此，我们假设社区可能受益于从SE社区的文本语料库中提取术语及其相互关系并显示术语趋势的工具。在本文中，我们使用单词嵌入技术构建了一个原型工具。我们在SE Body of Knowledge手册和15233篇研究论文的标题和摘要中对嵌入进行了训练。我们还创建必要的测试用例来验证嵌入的训练。我们提供了具有代表性的例子，说明嵌入可能有助于总结术语和揭示知识库中的趋势。摘要：The Software Engineering (SE) community is prolific, making it challenging for experts to keep up with the flood of new papers and for neophytes to enter the field. Therefore, we posit that the community may benefit from a tool extracting terms and their interrelations from the SE community's text corpus and showing terms' trends. In this paper, we build a prototyping tool using the word embedding technique. We train the embeddings on the SE Body of Knowledge handbook and 15,233 research papers' titles and abstracts. We also create test cases necessary for validation of the training of the embeddings. We provide representative examples showing that the embeddings may aid in summarizing terms and uncovering trends in the knowledge base.

【19】 A computational study on imputation methods for missing environmental data 标题：缺失环境数据补偿方法的计算研究链接：https://arxiv.org/abs/2108.09500

作者：Paul Dixneuf,Fausto Errico,Mathias Glaus 机构： Université du Québec 摘要：数据库形式的数据采集和记录是例行操作。然而，收集数据的过程可能会出现异常情况，导致数据库中数据丢失。缺少条目可能会改变分析效率，从而改变相关的决策过程。本文的重点是收集自然环境相关信息的数据库。鉴于记录的活动范围广泛，这些数据库通常是混合性质的。因此，考虑到这一特点，评估缺失数据处理方法的性能是相关的。本文研究了几种缺失数据插补方法的性能及其在环境中缺失数据问题中的应用。进行了一项计算研究，以比较missForest（MF）方法与其他两种插补方法，即链式方程多变量插补（MICE）和K近邻插补（KNN）。对10个不同类型的预处理数据集进行了测试。结果显示，MF在插补误差方面通常优于MICE和KNN，混合型数据库的性能差距更为明显，与其他方法相比，MF将插补误差降低了150%。KNN通常是最快的方法。MF随后成功应用于魁北克污水处理厂性能监测的案例研究。我们相信，本研究证明了在处理缺失的环境数据时，使用MF作为插补方法的相关性。摘要：Data acquisition and recording in the form of databases are routine operations. The process of collecting data, however, may experience irregularities, resulting in databases with missing data. Missing entries might alter analysis efficiency and, consequently, the associated decision-making process. This paper focuses on databases collecting information related to the natural environment. Given the broad spectrum of recorded activities, these databases typically are of mixed nature. It is therefore relevant to evaluate the performance of missing data processing methods considering this characteristic. In this paper we investigate the performances of several missing data imputation methods and their application to the problem of missing data in environment. A computational study was performed to compare the method missForest (MF) with two other imputation methods, namely Multivariate Imputation by Chained Equations (MICE) and K-Nearest Neighbors (KNN). Tests were made on 10 pretreated datasets of various types. Results revealed that MF generally outperformed MICE and KNN in terms of imputation errors, with a more pronounced performance gap for mixed typed databases where MF reduced the imputation error up to 150%, when compared to the other methods. KNN was usually the fastest method. MF was then successfully applied to a case study on Quebec wastewater treatment plants performance monitoring. We believe that the present study demonstrates the pertinence of using MF as imputation method when dealing with missing environmental data.

【20】 Temporal Induced Self-Play for Stochastic Bayesian Games 标题：随机贝叶斯对策的时间诱导自玩链接：https://arxiv.org/abs/2108.09444

作者：Weizhe Chen,Zihan Zhou,Yi Wu,Fei Fang 机构：Shanghai Jiao Tong University, Shanghai Qi Zhi Institute, Tsinghua University, Carnegie Mellon University 备注：None 摘要：解决动态博弈的一个实际要求是确保玩家从任何决策点开始都能玩得很好。为了满足这一需求，现有的工作主要集中在均衡优化上，但现有技术的可扩展性和适用性受到限制。在本文中，我们提出了一个新的基于强化学习的框架——时间诱导自我游戏（TISP），它可以从任何决策点开始寻找性能良好的策略。TISP使用信念空间表示、反向归纳、策略学习和非参数近似。在TISP的基础上，我们设计了一个基于策略梯度的算法TISP-PG。我们证明了基于TISP的算法可以在有限视界的零和单边随机贝叶斯博弈中找到近似的完美贝叶斯均衡。我们在各种游戏中测试基于TISP的算法，包括有限重复安全游戏和网格世界游戏。结果表明，TISP-PG比现有的基于数学规划的方法具有更好的可扩展性，并且显著优于其他基于学习的方法。摘要：One practical requirement in solving dynamic games is to ensure that the players play well from any decision point onward. To satisfy this requirement, existing efforts focus on equilibrium refinement, but the scalability and applicability of existing techniques are limited. In this paper, we propose Temporal-Induced Self-Play (TISP), a novel reinforcement learning-based framework to find strategies with decent performances from any decision point onward. TISP uses belief-space representation, backward induction, policy learning, and non-parametric approximation. Building upon TISP, we design a policy-gradient-based algorithm TISP-PG. We prove that TISP-based algorithms can find approximate Perfect Bayesian Equilibrium in zero-sum one-sided stochastic Bayesian games with finite horizon. We test TISP-based algorithms in various games, including finitely repeated security games and a grid-world game. The results show that TISP-PG is more scalable than existing mathematical programming-based methods and significantly outperforms other learning-based methods.

【21】 Fast Sketching of Polynomial Kernels of Polynomial Degree 标题：多项式次多项式核的快速绘制链接：https://arxiv.org/abs/2108.09420

作者：Zhao Song,David P. Woodruff,Zheng Yu,Lichen Zhang 机构： Princeton University 备注：ICML 2021 摘要：核方法是机器学习的基础，核近似的快速算法为机器学习中的许多核心任务提供了直接的加速。多项式核尤其重要，因为其他核通常可以通过泰勒级数展开由多项式核近似。最近的不经意素描技术减少了运行时间对多项式核的阶数$q$的依赖性，从指数到多项式，这对于高斯核很有用，对于高斯核，$q$可以选择为多对数。然而，对于增长较慢的内核，例如神经正切和弧余弦内核，$q$需要是多项式的，而以前的工作会导致运行时间的多项式因子减慢。我们给出了一个新的不经意草图，它通过消除前导顺序项中对$q$的依赖，大大改进了这个运行时间。结合一种新的采样方案，我们给出了逼近一大类缓慢增长核的最快算法。摘要：Kernel methods are fundamental in machine learning, and faster algorithms for kernel approximation provide direct speedups for many core tasks in machine learning. The polynomial kernel is especially important as other kernels can often be approximated by the polynomial kernel via a Taylor series expansion. Recent techniques in oblivious sketching reduce the dependence in the running time on the degree $q$ of the polynomial kernel from exponential to polynomial, which is useful for the Gaussian kernel, for which $q$ can be chosen to be polylogarithmic. However, for more slowly growing kernels, such as the neural tangent and arc-cosine kernels, $q$ needs to be polynomial, and previous work incurs a polynomial factor slowdown in the running time. We give a new oblivious sketch which greatly improves upon this running time, by removing the dependence on $q$ in the leading order term. Combined with a novel sampling scheme, we give the fastest algorithms for approximating a large family of slow-growing kernels.

【22】 D-DARTS: Distributed Differentiable Architecture Search 标题：D-DARTS：分布式可区分体系结构搜索链接：https://arxiv.org/abs/2108.09306

作者：Alexandre Heuillet,Hedi Tabia,Hichem Arioui,Kamal Youcef-Toumi 机构：Universit´e Paris-Saclay, MIT 摘要：可微结构搜索（DARTS）是最流行的神经结构搜索（NAS）方法之一，通过采用随机梯度下降（SGD）和权重共享，大大降低了搜索成本。然而，它也大大减少了搜索空间，从而排除了潜在的有希望的架构被发现。在本文中，我们提出了D-DART，这是一种新的解决方案，通过在单元级别嵌套多个神经网络来解决此问题，而不是使用权重共享来生成更多样化和专用的体系结构。此外，我们还介绍了一种新的算法，该算法可以从少量训练单元中派生出更深层次的体系结构，从而提高了性能并节省了计算时间。我们的解决方案能够在CIFAR-10、CIFAR-100和ImageNet上提供最先进的结果，同时使用的参数比以前的基线少得多，从而产生更具硬件效率的神经网络。摘要：Differentiable ARchiTecture Search (DARTS) is one of the most trending Neural Architecture Search (NAS) methods, drastically reducing search cost by resorting to Stochastic Gradient Descent (SGD) and weight-sharing. However, it also greatly reduces the search space, thus excluding potential promising architectures from being discovered. In this paper, we propose D-DARTS, a novel solution that addresses this problem by nesting several neural networks at cell-level instead of using weight-sharing to produce more diversified and specialized architectures. Moreover, we introduce a novel algorithm which can derive deeper architectures from a few trained cells, increasing performance and saving computation time. Our solution is able to provide state-of-the-art results on CIFAR-10, CIFAR-100 and ImageNet while using significantly less parameters than previous baselines, resulting in more hardware-efficient neural networks.

【23】 Primal and Dual Combinatorial Dimensions 标题：原始和对偶组合维数链接：https://arxiv.org/abs/2108.10037

作者：Pieter Kleer,Hans Simon 机构：Tilburg University, Tilburg, The Netherlands, Max Planck Institute for Informatics, Saarland Informatics Campus (SIC), Saarbr¨ucken, Germany 摘要：对于多值函数类，我们给出了各种组合维数（如伪维数和脂肪破碎维数）的原维数和对偶维数之间关系的紧界。这些维度概念在学习理论领域发挥着重要作用。我们首先回顾了一些（民间传说）的结果，这些结果将函数类的对偶维度按其原始维度进行了限定，然后给出了（几乎）匹配的下界。特别是，由于Assouad（1983），我们对已知界的多值函数类给出了一个适当的推广，该推广涉及二元函数类的原始VC维和对偶VC维。摘要：We give tight bounds on the relation between the primal and dual of various combinatorial dimensions, such as the pseudo-dimension and fat-shattering dimension, for multi-valued function classes. These dimensional notions play an important role in the area of learning theory. We first review some (folklore) results that bound the dual dimension of a function class in terms of its primal, and after that give (almost) matching lower bounds. In particular, we give an appropriate generalization to multi-valued function classes of a well-known bound due to Assouad (1983), that relates the primal and dual VC-dimension of a binary function class.

【24】 An Efficient Mini-batch Method via Partial Transportation 标题：一种高效的小批量部分运输方法链接：https://arxiv.org/abs/2108.09645

作者：Khai Nguyen,Dang Nguyen,Tung Pham,Nhat Ho 机构：University of Texas, Austin†; VinAI Research, Vietnam⋄ 备注：25 pages, 10 figures 摘要：近年来，小批量优化传输（m-OT）被广泛用于解决大规模应用中OT的存储问题。尽管m-OT具有实用性，但其存在错误指定的映射，即在小批量级别上是最优的映射，但在原始度量之间的最优运输计划中不存在。为了解决指定错误的映射问题，我们提出了一种新的小批量方法，通过使用小批量经验测度之间的部分最优传输（POT），我们称之为小批量部分最优传输（m-POT）。利用部分运输的洞察力，我们解释了m-OT中错误指定映射的来源，并解释了为什么通过POT限制小批量运输的质量可以缓解错误映射。最后，我们在各种应用上进行了广泛的实验，以比较m-POT和m-OT以及最近提出的小批量方法，即小批量非平衡最优运输（m-UOT）。我们观察到，m-POT优于m-OT深域适配应用程序，同时具有与m-UOT相当的性能。在其他应用中，如深度生成模型、梯度流和颜色转移，m-POT比m-OT和m-UOT都能产生更好的性能。摘要：Mini-batch optimal transport (m-OT) has been widely used recently to deal with the memory issue of OT in large-scale applications. Despite their practicality, m-OT suffers from misspecified mappings, namely, mappings that are optimal on the mini-batch level but do not exist in the optimal transportation plan between the original measures. To address the misspecified mappings issue, we propose a novel mini-batch method by using partial optimal transport (POT) between mini-batch empirical measures, which we refer to as mini-batch partial optimal transport (m-POT). Leveraging the insight from the partial transportation, we explain the source of misspecified mappings from the m-OT and motivate why limiting the amount of transported masses among mini-batches via POT can alleviate the incorrect mappings. Finally, we carry out extensive experiments on various applications to compare m-POT with m-OT and recently proposed mini-batch method, mini-batch unbalanced optimal transport (m-UOT). We observe that m-POT is better than m-OT deep domain adaptation applications while having comparable performance with m-UOT. On other applications, such as deep generative model, gradient flow, and color transfer, m-POT yields more favorable performance than both m-OT and m-UOT.

【25】 How Can Increased Randomness in Stochastic Gradient Descent Improve Generalization? 标题：随机梯度下降中增加的随机性如何改善泛化？链接：https://arxiv.org/abs/2108.09507

作者：Arwen V. Bradley,Carlos Alberto Gomez-Uribe 机构：Apple, Infinite Loop, Cupertino, CA 摘要：最近的研究表明，在随机梯度下降（SGD）中提高学习率或减小小批量可以提高测试集的性能。我们认为，在具有多个局部极小值的损失函数的模型中，这在某些条件下是预期的。我们的主要贡献是受到物理学方法启发的一种近似分析方法，用于研究SGD学习率和批量大小在泛化中的作用。对于具有多个极小值的损失函数，我们描述了在训练和测试数据分布之间发生变化时测试集的性能。这种变化可能仅仅是由于采样造成的，因此通常出现在实际应用中。我们表明，由此产生的局部极小值的偏移通过拾取曲率而恶化了测试性能，这意味着通过选择宽和/或小偏移的局部极小值，泛化得到了改进。然后我们专门研究SGD，并研究其在平稳性下的测试性能。由于SGD的精确平稳分布很难获得，我们推导了SGD的福克-普朗克近似，并得到了它的平稳分布。这一过程表明，学习率除以小批量的作用类似于统计力学中的温度，并意味着SGD（包括其平稳分布）在很大程度上对保持温度恒定的学习率或批量的变化保持不变。我们表明，增加SGD温度有助于选择曲率较低的局部极小值，并且可以实现更好的泛化。我们在CIFAR10上进行了实验，证明了SGD的温度不变性，随着SGD温度的升高，测试损耗有所改善，并量化了采样与畴移对驱动这种效应的影响。最后，我们给出了合成实验，展示了我们的理论如何应用于具有两个局部极小值的简化损失。摘要：Recent works report that increasing the learning rate or decreasing the minibatch size in stochastic gradient descent (SGD) can improve test set performance. We argue this is expected under some conditions in models with a loss function with multiple local minima. Our main contribution is an approximate but analytical approach inspired by methods in Physics to study the role of the SGD learning rate and batch size in generalization. We characterize test set performance under a shift between the training and test data distributions for loss functions with multiple minima. The shift can simply be due to sampling, and is therefore typically present in practical applications. We show that the resulting shift in local minima worsens test performance by picking up curvature, implying that generalization improves by selecting wide and/or little-shifted local minima. We then specialize to SGD, and study its test performance under stationarity. Because obtaining the exact stationary distribution of SGD is intractable, we derive a Fokker-Planck approximation of SGD and obtain its stationary distribution instead. This process shows that the learning rate divided by the minibatch size plays a role analogous to temperature in statistical mechanics, and implies that SGD, including its stationary distribution, is largely invariant to changes in learning rate or batch size that leave its temperature constant. We show that increasing SGD temperature encourages the selection of local minima with lower curvature, and can enable better generalization. We provide experiments on CIFAR10 demonstrating the temperature invariance of SGD, improvement of the test loss as SGD temperature increases, and quantifying the impact of sampling versus domain shift in driving this effect. Finally, we present synthetic experiments showing how our theory applies in a simplified loss with two local minima.

linux 数据挖掘网络安全 https sql

0 人点赞