机器学习学术速递[7.19]

访问www.arxivdaily.com获取含摘要速递，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏、发帖等功能！点击阅读原文即可访问

cs.LG 方向，今日共计67篇

Graph相关(图学习|图神经网络|图优化等)(5篇)

【1】 Graph Kernel Attention Transformers 标题：图核注意力转换器

作者：Krzysztof Choromanski,Han Lin,Haoxian Chen,Jack Parker-Holder 机构：Google Brain Robotics & Columbia University, University of Oxford 备注：18 pages, 9 figures 链接：https://arxiv.org/abs/2107.07999 摘要：我们引入了一类新的图神经网络（GNNs），它结合了迄今为止独立研究的几个概念：图核、基于注意的具有结构先验的网络和最近通过低秩分解技术应用小内存占用隐式注意方法的高效Transformer结构。本文的目标有两个。由美国提出的图形内核注意变换器（或GKATs）比SOTA-GNNs更具表现力，因为它能够在一个层中建模更长范围的依赖关系。因此，他们可以使用更浅层次的架构设计。此外，GKAT注意层在输入图的节点数上是线性的，而不是二次的，即使这些图是稠密的，需要的计算量也比规则图注意层少。他们通过应用新的图核类，通过图上的随机游动来实现随机特征映射分解。作为这些技术的副产品，我们得到了一类新的可学习的图草图，称为图，它对拓扑图的性质以及节点的特征进行了紧凑的编码。我们将我们的方法与9个不同的GNN类进行了详尽的实证比较，从基序检测到社会网络分类，再到生物信息学挑战，显示了GKATs的一致性。摘要：We introduce a new class of graph neural networks (GNNs), by combining several concepts that were so far studied independently - graph kernels, attention-based networks with structural priors and more recently, efficient Transformers architectures applying small memory footprint implicit attention methods via low rank decomposition techniques. The goal of the paper is twofold. Proposed by us Graph Kernel Attention Transformers (or GKATs) are much more expressive than SOTA GNNs as capable of modeling longer-range dependencies within a single layer. Consequently, they can use more shallow architecture design. Furthermore, GKAT attention layers scale linearly rather than quadratically in the number of nodes of the input graphs, even when those graphs are dense, requiring less compute than their regular graph attention counterparts. They achieve it by applying new classes of graph kernels admitting random feature map decomposition via random walks on graphs. As a byproduct of the introduced techniques, we obtain a new class of learnable graph sketches, called graphots, compactly encoding topological graph properties as well as nodes' features. We conducted exhaustive empirical comparison of our method with nine different GNN classes on tasks ranging from motif detection through social network classification to bioinformatics challenges, showing consistent gains coming from GKATs.

【2】 Graph Representation Learning for Road Type Classification 标题：用于道路类型分类的图表示学习

作者：Zahra Gharaee,Shreyas Kowshik,Oliver Stromann,Michael Felsberg 机构：Computer Vision Laboratory (CVL), Department of Electrical Engineering, University of Link¨oping, Link¨oping, Sweden, Department of Mathematics, Indian Institute of Technology Kharagpur, India, Autonomous Transport Solutions Research, Scania CV AB, Sweden 链接：https://arxiv.org/abs/2107.07791 摘要：我们提出了一种新的基于学习的道路网络图表示方法，采用最先进的图卷积神经网络。我们的方法被应用于17个城市的真实道路网从开放的街道地图。边缘特征是生成道路网络描述图的关键，而图卷积网络通常只依赖于节点特征。我们证明了高度代表性的边缘特征仍然可以通过线图变换集成到这样的网络中。提出了一种基于局部邻域和全局邻域的拓扑邻域采样方法。我们比较了使用不同类型的邻域聚合函数的学习表征在导入和归纳任务中以及在有监督和无监督学习中的表现。此外，我们还提出了一种新的聚合方法，即图注意同构网络。结果表明，在道路类型分类问题上，GAIN算法的性能优于现有的方法。摘要：We present a novel learning-based approach to graph representations of road networks employing state-of-the-art graph convolutional neural networks. Our approach is applied to realistic road networks of 17 cities from Open Street Map. While edge features are crucial to generate descriptive graph representations of road networks, graph convolutional networks usually rely on node features only. We show that the highly representative edge features can still be integrated into such networks by applying a line graph transformation. We also propose a method for neighborhood sampling based on a topological neighborhood composed of both local and global neighbors. We compare the performance of learning representations using different types of neighborhood aggregation functions in transductive and inductive tasks and in supervised and unsupervised learning. Furthermore, we propose a novel aggregation approach, Graph Attention Isomorphism Network, GAIN. Our results show that GAIN outperforms state-of-the-art methods on the road type classification problem.

【3】 EGC2: Enhanced Graph Classification with Easy Graph Compression 标题：EGC2：具有简易图压缩功能的增强型图分类

作者：Jinyin Chen,Dunjie Zhang,Zhaoyan Ming,Mingwei Jia,Yi Liu 机构： Zhe-jiang University of Technology, Ming is with the Institute of Computing Innovation, Zhejiang University 备注：14 pages, 11 figures 链接：https://arxiv.org/abs/2107.07737 摘要：图分类在网络分析中起着重要的作用。它还面临着潜在的安全威胁，如对抗性攻击。一些防御方法可能会牺牲算法复杂度以获得鲁棒性，如对抗性训练，而另一些方法可能会牺牲干净的示例性能，如基于平滑的防御。它们大多具有很高的复杂性或可移植性。为了解决这个问题，我们提出了EGC$^2$，一个易于图形压缩的增强的图形分类模型。EGC$^2$通过构造特征图和改进聚合节点级表示来捕获不同节点的特征之间的关系。EGC$^2$利用一个基于中心性的边重要度指标对图进行压缩，滤除输入图的平凡结构甚至敌对扰动，从而提高了它的鲁棒性。在七个基准数据集上的实验表明，所提出的特征读取和图形压缩机制增强了各种基本模型的鲁棒性，从而在不同的对抗性攻击威胁下实现了最先进的准确性和鲁棒性性能。摘要：Graph classification plays a significant role in network analysis. It also faces potential security threat like adversarial attacks. Some defense methods may sacrifice algorithm complexity for robustness like adversarial training, while others may sacrifice the clean example performance such as smoothing-based defense. Most of them are suffered from high-complexity or less transferability. To address this problem, we proposed EGC$^2$, an enhanced graph classification model with easy graph compression. EGC$^2$ captures the relationship between features of different nodes by constructing feature graphs and improving aggregate node-level representation. To achieve lower complexity defense applied to various graph classification models, EGC$^2$ utilizes a centrality-based edge importance index to compress graphs, filtering out trivial structures and even adversarial perturbations of the input graphs, thus improves its robustness. Experiments on seven benchmark datasets demonstrate that the proposed feature read-out and graph compression mechanisms enhance the robustness of various basic models, thus achieving the state-of-the-art performance of accuracy and robustness in the threat of different adversarial attacks.

【4】 Correlation detection in trees for partial graph alignment 标题：用于部分图对齐的树的相关性检测

作者：Luca Ganassali,Laurent Massoulié,Marc Lelarge 备注：22 pages, 1 figure. Preliminary version 链接：https://arxiv.org/abs/2107.07623 摘要：我们考虑稀疏图的对齐，它包括在两个图的节点之间找到映射，保留了大部分的边。我们的方法是比较两个图中的局部结构，如果两个节点的邻域“足够接近”，则匹配两个节点：对于相关ErdH{o}s-R-enyi随机图，这个问题可以通过测试一对分支树是从乘积分布还是从相关分布中提取来局部地重新表述。我们为此问题设计了一个最优测试，给出了一个用于图对齐的消息传递算法，该算法可证明在多项式时间内返回正确匹配顶点的正分数和不匹配的消失分数。如果图中的平均度数$lambda=O（1）$，并且在[0,1]$中有一个相关参数$s，则$lambda s$足够大，$1-s$足够小，这个结果就成立了，从而完成了最新的图表。利用Kullback-Leibler发散给出了多项式时间内部分图对齐（或树中相关检测）是否可行的更严格条件。摘要：We consider alignment of sparse graphs, which consists in finding a mapping between the nodes of two graphs which preserves most of the edges. Our approach is to compare local structures in the two graphs, matching two nodes if their neighborhoods are 'close enough': for correlated ErdH{o}s-R'enyi random graphs, this problem can be locally rephrased in terms of testing whether a pair of branching trees is drawn from either a product distribution, or a correlated distribution. We design an optimal test for this problem which gives rise to a message-passing algorithm for graph alignment, which provably returns in polynomial time a positive fraction of correctly matched vertices, and a vanishing fraction of mismatches. With an average degree $lambda = O(1)$ in the graphs, and a correlation parameter $s in [0,1]$, this result holds with $lambda s$ large enough, and $1-s$ small enough, completing the recent state-of-the-art diagram. Tighter conditions for determining whether partial graph alignment (or correlation detection in trees) is feasible in polynomial time are given in terms of Kullback-Leibler divergences.

【5】 Online Graph Topology Learning from Matrix-valued Time Series 标题：基于矩阵值时间序列的在线图拓扑学习

作者：Yiye Jiang,Jérémie Bigot,Sofian Maabout 机构：Institut de Mathématiques de Bordeaux, Université de Bordeaux, Laboratoire Bordelais de Recherche en Informatique, Université de Bordeaux 链接：https://arxiv.org/abs/2107.08020 摘要：本文研究矩阵值时间序列的统计分析。这些是通过传感器网络（通常是一组空间位置）收集的数据，随着时间的推移，记录多个测量值的观测值。从这些数据，我们建议学习，在一个在线的方式，一个图表，捕捉两个方面的依赖：一个描述稀疏的空间关系传感器，另一个表征测量关系。为此，我们引入了一个新的多元自回归模型来推断编码在系数矩阵中的图拓扑，该模型捕捉了这种矩阵值时间序列中存在的稀疏Granger因果关系依赖结构。我们通过对系数矩阵施加Kronecker和结构来分解图。我们开发了两种在线方法来递归地学习图形。第一种方法使用Wald检验进行投影OLS估计，得到估计量的渐近分布。对于第二个问题，我们形式化了一个套索型优化问题。我们依靠同伦算法来推导系数矩阵估计的更新规则。此外，我们提供了正则化参数的自适应调整过程。数值实验使用合成和真实数据，以支持所提出的学习方法的有效性。摘要：This paper is concerned with the statistical analysis of matrix-valued time series. These are data collected over a network of sensors (typically a set of spatial locations), recording, over time, observations of multiple measurements. From such data, we propose to learn, in an online fashion, a graph that captures two aspects of dependency: one describing the sparse spatial relationship between sensors, and the other characterizing the measurement relationship. To this purpose, we introduce a novel multivariate autoregressive model to infer the graph topology encoded in the coefficient matrix which captures the sparse Granger causality dependency structure present in such matrix-valued time series. We decompose the graph by imposing a Kronecker sum structure on the coefficient matrix. We develop two online approaches to learn the graph in a recursive way. The first one uses Wald test for the projected OLS estimation, where we derive the asymptotic distribution for the estimator. For the second one, we formalize a Lasso-type optimization problem. We rely on homotopy algorithms to derive updating rules for estimating the coefficient matrix. Furthermore, we provide an adaptive tuning procedure for the regularization parameter. Numerical experiments using both synthetic and real data, are performed to support the effectiveness of the proposed learning approaches.

GAN|对抗|攻击|生成相关(3篇)

【1】 Privacy-preserving Spatiotemporal Scenario Generation of Renewable Energies: A Federated Deep Generative Learning Approach 标题：保护隐私的可再生能源时空情景生成：一种联合深度生成学习方法

作者：Yang Li,Jiazheng Li,Yi Wang 机构： Li are with the School of Electrical Engineering, NortheastElectric Power University 备注：Accepted by IEEE Transactions on Industrial Informatics 链接：https://arxiv.org/abs/2107.07738 摘要：情景生成是高渗透率可再生能源电力系统决策的基本和关键工具。基于大量历史数据，将联邦学习与最小二乘生成对抗网络（LSGAN）相结合，提出了一种新的联邦深度生成学习框架Fed-LSGAN。具体而言，联邦学习从网络边缘的可更新站点学习中央服务器中的共享全局模型，这使得Fed-LSGAN能够以隐私保护的方式生成场景，而不会通过传输模型参数而不是所有数据来牺牲生成质量。同时，基于LSGANs的深度生成模型通过充分捕捉可再生能源的时空特性，生成符合历史数据分布的场景，利用最小二乘损失函数提高训练稳定性和生成质量。仿真结果表明，该方案能够生成高质量的可再生能源方案，并优于现有的集中式方案。此外，设计并进行了不同联邦学习环境下的实验，验证了该方法的鲁棒性。摘要：Scenario generation is a fundamental and crucial tool for decision-making in power systems with high-penetration renewables. Based on big historical data, a novel federated deep generative learning framework, called Fed-LSGAN, is proposed by integrating federated learning and least square generative adversarial networks (LSGANs) for renewable scenario generation. Specifically, federated learning learns a shared global model in a central server from renewable sites at network edges, which enables the Fed-LSGAN to generate scenarios in a privacy-preserving manner without sacrificing the generation quality by transferring model parameters, rather than all data. Meanwhile, the LSGANs-based deep generative model generates scenarios that conform to the distribution of historical data through fully capturing the spatial-temporal characteristics of renewable powers, which leverages the least squares loss function to improve the training stability and generation quality. The simulation results demonstrate that the proposal manages to generate high-quality renewable scenarios and outperforms the state-of-the-art centralized methods. Besides, an experiment with different federated learning settings is designed and conducted to verify the robustness of our method.

【2】 ECG-Adv-GAN: Detecting ECG Adversarial Examples with Conditional Generative Adversarial Networks 标题：ECG-ADV-GAN：条件生成对抗性网络检测ECG对抗性实例

作者：Khondker Fariha Hossain,Sharif Amit Kamran,Alireza Tavakkoli,Lei Pan,Daniel Ma,Sutharshan Rajasegarar,Chandan Karmaker 机构：∗†‡ University of Nevada, Reno, NV, USA, §¶∥∗∗ Deakin University, Australia 备注：8 pages, 3 figures, 4 tables 链接：https://arxiv.org/abs/2107.07677 摘要：心电图（ECG）采集需要一个自动化系统和分析管道来理解特定的心律失常。深度神经网络已经成为追踪心电信号的一种流行技术，其性能优于人类专家。尽管如此，卷积神经网络很容易受到对抗性例子的影响，这些例子可能会对心电信号进行错误分类，并降低模型的精度。此外，它们在非分布数据集上的推广效果也不好。在最近的工作中，GAN结构被用来合成对抗性的ECG信号以增加现有的训练数据。然而，他们使用了一种不连贯的基于CNN的分类结构来检测心律失常。到目前为止，还没有一个通用的架构能够同时检测对抗性的例子和分类心律失常。为了缓解这一问题，我们提出了一种新的条件生成对抗网络来同时生成不同类别的心电信号并检测心脏异常。此外，该模型以特定类别的心电信号为条件，合成真实的对抗性例子。因此，我们比较了我们的体系结构，并通过对真实世界和敌对信号的基准测试，展示了它在正常/异常心电信号检测中的性能如何优于其他分类模型。摘要：Electrocardiogram (ECG) acquisition requires an automated system and analysis pipeline for understanding specific rhythm irregularities. Deep neural networks have become a popular technique for tracing ECG signals, outperforming human experts. Despite this, convolutional neural networks are susceptible to adversarial examples that can misclassify ECG signals and decrease the model's precision. Moreover, they do not generalize well on the out-of-distribution dataset. The GAN architecture has been employed in recent works to synthesize adversarial ECG signals to increase existing training data. However, they use a disjointed CNN-based classification architecture to detect arrhythmia. Till now, no versatile architecture has been proposed that can detect adversarial examples and classify arrhythmia simultaneously. To alleviate this, we propose a novel Conditional Generative Adversarial Network to simultaneously generate ECG signals for different categories and detect cardiac abnormalities. Moreover, the model is conditioned on class-specific ECG signals to synthesize realistic adversarial examples. Consequently, we compare our architecture and show how it outperforms other classification models in normal/abnormal ECG signal detection by benchmarking real world and adversarial signals.

【3】 Adversarial Attack for Uncertainty Estimation: Identifying Critical Regions in Neural Networks 标题：不确定性估计的对抗性攻击：识别神经网络中的关键区域

作者：Ismail Alarab,Simant Prakoonwit 机构： Bournemouth University, United Kingdom 备注：15 pages, 6 figures, Submitted to Neural Processing Letters Journal 链接：https://arxiv.org/abs/2107.07618 摘要：我们提出了一种新的方法来捕捉神经网络中决策边界附近的数据点，这些数据点通常涉及到特定类型的不确定性。在我们的方法中，我们试图基于对抗攻击方法的思想来进行不确定性估计。在本文中，不确定性估计是从输入扰动中导出的，这与以往的研究不同，在贝叶斯方法中，这些研究提供了对模型参数的扰动。我们能够通过对输入的几次扰动来产生不确定性。有趣的是，我们将提出的方法应用于来自区块链的数据集。我们比较了模型不确定性与最新的不确定性方法的性能。结果表明，与其他方法相比，本文提出的方法具有显著的优越性，并且在机器学习中捕获模型不确定性的风险较小。摘要：We propose a novel method to capture data points near decision boundary in neural network that are often referred to a specific type of uncertainty. In our approach, we sought to perform uncertainty estimation based on the idea of adversarial attack method. In this paper, uncertainty estimates are derived from the input perturbations, unlike previous studies that provide perturbations on the model's parameters as in Bayesian approach. We are able to produce uncertainty with couple of perturbations on the inputs. Interestingly, we apply the proposed method to datasets derived from blockchain. We compare the performance of model uncertainty with the most recent uncertainty methods. We show that the proposed method has revealed a significant outperformance over other methods and provided less risk to capture model uncertainty in machine learning.

半/弱/无/有监督|不确定性|主动学习(5篇)

【1】 Uncertainty Prediction for Machine Learning Models of Material Properties 标题：材料性质机器学习模型的不确定性预测

作者：Francesca Tavazza,Brian De Cost,Kamal Choudhary 机构：Materials Science and Engineering Division, National Institute of Standards and Technology, Gaithersburg, MD, USA 链接：https://arxiv.org/abs/2107.07997 摘要：基于人工智能（AI）的材料性能预测中的不确定性量化对于AI在材料科学中应用的成功和可靠性具有重要意义。虽然机器学习（ML）模型通常报告置信区间，但预测区间，即对每个预测的不确定性的评估，却很少可用。在这项工作中，我们比较了3种不同的方法，以获得这样的个人不确定度，测试他们对12毫升的物理性质。具体来说，我们研究了使用分位数损失函数，机器学习的预测区间直接和使用高斯过程。我们确定了每种方法的优点和缺点，并最终略微倾向于直接对单个不确定性建模，因为它最容易拟合，并且在大多数情况下，最小化了预测误差的过高和过低估计。所有用于训练和测试的数据都来自于公开的JARVIS-DFT数据库，用于计算预测间隔的代码可以通过JARVIS工具获得。摘要：Uncertainty quantification in Artificial Intelligence (AI)-based predictions of material properties is of immense importance for the success and reliability of AI applications in material science. While confidence intervals are commonly reported for machine learning (ML) models, prediction intervals, i.e., the evaluation of the uncertainty on each prediction, are seldomly available. In this work we compare 3 different approaches to obtain such individual uncertainty, testing them on 12 ML-physical properties. Specifically, we investigated using the Quantile loss function, machine learning the prediction intervals directly and using Gaussian Processes. We identify each approachs advantages and disadvantages and end up slightly favoring the modeling of the individual uncertainties directly, as it is the easiest to fit and, in most cases, minimizes over-and under-estimation of the predicted errors. All data for training and testing were taken from the publicly available JARVIS-DFT database, and the codes developed for computing the prediction intervals are available through JARVIS-Tools.

【2】 An Uncertainty-Aware, Shareable and Transparent Neural Network Architecture for Brain-Age Modeling 标题：一种不确定性感知、可共享、透明的脑年龄建模神经网络体系结构

作者：Tim Hahn,Jan Ernsting,Nils R. Winter,Vincent Holstein,Ramona Leenings,Marie Beisemann,Lukas Fisch,Kelvin Sarink,Daniel Emden,Nils Opel,Ronny Redlich,Jonathan Repple,Dominik Grotegerd,Susanne Meinert,Jochen G. Hirsch,Thoralf Niendorf,Beate Endemann,Fabian Bamberg,Thomas Kröncke,Robin Bülow,Henry Völzke,Oyunbileg von Stackelberg,Ramona Felizitas Sowade,Lale Umutlu,Börge Schmidt,Svenja Caspers,German National Cohort Study Center Consortium,Harald Kugel,Tilo Kircher,Benjamin Risse,Christian Gaser,James H. Cole,Udo Dannlowski,Klaus Berger 机构：Affiliations, Institute for Translational Psychiatry, University of Münster, Germany, Department of Statistics, TU Dortmund University, Dortmund, Germany, Department of Psychology, University of Halle, Halle, Germany, Fraunhofer MEVIS, Bremen, Germany 链接：https://arxiv.org/abs/2107.07977 摘要：从神经影像学数据中预测的年龄与实际年龄之间的偏差已被确定为交叉疾病脑改变的敏感风险标记，并逐渐成为生物年龄研究的基石。然而，该领域的机器学习模型不考虑不确定性，从而混淆了训练数据密度和变异性的结果。此外，现有的模型通常基于同质的训练集，通常没有独立验证，并且由于数据保护问题而无法共享。在这里，我们介绍一个不确定性意识，共享，透明蒙特卡罗辍学复合分位数回归（MCCQR）神经网络训练N=10691数据集来自德国国家队列。MCCQR模型在高维神经成像数据中提供了稳健的、无分布的不确定性量化，与现有模型相比，在十个招募中心和三个独立的验证样本（N=4004）中实现了较低的错误率。在两个例子中，我们证明了它可以防止虚假的关联，并提高检测大脑加速老化的能力。我们将预先训练好的模型公之于众。摘要：The deviation between chronological age and age predicted from neuroimaging data has been identified as a sensitive risk-marker of cross-disorder brain changes, growing into a cornerstone of biological age-research. However, Machine Learning models underlying the field do not consider uncertainty, thereby confounding results with training data density and variability. Also, existing models are commonly based on homogeneous training sets, often not independently validated, and cannot be shared due to data protection issues. Here, we introduce an uncertainty-aware, shareable, and transparent Monte-Carlo Dropout Composite-Quantile-Regression (MCCQR) Neural Network trained on N=10,691 datasets from the German National Cohort. The MCCQR model provides robust, distribution-free uncertainty quantification in high-dimensional neuroimaging data, achieving lower error rates compared to existing models across ten recruitment centers and in three independent validation samples (N=4,004). In two examples, we demonstrate that it prevents spurious associations and increases power to detect accelerated brain-aging. We make the pre-trained model publicly available.

【3】 Semi-supervised Learning for Marked Temporal Point Processes 标题：带标记时点过程的半监督学习

作者：Shivshankar Reddy,Anand Vir Singh Chauhan,Maneet Singh,Karamjit Singh 机构：AI Garage, Mastercard, India 链接：https://arxiv.org/abs/2107.07729 摘要：时间点过程（tpp）通常被用来表示事件序列，这些序列是按照事件发生的时间排序的。由于tpp的灵活性，它被用来模拟不同的场景，并在各种实际应用中表现出了适用性。tpp侧重于事件发生的建模，而标记时间点过程（markedtemporalpointprocess，MTPP）也侧重于事件的类别（marker）的建模。在过去的几年中，MTPP的研究得到了广泛的关注，主要集中在监督算法上。尽管有研究重点，但在半监督环境下开发解决方案这一具有挑战性的问题仍然受到了有限的关注，因为在半监督环境下，算法可以访问标记和未标记数据的混合。本研究提出一种新的适用于此类情形的标记时间点过程半监督学习算法（SSL-MTPP）。提出的SSL-MTPP算法结合了有标记和无标记的数据来学习一个鲁棒的标记预测模型。该算法利用基于RNN的编解码模块学习时间序列的有效表示。通过在转发数据集上的多个协议验证了该算法的有效性，与传统的有监督学习方法相比，SSL-MTPP算法的性能得到了改善。摘要：Temporal Point Processes (TPPs) are often used to represent the sequence of events ordered as per the time of occurrence. Owing to their flexible nature, TPPs have been used to model different scenarios and have shown applicability in various real-world applications. While TPPs focus on modeling the event occurrence, Marked Temporal Point Process (MTPP) focuses on modeling the category/class of the event as well (termed as the marker). Research in MTPP has garnered substantial attention over the past few years, with an extensive focus on supervised algorithms. Despite the research focus, limited attention has been given to the challenging problem of developing solutions in semi-supervised settings, where algorithms have access to a mix of labeled and unlabeled data. This research proposes a novel algorithm for Semi-supervised Learning for Marked Temporal Point Processes (SSL-MTPP) applicable in such scenarios. The proposed SSL-MTPP algorithm utilizes a combination of labeled and unlabeled data for learning a robust marker prediction model. The proposed algorithm utilizes an RNN-based Encoder-Decoder module for learning effective representations of the time sequence. The efficacy of the proposed algorithm has been demonstrated via multiple protocols on the Retweet dataset, where the proposed SSL-MTPP demonstrates improved performance in comparison to the traditional supervised learning approach.

【4】 Recognizing bird species in diverse soundscapes under weak supervision 标题：弱监督下不同声景中鸟类的识别

作者：Christof Henkel,Pascal Pfeiffer,Philipp Singer 机构：NVIDIA, Munich, Germany, RWTH Aachen University, Aachen, Germany, H,O.ai, Mountain View, USA, All authors contributed equally. 备注：All authors contributed equally, 8 pages, 4 figures, submitted to CEUR-WS 链接：https://arxiv.org/abs/2107.07728 摘要：我们提出了一种稳健的分类方法，用于复杂多样的声景中的鸟类发声，在2021年BirdCLEF2021挑战赛中获得第二名。我们说明了如何充分利用预先训练的卷积神经网络，通过使用一个有效的建模和训练例程，辅以新的增强方法。因此，我们改进了弱标记众包数据到自治记录单元收集的生产数据的泛化。因此，我们说明了如何朝着准确的自动评估鸟类种群的方向发展，这将使全球生物多样性的大规模监测成为可能，而手动注释是不可能的。摘要：We present a robust classification approach for avian vocalization in complex and diverse soundscapes, achieving second place in the BirdCLEF2021 challenge. We illustrate how to make full use of pre-trained convolutional neural networks, by using an efficient modeling and training routine supplemented by novel augmentation methods. Thereby, we improve the generalization of weakly labeled crowd-sourced data to productive data collected by autonomous recording units. As such, we illustrate how to progress towards an accurate automated assessment of avian population which would enable global biodiversity monitoring at scale, impossible by manual annotation.

【5】 Active learning for online training in imbalanced data streams under cold start 标题：冷启动下不平衡数据流在线训练的主动学习

作者：Ricardo Barata,Miguel Leite,Ricardo Pacheco,Marco O. P. Sampaio,João Tiago Ascensão,Pedro Bizarro 机构：Feedzai 备注：9 pages, 6 figures, 2 tables 链接：https://arxiv.org/abs/2107.07724 摘要：在依赖机器学习（ML）进行预测建模的现代系统中，标记数据是必不可少的。这样的系统可能会遇到冷启动问题：有监督的模型工作得很好，但最初，没有标签，这是昂贵的或缓慢获得。在数据不平衡的情况下，这个问题更为严重。在线金融欺诈检测就是这样一个例子：i）费用高昂，或者ii）如果依赖于受害者提出投诉，它会遭受长时间的延迟。如果必须立即建立模型，则后者可能不可行，因此一种选择是要求分析人员标记事件，同时最小化注释数量以控制成本。我们提出了一个主动学习（AL）注释系统的数据集与数量级的类不平衡，在冷启动流的情况下。我们提出了一种计算效率高的基于离群点的判别AL方法（ODAL），并设计了一种新的三阶段AL标记策略序列，用于预热。然后，我们在四个真实世界的数据集进行实证研究，不同程度的阶级失衡。结果表明，与标准的AL策略相比，该方法能更快地获得高性能模型。与随机抽样相比，其观察到的收益可以达到80%，并且可以与具有无限注释预算或附加历史数据（标签的1/10到1/50）的策略相竞争。摘要：Labeled data is essential in modern systems that rely on Machine Learning (ML) for predictive modelling. Such systems may suffer from the cold-start problem: supervised models work well but, initially, there are no labels, which are costly or slow to obtain. This problem is even worse in imbalanced data scenarios. Online financial fraud detection is an example where labeling is: i) expensive, or ii) it suffers from long delays, if relying on victims filing complaints. The latter may not be viable if a model has to be in place immediately, so an option is to ask analysts to label events while minimizing the number of annotations to control costs. We propose an Active Learning (AL) annotation system for datasets with orders of magnitude of class imbalance, in a cold start streaming scenario. We present a computationally efficient Outlier-based Discriminative AL approach (ODAL) and design a novel 3-stage sequence of AL labeling policies where it is used as warm-up. Then, we perform empirical studies in four real world datasets, with various magnitudes of class imbalance. The results show that our method can more quickly reach a high performance model than standard AL policies. Its observed gains over random sampling can reach 80% and be competitive with policies with an unlimited annotation budget or additional historical data (with 1/10 to 1/50 of the labels).

迁移|Zero/Few/One-Shot|自适应(3篇)

【1】 Property-aware Adaptive Relation Networks for Molecular Property Prediction 标题：基于属性感知的自适应关系网络在分子性质预测中的应用

作者：Yaqing Wang,Abulikemu Abuduweili,Dejing Dou 机构：Business Intelligence Lab, Baidu Research 备注：molecular property prediction, few-shot learning, meta learning 链接：https://arxiv.org/abs/2107.07994 摘要：分子性质预测是发现具有靶向性质的候选分子的基础。然而，分子性质预测本质上是一个小问题，很难得到规则的模型。本文提出了一种基于属性感知的自适应关系网络（PAR）来解决分子属性预测问题。与现有的研究相比，我们利用了分子的子结构和分子间的关系在不同的分子性质下是不同的这一事实。我们的PAR与现有的基于图的分子编码器兼容，并进一步具备获得属性感知分子嵌入和模型分子关系图自适应的能力。生成的关系图也有助于在每个任务中进行有效的标签传播。在基准分子性质预测数据集上的大量实验表明，该方法的性能优于现有的方法，能够获得具有性质感知的分子嵌入和正确的分子关系图模型。摘要：Molecular property prediction plays a fundamental role in drug discovery to discover candidate molecules with target properties. However, molecular property prediction is essentially a few-shot problem which makes it hard to obtain regular models. In this paper, we propose a property-aware adaptive relation networks (PAR) for the few-shot molecular property prediction problem. In comparison to existing works, we leverage the facts that both substructures and relationships among molecules are different considering various molecular properties. Our PAR is compatible with existing graph-based molecular encoders, and are further equipped with the ability to obtain property-aware molecular embedding and model molecular relation graph adaptively. The resultant relation graph also facilitates effective label propagation within each task. Extensive experiments on benchmark molecular property prediction datasets show that our method consistently outperforms state-of-the-art methods and is able to obtain property-aware molecular embedding and model molecular relation graph properly.

【2】 MS-MDA: Multisource Marginal Distribution Adaptation for Cross-subject and Cross-session EEG Emotion Recognition 标题：MS-MDA：多源边缘分布自适应跨学科、跨时段脑电信号情绪识别

作者：Hao Chen,Ming Jin,Zhunan Li,Cunhang Fan,Jinpeng Li,Huiguang He 机构： School of Computer Science and Technology 备注：10 pages, 8 figures 链接：https://arxiv.org/abs/2107.07740 摘要：作为精神疾病诊断和康复的重要组成部分，基于脑电图（EEG）的情感识别以其高精度和高可靠性而取得了重大进展。然而，实用性的一个障碍在于主题和会话之间的可变性。虽然已有一些研究采用了域适应（DA）方法来解决这一问题，但大多数研究将来自不同受试者和会话的多个脑电数据作为一个单一的源域进行传输，这要么不能满足域适应的假设，即源具有一定的边缘分布，或者增加适应的难度。因此，我们提出了一种多源边缘分布自适应（MS-MDA）的脑电情感识别方法，该方法同时考虑了领域不变性和特定领域的特征。首先假设不同的脑电数据具有相同的低层特征，然后构造多个脑电数据源域的独立分支，采用一对一的域自适应，提取特定域的特征。最后，由多个分支进行推理。我们评估了SEED和SEED-IV分别识别三种和四种情绪的方法。实验结果表明，MS-MDA在跨会话和跨主题传输场景中的性能优于比较方法和最新模型。代码位于https://github.com/VoiceBeer/MS-MDA. 摘要：As an essential element for the diagnosis and rehabilitation of psychiatric disorders, the electroencephalogram (EEG) based emotion recognition has achieved significant progress due to its high precision and reliability. However, one obstacle to practicality lies in the variability between subjects and sessions. Although several studies have adopted domain adaptation (DA) approaches to tackle this problem, most of them treat multiple EEG data from different subjects and sessions together as a single source domain for transfer, which either fails to satisfy the assumption of domain adaptation that the source has a certain marginal distribution, or increases the difficulty of adaptation. We therefore propose the multi-source marginal distribution adaptation (MS-MDA) for EEG emotion recognition, which takes both domain-invariant and domain-specific features into consideration. First, we assume that different EEG data share the same low-level features, then we construct independent branches for multiple EEG data source domains to adopt one-to-one domain adaptation and extract domain-specific features. Finally, the inference is made by multiple branches. We evaluate our method on SEED and SEED-IV for recognizing three and four emotions, respectively. Experimental results show that the MS-MDA outperforms the comparison methods and state-of-the-art models in cross-session and cross-subject transfer scenarios in our settings. Codes at https://github.com/VoiceBeer/MS-MDA.

【3】 Adaptive first-order methods revisited: Convex optimization without Lipschitz requirements 标题：自适应一阶方法回顾：无Lipschitz要求的凸优化

作者：Kimon Antonakopoulos,Panayotis Mertikopoulos 备注：34 pages, 4 figures 链接：https://arxiv.org/abs/2107.08011 摘要：针对一类凸极小化问题，我们提出了一类新的自适应一阶方法，这些方法在标准意义下可能不是Lipschitz连续或光滑的。具体而言，在最近的非Lipschitz（NoLIP）优化活动的激励下，我们考虑相对于参考Brgman函数是连续或平滑的问题-而不是全局的、环境的范数（欧几里德或其他）。这些条件包括具有奇异目标的广泛问题，例如Fisher市场、Poisson层析成像、D-设计等。在这种情况下，应用现有的最优阶自适应方法（如UnixGrad或AcceleGrad）是不可能的，特别是在存在随机性和不确定性的情况下。我们称之为自适应镜像下降（AdaMir）的方法旨在通过在相对连续或光滑的问题（包括随机问题）中同时实现最小-最大最优速率来缩小这一差距。摘要：We propose a new family of adaptive first-order methods for a class of convex minimization problems that may fail to be Lipschitz continuous or smooth in the standard sense. Specifically, motivated by a recent flurry of activity on non-Lipschitz (NoLips) optimization, we consider problems that are continuous or smooth relative to a reference Bregman function - as opposed to a global, ambient norm (Euclidean or otherwise). These conditions encompass a wide range of problems with singular objectives, such as Fisher markets, Poisson tomography, D-design, and the like. In this setting, the application of existing order-optimal adaptive methods - like UnixGrad or AcceleGrad - is not possible, especially in the presence of randomness and uncertainty. The proposed method - which we call adaptive mirror descent (AdaMir) - aims to close this gap by concurrently achieving min-max optimal rates in problems that are relatively continuous or smooth, including stochastic ones.

强化学习(2篇)

【1】 Reinforcement Learning for Optimal Stationary Control of Linear Stochastic Systems 标题：线性随机系统最优平稳控制的强化学习

作者：Bo Pang,Zhong-Ping Jiang 机构： Tandon School of Engineering, New York University 备注：9 pages, 1 figure 链接：https://arxiv.org/abs/2107.07788 摘要：利用强化学习技术研究了具有加性噪声和乘性噪声的连续时间线性随机系统的最优平稳控制问题。在策略迭代的基础上，提出了一种新的非策略强化学习算法，即基于乐观最小二乘的策略迭代算法，该算法能够直接从输入/状态数据中迭代地找到最优平稳控制问题的近似最优策略，而无需显式地识别任何系统矩阵，从初始容许控制策略开始。在温和的条件下，证明了基于乐观最小二乘法的策略迭代所给出的解以概率1收敛到最优解的一个小邻域。以三级倒立摆为例验证了该算法的可行性和有效性。摘要：This paper studies the optimal stationary control of continuous-time linear stochastic systems with both additive and multiplicative noises, using reinforcement learning techniques. Based on policy iteration, a novel off-policy reinforcement learning algorithm, named optimistic least-squares-based policy iteration, is proposed which is able to iteratively find near-optimal policies of the optimal stationary control problem directly from input/state data without explicitly identifying any system matrices, starting from an initial admissible control policy. The solutions given by the proposed optimistic least-squares-based policy iteration are proved to converge to a small neighborhood of the optimal solution with probability one, under mild conditions. The application of the proposed algorithm to a triple inverted pendulum example validates its feasibility and effectiveness.

【2】 Geometric Value Iteration: Dynamic Error-Aware KL Regularization for Reinforcement Learning 标题：几何值迭代：强化学习的动态误差感知KL正则化

作者：Toshinori Kitamura,Lingwei Zhu,Takamitsu Matsubara 机构：Nara Institute of Science and Technology, Nara, JAPAN 链接：https://arxiv.org/abs/2107.07659 摘要：最近兴起的熵正则化文献表明，Kullback-Leibler（KL）正则化通过在温和的假设下消除错误，为强化学习（RL）算法带来了优势。然而，现有的分析主要集中在固定的正则化和一个常数的加权系数，并没有考虑到的情况下，该系数是允许动态变化的。本文研究了动力系数格式，给出了第一个渐近误差界。在动态系数误差界的基础上，提出了一种根据误差大小调整系数的有效方法，以提高学习的鲁棒性。在此基础上，我们提出了一种新的算法：几何值迭代算法（GVI），该算法采用动态误差感知KL系数设计，以减轻误差对性能的影响。我们的实验表明，GVI可以有效地利用学习速度和鲁棒性之间的平衡，而不是常数KL系数的均匀平均。GVI和deep网络的结合，即使在没有目标网络的情况下也表现出稳定的学习行为，而目标网络中具有常数KL系数的算法会大幅振荡甚至无法收敛。摘要：The recent booming of entropy-regularized literature reveals that Kullback-Leibler (KL) regularization brings advantages to Reinforcement Learning (RL) algorithms by canceling out errors under mild assumptions. However, existing analyses focus on fixed regularization with a constant weighting coefficient and have not considered the case where the coefficient is allowed to change dynamically. In this paper, we study the dynamic coefficient scheme and present the first asymptotic error bound. Based on the dynamic coefficient error bound, we propose an effective scheme to tune the coefficient according to the magnitude of error in favor of more robust learning. On top of this development, we propose a novel algorithm: Geometric Value Iteration (GVI) that features a dynamic error-aware KL coefficient design aiming to mitigate the impact of errors on the performance. Our experiments demonstrate that GVI can effectively exploit the trade-off between learning speed and robustness over uniform averaging of constant KL coefficient. The combination of GVI and deep networks shows stable learning behavior even in the absence of a target network where algorithms with a constant KL coefficient would greatly oscillate or even fail to converge.

元学习(1篇)

【1】 A Channel Coding Benchmark for Meta-Learning 标题：一种面向元学习的信道编码基准

作者：Rui Li,Ondrej Bohdal,Rajesh Mishra,Hyeji Kim,Da Li,Nicholas Lane,Timothy Hospedales 机构：Cambridge, UK, School of Informatics, University of Edinburgh, UK, UT Austin, US, Samsung AI Center, UK, Samsung AI Center and 链接：https://arxiv.org/abs/2107.07579 摘要：元学习为新任务的数据高效学习提供了一系列流行而有效的方法。然而，到目前为止，元学习中的几个重要问题已经被证明是很难研究的。例如，在现实环境中，元学习者必须从广泛且潜在的多模式训练任务分布中学习，绩效会下降；当元训练和元测试任务分配之间存在分配转移时。这些问题通常很难研究，因为任务分布的形状以及它们之间的转换在标准基准中不容易测量或控制。我们提出了信道编码问题作为元学习的基准。信道编码是一种重要的实际应用，在这种应用中，任务分布是自然产生的，快速适应新的任务具有重要的实用价值。我们使用这个基准来研究元学习的几个方面，包括任务分布广度和移位的影响，这些都可以在编码问题中得到控制。展望未来，这个基准为社区提供了一个工具，以研究元学习的能力和局限性，并推动对实际强大和有效的元学习者的研究。摘要：Meta-learning provides a popular and effective family of methods for data-efficient learning of new tasks. However, several important issues in meta-learning have proven hard to study thus far. For example, performance degrades in real-world settings where meta-learners must learn from a wide and potentially multi-modal distribution of training tasks; and when distribution shift exists between meta-train and meta-test task distributions. These issues are typically hard to study since the shape of task distributions, and shift between them are not straightforward to measure or control in standard benchmarks. We propose the channel coding problem as a benchmark for meta-learning. Channel coding is an important practical application where task distributions naturally arise, and fast adaptation to new tasks is practically valuable. We use this benchmark to study several aspects of meta-learning, including the impact of task distribution breadth and shift, which can be controlled in the coding problem. Going forward, this benchmark provides a tool for the community to study the capabilities and limitations of meta-learning, and to drive research on practically robust and effective meta-learners.

推荐(1篇)

【1】 Modeling User Behaviour in Research Paper Recommendation System 标题：科研论文推荐系统中的用户行为建模

作者：Arpita Chaudhuri,Debasis Samanta,Monalisa Sarma 机构：Received: date Accepted: date 备注：23 pages 链接：https://arxiv.org/abs/2107.07831 摘要：在推荐系统的设计中，用户意图往往是动态变化的，被认为是对用户建模的一个重要因素。最近的研究开始关注于预测用户的意图（用户想要什么）而不是用户的偏好（用户喜欢什么）。本文提出了一种基于深度序贯主题分析的用户意图模型。该模型根据用户感兴趣的话题来预测用户的意图。提出了一种由潜在Dirichlet分配（LDA）和Word2Vec组成的混合主题模型（HTM）来获取用户感兴趣的主题和偏好历史。HTM发现论文的真实主题，估计词的主题分布，包括词之间的句法和语义关联。其次，为了对用户意图进行建模，提出了一种基于长短时记忆（LSTM）的序贯深度学习模型。该模型考虑了时间上下文，即用户看到的两篇连续论文的点击时间差。对真实世界的研究论文数据集进行的大量实验表明，该方法的性能明显优于现有的方法。此外，该方法还引入了一种新的路径图来模拟用户活动，适合于论文推荐系统的设计。摘要：User intention which often changes dynamically is considered to be an important factor for modeling users in the design of recommendation systems. Recent studies are starting to focus on predicting user intention (what users want) beyond user preference (what users like). In this work, a user intention model is proposed based on deep sequential topic analysis. The model predicts a user's intention in terms of the topic of interest. The Hybrid Topic Model (HTM) comprising Latent Dirichlet Allocation (LDA) and Word2Vec is proposed to derive the topic of interest of users and the history of preferences. HTM finds the true topics of papers estimating word-topic distribution which includes syntactic and semantic correlations among words. Next, to model user intention, a Long Short Term Memory (LSTM) based sequential deep learning model is proposed. This model takes into account temporal context, namely the time difference between clicks of two consecutive papers seen by a user. Extensive experiments with the real-world research paper dataset indicate that the proposed approach significantly outperforms the state-of-the-art methods. Further, the proposed approach introduces a new road map to model a user activity suitable for the design of a research paper recommendation system.

聚类(3篇)

【1】 Measuring and Explaining the Inter-Cluster Reliability of Multidimensional Projections 标题：多维投影的簇间可靠性度量与解释

作者：Hyeon Jeon,Hyung-Kwon Ko,Jaemin Jo,Youngtaek Kim,Jinwook Seo 备注：IEEE Transactions of Visualization and Computer Graphics (TVCG, Proc. VIS 2021), to appear 链接：https://arxiv.org/abs/2107.07859 摘要：我们提出了稳定性和内聚性这两个新的度量多维投影（MDP）簇间可靠性的指标，特别是在原始高维空间和低维投影空间之间簇间结构的保持情况。衡量集群间的可靠性至关重要，因为它直接影响到集群间任务（例如，从投影视图识别原始空间中的集群关系）的执行情况；然而，尽管集群间任务很重要，我们发现以前的度量标准，如可信度和连续性，无法度量集群间的可靠性。我们的度量考虑了簇间可靠性的两个方面：稳定性度量了投影空间中的簇在原始空间中形成簇的程度，并且聚合性测量了相反的。它们在一个空间中提取具有任意形状和位置的随机簇，并评估簇在另一个空间中被拉伸或分散的程度。此外，我们的度量可以量化逐点扭曲，允许在投影中可视化集群间的可靠性，我们称之为可靠性图。通过定量实验，我们验证了我们的度量能够准确地捕捉到影响簇间可靠性的失真，而以前的度量很难捕捉到失真。一个案例研究还表明，我们的度量和可靠性图1）支持用户选择适当的投影技术或超参数，2）在执行簇间任务时防止误解，从而允许充分识别簇间结构。摘要：We propose Steadiness and Cohesiveness, two novel metrics to measure the inter-cluster reliability of multidimensional projection (MDP), specifically how well the inter-cluster structures are preserved between the original high-dimensional space and the low-dimensional projection space. Measuring inter-cluster reliability is crucial as it directly affects how well inter-cluster tasks (e.g., identifying cluster relationships in the original space from a projected view) can be conducted; however, despite the importance of inter-cluster tasks, we found that previous metrics, such as Trustworthiness and Continuity, fail to measure inter-cluster reliability. Our metrics consider two aspects of the inter-cluster reliability: Steadiness measures the extent to which clusters in the projected space form clusters in the original space, and Cohesiveness measures the opposite. They extract random clusters with arbitrary shapes and positions in one space and evaluate how much the clusters are stretched or dispersed in the other space. Furthermore, our metrics can quantify pointwise distortions, allowing for the visualization of inter-cluster reliability in a projection, which we call a reliability map. Through quantitative experiments, we verify that our metrics precisely capture the distortions that harm inter-cluster reliability while previous metrics have difficulty capturing the distortions. A case study also demonstrates that our metrics and the reliability map 1) support users in selecting the proper projection techniques or hyperparameters and 2) prevent misinterpretation while performing inter-cluster tasks, thus allow an adequate identification of inter-cluster structure.

【2】 ScRAE: Deterministic Regularized Autoencoders with Flexible Priors for Clustering Single-cell Gene Expression Data 标题：ScRAE：用于聚类单细胞基因表达数据的具有灵活先验的确定性正则化自动编码器

作者：Arnab Kumar Mondal,Himanshu Asnani,Parag Singla,Prathosh AP 备注：IEEE/ACM Transactions on Computational Biology and Bioinformatics 链接：https://arxiv.org/abs/2107.07709 摘要：聚类单细胞RNA序列（scRNA-seq）数据由于其高维性和数据稀疏性（也称为“退出”事件）带来了统计和计算方面的挑战。近年来，基于正则化自动编码器（RAE）的深度神经网络模型在学习鲁棒低维表示方面取得了显著的成功。RAEs的基本思想是学习从高维数据空间到低维潜在空间的非线性映射，同时对潜在空间施加分布先验，从而产生正则化效应。本文认为RAE在其幼稚的公式中存在着臭名昭著的偏差-方差权衡问题。而一个简单的AE没有潜在的正则化会导致数据过拟合，而一个非常强的先验会导致表示不足，从而导致聚类不好。为了解决上述问题，我们提出了一个改进的RAE框架（称为scRAE）来对单细胞RNA测序数据进行有效聚类。scRAE由确定性AE和可灵活学习的先验生成网络组成，与AE共同训练。这有助于scRAE更好地权衡潜在空间中的偏差和方差。我们通过在多个真实的单细胞基因表达数据集上的大量实验证明了该方法的有效性。摘要：Clustering single-cell RNA sequence (scRNA-seq) data poses statistical and computational challenges due to their high-dimensionality and data-sparsity, also known as `dropout' events. Recently, Regularized Auto-Encoder (RAE) based deep neural network models have achieved remarkable success in learning robust low-dimensional representations. The basic idea in RAEs is to learn a non-linear mapping from the high-dimensional data space to a low-dimensional latent space and vice-versa, simultaneously imposing a distributional prior on the latent space, which brings in a regularization effect. This paper argues that RAEs suffer from the infamous problem of bias-variance trade-off in their naive formulation. While a simple AE without a latent regularization results in data over-fitting, a very strong prior leads to under-representation and thus bad clustering. To address the above issues, we propose a modified RAE framework (called the scRAE) for effective clustering of the single-cell RNA sequencing data. scRAE consists of deterministic AE with a flexibly learnable prior generator network, which is jointly trained with the AE. This facilitates scRAE to trade-off better between the bias and variance in the latent space. We demonstrate the efficacy of the proposed method through extensive experimentation on several real-world single-cell Gene expression datasets.

【3】 Measuring inter-cluster similarities with Alpha Shape TRIangulation in loCal Subspaces (ASTRICS) facilitates visualization and clustering of high-dimensional data 标题：使用局部子空间中的Alpha形状三角剖分(ASTRICS)测量簇间相似性有助于高维数据的可视化和聚类

作者：Joshua M. Scurll 机构： Department of Mathematics and Institute of Applied Mathematics, Mathematics Road, University, of British Columbia, Vancouver, British Columbia V,T ,Z, Canada. 备注：35 pages, 7 figures 链接：https://arxiv.org/abs/2107.07603 摘要：高维数据的聚类和可视化是许多领域的重要任务。例如，在生物信息学中，它们对于单细胞数据的分析是至关重要的，例如质谱仪（CyTOF）数据。对HD数据进行聚类的一些最有效的算法是通过图中的节点来表示数据，边根据相似性或距离的度量来连接相邻的节点。然而，基于图的算法的用户通常面临着选择输入参数的值的关键但具有挑战性的任务，该输入参数设置图中邻域的大小，例如连接每个节点的最近邻居的数目或连接节点的阈值距离。用户的负担可以通过节点间相似性的度量来减轻，该度量对于不同的节点可以具有值0，而不需要任何用户定义的参数或阈值。这将自动确定邻域，同时仍然生成稀疏图。为此，本文提出了一种基于局部降维和关键alpha形状三角剖分的HD数据点簇间相似度度量方法ASTRICS。我证明了我的ASTRICS相似性度量在三阶段流水线的第2阶段中可以促进HD数据的聚类和可视化：第1阶段=通过任何方法对数据进行初始聚类；阶段2=让图节点代表初始簇而不是单个数据点，并使用ASTRICS自动定义节点之间的边；阶段3=使用图形进行进一步的聚类和可视化。这就把选择图形邻域大小这一关键任务与从本质上选择查看数据的分辨率这一更简单的任务进行了权衡。然后，图形以及随后的下游聚类和可视化将自动适应所选的分辨率。摘要：Clustering and visualizing high-dimensional (HD) data are important tasks in a variety of fields. For example, in bioinformatics, they are crucial for analyses of single-cell data such as mass cytometry (CyTOF) data. Some of the most effective algorithms for clustering HD data are based on representing the data by nodes in a graph, with edges connecting neighbouring nodes according to some measure of similarity or distance. However, users of graph-based algorithms are typically faced with the critical but challenging task of choosing the value of an input parameter that sets the size of neighbourhoods in the graph, e.g. the number of nearest neighbours to which to connect each node or a threshold distance for connecting nodes. The burden on the user could be alleviated by a measure of inter-node similarity that can have value 0 for dissimilar nodes without requiring any user-defined parameters or thresholds. This would determine the neighbourhoods automatically while still yielding a sparse graph. To this end, I propose a new method called ASTRICS to measure similarity between clusters of HD data points based on local dimensionality reduction and triangulation of critical alpha shapes. I show that my ASTRICS similarity measure can facilitate both clustering and visualization of HD data by using it in Stage 2 of a three-stage pipeline: Stage 1 = perform an initial clustering of the data by any method; Stage 2 = let graph nodes represent initial clusters instead of individual data points and use ASTRICS to automatically define edges between nodes; Stage 3 = use the graph for further clustering and visualization. This trades the critical task of choosing a graph neighbourhood size for the easier task of essentially choosing a resolution at which to view the data. The graph and consequently downstream clustering and visualization are then automatically adapted to the chosen resolution.

超分辨率|去噪|去模糊|去雾(1篇)

【1】 Beyond In-Place Corruption: Insertion and Deletion In Denoising Probabilistic Models 标题：超越就地腐败：概率模型去噪中的插入和删除

作者：Daniel D. Johnson,Jacob Austin,Rianne van den Berg,Daniel Tarlow 备注：Accepted at the ICML 2021 Workshop on Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models (poster) 链接：https://arxiv.org/abs/2107.07675 摘要：去噪扩散概率模型（ddpm）通过迭代腐蚀每个示例，然后学习将损坏的版本映射回原始版本，在序列生成方面显示了令人印象深刻的结果。然而，以前的工作主要集中在就地损坏上，在保持每个像素或令牌位置不变的情况下，分别向每个像素或令牌添加噪声。在这项工作中，我们考虑更广泛的一类腐败过程和去噪模型的序列数据，可以插入和删除元素，而仍然是有效的训练和样本。我们证明了这些模型在算术序列任务上优于标准的就地模型，并且当在text8数据集上训练时，它们可以用来修复拼写错误而无需任何微调。摘要：Denoising diffusion probabilistic models (DDPMs) have shown impressive results on sequence generation by iteratively corrupting each example and then learning to map corrupted versions back to the original. However, previous work has largely focused on in-place corruption, adding noise to each pixel or token individually while keeping their locations the same. In this work, we consider a broader class of corruption processes and denoising models over sequence data that can insert and delete elements, while still being efficient to train and sample from. We demonstrate that these models outperform standard in-place models on an arithmetic sequence task, and that when trained on the text8 dataset they can be used to fix spelling errors without any fine-tuning.

点云|SLAM|雷达|激光|深度RGBD相关(1篇)

【1】 CutDepth:Edge-aware Data Augmentation in Depth Estimation 标题：CutDepth：深度估计中的边缘感知数据增强

作者：Yasunori Ishii,Takayoshi Yamashita 机构：Panasonic, Kadoma, Kadoma City, Osaka, Japan, Chubu University, Matsumotocho, Kasugai, Aichi, Japan 链接：https://arxiv.org/abs/2107.07684 摘要：在单目深度估计中，由于需要同时获取RGB图像和深度，因此很难大规模地采集数据。因此，数据扩充对这项任务非常重要。然而，对于单目深度估计等需要逐像素变换的任务，数据增强的研究却很少。在本文中，我们提出了一种数据扩充方法，称为切割深度。在CutDepth中，在训练期间，部分深度被粘贴到输入图像上。该方法在不破坏边缘特征的前提下扩展了变异数据。客观和主观的实验结果表明，该方法优于传统的数据扩充方法。即使在长距离的训练数据较少的情况下，切割深度也能提高估计精度。摘要：It is difficult to collect data on a large scale in a monocular depth estimation because the task requires the simultaneous acquisition of RGB images and depths. Data augmentation is thus important to this task. However, there has been little research on data augmentation for tasks such as monocular depth estimation, where the transformation is performed pixel by pixel. In this paper, we propose a data augmentation method, called CutDepth. In CutDepth, part of the depth is pasted onto an input image during training. The method extends variations data without destroying edge features. Experiments objectively and subjectively show that the proposed method outperforms conventional methods of data augmentation. The estimation accuracy is improved with CutDepth even though there are few training data at long distances.

推理|分析|理解|解释(2篇)

【1】 DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference 标题：Dance：高效分段模型训练和推理的数据网络联合优化

作者：Chaojian Li,Wuyang Chen,Yuchen Gu,Tianlong Chen,Yonggan Fu,Zhangyang Wang,Yingyan Lin 机构： Rice University, University of Texas at Austin 备注：16 pages, 6 figures 链接：https://arxiv.org/abs/2107.07706 摘要：目前，场景理解中的语义分割需求越来越广泛，这对算法的效率提出了很大的挑战，尤其是在资源有限的平台上的应用。目前的分割模型都是在海量的高分辨率场景图像（“数据层”）上训练和评估的，并且由于所需的多尺度聚合（“网络层”）带来了昂贵的计算量。在这两种情况下，由于分割模型通常需要较大的输入分辨率和较重的计算负担，训练和推理的计算和能量开销都是显著的。为此，我们提出了通用自动数据网络协同优化算法DANCE，用于有效的分割模型训练和推理。与现有的只关注轻量网络设计的高效分割方法不同，DANCE通过输入数据操作和网络结构瘦身实现了自动同步数据网络协同优化。具体地说，DANCE集成了自动数据瘦身技术，该技术自适应地减少输入图像的采样/丢弃，并根据图像的空间复杂度控制其对训练损失的贡献。这种降采样操作除了直接降低与输入大小相关的成本外，还缩小了输入对象和上下文尺度的动态范围，因此激励我们也自适应地对网络进行瘦身以匹配降采样数据。大量的实验和烧蚀研究（在四个SOTA分割模型和三个流行的分割数据集上，在两种训练环境下）表明，DANCE可以实现“双赢”的高效分割（降低训练成本，降低推理成本，以及更好的联合平均交集（mIoU））。摘要：Semantic segmentation for scene understanding is nowadays widely demanded, raising significant challenges for the algorithm efficiency, especially its applications on resource-limited platforms. Current segmentation models are trained and evaluated on massive high-resolution scene images ("data level") and suffer from the expensive computation arising from the required multi-scale aggregation("network level"). In both folds, the computational and energy costs in training and inference are notable due to the often desired large input resolutions and heavy computational burden of segmentation models. To this end, we propose DANCE, general automated DAta-Network Co-optimization for Efficient segmentation model training and inference. Distinct from existing efficient segmentation approaches that focus merely on light-weight network design, DANCE distinguishes itself as an automated simultaneous data-network co-optimization via both input data manipulation and network architecture slimming. Specifically, DANCE integrates automated data slimming which adaptively downsamples/drops input images and controls their corresponding contribution to the training loss guided by the images' spatial complexity. Such a downsampling operation, in addition to slimming down the cost associated with the input size directly, also shrinks the dynamic range of input object and context scales, therefore motivating us to also adaptively slim the network to match the downsampled data. Extensive experiments and ablating studies (on four SOTA segmentation models with three popular segmentation datasets under two training settings) demonstrate that DANCE can achieve "all-win" towards efficient segmentation(reduced training cost, less expensive inference, and better mean Intersection-over-Union (mIoU)).

【2】 Constrained Feedforward Neural Network Training via Reachability Analysis 标题：基于可达性分析的约束前馈神经网络训练

作者：Long Kiu Chung,Adam Dai,Derek Knowles,Shreyas Kousik,Grace X. Gao 机构： Dai is with the Department of Electrical Engineering 备注：5 pages, 4 figures 链接：https://arxiv.org/abs/2107.07696 摘要：最近，神经网络的应用越来越广泛，但在诸如人类附近和周围的机器人等安全关键领域的应用有限。这是因为训练神经网络服从安全约束仍然是一个公开的挑战。大多数现有的安全相关方法只寻求验证已经训练过的网络是否服从约束，需要交替训练和验证。相反，本文提出了一种同时训练和验证具有校正线性单元（ReLU）非线性的前馈神经网络的约束方法。通过计算网络的输出空间可达集并确保其不与不安全集相交来实施约束；训练是通过在可达集和输出空间的不安全部分之间建立一个新的冲突检查损失函数来实现的。可达集和不安全集由约束的zonotopes表示，这是一种凸多面体表示，支持可微碰撞检查。该方法在一个具有一个非线性层和大约50个参数的网络上得到了成功的验证。摘要：Neural networks have recently become popular for a wide variety of uses, but have seen limited application in safety-critical domains such as robotics near and around humans. This is because it remains an open challenge to train a neural network to obey safety constraints. Most existing safety-related methods only seek to verify that already-trained networks obey constraints, requiring alternating training and verification. Instead, this work proposes a constrained method to simultaneously train and verify a feedforward neural network with rectified linear unit (ReLU) nonlinearities. Constraints are enforced by computing the network's output-space reachable set and ensuring that it does not intersect with unsafe sets; training is achieved by formulating a novel collision-check loss function between the reachable set and unsafe portions of the output space. The reachable and unsafe sets are represented by constrained zonotopes, a convex polytope representation that enables differentiable collision checking. The proposed method is demonstrated successfully on a network with one nonlinearity layer and approximately 50 parameters.

检测相关(4篇)

【1】 Contrastive Predictive Coding for Anomaly Detection 标题：用于异常检测的对比预测编码

作者：Puck de Haan,Sindy Löwe 机构： University of Amsterdam 备注：7 pages, ICML 2021 Workshop on Uncertainty and Robustness in Deep Learning 链接：https://arxiv.org/abs/2107.07820 摘要：在实际应用机器学习模型时，异常的可靠检测是至关重要的，但由于缺乏标记数据，异常检测仍然具有挑战性。为了应对这一挑战，对比学习方法正变得越来越流行，因为它们在自我监督表征学习环境中取得了令人印象深刻的成果。然而，现有的对比异常检测和分割方法大多应用于图像，没有一种方法能够直接利用对比损失进行异常检测和分割。在本文中，我们利用对比预测编码模型（arXiv:1807.03748）来弥补这一差距。我们表明，它的斑块对比损失可以直接解释为异常评分，以及如何允许创建异常分割掩模。该模型对具有挑战性的MVTec-AD数据集的异常检测和分割都取得了很好的效果。摘要：Reliable detection of anomalies is crucial when deploying machine learning models in practice, but remains challenging due to the lack of labeled data. To tackle this challenge, contrastive learning approaches are becoming increasingly popular, given the impressive results they have achieved in self-supervised representation learning settings. However, while most existing contrastive anomaly detection and segmentation approaches have been applied to images, none of them can use the contrastive losses directly for both anomaly detection and segmentation. In this paper, we close this gap by making use of the Contrastive Predictive Coding model (arXiv:1807.03748). We show that its patch-wise contrastive loss can directly be interpreted as an anomaly score, and how this allows for the creation of anomaly segmentation masks. The resulting model achieves promising results for both anomaly detection and segmentation on the challenging MVTec-AD dataset.

【2】 Pseudo-labelling Enhanced Media Bias Detection 标题：伪标记法增强介质偏置检测

作者：Qin Ruan,Brian Mac Namee,Ruihai Dong 机构：School of Computer Science, University College Dublin, Dublin, Ireland, Insight Centre for Data Analytics, University College Dublin, Dublin, Ireland 链接：https://arxiv.org/abs/2107.07705 摘要：通过弱的或远程的监控来利用未标记的数据是开发更有效的文本分类模型的一个重要方法。本文提出了一种简单而有效的数据扩充方法，该方法利用伪标记的思想从有噪声的远程监督注释数据集中选取样本。结果表明，该方法提高了有偏新闻检测模型的准确率。摘要：Leveraging unlabelled data through weak or distant supervision is a compelling approach to developing more effective text classification models. This paper proposes a simple but effective data augmentation method, which leverages the idea of pseudo-labelling to select samples from noisy distant supervision annotation datasets. The result shows that the proposed method improves the accuracy of biased news detection models.

【3】 Neural Contextual Anomaly Detection for Time Series 标题：时间序列的神经上下文异常检测

作者：Chris U. Carmona,François-Xavier Aubet,Valentin Flunkert,Jan Gasthaus 机构：University of Oxford, AWS AI Labs 备注：Chris and Franc{c}ois-Xavier contributed equally 链接：https://arxiv.org/abs/2107.07702 摘要：我们介绍了神经上下文异常检测（NCAD），这是一个用于时间序列异常检测的框架，它可以无缝地从无监督环境扩展到有监督环境，并且适用于单变量和多变量时间序列。这是通过有效地结合多元时间序列表示学习的最新发展，以及最初为计算机视觉开发的深度异常检测技术来实现的，我们根据时间序列设置来定制。我们的基于窗口的方法通过向可用数据中注入一般的合成异常，有助于学习正常类和异常类之间的边界。此外，我们的方法可以有效地利用所有可用的信息，无论是作为领域知识，还是作为半监督环境下的训练标签。我们在标准基准数据集上的经验证明，我们的方法在这些设置中获得了最先进的性能。摘要：We introduce Neural Contextual Anomaly Detection (NCAD), a framework for anomaly detection on time series that scales seamlessly from the unsupervised to supervised setting, and is applicable to both univariate and multivariate time series. This is achieved by effectively combining recent developments in representation learning for multivariate time series, with techniques for deep anomaly detection originally developed for computer vision that we tailor to the time series setting. Our window-based approach facilitates learning the boundary between normal and anomalous classes by injecting generic synthetic anomalies into the available data. Moreover, our method can effectively take advantage of all the available information, be it as domain knowledge, or as training labels in the semi-supervised setting. We demonstrate empirically on standard benchmark datasets that our approach obtains a state-of-the-art performance in these settings.

【4】 On the Importance of Regularisation & Auxiliary Information in OOD Detection 标题：论规范化辅助信息在OOD检测中的重要性

作者：John Mitros,Brian Mac Namee 机构：University College Dublin, IR, School of Computer Science 链接：https://arxiv.org/abs/2107.07564 摘要：神经网络通常用于关键领域的应用（如自动驾驶汽车、金融市场和航空航天工程），尽管它们对模糊输入的预测过于自信。这一缺陷显示了一个基本缺陷，表明神经网络常常过度拟合虚假相关性。为了解决这个问题，我们提出了两个新的目标，以提高网络检测分布外样本的能力，从而避免对模糊输入的过度自信预测。我们的经验证明，我们的方法优于基线，表现优于大多数现有的方法，而表现有竞争力的那些他们没有超越。此外，我们还从经验上证明了我们的方法对常见腐蚀的鲁棒性，并证明了正则化和辅助信息在分布外检测中的重要性。摘要：Neural networks are often utilised in critical domain applications (e.g.~self-driving cars, financial markets, and aerospace engineering), even though they exhibit overconfident predictions for ambiguous inputs. This deficiency demonstrates a fundamental flaw indicating that neural networks often overfit on spurious correlations. To address this problem in this work we present two novel objectives that improve the ability of a network to detect out-of-distribution samples and therefore avoid overconfident predictions for ambiguous inputs. We empirically demonstrate that our methods outperform the baseline and perform better than the majority of existing approaches, while performing competitively those that they don't outperform. Additionally, we empirically demonstrate the robustness of our approach against common corruptions and demonstrate the importance of regularisation and auxiliary information in out-of-distribution detection.

分类|识别(3篇)

【1】 Revisiting IoT Device Identification 标题：重温物联网设备标识

作者：Roman Kolcun,Diana Andreea Popescu,Vadim Safronov,Poonam Yadav,Anna Maria Mandalari,Richard Mortier,Hamed Haddadi 机构：University of Cambridge, University of York, Imperial College London 备注：To appear in TMA 2021 conference. 9 pages, 6 figures. arXiv admin note: text overlap with arXiv:2011.08605 链接：https://arxiv.org/abs/2107.07818 摘要：众所周知，物联网设备是许多安全问题的根源，因此，它们将从自动化管理中受益匪浅。这需要可靠地识别设备，以便应用适当的网络安全策略。我们通过探索如何基于网络行为准确识别物联网设备来应对这一挑战，同时利用其他研究人员先前提出的方法。我们比较了四种不同的机器学习模型（基于树和基于神经网络）识别物联网设备的准确性。我们使用从大型物联网试验台收集的6个月的数据包跟踪数据。我们发现，虽然所有模型在训练时在同一个数据集上进行评估时都达到了很高的精度，但在训练集外收集的数据上进行评估时，其精度会随着时间的推移而下降。我们发现，平均而言，模型的精确度在几周后下降了40个百分点（平均在12到21个百分点之间）。我们认为，为了保持模型的准确性在一个较高的水平，这些需要不断更新。摘要：Internet-of-Things (IoT) devices are known to be the source of many security problems, and as such, they would greatly benefit from automated management. This requires robustly identifying devices so that appropriate network security policies can be applied. We address this challenge by exploring how to accurately identify IoT devices based on their network behavior, while leveraging approaches previously proposed by other researchers. We compare the accuracy of four different previously proposed machine learning models (tree-based and neural network-based) for identifying IoT devices. We use packet trace data collected over a period of six months from a large IoT test-bed. We show that, while all models achieve high accuracy when evaluated on the same dataset as they were trained on, their accuracy degrades over time, when evaluated on data collected outside the training set. We show that on average the models' accuracy degrades after a couple of weeks by up to 40 percentage points (on average between 12 and 21 percentage points). We argue that, in order to keep the models' accuracy at a high level, these need to be continuously updated.

【2】 The Application of Active Query K-Means in Text Classification 标题：主动查询K-Means算法在文本分类中的应用

作者：Yukun Jiang 机构：Department of Computer Science, New York University, New York, United States 备注：None 链接：https://arxiv.org/abs/2107.07682 摘要：主动学习是一种处理大量未标记数据的先进机器学习方法。在自然语言处理领域，通常对所有的数据进行注释既费钱又费时。这种低效性启发我们将主动学习应用于文本分类。本文首先将传统的无监督k-均值聚类算法改进为半监督聚类算法。然后，将该算法进一步扩展到具有惩罚最小最大选择的主动学习场景中，使得有限的查询产生更稳定的初始质心。该方法利用了用户的交互查询结果和底层的距离表示。在一个中文新闻数据集上进行测试后，它显示了在降低训练成本的同时，准确率的持续提高。摘要：Active learning is a state-of-art machine learning approach to deal with an abundance of unlabeled data. In the field of Natural Language Processing, typically it is costly and time-consuming to have all the data annotated. This inefficiency inspires out our application of active learning in text classification. Traditional unsupervised k-means clustering is first modified into a semi-supervised version in this research. Then, a novel attempt is applied to further extend the algorithm into active learning scenario with Penalized Min-Max-selection, so as to make limited queries that yield more stable initial centroids. This method utilizes both the interactive query results from users and the underlying distance representation. After tested on a Chinese news dataset, it shows a consistent increase in accuracy while lowering the cost in training.

【3】 Real-Time Face Recognition System for Remote Employee Tracking 标题：一种用于远程员工跟踪的实时人脸识别系统

作者：Mohammad Sabik Irbaz,MD Abdullah Al Nasim,Refat E Ferdous 机构：Machine Learning Team, Pioneer Alpha Ltd, A PREPRINT 备注：Accepted in International Conference on Big Data, IoT and Machine Learning (BIM 2021) 链接：https://arxiv.org/abs/2107.07576 摘要：在COVID-19大流行期间，大多数人与人之间的相互作用已经停止。为了减少致命冠状病毒的传播，许多办公室采取了主动，让员工可以在家工作。但是，跟踪这些员工，找出他们是否真的在履行他们本该履行的职责，对于所有促进“在家工作”的公司和组织来说，都是一个严峻的挑战。为了有效应对这一挑战，我们提出了一个解决方案，通过人脸识别跟踪员工。我们一直在为我们的办公室测试这个系统。在人脸识别模块的训练中，我们使用了带有KNN的FaceNet，并使用了野生标记人脸（LFW）数据集，达到了97.8%的准确率。我们将经过训练的模型集成到我们的中心系统中，员工在其中记录时间。在本文中，我们简要地讨论了我们正在试验的系统以及该系统的优缺点。摘要：During the COVID-19 pandemic, most of the human-to-human interactions have been stopped. To mitigate the spread of deadly coronavirus, many offices took the initiative so that the employees can work from home. But, tracking the employees and finding out if they are really performing what they were supposed to turn out to be a serious challenge for all the companies and organizations who are facilitating "Work From Home". To deal with the challenge effectively, we came up with a solution to track the employees with face recognition. We have been testing this system experimentally for our office. To train the face recognition module, we used FaceNet with KNN using the Labeled Faces in the Wild (LFW) dataset and achieved 97.8% accuracy. We integrated the trained model into our central system, where the employees log their time. In this paper, we discuss in brief the system we have been experimenting with and the pros and cons of the system.

表征(1篇)

【1】 Representation Consolidation for Training Expert Students 标题：训练专家型学生的表征整合

作者：Zhizhong Li,Avinash Ravichandran,Charless Fowlkes,Marzia Polito,Rahul Bhotika,Stefano Soatto 机构：Amazon AWS Rekognition 链接：https://arxiv.org/abs/2107.08039 摘要：传统上，蒸馏被用来训练学生模型来模拟教师的输入/输出功能。一个比仿真更有用的目标是让学生学习能够很好地转移到未来任务中的特征表示法，但这一目标尚未得到充分的探索。然而，我们观察到，任务型教师的标准提炼实际上减少了学生表征向下游任务的转移。我们证明，使用未标记的代理数据集和多面手教师的多头、多任务提取方法足以整合来自任务特定教师的表示并改善下游性能，优于教师和ImageNet预训练特征的强基线。该方法还可以将多个教师在一个或多个领域的表征知识组合成一个单一的模型，该模型在所有教师领域的表征都得到了改进。摘要：Traditionally, distillation has been used to train a student model to emulate the input/output functionality of a teacher. A more useful goal than emulation, yet under-explored, is for the student to learn feature representations that transfer well to future tasks. However, we observe that standard distillation of task-specific teachers actually *reduces* the transferability of student representations to downstream tasks. We show that a multi-head, multi-task distillation method using an unlabeled proxy dataset and a generalist teacher is sufficient to consolidate representations from task-specific teacher(s) and improve downstream performance, outperforming the teacher(s) and the strong baseline of ImageNet pretrained features. Our method can also combine the representational knowledge of multiple teachers trained on one or multiple domains into a single model, whose representation is improved on all teachers' domain(s).

优化|敛散性(1篇)

【1】 A Penalized Shared-parameter Algorithm for Estimating Optimal Dynamic Treatment Regimens 标题：估计最优动态治疗方案的惩罚共享参数算法

作者：Trikay Nalamada,Shruti Agarwal,Maria Jahja,Bibhas Chakraborty,Palash Ghosh 机构：Department of Mathematics, Indian, Institute of Technology Guwahati, Assam , India, Department of Statistics, North, Carolina State University, USA, Centre for Quantitative Medicine, Duke-NUS Medical School, National, University of Singapore, Singapore 链接：https://arxiv.org/abs/2107.07875 摘要：动态治疗方案（DTR）是一组决策规则，用于根据患者的病史对患者进行个性化治疗。基于Q-learning的Q-shared算法已被用于开发涉及跨多个干预阶段共享决策规则的dtr。我们证明了现有的Q-shared算法由于在Q-learning设置中使用了线性模型而存在非收敛性，并确定了Q-shared失败的条件。利用扩展约束普通最小二乘法的性质，给出了一种惩罚Q-共享算法，该算法不仅能在违反条件的情况下收敛，而且在满足条件的情况下仍能优于原Q-共享算法。我们给出了该方法在实际应用和若干综合仿真中的证明。摘要：A dynamic treatment regimen (DTR) is a set of decision rules to personalize treatments for an individual using their medical history. The Q-learning based Q-shared algorithm has been used to develop DTRs that involve decision rules shared across multiple stages of intervention. We show that the existing Q-shared algorithm can suffer from non-convergence due to the use of linear models in the Q-learning setup, and identify the condition in which Q-shared fails. Leveraging properties from expansion-constrained ordinary least-squares, we give a penalized Q-shared algorithm that not only converges in settings that violate the condition, but can outperform the original Q-shared algorithm even when the condition is satisfied. We give evidence for the proposed method in a real-world application and several synthetic simulations.

预测|估计(4篇)

【1】 Is attention to bounding boxes all you need for pedestrian action prediction? 标题：要预测行人的行动，只需要注意边界框就行了吗？

作者：Lina Achaji,Julien Moreau,Thibault Fouqueray,Francois Aioun,Francois Charpillet 机构：Stellantis Group, Technical center of Velizy , France, Inria, Nancy , France 链接：https://arxiv.org/abs/2107.08031 摘要：人类驾驶员不再是唯一关心驾驶场景复杂性的人。自动驾驶汽车（AV）也同样参与了这一过程。如今，城市地区视听技术的发展为行人等易受伤害的道路使用者（VRU）提供了重要的安全保障。因此，为了使道路更安全，对其未来行为进行分类和预测是至关重要的。本文提出了一个基于多重变型变换模型的行人过街行为分析框架，对行人过街和不过街行为进行了分析和预测。我们证明了仅使用边界框作为模型的输入可以比以前的最新模型有更好的性能，并且在PIE数据集上达到91%的预测准确率和0.83的F1分数，在未来最多提前2秒。此外，我们还介绍了一个使用CARLA进行动作预测的大型模拟数据集（CP2A）。我们的模型在这个数据集上同样达到了高精度（91%）和F1分数（0.91）。有趣的是，我们证明了在模拟数据集上预先训练Transformer模型，然后在真实数据集上对其进行微调，对于动作预测任务是非常有效的。摘要：The human driver is no longer the only one concerned with the complexity of the driving scenarios. Autonomous vehicles (AV) are similarly becoming involved in the process. Nowadays, the development of AV in urban places underpins essential safety concerns for vulnerable road users (VRUs) such as pedestrians. Therefore, to make the roads safer, it is critical to classify and predict their future behavior. In this paper, we present a framework based on multiple variations of the Transformer models to reason attentively about the dynamic evolution of the pedestrians' past trajectory and predict its future actions of crossing or not crossing the street. We proved that using only bounding boxes as input to our model can outperform the previous state-of-the-art models and reach a prediction accuracy of 91 % and an F1-score of 0.83 on the PIE dataset up to two seconds ahead in the future. In addition, we introduced a large-size simulated dataset (CP2A) using CARLA for action prediction. Our model has similarly reached high accuracy (91 %) and F1-score (0.91) on this dataset. Interestingly, we showed that pre-training our Transformer model on the simulated dataset and then fine-tuning it on the real dataset can be very effective for the action prediction task.

【2】 Towards an Interpretable Latent Space in Structured Models for Video Prediction 标题：视频预测结构化模型中的可解释潜在空间研究

作者：Rushil Gupta,Vishal Sharma,Yash Jain,Yitao Liang,Guy Van den Broeck,Parag Singla 机构：Indian Institute of Technology Delhi, University of California, Los Angeles 备注：Accepted at Weakly Supervised Representation Learning Workshop at IJCAI 2021 链接：https://arxiv.org/abs/2107.07713 摘要：我们的重点是未来帧预测的任务，在视频控制下的物理动力学。我们使用以对象为中心的模型，即显式地使用对象表示，并在潜在空间中传播损失。具体来说，我们的研究建立在Kipf等人最近的工作的基础上。引用{Kipf&al20}，它通过使用图形神经网络对潜在空间中的对象交互进行对比学习来预测下一个状态。我们认为，以一般物理定律的形式在模型中注入显式归纳偏差，不仅可以提高模型的可解释性，而且可以提高模型的整体预测能力。作为一个自然的副产品，我们的模型可以学习与图像中实际物体位置非常相似的特征映射，而不需要在训练时对物体位置进行任何明确的监督。与早期的著作相比，我们只依赖于一般物理定律的知识，例如，世界是由物体组成的，物体具有位置和速度。我们提出一个额外的解码器为基础的损失在像素空间，强加在课程的方式，以进一步完善潜在的空间预测。在多个不同环境中的实验表明，虽然Kipf等人的模型在捕捉对象交互方面是有效的，但是我们的模型在定位对象方面可以显著地更有效，从而在我们实验的4个域中有3个域的性能得到了改善。此外，我们的模型可以学习高度无畏的特征地图，类似于实际对象的位置。摘要：We focus on the task of future frame prediction in video governed by underlying physical dynamics. We work with models which are object-centric, i.e., explicitly work with object representations, and propagate a loss in the latent space. Specifically, our research builds on recent work by Kipf et al. cite{kipf&al20}, which predicts the next state via contrastive learning of object interactions in a latent space using a Graph Neural Network. We argue that injecting explicit inductive bias in the model, in form of general physical laws, can help not only make the model more interpretable, but also improve the overall prediction of model. As a natural by-product, our model can learn feature maps which closely resemble actual object positions in the image, without having any explicit supervision about the object positions at the training time. In comparison with earlier works cite{jaques&al20}, which assume a complete knowledge of the dynamics governing the motion in the form of a physics engine, we rely only on the knowledge of general physical laws, such as, world consists of objects, which have position and velocity. We propose an additional decoder based loss in the pixel space, imposed in a curriculum manner, to further refine the latent space predictions. Experiments in multiple different settings demonstrate that while Kipf et al. model is effective at capturing object interactions, our model can be significantly more effective at localising objects, resulting in improved performance in 3 out of 4 domains that we experiment with. Additionally, our model can learn highly intrepretable feature maps, resembling actual object positions.

【3】 Simultaneous boundary shape estimation and velocity field de-noising in Magnetic Resonance Velocimetry using Physics-informed Neural Networks 标题：基于物理信息神经网络的磁共振测速中边界形状估计和速度场同时去噪

作者：Ushnish Sengupta,Alexandros Kontogiannis,Matthew P. Juniper 机构：Department of Engineering, University of Cambridge, Cambridge, CB,PZ 链接：https://arxiv.org/abs/2107.07863 摘要：磁共振测速技术（MRV）是一种非侵入性的测量流体速度场的实验技术，在医学和工程中有着广泛的应用。这些测量是密集的，但具有低信噪比（SNR）。测量可以通过对流动施加物理约束来消除噪声，这些约束被封装在质量和动量的控制方程中。以前的研究要求边界的形状（例如血管）是先验的。然而，这需要一组额外的测量，获得这些测量可能很昂贵。本文提出了一种基于物理信息的神经网络，该网络仅利用含噪声的MRV数据同时推断最可能的边界形状和去噪后的速度场。我们通过训练一个辅助神经网络来实现这一点，该网络在控制偏微分方程的推断域内取1.0，在控制偏微分方程的推断域外取0.0。该网络用于加权损失函数中的偏微分方程残差项，并隐式地学习系统的几何结构。我们通过同化合成的和真实的MRV测量值来测试我们的算法，这些测量值可以很好地用Poisson和Stokes方程来模拟。我们发现，我们可以重建非常嘈杂的（信噪比=2.5）磁共振成像信号和恢复地面真值与低重建误差3.7-7.5%。我们的物理信息神经网络方法的简单性和灵活性可以很容易地扩展到同化具有复杂三维几何、时变4D数据或物理模型中未知参数的MRV数据。摘要：Magnetic resonance velocimetry (MRV) is a non-invasive experimental technique widely used in medicine and engineering to measure the velocity field of a fluid. These measurements are dense but have a low signal-to-noise ratio (SNR). The measurements can be de-noised by imposing physical constraints on the flow, which are encapsulated in governing equations for mass and momentum. Previous studies have required the shape of the boundary (for example, a blood vessel) to be known a priori. This, however, requires a set of additional measurements, which can be expensive to obtain. In this paper, we present a physics-informed neural network that instead uses the noisy MRV data alone to simultaneously infer the most likely boundary shape and de-noised velocity field. We achieve this by training an auxiliary neural network that takes the value 1.0 within the inferred domain of the governing PDE and 0.0 outside. This network is used to weight the PDE residual term in the loss function accordingly and implicitly learns the geometry of the system. We test our algorithm by assimilating both synthetic and real MRV measurements for flows that can be well modeled by the Poisson and Stokes equations. We find that we are able to reconstruct very noisy (SNR = 2.5) MRV signals and recover the ground truth with low reconstruction errors of 3.7 - 7.5%. The simplicity and flexibility of our physics-informed neural network approach can readily scale to assimilating MRV data with complex 3D geometries, time-varying 4D data, or unknown parameters in the physical model.

【4】 Prediction of Blood Lactate Values in Critically Ill Patients: A Retrospective Multi-center Cohort Study 标题：危重患者血乳酸值的预测：一项回顾性多中心队列研究

作者：Behrooz Mamandipoor,Wesley Yeung,Louis Agha-Mir-Salim,David J. Stone,Venet Osmani,Leo Anthony Celi 机构：Affiliations:, Fondazione Bruno Kessler Research Institute, Trento, Italy, Laboratory for Computational Physiology, Harvard-MIT Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, MA, USA 备注：None 链接：https://arxiv.org/abs/2107.07582 摘要：目的。最初获得的血清乳酸水平升高是危重病人死亡率的有力预测因素。确定血清乳酸水平更可能升高的患者可以提醒医生加强护理，并指导他们进行血液检测的频率。我们研究机器学习模型是否能预测随后的血清乳酸变化。方法。我们使用模拟III和eICU CRD数据集，在模拟III队列的eICU队列的内部和外部验证中研究血清乳酸变化预测。根据初始乳酸水平分为三个亚组：i）正常组（<2 mmol/L），ii）轻度组（2-4 mmol/L），以及iii）重度组（>4 mmol/L）。根据各组间血清乳酸水平的增加或减少来确定结果。我们还通过将结果定义为乳酸变化>10%来进行敏感性分析，并进一步研究后续乳酸测量之间的时间间隔对预测性能的影响。结果。LSTM模型能够预测正常组AUC为0.77（95%CI 0.762-0.771），轻度组AUC为0.77（95%CI 0.768-0.772），重度组AUC为0.85（95%CI 0.840-0.851）的模拟III患者血清乳酸值的恶化，外部验证的表现稍低。结论。LSTM对血清乳酸水平下降的患者有很好的鉴别能力。临床研究需要评估基于这些结果的临床决策支持工具的使用是否会对决策和患者结果产生积极影响。摘要：Purpose. Elevations in initially obtained serum lactate levels are strong predictors of mortality in critically ill patients. Identifying patients whose serum lactate levels are more likely to increase can alert physicians to intensify care and guide them in the frequency of tending the blood test. We investigate whether machine learning models can predict subsequent serum lactate changes. Methods. We investigated serum lactate change prediction using the MIMIC-III and eICU-CRD datasets in internal as well as external validation of the eICU cohort on the MIMIC-III cohort. Three subgroups were defined based on the initial lactate levels: i) normal group (<2 mmol/L), ii) mild group (2-4 mmol/L), and iii) severe group (>4 mmol/L). Outcomes were defined based on increase or decrease of serum lactate levels between the groups. We also performed sensitivity analysis by defining the outcome as lactate change of >10% and furthermore investigated the influence of the time interval between subsequent lactate measurements on predictive performance. Results. The LSTM models were able to predict deterioration of serum lactate values of MIMIC-III patients with an AUC of 0.77 (95% CI 0.762-0.771) for the normal group, 0.77 (95% CI 0.768-0.772) for the mild group, and 0.85 (95% CI 0.840-0.851) for the severe group, with a slightly lower performance in the external validation. Conclusion. The LSTM demonstrated good discrimination of patients who had deterioration in serum lactate levels. Clinical studies are needed to evaluate whether utilization of a clinical decision support tool based on these results could positively impact decision-making and patient outcomes.

其他神经网络|深度学习|模型|建模(16篇)

【1】 Continual Learning for Automated Audio Captioning Using The Learning Without Forgetting Approach 标题：基于学习不忘方法的自动音频字幕的持续学习

作者：Jan Berg,Konstantinos Drossos 机构：Audio Research Group, Tampere University, Finland 链接：https://arxiv.org/abs/2107.08028 摘要：自动音频字幕（AAC）是为一般音频信号的内容自动创建文本描述（即字幕）的任务。大多数AAC方法都是使用现有的数据集来优化和/或评估。考虑到AAC数据集所拥有的有限信息，AAC方法很可能只学习所使用的数据集中包含的信息。在本文中，我们提出了第一种方法，不断适应AAC方法，以新的信息，使用一个持续的学习方法。在我们的场景中，一个预先优化的AAC方法被用于一些看不见的一般音频信号，并且可以更新其参数以适应新的信息，给定一个新的参考标题。我们评估我们的方法使用一个免费的，预先优化的AAC方法和两个免费可用的AAC数据集。我们将我们提出的方法与三个场景进行了比较，其中两个场景是在一个数据集上训练并对另一个数据集进行评估，另外三个场景是在一个数据集上训练并对另一个数据集进行微调。结果表明，该方法在提取新知识和不遗忘前一知识之间取得了很好的平衡。摘要：Automated audio captioning (AAC) is the task of automatically creating textual descriptions (i.e. captions) for the contents of a general audio signal. Most AAC methods are using existing datasets to optimize and/or evaluate upon. Given the limited information held by the AAC datasets, it is very likely that AAC methods learn only the information contained in the utilized datasets. In this paper we present a first approach for continuously adapting an AAC method to new information, using a continual learning method. In our scenario, a pre-optimized AAC method is used for some unseen general audio signals and can update its parameters in order to adapt to the new information, given a new reference caption. We evaluate our method using a freely available, pre-optimized AAC method and two freely available AAC datasets. We compare our proposed method with three scenarios, two of training on one of the datasets and evaluating on the other and a third of training on one dataset and fine-tuning on the other. Obtained results show that our method achieves a good balance between distilling new knowledge and not forgetting the previous one.

【2】 Port-Hamiltonian Neural Networks for Learning Explicit Time-Dependent Dynamical Systems 标题：学习显式时变动力系统的端口-哈密顿神经网络

作者：Shaan Desai,Marios Mattheakis,David Sondak,Pavlos Protopapas,Stephen Roberts 机构：John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts , United States, ) 备注：[under review] 链接：https://arxiv.org/abs/2107.08024 摘要：准确地学习动态系统的时间行为需要具有良好选择的学习偏差的模型。最近的创新将哈密顿量和拉格朗日形式嵌入到神经网络中，在预测物理系统的轨迹方面比其他方法有了显著的改进。这些方法通常适用于隐含地依赖于时间的自治系统或控制信号先验已知的系统。尽管取得了这样的成功，但现实世界中的许多动力系统都是非自治的，由依赖于时间的力驱动并经历能量耗散。在这项研究中，我们通过将波特哈密顿形式嵌入到神经网络中来解决从这种非自治系统学习的挑战，神经网络是一个可以捕捉能量耗散和时间相关控制力的通用框架。结果表明，所提出的端口哈密顿神经网络能有效地学习具有实际意义的非线性物理系统的动力学，并能准确地恢复系统的稳态哈密顿量、含时力和耗散系数。我们的网络的一个很有希望的结果是它能够学习和预测混沌系统，例如Duffing方程，其轨迹通常很难学习。摘要：Accurately learning the temporal behavior of dynamical systems requires models with well-chosen learning biases. Recent innovations embed the Hamiltonian and Lagrangian formalisms into neural networks and demonstrate a significant improvement over other approaches in predicting trajectories of physical systems. These methods generally tackle autonomous systems that depend implicitly on time or systems for which a control signal is known apriori. Despite this success, many real world dynamical systems are non-autonomous, driven by time-dependent forces and experience energy dissipation. In this study, we address the challenge of learning from such non-autonomous systems by embedding the port-Hamiltonian formalism into neural networks, a versatile framework that can capture energy dissipation and time-dependent control forces. We show that the proposed emph{port-Hamiltonian neural network} can efficiently learn the dynamics of nonlinear physical systems of practical interest and accurately recover the underlying stationary Hamiltonian, time-dependent force, and dissipative coefficient. A promising outcome of our network is its ability to learn and predict chaotic systems such as the Duffing equation, for which the trajectories are typically hard to learn.

【3】 Ranking labs-of-origin for genetically engineered DNA using Metric Learning 标题：利用度量学习对基因工程DNA原产地实验室进行排序

作者：I. Muniz,F. H. F. Camargo,A. Marques 备注：4 pages, 2 figures, 1 algorithm 链接：https://arxiv.org/abs/2107.07878 摘要：随着基因工程的不断发展，一个共同关注的问题是如何识别基因工程DNA序列的起源实验室。因此，AltLabs主办了基因工程归因挑战赛，召集了许多团队提出新的工具来解决这个问题。在这里，我们展示了我们提出的方法，以排名最有可能的起源实验室和生成嵌入DNA序列和实验室。这些嵌入还可以执行各种其他任务，比如对DNA序列和实验室进行聚类，并将它们用作用于解决其他问题的机器学习模型的特征。这项工作表明，我们的方法优于经典的训练方法，为这个任务，同时产生其他有用的信息。摘要：With the constant advancements of genetic engineering, a common concern is to be able to identify the lab-of-origin of genetically engineered DNA sequences. For that reason, AltLabs has hosted the genetic Engineering Attribution Challenge to gather many teams to propose new tools to solve this problem. Here we show our proposed method to rank the most likely labs-of-origin and generate embeddings for DNA sequences and labs. These embeddings can also perform various other tasks, like clustering both DNA sequences and labs and using them as features for Machine Learning models applied to solve other problems. This work demonstrates that our method outperforms the classic training method for this task while generating other helpful information.

【4】 Nearest neighbor Methods and their Applications in Design of 5G & Beyond Wireless Networks 标题：最近邻法及其在5G及Beyond无线网络设计中的应用

作者：Syed Ali Raza Zaidi 链接：https://arxiv.org/abs/2107.07869 摘要：在本文中，我们提出了一个最近邻（NN）方法，这是常用的解决分类问题的监督学习。本文简要介绍了该系统的理论背景、算法、实现以及关键应用。本文从应用的角度出发，探讨了5G及以上无线网络所面临的挑战，这些挑战可以通过神经网络分类技术来解决。摘要：In this paper, we present an overview of Nearest neighbor (NN) methods, which are frequently employed for solving classification problems using supervised learning. The article concisely introduces the theoretical background, algorithmic, and implementation aspects along with the key applications. From an application standpoint, this article explores the challenges related to the 5G and beyond wireless networks which can be solved using NN classification techniques.

【5】 Versatile modular neural locomotion control with fast learning 标题：具有快速学习功能的多功能模块化神经运动控制

作者：Mathias Thor,Poramate Manoonpong 机构：Embodied AI and Neurorobotics Laboratory, SDU Biorobotics, The Mærsk Mc-Kinney Møller Institute, The University of Southern Denmark, Campusvej , Odense , Denmark, Bio-Inspired Robotics and Neural Engineering Laboratory 备注：For supplementary video files see: this https URL 链接：https://arxiv.org/abs/2107.07844 摘要：腿部机器人在高度非结构化的环境中有着巨大的潜力。然而，运动控制的设计仍然具有挑战性。目前，控制器必须为特定的机器人和任务手动设计，或者通过机器学习方法自动设计，这些方法需要较长的训练时间并产生大量不透明的控制器。从动物运动的启发，我们提出了一个简单而通用的模块化快速学习神经控制结构。该方法的主要优点是可以逐步增加特定于行为的控制模块，以获得日益复杂的紧急运动行为，并且可以快速自动地学习与现有模块接口的神经连接。在一系列实验中，我们展示了如何快速学习八个模块，并将其添加到一个基本控制模块中，以获得紧急的自适应行为，从而使六足机器人能够在复杂环境中导航。我们还表明，模块可以在操作过程中添加和删除，而不影响其余控制器的功能。最后，在一个物理六足机器人上成功地演示了该控制方法。综上所述，我们的研究为复杂机器人系统的多功能神经运动控制的快速自动设计迈出了重要的一步。摘要：Legged robots have significant potential to operate in highly unstructured environments. The design of locomotion control is, however, still challenging. Currently, controllers must be either manually designed for specific robots and tasks, or automatically designed via machine learning methods that require long training times and yield large opaque controllers. Drawing inspiration from animal locomotion, we propose a simple yet versatile modular neural control structure with fast learning. The key advantages of our approach are that behavior-specific control modules can be added incrementally to obtain increasingly complex emergent locomotion behaviors, and that neural connections interfacing with existing modules can be quickly and automatically learned. In a series of experiments, we show how eight modules can be quickly learned and added to a base control module to obtain emergent adaptive behaviors allowing a hexapod robot to navigate in complex environments. We also show that modules can be added and removed during operation without affecting the functionality of the remaining controller. Finally, the control approach was successfully demonstrated on a physical hexapod robot. Taken together, our study reveals a significant step towards fast automatic design of versatile neural locomotion control for complex robotic systems.

【6】 Measuring Fairness in Generative Models 标题：产生式模型中公平性的度量

作者：Christopher T. H Teo,Ngai-Man Cheung 备注：Accepted in ICML 2021 Workshop - Machine Learning for Data: Automated Creation, Privacy, Bias 链接：https://arxiv.org/abs/2107.07754 摘要：深层生成模型在提高训练稳定性和生成数据质量方面取得了很大进展。最近，人们对深层生成数据的公平性越来越感兴趣。公平在许多应用中很重要，例如执法，因为偏见会影响效率。公平数据生成的核心是评估和评估不同生成模型的公平性度量。在本文中，我们首先回顾了公平性指标在以往的工作中提出的，并强调潜在的弱点。然后，我们讨论了一个性能基准框架以及替代指标的评估。摘要：Deep generative models have made much progress in improving training stability and quality of generated data. Recently there has been increased interest in the fairness of deep-generated data. Fairness is important in many applications, e.g. law enforcement, as biases will affect efficacy. Central to fair data generation are the fairness metrics for the assessment and evaluation of different generative models. In this paper, we first review fairness metrics proposed in previous works and highlight potential weaknesses. We then discuss a performance benchmark framework along with the assessment of alternative metrics.

【7】 Intersectional Bias in Causal Language Models 标题：因果语言模型中的交叉性偏差

作者：Liam Magee,Lida Ghahremanlou,Karen Soldatic,Shanthi Robertson 机构： Western Sydney University, Australia, Microsoft, United Kingdom 备注：18 pages, 4 figures 链接：https://arxiv.org/abs/2107.07691 摘要：为了检验在语言生成中是否可以观察到交叉偏误，我们检验了emph{GPT-2}和emph{GPT-NEO}模型，其大小从1.24亿到27亿个参数不等。我们进行了一项实验，将性别、宗教和残疾这三个社会类别组合成无条件或Zero-Shot提示，用来生成句子，然后分析句子的情感。我们的结果证实了早期使用自回归因果模型进行的测试，包括emph{GPT}模型家族。我们还说明了为什么偏见可能会抵制针对单一类别（如性别、宗教和种族）的技术，因为它也可能以微妙的方式表现在由串联的社会类别引发的文本中。为了解决这些困难，我们建议技术和社区为基础的方法需要结合起来，以承认和解决复杂和交叉的语言模型偏见。摘要：To examine whether intersectional bias can be observed in language generation, we examine emph{GPT-2} and emph{GPT-NEO} models, ranging in size from 124 million to ~2.7 billion parameters. We conduct an experiment combining up to three social categories - gender, religion and disability - into unconditional or zero-shot prompts used to generate sentences that are then analysed for sentiment. Our results confirm earlier tests conducted with auto-regressive causal models, including the emph{GPT} family of models. We also illustrate why bias may be resistant to techniques that target single categories (e.g. gender, religion and race), as it can also manifest, in often subtle ways, in texts prompted by concatenated social categories. To address these difficulties, we suggest technical and community-based approaches need to combine to acknowledge and address complex and intersectional language model bias.

【8】 Algorithmic insights on continual learning from fruit flies 标题：从果蝇身上持续学习的算法洞察力

作者：Yang Shen,Sanjoy Dasgupta,Saket Navlakha 机构：Cold Spring Harbor Laboratory, Simons Center for Quantitative Biology, Cold Spring, Harbor, NY, Computer Science and Engineering Department, University of California San Diego, La, Jolla, CA 链接：https://arxiv.org/abs/2107.07617 摘要：由于灾难性遗忘，计算系统中的持续学习具有挑战性。我们在果蝇嗅觉系统中发现了一个两层的神经回路，通过将稀疏编码和联想学习相结合来解决这个问题。在第一层，气味被编码使用稀疏的，高维的表示，通过激活不同气味的非重叠神经元群来减少记忆干扰。在第二层，只有气味激活神经元和与气味相关的输出神经元之间的突触在学习过程中发生改变；其余的权重被冻结，以防止不相关的内存被覆盖。我们的经验和分析表明，这种简单而轻量级的算法显著提高了连续学习的性能。苍蝇联想学习算法与经典的感知器学习算法有着惊人的相似性，尽管有两个改进，这对于减少灾难性遗忘是至关重要的。总的来说，果蝇进化出了一种高效的终身学习算法，神经科学的电路机制可以转化为改进机器计算。摘要：Continual learning in computational systems is challenging due to catastrophic forgetting. We discovered a two layer neural circuit in the fruit fly olfactory system that addresses this challenge by uniquely combining sparse coding and associative learning. In the first layer, odors are encoded using sparse, high dimensional representations, which reduces memory interference by activating non overlapping populations of neurons for different odors. In the second layer, only the synapses between odor activated neurons and the output neuron associated with the odor are modified during learning; the rest of the weights are frozen to prevent unrelated memories from being overwritten. We show empirically and analytically that this simple and lightweight algorithm significantly boosts continual learning performance. The fly associative learning algorithm is strikingly similar to the classic perceptron learning algorithm, albeit two modifications, which we show are critical for reducing catastrophic forgetting. Overall, fruit flies evolved an efficient lifelong learning algorithm, and circuit mechanisms from neuroscience can be translated to improve machine computation.

【9】 Globally Convergent Multilevel Training of Deep Residual Networks 标题：深度残差网络的全局收敛多层训练

作者：Alena Kopaničáková,Rolf Krause 机构：†Euler Institute, Università della Svizzera italiana (alena 链接：https://arxiv.org/abs/2107.07572 摘要：提出了一种全局收敛的深度残差网络多级训练方法。该方法是递归多层次信赖域（RMTR）方法的一种新变种，它通过在训练过程中自适应地调整小批量大小，在混合（随机确定性）环境下运行。利用动态系统的观点构造了多级层次结构和传递算子，将通过ResNet的前向传播解释为初值问题的前向Euler离散化。与传统的训练方法不同，本文提出的RMTR方法还利用有限记忆SR1方法，在多层结构的各个层次上引入了曲率信息。利用分类和回归领域的实例，对多级训练方法的整体性能和收敛性进行了数值研究。摘要：We propose a globally convergent multilevel training method for deep residual networks (ResNets). The devised method can be seen as a novel variant of the recursive multilevel trust-region (RMTR) method, which operates in hybrid (stochastic-deterministic) settings by adaptively adjusting mini-batch sizes during the training. The multilevel hierarchy and the transfer operators are constructed by exploiting a dynamical system's viewpoint, which interprets forward propagation through the ResNet as a forward Euler discretization of an initial value problem. In contrast to traditional training approaches, our novel RMTR method also incorporates curvature information on all levels of the multilevel hierarchy by means of the limited-memory SR1 method. The overall performance and the convergence properties of our multilevel training method are numerically investigated using examples from the field of classification and regression.

【10】 SA-GD: Improved Gradient Descent Learning Strategy with Simulated Annealing 标题：SA-GD：改进的模拟退火梯度下降学习策略

作者：Zhicheng Cai 机构：School of Electronic Science and Engineering, Nanjing University, Nanjing, China 备注：10 pages, 10 figures 链接：https://arxiv.org/abs/2107.07558 摘要：梯度下降算法是优化机器学习问题时最常用的方法。然而，在损失函数中存在许多局部极小值和鞍点，特别是对于高维非凸优化问题，如深度学习问题。梯度下降会使损失函数陷入这些局部区间，阻碍进一步优化，导致泛化能力差。将模拟退火算法的思想引入到梯度下降算法中，提出了SA-GD算法。SA-GD方法为模型提供了概率爬山的能力，使模型能够跳出这些局部区域，最终收敛到最优状态。我们以CNN模型为例，在各种基准数据集上测试了基本的CNN模型。与传统梯度下降算法的基线模型相比，SA-GD算法在不牺牲模型收敛效率和稳定性的前提下，具有更好的泛化能力。此外，SA-GD可以作为一种有效的集成学习方法，显著提高系统的最终性能。摘要：Gradient descent algorithm is the most utilized method when optimizing machine learning issues. However, there exists many local minimums and saddle points in the loss function, especially for high dimensional non-convex optimization problems like deep learning. Gradient descent may make loss function trapped in these local intervals which impedes further optimization, resulting in poor generalization ability. This paper proposes the SA-GD algorithm which introduces the thought of simulated annealing algorithm to gradient descent. SA-GD method offers model the ability of mounting hills in probability, tending to enable the model to jump out of these local areas and converge to a optimal state finally. We took CNN models as an example and tested the basic CNN models on various benchmark datasets. Compared to the baseline models with traditional gradient descent algorithm, models with SA-GD algorithm possess better generalization ability without sacrificing the efficiency and stability of model convergence. In addition, SA-GD can be utilized as an effective ensemble learning approach which improves the final performance significantly.

【11】 Machine-learning Kondo physics using variational autoencoders 标题：使用变分自动编码器的机器学习近藤物理

作者：Cole Miles,Matthew R. Carbone,Erica J. Sturm,Deyu Lu,Andreas Weichselbaum,Kipton Barros,Robert M. Konik 机构：Department of Physics, Cornell University, Ithaca, New York , USA, Computational Science Initiative, Brookhaven National Laboratory, Upton, New York , USA, Condensed Matter Physics and Materials Science Division 备注：9 pages 5 pages appendix, 14 figures 链接：https://arxiv.org/abs/2107.08013 摘要：我们使用变分自动编码器从单粒子安德森杂质模型光谱函数的数据集中提取物理细节。自动编码器被训练来寻找一个低维的，潜在的空间表示，它忠实地描述了训练集的每个元素，如重建误差所测量的。变分自动编码器是标准自动编码器的一种概率推广，它进一步调节学习的潜在空间，以促进高度可解释的特征。在我们的研究中，我们发现学习到的潜在空间成分与安德森杂质模型中表征涌现行为的众所周知但非平凡的参数密切相关。特别地，一个潜在空间分量与粒子-空穴不对称性相关，而另一个与近藤温度（杂质模型中动态生成的低能标度）近似一一对应。通过符号回归，我们将这个分量建模为物理输入参数的函数，并“重新发现”近藤温度的非微扰公式。我们开发的机器学习管道为在其他物理系统中发现新的领域知识提供了机会。摘要：We employ variational autoencoders to extract physical insight from a dataset of one-particle Anderson impurity model spectral functions. Autoencoders are trained to find a low-dimensional, latent space representation that faithfully characterizes each element of the training set, as measured by a reconstruction error. Variational autoencoders, a probabilistic generalization of standard autoencoders, further condition the learned latent space to promote highly interpretable features. In our study, we find that the learned latent space components strongly correlate with well known, but nontrivial, parameters that characterize emergent behaviors in the Anderson impurity model. In particular, one latent space component correlates with particle-hole asymmetry, while another is in near one-to-one correspondence with the Kondo temperature, a dynamically generated low-energy scale in the impurity model. With symbolic regression, we model this component as a function of bare physical input parameters and "rediscover" the non-perturbative formula for the Kondo temperature. The machine learning pipeline we develop opens opportunities to discover new domain knowledge in other physical systems.

【12】 Tracing Halpha Fibrils through Bayesian Deep Learning 标题：基于贝叶斯深度学习的Halpha纤维追踪

作者：Haodi Jiang,Ju Jing,Jiasheng Wang,Chang Liu,Qin Li,Yan Xu,Jason T. L. Wang,Haimin Wang 机构：Institute for Space Weather Sciences, New Jersey Institute of Technology, University Heights, Newark, NJ ,-, USA;, Department of Computer Science, New Jersey Institute of Technology, University Heights, Newark, NJ ,-, USA 备注：20 pages, 12 figures 链接：https://arxiv.org/abs/2107.07886 摘要：我们提出了一种新的深度学习方法，称为纤维网，用于跟踪太阳观测Halpha图像中的色球纤维。我们的方法包括一个数据预处理组件，它从一个基于阈值的工具中准备训练数据，一个作为贝叶斯卷积神经网络实现的深度学习模型，用于概率图像分割和不确定性量化以预测纤维，以及包含用于确定纤维取向的纤维拟合算法的后处理组件。FibrilNet工具应用于大熊太阳天文台（BBSO）配备高阶自适应光学系统的1.6米古德太阳望远镜（GST）采集的活动区域（AR 12665）的高分辨率Halpha图像。我们定量评估了FibrilNet工具，并将其图像分割算法和fibril拟合算法与基于阈值的工具进行了比较。我们的实验结果和主要发现总结如下。首先，两种工具的图像分割结果（即检测到的纤维）非常相似，说明纤维网具有良好的学习能力。其次，与基于阈值的工具相比，FibrilNet可以找到更精确、更平滑的原纤方向角。第三，FibrilNet比基于阈值的工具更快，FibrilNet生成的不确定度图不仅提供了一种定量的方法来测量每个检测到的原纤维的置信度，而且有助于识别那些没有被基于阈值的工具检测到但通过机器学习推断出来的原纤维结构。最后，我们将FibrilNet应用于来自其他太阳观测站的全盘Halpha图像和BBSO/GST收集的额外高分辨率Halpha图像，展示了该工具在不同数据集中的可用性。摘要：We present a new deep learning method, dubbed FibrilNet, for tracing chromospheric fibrils in Halpha images of solar observations. Our method consists of a data pre-processing component that prepares training data from a threshold-based tool, a deep learning model implemented as a Bayesian convolutional neural network for probabilistic image segmentation with uncertainty quantification to predict fibrils, and a post-processing component containing a fibril-fitting algorithm to determine fibril orientations. The FibrilNet tool is applied to high-resolution Halpha images from an active region (AR 12665) collected by the 1.6 m Goode Solar Telescope (GST) equipped with high-order adaptive optics at the Big Bear Solar Observatory (BBSO). We quantitatively assess the FibrilNet tool, comparing its image segmentation algorithm and fibril-fitting algorithm with those employed by the threshold-based tool. Our experimental results and major findings are summarized as follows. First, the image segmentation results (i.e., detected fibrils) of the two tools are quite similar, demonstrating the good learning capability of FibrilNet. Second, FibrilNet finds more accurate and smoother fibril orientation angles than the threshold-based tool. Third, FibrilNet is faster than the threshold-based tool and the uncertainty maps produced by FibrilNet not only provide a quantitative way to measure the confidence on each detected fibril, but also help identify fibril structures that are not detected by the threshold-based tool but are inferred through machine learning. Finally, we apply FibrilNet to full-disk Halpha images from other solar observatories and additional high-resolution Halpha images collected by BBSO/GST, demonstrating the tool's usability in diverse datasets.

【13】 Finite Basis Physics-Informed Neural Networks (FBPINNs): a scalable domain decomposition approach for solving differential equations 标题：有限基物理信息神经网络(FBPINNs)：求解微分方程的一种可伸缩区域分解方法

作者：Ben Moseley,Andrew Markham,Tarje Nissen-Meyer 机构：Department of Computer Science, University of Oxford, Oxford, UK, Department of Earth Sciences 备注：27 pages, 13 figures 链接：https://arxiv.org/abs/2107.07871 摘要：近年来，物理信息神经网络（PINNs）为求解微分方程问题提供了一种强有力的新范式。与经典的数值方法相比，PINNs具有许多优点，例如，它们能够提供微分方程的无网格解，并且能够在同一优化问题中进行正演和逆演建模。虽然前景看好，但迄今为止的一个关键限制是，pinn一直难以准确有效地解决具有大域和/或多尺度解决方案的问题，这对其实际应用至关重要。多重重要和相关的因素促成了这个问题，包括随着问题规模的增长，潜在PINN优化问题的复杂性不断增加，以及神经网络的光谱偏差。在这项工作中，我们提出了一种新的，可扩展的方法来解决有关微分方程的大问题称为有限基PINNs（FBPINNs）。fbpinn的灵感来自于经典的有限元方法，其中微分方程的解被表示为具有紧支撑的有限组基函数的和。在FBPINNs中，神经网络用于学习这些基函数，这些基函数定义在小的、重叠的子域上。fbinn通过在每个子域上使用单独的输入归一化来解决神经网络的光谱偏差，并通过在并行分治方法中使用许多较小的神经网络来降低潜在优化问题的复杂性。我们的数值实验表明，FBPINNs在解决小型和大型多尺度问题方面都是有效的，在精度和所需计算资源方面都优于标准PINNs，为PINNs在大型实际问题中的应用铺平了道路。摘要：Recently, physics-informed neural networks (PINNs) have offered a powerful new paradigm for solving problems relating to differential equations. Compared to classical numerical methods PINNs have several advantages, for example their ability to provide mesh-free solutions of differential equations and their ability to carry out forward and inverse modelling within the same optimisation problem. Whilst promising, a key limitation to date is that PINNs have struggled to accurately and efficiently solve problems with large domains and/or multi-scale solutions, which is crucial for their real-world application. Multiple significant and related factors contribute to this issue, including the increasing complexity of the underlying PINN optimisation problem as the problem size grows and the spectral bias of neural networks. In this work we propose a new, scalable approach for solving large problems relating to differential equations called Finite Basis PINNs (FBPINNs). FBPINNs are inspired by classical finite element methods, where the solution of the differential equation is expressed as the sum of a finite set of basis functions with compact support. In FBPINNs neural networks are used to learn these basis functions, which are defined over small, overlapping subdomains. FBINNs are designed to address the spectral bias of neural networks by using separate input normalisation over each subdomain, and reduce the complexity of the underlying optimisation problem by using many smaller neural networks in a parallel divide-and-conquer approach. Our numerical experiments show that FBPINNs are effective in solving both small and larger, multi-scale problems, outperforming standard PINNs in both accuracy and computational resources required, potentially paving the way to the application of PINNs on large, real-world problems.

【14】 NeXtQSM -- A complete deep learning pipeline for data-consistent quantitative susceptibility mapping trained with hybrid data 标题：NeXtQSM--用混合数据训练的数据一致性定量磁化率图的完整深度学习管道

作者：Francesco Cognolato,Kieran O'Brien,Jin Jin,Simon Robinson,Frederik B. Laun,Markus Barth,Steffen Bollmann 机构： Centre for Advanced Imaging, The University of Queensland, Brisbane, Australia, ARC Training Centre for Innovation in Biomedical Imaging Technology, The, School of Information Technology and Electrical Engineering, The University of 链接：https://arxiv.org/abs/2107.07752 摘要：近年来，基于深度学习的定量敏感度图（QSM）显示出巨大的潜力，在速度和准确性上都优于传统的非学习方法。然而，目前许多深度学习方法的数据不一致，需要在体训练数据或不能解决QSM处理管道的所有步骤。在这里，我们旨在克服这些局限性，并开发了一个框架来共同解决QSM处理步骤。我们开发了一种新的混合训练数据生成方法，该方法利用QSM模型项和学习正则化子相结合的变分网络，实现了以数据一致的方式进行背景场校正和偶极子反演的端到端训练。我们证明了NeXtQSM克服了以前模型不可知的深度学习方法的局限性，并且证明了NeXtQSM提供了一个完整的基于深度学习的管道，用于计算健壮、快速和准确的定量敏感性图。摘要：Deep learning based Quantitative Susceptibility Mapping (QSM) has shown great potential in recent years, outperforming traditional non-learning approaches in speed and accuracy. However, many of the current deep learning approaches are not data consistent, require in vivo training data or do not solve all steps of the QSM processing pipeline. Here we aim to overcome these limitations and developed a framework to solve the QSM processing steps jointly. We developed a new hybrid training data generation method that enables the end-to-end training for solving background field correction and dipole inversion in a data-consistent fashion using a variational network that combines the QSM model term and a learned regularizer. We demonstrate that NeXtQSM overcomes the limitations of previous model-agnostic deep learning methods and show that NeXtQSM offers a complete deep learning based pipeline for computing robust, fast and accurate quantitative susceptibility maps.

【15】 Robust Online Control with Model Misspecification 标题：具有模型偏差的鲁棒在线控制

作者：Xinyi Chen,Udaya Ghai,Elad Hazan,Alexandre Megretski 机构： The deviation of the nonlinear dynam-ics from a linear system is captured by an adversarial dis-Equal contribution 1Department of Computer Science, Prince-ton University, NJ 3Department of Electrical Engineering and Computer Science 链接：https://arxiv.org/abs/2107.07732 摘要：研究了一个未知非线性动力系统的在线控制问题，该系统由一个模型描述错误的时不变线性系统逼近。我们的研究集中在鲁棒性上，它衡量了与事后最优控制相比，在保持有界的$ellu 2$-增益的同时，所能容忍的偏离假定线性近似的程度。有些模型即使在完全了解其系数的情况下也无法稳定：鲁棒性受到假设动力学和不稳定动力学集之间最小距离的限制。因此，有必要假设这个距离的下限。在这个假设下，在充分观察到$d$维状态的情况下，我们描述了一个有效的控制器，它具有$Omega（frac{1}{sqrt{d}}）$鲁棒性和$ellu 2$增益，其维数依赖性接近最优。我们还给出了一个效率低下的算法，该算法可以获得与维数无关的恒定鲁棒性，具有有限的次优$ellu 2$-增益。摘要：We study online control of an unknown nonlinear dynamical system that is approximated by a time-invariant linear system with model misspecification. Our study focuses on robustness, which measures how much deviation from the assumed linear approximation can be tolerated while maintaining a bounded $ell_2$-gain compared to the optimal control in hindsight. Some models cannot be stabilized even with perfect knowledge of their coefficients: the robustness is limited by the minimal distance between the assumed dynamics and the set of unstabilizable dynamics. Therefore it is necessary to assume a lower bound on this distance. Under this assumption, and with full observation of the $d$ dimensional state, we describe an efficient controller that attains $Omega(frac{1}{sqrt{d}})$ robustness together with an $ell_2$-gain whose dimension dependence is near optimal. We also give an inefficient algorithm that attains constant robustness independent of the dimension, with a finite but sub-optimal $ell_2$-gain.

【16】 Multi-task Learning with Cross Attention for Keyword Spotting 标题：基于交叉注意的关键词识别多任务学习

作者：Takuya Higuchi,Anmol Gupta,Chandra Dhir 机构：Apple, Department of Computer Science, The University of Hong Kong 备注：Submitted to ASRU 2021 链接：https://arxiv.org/abs/2107.07634 摘要：关键词定位（keywordspotting，KWS）是语音应用中的一项重要技术，它使用户能够通过说出关键词短语来激活设备。尽管音素分类器可以用于KWS，但它可以利用大量的转录数据进行自动语音识别（ASR），但训练标准（音素识别）和目标任务（KWS）之间存在不匹配。最近，多任务学习被应用到KWS中，以利用ASR和KWS训练数据。在这种方法中，声学模型的输出被分成两个分支，一个是用ASR数据训练的音素转录，另一个是用KWS数据训练的关键词分类。本文介绍了一种多任务学习框架下的交叉注意解码器。与输出层简单分割的传统多任务学习方法不同，交叉注意解码器通过在编码器输出和可训练查询序列之间执行交叉注意来总结来自语音编码器的信息，以预测KWS任务的置信度得分。在KWS任务上的实验结果表明，该方法比传统的分支分裂多任务学习和双向长-短团队记忆译码器的性能平均提高了12%。摘要：Keyword spotting (KWS) is an important technique for speech applications, which enables users to activate devices by speaking a keyword phrase. Although a phoneme classifier can be used for KWS, exploiting a large amount of transcribed data for automatic speech recognition (ASR), there is a mismatch between the training criterion (phoneme recognition) and the target task (KWS). Recently, multi-task learning has been applied to KWS to exploit both ASR and KWS training data. In this approach, an output of an acoustic model is split into two branches for the two tasks, one for phoneme transcription trained with the ASR data and one for keyword classification trained with the KWS data. In this paper, we introduce a cross attention decoder in the multi-task learning framework. Unlike the conventional multi-task learning approach with the simple split of the output layer, the cross attention decoder summarizes information from a phonetic encoder by performing cross attention between the encoder outputs and a trainable query sequence to predict a confidence score for the KWS task. Experimental results on KWS tasks show that the proposed approach outperformed the conventional multi-task learning with split branches and a bi-directional long short-team memory decoder by 12% on average.

其他(11篇)

【1】 SOK: Seeing and Believing: Evaluating the Trustworthiness of Twitter Users 标题：SOK：眼见为实：评估推特用户的可信度

作者：Tanveer Khan,Antonis Michalas 机构：Tampere University, Tampere, Finland 链接：https://arxiv.org/abs/2107.08027 摘要：社交网络和微博服务（如Twitter）在共享数字信息方面发挥着重要作用。尽管社交媒体很受欢迎，也很有用，但在很多情况下，腐败的用户会发现滥用社交媒体的方法，比如通过提高或降低用户的可信度。因此，社交媒体在为获取信息提供前所未有便利的同时，也带来了一个新的挑战——确定共享信息的可信度。目前，还没有自动的方法来确定哪些新闻或用户是可信的，哪些是不可信的。因此，建立一个能够衡量社交媒体用户可信度的系统已经成为一个非常重要的问题。给一个用户分配一个可信度分数，不仅激起了研究界的兴趣，也激起了双方大多数大人物的兴趣——比如Facebook，工业界的，政党的，社会界的。在这项工作中，我们创建了一个模型，我们希望它最终将促进和支持社会网络社区中信任的增加。我们的模型收集了数据，分析了大约5万名政客在推特上的行为。影响分数，基于几个选定的特征，分配给每个评估用户。此外，我们使用随机森林、多层感知器和支持向量机将政治Twitter用户分为可信或不可信。采用主动学习模型对数据集中未标记的模糊记录进行分类。最后，我们以查准率、召回率、F1评分和查准率作为主要的评估指标来衡量模型的性能。摘要：Social networking and micro-blogging services, such as Twitter, play an important role in sharing digital information. Despite the popularity and usefulness of social media, there have been many instances where corrupted users found ways to abuse it, as for instance, through raising or lowering user's credibility. As a result, while social media facilitates an unprecedented ease of access to information, it also introduces a new challenge - that of ascertaining the credibility of shared information. Currently, there is no automated way of determining which news or users are credible and which are not. Hence, establishing a system that can measure the social media user's credibility has become an issue of great importance. Assigning a credibility score to a user has piqued the interest of not only the research community but also most of the big players on both sides - such as Facebook, on the side of industry, and political parties on the societal one. In this work, we created a model which, we hope, will ultimately facilitate and support the increase of trust in the social network communities. Our model collected data and analysed the behaviour of~50,000 politicians on Twitter. Influence score, based on several chosen features, was assigned to each evaluated user. Further, we classified the political Twitter users as either trusted or untrusted using random forest, multilayer perceptron, and support vector machine. An active learning model was used to classify any unlabelled ambiguous records from our dataset. Finally, to measure the performance of the proposed model, we used precision, recall, F1 score, and accuracy as the main evaluation metrics.

【2】 Controlled AutoEncoders to Generate Faces from Voices 标题：受控自动编码器，可从语音生成人脸

作者：Hao Liang,Lulan Yu,Guikang Xu,Bhiksha Raj,Rita Singh 机构：Carnegie Mellon University, Pittsburgh PA , USA 链接：https://arxiv.org/abs/2107.07988 摘要：以往的多项研究表明，人声特征与面部特征之间存在着很强的相关性。然而，现有的方法只是从声音中生成人脸，而没有探索导致这些观察到的相关性的一组特征。为了探索这一点，可以设计一种计算方法，将问题重新表述为：“为了被视为源语音的发起者，目标人脸需要改变多少？”，本文提出了一种基于学习的语音-人脸相关性隐式引导人脸特征的目标人脸变形框架。我们的框架包括一个引导式自动编码器，可将一张脸转换为另一张脸，由一个称为选通控制器的独特模型调节组件控制，该控制器根据输入语音记录修改重建的脸。在VoxCelab和VGGFace数据集上，我们通过人类主题和人脸检索对该框架进行了评估。实验结果证明了该模型的有效性。摘要：Multiple studies in the past have shown that there is a strong correlation between human vocal characteristics and facial features. However, existing approaches generate faces simply from voice, without exploring the set of features that contribute to these observed correlations. A computational methodology to explore this can be devised by rephrasing the question to: "how much would a target face have to change in order to be perceived as the originator of a source voice?" With this in perspective, we propose a framework to morph a target face in response to a given voice in a way that facial features are implicitly guided by learned voice-face correlation in this paper. Our framework includes a guided autoencoder that converts one face to another, controlled by a unique model-conditioning component called a gating controller which modifies the reconstructed face based on input voice recordings. We evaluate the framework on VoxCelab and VGGFace datasets through human subjects and face retrieval. Various experiments demonstrate the effectiveness of our proposed model.

【3】 S2TA: Exploiting Structured Sparsity for Energy-Efficient Mobile CNN Acceleration 标题：S2TA：利用结构稀疏性实现节能移动CNN加速

作者：Zhi-Gang Liu,Paul N. Whatmough,Yuhao Zhu,Matthew Mattina 机构：Arm ML Research Lab, Boston, MA, USA, University of Rochester, Rochester, NY, USA 链接：https://arxiv.org/abs/2107.07983 摘要：利用稀疏性是加速量化卷积神经网络（CNN）在移动设备上推理的关键技术。以前的稀疏CNN加速器很大程度上利用了非结构化稀疏性，实现了显著的加速。然而，由于稀疏模式的无界性和很大程度上的不可预测性，利用非结构化稀疏性需要复杂的硬件设计，具有显著的能量和面积开销，这对于能量和面积效率至关重要的移动/IoT推理场景尤其不利。我们建议利用结构稀疏性，更具体地说，密度限制块（DBB）稀疏的重量和激活。DBB块张量限制了每个块的最大非零数。因此，DBB公开了静态可预测的稀疏模式，这些模式支持利用硬件的稀疏性。我们提出了新的硬件原语来分别实现（静态）权重和（动态）激活的DBB稀疏性，开销非常低。在这些原语的基础上，我们描述了S2TA，一个基于脉动阵列的CNN加速器，它利用了联合权重和激活DBB稀疏性以及传统脉动阵列上无法利用的数据重用的新维度。与具有零值时钟门控的脉动阵列的强基线相比，16nm的S2TA实现了超过2倍的加速比和能量降低，超过了5个流行的CNN基准。与两种最新的非收缩稀疏加速器Eyeriss v2（65nm）和SparTen（45nm）相比，65nm的S2TA每次使用的能量分别减少了2.2倍和3.1倍。摘要：Exploiting sparsity is a key technique in accelerating quantized convolutional neural network (CNN) inference on mobile devices. Prior sparse CNN accelerators largely exploit un-structured sparsity and achieve significant speedups. Due to the unbounded, largely unpredictable sparsity patterns, however, exploiting unstructured sparsity requires complicated hardware design with significant energy and area overhead, which is particularly detrimental to mobile/IoT inference scenarios where energy and area efficiency are crucial. We propose to exploit structured sparsity, more specifically, Density Bound Block (DBB) sparsity for both weights and activations. DBB block tensors bound the maximum number of non-zeros per block. DBB thus exposes statically predictable sparsity patterns that enable lean sparsity-exploiting hardware. We propose new hardware primitives to implement DBB sparsity for (static) weights and (dynamic) activations, respectively, with very low overheads. Building on top of the primitives, we describe S2TA, a systolic array-based CNN accelerator that exploits joint weight and activation DBB sparsity and new dimensions of data reuse unavailable on the traditional systolic array. S2TA in 16nm achieves more than 2x speedup and energy reduction compared to a strong baseline of a systolic array with zero-value clock gating, over five popular CNN benchmarks. Compared to two recent non-systolic sparse accelerators, Eyeriss v2 (65nm) and SparTen (45nm), S2TA in 65nm uses about 2.2x and 3.1x less energy per inference, respectively.

【4】 Single Pass Entrywise-Transformed Low Rank Approximation 标题：单程逐次变换低秩逼近

作者：Yifei Jiang,Yi Li,Yiming Sun,Jiaxin Wang,David P. Woodruff 机构：edu†School of Physical and Mathematical Sciences, Nanyang Technological University, sg‡School of Physical and Mathematical Sciences 备注：Accepted to ICML 2021 链接：https://arxiv.org/abs/2107.07889 摘要：在诸如自然语言处理或计算机视觉等应用中，给定一个大的$n×d$矩阵$a=（a{i，j}）$，并且希望计算一个函数$f（a）=（f（a{i，j}）$）的矩阵分解，例如，低秩近似，该函数逐项应用于$a$。一个非常重要的特例是似然函数$fleft（Aright）=log{left（left | A{ij}right | 1right）}$。一种自然的方法是简单地将$f$应用于$A$的每个条目，然后计算矩阵分解，但这需要存储所有$A$以及对其条目的多次传递。Liang et al.最近的工作展示了如何仅使用$ncdotoperatorname{poly}（epsilon^{1}klog n）$个内存字，将$ntimes n$矩阵$a$的秩从$k$分解为$f（a）$，总体错误为$10| f（a）-[f（a）]u k| f^2 operatorname{poly}（epsilon/k）| f（a）|{1,2}^2$，式中，$[f（A）]u k$是最佳秩-$k$近似于$f（A）$，而$\f（A）\u124;{1,2}^2$是$f（A）$行的欧氏长度之和的平方。他们的算法对$A$的条目进行三次遍历。作者提出了一个公开的问题，即只需对$a$的条目进行一次遍历，就可以获得一个具有$ncdotoperatorname{poly}（epsilon^{1}klogn）$个内存字的算法。本文解决了这一问题，得到了这一问题的第一个单通算法，并对Liang等人研究的同一类函数$f$进行了研究，我们的误差为$\ f（A）-[f（A）]\ U k\ f^2 运算符名{poly}（ epsilon/k）\ f（A）\ U f^2$，其中$\ f（A）\ U f^2$是$f（A）$行的欧氏长度的平方和。因此，我们的误差要小得多，因为它去掉了$10$和$| f（A）\| u f^2leq| f（A）\|{1,2}^2$。我们还给出了一个回归算法，指出了以前工作中的一个错误，并对我们的结果进行了实证验证。摘要：In applications such as natural language processing or computer vision, one is given a large $n times d$ matrix $A = (a_{i,j})$ and would like to compute a matrix decomposition, e.g., a low rank approximation, of a function $f(A) = (f(a_{i,j}))$ applied entrywise to $A$. A very important special case is the likelihood function $fleft( A right ) = log{left( left| a_{ij}right| 1right)}$. A natural way to do this would be to simply apply $f$ to each entry of $A$, and then compute the matrix decomposition, but this requires storing all of $A$ as well as multiple passes over its entries. Recent work of Liang et al. shows how to find a rank-$k$ factorization to $f(A)$ for an $n times n$ matrix $A$ using only $n cdot operatorname{poly}(epsilon^{-1}klog n)$ words of memory, with overall error $10|f(A)-[f(A)]_k|_F^2 operatorname{poly}(epsilon/k) |f(A)|_{1,2}^2$, where $[f(A)]_k$ is the best rank-$k$ approximation to $f(A)$ and $|f(A)|_{1,2}^2$ is the square of the sum of Euclidean lengths of rows of $f(A)$. Their algorithm uses three passes over the entries of $A$. The authors pose the open question of obtaining an algorithm with $n cdot operatorname{poly}(epsilon^{-1}klog n)$ words of memory using only a single pass over the entries of $A$. In this paper we resolve this open question, obtaining the first single-pass algorithm for this problem and for the same class of functions $f$ studied by Liang et al. Moreover, our error is $|f(A)-[f(A)]_k|_F^2 operatorname{poly}(epsilon/k) |f(A)|_F^2$, where $|f(A)|_F^2$ is the sum of squares of Euclidean lengths of rows of $f(A)$. Thus our error is significantly smaller, as it removes the factor of $10$ and also $|f(A)|_F^2 leq |f(A)|_{1,2}^2$. We also give an algorithm for regression, pointing out an error in previous work, and empirically validate our results.

【5】 When does loss-based prioritization fail? 标题：基于损失的优先级排序何时会失败？

作者：Niel Teng Hu,Xinyu Hu,Rosanne Liu,Sara Hooker,Jason Yosinski 链接：https://arxiv.org/abs/2107.07741 摘要：并不是所有的例子都是平等的，但标准的深度神经网络训练协议统一处理每个训练点。每个示例通过网络向前和向后传播相同的次数，与示例对学习协议的贡献程度无关。最近的工作提出了一些方法来加快训练的偏离这种统一的待遇。流行的方法是对损失较大的例子进行加权，直觉地认为损失较小的例子已经被模型学习，因此它们对训练过程的边际价值应该较低。这种观点认为，用高损失的例子来更新模型对模型是有益的。然而，这可能不适用于嘈杂的真实世界数据。在本文中，我们从理论上证明了基于损失的加速方法在有噪声和损坏数据的情况下会退化。我们的工作表明，衡量实例难度的方法需要正确地将噪声从其他类型的具有挑战性的实例中分离出来。摘要：Not all examples are created equal, but standard deep neural network training protocols treat each training point uniformly. Each example is propagated forward and backward through the network the same amount of times, independent of how much the example contributes to the learning protocol. Recent work has proposed ways to accelerate training by deviating from this uniform treatment. Popular methods entail up-weighting examples that contribute more to the loss with the intuition that examples with low loss have already been learned by the model, so their marginal value to the training procedure should be lower. This view assumes that updating the model with high loss examples will be beneficial to the model. However, this may not hold for noisy, real world data. In this paper, we theorize and then empirically demonstrate that loss-based acceleration methods degrade in scenarios with noisy and corrupted data. Our work suggests measures of example difficulty need to correctly separate out noise from other types of challenging examples.

【6】 An Energy-Efficient Edge Computing Paradigm for Convolution-based Image Upsampling 标题：一种基于卷积的图像上采样的节能边缘计算模式

作者：Ian Colbert,Ken Kreutz-Delgado,Srinjoy Das 链接：https://arxiv.org/abs/2107.07647 摘要：针对基于实时深度学习的图像上采样应用，提出了一种新的能量有效的边缘计算方法。目前，用于图像上采样的最先进的深度学习解决方案是使用调整大小或亚像素卷积来学习产生具有最小伪影的高保真图像的内核。然而，使用这些学习的卷积核进行推理需要内存密集型的特征映射变换，这些变换在实时应用程序中控制时间和能量开销。为了减轻内存带宽上的压力，我们将调整大小或亚像素卷积的使用限制在云中的训练中，通过将学习到的卷积核转换为反卷积核，然后将其部署为功能等效的反卷积进行推理。当从训练转移到推理时，这些核变换作为一次性成本，使系统设计者能够通过保持在云中训练时学习的图像保真度，同时最小化在边缘推理期间的数据传输惩罚，在其最佳上下文中使用每种算法。我们还探讨了现有的反卷积推理算法的变种，并介绍了一个新的变种考虑。我们使用时间和能量消耗的定量模型分析和比较了基于卷积的上采样算法的推理特性，结果表明，与亚像素或调整卷积的算法相比，在边缘使用反卷积进行推理可以提高系统延迟和能量效率。摘要：A novel energy-efficient edge computing paradigm is proposed for real-time deep learning-based image upsampling applications. State-of-the-art deep learning solutions for image upsampling are currently trained using either resize or sub-pixel convolution to learn kernels that generate high fidelity images with minimal artifacts. However, performing inference with these learned convolution kernels requires memory-intensive feature map transformations that dominate time and energy costs in real-time applications. To alleviate this pressure on memory bandwidth, we confine the use of resize or sub-pixel convolution to training in the cloud by transforming learned convolution kernels to deconvolution kernels before deploying them for inference as a functionally equivalent deconvolution. These kernel transformations, intended as a one-time cost when shifting from training to inference, enable a systems designer to use each algorithm in their optimal context by preserving the image fidelity learned when training in the cloud while minimizing data transfer penalties during inference at the edge. We also explore existing variants of deconvolution inference algorithms and introduce a novel variant for consideration. We analyze and compare the inference properties of convolution-based upsampling algorithms using a quantitative model of incurred time and energy costs and show that using deconvolution for inference at the edge improves both system latency and energy efficiency when compared to their sub-pixel or resize convolution counterparts.

【7】 Efficient Bayesian Sampling Using Normalizing Flows to Assist Markov Chain Monte Carlo Methods 标题：用归一化流辅助马尔可夫链蒙特卡罗方法的高效贝叶斯抽样

作者：Marylou Gabrié,Grant M. Rotskoff,Eric Vanden-Eijnden 机构： Stanford University, CA 9 4 30 5 4CourantInstitute, New York University 链接：https://arxiv.org/abs/2107.08001 摘要：规范化流可以生成复杂的目标分布，因此在贝叶斯统计的许多应用中显示出作为MCMC后验抽样的替代或补充的前景。由于事先没有来自目标后验分布的数据集可用，因此通常使用反向Kullback-Leibler（KL）散度来训练流，该散度只需要来自基分布的样本。当后路很复杂，很难用未经训练的正常化血流取样时，这种方法可能效果不佳。在这里，我们探索了一种独特的训练策略，使用直接KL散度作为损失，其中来自后验数据的样本是通过（i）在后验数据上使用归一化流辅助局部MCMC算法来加速其混合速率，以及（ii）使用这种方法生成的数据来训练流。该方法只需要有限的关于后验概率的输入，并且可以用来估计模型验证所需的证据，如我们在示例中所示。摘要：Normalizing flows can generate complex target distributions and thus show promise in many applications in Bayesian statistics as an alternative or complement to MCMC for sampling posteriors. Since no data set from the target posterior distribution is available beforehand, the flow is typically trained using the reverse Kullback-Leibler (KL) divergence that only requires samples from a base distribution. This strategy may perform poorly when the posterior is complicated and hard to sample with an untrained normalizing flow. Here we explore a distinct training strategy, using the direct KL divergence as loss, in which samples from the posterior are generated by (i) assisting a local MCMC algorithm on the posterior with a normalizing flow to accelerate its mixing rate and (ii) using the data generated this way to train the flow. The method only requires a limited amount of textit{a~priori} input about the posterior, and can be used to estimate the evidence required for model validation, as we illustrate on examples.

【8】 A Causal Perspective on Meaningful and Robust Algorithmic Recourse 标题：有意义且稳健的算法资源的因果透视

作者：Gunnar König,Timo Freiesleben,Moritz Grosse-Wentrup 机构：We take a causal perspective on the issue at hand and arguethat robustness and meaningfulness are related problems 1Institute for Statistics, University of Vienna 3Munich Center for Mathe-matical Philosophy 备注：ICML (International Conference on Machine Learning) Workshop on Algorithmic Recourse 链接：https://arxiv.org/abs/2107.07853 摘要：算法追索解释告知利益相关者如何采取行动恢复不利的预测。然而，一般而言，ML模型不能很好地预测介入分布。因此，以期望的方式改变预测的动作可能不会导致潜在目标的改进。这种方法对模型改装既没有意义也不可靠。扩展了Karimi等人（2021）的工作，我们提出了有意义的算法资源（MAR），它只建议改进预测和目标的行动。我们通过强调模型审计和有意义的、可操作的追索解释之间的差异来证明这种选择约束。此外，我们引入了一种称为有效算法追索（EAR）的MAR松弛方法，在某些假设下，该方法仅允许对目标原因进行干预，从而产生有意义的追索。摘要：Algorithmic recourse explanations inform stakeholders on how to act to revert unfavorable predictions. However, in general ML models do not predict well in interventional distributions. Thus, an action that changes the prediction in the desired way may not lead to an improvement of the underlying target. Such recourse is neither meaningful nor robust to model refits. Extending the work of Karimi et al. (2021), we propose meaningful algorithmic recourse (MAR) that only recommends actions that improve both prediction and target. We justify this selection constraint by highlighting the differences between model audit and meaningful, actionable recourse explanations. Additionally, we introduce a relaxation of MAR called effective algorithmic recourse (EAR), which, under certain assumptions, yields meaningful recourse by only allowing interventions on causes of the target.

【9】 Entropic alternatives to initialization 标题：初始化的熵替代方案

作者：Daniele Musso 机构：Centro de Supercomputaci´on de Galicia (CESGA), sn, Avenida de Vigo, Santiago de Compostela, Spain 备注：19 pages, 5 figures, 2 appendices 链接：https://arxiv.org/abs/2107.07757 摘要：局部熵损失函数提供了一个通用的框架来定义体系结构感知的正则化过程。除了突触空间各向异性的可能性之外，损失函数的局部熵平滑在训练过程中也会发生变化，从而产生可调的模型复杂性。在训练的早期，正则化很强，然后逐渐消失的作用域协议是深度卷积神经网络标准初始化程序的替代方案，尽管如此，它具有更广泛的适用性。我们用统计物理和信息论的语言分析了各向异性、局部熵平滑，提供了对它们的解释和工作原理的见解。本文对卷积网络的重正化物理和时空结构的有关问题进行了评述。摘要：Local entropic loss functions provide a versatile framework to define architecture-aware regularization procedures. Besides the possibility of being anisotropic in the synaptic space, the local entropic smoothening of the loss function can vary during training, thus yielding a tunable model complexity. A scoping protocol where the regularization is strong in the early-stage of the training and then fades progressively away constitutes an alternative to standard initialization procedures for deep convolutional neural networks, nonetheless, it has wider applicability. We analyze anisotropic, local entropic smoothenings in the language of statistical physics and information theory, providing insight into both their interpretation and workings. We comment some aspects related to the physics of renormalization and the spacetime structure of convolutional networks.

【10】 Auto-differentiable Ensemble Kalman Filters 标题：自可微集合卡尔曼滤波器

作者：Yuming Chen,Daniel Sanz-Alonso,Rebecca Willett 机构：University of Chicago 链接：https://arxiv.org/abs/2107.07687 摘要：资料同化是对时间演化状态的连续估计。在高维状态和未知状态空间动力学的情况下，这项任务在科学和工程应用中有着广泛的应用。本文介绍了一种用于数据同化动态系统学习的机器学习框架。我们的自动可微集合Kalman滤波器（AD-EnKFs）将用于状态恢复的集合Kalman滤波器与用于动态学习的机器学习工具相结合。在此过程中，AD-EnKFs利用集成Kalman滤波器缩放到高维状态的能力和自动微分的能力来训练动力学的高维代理模型。使用Lorenz-96模式的数值结果表明，AD-EnKFs的性能优于现有的使用期望最大化或粒子滤波来融合数据同化和机器学习的方法。此外，AD-enkf易于实现，并且需要最少的调优。摘要：Data assimilation is concerned with sequentially estimating a temporally-evolving state. This task, which arises in a wide range of scientific and engineering applications, is particularly challenging when the state is high-dimensional and the state-space dynamics are unknown. This paper introduces a machine learning framework for learning dynamical systems in data assimilation. Our auto-differentiable ensemble Kalman filters (AD-EnKFs) blend ensemble Kalman filters for state recovery with machine learning tools for learning the dynamics. In doing so, AD-EnKFs leverage the ability of ensemble Kalman filters to scale to high-dimensional states and the power of automatic differentiation to train high-dimensional surrogate models for the dynamics. Numerical results using the Lorenz-96 model show that AD-EnKFs outperform existing methods that use expectation-maximization or particle filters to merge data assimilation and machine learning. In addition, AD-EnKFs are easy to implement and require minimal tuning.

【11】 Improving application performance with biased distributions of quantum states 标题：利用量子态的偏置分布提高应用性能

作者：Sanjaya Lohani,Joseph M. Lukens,Daniel E. Jones,Thomas A. Searles,Ryan T. Glasser,Brian T. Kirby 机构：IBM-HBCU Quantum Center, Howard University, Washington, DC , USA, Tulane University, New Orleans, LA , USA, Quantum Information Science Group, Oak Ridge National Laboratory, Oak Ridge, Tennessee , USA, United States Army Research Laboratory, Adelphi, MD , USA 备注：16 pages, 15 figures 链接：https://arxiv.org/abs/2107.07642 摘要：我们考虑任意尺寸的混合量子态的特定分布的性质，其可以偏向于特定的平均纯度。特别地，我们分析了具有Dirichlet分布系数的Haar随机纯态的混合物。我们解析地导出了在任何维度上匹配布雷斯和希尔BERT-施密特分布的平均纯度所需的浓度参数。数值模拟表明，该值准确地恢复了希尔BERT-施密特分布，为希尔BERT-施密特分布的随机量子态群提供了一种直观的物理解释。然后，我们演示了如何用这些Dirichlet加权的Haar混合来代替Bures和Hilbert-Schmidt分布，从而在基于机器学习的量子态层析成像系统和贝叶斯量子态重建中获得可测量的性能优势。最后，我们在实验上描述了云存取IBM量子计算机和内部偏振纠缠光子源所产生的量子态的分布。在每种情况下，我们的方法都比Bures或Hilbert-Schmidt分布态更接近于各种实验条件下的基本分布。摘要：We consider the properties of a specific distribution of mixed quantum states of arbitrary dimension that can be biased towards a specific mean purity. In particular, we analyze mixtures of Haar-random pure states with Dirichlet-distributed coefficients. We analytically derive the concentration parameters required to match the mean purity of the Bures and Hilbert--Schmidt distributions in any dimension. Numerical simulations suggest that this value recovers the Hilbert--Schmidt distribution exactly, offering an alternative and intuitive physical interpretation for ensembles of Hilbert--Schmidt-distributed random quantum states. We then demonstrate how substituting these Dirichlet-weighted Haar mixtures in place of the Bures and Hilbert--Schmidt distributions results in measurable performance advantages in machine-learning-based quantum state tomography systems and Bayesian quantum state reconstruction. Finally, we experimentally characterize the distribution of quantum states generated by both a cloud-accessed IBM quantum computer and an in-house source of polarization-entangled photons. In each case, our method can more closely match the underlying distribution than either Bures or Hilbert--Schmidt distributed states for various experimental conditions.

linux https 网络安全聚类算法学习方法

0 人点赞