访问www.arxivdaily.com获取含摘要速递,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏、发帖等功能!点击阅读原文即可访问
cs.LG 方向,今日共计95篇
Graph相关(图学习|图神经网络|图优化等)(4篇)
【1】 A Graph Data Augmentation Strategy with Entropy Preserving 标题:一种保熵的图形数据增强策略
作者:Xue Liu,Dan Sun,Wei Wei 机构:Beijing System Design Institute of Electro-Mechanic Engineering, Beijing, China, School of Mathematical Sciences, Beihang University, Beijing, China, Key Laboratory of Mathematics, Informatics and Behavioral Semantics, Ministry of Education, China 链接:https://arxiv.org/abs/2107.06048 摘要:Kipf和Welling提出的图卷积网络(GCNs)是半监督学习的有效模型,但面临着过度平滑的障碍,这将削弱GCNs的表示能力。近年来,一些研究者提出了通过随机扰动图的拓扑结构或特征矩阵来生成数据增强作为训练的输入。然而,这些操作都要付出信息结构完整性破坏的代价,不可避免地牺牲原始图中的信息。本文提出了一种新的图熵定义,作为评价图中特征信息扩散的定量指标。在保持图熵的前提下,提出了一种在保证图拓扑完整性的前提下,利用随机机制生成扰动训练数据的有效策略。在真实数据集上进行了大量的实验,结果表明,与基线激增相比,本文提出的方法在提高半监督节点分类精度方面是有效的。除此之外,我们提出的方法在训练过程中显著提高了GCNs的鲁棒性和泛化能力。 摘要:The Graph Convolutional Networks (GCNs) proposed by Kipf and Welling are effective models for semi-supervised learning, but facing the obstacle of over-smoothing, which will weaken the representation ability of GCNs. Recently some works are proposed to tackle with above limitation by randomly perturbing graph topology or feature matrix to generate data augmentations as input for training. However, these operations have to pay the price of information structure integrity breaking, and inevitably sacrifice information stochastically from original graph. In this paper, we introduce a novel graph entropy definition as an quantitative index to evaluate feature information diffusion among a graph. Under considerations of preserving graph entropy, we propose an effective strategy to generate perturbed training data using a stochastic mechanism but guaranteeing graph topology integrity and with only a small amount of graph entropy decaying. Extensive experiments have been conducted on real-world datasets and the results verify the effectiveness of our proposed method in improving semi-supervised node classification accuracy compared with a surge of baselines. Beyond that, our proposed approach significantly enhances the robustness and generalization ability of GCNs during the training process.
【2】 Towards Representation Identical Privacy-Preserving Graph Neural Network via Split Learning 标题:基于分裂学习的表示一致保密图神经网络
作者:Chuanqiang Shan,Huiyun Jiao,Jie Fu 链接:https://arxiv.org/abs/2107.05917 摘要:近年来,随着对图神经网络(GNN)研究数量的迅速增加,它已从理论研究走向实际应用阶段。尽管GNN取得了令人鼓舞的性能,但是在相关文献中,对分布式图形数据的隐私保护训练和推理的关注较少。由于图结构的特殊性,将现有的私有学习框架扩展到GNN具有挑战性。基于分裂学习的思想,我们提出了一种用于水平分区跨思洛存储器场景的节点级任务的textbf{S}servertextbf{a}idedtextbf{P}竞争保持textbf{GNN}(SAPGNN)。它将集中式GNN自然地扩展到具有max/min池聚合的孤立图,同时保证所有参与计算的私有数据仍然保留在本地数据持有者。为了进一步提高数据的保密性,提出了一种安全的池聚合机制。理论和实验结果表明,该模型与在组合数据上学习的模型具有相同的精度。 摘要:In recent years, the fast rise in number of studies on graph neural network (GNN) has put it from the theories research to reality application stage. Despite the encouraging performance achieved by GNN, less attention has been paid to the privacy-preserving training and inference over distributed graph data in the related literature. Due to the particularity of graph structure, it is challenging to extend the existing private learning framework to GNN. Motivated by the idea of split learning, we propose a textbf{S}erver textbf{A}ided textbf{P}rivacy-preserving textbf{GNN} (SAPGNN) for the node level task on horizontally partitioned cross-silo scenario. It offers a natural extension of centralized GNN to isolated graph with max/min pooling aggregation, while guaranteeing that all the private data involved in computation still stays at local data holders. To further enhancing the data privacy, a secure pooling aggregation mechanism is proposed. Theoretical and experimental results show that the proposed model achieves the same accuracy as the one learned over the combined data.
【3】 Generalization of graph network inferences in higher-order probabilistic graphical models 标题:图网络推论在高阶概率图模型中的推广
作者:Yicheng Fei,Xaq Pitkow 备注:9 pages, 2 figures 链接:https://arxiv.org/abs/2107.05729 摘要:概率图形模型为描述复杂的统计结构提供了一个强有力的工具,在科学和工程中有许多实际应用,从控制机械臂到理解神经元计算。这些图形模型的一个主要挑战是,边缘化等推论对于一般图形来说是难以处理的。这些推论通常由分布式消息传递算法(如信念传播)来近似,这种算法在有圈的图上并不总是表现得很好,对于复杂的连续概率分布也不容易指定。这种困难经常出现在表达图形模型,包括棘手的高阶相互作用。本文利用定义在因子图上的图神经网络构造迭代消息传递算法,实现对涉及多变量交互的图形模型的快速近似推理。在多个图形模型族上的实验结果表明了该方法对不同尺寸图形的超分布泛化能力,并指出了该方法优于信念传播的领域。 摘要:Probabilistic graphical models provide a powerful tool to describe complex statistical structure, with many real-world applications in science and engineering from controlling robotic arms to understanding neuronal computations. A major challenge for these graphical models is that inferences such as marginalization are intractable for general graphs. These inferences are often approximated by a distributed message-passing algorithm such as Belief Propagation, which does not always perform well on graphs with cycles, nor can it always be easily specified for complex continuous probability distributions. Such difficulties arise frequently in expressive graphical models that include intractable higher-order interactions. In this paper we construct iterative message-passing algorithms using Graph Neural Networks defined on factor graphs to achieve fast approximate inference on graphical models that involve many-variable interactions. Experimental results on several families of graphical models demonstrate the out-of-distribution generalization capability of our method to different sized graphs, and indicate the domain in which our method gains advantage over Belief Propagation.
【4】 Drug-Target Interaction Prediction with Graph Attention networks 标题:基于图注意网络的药物与靶点相互作用预测
作者:Haiyang Wang,Guangyu Zhou,Siqi Liu,Jyun-Yu Jiang,Wei Wang 机构:Wang ,∗, Zhiyuan College, Shanghai Jiao Tong University, Shanghai, China and, Department of Computer Science, University of California, Los Angeles, USA, ∗To whom correspondence should be addressed. † These authors contributed equally to this work. 链接:https://arxiv.org/abs/2107.06099 摘要:动机:预测药物-靶点相互作用(DTI)在蛋白质组学和药物研究领域具有重要意义,是生物信息学研究的热点。尽管许多机器学习方法已经成功地应用于这项任务中,但很少有人利用DTI网络中固有的异构图结构来应对这一挑战。为了更好地学习和解释DTI拓扑结构和相似性,需要有专门用于从图结构预测交互作用的方法。结果:我们提出了一个端到端的框架,DTI-GAT(药物-靶点相互作用预测与图形注意网络)的DTI预测。DTI-GAT结合了一种深层的神经网络结构,该结构利用了药物和蛋白质序列的相互作用模式和特征,并通过注意机制对图形结构数据进行操作。DTI-GAT通过自我注意机制为每个节点分配不同的注意权重,有助于解释DTI的拓扑结构。实验结果表明,DTI-GAT在二进制DTI预测问题上的性能优于现有的各种系统。此外,独立的研究结果进一步证明了我们的模型比其他传统方法具有更好的通用性。可用性:源代码和所有数据集在https://github.com/Haiyang-W/DTI-GRAPH 摘要:Motivation: Predicting Drug-Target Interaction (DTI) is a well-studied topic in bioinformatics due to its relevance in the fields of proteomics and pharmaceutical research. Although many machine learning methods have been successfully applied in this task, few of them aim at leveraging the inherent heterogeneous graph structure in the DTI network to address the challenge. For better learning and interpreting the DTI topological structure and the similarity, it is desirable to have methods specifically for predicting interactions from the graph structure. Results: We present an end-to-end framework, DTI-GAT (Drug-Target Interaction prediction with Graph Attention networks) for DTI predictions. DTI-GAT incorporates a deep neural network architecture that operates on graph-structured data with the attention mechanism, which leverages both the interaction patterns and the features of drug and protein sequences. DTI-GAT facilitates the interpretation of the DTI topological structure by assigning different attention weights to each node with the self-attention mechanism. Experimental evaluations show that DTI-GAT outperforms various state-of-the-art systems on the binary DTI prediction problem. Moreover, the independent study results further demonstrate that our model can be generalized better than other conventional methods. Availability: The source code and all datasets are available at https://github.com/Haiyang-W/DTI-GRAPH
Transformer(3篇)
【1】 Transformer-Based Behavioral Representation Learning Enables Transfer Learning for Mobile Sensing in Small Datasets 标题:基于变换的行为表征学习实现小数据集移动传感的迁移学习
作者:Mike A. Merrill,Tim Althoff 链接:https://arxiv.org/abs/2107.06097 摘要:虽然深度学习已经彻底改变了自然语言处理和计算机视觉的研究和应用,但行为建模和行为健康应用还没有出现这种情况。这是因为域的数据集较小,具有异构数据类型,并且通常表现出很大程度的丢失。因此,现成的深度学习模式需要重大的、通常是禁止性的适应。因此,许多研究应用程序仍然依赖于带有增强树模型的手动编码特征,有时还依赖于由专家手工制作的特定于任务的特征。在这里,我们通过为移动传感数据提供一个神经结构框架来解决这些挑战,该框架可以从时间序列中学习可概括的特征表示,并通过微调证明了在小数据域上进行转移学习的可行性。这种体系结构结合了CNN和Trans-former体系结构的优点:(1)通过直接从原始的分钟级传感器数据中学习,而不需要手工制作的特征,最大ROC AUC为0.33,(2)使用预训练来超越简单的神经模型,并用来自几十个参与者的数据增强决策树。 摘要:While deep learning has revolutionized research and applications in NLP and computer vision, this has not yet been the case for behavioral modeling and behavioral health applications. This is because the domain's datasets are smaller, have heterogeneous datatypes, and typically exhibit a large degree of missingness. Therefore, off-the-shelf deep learning models require significant, often prohibitive, adaptation. Accordingly, many research applications still rely on manually coded features with boosted tree models, sometimes with task-specific features handcrafted by experts. Here, we address these challenges by providing a neural architecture framework for mobile sensing data that can learn generalizable feature representations from time series and demonstrates the feasibility of transfer learning on small data domains through finetuning. This architecture combines benefits from CNN and Trans-former architectures to (1) enable better prediction performance by learning directly from raw minute-level sensor data without the need for handcrafted features by up to 0.33 ROC AUC, and (2) use pretraining to outperform simpler neural models and boosted decision trees with data from as few a dozen participants.
【2】 Combiner: Full Attention Transformer with Sparse Computation Cost 标题:合并器:具有稀疏计算成本的全注意力Transformer
作者:Hongyu Ren,Hanjun Dai,Zihang Dai,Mengjiao Yang,Jure Leskovec,Dale Schuurmans,Bo Dai 机构:‡University of Alberta 链接:https://arxiv.org/abs/2107.05768 摘要:Transformers提供了一类对序列建模非常有效的表达性架构。然而,transformers的关键限制是其二次内存和时间复杂度$mathcal{O}(L^2)$相对于注意层中的序列长度,这限制了它在超长序列中的应用。大多数现有的方法都利用注意力矩阵中的稀疏性或低秩假设来降低成本,但牺牲了表达能力。相反,我们提出了组合器,它在保持低计算和内存复杂度的同时,在每个注意头中提供完全的注意能力。其核心思想是将自我注意机制视为每个位置嵌入的条件期望,并用结构化因子分解近似条件分布。每个位置都可以关注所有其他位置,或者通过直接关注,或者通过间接关注抽象,这些抽象又是来自相应局部区域的嵌入的条件期望。我们表明,现有稀疏变换器中使用的大多数稀疏注意模式都能够激发这种分解的设计以获得充分的注意,从而产生相同的次二次代价($mathcal{O}(Llog(L))$或$mathcal{O}(Lsqrt{L})$)。Combiner是现有Transformer中注意层的一个替代品,可以很容易地在通用框架中实现。对自回归和双向序列任务的实验评估表明了该方法的有效性,在多个图像和文本建模任务中得到了最新的结果。 摘要:Transformers provide a class of expressive architectures that are extremely effective for sequence modeling. However, the key limitation of transformers is their quadratic memory and time complexity $mathcal{O}(L^2)$ with respect to the sequence length in attention layers, which restricts application in extremely long sequences. Most existing approaches leverage sparsity or low-rank assumptions in the attention matrix to reduce cost, but sacrifice expressiveness. Instead, we propose Combiner, which provides full attention capability in each attention head while maintaining low computation and memory complexity. The key idea is to treat the self-attention mechanism as a conditional expectation over embeddings at each location, and approximate the conditional distribution with a structured factorization. Each location can attend to all other locations, either via direct attention, or through indirect attention to abstractions, which are again conditional expectations of embeddings from corresponding local regions. We show that most sparse attention patterns used in existing sparse transformers are able to inspire the design of such factorization for full attention, resulting in the same sub-quadratic cost ($mathcal{O}(Llog(L))$ or $mathcal{O}(Lsqrt{L})$). Combiner is a drop-in replacement for attention layers in existing transformers and can be easily implemented in common frameworks. An experimental evaluation on both autoregressive and bidirectional sequence tasks demonstrates the effectiveness of this approach, yielding state-of-the-art results on several image and text modeling tasks.
【3】 Uncertainty-based Query Strategies for Active Learning with Transformers 标题:基于不确定性的Transformer主动学习查询策略
作者:Christopher Schröder,Andreas Niekler,Martin Potthast 机构:Leipzig University 链接:https://arxiv.org/abs/2107.05687 摘要:主动学习是通过有针对性的标记来迭代构建分类模型,从而显著节省标记成本。由于大多数关于主动学习的研究都是在基于Transformer的语言模型(transformers)流行之前进行的,尽管它具有重要的实际意义,但迄今为止很少有论文研究Transformer如何与主动学习相结合。这可以归因于这样一个事实,即对transformers使用最先进的查询策略会导致令人望而却步的运行时开销,这实际上抵消了甚至超过了前面提到的成本节约。在本文中,我们将重新讨论基于不确定性的查询策略,这些策略在很大程度上优于以前的查询策略,但特别适合于微调转换器的环境。在对五个广泛使用的文本分类基准的广泛评估中,我们表明,在学习曲线下的面积上取得了高达14.4个百分点的显著改进,并且对于除一个基准外的所有基准,仅使用0.4%到15%的训练数据,最终的准确率接近最新水平。 摘要:Active learning is the iterative construction of a classification model through targeted labeling, enabling significant labeling cost savings. As most research on active learning has been carried out before transformer-based language models ("transformers") became popular, despite its practical importance, comparably few papers have investigated how transformers can be combined with active learning to date. This can be attributed to the fact that using state-of-the-art query strategies for transformers induces a prohibitive runtime overhead, which effectively cancels out, or even outweighs aforementioned cost savings. In this paper, we revisit uncertainty-based query strategies, which had been largely outperformed before, but are particularly suited in the context of fine-tuning transformers. In an extensive evaluation on five widely used text classification benchmarks, we show that considerable improvements of up to 14.4 percentage points in area under the learning curve are achieved, as well as a final accuracy close to the state of the art for all but one benchmark, using only between 0.4% and 15% of the training data.
GAN|对抗|攻击|生成相关(7篇)
【1】 Generative Adversarial Learning via Kernel Density Discrimination 标题:基于核密度判别的生成性对抗性学习
作者:Abdelhak Lemkhenter,Adam Bielski,Alp Eren Sari,Paolo Favaro 机构:Institute of Computer Science, University of Bern 链接:https://arxiv.org/abs/2107.06197 摘要:本文介绍了一种新的产生式对抗学习方法——核密度判别GAN(KDD-GAN)。KDD-GAN将训练描述为似然比优化问题,其中数据分布通过(局部)核密度估计(KDE)显式地写入。这是受对比学习的最新进展及其与KDE关系的启发。我们直接在特征空间中定义kde,放弃了核特征映射可逆性的要求。在我们的方法中,特征不再像在原始GAN公式中那样针对线性可分性进行优化,而是针对特征空间中更一般的分布区分进行优化。我们分析了我们的损失相对于特征表示的梯度,并表明它比原来的铰链损失有更好的表现。我们在CIFAR10和ImageNet的缩放版本上用所提出的基于KDE的丢失进行了实验,该丢失被用作训练丢失或正则化项。我们使用BigGAN/SA-GAN作为主干和基线,因为我们的重点不是设计网络的体系结构。我们显示,与基线相比,FID生成的样品质量从10%提高到40%。代码将可用。 摘要:We introduce Kernel Density Discrimination GAN (KDD GAN), a novel method for generative adversarial learning. KDD GAN formulates the training as a likelihood ratio optimization problem where the data distributions are written explicitly via (local) Kernel Density Estimates (KDE). This is inspired by the recent progress in contrastive learning and its relation to KDE. We define the KDEs directly in feature space and forgo the requirement of invertibility of the kernel feature mappings. In our approach, features are no longer optimized for linear separability, as in the original GAN formulation, but for the more general discrimination of distributions in the feature space. We analyze the gradient of our loss with respect to the feature representation and show that it is better behaved than that of the original hinge loss. We perform experiments with the proposed KDE-based loss, used either as a training loss or a regularization term, on both CIFAR10 and scaled versions of ImageNet. We use BigGAN/SA-GAN as a backbone and baseline, since our focus is not to design the architecture of the networks. We show a boost in the quality of generated samples with respect to FID from 10% to 40% compared to the baseline. Code will be made available.
【2】 Force-in-domain GAN inversion 标题:畴内力GaN反转
作者:Guangjie Leng,Yeku Zhu,Zhi-Qin John Xu 机构: Institute of Natural Sciences and School of Mathematical Sciences and MOE-LSC, Shanghai Jiao Tong University, Qing Yuan Research Institute, Shanghai Jiao Tong University 链接:https://arxiv.org/abs/2107.06050 摘要:实证研究表明,生成性对抗网络(GANs)在训练生成图像时,其潜在空间会出现各种语义。为了进行真实图像的编辑,需要从真实图像到潜在空间的精确映射,以利用这些学习到的语义,这是一个重要而困难的问题。最近提出了一种域内GAN反转方法,通过将反转码得到的重建图像强制在真实图像空间中,从而将反转码限制在潜在空间中。经验上,我们发现在域内GAN的反转码可以明显地偏离潜在空间。为了解决这一问题,我们在域内GAN的基础上提出了一种强制域内GAN,它利用一个鉴别器在潜在空间内强制反转码。GAN畴中的力也可以用循环GAN来解释。大量实验表明,我们在GAN域的工作不仅可以在像素级重建目标图像,而且可以将反转后的代码与潜在空间很好地对齐,实现语义编辑。 摘要:Empirical works suggest that various semantics emerge in the latent space of Generative Adversarial Networks (GANs) when being trained to generate images. To perform real image editing, it requires an accurate mapping from the real image to the latent space to leveraging these learned semantics, which is important yet difficult. An in-domain GAN inversion approach is recently proposed to constraint the inverted code within the latent space by forcing the reconstructed image obtained from the inverted code within the real image space. Empirically, we find that the inverted code by the in-domain GAN can deviate from the latent space significantly. To solve this problem, we propose a force-in-domain GAN based on the in-domain GAN, which utilizes a discriminator to force the inverted code within the latent space. The force-in-domain GAN can also be interpreted by a cycle-GAN with slight modification. Extensive experiments show that our force-in-domain GAN not only reconstructs the target image at the pixel level, but also align the inverted code with the latent space well for semantic editing.
【3】 EvoBA: An Evolution Strategy as a Strong Baseline forBlack-Box Adversarial Attacks 标题:EvoBA:作为黑箱攻击强基线的进化策略
作者:Andrei Ilie,Marius Popescu,Alin Stefanescu 机构:University of Bucharest, Romania 链接:https://arxiv.org/abs/2107.05754 摘要:最近的工作表明,白盒对抗攻击可以很容易地应用于最先进的图像分类器。然而,现实生活中的场景更像是黑盒对抗性条件,缺乏透明度,并且通常对查询预算施加自然的硬约束。我们提出了$textbf{EvoBA}$,一种基于简单进化搜索策略的黑盒对抗攻击$textbf{EvoBA}$是高效的查询,最小化了$Lu 0$的敌对干扰,并且不需要任何形式的训练$textbf{EvoBA}$通过与$textbf{AutoZOOM}$等更复杂的最新黑匣子攻击一致的结果显示了效率和有效性。它比$textbf{SimBA}$(一种简单而强大的基线黑盒攻击)的查询效率更高,并且具有类似的复杂性。因此,我们建议将其作为黑匣子对抗性攻击的一个新的强基线,并将其作为一个快速而通用的工具来获得关于$L_0$对抗性干扰的图像分类器的健壮性的经验洞察。存在快速可靠的$L泳2$黑盒攻击,如$textbf{SimBA}$,和$L泳infty}$黑盒攻击,如$textbf{DeepSearch}$。我们提出$textbf{EvoBA}$作为一种查询效率高的$Lu 0$黑盒对抗攻击,与上述方法一起,可以作为评估图像分类器经验鲁棒性的通用工具。这些方法的主要优点是运行速度快,查询效率高,易于集成到图像分类器开发流水线中。虽然我们的攻击最小化了$Lu 0$的对抗性干扰,但我们也报告了$Lu 2$,并且注意到我们与最先进的$Lu 2$黑匣子攻击$textbf{AutoZOOM}$和$Lu 2$强基线$textbf{SimBA}$进行了比较。 摘要:Recent work has shown how easily white-box adversarial attacks can be applied to state-of-the-art image classifiers. However, real-life scenarios resemble more the black-box adversarial conditions, lacking transparency and usually imposing natural, hard constraints on the query budget. We propose $textbf{EvoBA}$, a black-box adversarial attack based on a surprisingly simple evolutionary search strategy. $textbf{EvoBA}$ is query-efficient, minimizes $L_0$ adversarial perturbations, and does not require any form of training. $textbf{EvoBA}$ shows efficiency and efficacy through results that are in line with much more complex state-of-the-art black-box attacks such as $textbf{AutoZOOM}$. It is more query-efficient than $textbf{SimBA}$, a simple and powerful baseline black-box attack, and has a similar level of complexity. Therefore, we propose it both as a new strong baseline for black-box adversarial attacks and as a fast and general tool for gaining empirical insight into how robust image classifiers are with respect to $L_0$ adversarial perturbations. There exist fast and reliable $L_2$ black-box attacks, such as $textbf{SimBA}$, and $L_{infty}$ black-box attacks, such as $textbf{DeepSearch}$. We propose $textbf{EvoBA}$ as a query-efficient $L_0$ black-box adversarial attack which, together with the aforementioned methods, can serve as a generic tool to assess the empirical robustness of image classifiers. The main advantages of such methods are that they run fast, are query-efficient, and can easily be integrated in image classifiers development pipelines. While our attack minimises the $L_0$ adversarial perturbation, we also report $L_2$, and notice that we compare favorably to the state-of-the-art $L_2$ black-box attack, $textbf{AutoZOOM}$, and of the $L_2$ strong baseline, $textbf{SimBA}$.
【4】 A Closer Look at the Adversarial Robustness of Information Bottleneck Models 标题:仔细研究信息瓶颈模型的对抗性稳健性
作者:Iryna Korshunova,David Stutz,Alexander A. Alemi,Olivia Wiles,Sven Gowal 链接:https://arxiv.org/abs/2107.05712 摘要:研究了分类信息瓶颈模型的对抗鲁棒性。以往的研究表明,在对抗性训练的基础上,利用信息瓶颈训练的模型具有更好的鲁棒性。我们在各种各样的白盒$l{infty}$攻击下的评估表明,信息瓶颈本身并不是一种强有力的防御策略,以前的结果可能受到梯度模糊的影响。 摘要:We study the adversarial robustness of information bottleneck models for classification. Previous works showed that the robustness of models trained with information bottlenecks can improve upon adversarial training. Our evaluation under a diverse range of white-box $l_{infty}$ attacks suggests that information bottlenecks alone are not a strong defense strategy, and that previous results were likely influenced by gradient obfuscation.
【5】 Hidden Convexity of Wasserstein GANs: Interpretable Generative Models with Closed-Form Solutions 标题:Wasserstein Gans的隐凸性:闭式解的可解释生成模型
作者:Arda Sahiner,Tolga Ergen,Batu Ozturkler,Burak Bartan,John Pauly,Morteza Mardani,Mert Pilanci 机构:Department of Electrical Engineering, Stanford University 备注:First two authors contributed equally to this work; 30 pages, 11 figures 链接:https://arxiv.org/abs/2107.05680 摘要:生成性对抗网络(generativediscountarial Networks,GANs)通常用于建模复杂的数据分布。GANs的生成元和鉴别器通常都是用神经网络建模的,这就分别对生成元和鉴别器提出了一个非凸和非凹的非透明优化问题。这类网络通常采用梯度下降上升法(GDA)进行启发式优化,但目前尚不清楚优化问题是否包含鞍点,或者启发式方法是否能在实际中找到鞍点。本文从凸对偶的角度分析了用两层神经网络鉴别器对Wasserstein-GANs进行训练的过程,并针对不同的生成器给出了Wasserstein-GANs可以用凸优化方法精确求解的条件,或者可以表示为凸凹对策的条件。利用这种凸对偶解释,我们进一步证明了不同激活函数对鉴别器的影响。数值结果验证了这种方法的有效性,并将其应用于线性生成器和二次激活鉴别器对应的凸结构的渐进训练中。我们的实验代码在https://github.com/ardasahiner/ProCoGAN. 摘要:Generative Adversarial Networks (GANs) are commonly used for modeling complex distributions of data. Both the generators and discriminators of GANs are often modeled by neural networks, posing a non-transparent optimization problem which is non-convex and non-concave over the generator and discriminator, respectively. Such networks are often heuristically optimized with gradient descent-ascent (GDA), but it is unclear whether the optimization problem contains any saddle points, or whether heuristic methods can find them in practice. In this work, we analyze the training of Wasserstein GANs with two-layer neural network discriminators through the lens of convex duality, and for a variety of generators expose the conditions under which Wasserstein GANs can be solved exactly with convex optimization approaches, or can be represented as convex-concave games. Using this convex duality interpretation, we further demonstrate the impact of different activation functions of the discriminator. Our observations are verified with numerical results demonstrating the power of the convex interpretation, with applications in progressive training of convex architectures corresponding to linear generators and quadratic-activation discriminators for CelebA image generation. The code for our experiments is available at https://github.com/ardasahiner/ProCoGAN.
【6】 Parameterization of Forced Isotropic Turbulent Flow using Autoencoders and Generative Adversarial Networks 标题:用自动编码器和生成对抗性网络对强迫各向同性湍流的参数化
作者:Kanishk,Tanishk Nandal,Prince Tyagi,Raj Kumar Singh 机构:Delhi Technological University, New Delhi, India 链接:https://arxiv.org/abs/2107.06264 摘要:自动编码器和生成性神经网络模型由于其自发性和较低的处理时间而不是高保真的CFD模拟,近年来在流体力学中得到了广泛的应用。在流体力学的应用中,自动编码器被用作模型降阶工具,通过编码器将输入的高维数据压缩成低维的潜在空间。然而,诸如变分自动编码器(VAEs)和生成对抗网络(GANs)等生成模型被证明能有效地生成具有高“随机性”的混沌模型(如湍流)的解。在这项研究中,强迫各向同性湍流流动是通过参数化为一些基本的统计特征生成的。根据这些特性的依赖关系,根据预先模拟的数据训练模型,然后通过改变这些参数来影响流量生成。沿着生成器模型(如解码器和生成器)推送的潜在向量包含独立的条目,这些条目可用于创建具有相似属性的不同输出。基于神经网络的结构的使用消除了对经典的基于网格的Navier-Stoke方程估计的依赖,这在许多CFD软件中是突出的。 摘要:Autoencoders and generative neural network models have recently gained popularity in fluid mechanics due to their spontaneity and low processing time instead of high fidelity CFD simulations. Auto encoders are used as model order reduction tools in applications of fluid mechanics by compressing input high-dimensional data using an encoder to map the input space into a lower-dimensional latent space. Whereas, generative models such as Variational Auto-encoders (VAEs) and Generative Adversarial Networks (GANs) are proving to be effective in generating solutions to chaotic models with high 'randomness' such as turbulent flows. In this study, forced isotropic turbulence flow is generated by parameterizing into some basic statistical characteristics. The models trained on pre-simulated data from dependencies on these characteristics and the flow generation is then affected by varying these parameters. The latent vectors pushed along the generator models like the decoders and generators contain independent entries which can be used to create different outputs with similar properties. The use of neural network-based architecture removes the need for dependency on the classical mesh-based Navier-Stoke equation estimation which is prominent in many CFD softwares.
【7】 Wasserstein GAN: Deep Generation applied on Bitcoins financial time series 标题:Wasserstein Gan:深度世代在比特币金融时间序列上的应用
作者:Rikli Samuel,Bigler Daniel Nico,Pfenninger Moritz,Osterrieder Joerg 机构:Bigler Nico, Samuel Rikli, Joerg Osterrieder, School of Engineering, Zurich University of Applied Sciences, Winterthur, Switzerland, The Hightech Business and Entrepreneurship Group, Management and Social Sciences, University of Twente, Enschede, Netherlands 链接:https://arxiv.org/abs/2107.06008 摘要:由于金融时间序列的高波动性和市场上的突发事件,对其进行建模具有挑战性。大多数试图弥补历史金融时间序列不足的金融模型和算法难以执行,而且极易出现过度拟合。作为另一种选择,我们在本文中引入了一种称为WGAN-GP的深度神经网络,它是一种数据驱动模型,主要关注样本生成。WGAN-GP由一个发生器和鉴别器功能组成,利用LSTM体系结构。WGAN-GP应该学习输入数据的底层结构,在我们的例子中,就是比特币。比特币在行为上是独一无二的;价格波动使得猜测价格走势几乎不可能。通过对抗性训练,WGAN-GP应该了解比特币的基本结构,并生成非常相似的比特币分布样本。生成的合成时间序列在视觉上与真实数据无法区分。但数值结果表明,生成的数据接近真实数据分布,但具有可分辨性。该模型主要表现为一种稳定的学习行为。但是,该模型有优化的空间,可以通过调整超参数来实现。 摘要:Modeling financial time series is challenging due to their high volatility and unexpected happenings on the market. Most financial models and algorithms trying to fill the lack of historical financial time series struggle to perform and are highly vulnerable to overfitting. As an alternative, we introduce in this paper a deep neural network called the WGAN-GP, a data-driven model that focuses on sample generation. The WGAN-GP consists of a generator and discriminator function which utilize an LSTM architecture. The WGAN-GP is supposed to learn the underlying structure of the input data, which in our case, is the Bitcoin. Bitcoin is unique in its behavior; the prices fluctuate what makes guessing the price trend hardly impossible. Through adversarial training, the WGAN-GP should learn the underlying structure of the bitcoin and generate very similar samples of the bitcoin distribution. The generated synthetic time series are visually indistinguishable from the real data. But the numerical results show that the generated data were close to the real data distribution but distinguishable. The model mainly shows a stable learning behavior. However, the model has space for optimization, which could be achieved by adjusting the hyperparameters.
半/弱/无/有监督|不确定性|主动学习(4篇)
【1】 Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval 标题:风格检索:无人监督的面部特征转移和检索
作者:Min Jin Chong,Wen-Sheng Chu,Abhishek Kumar 机构:University of Illinois at Urbana-Champaign, Google Research 备注:Code is here this https URL 链接:https://arxiv.org/abs/2107.06256 摘要:本文提出了一种在真实图像上进行细粒度人脸特征传输和检索的无监督框架——风格检索(RIS)。最近的工作表明,通过利用StyleGAN潜在空间的解纠缠特性,可以学习一个目录,该目录允许在生成的图像上局部语义转移面部特征。RIS在以下方面改进了现有的艺术:1)特征分离,允许具有挑战性的转移(即头发和姿势),这在SoTA方法中是不可能的。2) 无需对每幅图像进行超参数调整,也无需对大量图像计算目录。3) 利用人脸特征(如眼睛)进行人脸检索是在细粒度水平上检索人脸图像的第一步。4) 鲁棒性和对真实图像的自然应用。我们的定性和定量分析表明,RIS在真实图像上实现了高保真的特征传输和精确的细粒度检索。我们讨论了RIS的负责任应用。 摘要:We present Retrieve in Style (RIS), an unsupervised framework for fine-grained facial feature transfer and retrieval on real images. Recent work shows that it is possible to learn a catalog that allows local semantic transfers of facial features on generated images by capitalizing on the disentanglement property of the StyleGAN latent space. RIS improves existing art on: 1) feature disentanglement and allows for challenging transfers (i.e., hair and pose) that were not shown possible in SoTA methods. 2) eliminating the need for per-image hyperparameter tuning, and for computing a catalog over a large batch of images. 3) enabling face retrieval using the proposed facial features (e.g., eyes), and to our best knowledge, is the first work to retrieve face images at the fine-grained level. 4) robustness and natural application to real images. Our qualitative and quantitative analyses show RIS achieves both high-fidelity feature transfers and accurate fine-grained retrievals on real images. We discuss the responsible application of RIS.
【2】 Domain-Irrelevant Representation Learning for Unsupervised Domain Generalization 标题:无监督领域泛化的领域无关表示学习
作者:Xingxuan Zhang,Linjun Zhou,Renzhe Xu,Peng Cui,Zheyan Shen,Haoxin Liu 机构:Department of Computer Science, Tsinghua University, Beijing, China 链接:https://arxiv.org/abs/2107.06219 摘要:领域泛化(DG)的目的是帮助在一组源域上训练的模型更好地泛化到不可见的目标域上。现有的分布式遗传算法的性能很大程度上依赖于足够的标记数据,然而这些标记数据通常是昂贵的或不可用的。虽然未标记的数据更容易获取,但我们试图探索无监督学习如何帮助深度模型跨领域推广。具体来说,我们研究了一个新的泛化问题,称为无监督领域泛化,其目的是学习具有未标记数据的可泛化模型。此外,我们提出了一种与领域无关的无监督学习(DIUL)方法来处理未标记数据中的显著和误导性异质性以及源数据和目标数据之间的严重分布偏移。令人惊讶的是,我们发现DIUL不仅可以弥补标记数据的不足,而且在标记数据足够的情况下还可以进一步增强模型的泛化能力。作为一种预训练方法,DIUL显示出优于ImageNet的预训练协议,即使在可用数据没有标记的情况下,也比ImageNet要少很多。大量的实验清楚地证明了我们的方法与最先进的无监督学习方法相比的有效性。 摘要:Domain generalization (DG) aims to help models trained on a set of source domains generalize better on unseen target domains. The performances of current DG methods largely rely on sufficient labeled data, which however are usually costly or unavailable. While unlabeled data are far more accessible, we seek to explore how unsupervised learning can help deep models generalizes across domains. Specifically, we study a novel generalization problem called unsupervised domain generalization, which aims to learn generalizable models with unlabeled data. Furthermore, we propose a Domain-Irrelevant Unsupervised Learning (DIUL) method to cope with the significant and misleading heterogeneity within unlabeled data and severe distribution shifts between source and target data. Surprisingly we observe that DIUL can not only counterbalance the scarcity of labeled data but also further strengthen the generalization ability of models when the labeled data are sufficient. As a pretraining approach, DIUL shows superior to ImageNet pretraining protocol even when the available data are unlabeled and of a greatly smaller amount compared to ImageNet. Extensive experiments clearly demonstrate the effectiveness of our method compared with state-of-the-art unsupervised learning counterparts.
【3】 Calibrated Uncertainty for Molecular Property Prediction using Ensembles of Message Passing Neural Networks 标题:用消息传递神经网络集成校正分子性质预测的不确定度
作者:Jonas Busk,Peter Bjørn Jørgensen,Arghya Bhowmik,Mikkel N. Schmidt,Ole Winther,Tejs Vegge 链接:https://arxiv.org/abs/2107.06068 摘要:基于机器学习的数据驱动方法有可能加速原子结构的分析。然而,机器学习模型会产生过度自信的预测,因此仔细地检测和处理不确定性是至关重要的。在这里,我们扩展了一个消息传递神经网络,专门设计用于预测分子和材料的性能,具有校准的概率预测分布。本文提出的方法不同于以往的工作,它在统一的框架下考虑了任意性和认知不确定性,并对未知数据的预测分布进行了重新校正。通过计算机实验,我们证明了我们的方法在QM9和PC9两个公共分子基准数据集上得到了精确的预测分子形成能的模型。提出的方法为训练和评估神经网络集成模型提供了一个通用框架,该模型能够精确预测具有校准不确定度的分子的性质。 摘要:Data-driven methods based on machine learning have the potential to accelerate analysis of atomic structures. However, machine learning models can produce overconfident predictions and it is therefore crucial to detect and handle uncertainty carefully. Here, we extend a message passing neural network designed specifically for predicting properties of molecules and materials with a calibrated probabilistic predictive distribution. The method presented in this paper differs from the previous work by considering both aleatoric and epistemic uncertainty in a unified framework, and by re-calibrating the predictive distribution on unseen data. Through computer experiments, we show that our approach results in accurate models for predicting molecular formation energies with calibrated uncertainty in and out of the training data distribution on two public molecular benchmark datasets, QM9 and PC9. The proposed method provides a general framework for training and evaluating neural network ensemble models that are able to produce accurate predictions of properties of molecules with calibrated uncertainty.
【4】 SoftHebb: Bayesian inference in unsupervised Hebbian soft winner-take-all networks 标题:SoftHebb:无监督Hebbian软赢家通吃网络中的贝叶斯推理
作者:Timoleon Moraitis,Dmitry Toichkin,Yansong Chua,Qinghai Guo 机构:Huawei - Zurich Research Center, Zurich, Switzerland, Moscow, Russia, Laboratories, Huawei Technologies, Shenzhen, China 链接:https://arxiv.org/abs/2107.05747 摘要:最先进的人工神经网络(ANN)需要标记数据或层间反馈,通常在生物学上是不可信的,并且容易受到人类不易受到的对抗性攻击。另一方面,Hebbian学习在winner-take-all(WTA)网络中是无监督的,前馈的,并且在生物学上是合理的。然而,除了在非常有限的假设条件下,WTA网络的目标优化理论一直缺乏。在这里,我们正式得出这样一个理论,基于生物学上看似合理,但通用的人工神经网络元素。通过Hebbian学习,网络参数保持了数据的贝叶斯生成模型。不存在监督损失函数,但网络确实最小化了其激活和输入分布之间的交叉熵。关键是一个“软”WTA,那里没有绝对的“硬”赢家神经元,以及一种特殊类型的Hebbian样的权重和偏差可塑性。我们在实践中证实了我们的理论,在手写数字(MNIST)识别中,我们的Hebbian算法SoftHebb在不访问交叉熵的情况下最小化交叉熵,并且优于更常用的基于硬WTA的方法。引人注目的是,在某些条件下,它甚至优于有监督的端到端反向传播。具体地说,在两层网络中,当训练数据只呈现一次、测试数据有噪声以及基于梯度的对抗攻击时,SoftHebb的性能优于反向传播。混淆SoftHebb的对抗性攻击也会混淆人眼。最后,该模型可以根据输入分布生成对象的插值。 摘要:State-of-the-art artificial neural networks (ANNs) require labelled data or feedback between layers, are often biologically implausible, and are vulnerable to adversarial attacks that humans are not susceptible to. On the other hand, Hebbian learning in winner-take-all (WTA) networks, is unsupervised, feed-forward, and biologically plausible. However, an objective optimization theory for WTA networks has been missing, except under very limiting assumptions. Here we derive formally such a theory, based on biologically plausible but generic ANN elements. Through Hebbian learning, network parameters maintain a Bayesian generative model of the data. There is no supervisory loss function, but the network does minimize cross-entropy between its activations and the input distribution. The key is a "soft" WTA where there is no absolute "hard" winner neuron, and a specific type of Hebbian-like plasticity of weights and biases. We confirm our theory in practice, where, in handwritten digit (MNIST) recognition, our Hebbian algorithm, SoftHebb, minimizes cross-entropy without having access to it, and outperforms the more frequently used, hard-WTA-based method. Strikingly, it even outperforms supervised end-to-end backpropagation, under certain conditions. Specifically, in a two-layered network, SoftHebb outperforms backpropagation when the training dataset is only presented once, when the testing data is noisy, and under gradient-based adversarial attacks. Adversarial attacks that confuse SoftHebb are also confusing to the human eye. Finally, the model can generate interpolations of objects from its input distribution.
迁移|Zero/Few/One-Shot|自适应(5篇)
【1】 Adaptive Machine Learning for Time-Varying Systems: Low Dimensional Latent Space Tuning 标题:时变系统的自适应机器学习:低维潜在空间调谐
作者:Alexander Scheinker 链接:https://arxiv.org/abs/2107.06207 摘要:机器学习(ML)工具,如编码器-解码器卷积神经网络(CNN),可以代表难以置信的复杂非线性函数之间的图像和标量的组合映射。例如,CNNs可以用来映射加速器参数和图像的组合,这些组合是带电粒子束在不同粒子加速器位置之间传输时的6D相空间分布的2D投影。尽管ML有其优点,但将其应用于时变系统或具有变化分布的系统仍然是一个开放的问题,特别是对于收集新数据以进行再训练不切实际或会中断操作的大型系统。粒子加速器是大型时变系统的一个例子,对于这些系统,收集详细的训练数据需要长时间的专用束流测量,而这些测量在常规操作中可能不再可用。本文提出了一种新的时变系统自适应ML方法。我们的方法是将非常高(N>100k)维的输入(标量参数和图像的组合)映射到编码器-解码器CNN的编码器部分的输出处的低维(N~2)潜在空间。然后,我们通过在解码器部分构建基于图像的高维相空间密度表示之前直接添加自适应调谐的反馈向量来主动调谐复杂系统动力学的低维潜在空间表示。这种方法使我们能够学习内部的相关性,并快速调整难以置信的高参数系统的特性,并基于反馈实时跟踪其演化,而无需重新训练大量新数据集。 摘要:Machine learning (ML) tools such as encoder-decoder convolutional neural networks (CNN) can represent incredibly complex nonlinear functions which map between combinations of images and scalars. For example, CNNs can be used to map combinations of accelerator parameters and images which are 2D projections of the 6D phase space distributions of charged particle beams as they are transported between various particle accelerator locations. Despite their strengths, applying ML to time-varying systems, or systems with shifting distributions, is an open problem, especially for large systems for which collecting new data for re-training is impractical or interrupts operations. Particle accelerators are one example of large time-varying systems for which collecting detailed training data requires lengthy dedicated beam measurements which may no longer be available during regular operations. We present a recently developed method of adaptive ML for time-varying systems. Our approach is to map very high (N>100k) dimensional inputs (a combination of scalar parameters and images) into the low dimensional (N~2) latent space at the output of the encoder section of an encoder-decoder CNN. We then actively tune the low dimensional latent space-based representation of complex system dynamics by the addition of an adaptively tuned feedback vector directly before the decoder sections builds back up to our image-based high-dimensional phase space density representations. This method allows us to learn correlations within and to quickly tune the characteristics of incredibly high parameter systems and to track their evolution in real time based on feedback without massive new data sets for re-training.
【2】 Transfer Learning in Multi-Agent Reinforcement Learning with Double Q-Networks for Distributed Resource Sharing in V2X Communication 标题:V2X通信分布式资源共享双Q网络多Agent强化学习中的迁移学习
作者:Hammad Zafar,Zoran Utkovski,Martin Kasparick,Slawomir Stanczak 机构:Wireless Communications and Networks, Fraunhofer Heinrich Hertz Institute Berlin, Germany 备注:Submitted for publication 链接:https://arxiv.org/abs/2107.06195 摘要:本文研究了车辆到一切(V2X)通信网络中的分散频谱共享问题。其目的是提供车辆到基础设施(V2I)和车辆到车辆(V2V)链路的资源高效共存。最近的一项研究提出了一种基于深度Q学习的多智能体强化学习(MARL)方法,该方法利用了基于指纹的深度Q网络(DQN)结构。本文将双Q学习(通过双DQN)和迁移学习相结合,对该框架进行了扩展。其背后的动机是双Q学习可以缓解传统Q学习中存在的行为价值被高估的问题,而迁移学习可以利用专家模型获得的知识来加速MARL环境下的学习。提出的算法在真实的V2X环境中进行评估,合成数据是基于几何传播模型生成的,该模型包含模拟环境的特定位置地理描述符(建筑物、树叶和车辆的轮廓)。通过数值模拟验证了该方法的优越性。 摘要:This paper addresses the problem of decentralized spectrum sharing in vehicle-to-everything (V2X) communication networks. The aim is to provide resource-efficient coexistence of vehicle-to-infrastructure(V2I) and vehicle-to-vehicle(V2V) links. A recent work on the topic proposes a multi-agent reinforcement learning (MARL) approach based on deep Q-learning, which leverages a fingerprint-based deep Q-network (DQN) architecture. This work considers an extension of this framework by combining Double Q-learning (via Double DQN) and transfer learning. The motivation behind is that Double Q-learning can alleviate the problem of overestimation of the action values present in conventional Q-learning, while transfer learning can leverage knowledge acquired by an expert model to accelerate learning in the MARL setting. The proposed algorithm is evaluated in a realistic V2X setting, with synthetic data generated based on a geometry-based propagation model that incorporates location-specific geographical descriptors of the simulated environment(outlines of buildings, foliage, and vehicles). The advantages of the proposed approach are demonstrated via numerical simulations.
【3】 Deep Ranking with Adaptive Margin Triplet Loss 标题:自适应边际三重损失的深度排序
作者:Mai Lan Ha,Volker Blanz 机构:University of Siegen 链接:https://arxiv.org/abs/2107.06187 摘要:我们提出了一个简单的修改,从一个固定的保证金三重损失的自适应保证金三重损失。原始三元组丢失在人脸识别、人脸再识别和细粒度相似性等分类问题中有着广泛的应用,而我们提出的丢失方法非常适合于评级为连续值的评级数据集。相对于原始的三元组丢失需要仔细采样的情况,In-out方法可以利用整个数据集生成三元组,并且优化仍然可以收敛,不会经常遇到模型崩溃的问题。自适应边缘只需在训练前计算一次,这比在固定边缘情况下在每个历元后生成三元组要便宜得多。除了显著提高了训练稳定性(在我们的实验中,所提出的模型从未崩溃过,相比之下,在现有的三重丢失情况下,训练崩溃了几倍),我们在各种评级数据集和网络架构上取得了比原始三重丢失略好的性能。 摘要:We propose a simple modification from a fixed margin triplet loss to an adaptive margin triplet loss. While the original triplet loss is used widely in classification problems such as face recognition, face re-identification and fine-grained similarity, our proposed loss is well suited for rating datasets in which the ratings are continuous values. In contrast to original triplet loss where we have to sample data carefully, in out method, we can generate triplets using the whole dataset, and the optimization can still converge without frequently running into a model collapsing issue. The adaptive margins only need to be computed once before the training, which is much less expensive than generating triplets after every epoch as in the fixed margin case. Besides substantially improved training stability (the proposed model never collapsed in our experiments compared to a couple of times that the training collapsed on existing triplet loss), we achieved slightly better performance than the original triplet loss on various rating datasets and network architectures.
【4】 Induced Domain Adaptation 标题:诱导域适应
作者:Yang Liu,Yatong Chen,Jiaheng Wei 机构:UC Santa Cruz, Santa Cruz, CA 备注:Preprint under review 链接:https://arxiv.org/abs/2107.05911 摘要:我们提出了当模型被部署时引入底层分布/域转移时的诱导域适应(IDA)问题。我们的公式是由部署的机器学习模型与人类智能体交互的应用程序驱动的,最终将面对响应和交互的数据分布。通过研究在可用源分布(数据)上训练的模型如何转化为诱导域上的性能,我们正式讨论了在我们的IDA环境中学习的可转移性。我们提供了由于诱导域移动而导致的性能差距的上界,以及分类器在源训练分布或诱导目标分布上必须承受的权衡的下界。我们提供了进一步的实例分析两个流行的领域适应设置与协变量转移和标签转移。我们强调了IDA的一些关键特性,以及计算和学习方面的挑战。 摘要:We formulate the problem of induced domain adaptation (IDA) when the underlying distribution/domain shift is introduced by the model being deployed. Our formulation is motivated by applications where the deployed machine learning models interact with human agents, and will ultimately face responsive and interactive data distributions. We formalize the discussions of the transferability of learning in our IDA setting by studying how the model trained on the available source distribution (data) would translate to the performance on the induced domain. We provide both upper bounds for the performance gap due to the induced domain shift, as well as lower bound for the trade-offs a classifier has to suffer on either the source training distribution or the induced target distribution. We provide further instantiated analysis for two popular domain adaptation settings with covariate shift and label shift. We highlight some key properties of IDA, as well as computational and learning challenges.
【5】 Toward Efficient Transfer Learning in 6G 标题:面向6G的高效迁移学习
作者:Saeedeh Parsaeefard,Alberto Leon-Garcia 机构:Electrical and Computer Engineering Department, University of Toronto 链接:https://arxiv.org/abs/2107.05728 摘要:6G网络将极大地扩展对面向数据、自主应用的OTT和网络用例的支持。这些用例的成功将取决于大数据集的可用性,由于系统的高度动态行为和数据收集过程的成本,大数据集在许多实际场景中并不实用。迁移学习(Transfer learning,TL)通过在不同的学习算法之间共享知识来应对这些挑战,是一种很有前途的方法。使用TL,可以大大提高学习速度和学习精度。然而,在6G中高效地部署和利用TL存在实现挑战。在本文中,我们通过提供一些性能指标来衡量TL成功与否来展开讨论。然后,我们展示了如何调整6G的基础设施、应用程序、管理和训练平面来处理TL。我们提供了6G中TL的示例,并强调了6G中数据的时空特征,这些特征可以导致有效的TL。通过仿真结果,我们演示了如何在两个用例之间传递量化的神经网络权重,从而在开销和性能之间进行权衡,并在6G中获得更有效的TL。我们还提供了6G在TL方面的未来研究方向。 摘要:6G networks will greatly expand the support for data-oriented, autonomous applications for over the top (OTT) and networking use cases. The success of these use cases will depend on the availability of big data sets which is not practical in many real scenarios due to the highly dynamic behavior of systems and the cost of data collection procedures. Transfer learning (TL) is a promising approach to deal with these challenges through the sharing of knowledge among diverse learning algorithms. with TL, the learning rate and learning accuracy can be considerably improved. However, there are implementation challenges to efficiently deploy and utilize TL in 6G. In this paper, we initiate this discussion by providing some performance metrics to measure the TL success. Then, we show how infrastructure, application, management, and training planes of 6G can be adapted to handle TL. We provide examples of TL in 6G and highlight the spatio-temporal features of data in 6G that can lead to efficient TL. By simulation results, we demonstrate how transferring the quantized neural network weights between two use cases can make a trade-off between overheads and performance and attain more efficient TL in 6G. We also provide a list of future research directions in TL for 6G.
强化学习(5篇)
【1】 Conservative Offline Distributional Reinforcement Learning 标题:保守的离线分布式强化学习
作者:Yecheng Jason Ma,Dinesh Jayaraman,Osbert Bastani 机构:University of Pennsylvania 链接:https://arxiv.org/abs/2107.06106 摘要:许多强化学习(RL)问题在实践中是离线的,纯粹从观察数据中学习。一个关键的挑战是如何确保学习到的政策是安全的,这需要量化与不同行动相关的风险。在在线环境下,分布RL算法通过学习收益(即累积报酬)的分布而不是期望收益来实现;除了量化风险外,他们还可以学习更好的规划表示法。提出了一种适用于风险中性域和风险厌恶域的离线RL算法,即保守的离线分布行动者批评算法(CODAC)。CODAC通过惩罚分布外行为的预期回报分位数,使分布RL适应离线设置。我们证明了CODAC学习了一个保守的收益分布——特别是对于有限MDP,CODAC收敛到收益分布分位数上的一致下界;我们的证明依赖于对分配Bellman算子的一个新的分析。在我们的实验中,在两个具有挑战性的机器人导航任务上,CODAC成功地利用纯粹从风险中性代理收集的离线数据学习风险规避策略。此外,CODAC在D4RL MuJoCo基准上在预期和风险敏感性能方面都是最先进的。 摘要:Many reinforcement learning (RL) problems in practice are offline, learning purely from observational data. A key challenge is how to ensure the learned policy is safe, which requires quantifying the risk associated with different actions. In the online setting, distributional RL algorithms do so by learning the distribution over returns (i.e., cumulative rewards) instead of the expected return; beyond quantifying risk, they have also been shown to learn better representations for planning. We propose Conservative Offline Distributional Actor Critic (CODAC), an offline RL algorithm suitable for both risk-neutral and risk-averse domains. CODAC adapts distributional RL to the offline setting by penalizing the predicted quantiles of the return for out-of-distribution actions. We prove that CODAC learns a conservative return distribution -- in particular, for finite MDPs, CODAC converges to an uniform lower bound on the quantiles of the return distribution; our proof relies on a novel analysis of the distributional Bellman operator. In our experiments, on two challenging robot navigation tasks, CODAC successfully learns risk-averse policies using offline data collected purely from risk-neutral agents. Furthermore, CODAC is state-of-the-art on the D4RL MuJoCo benchmark in terms of both expected and risk-sensitive performance.
【2】 Cautious Policy Programming: Exploiting KL Regularization in Monotonic Policy Improvement for Reinforcement Learning 标题:谨慎的策略规划:在强化学习的单调策略改进中利用KL正则化
作者:Lingwei Zhu,Toshinori Kitamura,Takamitsu Matsubara 机构:Graduate School of Science and Technology, Nara Institute of Science and Technology, Ikoma, Nara, Japan 备注:15 pages. arXiv admin note: text overlap with arXiv:2008.10806 链接:https://arxiv.org/abs/2107.05798 摘要:本文提出了一种新的基于值的强化学习(RL)算法&谨慎策略规划(CPP),该算法能保证学习过程中策略的单调性。基于熵正则化RL的性质,提出了一种新的熵正则化策略改进下界,该下界只需要估计期望的策略优势函数。CPP利用这个下限作为调整策略更新程度的标准,以减轻策略振荡。不同于类似的算法大多是面向理论的,我们还提出了一种新的插值方案,使CPP在高维控制问题中具有更好的规模。我们证明了所提出的算法可以交易o?在说教经典控制问题和具有挑战性的高维Atari游戏中的性能和稳定性。 摘要:In this paper, we propose cautious policy programming (CPP), a novel value-based reinforcement learning (RL) algorithm that can ensure monotonic policy improvement during learning. Based on the nature of entropy-regularized RL, we derive a new entropy regularization-aware lower bound of policy improvement that only requires estimating the expected policy advantage function. CPP leverages this lower bound as a criterion for adjusting the degree of a policy update for alleviating policy oscillation. Different from similar algorithms that are mostly theory-oriented, we also propose a novel interpolation scheme that makes CPP better scale in high dimensional control problems. We demonstrate that the proposed algorithm can trade o? performance and stability in both didactic classic control problems and challenging high-dimensional Atari games.
【3】 Representation Learning for Out-Of-Distribution Generalization in Reinforcement Learning 标题:强化学习中非分布泛化的表征学习
作者:Andrea Dittadi,Frederik Träuble,Manuel Wüthrich,Felix Widmaier,Peter Gehler,Ole Winther,Francesco Locatello,Olivier Bachem,Bernhard Schölkopf,Stefan Bauer 机构:Technical University of Denmark,Max Planck Institute for Intelligent Systems, Tübingen, Amazon Lablets,Google Brain,CIFAR Azrieli Global Scholar 链接:https://arxiv.org/abs/2107.05686 摘要:学习对各种下游任务有用的数据表示是人工智能的基石。虽然现有的方法通常是对下游任务(如分类或生成图像质量)进行评估,但我们建议通过它们在下游控制任务(如到达或推送对象)中的有用性来评估表示。通过训练10000多个强化学习策略,我们广泛地评估了不同的表征属性对分布外(OOD)泛化的影响程度。最后,我们演示了这些策略从模拟到现实世界的Zero-Shot传输,没有任何领域随机化或微调。本文的目的是建立第一个系统性的描述学习表示对现实世界任务的有用性。 摘要:Learning data representations that are useful for various downstream tasks is a cornerstone of artificial intelligence. While existing methods are typically evaluated on downstream tasks such as classification or generative image quality, we propose to assess representations through their usefulness in downstream control tasks, such as reaching or pushing objects. By training over 10,000 reinforcement learning policies, we extensively evaluate to what extent different representation properties affect out-of-distribution (OOD) generalization. Finally, we demonstrate zero-shot transfer of these policies from simulation to the real world, without any domain randomization or fine-tuning. This paper aims to establish the first systematic characterization of the usefulness of learned representations for real-world OOD downstream tasks.
【4】 A Deep Reinforcement Learning Approach for Traffic Signal Control Optimization 标题:一种用于交通信号控制优化的深度强化学习方法
作者:Zhenning Li,Chengzhong Xu,Guohui Zhang 机构:a University of Macau, b University of Hawaii at Manoa, Corresponding Author 链接:https://arxiv.org/abs/2107.06115 摘要:低效的交通信号控制方法会导致交通拥挤和能源浪费等问题。强化学习是一种数据驱动的自适应交通信号控制方法。虽然深度神经网络(DNN)的发展进一步增强了它的学习能力,但将深度RLs应用于多信号交叉口交通网络仍然面临着一些挑战,包括非平稳环境、探索-开发困境、多智能体训练方案、连续动作空间等,为了解决这些问题,本文首先通过扩展actor-critic策略梯度算法,提出了一种多agent深度确定性策略梯度(MADDPG)方法。MADDPG有一个集中的学习和分散的执行范例,在这个范例中,评论家使用额外的信息来简化训练过程,而演员则根据他们自己的本地观察采取行动。在城市交通仿真平台(SUMO)上对该模型进行了仿真评价。模型比较结果表明了该算法在交通信号灯控制中的有效性。 摘要:Inefficient traffic signal control methods may cause numerous problems, such as traffic congestion and waste of energy. Reinforcement learning (RL) is a trending data-driven approach for adaptive traffic signal control in complex urban traffic networks. Although the development of deep neural networks (DNN) further enhances its learning capability, there are still some challenges in applying deep RLs to transportation networks with multiple signalized intersections, including non-stationarity environment, exploration-exploitation dilemma, multi-agent training schemes, continuous action spaces, etc. In order to address these issues, this paper first proposes a multi-agent deep deterministic policy gradient (MADDPG) method by extending the actor-critic policy gradient algorithms. MADDPG has a centralized learning and decentralized execution paradigm in which critics use additional information to streamline the training process, while actors act on their own local observations. The model is evaluated via simulation on the Simulation of Urban MObility (SUMO) platform. Model comparison results show the efficiency of the proposed algorithm in controlling traffic lights.
【5】 Model Selection with Near Optimal Rates for Reinforcement Learning with General Model Classes 标题:一般模型类强化学习的近似最优率模型选择
作者:Avishek Ghosh,Sayak Ray Chowdhury,Kannan Ramchandran 机构:Dept. of EECS, UC Berkeley, Dept. of ECE, Indian Institute of Science, Bangalore 备注:24 pages 链接:https://arxiv.org/abs/2107.05849 摘要:我们讨论了有限视界幕式强化学习(RL)问题的模型选择问题,其中转换核$P^*$属于一系列具有有限度量熵的模型$mathcal{P}^*$。在模型选择框架中,我们没有使用$mathcal{P}^*$,而是使用了$cPu 1subsetcPu 2subsetldotssubsetcPu M$嵌套的转换核族。我们提出并分析了一种新的算法,即自适应强化学习(General)}(texttt{ARL-GEN}),它适用于真正的转换核$P^*$所在的最小类texttt{ARL-GEN}使用具有值目标回归的上置信度强化学习(texttt{UCRL})算法作为黑盒,并在每个历元的开始处放置模型选择模块。在模型类的轻度可分性假设下,我们证明了texttt{ARL-GEN}得到了$Tilde{mathcal{O}(d{mathcal{E}}^*H^2 sqrt{d{mathcal{E}^*mathbb{M}^*H^2t}的遗憾,其中,$H$是视界长度,$T$是总步数,$d{mathcal{E}}^*$是Eluder维度,$mathbb{M}^*$是对应于$mathcal{P}^*$的度量熵。请注意,这种遗憾的缩放匹配于提前知道$mathcal{P}^*$的oracle。我们证明了在对$T$有弱依赖的遗憾中,模型选择的代价是一个加性项。随后,我们去掉了可分性假设,并考虑了线性混合MDP的建立,其中转移核$p^*$具有线性函数逼近。利用这种低秩结构,我们提出了一种新的自适应模型选择算法,得到了与oracle模型完全相同的(顺序)遗憾。 摘要:We address the problem of model selection for the finite horizon episodic Reinforcement Learning (RL) problem where the transition kernel $P^*$ belongs to a family of models $mathcal{P}^*$ with finite metric entropy. In the model selection framework, instead of $mathcal{P}^*$, we are given $M$ nested families of transition kernels $cP_1 subset cP_2 subset ldots subset cP_M$. We propose and analyze a novel algorithm, namely emph{Adaptive Reinforcement Learning (General)} (texttt{ARL-GEN}) that adapts to the smallest such family where the true transition kernel $P^*$ lies. texttt{ARL-GEN} uses the Upper Confidence Reinforcement Learning (texttt{UCRL}) algorithm with value targeted regression as a blackbox and puts a model selection module at the beginning of each epoch. Under a mild separability assumption on the model classes, we show that texttt{ARL-GEN} obtains a regret of $Tilde{mathcal{O}}(d_{mathcal{E}}^*H^2 sqrt{d_{mathcal{E}}^* mathbb{M}^* H^2 T})$, with high probability, where $H$ is the horizon length, $T$ is the total number of steps, $d_{mathcal{E}}^*$ is the Eluder dimension and $mathbb{M}^*$ is the metric entropy corresponding to $mathcal{P}^*$. Note that this regret scaling matches that of an oracle that knows $mathcal{P}^*$ in advance. We show that the cost of model selection for texttt{ARL-GEN} is an additive term in the regret having a weak dependence on $T$. Subsequently, we remove the separability assumption and consider the setup of linear mixture MDPs, where the transition kernel $P^*$ has a linear function approximation. With this low rank structure, we propose novel adaptive algorithms for model selection, and obtain (order-wise) regret identical to that of an oracle with knowledge of the true model class.
符号|符号学习(2篇)
【1】 Identification of Dynamical Systems using Symbolic Regression 标题:基于符号回归的动力系统辨识
作者:Gabriel Kronberger,Lukas Kammerer,Michael Kommenda 备注:None 链接:https://arxiv.org/abs/2107.06131 摘要:我们描述了一种从观测数据中识别动力系统模型的方法。该方法基于符号回归的概念,利用遗传规划方法建立常微分方程组。新颖之处在于我们增加了一个基于梯度的ODE参数优化步骤。为此,我们用自动微分法计算了初值问题(IVP)解的灵敏度。所提出的方法在一组19个问题实例上进行了测试,这些问题实例包括来自模拟系统的数据集和来自机械系统的数据集。我们发现基于梯度的参数优化可以提高模型的预测精度。当我们首先将单个方程拟合到数值差分,然后通过将IVP解拟合到观测变量值来对识别的参数值进行微调时,可以获得最佳结果。 摘要:We describe a method for the identification of models for dynamical systems from observational data. The method is based on the concept of symbolic regression and uses genetic programming to evolve a system of ordinary differential equations (ODE). The novelty is that we add a step of gradient-based optimization of the ODE parameters. For this we calculate the sensitivities of the solution to the initial value problem (IVP) using automatic differentiation. The proposed approach is tested on a set of 19 problem instances taken from the literature which includes datasets from simulated systems as well as datasets captured from mechanical systems. We find that gradient-based optimization of parameters improves predictive accuracy of the models. The best results are obtained when we first fit the individual equations to the numeric differences and then subsequently fine-tune the identified parameter values by fitting the IVP solution to the observed variable values.
【2】 Towards Automatic Instrumentation by Learning to Separate Parts in Symbolic Multitrack Music 标题:在符号多声道音乐中学习分离声部走向自动化乐器
作者:Hao-Wen Dong,Chris Donahue,Taylor Berg-Kirkpatrick,Julian McAuley 机构: University of California San Diego, Stanford University 备注:Accepted to ISMIR 2021 链接:https://arxiv.org/abs/2107.05916 摘要:现代键盘允许音乐家同时演奏多种乐器,方法是给不同的乐器指定区域——键盘的固定音高范围。在本文中,我们旨在进一步扩展这一思想,并探讨自动乐器的可行性,即在独奏音乐演奏过程中为音符动态分配乐器。除了为执行用例提供在线、实时的设置外,自动插装还可以在离线设置中的辅助创作工具中找到应用程序。由于缺乏原始独奏音乐的配对数据及其完整排列,我们通过学习将部分(如声音、乐器和音轨)从符号多轨音乐的混合中分离出来,假设混合是在键盘上播放的,从而接近自动乐器。我们将零件分离问题定义为一个序列多类分类问题,并采用机器学习将注释序列映射为零件标签序列。为了检验我们提出的模型的有效性,我们对巴赫合唱、弦乐四重奏、游戏音乐和流行音乐四个不同流派和合奏的数据集进行了全面的实证评估。我们的实验表明,所提出的模型优于各种基线。我们还展示了我们提出的模型通过将混合物分成若干部分,为现有安排产生替代的令人信服的工具的潜力。所有源代码和音频样本可以在https://salu133445.github.io/arranger/ . 摘要:Modern keyboards allow a musician to play multiple instruments at the same time by assigning zones -- fixed pitch ranges of the keyboard -- to different instruments. In this paper, we aim to further extend this idea and examine the feasibility of automatic instrumentation -- dynamically assigning instruments to notes in solo music during performance. In addition to the online, real-time-capable setting for performative use cases, automatic instrumentation can also find applications in assistive composing tools in an offline setting. Due to the lack of paired data of original solo music and their full arrangements, we approach automatic instrumentation by learning to separate parts (e.g., voices, instruments and tracks) from their mixture in symbolic multitrack music, assuming that the mixture is to be played on a keyboard. We frame the task of part separation as a sequential multi-class classification problem and adopt machine learning to map sequences of notes into sequences of part labels. To examine the effectiveness of our proposed models, we conduct a comprehensive empirical evaluation over four diverse datasets of different genres and ensembles -- Bach chorales, string quartets, game music and pop music. Our experiments show that the proposed models outperform various baselines. We also demonstrate the potential for our proposed models to produce alternative convincing instrumentations for an existing arrangement by separating its mixture into parts. All source code and audio samples can be found at https://salu133445.github.io/arranger/ .
医学相关(5篇)
【1】 DiCOVA-Net: Diagnosing COVID-19 using Acoustics based on Deep Residual Network for the DiCOVA Challenge 2021 标题:DiCOVA-Net:基于深度残差网络的声学诊断2021年DiCOVA挑战赛冠状病毒
作者:Jiangeng Chang,Shaoze Cui,Mengling Feng 备注:5 figures 链接:https://arxiv.org/abs/2107.06126 摘要:在本文中,我们提出了一种基于深度残差网络的方法,即DiCOVA网络,根据咳嗽声记录识别COVID-19感染患者。由于健康人群远远多于感染者,这一分类问题面临着数据不平衡的挑战。为了提高模型对少数群体(感染者)的识别能力,我们在模型中引入了数据扩充和成本敏感的方法。此外,考虑到本课题的特殊性,我们采用了一些微调技术来调整预训练ResNet50。此外,为了提高模型的可推广性,我们使用集成学习来整合使用不同随机种子产生的多个基分类器的预测结果。为了评估提出的DiCOVA网络的性能,我们使用DiCOVA挑战数据集进行了实验。结果表明,该方法的AUC达到85.43%,在所有参赛队伍中名列前茅。 摘要:In this paper, we propose a deep residual network-based method, namely the DiCOVA-Net, to identify COVID-19 infected patients based on the acoustic recording of their coughs. Since there are far more healthy people than infected patients, this classification problem faces the challenge of imbalanced data. To improve the model's ability to recognize minority class (the infected patients), we introduce data augmentation and cost-sensitive methods into our model. Besides, considering the particularity of this task, we deploy some fine-tuning techniques to adjust the pre-training ResNet50. Furthermore, to improve the model's generalizability, we use ensemble learning to integrate prediction results from multiple base classifiers generated using different random seeds. To evaluate the proposed DiCOVA-Net's performance, we conducted experiments with the DiCOVA challenge dataset. The results show that our method has achieved 85.43% in AUC, among the top of all competing teams.
【2】 AutoScore-Imbalance: An interpretable machine learning tool for development of clinical scores with rare events data 标题:AutoScore-Importion:一种利用罕见事件数据开发临床评分的可解释机器学习工具
作者:Han Yuan,Feng Xie,Marcus Eng Hock Ong,Yilin Ning,Marcel Lucas Chee,Seyed Ehsan Saffari,Hairil Rizal Abdullah,Benjamin Alan Goldstein,Bibhas Chakraborty,Nan Liu 机构: Duke-NUS Medical School, National University of Singapore, Singapore, Singapore, Department of Emergency Medicine, Singapore General Hospital, Singapore, Singapore, Health Services Research Centre, Singapore Health Services, Singapore 链接:https://arxiv.org/abs/2107.06039 摘要:背景:医疗决策影响个人和公众健康。临床评分常用于各种各样的决策模型,用于确定床边疾病恶化的程度。AutoScore是一种基于机器学习和广义线性模型的临床评分生成器。然而,它目前的框架在处理罕见事件的不平衡数据时仍有改进的余地。方法:采用机器智能的方法,我们开发了AutoScore不平衡,它包括三个部分:训练数据集优化、样本权重优化和调整AutoScore。所有评分模型均根据受试者操作特征分析中的曲线下面积(AUC)和平衡准确度(即敏感性和特异性的平均值)进行评估。通过利用贝斯以色列女执事医疗中心的一个可公开访问的数据集,我们评估了提出的模型和基线方法在预测住院病人死亡率方面的作用。结果:在AUC和平衡准确性方面,自动核不平衡优于基线。9个变量的自核不平衡子模型的AUC最高为0.786(0.732-0.839),11个变量的原始自核模型的AUC为0.723(0.663-0.783),21个变量的logistic回归模型的AUC为0.743(0.685-0.800)。自动核不平衡子模型(使用下采样算法)仅使用五个变量,AUC为0.0.771(0.718-0.823),显示了性能和变量稀疏性之间的良好平衡。结论:AutoScore不平衡工具有可能应用于高度不平衡的数据集,以进一步了解罕见的医疗事件,并促进现实世界的临床决策。 摘要:Background: Medical decision-making impacts both individual and public health. Clinical scores are commonly used among a wide variety of decision-making models for determining the degree of disease deterioration at the bedside. AutoScore was proposed as a useful clinical score generator based on machine learning and a generalized linear model. Its current framework, however, still leaves room for improvement when addressing unbalanced data of rare events. Methods: Using machine intelligence approaches, we developed AutoScore-Imbalance, which comprises three components: training dataset optimization, sample weight optimization, and adjusted AutoScore. All scoring models were evaluated on the basis of their area under the curve (AUC) in the receiver operating characteristic analysis and balanced accuracy (i.e., mean value of sensitivity and specificity). By utilizing a publicly accessible dataset from Beth Israel Deaconess Medical Center, we assessed the proposed model and baseline approaches in the prediction of inpatient mortality. Results: AutoScore-Imbalance outperformed baselines in terms of AUC and balanced accuracy. The nine-variable AutoScore-Imbalance sub-model achieved the highest AUC of 0.786 (0.732-0.839) while the eleven-variable original AutoScore obtained an AUC of 0.723 (0.663-0.783), and the logistic regression with 21 variables obtained an AUC of 0.743 (0.685-0.800). The AutoScore-Imbalance sub-model (using down-sampling algorithm) yielded an AUC of 0. 0.771 (0.718-0.823) with only five variables, demonstrating a good balance between performance and variable sparsity. Conclusions: The AutoScore-Imbalance tool has the potential to be applied to highly unbalanced datasets to gain further insight into rare medical events and to facilitate real-world clinical decision-making.
【3】 Scalable, Axiomatic Explanations of Deep Alzheimer's Diagnosis from Heterogeneous Data 标题:异质数据对深度阿尔茨海默病诊断的可伸缩性、公理化解释
作者:Sebastian Pölsterl,Christina Aigner,Christian Wachinger 机构:Artificial Intelligence in Medical Imaging (AI-Med), Department of Child and Adolescent Psychiatry, Ludwig-Maximilians-Universität, Munich, Germany 备注:Accepted at 2021 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) 链接:https://arxiv.org/abs/2107.05997 摘要:深度神经网络(DNNs)具有从复杂的生物医学数据中学习的巨大潜力。特别是,DNNs已被用于无缝融合来自神经解剖学、遗传学、生物标志物和神经心理学测试的异质信息,以实现对阿尔茨海默病的高准确诊断。另一方面,它们的黑匣子性质仍然是在诊所采用这种系统的一个障碍,在诊所,解释性是绝对必要的。我们提出异质神经网络的Shapley值解释(SVEHNN)来解释神经解剖和生物标记物的三维点云中的DNN对阿尔茨海默病的诊断。我们的解释是基于Shapley值,这是一种独特的方法,满足所有基本公理的地方解释先前建立的文献。因此,SVEHNN具有许多令人满意的特性,而以往在医学决策的可解释性方面的工作是缺乏的。为了避免Shapley值的指数时间复杂度,我们提出将一个给定的DNN转化为一个轻量级的概率深度网络而不需要再训练,从而实现了在特征数上的二次复杂度。在合成数据和真实数据的实验中,我们证明了我们可以在极大地减少运行时间的情况下逼近精确的Shapley值,并且可以揭示网络从数据中学习到的隐藏知识。 摘要:Deep Neural Networks (DNNs) have an enormous potential to learn from complex biomedical data. In particular, DNNs have been used to seamlessly fuse heterogeneous information from neuroanatomy, genetics, biomarkers, and neuropsychological tests for highly accurate Alzheimer's disease diagnosis. On the other hand, their black-box nature is still a barrier for the adoption of such a system in the clinic, where interpretability is absolutely essential. We propose Shapley Value Explanation of Heterogeneous Neural Networks (SVEHNN) for explaining the Alzheimer's diagnosis made by a DNN from the 3D point cloud of the neuroanatomy and tabular biomarkers. Our explanations are based on the Shapley value, which is the unique method that satisfies all fundamental axioms for local explanations previously established in the literature. Thus, SVEHNN has many desirable characteristics that previous work on interpretability for medical decision making is lacking. To avoid the exponential time complexity of the Shapley value, we propose to transform a given DNN into a Lightweight Probabilistic Deep Network without re-training, thus achieving a complexity only quadratic in the number of features. In our experiments on synthetic and real data, we show that we can closely approximate the exact Shapley value with a dramatically reduced runtime and can reveal the hidden knowledge the network has learned from the data.
【4】 Detecting when pre-trained nnU-Net models fail silently for Covid-19 标题:检测预先训练的NNU-Net模型何时发生冠状病毒静默故障
作者:Camila Gonzalez,Karol Gotkowski,Andreas Bucher,Ricarda Fischbach,Isabel Kaltenborn,Anirban Mukhopadhyay 机构: Darmstadt University of Technology, Karolinenpl. , Darmstadt, Germany, University Hospital Frankfurt, Theodor-Stern-Kai , Frankfurt am Main 链接:https://arxiv.org/abs/2107.05975 摘要:计算机断层扫描中肺部病变的自动分割有可能减轻临床医生在Covid-19大流行期间的负担。然而,预测性深度学习模型在临床上并不可信,因为它在非分布(OOD)数据中失败了。提出了一种利用特征空间中马氏距离的轻量级OOD检测方法。所提出的方法可以无缝地集成到最先进的分割管道中,而不需要改变模型结构或训练过程,因此可以用来评估预先训练的模型对新数据的适用性。我们用一个多机构数据集训练的基于补丁的nnU网络结构验证了我们的方法,发现它能有效地检测模型错误分割的样本。 摘要:Automatic segmentation of lung lesions in computer tomography has the potential to ease the burden of clinicians during the Covid-19 pandemic. Yet predictive deep learning models are not trusted in the clinical routine due to failing silently in out-of-distribution (OOD) data. We propose a lightweight OOD detection method that exploits the Mahalanobis distance in the feature space. The proposed approach can be seamlessly integrated into state-of-the-art segmentation pipelines without requiring changes in model architecture or training procedure, and can therefore be used to assess the suitability of pre-trained models to new data. We validate our method with a patch-based nnU-Net architecture trained with a multi-institutional dataset and find that it effectively detects samples that the model segments incorrectly.
【5】 Challenges for machine learning in clinical translation of big data imaging studies 标题:大数据影像研究临床翻译中机器学习面临的挑战
作者:Nicola K Dinsdale,Emma Bluemke,Vaanathi Sundaresan,Mark Jenkinson,Stephen Smith,Ana IL Namburete 链接:https://arxiv.org/abs/2107.05630 摘要:深入学习的图像分析方法和大规模的影像数据集的结合为影像神经科学和流行病学提供了许多机会。然而,尽管深度学习在许多神经成像任务中取得了成功,但在大规模数据集和处理工具的临床翻译中仍然存在障碍。在这里,我们探讨了主要的挑战和克服这些挑战的方法。我们关注与数据可用性、可解释性、评估和后勤挑战相关的问题,并讨论我们认为仍然需要克服的挑战,以使大数据深度学习方法能够在研究领域之外获得全面成功。 摘要:The combination of deep learning image analysis methods and large-scale imaging datasets offers many opportunities to imaging neuroscience and epidemiology. However, despite the success of deep learning when applied to many neuroimaging tasks, there remain barriers to the clinical translation of large-scale datasets and processing tools. Here, we explore the main challenges and the approaches that have been explored to overcome them. We focus on issues relating to data availability, interpretability, evaluation and logistical challenges, and discuss the challenges we believe are still to be overcome to enable the full success of big data deep learning approaches to be experienced outside of the research field.
自动驾驶|车辆|车道检测等(1篇)
【1】 Practical and Configurable Network Traffic Classification Using Probabilistic Machine Learning 标题:基于概率机器学习的实用可配置网络流量分类
作者:Jiahui Chen,Joe Breen,Jeff M. Phillips,Jacobus Van der Merwe 机构:Received: date Accepted: date 备注:Published in the Springer Cluster Computing journal 链接:https://arxiv.org/abs/2107.06080 摘要:网络流量分类具有广泛的适用性和较高的准确度,在许多网络安全和管理任务中具有重要的应用价值。一个灵活且易于配置的分类框架是理想的,因为它可以定制为在各种各样的网络中使用。在本文中,我们提出了一种高度可配置且灵活的机器学习流量分类方法,该方法仅依赖于数据包序列的统计信息来区分已知或批准的流量与未知流量。我们的方法基于似然估计,为分类决策提供确定性度量,并且可以在可调整的确定性水平上对流量进行分类。我们的分类方法也可以应用于不同的分类场景,每个场景都有不同的分类目标。我们演示了我们的分类方案及其所有配置如何在高性能计算网络环境中的实际流量上表现良好。 摘要:Network traffic classification that is widely applicable and highly accurate is valuable for many network security and management tasks. A flexible and easily configurable classification framework is ideal, as it can be customized for use in a wide variety of networks. In this paper, we propose a highly configurable and flexible machine learning traffic classification method that relies only on statistics of sequences of packets to distinguish known, or approved, traffic from unknown traffic. Our method is based on likelihood estimation, provides a measure of certainty for classification decisions, and can classify traffic at adjustable certainty levels. Our classification method can also be applied in different classification scenarios, each prioritizing a different classification goal. We demonstrate how our classification scheme and all its configurations perform well on real-world traffic from a high performance computing network environment.
推理|分析|理解|解释(5篇)
【1】 Learning a Discriminant Latent Space with Neural Discriminant Analysis 标题:用神经判别分析学习判别潜在空间
作者:Mai Lan Ha,Gianni Franchi,Emanuel Aldea,Volker Blanz 机构:Department of Computer Science, University of Siegen, Germany, U,IS, ENSTA Paris, Institut Polytechnique de Paris, France, Laboratoire SATIE, Paris-Saclay University 链接:https://arxiv.org/abs/2107.06209 摘要:判别特征在图像和目标分类以及半监督学习、细粒度分类、分布外检测等领域都有重要的应用。受线性判别分析(LDA)的启发,我们提出了一种用于深度卷积神经网络(DCNNs)的神经判别分析(NDA)优化方法。NDA将深度特征转换为更具辨别力的特征,从而提高了在各种任务中的性能。我们提出的优化有两个主要目标类间和类内方差。第一种方法是最小化每个类内的差异。第二个目标是最大化来自不同类的特征之间的成对距离。我们在一般监督分类、细粒度分类、半监督学习和分布外检测等不同的研究领域对我们的NDA优化进行了评估。与不使用NDA的基线方法相比,我们在所有领域都实现了性能改进。此外,使用NDA,我们在各种测试数据集的四项任务上也超过了最新的水平。 摘要:Discriminative features play an important role in image and object classification and also in other fields of research such as semi-supervised learning, fine-grained classification, out of distribution detection. Inspired by Linear Discriminant Analysis (LDA), we propose an optimization called Neural Discriminant Analysis (NDA) for Deep Convolutional Neural Networks (DCNNs). NDA transforms deep features to become more discriminative and, therefore, improves the performances in various tasks. Our proposed optimization has two primary goals for inter- and intra-class variances. The first one is to minimize variances within each individual class. The second goal is to maximize pairwise distances between features coming from different classes. We evaluate our NDA optimization in different research fields: general supervised classification, fine-grained classification, semi-supervised learning, and out of distribution detection. We achieve performance improvements in all the fields compared to baseline methods that do not use NDA. Besides, using NDA, we also surpass the state of the art on the four tasks on various testing datasets.
【2】 Correlation Analysis between the Robustness of Sparse Neural Networks and their Random Hidden Structural Priors 标题:稀疏神经网络的鲁棒性与其随机隐含结构先验的相关分析
作者:M. Ben Amor,J. Stier,M. Granitzer 机构:deUniversity of PassauABSTRACTDeep learning models have been shown to be vulnerable to adversarial attacks 链接:https://arxiv.org/abs/2107.06158 摘要:深度学习模式已被证明易受对手攻击。这种认知导致了对深度学习模型的分析,不仅从其性能指标的角度,而且从其对某些类型的对抗性攻击的鲁棒性的角度。我们从图论的角度将神经网络的体系结构与它们的健壮性联系起来,又向前迈出了一步。我们的目的是研究稀疏神经网络的图论性质和鲁棒性之间存在的任何相关性。我们的假设是,作为神经网络结构先验的图论性质与其鲁棒性有关。为了回答这一假设,我们设计了一个实证研究,通过随机图获得的神经网络模型作为网络的稀疏结构先验。我们还研究了一个随机剪枝的完全连接网络作为参考点的评估。我们发现,鲁棒性度量与初始化方法无关,但与图的性质表现出弱相关性:图的密度越高,鲁棒性越低;而平均路径长度和平均节点偏心率越高,鲁棒性度量则表现出负相关性。我们希望激励进一步的实证和分析研究,以加紧回答我们的假设。 摘要:Deep learning models have been shown to be vulnerable to adversarial attacks. This perception led to analyzing deep learning models not only from the perspective of their performance measures but also their robustness to certain types of adversarial attacks. We take another step forward in relating the architectural structure of neural networks from a graph theoretic perspective to their robustness. We aim to investigate any existing correlations between graph theoretic properties and the robustness of Sparse Neural Networks. Our hypothesis is, that graph theoretic properties as a prior of neural network structures are related to their robustness. To answer to this hypothesis, we designed an empirical study with neural network models obtained through random graphs used as sparse structural priors for the networks. We additionally investigated the evaluation of a randomly pruned fully connected network as a point of reference. We found that robustness measures are independent of initialization methods but show weak correlations with graph properties: higher graph densities correlate with lower robustness, but higher average path lengths and average node eccentricities show negative correlations with robustness measures. We hope to motivate further empirical and analytical research to tightening an answer to our hypothesis.
【3】 Using Causal Analysis for Conceptual Deep Learning Explanation 标题:因果分析在概念深度学习解释中的应用
作者:Sumedha Singla,Stephen Wallace,Sofia Triantafillou,Kayhan Batmanghelich 机构: Computer Science Department, University of Pittsburgh, USA, University of Pittsburgh School of Medicine, University of Pittsburgh, USA, Department of Biomedical Informatics, University of Pittsburgh, USA 备注:10 pages, 6 figures 链接:https://arxiv.org/abs/2107.06098 摘要:模型的可解释性对于在医疗领域建立可信的机器学习模型至关重要。理想的解释类似于领域专家的决策过程,并使用对临床医生有意义的概念或术语来表达。为了提供这样的解释,我们首先将分类器的隐藏单位与临床相关概念联系起来。我们利用伴随胸部X光图像的放射学报告来定义概念。我们使用线性稀疏逻辑回归发现概念和隐藏单元之间的稀疏关联。为了确保确定的单位真正影响分类器的结果,我们采用了因果推理文献中的工具,更具体地说,通过反事实干预进行中介分析。最后,我们构造一个低深度的决策树,将所有发现的概念转换成一个直接的决策规则,表达给放射科医生。我们在一个大的胸部x光数据集上评估了我们的方法,在这个数据集上,我们的模型产生了一个与临床知识一致的全局解释。 摘要:Model explainability is essential for the creation of trustworthy Machine Learning models in healthcare. An ideal explanation resembles the decision-making process of a domain expert and is expressed using concepts or terminology that is meaningful to the clinicians. To provide such an explanation, we first associate the hidden units of the classifier to clinically relevant concepts. We take advantage of radiology reports accompanying the chest X-ray images to define concepts. We discover sparse associations between concepts and hidden units using a linear sparse logistic regression. To ensure that the identified units truly influence the classifier's outcome, we adopt tools from Causal Inference literature and, more specifically, mediation analysis through counterfactual interventions. Finally, we construct a low-depth decision tree to translate all the discovered concepts into a straightforward decision rule, expressed to the radiologist. We evaluated our approach on a large chest x-ray dataset, where our model produces a global explanation consistent with clinical knowledge.
【4】 Teaching Agents how to Map: Spatial Reasoning for Multi-Object Navigation 标题:教Agent如何绘制地图:多目标导航的空间推理
作者:Pierre Marza,Laetitia Matignon,Olivier Simonin,Christian Wolf 机构: LIRIS, UMR CNRS , Université de Lyon, INSA Lyon, Villeurbanne, France, Université de Lyon, Univ. Lyon , CITI Lab, INRIA Chroma team 链接:https://arxiv.org/abs/2107.06011 摘要:在视觉导航的背景下,为了使agent能够在所考虑的地点利用其观察历史并有效地达到已知的目标,绘制一个新环境的能力是必要的。这种能力可以与空间推理相联系,在空间推理中,智能体能够感知空间关系和规律,并发现对象的启示。在经典的强化学习(RL)设置中,这种能力仅从奖励中学习。我们引入了辅助任务形式的辅助监督,旨在帮助为达到下游目标而训练的代理出现空间感知能力。我们发现,学习估计量化给定位置的代理和目标之间的空间关系的度量在多目标导航设置中具有很高的积极影响。我们的方法显著提高了不同基线代理的性能,这些代理可以构建环境的显式或隐式表示,甚至可以匹配以地面真值图作为输入的不可比较的oracle代理的性能。 摘要:In the context of visual navigation, the capacity to map a novel environment is necessary for an agent to exploit its observation history in the considered place and efficiently reach known goals. This ability can be associated with spatial reasoning, where an agent is able to perceive spatial relationships and regularities, and discover object affordances. In classical Reinforcement Learning (RL) setups, this capacity is learned from reward alone. We introduce supplementary supervision in the form of auxiliary tasks designed to favor the emergence of spatial perception capabilities in agents trained for a goal-reaching downstream objective. We show that learning to estimate metrics quantifying the spatial relationships between an agent at a given location and a goal to reach has a high positive impact in Multi-Object Navigation settings. Our method significantly improves the performance of different baseline agents, that either build an explicit or implicit representation of the environment, even matching the performance of incomparable oracle agents taking ground-truth maps as input.
【5】 Experience Report: Deep Learning-based System Log Analysis for Anomaly Detection 标题:经验报告:基于深度学习的系统日志分析异常检测
作者:Zhuangbin Chen,Jinyang Liu,Wenwei Gu,Yuxin Su,Michael R. Lyu 机构:The Chinese University of Hong Kong, Hong Kong, China 链接:https://arxiv.org/abs/2107.05908 摘要:日志是保证许多软件系统,特别是大型分布式系统的可靠性和连续性必不可少的资源。它们忠实地记录运行时信息,以便于系统故障排除和行为理解。由于现代软件系统的大规模和复杂性,日志的数量达到了前所未有的水平。因此,对于基于日志的异常检测,传统的手工检测方法甚至传统的基于机器学习的方法都变得不切实际,这成为基于深度学习的解决方案快速发展的催化剂。然而,目前有代表性的基于测井的异常检测方法中,对神经网络模型的异常检测缺乏严格的比较。此外,重新实现过程需要付出大量的努力,而且很容易引入偏差。为了更好地了解不同异常探测器的特点,本文对六种最新方法所使用的五种流行模型进行了综合评述和评价。特别地,所选择的方法中有四种是无监督的,其余两种是有监督的。使用两个公开的日志数据集对这些方法进行了评估,其中包含近1600万条日志消息和40万个异常实例。我们相信,我们的工作可以作为这一领域的基础,并有助于未来的学术研究和工业应用。 摘要:Logs have been an imperative resource to ensure the reliability and continuity of many software systems, especially large-scale distributed systems. They faithfully record runtime information to facilitate system troubleshooting and behavior understanding. Due to the large scale and complexity of modern software systems, the volume of logs has reached an unprecedented level. Consequently, for log-based anomaly detection, conventional methods of manual inspection or even traditional machine learning-based methods become impractical, which serve as a catalyst for the rapid development of deep learning-based solutions. However, there is currently a lack of rigorous comparison among the representative log-based anomaly detectors which resort to neural network models. Moreover, the re-implementation process demands non-trivial efforts and bias can be easily introduced. To better understand the characteristics of different anomaly detectors, in this paper, we provide a comprehensive review and evaluation on five popular models used by six state-of-the-art methods. Particularly, four of the selected methods are unsupervised and the remaining two are supervised. These methods are evaluated with two publicly-available log datasets, which contain nearly 16 millions log messages and 0.4 million anomaly instances in total. We believe our work can serve as a basis in this field and contribute to the future academic researches and industrial applications.
检测相关(1篇)
【1】 Intermittent Jamming against Telemetry and Telecommand of Satellite Systems and A Learning-driven Detection Strategy 标题:卫星系统遥测遥控间歇干扰及学习驱动检测策略
作者:Selen Gecgel,Gunes Karabulut Kurt 链接:https://arxiv.org/abs/2107.06181 摘要:随着第六代网络(6G)的发展,卫星通信系统,特别是基于低轨(LEO)网络的卫星通信系统,由于其独特的综合性能而具有广阔的应用前景。这些优势伴随着各种各样的挑战,如安全漏洞、混合系统的管理和高移动性。本文首先从概念上分析了物理层的安全缺陷,结合卫星系统的网络物理特性,突出了潜在的攻击。其次,提出了一种学习驱动的检测方案,并设计了轻量级卷积神经网络(CNN)。将所设计的CNN结构的性能与一种流行的机器学习算法支持向量机(SVM)进行了比较。结果表明,该方案能够检测出针对卫星系统的缺陷攻击。 摘要:Towards sixth-generation networks (6G), satellite communication systems, especially based on Low Earth Orbit (LEO) networks, become promising due to their unique and comprehensive capabilities. These advantages are accompanied by a variety of challenges such as security vulnerabilities, management of hybrid systems, and high mobility. In this paper, firstly, a security deficiency in the physical layer is addressed with a conceptual framework, considering the cyber-physical nature of the satellite systems, highlighting the potential attacks. Secondly, a learning-driven detection scheme is proposed, and the lightweight convolutional neural network (CNN) is designed. The performance of the designed CNN architecture is compared with a prevalent machine learning algorithm, support vector machine (SVM). The results show that deficiency attacks against the satellite systems can be detected by employing the proposed scheme.
分类|识别(5篇)
【1】 Timbre Classification of Musical Instruments with a Deep Learning Multi-Head Attention-Based Model 标题:基于深度学习多头注意力模型的乐器音色分类
作者:Carlos Hernandez-Olivan,Jose R. Beltran 机构:Universidad de Zaragoza 链接:https://arxiv.org/abs/2107.06231 摘要:这项工作的目的是定义一个基于深度学习的模型,该模型能够用尽可能少的参数识别不同的乐器音色。为了达到这个目的,我们研究了古典管弦乐器演奏不同的动态,这是一些乐器家族的一部分,在相同的音高范围内演奏音符。即使乐器以同样的强度演奏同一个音符,也可以通过音色来评估乐器的分类能力。所采用的网络采用多头部注意机制,输出端有8个头部和一个密集的网络,并以声音样本的对数mel幅度谱图作为输入。这个网络允许识别20个古典管弦乐队的乐器类别,使F$1$的总价值达到0.62。对注意层的权重进行了分析,给出了模型的混淆矩阵,使我们能够评估所提出的体系结构区分音色的能力,并确定未来工作应关注的方面。 摘要:The aim of this work is to define a model based on deep learning that is able to identify different instrument timbres with as few parameters as possible. For this purpose, we have worked with classical orchestral instruments played with different dynamics, which are part of a few instrument families and which play notes in the same pitch range. It has been possible to assess the ability to classify instruments by timbre even if the instruments are playing the same note with the same intensity. The network employed uses a multi-head attention mechanism, with 8 heads and a dense network at the output taking as input the log-mel magnitude spectrograms of the sound samples. This network allows the identification of 20 instrument classes of the classical orchestra, achieving an overall F$_1$ value of 0.62. An analysis of the weights of the attention layer has been performed and the confusion matrix of the model is presented, allowing us to assess the ability of the proposed architecture to distinguish timbre and to establish the aspects on which future work should focus.
【2】 What classifiers know what they don't? 标题:什么样的分类员知道他们不知道的呢?
作者:Mohamed Ishmael Belghazi,David Lopez-Paz 机构:Facebook AI Research, Paris, France 备注:27 pages 链接:https://arxiv.org/abs/2107.06217 摘要:面对未知时的不确定性是智能决策的关键。然而,机器学习算法缺乏对其预测不确定性的可靠估计。这导致在训练中遇到看不见的课程时做出错误和过于自信的决定。尽管为分类器配备适合于现实世界的不确定性估计非常重要,但以前的工作主要集中在小数据集和训练数据与测试数据之间很少或没有类差异。为了弥补这一差距,我们引入UIMNET:一个真实的、ImageNet规模的测试平台,用于评估深度图像分类器的预测不确定性估计。我们的基准测试提供了八种最先进的算法、六种不确定性度量、四种域内度量、三种域外度量的实现,以及用于训练、校准、集成、选择和评估模型的全自动管道。我们的测试平台是开源的,所有的结果都可以从存储库中的固定提交中复制。添加新的数据集、算法、度量或度量只是几行代码的问题,因此希望UIMNET成为现实的、严格的和可复制的不确定性估计研究的垫脚石。我们的结果表明,ERM分类器的集合和单个MIMO分类器是目前测量域内和域外类的不确定性的两个最佳选择。 摘要:Being uncertain when facing the unknown is key to intelligent decision making. However, machine learning algorithms lack reliable estimates about their predictive uncertainty. This leads to wrong and overly-confident decisions when encountering classes unseen during training. Despite the importance of equipping classifiers with uncertainty estimates ready for the real world, prior work has focused on small datasets and little or no class discrepancy between training and testing data. To close this gap, we introduce UIMNET: a realistic, ImageNet-scale test-bed to evaluate predictive uncertainty estimates for deep image classifiers. Our benchmark provides implementations of eight state-of-the-art algorithms, six uncertainty measures, four in-domain metrics, three out-domain metrics, and a fully automated pipeline to train, calibrate, ensemble, select, and evaluate models. Our test-bed is open-source and all of our results are reproducible from a fixed commit in our repository. Adding new datasets, algorithms, measures, or metrics is a matter of a few lines of code-in so hoping that UIMNET becomes a stepping stone towards realistic, rigorous, and reproducible research in uncertainty estimation. Our results show that ensembles of ERM classifiers as well as single MIMO classifiers are the two best alternatives currently available to measure uncertainty about both in-domain and out-domain classes.
【3】 Emotion Recognition for Healthcare Surveillance Systems Using Neural Networks: A Survey 标题:基于神经网络的医疗监护系统情感识别研究综述
作者:Marwan Dhuheir,Abdullatif Albaseer,Emna Baccour,Aiman Erbad,Mohamed Abdallah,Mounir Hamdi 备注:conference paper accepted and presented at 17th Int. Wireless Communications & Mobile Computing Conference - IWCMC 2021, Harbin, China 链接:https://arxiv.org/abs/2107.05989 摘要:由于技术的进步,利用深度学习技术来识别患者的情绪已经引起了广泛的关注。自动识别情绪有助于建立智能医疗中心,能够检测患者的抑郁和压力,以便尽早开始用药。使用先进的技术来识别情感是最令人兴奋的话题之一,因为它定义了人与机器之间的关系。机器通过采用各种方法学会了如何预测情绪。在这项调查中,我们介绍了最近的研究领域中使用神经网络来识别情绪。我们重点研究从语音、面部表情和视听输入中识别情感,并展示了在现实世界中部署这些算法的不同技术。这三种情绪识别技术可以作为医疗中心的监控系统来监控患者。最后,我们提出了本研究所面临的挑战和未来的相关工作,以期对情绪识别的应用有一个更深入的了解。 摘要:Recognizing the patient's emotions using deep learning techniques has attracted significant attention recently due to technological advancements. Automatically identifying the emotions can help build smart healthcare centers that can detect depression and stress among the patients in order to start the medication early. Using advanced technology to identify emotions is one of the most exciting topics as it defines the relationships between humans and machines. Machines learned how to predict emotions by adopting various methods. In this survey, we present recent research in the field of using neural networks to recognize emotions. We focus on studying emotions' recognition from speech, facial expressions, and audio-visual input and show the different techniques of deploying these algorithms in the real world. These three emotion recognition techniques can be used as a surveillance system in healthcare centers to monitor patients. We conclude the survey with a presentation of the challenges and the related future work to provide an insight into the applications of using emotion recognition.
【4】 Multi-Scale Label Relation Learning for Multi-Label Classification Using 1-Dimensional Convolutional Neural Networks 标题:基于一维卷积神经网络的多标签分类多尺度标签关系学习
作者:Junhyung Kim,Byungyoon Park,Charmgil Hong 机构:School of Computer Science and, Electrical Engineering, Handong Global University, Pohang, South Korea 链接:https://arxiv.org/abs/2107.05941 摘要:提出了一种基于一维卷积核学习多尺度标签依赖关系网络(MSDN)的多标签分类方法。现代多标签分类器一直采用递归神经网络(RNNs)作为记忆结构来捕获和利用标签依赖关系。然而,基于RNN的MLC模型往往会引入大量的参数,这些参数可能会导致欠拟合/过拟合问题。该方法利用一维卷积神经网络(1D-CNN)以更有效的方式达到同样的目的。该方法通过训练具有多个核尺寸的模型,可以在多个尺度上学习标签之间的依赖关系,同时使用的参数数量非常少。与基于RNN的MLC模型相比,通过公开的基准数据集,我们证明了我们的模型可以在更少的模型参数下获得更好的精度。 摘要:We present Multi-Scale Label Dependence Relation Networks (MSDN), a novel approach to multi-label classification (MLC) using 1-dimensional convolution kernels to learn label dependencies at multi-scale. Modern multi-label classifiers have been adopting recurrent neural networks (RNNs) as a memory structure to capture and exploit label dependency relations. The RNN-based MLC models however tend to introduce a very large number of parameters that may cause under-/over-fitting problems. The proposed method uses the 1-dimensional convolutional neural network (1D-CNN) to serve the same purpose in a more efficient manner. By training a model with multiple kernel sizes, the method is able to learn the dependency relations among labels at multiple scales, while it uses a drastically smaller number of parameters. With public benchmark datasets, we demonstrate that our model can achieve better accuracies with much smaller number of model parameters compared to RNN-based MLC models.
【5】 Stress Classification and Personalization: Getting the most out of the least 标题:压力分类和个性化:从最少的东西中获得最多
作者:Ramesh Kumar Sah,Hassan Ghasemzadeh 机构:School of Electrical Engineering and Computer Science, Washington State University 备注:4 pages, 4 figures, IEEE International Conference on Wearable and Implantable Body Sensor Networks 链接:https://arxiv.org/abs/2107.05666 摘要:压力检测和监测是一个活跃的研究领域,对个人的个人、职业和社会健康具有重要意义。目前的情感状态分类方法使用传统的机器学习算法,其特征由多个传感器模式计算。这些方法是数据密集型的,并且依赖于手工制作的特性,这妨碍了这些传感器系统在日常生活中的实际应用。为了克服这些缺点,我们提出了一种新的基于卷积神经网络(CNN)的应力检测和分类框架,不需要任何特征计算,只使用一个传感器模态的数据。我们的方法是有竞争力的,优于目前最先进的技术,实现了92.85%$的分类精度和0.89$的f1$分数。通过漏掉一个主题的分析,我们也展示了个性化压力模型的重要性。 摘要:Stress detection and monitoring is an active area of research with important implications for the personal, professional, and social health of an individual. Current approaches for affective state classification use traditional machine learning algorithms with features computed from multiple sensor modalities. These methods are data-intensive and rely on hand-crafted features which impede the practical applicability of these sensor systems in daily lives. To overcome these shortcomings, we propose a novel Convolutional Neural Network (CNN) based stress detection and classification framework without any feature computation using data from only one sensor modality. Our method is competitive and outperforms current state-of-the-art techniques and achieves a classification accuracy of $92.85%$ and an $f1$ score of $0.89$. Through our leave-one-subject-out analysis, we also show the importance of personalizing stress models.
表征(4篇)
【1】 On Designing Good Representation Learning Models 标题:论设计良好的表征学习模式
作者:Qinglin Li,Bin Li,Jonathan M Garibaldi,Guoping Qiu 备注:15 pages, 链接:https://arxiv.org/abs/2107.05948 摘要:表征学习的目标不同于决策等机器学习的最终目标,因此很难建立清晰直接的表征学习模型训练目标。有人认为,一个好的代表应该解开潜在的变异因素,但如何将其转化为训练目标仍然是未知的。本文试图建立直接训练准则和设计原则,以发展良好的表征学习模型。我们提出一个好的表征学习模型应该具有最大的表达能力,即能够区分最大数量的输入配置。我们正式定义了表达性,并引入了一般学习模型的最大表达性定理。我们建议训练一个模型,最大限度地提高其表达能力,同时纳入一般的先验知识,如模型的光滑性。提出了一种良心竞争学习算法,该算法在保证模型光滑性的前提下,使模型达到mex。我们还引入了标签一致性训练(LCT)技术,通过鼓励模型为相似的样本分配一致的标签来提高模型的平滑度。我们提出了大量的实验结果表明,我们的方法确实可以设计出表征学习模型,能够开发出与现有技术相当或更好的表征。我们还表明,我们的技术计算效率高,对不同的参数设置具有鲁棒性,可以有效地处理各种数据集。 摘要:The goal of representation learning is different from the ultimate objective of machine learning such as decision making, it is therefore very difficult to establish clear and direct objectives for training representation learning models. It has been argued that a good representation should disentangle the underlying variation factors, yet how to translate this into training objectives remains unknown. This paper presents an attempt to establish direct training criterions and design principles for developing good representation learning models. We propose that a good representation learning model should be maximally expressive, i.e., capable of distinguishing the maximum number of input configurations. We formally define expressiveness and introduce the maximum expressiveness (MEXS) theorem of a general learning model. We propose to train a model by maximizing its expressiveness while at the same time incorporating general priors such as model smoothness. We present a conscience competitive learning algorithm which encourages the model to reach its MEXS whilst at the same time adheres to model smoothness prior. We also introduce a label consistent training (LCT) technique to boost model smoothness by encouraging it to assign consistent labels to similar samples. We present extensive experimental results to show that our method can indeed design representation learning models capable of developing representations that are as good as or better than state of the art. We also show that our technique is computationally efficient, robust against different parameter settings and can work effectively on a variety of datasets.
【2】 Exploiting Network Structures to Improve Semantic Representation for the Financial Domain 标题:利用网络结构改进金融领域的语义表示
作者:Chao Feng,Shi-jie We 机构:University of Zurich, Harbin Institute of Technology 备注:5 pages, 4 figures 链接:https://arxiv.org/abs/2107.05885 摘要:本文介绍了miniture团队参与FinSim-3共享任务学习英语金融领域语义相似性的情况。我们的方法将基于transformer的语言模型学习到的上下文嵌入与从外部知识源中提取的网络结构嵌入相结合,以创建金融领域实体和术语的更有意义的表示。为此,本文采用了两种基于BERT的语言模型和一种知识图嵌入模型。此外,我们提出了一个投票函数来连接三个基本模型进行最终推理。实验结果表明,嵌入知识图的模型比仅嵌入上下文的模型效果更好。然而,我们也注意到,我们的投票功能为最终系统带来了额外的好处。 摘要:This paper presents the participation of the MiniTrue team in the FinSim-3 shared task on learning semantic similarities for the financial domain in English language. Our approach combines contextual embeddings learned by transformer-based language models with network structures embeddings extracted on external knowledge sources, to create more meaningful representations of financial domain entities and terms. For this, two BERT based language models and a knowledge graph embedding model are used. Besides, we propose a voting function to joint three basic models for the final inference. Experimental results show that the model with the knowledge graph embeddings has achieved a superior result than these models with only contextual embeddings. Nevertheless, we also observe that our voting function brings an extra benefit to the final system.
【3】 Codified audio language modeling learns useful representations for music information retrieval 标题:编码音频语言建模学习用于音乐信息检索的有用表示
作者:Rodrigo Castellon,Chris Donahue,Percy Liang 机构:Stanford University 备注:To appear in the proceedings of ISMIR 2021 链接:https://arxiv.org/abs/2107.05677 摘要:我们证明,语言模型预先训练编码(离散编码)的音乐音频学习表示,是有用的下游和平号任务。具体来说,我们探讨了Jukebox(Dhariwal et al.2020)的表现形式:一个音乐生成系统,包含一个语言模型,该语言模型基于来自100万首歌曲的编码音频进行训练。为了确定Jukebox的表示是否包含对MIR有用的信息,我们使用它们作为输入特征来训练几个MIR任务的浅层模型。相对于传统的MIR模型在标记方面的预训练,我们发现使用Jukebox中的表示作为输入特征,在标记、类型分类、情感识别和关键点检测四个MIR任务中平均产生30%的性能提高。对于密钥检测,我们观察到自动存储塔中的表示比预先训练的模型中的表示强得多,这表明通过编码音频语言模型进行预训练可以解决传统方法中的盲点。我们将Jukebox表示的强度解释为音频建模而不是标记为和平号提供了更丰富的表示的证据。 摘要:We demonstrate that language models pre-trained on codified (discretely-encoded) music audio learn representations that are useful for downstream MIR tasks. Specifically, we explore representations from Jukebox (Dhariwal et al. 2020): a music generation system containing a language model trained on codified audio from 1M songs. To determine if Jukebox's representations contain useful information for MIR, we use them as input features to train shallow models on several MIR tasks. Relative to representations from conventional MIR models which are pre-trained on tagging, we find that using representations from Jukebox as input features yields 30% stronger performance on average across four MIR tasks: tagging, genre classification, emotion recognition, and key detection. For key detection, we observe that representations from Jukebox are considerably stronger than those from models pre-trained on tagging, suggesting that pre-training via codified audio language modeling may address blind spots in conventional approaches. We interpret the strength of Jukebox's representations as evidence that modeling audio instead of tags provides richer representations for MIR.
【4】 Optimal input representation in neural systems at the edge of chaos 标题:混沌边缘神经系统的最优输入表示
作者:Guillermo B. Morales,Miguel A. Muñoz 机构:Departamento de Electromagnetismo y Física de la Materia, Instituto Carlos I de Física Teórica y Computacional., Universidad de Granada, E-, Granada, Spain 链接:https://arxiv.org/abs/2107.05709 摘要:阐明生物系统如何在嘈杂的环境中表现、处理和储存信息是一个关键而富有挑战性的目标。一个刺激的,尽管有争议的假设提出,在接近相变边缘的动态状态下运行,即在临界状态或“混沌边缘”,可以为信息处理生命系统提供重要的操作优势,例如,在鲁棒性和灵活性之间创造最佳的权衡。在这里,我们详细阐述了一个最新的理论结果,该结果确定了以稳健的方式表示复杂输入的神经网络的协方差矩阵的频谱需要以秩的幂律衰减,指数接近于单位,这一结果确实在小鼠视皮层的神经元中得到了实验验证。为了理解和模仿这些结果,我们构造了一个人工神经网络,并训练它对图像进行分类。值得注意的是,当网络在临界点附近运行时,协方差矩阵的本征谱与实际神经元的统计特性完全相同,在这种情况下获得了最佳的性能。因此,我们得出结论,在临界点附近操作,除了通常所说的优点之外,还具有允许灵活、健壮和高效的输入表示的优点。 摘要:Shedding light onto how biological systems represent, process and store information in noisy environments is a key and challenging goal. A stimulating, though controversial, hypothesis poses that operating in dynamical regimes near the edge of a phase transition, i.e. at criticality or the "edge of chaos", can provide information-processing living systems with important operational advantages, creating, e.g., an optimal trade-off between robustness and flexibility. Here, we elaborate on a recent theoretical result, which establishes that the spectrum of covariance matrices of neural networks representing complex inputs in a robust way needs to decay as a power-law of the rank, with an exponent close to unity, a result that has been indeed experimentally verified in neurons of the mouse visual cortex. Aimed at understanding and mimicking these results, we construct an artificial neural network and train it to classify images. Remarkably, we find that the best performance in such a task is obtained when the network operates near the critical point, at which the eigenspectrum of the covariance matrix follows the very same statistics as actual neurons do. Thus, we conclude that operating near criticality can also have -- besides the usually alleged virtues -- the advantage of allowing for flexible, robust and efficient input representations.
3D|3D重建等相关(2篇)
【1】 'CADSketchNet' -- An Annotated Sketch dataset for 3D CAD Model Retrieval with Deep Neural Networks 标题:CADSketchNet--一种用于深度神经网络三维CAD模型检索的注释草图数据集
作者:Bharadwaj Manda,Shubham Dhayarkar,Sai Mitheran,V. K. Viekash,Ramanathan Muthuganapathy 机构: Indian Institute of Technology Madras , National Institute of Technology Tiruchirappalli 备注:Computers & Graphics Journal, Special Section on 3DOR 2021 链接:https://arxiv.org/abs/2107.06212 摘要:三维建模和数字存档领域的不断进步导致了数字存储数据量的激增。因此,根据存储在这些数据库中的数据类型,开发了若干检索系统。然而,与文本数据或图像不同的是,执行三维模型搜索是非常重要的。在三维模型中,由于存在孔、体积特征、锐边等,检索三维工程/CAD模型或机械部件更具挑战性,这使得CAD本身成为一个领域。本文的研究工作旨在开发一个适合于建立基于深度学习的三维CAD模型检索系统的数据集。从可用的CAD数据库中收集3D CAD模型,并准备计算机生成的草图数据集,称为“CADSketchNet”。此外,零部件的手绘草图也添加到CADSetchNet中。利用该数据集的草图图像,本文还旨在评估各种检索系统或接受草图图像作为输入查询的三维CAD模型搜索引擎的性能。在CADSketchNet上构建并测试了多个实验模型。这些实验,连同模型架构,相似性度量的选择与搜索结果一起被报告。 摘要:Ongoing advancements in the fields of 3D modelling and digital archiving have led to an outburst in the amount of data stored digitally. Consequently, several retrieval systems have been developed depending on the type of data stored in these databases. However, unlike text data or images, performing a search for 3D models is non-trivial. Among 3D models, retrieving 3D Engineering/CAD models or mechanical components is even more challenging due to the presence of holes, volumetric features, presence of sharp edges etc., which make CAD a domain unto itself. The research work presented in this paper aims at developing a dataset suitable for building a retrieval system for 3D CAD models based on deep learning. 3D CAD models from the available CAD databases are collected, and a dataset of computer-generated sketch data, termed 'CADSketchNet', has been prepared. Additionally, hand-drawn sketches of the components are also added to CADSketchNet. Using the sketch images from this dataset, the paper also aims at evaluating the performance of various retrieval system or a search engine for 3D CAD models that accepts a sketch image as the input query. Many experimental models are constructed and tested on CADSketchNet. These experiments, along with the model architecture, choice of similarity metrics are reported along with the search results.
【2】 Combining 3D Image and Tabular Data via the Dynamic Affine Feature Map Transform 标题:基于动态仿射要素地图变换的三维图像与表格数据融合
作者:Sebastian Pölsterl,Tom Nuno Wolf,Christian Wachinger 机构:Artificial Intelligence in Medical Imaging (AI-Med), Department of Child and Adolescent Psychiatry, Ludwig-Maximilians-Universität, Munich, Germany 备注:Accepted at 2021 International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) 链接:https://arxiv.org/abs/2107.05990 摘要:先前的研究表明,卷积神经网络(CNNs)可以利用高维图像信息对患者进行分类。然而,很少有研究关注这些模型如何利用通常低维的表格信息,如患者人口统计或实验室测量。我们介绍了动态仿射特征映射变换(DAFT),这是CNNs的一个通用模块,它根据患者的表格临床信息动态地重新缩放和移动卷积层的特征映射。我们发现DAFT在结合3D图像和表格信息进行痴呆诊断和预测痴呆发生时间方面非常有效,它的平均平衡准确率为0.622,平均c指数为0.748。我们广泛的烧蚀研究为DAFT的建筑特性提供了有价值的见解。我们的实现可在https://github.com/ai-med/DAFT. 摘要:Prior work on diagnosing Alzheimer's disease from magnetic resonance images of the brain established that convolutional neural networks (CNNs) can leverage the high-dimensional image information for classifying patients. However, little research focused on how these models can utilize the usually low-dimensional tabular information, such as patient demographics or laboratory measurements. We introduce the Dynamic Affine Feature Map Transform (DAFT), a general-purpose module for CNNs that dynamically rescales and shifts the feature maps of a convolutional layer, conditional on a patient's tabular clinical information. We show that DAFT is highly effective in combining 3D image and tabular information for diagnosis and time-to-dementia prediction, where it outperforms competing CNNs with a mean balanced accuracy of 0.622 and mean c-index of 0.748, respectively. Our extensive ablation study provides valuable insights into the architectural properties of DAFT. Our implementation is available at https://github.com/ai-med/DAFT.
优化|敛散性(2篇)
【1】 Motion Planning by Learning the Solution Manifold in Trajectory Optimization 标题:轨迹优化中基于解流形学习的运动规划
作者:Takayuki Osa 机构: 1Kyushu Institute of Technology, Kyushu Institute of Technology Department of HumanIntelligence Systems & Research Center for Neuromorphic AI HardwareBehavior Learning Systems Loboratory 备注:24 pages, to appear in the International Journal of Robotics Research 链接:https://arxiv.org/abs/2107.05842 摘要:轨迹优化中使用的目标函数通常是非凸的,可以有无穷多个局部最优解。在这种情况下,有不同的解决方案来执行给定的任务。虽然有一些方法可以找到运动规划的多个解决方案,但它们仅限于生成一组有限的解决方案。为了解决这个问题,我们提出了一种优化方法,学习无穷多个解决方案的轨迹优化。在我们的框架中,通过学习解的潜在表示来获得不同的解。我们的方法可以解释为训练一个深层的无碰撞轨迹生成模型来进行运动规划。实验结果表明,训练后的模型代表了运动规划问题的无穷多个同伦解。 摘要:The objective function used in trajectory optimization is often non-convex and can have an infinite set of local optima. In such cases, there are diverse solutions to perform a given task. Although there are a few methods to find multiple solutions for motion planning, they are limited to generating a finite set of solutions. To address this issue, we presents an optimization method that learns an infinite set of solutions in trajectory optimization. In our framework, diverse solutions are obtained by learning latent representations of solutions. Our approach can be interpreted as training a deep generative model of collision-free trajectories for motion planning. The experimental results indicate that the trained model represents an infinite set of homotopic solutions for motion planning problems.
【2】 Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges 标题:超参数优化:基础、算法、最佳实践和开放挑战
作者:Bernd Bischl,Martin Binder,Michel Lang,Tobias Pielok,Jakob Richter,Stefan Coors,Janek Thomas,Theresa Ullmann,Marc Becker,Anne-Laure Boulesteix,Difan Deng,Marius Lindauer 备注:67 pages, 13 figures, to be published in WIREs: Data Mining and Knowledge Discovery 链接:https://arxiv.org/abs/2107.05847 摘要:大多数机器学习算法都是由一个或多个超参数配置的,这些超参数必须仔细选择,而且往往对性能有很大影响。为了避免费时且不可生产的手动试验和错误过程以找到性能良好的超参数配置,可以采用各种自动超参数优化(HPO)方法,例如基于有监督机器学习的重采样误差估计的方法。本文从广义的角度介绍了高性能优化方法,综述了网格或随机搜索、进化算法、贝叶斯优化、超带和竞速等重要的高性能优化方法。它给出了在执行HPO时要做出的重要选择的实用建议,包括HPO算法本身、性能评估、如何将HPO与ML管道相结合、运行时改进和并行化。 摘要:Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for supervised machine learning, can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization.
预测|估计(6篇)
【1】 Smoothed Bernstein Online Aggregation for Day-Ahead Electricity Demand Forecasting 标题:用于日前电力需求预测的平滑Bernstein在线聚合
作者:Florian Ziel 机构: Universityof Duisburg-Essen 链接:https://arxiv.org/abs/2107.06268 摘要:我们提出了一种在IEEE数据端口竞争中获胜的日前电力需求预测方法:后COVID范式。日前负荷预测方法是基于多点预测模型的在线预测组合。它包括四个步骤:i)数据清理和预处理,ii)假期调整过程,iii)个体预测模型的训练,iv)平滑Bernstein在线聚合(BOA)预测组合。这种方法是灵活的,可以迅速适应新能源系统的情况,因为它们发生在COVID-19关闭期间和之后。单个预测模型的范围从相当简单的时间序列模型到复杂的模型,如广义加性模型(GAMs)和由lasso估计的高维线性模型。它们有效地结合了自回归、日历和天气效应。所有的步骤都包含了新的概念,这有助于该方法具有良好的预测性能。这尤其适用于假期调整程序和完全自适应平滑BOA方法。 摘要:We present a winning method of the IEEE DataPort Competition on Day-Ahead Electricity Demand Forecasting: Post-COVID Paradigm. The day-ahead load forecasting approach is based on online forecast combination of multiple point prediction models. It contains four steps: i) data cleaning and preprocessing, ii) a holiday adjustment procedure, iii) training of individual forecasting models, iv) forecast combination by smoothed Bernstein Online Aggregation (BOA). The approach is flexible and can quickly adopt to new energy system situations as they occurred during and after COVID-19 shutdowns. The pool of individual prediction models ranges from rather simple time series models to sophisticated models like generalized additive models (GAMs) and high-dimensional linear models estimated by lasso. They incorporate autoregressive, calendar and weather effects efficiently. All steps contain novel concepts that contribute to the excellent forecasting performance of the proposed method. This holds particularly for the holiday adjustment procedure and the fully adaptive smoothed BOA approach.
【2】 Predictive models for wind speed using artificial intelligence and copula 标题:基于人工智能和Copula的风速预测模型
作者:Md Amimul Ehsan 机构:A thesis submitted to the, University of the District of Columbia, in Partial Fulfillment of the, Requirements for the Degree of, Master of Science in, Electrical Engineering, Washington, DC 备注:This is a Masters thesis that compares various machine learning algorithms for wind speed prediction using weather data. It also applies Copula to model joint probability distribution of two far apart wind sites. arXiv admin note: text overlap with arXiv:2005.12401 链接:https://arxiv.org/abs/2107.06182 摘要:燃烧化石燃料发电是造成全球变暖的主要原因之一。可再生能源是发电和减少电力工业排放的可行替代能源。这些能源是绿色能源的基石,具有不同的特点。根据地理位置和其他参数,它们的可用性也各不相同。低的实现成本和遍布全球的分布式可用性使它们的受欢迎程度成倍提高。因此,它为消费者提供了在当地生产和现场使用电力的机会,从而减少了对集中公用事业公司的依赖。该研究考虑了两个主要目标:简化风电场规划的风速预测和可行性研究。其次,需要了解多个遥远地点风速的相关性结构。为了解决第一个目标,12个人工智能算法用于风速预测从收集的气象参数。对模型性能进行了比较,以确定风速预测精度。结果表明,在深度学习模式下,长-短期记忆(LSTM)的准确率最高,为97.8%。对于相关性,采用多元累积分布函数Copula求出两个或两个以上远距离风速的联合分布,并进行了实例分析。我们发现,合适的copula族和参数会根据它们之间的距离而变化。对于案例研究,Joe-Frank(BB8)copula证明了一种有效的联合分布,适用于风速对,标准误差为0.0094。最后,对风速相关性的不确定性方面提出了一些见解。 摘要:Electricity generation from burning fossil fuels is one of the major contributors to global warming. Renewable energy sources are a viable alternative to produce electrical energy and to reduce the emission from the power industry. These energy sources are the building blocks of green energy, which all have different characteristics. Their availabilities are also diverse, depending on geographical locations and other parameters. Low implementation cost and distributed availability all over the world uplifts their popularity exponentially. Therefore, it has unlocked opportunities for consumers to produce electricity locally and use it on-site, which reduces dependency on centralized utility companies. The research considers two main objectives: the prediction of wind speed that simplifies wind farm planning and feasibility study. Secondly, the need to understand the dependency structure of the wind speeds of multiple distant locations. To address the first objective, twelve artificial intelligence algorithms were used for wind speed prediction from collected meteorological parameters. The model performances were compared to determine the wind speed prediction accuracy. The results show a deep learning approach, long short-term memory (LSTM) outperforms other models with the highest accuracy of 97.8%. For dependency, a multivariate cumulative distribution function, Copula, was used to find the joint distribution of two or more distant location wind speeds, followed by a case study. We found that the appropriate copula family and the parameters vary based on the distance in between. For the case study, Joe-Frank (BB8) copula shows an efficient joint distribution fit for a wind speed pair with a standard error of 0.0094. Finally, some insights about the uncertainty aspects of wind speed dependency were addressed.
【3】 Auto IV: Counterfactual Prediction via Automatic Instrumental Variable Decomposition 标题:Auto IV:通过自动工具变量分解进行反事实预测
作者:Junkun Yuan,Anpeng Wu,Kun Kuang,Bo Li,Runze Wu,Fei Wu,Lanfen Lin 备注:12 pages 链接:https://arxiv.org/abs/2107.05884 摘要:工具变量(IVs)是治疗随机化的来源,与结果有条件独立,在因果推断中起着重要作用。然而,现有的基于IV的反事实预测方法需要预先定义好的IV,而在许多真实场景中找到有效的IV是一门艺术而不是科学。此外,预定义的手工IVs可能会因违反有效IVs的条件而变弱或出错。这些棘手的事实阻碍了基于IV的反事实预测方法的应用。在本文中,我们提出了一种新的自动工具变量分解(AutoIV)算法,从观察变量(IV候选者)中自动生成服务于IVs角色的表示。具体地说,我们分别通过互信息最大化和最小化约束,使学习到的IV表示满足处理的相关条件和结果的排除条件。我们还通过鼓励他们与治疗和结果相关来学习混杂因素表征。在对抗性博弈中,IV和混杂表示与它们的约束条件竞争信息,这使得我们能够得到有效的IV表示,用于基于IV的反事实预测。大量的实验表明,我们的方法生成了有效的IV表示,用于精确的基于IV的反事实预测。 摘要:Instrumental variables (IVs), sources of treatment randomization that are conditionally independent of the outcome, play an important role in causal inference with unobserved confounders. However, the existing IV-based counterfactual prediction methods need well-predefined IVs, while it's an art rather than science to find valid IVs in many real-world scenes. Moreover, the predefined hand-made IVs could be weak or erroneous by violating the conditions of valid IVs. These thorny facts hinder the application of the IV-based counterfactual prediction methods. In this paper, we propose a novel Automatic Instrumental Variable decomposition (AutoIV) algorithm to automatically generate representations serving the role of IVs from observed variables (IV candidates). Specifically, we let the learned IV representations satisfy the relevance condition with the treatment and exclusion condition with the outcome via mutual information maximization and minimization constraints, respectively. We also learn confounder representations by encouraging them to be relevant to both the treatment and the outcome. The IV and confounder representations compete for the information with their constraints in an adversarial game, which allows us to get valid IV representations for IV-based counterfactual prediction. Extensive experiments demonstrate that our method generates valid IV representations for accurate IV-based counterfactual prediction.
【4】 DDCNet-Multires: Effective Receptive Field Guided Multiresolution CNN for Dense Prediction 标题:DDCNet-Multires:有效感受场引导的多分辨率细胞神经网络密度预测
作者:Ali Salehi,Madhusudhanan Balasubramanian 机构:Tennessee, United States 备注:27 pages, 10 figures, 2 tables. arXiv admin note: text overlap with arXiv:2107.04715 链接:https://arxiv.org/abs/2107.05634 摘要:在具有非均匀运动动力学、遮挡和场景均匀性的场景中,当存在大位移时,密集光流估计是一个挑战。处理这些挑战的传统方法包括分层和多分辨率处理方法。基于学习的光流方法通常使用多分辨率的方法,当存在大范围的流速和非均匀运动时,图像扭曲。这种从粗到精的方法的精度受到多分辨率图像扭曲时的重影伪影以及具有较高运动对比度的较小场景范围内的消失问题的影响。在此之前,我们设计了以有效感受野(ERF)特性为指导的密集预测网络(DDCNet)的构建策略。DDCNet的设计有意地简单和紧凑,允许它被用作设计更复杂但紧凑的网络的构建块。在这项工作中,我们扩展了DDCNet策略,通过级联基于DDCNet的子网来处理异构的运动动力学,减少了子网的ERF。我们的具有多分辨率功能的DDCNet(DDCNet Multires)结构紧凑,没有任何专门的网络层。我们使用标准光流基准数据集评估了DDCNet Multires网络的性能。我们的实验表明,DDCNet-Multires比DDCNet-B0和-B1改进,并且提供了与类似的基于轻量级学习的方法相当的精度的光流估计。 摘要:Dense optical flow estimation is challenging when there are large displacements in a scene with heterogeneous motion dynamics, occlusion, and scene homogeneity. Traditional approaches to handle these challenges include hierarchical and multiresolution processing methods. Learning-based optical flow methods typically use a multiresolution approach with image warping when a broad range of flow velocities and heterogeneous motion is present. Accuracy of such coarse-to-fine methods is affected by the ghosting artifacts when images are warped across multiple resolutions and by the vanishing problem in smaller scene extents with higher motion contrast. Previously, we devised strategies for building compact dense prediction networks guided by the effective receptive field (ERF) characteristics of the network (DDCNet). The DDCNet design was intentionally simple and compact allowing it to be used as a building block for designing more complex yet compact networks. In this work, we extend the DDCNet strategies to handle heterogeneous motion dynamics by cascading DDCNet based sub-nets with decreasing extents of their ERF. Our DDCNet with multiresolution capability (DDCNet-Multires) is compact without any specialized network layers. We evaluate the performance of the DDCNet-Multires network using standard optical flow benchmark datasets. Our experiments demonstrate that DDCNet-Multires improves over the DDCNet-B0 and -B1 and provides optical flow estimates with accuracy comparable to similar lightweight learning-based methods.
【5】 National-scale electricity peak load forecasting: Traditional, machine learning, or hybrid model? 标题:国家级电力高峰负荷预测:传统、机器学习还是混合模型?
作者:Juyong Lee,Youngsang Cho 机构:Postdoctoral Research Fellow, Department of Industrial Engineering, College of Engineering, Yonsei University, Yonsei-, Ro, Seodaemun-gu, Seoul, South Korea, Tel: ,-,-,-, Professor, Fax: ,-,-,- 链接:https://arxiv.org/abs/2107.06174 摘要:随着气候变化和电气化导致电力需求的波动性增加,准确的峰值负荷预测的重要性日益增加。传统的高峰负荷预测是通过基于时间序列的模型进行的;然而,最近,基于机器或深度学习的新模型被引入。本研究通过比较时间序列、机器学习和混合模型的性能,进行比较分析,以确定韩国最准确的峰值负荷预测模型。时间序列模型采用带外生变量的季节性自回归综合移动平均(SARIMAX)。机器学习模型采用人工神经网络(ANN)、支持向量回归(SVR)和长短时记忆(LSTM)。SARIMAX-ANN、SARIMAX-SVR和SARIMAX-LSTM用于混合模型。结果表明,混合模型比SARIMAX模型有明显的改进。基于LSTM的模型优于其他模型;单一和混合LSTM模型没有表现出显著的性能差异。在韩国2019年最高峰值负荷的情况下,LSTM模型的预测能力证明大于SARIMAX-LSTM模型。LSTM、SARIMAX-SVR和SARIMAX-LSTM模型的性能优于韩国目前使用的基于时间序列的预测模型。因此,韩国的峰值负荷预测性能可以通过加入机器学习或混合模型来提高。 摘要:As the volatility of electricity demand increases owing to climate change and electrification, the importance of accurate peak load forecasting is increasing. Traditional peak load forecasting has been conducted through time series-based models; however, recently, new models based on machine or deep learning are being introduced. This study performs a comparative analysis to determine the most accurate peak load-forecasting model for Korea, by comparing the performance of time series, machine learning, and hybrid models. Seasonal autoregressive integrated moving average with exogenous variables (SARIMAX) is used for the time series model. Artificial neural network (ANN), support vector regression (SVR), and long short-term memory (LSTM) are used for the machine learning models. SARIMAX-ANN, SARIMAX-SVR, and SARIMAX-LSTM are used for the hybrid models. The results indicate that the hybrid models exhibit significant improvement over the SARIMAX model. The LSTM-based models outperformed the others; the single and hybrid LSTM models did not exhibit a significant performance difference. In the case of Korea's highest peak load in 2019, the predictive power of the LSTM model proved to be greater than that of the SARIMAX-LSTM model. The LSTM, SARIMAX-SVR, and SARIMAX-LSTM models outperformed the current time series-based forecasting model used in Korea. Thus, Korea's peak load-forecasting performance can be improved by including machine learning or hybrid models.
【6】 Effects of personality traits in predicting grade retention of Brazilian students 标题:人格特质在预测巴西学生年级保持中的作用
作者:Carmen Melo Toledo,Guilherme Mendes Bassedon,Jonathan Batista Ferreira,Lucka de Godoy Gianvechio,Carlos Guatimosim,Felipe Maia Polo,Renato Vicente 链接:https://arxiv.org/abs/2107.05767 摘要:学生成绩保持是许多教育系统,特别是发展中国家教育系统面临的一个关键问题。在这篇论文中,我们试图衡量巴西学生的人格特质在预测成绩保持方面的相关性。为此,我们使用了2012年和2017年在巴西圣保罗州农村Sertaozinho市收集的数据。在Sertaozinho进行的调查包括几个社会经济问题、标准化测试和人格测试。此外,2012年的学生分为4、5和6年级。我们的方法是基于调查数据训练机器学习模型,利用2012年或之前的信息预测2012年至2017年的年级保持率,然后使用一些策略量化人格特质的预测能力。我们的结论是,除了被证明在孤立时比随机分类器更好外,人格特质甚至在使用社会经济变量和标准化测试结果时也有助于预测。 摘要:Student's grade retention is a key issue faced by many education systems, especially those in developing countries. In this paper, we seek to gauge the relevance of students' personality traits in predicting grade retention in Brazil. For that, we used data collected in 2012 and 2017, in the city of Sertaozinho, countryside of the state of Sao Paulo, Brazil. The surveys taken in Sertaozinho included several socioeconomic questions, standardized tests, and a personality test. Moreover, students were in grades 4, 5, and 6 in 2012. Our approach was based on training machine learning models on the surveys' data to predict grade retention between 2012 and 2017 using information from 2012 or before, and then using some strategies to quantify personality traits' predictive power. We concluded that, besides proving to be fairly better than a random classifier when isolated, personality traits contribute to prediction even when using socioeconomic variables and standardized tests results.
其他神经网络|深度学习|模型|建模(16篇)
【1】 Pessimistic Model-based Offline RL: PAC Bounds and Posterior Sampling under Partial Coverage 标题:基于悲观模型的离线RL:部分覆盖下的PAC界和后验抽样
作者:Masatoshi Uehara,Wen Sun 机构:Department of Computer Science, Cornell University 链接:https://arxiv.org/abs/2107.06226 摘要:研究了基于模型的离线强化学习。提出了一种基于约束悲观策略优化(CPPO)的算法,该算法利用了一个通用的函数类,并利用一个约束对悲观进行编码。在假设地面真值模型属于我们的函数类的情况下,CPPO可以只提供部分覆盖的离线数据进行学习,也就是说,相对于函数类的统计复杂度,它可以学习一个针对离线数据覆盖的任何策略完成的策略。然后我们证明了这个算法框架可以应用于许多特殊的马尔可夫决策过程,在这些过程中,附加的结构假设可以进一步完善部分覆盖的概念。一个显著的例子是具有表示学习的低秩MDP,其中部分覆盖是使用由潜在未知地面真值特征表示度量的相对条件数的概念来定义的。最后,介绍并研究了离线RL中的贝叶斯设置。贝叶斯离线RL的主要优点是,在算法上,我们不需要显式地构造悲观主义或奖惩,这可能很难超越线性结构的模型。提出了一种基于后验抽样的增量式策略优化算法(PS-PO),该算法从后验分布中对模型进行迭代抽样,并在抽样模型内进行一步增量式策略优化。理论上,在对先验分布的期望下,PS-PO可以在多项式样本复杂度的部分覆盖下学习一个近似最优策略。 摘要:We study model-based offline Reinforcement Learning with general function approximation. We present an algorithm named Constrained Pessimistic Policy Optimization (CPPO) which leverages a general function class and uses a constraint to encode pessimism. Under the assumption that the ground truth model belongs to our function class, CPPO can learn with the offline data only providing partial coverage, i.e., it can learn a policy that completes against any policy that is covered by the offline data, in polynomial sample complexity with respect to the statistical complexity of the function class. We then demonstrate that this algorithmic framework can be applied to many specialized Markov Decision Processes where the additional structural assumptions can further refine the concept of partial coverage. One notable example is low-rank MDP with representation learning where the partial coverage is defined using the concept of relative condition number measured by the underlying unknown ground truth feature representation. Finally, we introduce and study the Bayesian setting in offline RL. The key benefit of Bayesian offline RL is that algorithmically, we do not need to explicitly construct pessimism or reward penalty which could be hard beyond models with linear structures. We present a posterior sampling-based incremental policy optimization algorithm (PS-PO) which proceeds by iteratively sampling a model from the posterior distribution and performing one-step incremental policy optimization inside the sampled model. Theoretically, in expectation with respect to the prior distribution, PS-PO can learn a near optimal policy under partial coverage with polynomial sample complexity.
【2】 ML-Quest: A Game for Introducing Machine Learning Concepts to K-12 Students 标题:ML-Quest:向K-12学生介绍机器学习概念的游戏
作者:Shruti Priya,Shubhankar Bhadra,Sridhar Chimalakonda 备注:13 pages, 5 figures, 3 tables 链接:https://arxiv.org/abs/2107.06206 摘要:今天,机器学习(ML)由于其海量数据和高计算资源的可用性而在社会上具有重要的意义。这最终导致在多个层次的教育中引入ML概念,包括K-12学生,以促进计算思维。然而,通过传统的教学方法,如视频讲座和书籍,向K-12教授这些概念是一个挑战。文献中的许多研究表明,利用游戏等交互环境来教授计算思维和编程可以提高学生的记忆能力和学习动机。因此,在游戏中引入ML概念,可以增进学生对主题的理解,并激励他们进一步学习。然而,我们不知道有任何现有的游戏,明确侧重于介绍ML概念给学生使用游戏。因此,在本文中,我们提出了ML-Quest,一个三维视频游戏来提供三个ML概念的概念概述:有监督学习、梯度下降和K-最近邻(KNN)分类。游戏的关键是介绍这些概念的定义和作用,我们称之为概念概述,在一个模拟的场景中,没有让学生被复杂的ML所压倒。在23名高中生的帮助下,使用技术接受模型(TAM)对游戏的有用性和玩家体验进行了评估。调查结果显示,约有70%的参与者同意或强烈同意,ML探索是相当互动的,有助于他们介绍ML概念。 摘要:Today, Machine Learning (ML) is of a great importance to society due to the availability of huge data and high computational resources. This ultimately led to the introduction of ML concepts at multiple levels of education including K-12 students to promote computational thinking. However, teaching these concepts to K-12 through traditional methodologies such as video lectures and books is challenging. Many studies in the literature have reported that using interactive environments such as games to teach computational thinking and programming improves retention capacity and motivation among students. Therefore, introducing ML concepts using a game might enhance students' understanding of the subject and motivate them to learn further. However, we are not aware of any existing game which explicitly focuses on introducing ML concepts to students using game play. Hence, in this paper, we propose ML-Quest, a 3D video game to provide conceptual overview of three ML concepts: Supervised Learning, Gradient Descent and K-Nearest Neighbor (KNN) Classification. The crux of the game is to introduce the definition and working of these concepts, which we call conceptual overview, in a simulated scenario without overwhelming students with the intricacies of ML. The game has been predominantly evaluated for its usefulness and player experience using the Technology Acceptance Model (TAM) model with the help of 23 higher-secondary school students. The survey result shows that around 70% of the participants either agree or strongly agree that the ML-Quest is quite interactive and useful in introducing them to ML concepts.
【3】 No Regrets for Learning the Prior in Bandits 标题:在土匪中学习前辈无怨无悔
作者:Soumya Basu,Branislav Kveton,Manzil Zaheer,Csaba Szepesvári 机构:Google Research, DeepMind University of Alberta 链接:https://arxiv.org/abs/2107.06196 摘要:我们提出了${tt AdaTS}$,一种汤普森采样算法,它可以顺序地适应与其交互的bandit任务。${tt AdaTS}$中的关键思想是通过在未知任务的参数上保持分布来适应该分布。在解决一个强盗任务时,这种不确定性被边缘化了,并得到了适当的解释${tt AdaTS}$是一种完全贝叶斯算法,可以有效地应用于多类bandit问题。我们推导了它的Bayes后悔上界,它量化了由于事先不知道任务而造成的损失,并且证明了它是很小的。我们的理论得到了实验的支持,其中${tt AdaTS}$的性能优于先前的算法,即使在具有挑战性的现实问题中也能很好地工作。 摘要:We propose ${tt AdaTS}$, a Thompson sampling algorithm that adapts sequentially to bandit tasks that it interacts with. The key idea in ${tt AdaTS}$ is to adapt to an unknown task prior distribution by maintaining a distribution over its parameters. When solving a bandit task, that uncertainty is marginalized out and properly accounted for. ${tt AdaTS}$ is a fully-Bayesian algorithm that can be implemented efficiently in several classes of bandit problems. We derive upper bounds on its Bayes regret that quantify the loss due to not knowing the task prior, and show that it is small. Our theory is supported by experiments, where ${tt AdaTS}$ outperforms prior algorithms and works well even in challenging real-world problems.
【4】 On Choice of Hyper-parameter in Extreme Value Theory based on Machine Learning Techniques 标题:基于机器学习技术的极值理论中超参数的选择
作者:Chikara Nakamura 链接:https://arxiv.org/abs/2107.06074 摘要:极值理论是分析极端事件的一种统计工具。它有很强的理论背景,但应用EVT需要选择超参数。在最近的机器学习研究中,超参数选择技术得到了很好的研究。本文提出了一种基于机器学习技术的EVT超参数选择方法。我们还对实际数据进行了实验,结果表明该方法具有良好的可用性。 摘要:Extreme value theory (EVT) is a statistical tool for analysis of extreme events. It has a strong theoretical background, however, we need to choose hyper-parameters to apply EVT. In recent studies of machine learning, techniques of choosing hyper-parameters have been well-studied. In this paper, we propose a new method of choosing hyper-parameters in EVT based on machine learning techniques. We also experiment our method to real-world data and show good usability of our method.
【5】 Model of the Weak Reset Process in HfOx Resistive Memory for Deep Learning Frameworks 标题:深度学习框架中HfOx阻性记忆的弱重置过程模型
作者:Atreya Majumdar,Marc Bocquet,Tifenn Hirtzlin,Axel Laborieux,Jacques-Olivier Klein,Etienne Nowak,Elisa Vianello,Jean-Michel Portal,Damien Querlioz 链接:https://arxiv.org/abs/2107.06064 摘要:由于内存和逻辑单元之间的数据传输,当前深度学习训练算法的实现非常耗电。基于氧化物的rram是实现低功耗内存计算的理想选择。他们的弱复位制度,是特别有吸引力的学习,因为它允许调整电阻的设备具有非凡的耐力。然而,在这种情况下,电阻变化行为会受到许多波动的影响,尤其是在与用于模拟深度学习的工具兼容的方式下,建模尤其具有挑战性。在这项工作中,我们提出了一个氧化铪RRAM弱复位过程的模型,并将此模型集成到PyTorch深度学习框架中。通过对CMOS/RRAM混合工艺的实验验证,我们的模型再现了噪声的渐进行为和器件对器件(D2D)的可变性。我们使用这个工具来训练二值化神经网络,用于MNIST手写数字识别任务和CIFAR-10目标分类任务。我们模拟了我们的模型,无论是否存在设备缺陷的各个方面,以了解它们对训练过程的影响,并确定D2D变异性是最有害的方面。该框架可以以同样的方式用于其他类型的存储器,以识别导致最大退化的器件缺陷,进而可以用于优化器件以减少这些缺陷的影响。 摘要:The implementation of current deep learning training algorithms is power-hungry, owing to data transfer between memory and logic units. Oxide-based RRAMs are outstanding candidates to implement in-memory computing, which is less power-intensive. Their weak RESET regime, is particularly attractive for learning, as it allows tuning the resistance of the devices with remarkable endurance. However, the resistive change behavior in this regime suffers many fluctuations and is particularly challenging to model, especially in a way compatible with tools used for simulating deep learning. In this work, we present a model of the weak RESET process in hafnium oxide RRAM and integrate this model within the PyTorch deep learning framework. Validated on experiments on a hybrid CMOS/RRAM technology, our model reproduces both the noisy progressive behavior and the device-to-device (D2D) variability. We use this tool to train Binarized Neural Networks for the MNIST handwritten digit recognition task and the CIFAR-10 object classification task. We simulate our model with and without various aspects of device imperfections to understand their impact on the training process and identify that the D2D variability is the most detrimental aspect. The framework can be used in the same manner for other types of memories to identify the device imperfections that cause the most degradation, which can, in turn, be used to optimize the devices to reduce the impact of these imperfections.
【6】 Fast-Slow Streamflow Model Using Mass-Conserving LSTM 标题:基于质量守恒LSTM的快慢水流模型
作者:Miguel Paredes Quiñones,Maciel Zortea,Leonardo S. A. Martins 备注:None 链接:https://arxiv.org/abs/2107.06057 摘要:径流预报是有效管理水资源和应对气候变化加剧的自然灾害的关键。在这里,我们使用快流和慢流组件的概念来创建一个新的质量守恒长短时记忆(LSTM)神经网络模型。它使用水文气象时间序列和集水区属性来预测每日河流流量。初步结果表明,与最近的文献相比,不同分数的学生的技能有所提高。 摘要:Streamflow forecasting is key to effectively managing water resources and preparing for the occurrence of natural calamities being exacerbated by climate change. Here we use the concept of fast and slow flow components to create a new mass-conserving Long Short-Term Memory (LSTM) neural network model. It uses hydrometeorological time series and catchment attributes to predict daily river discharges. Preliminary results evidence improvement in skills for different scores compared to the recent literature.
【7】 DIVINE: Diverse Influential Training Points for Data Visualization and Model Refinement 标题:神性:数据可视化和模型精细化的各种有影响力的训练要点
作者:Umang Bhatt,Isabel Chien,Muhammad Bilal Zafar,Adrian Weller 机构:Amazon AWS AI, University of Cambridge & The Alan Turing Institute 备注:30 pages, 32 figures 链接:https://arxiv.org/abs/2107.05978 摘要:随着机器学习(ML)模型复杂性的增加,导致模型缺乏预测的可解释性,人们已经开发了几种方法来根据对模型影响最大的训练数据点来解释模型的行为。然而,这些方法倾向于将异常值标记为具有高度影响力的点,从而限制了从业者从不代表训练数据的点中得出的见解。在这项工作中,我们朝着寻找有影响力的训练点迈出了一步,这些训练点也很好地代表了训练数据。我们首先回顾了为训练点分配重要性分数的方法。基于重要性得分,我们提出了一种方法来选择一组不同的有影响力的(神圣的)训练点作为模型行为的有用解释。由于实践者可能不仅对发现对模型准确性有影响的数据点感兴趣,而且对其他重要指标也感兴趣,因此我们将展示如何在组公平性的基础上评估训练数据点。我们的方法可以识别导致不公平的训练点,去除这些训练点可以提高训练的公平性。我们的定量实验和用户研究表明,与早期的方法相比,可视化神圣点有助于从业者更好地理解和解释模型行为。 摘要:As the complexity of machine learning (ML) models increases, resulting in a lack of prediction explainability, several methods have been developed to explain a model's behavior in terms of the training data points that most influence the model. However, these methods tend to mark outliers as highly influential points, limiting the insights that practitioners can draw from points that are not representative of the training data. In this work, we take a step towards finding influential training points that also represent the training data well. We first review methods for assigning importance scores to training points. Given importance scores, we propose a method to select a set of DIVerse INfluEntial (DIVINE) training points as a useful explanation of model behavior. As practitioners might not only be interested in finding data points influential with respect to model accuracy, but also with respect to other important metrics, we show how to evaluate training data points on the basis of group fairness. Our method can identify unfairness-inducing training points, which can be removed to improve fairness outcomes. Our quantitative experiments and user studies show that visualizing DIVINE points helps practitioners understand and explain model behavior better than earlier approaches.
【8】 Fast approximations of the Jeffreys divergence between univariate Gaussian mixture models via exponential polynomial densities 标题:用指数多项式密度快速逼近单变量高斯混合模型间的Jeffreys散度
作者:Frank Nielsen 备注:29 pages 链接:https://arxiv.org/abs/2107.05901 摘要:Jeffreys散度是统计Kullback-Leibler散度的一个著名的对称化,常用于机器学习、信号处理和信息科学。由于普遍存在的高斯混合模型之间的Jeffreys散度不是封闭形式的,因此文献中提出了许多具有各种优缺点的方法来(i)估计,(ii)近似,或(iii)该散度的下限和上限。在这项工作中,我们提出了一个简单而快速的启发式方法来近似两个任意数量的gmm之间的Jeffreys散度。启发式算法依赖于将gmm转换成一对属于指数族的双参数化概率密度。特别地,我们考虑多项式指数密度,并设计一个拟合优度准则来测量GMM和PED之间的相异性,这是Hyv“阿里恩散度”的推广。这个标准允许人们选择ped的顺序来近似GMMs。我们通过实验证明,我们的启发式算法的计算时间比随机montecarlo估计基线提高了几个数量级,同时相当好地逼近了Jeffreys散度,特别是当单变量混合物具有少量模式时。 摘要:The Jeffreys divergence is a renown symmetrization of the statistical Kullback-Leibler divergence which is often used in machine learning, signal processing, and information sciences. Since the Jeffreys divergence between the ubiquitous Gaussian Mixture Models are not available in closed-form, many techniques with various pros and cons have been proposed in the literature to either (i) estimate, (ii) approximate, or (iii) lower and upper bound this divergence. In this work, we propose a simple yet fast heuristic to approximate the Jeffreys divergence between two GMMs of arbitrary number of components. The heuristic relies on converting GMMs into pairs of dually parameterized probability densities belonging to exponential families. In particular, we consider Polynomial Exponential Densities, and design a goodness-of-fit criterion to measure the dissimilarity between a GMM and a PED which is a generalization of the Hyv"arinen divergence. This criterion allows one to select the orders of the PEDs to approximate the GMMs. We demonstrate experimentally that the computational time of our heuristic improves over the stochastic Monte Carlo estimation baseline by several orders of magnitude while approximating reasonably well the Jeffreys divergence, specially when the univariate mixtures have a small number of modes.
【9】 Automated Learning Rate Scheduler for Large-batch Training 标题:面向大批量训练的自动学习率调度器
作者:Chiheon Kim,Saehoon Kim,Jongmin Kim,Donghoon Lee,Sungwoong Kim 机构:Kakao Brain, South Korea 备注:15 pages, 7 figures, 4 tables, 8th ICML Workshop on Automated Machine Learning (2021) 链接:https://arxiv.org/abs/2107.05855 摘要:大批量训练对于在深度学习中利用大规模数据集和模型至关重要。虽然使用大批量训练在计算上是有益的,但它通常需要一个专门设计的学习率(LR)计划来实现与小批量训练相当的性能水平。特别是在训练次数受限的情况下,由于更新步骤的减少,使用大LR和预热策略对大批量训练的最终性能至关重要。在这项工作中,我们提出了一个自动化的LR调度算法,在给定的epoch预算下,该算法对于大批量的神经网络训练是有效的。具体来说,整个训练计划分为两个阶段:自适应热身和预先确定的衰减,在这两个阶段中,LR增加,直到训练损失不再减少,而在训练结束前,LR减小到零。在这里,训练损失是否已经达到最小值以低计算负担的在线方式用高斯过程平滑来鲁棒地检查。该调度器与AdamP和LAMB等自适应随机优化器相结合,在不进行繁琐的超参数调整的情况下,成功地调整了LRs,在不同的图像分类基准和体系结构上获得了与调整后的基线相当或更好的性能。 摘要:Large-batch training has been essential in leveraging large-scale datasets and models in deep learning. While it is computationally beneficial to use large batch sizes, it often requires a specially designed learning rate (LR) schedule to achieve a comparable level of performance as in smaller batch training. Especially, when the number of training epochs is constrained, the use of a large LR and a warmup strategy is critical in the final performance of large-batch training due to the reduced number of updating steps. In this work, we propose an automated LR scheduling algorithm which is effective for neural network training with a large batch size under the given epoch budget. In specific, the whole schedule consists of two phases: adaptive warmup and predefined decay, where the LR is increased until the training loss no longer decreases and decreased to zero until the end of training. Here, whether the training loss has reached the minimum value is robustly checked with Gaussian process smoothing in an online manner with a low computational burden. Coupled with adaptive stochastic optimizers such as AdamP and LAMB, the proposed scheduler successfully adjusts the LRs without cumbersome hyperparameter tuning and achieves comparable or better performances than tuned baselines on various image classification benchmarks and architectures with a wide range of batch sizes.
【10】 A Hierarchical Bayesian model for Inverse RL in Partially-Controlled Environments 标题:部分受控环境下逆RL的分层贝叶斯模型
作者:Kenneth Bogert,Prashant Doshi 机构: other 1Kenneth Bogert is with Department of Computer Science, University ofNorth Carolina, University of Georgia 备注:8 pages, 10 figures 链接:https://arxiv.org/abs/2107.05818 摘要:在真实世界中,使用逆强化学习(IRL)从观察中学习的机器人在演示过程中可能会遇到环境中的物体或代理,而不是专家。在完全受控的环境(如虚拟仿真或实验室设置)中,这些混杂元素通常会被移除。当无法完全清除时,必须过滤掉有害的观察结果。然而,在进行大量观测时,很难确定观测的来源。为了解决这个问题,我们提出了一个分层贝叶斯模型,它结合了专家和混杂元素的观察结果,从而明确地为机器人可能接收到的各种观察结果建模。我们扩展现有的ILL算法最初设计工作在部分遮挡的专家考虑不同的意见。在一个包含遮挡和混杂元素的模拟机器人排序域中,我们证明了该模型的有效性。特别是,我们的技术优于其他几种比较方法,仅次于对受试者轨迹的完美了解。 摘要:Robots learning from observations in the real world using inverse reinforcement learning (IRL) may encounter objects or agents in the environment, other than the expert, that cause nuisance observations during the demonstration. These confounding elements are typically removed in fully-controlled environments such as virtual simulations or lab settings. When complete removal is impossible the nuisance observations must be filtered out. However, identifying the source of observations when large amounts of observations are made is difficult. To address this, we present a hierarchical Bayesian model that incorporates both the expert's and the confounding elements' observations thereby explicitly modeling the diverse observations a robot may receive. We extend an existing IRL algorithm originally designed to work under partial occlusion of the expert to consider the diverse observations. In a simulated robotic sorting domain containing both occlusion and confounding elements, we demonstrate the model's effectiveness. In particular, our technique outperforms several other comparative methods, second only to having perfect knowledge of the subject's trajectory.
【11】 AlterSGD: Finding Flat Minima for Continual Learning by Alternative Training 标题:AlterSGD:通过另类训练找到适合持续学习的扁平迷你图
作者:Zhongzhan Huang,Mingfu Liang,Senwei Liang,Wei He 机构:Tsinghua University, Northwestern University, Purdue University, Nanyang Technological University 链接:https://arxiv.org/abs/2107.05804 摘要:深度神经网络在连续学习多个知识时会遭受灾难性遗忘,越来越多的方法被提出来缓解这一问题。其中一些方法通过将平坦的局部极小值与持续学习中的遗忘缓解联系起来,取得了相当好的效果。然而,它们不可避免地需要(1)繁琐的超参数调整,(2)额外的计算成本。为了缓解这些问题,本文提出了一种简单而有效的优化方法AlterSGD,用于在损失景观中寻找平坦的最小值。在AlterSGD中,当网络在每次学习新知识时趋于收敛时,我们交替进行梯度下降和上升。此外,我们从理论上证明了这样的策略可以鼓励优化收敛到平坦的极小值。我们在语义切分的连续学习基准上验证了AlterSGD,实验结果表明,在具有挑战性的连续学习协议下,AlterSGD能够显著地减少遗忘,并在很大程度上优于现有的方法。 摘要:Deep neural networks suffer from catastrophic forgetting when learning multiple knowledge sequentially, and a growing number of approaches have been proposed to mitigate this problem. Some of these methods achieved considerable performance by associating the flat local minima with forgetting mitigation in continual learning. However, they inevitably need (1) tedious hyperparameters tuning, and (2) additional computational cost. To alleviate these problems, in this paper, we propose a simple yet effective optimization method, called AlterSGD, to search for a flat minima in the loss landscape. In AlterSGD, we conduct gradient descent and ascent alternatively when the network tends to converge at each session of learning new knowledge. Moreover, we theoretically prove that such a strategy can encourage the optimization to converge to a flat minima. We verify AlterSGD on continual learning benchmark for semantic segmentation and the empirical results show that we can significantly mitigate the forgetting and outperform the state-of-the-art methods with a large margin under challenging continual learning protocols.
【12】 How many degrees of freedom do we need to train deep networks: a loss landscape perspective 标题:我们需要多少自由度来训练深层网络:损失景观视角
作者:Brett W. Larsen,Stanislav Fort,Nic Becker,Surya Ganguli 机构:Stanford University, Facebook AI Research 链接:https://arxiv.org/abs/2107.05802 摘要:最近的各种工作,包括剪枝、彩票和随机子空间内的训练,都表明深层神经网络可以用比参数总数少得多的自由度来训练。我们首先通过检验在给定训练维度的随机子空间内训练时击中训练损失子水平集的成功概率来解释这一现象。我们发现,当训练维度超过阈值时,成功概率从0美元到1美元有一个急剧的相变。该阈值训练维数随着期望最终损失的减小而增大,但随着初始损失的减小而减小。然后,我们从理论上解释了这种相变的起源,以及它对初始和最终期望损失的依赖性,根据损失景观的高维几何结构的精确性质。特别地,我们通过Gordon的逃逸定理证明,训练维数加上期望损失子水平集的高斯宽度,投射到围绕初始化的单位球上,必须超过参数的总数,成功概率才会很大。在一些体系结构和数据集中,我们测量了作为初始化函数的阈值训练维数,并证明它只占参数总数的一小部分,因此,根据我们的理论,因为低损失子水平集的高斯宽度非常大,所以用如此少的维数进行成功的训练是可能的。此外,我们还提出了一种更为复杂的子空间训练方法来降低彩票的零自由度,为彩票的训练提供了更为复杂的方法。 摘要:A variety of recent works, spanning pruning, lottery tickets, and training within random subspaces, have shown that deep neural networks can be trained using far fewer degrees of freedom than the total number of parameters. We explain this phenomenon by first examining the success probability of hitting a training loss sub-level set when training within a random subspace of a given training dimensionality. We find a sharp phase transition in the success probability from $0$ to $1$ as the training dimension surpasses a threshold. This threshold training dimension increases as the desired final loss decreases, but decreases as the initial loss decreases. We then theoretically explain the origin of this phase transition, and its dependence on initialization and final desired loss, in terms of precise properties of the high dimensional geometry of the loss landscape. In particular, we show via Gordon's escape theorem, that the training dimension plus the Gaussian width of the desired loss sub-level set, projected onto a unit sphere surrounding the initialization, must exceed the total number of parameters for the success probability to be large. In several architectures and datasets, we measure the threshold training dimension as a function of initialization and demonstrate that it is a small fraction of the total number of parameters, thereby implying, by our theory, that successful training with so few dimensions is possible precisely because the Gaussian width of low loss sub-level sets is very large. Moreover, this threshold training dimension provides a strong null model for assessing the efficacy of more sophisticated ways to reduce training degrees of freedom, including lottery tickets as well a more optimal method we introduce: lottery subspaces.
【13】 Data-Driven Low-Rank Neural Network Compression 标题:数据驱动的低秩神经网络压缩
作者:Dimitris Papadimitriou,Swayambhoo Jain 机构:⋆ UC Berkeley, † InterDigital AI Lab 链接:https://arxiv.org/abs/2107.05787 摘要:尽管深度神经网络(DNNs)有许多现代应用,但由于隐藏层中的大量参数,使得它们在存储容量受限的设备上部署时缺乏吸引力。本文提出了一种数据驱动的低秩(DDLR)方法,通过在全连通层上施加低秩结构,减少了预训练dnn的参数数目,加快了推理速度,同时控制了总体精度,不需要任何再训练。我们提出的问题是在给定的性能保证下,寻找每个完全连通层的最低秩近似,并将其松弛为一个易于处理的凸优化问题。我们证明了在普通的DNN结构中,只需在分类精度上有一小部分的降低,就可以显著地减少参数的数量。我们将DDLR与另一种基于稀疏性的数据驱动DNN压缩技术nettrim进行了比较,结果表明DDLR在保持较高精度的同时,能产生更多的压缩神经网络。 摘要:Despite many modern applications of Deep Neural Networks (DNNs), the large number of parameters in the hidden layers makes them unattractive for deployment on devices with storage capacity constraints. In this paper we propose a Data-Driven Low-rank (DDLR) method to reduce the number of parameters of pretrained DNNs and expedite inference by imposing low-rank structure on the fully connected layers, while controlling for the overall accuracy and without requiring any retraining. We pose the problem as finding the lowest rank approximation of each fully connected layer with given performance guarantees and relax it to a tractable convex optimization problem. We show that it is possible to significantly reduce the number of parameters in common DNN architectures with only a small reduction in classification accuracy. We compare DDLR with Net-Trim, which is another data-driven DNN compression technique based on sparsity and show that DDLR consistently produces more compressed neural networks while maintaining higher accuracy.
【14】 Kernel Continual Learning 标题:核心连续学习
作者:Mohammad Mahdi Derakhshani,Xiantong Zhen,Ling Shao,Cees G. M. Snoek 机构: University of Amsterdam, The Netherlands 2Inception Institute of Artificial Intelligence 备注:accepted to ICML 2021 链接:https://arxiv.org/abs/2107.05757 摘要:本文介绍了核连续学习,它是一种简单而有效的连续学习变体,利用核方法的非参数特性来解决灾难性遗忘问题。我们部署了一个情景记忆单元,为每个任务存储一个子集样本,以学习基于核岭回归的任务特定分类器。这不需要记忆回放,并且系统地避免了分类器中的任务干扰。我们进一步引入变分随机特征来学习每个任务的数据驱动内核。为此,我们将核连续学习描述为一个变分推理问题,其中一个随机Fourier基被合并为潜变量。从每个任务的核心集推断出随机Fourier基上的变分后验分布。通过这种方式,我们能够生成针对每个任务的更多信息内核,更重要的是,核心集的大小可以减少,以实现更紧凑的内存,从而在情景记忆的基础上实现更有效的连续学习。对四个基准的广泛评估证明了内核用于持续学习的有效性和前景。 摘要:This paper introduces kernel continual learning, a simple but effective variant of continual learning that leverages the non-parametric nature of kernel methods to tackle catastrophic forgetting. We deploy an episodic memory unit that stores a subset of samples for each task to learn task-specific classifiers based on kernel ridge regression. This does not require memory replay and systematically avoids task interference in the classifiers. We further introduce variational random features to learn a data-driven kernel for each task. To do so, we formulate kernel continual learning as a variational inference problem, where a random Fourier basis is incorporated as the latent variable. The variational posterior distribution over the random Fourier basis is inferred from the coreset of each task. In this way, we are able to generate more informative kernels specific to each task, and, more importantly, the coreset size can be reduced to achieve more compact memory, resulting in more efficient continual learning based on episodic memory. Extensive evaluation on four benchmarks demonstrates the effectiveness and promise of kernels for continual learning.
【15】 Learning based E2E Energy Efficient in Joint Radio and NFV Resource Allocation for 5G and Beyond Networks 标题:基于学习的5G及以上网络无线和NFV联合资源分配中的E2E节能
作者:Narges Gholipoor,Ali Nouruzi,Shima Salarhosseini,Mohammad Reza Javan,Nader Mokari,Eduard A. Jorswieck 链接:https://arxiv.org/abs/2107.05991 摘要:在本文中,我们提出了一个联合无线电和核心资源分配框架的NFV支持的网络。在所提出的系统模型中,目标是通过保证不同服务类型的端到端(E2E)服务质量(QoS)来最大限度地提高能源效率(EE)。为此,我们提出了一个在无线部分分配功率和频谱资源的优化问题。在核心部分,对功能进行链接、布局和调度,以保证所有用户的QoS。该联合优化问题被建模为马尔可夫决策过程(MDP),考虑了可用资源和无线信道的时变特性。然后利用基于最大熵的深度学习算法(SAC-MDP)求解上述的深度学习算法。数值结果表明,与分别优化R-RA和NFV-RA问题相比,基于SAC-DRL算法的联合优化方法能显著降低能耗。 摘要:In this paper, we propose a joint radio and core resource allocation framework for NFV-enabled networks. In the proposed system model, the goal is to maximize energy efficiency (EE), by guaranteeing end-to-end (E2E) quality of service (QoS) for different service types. To this end, we formulate an optimization problem in which power and spectrum resources are allocated in the radio part. In the core part, the chaining, placement, and scheduling of functions are performed to ensure the QoS of all users. This joint optimization problem is modeled as a Markov decision process (MDP), considering time-varying characteristics of the available resources and wireless channels. A soft actor-critic deep reinforcement learning (SAC-DRL) algorithm based on the maximum entropy framework is subsequently utilized to solve the above MDP. Numerical results reveal that the proposed joint approach based on the SAC-DRL algorithm could significantly reduce energy consumption compared to the case in which R-RA and NFV-RA problems are optimized separately.
【16】 Deep Autoregressive Models with Spectral Attention 标题:具有谱注意的深度自回归模型
作者:Fernando Moreno-Pino,Pablo M. Olmos,Antonio Artés-Rodríguez 机构:accessible. 链接:https://arxiv.org/abs/2107.05984 摘要:时间序列预测是一个跨多个领域的重要问题,在许多实际应用中起着至关重要的作用。本文提出了一种深度自回归模型与谱注意(SA)模块相结合的预测体系结构,该模型在嵌入空间中融合了全局和局部频域信息。通过在谱域中表征嵌入的时间序列作为随机过程的出现,我们的方法可以识别全球趋势和季节性模式。两个光谱注意模型,全局和局部的时间序列,将这些信息集成到预测中,并进行光谱滤波以去除时间序列的噪声。所提出的架构具有许多有用的特性:它可以有效地整合到众所周知的预测架构中,需要较少的参数,并产生可解释的结果,从而提高预测精度。我们在几个众所周知的预测数据集上测试了光谱注意自回归模型(SAAM),一致地证明了我们的模型比最先进的方法优越。 摘要:Time series forecasting is an important problem across many domains, playing a crucial role in multiple real-world applications. In this paper, we propose a forecasting architecture that combines deep autoregressive models with a Spectral Attention (SA) module, which merges global and local frequency domain information in the model's embedded space. By characterizing in the spectral domain the embedding of the time series as occurrences of a random process, our method can identify global trends and seasonality patterns. Two spectral attention models, global and local to the time series, integrate this information within the forecast and perform spectral filtering to remove time series's noise. The proposed architecture has a number of useful properties: it can be effectively incorporated into well-know forecast architectures, requiring a low number of parameters and producing interpretable results that improve forecasting accuracy. We test the Spectral Attention Autoregressive Model (SAAM) on several well-know forecast datasets, consistently demonstrating that our model compares favorably to state-of-the-art approaches.
其他(18篇)
【1】 Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability 标题:RL中为什么难以泛化:认知POMDP与隐含部分可观测性
作者:Dibya Ghosh,Jad Rahme,Aviral Kumar,Amy Zhang,Ryan P. Adams,Sergey Levine 机构:UC Berkeley, Princeton University, Facebook AI Research 备注:First two authors contributed equally 链接:https://arxiv.org/abs/2107.06277 摘要:泛化是强化学习(RL)系统在现实世界中应用的核心挑战。在这篇文章中,我们证明了RL问题的序列结构需要新的方法来推广,而不仅仅是在监督学习中所使用的技术。虽然监督学习方法可以在不考虑认知不确定性的情况下有效地推广,但我们发现,也许令人惊讶的是,在RL中情况并非如此。我们证明,从有限的训练条件推广到不可见的测试条件,可以诱导内隐部分可观测性,有效地将完全观察到的mdp转化为pomdp。在此基础上,我们将RL中的泛化问题转化为求解诱导的部分观测Markov决策过程,我们称之为认知POMDP。我们证明了算法的失败模式,不适当地处理这个部分可观测性,并提出了一个简单的集成为基础的技术来近似解决部分观测问题。在Procgen基准测试套件上,我们证明了我们从认知POMDP导出的简单算法在推广上比现有方法有显著的提高。 摘要:Generalization is a central challenge for the deployment of reinforcement learning (RL) systems in the real world. In this paper, we show that the sequential structure of the RL problem necessitates new approaches to generalization beyond the well-studied techniques used in supervised learning. While supervised learning methods can generalize effectively without explicitly accounting for epistemic uncertainty, we show that, perhaps surprisingly, this is not the case in RL. We show that generalization to unseen test conditions from a limited number of training conditions induces implicit partial observability, effectively turning even fully-observed MDPs into POMDPs. Informed by this observation, we recast the problem of generalization in RL as solving the induced partially observed Markov decision process, which we call the epistemic POMDP. We demonstrate the failure modes of algorithms that do not appropriately handle this partial observability, and suggest a simple ensemble-based technique for approximately solving the partially observed problem. Empirically, we demonstrate that our simple algorithm derived from the epistemic POMDP achieves significant gains in generalization over current methods on the Procgen benchmark suite.
【2】 Object Tracking and Geo-localization from Street Images 标题:基于街道图像的目标跟踪与地理定位
作者:Daniel Wilson,Thayer Alshaabi,Colin Van Oort,Xiaohan Zhang,Jonathan Nelson,Safwan Wshah 机构:• A large and realistic dataset to support research in the field of object geolo-, calization, • An object detector designed to predict GPS locations using a local offset, and coordinate transform 备注:28 pages, 7 figures, to be submitted to Elsevier Pattern Recognition 链接:https://arxiv.org/abs/2107.06257 摘要:从街道图像中对静态物体进行地理定位是一项挑战,但对于道路资源测绘和自动驾驶也非常重要。在本文中,我们提出了一个两阶段的框架,检测和地理定位交通标志从低帧速率街道视频。我们提出的系统使用了一种改进的视网膜网(GPS-RetinaNet),除了执行标准分类和边界盒回归外,还可以预测每个标志相对于相机的位置偏移。我们的自定义跟踪器由学习的度量网络和匈牙利算法的变体组成,将GPS视网膜网中的候选符号检测浓缩为地理定位符号。我们的度量网络估计检测对之间的相似性,然后匈牙利算法使用度量网络提供的相似性分数匹配图像中的检测。我们的模型是使用更新版本的ARTS数据集训练的,该数据集包含25544幅图像和47.589个符号注释~cite{ARTS}。拟议的数据集涵盖了从广泛的道路选择中收集的各种环境。每个注释都包含一个标志类标签、其地理空间位置、装配标签、路侧指示器,以及有助于评估的唯一标识符。该数据集将支持该领域的未来进展,并且所提出的系统演示了如何利用真实地理定位数据集的一些独特特性。 摘要:Geo-localizing static objects from street images is challenging but also very important for road asset mapping and autonomous driving. In this paper we present a two-stage framework that detects and geolocalizes traffic signs from low frame rate street videos. Our proposed system uses a modified version of RetinaNet (GPS-RetinaNet), which predicts a positional offset for each sign relative to the camera, in addition to performing the standard classification and bounding box regression. Candidate sign detections from GPS-RetinaNet are condensed into geolocalized signs by our custom tracker, which consists of a learned metric network and a variant of the Hungarian Algorithm. Our metric network estimates the similarity between pairs of detections, then the Hungarian Algorithm matches detections across images using the similarity scores provided by the metric network. Our models were trained using an updated version of the ARTS dataset, which contains 25,544 images and 47.589 sign annotations ~cite{arts}. The proposed dataset covers a diverse set of environments gathered from a broad selection of roads. Each annotaiton contains a sign class label, its geospatial location, an assembly label, a side of road indicator, and unique identifiers that aid in the evaluation. This dataset will support future progress in the field, and the proposed system demonstrates how to take advantage of some of the unique characteristics of a realistic geolocalization dataset.
【3】 Everybody Is Unique: Towards Unbiased Human Mesh Recovery 标题:每个人都是独一无二的:走向不偏不倚的人脉恢复
作者:Ren Li,Meng Zheng,Srikrishna Karanam,Terrence Chen,Ziyan Wu 机构:United Imaging Intelligence, Cambridge MA 备注:10 pages, 5 figures, 4 tables 链接:https://arxiv.org/abs/2107.06239 摘要:我们考虑肥胖人网格恢复的问题,即,将参数人类网格拟合到肥胖人群的图像。尽管肥胖者的网格拟合是许多应用(如医疗保健)中的一个重要问题,但网格恢复方面的许多最新进展仅限于非肥胖者的图像。在这项工作中,我们通过介绍和讨论现有算法的局限性,找出了当前文献中的这一关键差距。接下来,我们将提供一个简单的基线来解决这个问题,它是可伸缩的,并且可以很容易地与现有算法结合使用,以提高它们的性能。最后,我们提出了一个广义人体网格优化算法,大大提高了现有方法在肥胖者图像和社区标准基准数据集上的性能。该技术的一个关键创新是,它不依赖于昂贵的监视来创建网格参数。取而代之的是,从广泛和廉价的二维关键点注释开始,我们的方法自动生成网格参数,这些参数可以用来重新训练和微调任何现有的网格估计算法。通过这种方式,我们展示了我们的方法作为一个下降,以提高性能的各种当代网格估计方法。我们进行了广泛的实验,在多个数据集,包括标准和肥胖的人的图像,并证明了我们提出的技术的有效性。 摘要:We consider the problem of obese human mesh recovery, i.e., fitting a parametric human mesh to images of obese people. Despite obese person mesh fitting being an important problem with numerous applications (e.g., healthcare), much recent progress in mesh recovery has been restricted to images of non-obese people. In this work, we identify this crucial gap in the current literature by presenting and discussing limitations of existing algorithms. Next, we present a simple baseline to address this problem that is scalable and can be easily used in conjunction with existing algorithms to improve their performance. Finally, we present a generalized human mesh optimization algorithm that substantially improves the performance of existing methods on both obese person images as well as community-standard benchmark datasets. A key innovation of this technique is that it does not rely on supervision from expensive-to-create mesh parameters. Instead, starting from widely and cheaply available 2D keypoints annotations, our method automatically generates mesh parameters that can in turn be used to re-train and fine-tune any existing mesh estimation algorithm. This way, we show our method acts as a drop-in to improve the performance of a wide variety of contemporary mesh estimation methods. We conduct extensive experiments on multiple datasets comprising both standard and obese person images and demonstrate the efficacy of our proposed techniques.
【4】 Pattern Discovery and Validation Using Scientific Research Methods 标题:基于科学研究方法的模式发现与验证
作者:Dirk Riehle,Nikolay Harutyunyan,Ann Barcomb 机构:Friedrich-Alexander-University, Erlangen-Nürnberg, University of Calgary 链接:https://arxiv.org/abs/2107.06065 摘要:模式发现,即发现先前未识别的模式的过程,通常作为一个临时过程来执行,所提出的模式的质量几乎没有确定性。模式验证,即验证所提出的模式的准确性的过程,仍然由简单的“三个规则”的启发式方法控制。本文展示了如何使用已建立的科学研究方法进行模式发现和验证。我们提出了一种具体的方法,称为手册方法,它使用定性调查、行动研究和案例研究来发现和评估模式,并讨论了一般使用科学方法的基本原则。我们使用三个探索性研究来评估手册方法,并证明其有效性。 摘要:Pattern discovery, the process of discovering previously unrecognized patterns, is often performed as an ad-hoc process with little resulting certainty in the quality of the proposed patterns. Pattern validation, the process of validating the accuracy of proposed patterns, remains dominated by the simple heuristic of "the rule of three". This article shows how to use established scientific research methods for the purpose of pattern discovery and validation. We present a specific approach, called the handbook method, that uses the qualitative survey, action research, and case study research for pattern discovery and evaluation, and we discuss the underlying principle of using scientific methods in general. We evaluate the handbook method using three exploratory studies and demonstrate its usefulness.
【5】 A Deep Generative Artificial Intelligence system to decipher species coexistence patterns 标题:破译物种共存模式的深度生成式人工智能系统
作者:J. Hirn,J. E. García,A. Montesinos-Navarro,R. Sanchez-Martín,V. Sanz,M. Verdú 机构:Department of Physics and Astronomy, University of Sussex 备注:15 pages, 5 figures 链接:https://arxiv.org/abs/2107.06020 摘要:1.破解共存模式是理解多样性维持的一个当前挑战,特别是在富裕社区,这些模式的复杂性通过间接的相互作用被放大,阻止了它们与经典实验方法的近似。2.我们探索被称为生成人工智能(GenAI)的尖端机器学习技术来破译植被斑块中的物种共存模式,训练生成对抗网络(GAN)和变分自动编码器(VAE),然后用它们来揭示群落组合背后的一些机制。3.GAN准确地再现了真实斑块的物种组成以及植物物种对不同土壤类型的亲和力,VAE的准确率也达到了99%以上。利用人工生成的补丁,我们发现高阶相互作用往往会抑制低阶相互作用的积极影响。最后,通过重建演替轨迹,我们可以识别出具有更大潜力的先锋物种,从而在物种组成方面产生高度多样性的不同斑块。4.了解不同生态群落中物种共存模式的复杂性需要超越启发式规则的新方法。生成性人工智能可以成为实现这一目标的有力工具,因为它可以克服这一挑战的内在维度。 摘要:1. Deciphering coexistence patterns is a current challenge to understanding diversity maintenance, especially in rich communities where the complexity of these patterns is magnified through indirect interactions that prevent their approximation with classical experimental approaches. 2. We explore cutting-edge Machine Learning techniques called Generative Artificial Intelligence (GenAI) to decipher species coexistence patterns in vegetation patches, training generative adversarial networks (GAN) and variational AutoEncoders (VAE) that are then used to unravel some of the mechanisms behind community assemblage. 3. The GAN accurately reproduces the species composition of real patches as well as the affinity of plant species to different soil types, and the VAE also reaches a high level of accuracy, above 99%. Using the artificially generated patches, we found that high order interactions tend to suppress the positive effects of low order interactions. Finally, by reconstructing successional trajectories we could identify the pioneer species with larger potential to generate a high diversity of distinct patches in terms of species composition. 4. Understanding the complexity of species coexistence patterns in diverse ecological communities requires new approaches beyond heuristic rules. Generative Artificial Intelligence can be a powerful tool to this end as it allows to overcome the inherent dimensionality of this challenge.
【6】 Can Less be More? When Increasing-to-Balancing Label Noise Rates Considered Beneficial 标题:能不能少一点就多一点呢?当增加到平衡的标签噪声率被认为是有益的
作者:Yang Liu,Jialu Wang 机构:University of California, Santa Cruz 备注:Preprint under review 链接:https://arxiv.org/abs/2107.05913 摘要:在本文中,我们回答了这样一个问题:插入标签噪声(信息量较小的标签)可以使我们得到更准确和公平的模型。我们主要受到两个观察的启发:1)增加某一类实例的标签噪声来平衡噪声率(增加到平衡)会导致更容易的学习问题;2) 增加到平衡可以提高公平性,防止标签偏差。在本文中,我们将首先量化通过增加某一组实例的标签噪声率w.r.t.引入的折衷方案,即学习困难和性能保证。我们分析地证明,当这样的增加被证明是有益的,无论是在改善泛化误差或公平性保证方面。然后,我们提出了一种方法,利用我们的想法,插入标签噪声的任务学习噪声标签,无论是没有或有公平性约束。我们面临的主要技术挑战是由于这样一个事实:我们不知道哪些数据实例受到更高噪声的影响,我们也不会有地面真值标签来验证任何可能的假设。我们提出了一种检测方法,告诉我们哪一组标签可能遭受更高的噪声,而不使用地面真相信息。我们正式建立了所提出的解决方案的有效性,并通过大量的实验进行了验证。 摘要:In this paper, we answer the question when inserting label noise (less informative labels) can instead return us more accurate and fair models. We are primarily inspired by two observations that 1) increasing a certain class of instances' label noise to balance the noise rates (increasing-to-balancing) results in an easier learning problem; 2) Increasing-to-balancing improves fairness guarantees against label bias. In this paper, we will first quantify the trade-offs introduced by increasing a certain group of instances' label noise rate w.r.t. the learning difficulties and performance guarantees. We analytically demonstrate when such an increase proves to be beneficial, in terms of either improved generalization errors or the fairness guarantees. Then we present a method to leverage our idea of inserting label noise for the task of learning with noisy labels, either without or with a fairness constraint. The primary technical challenge we face is due to the fact that we would not know which data instances are suffering from higher noise, and we would not have the ground truth labels to verify any possible hypothesis. We propose a detection method that informs us which group of labels might suffer from higher noise, without using ground truth information. We formally establish the effectiveness of the proposed solution and demonstrate it with extensive experiments.
【7】 Recent Advances in Leveraging Human Guidance for Sequential Decision-Making Tasks 标题:利用人类指导进行顺序决策任务的最新进展
作者:Ruohan Zhang,Faraz Torabi,Garrett Warnell,Peter Stone 机构:Contributed equally to this work 1Department of Computer Science, The University of Texas at Austin 2U 备注:None 链接:https://arxiv.org/abs/2107.05825 摘要:人工智能的一个长期目标是创建能够学习执行需要顺序决策的任务的人工代理。重要的是,虽然学习和行动的是人工智能体,但仍由人类来指定要执行的特定任务。经典的任务规格说明方法通常涉及人类提供固定的奖励函数或显式演示所需的任务。然而,最近有大量的研究精力投入到探索人类引导学习代理的替代方法上,例如,可以更适合某些任务或需要更少的人力。这项调查提供了一个高层次的概述,五个最近的机器学习框架,主要依赖于人类的指导,除了预先指定的奖励功能或传统的,一步一步的行动示范。我们回顾了每个框架的动机、假设和实现,并讨论了未来可能的研究方向。 摘要:A longstanding goal of artificial intelligence is to create artificial agents capable of learning to perform tasks that require sequential decision making. Importantly, while it is the artificial agent that learns and acts, it is still up to humans to specify the particular task to be performed. Classical task-specification approaches typically involve humans providing stationary reward functions or explicit demonstrations of the desired tasks. However, there has recently been a great deal of research energy invested in exploring alternative ways in which humans may guide learning agents that may, e.g., be more suitable for certain tasks or require less human effort. This survey provides a high-level overview of five recent machine learning frameworks that primarily rely on human guidance apart from pre-specified reward functions or conventional, step-by-step action demonstrations. We review the motivation, assumptions, and implementation of each framework, and we discuss possible future research directions.
【8】 Carle's Game: An Open-Ended Challenge in Exploratory Machine Creativity 标题:Carle‘s Game:探索性机器创造力的无限制挑战
作者:Q. Tyrell Davis 备注:8 pages, 11 figures, accepted to IEEE Conference on Games 2021: 978-1-6654-3886-5/21/$31.00 copyright 2021 IEEE 链接:https://arxiv.org/abs/2107.05786 摘要:本文既是引言,又是邀请函。这是对CARLE的介绍,CARLE是一个类似生命的元胞自动机模拟器和强化学习环境。这也是一个邀请卡尔的游戏,在开放式机器探索和创造力的挑战。诱导机器智能体在跨多个细胞自动机世界创建有趣的模式方面表现出色是一项重大挑战,而解决这一挑战可能需要来自人工生命、人工智能、机器学习和复杂性等多个感兴趣领域的贡献。Carle的游戏是基于机器代理与Carle的交互,Carle是一个元胞自动机强化学习环境。卡尔是灵活的,能够模拟262144个不同的规则定义生命一样的细胞自动机宇宙。CARLE速度也很快,通过矢量化和GPU加速相结合,可以以每秒数万步的速度模拟自动机世界。最后,卡尔很简单。与为人类玩家设计的高保真物理模拟器和视频游戏相比,CARLE的二维网格世界提供了一个离散的、确定性的、原子的通用游戏场,尽管它很复杂。结合CARLE,CARLE的游戏提供了一组初始的代理策略、学习和元学习算法,以及奖励包装器,这些包装器可以定制为鼓励探索或特定任务。 摘要:This paper is both an introduction and an invitation. It is an introduction to CARLE, a Life-like cellular automata simulator and reinforcement learning environment. It is also an invitation to Carle's Game, a challenge in open-ended machine exploration and creativity. Inducing machine agents to excel at creating interesting patterns across multiple cellular automata universes is a substantial challenge, and approaching this challenge is likely to require contributions from the fields of artificial life, AI, machine learning, and complexity, at multiple levels of interest. Carle's Game is based on machine agent interaction with CARLE, a Cellular Automata Reinforcement Learning Environment. CARLE is flexible, capable of simulating any of the 262,144 different rules defining Life-like cellular automaton universes. CARLE is also fast and can simulate automata universes at a rate of tens of thousands of steps per second through a combination of vectorization and GPU acceleration. Finally, CARLE is simple. Compared to high-fidelity physics simulators and video games designed for human players, CARLE's two-dimensional grid world offers a discrete, deterministic, and atomic universal playground, despite its complexity. In combination with CARLE, Carle's Game offers an initial set of agent policies, learning and meta-learning algorithms, and reward wrappers that can be tailored to encourage exploration or specific tasks.
【9】 Fast and Explicit Neural View Synthesis 标题:快速显式神经视图综合
作者:Pengsheng Guo,Miguel Angel Bautista,Alex Colburn,Liang Yang,Daniel Ulbricht,Joshua M. Susskind,Qi Shan 机构:Apple 链接:https://arxiv.org/abs/2107.05775 摘要:我们研究了由三维物体组成的场景的新视图合成问题。我们提出了一个简单而有效的方法,既不是连续的,也不是隐含的,具有挑战性的观点合成最近的趋势。我们证明,尽管连续辐射场表示由于其表达能力而受到了广泛的关注,但我们的简单方法在将渲染速度提高400倍以上的同时,获得了与最新基线相当甚至更好的新视图重建质量。我们的模型是以类别无关的方式训练的,不需要场景特定的优化。因此,它能够将新的视图合成推广到训练过程中没有看到的对象类别。此外,我们证明,通过我们的简单公式,我们可以使用视图合成作为一个自我监督信号,在没有显式三维监督的情况下有效地学习三维几何。 摘要:We study the problem of novel view synthesis of a scene comprised of 3D objects. We propose a simple yet effective approach that is neither continuous nor implicit, challenging recent trends on view synthesis. We demonstrate that although continuous radiance field representations have gained a lot of attention due to their expressive power, our simple approach obtains comparable or even better novel view reconstruction quality comparing with state-of-the-art baselines while increasing rendering speed by over 400x. Our model is trained in a category-agnostic manner and does not require scene-specific optimization. Therefore, it is able to generalize novel view synthesis to object categories not seen during training. In addition, we show that with our simple formulation, we can use view synthesis as a self-supervision signal for efficient learning of 3D geometry without explicit 3D supervision.
【10】 Strategic Instrumental Variable Regression: Recovering Causal Relationships From Strategic Responses 标题:战略工具变量回归:从战略反应中恢复因果关系
作者:Keegan Harris,Daniel Ngo,Logan Stapleton,Hoda Heidari,Zhiwei Steven Wu 机构: and Zhiwei Steven Wu 1 1Carnegie Mellon University, edu 2University of Minnesota, high school records)and outcomes (e 链接:https://arxiv.org/abs/2107.05762 摘要:机器学习算法经常提示个体策略性地修改其可观察属性以获得更有利的预测。因此,预测模型所训练的分布可能与部署时所操作的分布不同。一般来说,这种分布变化阻碍了准确的预测,但我们的工作发现了一个与战略反应引起的变化相关的独特机会:我们表明,我们可以有效地利用战略反应来恢复我们希望预测的可观察特征和结果之间的因果关系。更具体地说,我们研究了一个博弈论模型,其中校长部署了一系列模型来预测一系列战略代理人(例如,大学申请人)的利益结果(例如,大学平均绩点)。作为回应,战略代理人投入努力并修改其特征以获得更好的预测。在这种情况下,未观察到的混杂变量可以影响代理人的可观察特征(如高中记录)和结果。因此,标准回归方法通常产生有偏估计。为了解决这个问题,我们的工作建立了机器学习模型的策略反应和工具变量(IV)回归之间的新联系,通过观察部署模型的序列可以被视为影响代理的可观察特征的工具,但不直接影响其结果。因此,两阶段最小二乘(2SLS)回归可以恢复可观察特征和结果之间的因果关系。除了因果恢复,我们还可以在2SLS方法的基础上解决两个额外的相关优化目标:代理结果最大化和预测风险最小化。最后,我们对半合成数据的数值模拟表明,我们的方法在因果关系估计方面明显优于OLS回归。 摘要:Machine Learning algorithms often prompt individuals to strategically modify their observable attributes to receive more favorable predictions. As a result, the distribution the predictive model is trained on may differ from the one it operates on in deployment. While such distribution shifts, in general, hinder accurate predictions, our work identifies a unique opportunity associated with shifts due to strategic responses: We show that we can use strategic responses effectively to recover causal relationships between the observable features and outcomes we wish to predict. More specifically, we study a game-theoretic model in which a principal deploys a sequence of models to predict an outcome of interest (e.g., college GPA) for a sequence of strategic agents (e.g., college applicants). In response, strategic agents invest efforts and modify their features for better predictions. In such settings, unobserved confounding variables can influence both an agent's observable features (e.g., high school records) and outcomes. Therefore, standard regression methods generally produce biased estimators. In order to address this issue, our work establishes a novel connection between strategic responses to machine learning models and instrumental variable (IV) regression, by observing that the sequence of deployed models can be viewed as an instrument that affects agents' observable features but does not directly influence their outcomes. Therefore, two-stage least squares (2SLS) regression can recover the causal relationships between observable features and outcomes. Beyond causal recovery, we can build on our 2SLS method to address two additional relevant optimization objectives: agent outcome maximization and predictive risk minimization. Finally, our numerical simulations on semi-synthetic data show that our methods significantly outperform OLS regression in causal relationship estimation.
【11】 Adapting to Misspecification in Contextual Bandits 标题:适应上下文Bitts中的错误规范
作者:Dylan J. Foster,Claudio Gentile,Mehryar Mohri,Julian Zimmert 机构:Massachusetts Institute of Technology, ‡Courant Institute of Mathematical Sciences 备注:Appeared at NeurIPS 2020 链接:https://arxiv.org/abs/2107.05745 摘要:上下文盗贼的一个主要研究方向是开发计算效率高,但支持灵活的通用函数逼近的算法。基于奖励模型的算法已经显示出很强的实证性能,但通常需要一个明确的模型,并且当这个假设不成立时,可能会失败。我们能否设计出高效灵活的算法,同时在模型错误的情况下优雅地降级?我们为$varepsilon$引入了一个新的oracle高效算法家族,这是一个适应未知模型错误指定的上下文盗贼,适用于有限和无限操作设置。我们的算法可以访问在线甲骨文进行平方损失回归,在没有先验知识的情况下,获得最佳的遗憾,特别是对错误指定级别的最佳依赖性。针对在$d$维上具有无限行为的线性上下文盗贼,我们得到了第一个算法,该算法对于未知的错误指定级别$varepsilon$达到了最优的$O(dsqrt{T} varepsilonsqrt{d}T)$后悔界。在概念层面上,我们的结果是基于Foster和Rakhlin的回归oracle简化框架的一个新的基于优化的视角实现的,我们预计这将得到更广泛的应用。 摘要:A major research direction in contextual bandits is to develop algorithms that are computationally efficient, yet support flexible, general-purpose function approximation. Algorithms based on modeling rewards have shown strong empirical performance, but typically require a well-specified model, and can fail when this assumption does not hold. Can we design algorithms that are efficient and flexible, yet degrade gracefully in the face of model misspecification? We introduce a new family of oracle-efficient algorithms for $varepsilon$-misspecified contextual bandits that adapt to unknown model misspecification -- both for finite and infinite action settings. Given access to an online oracle for square loss regression, our algorithm attains optimal regret and -- in particular -- optimal dependence on the misspecification level, with no prior knowledge. Specializing to linear contextual bandits with infinite actions in $d$ dimensions, we obtain the first algorithm that achieves the optimal $O(dsqrt{T} varepsilonsqrt{d}T)$ regret bound for unknown misspecification level $varepsilon$. On a conceptual level, our results are enabled by a new optimization-based perspective on the regression oracle reduction framework of Foster and Rakhlin, which we anticipate will find broader use.
【12】 Computational modelling and data-driven homogenisation of knitted membranes 标题:针织薄膜的计算建模与数据驱动均匀化
作者:Sumudu Herath,Xiao Xiao,Fehmi Cirak 机构:Department of Civil Engineering, University of Moratuwa, Moratuwa, Sri Lanka, Inria, route des Lucioles, Sophia Antipolis, France, Department of Engineering, University of Cambridge, Trumpington Street, Cambridge CB,PZ, U.K., SUMMARY 备注:23 pages, 14 figures 链接:https://arxiv.org/abs/2107.05707 摘要:针织是一种生产复杂三维表面的有效技术,因为交织纱线固有的柔韧性和制造业的最新进展提供了更好的局部缝合模式控制。大规模针织膜的完全纱线级建模是不可行的。因此,我们考虑两个尺度均匀化方法和模型膜作为基尔霍夫爱壳的宏观尺度和Euler Bernoulli棒在微观尺度上。采用三次B样条基函数对壳体和杆的控制方程进行离散。非线性微尺度问题的求解需要大量的时间,这是由于大变形和接触约束的实施,使得传统的在线计算均匀化方法不可行。为了避免这个问题,我们使用预先训练的统计高斯过程回归(GPR)模型来将宏观变形映射到宏观应力。在离线学习阶段,探地雷达模型是通过求解微尺度问题来训练的,通过均匀采样或Sobol采样获得足够丰富的变形状态集。经过训练的探地雷达模型对微观尺度的非线性和各向异性进行编码,并作为宏观Kirchhoff-Love壳的材料模型。在验证了所提方法的不同组成部分之后,我们介绍了几个膜承受拉伸和剪切的例子,以证明其通用性和良好的性能。 摘要:Knitting is an effective technique for producing complex three-dimensional surfaces owing to the inherent flexibility of interlooped yarns and recent advances in manufacturing providing better control of local stitch patterns. Fully yarn-level modelling of large-scale knitted membranes is not feasible. Therefore, we consider a two-scale homogenisation approach and model the membrane as a Kirchhoff-Love shell on the macroscale and as Euler-Bernoulli rods on the microscale. The governing equations for both the shell and the rod are discretised with cubic B-spline basis functions. The solution of the nonlinear microscale problem requires a significant amount of time due to the large deformations and the enforcement of contact constraints, rendering conventional online computational homogenisation approaches infeasible. To sidestep this problem, we use a pre-trained statistical Gaussian Process Regression (GPR) model to map the macroscale deformations to macroscale stresses. During the offline learning phase, the GPR model is trained by solving the microscale problem for a sufficiently rich set of deformation states obtained by either uniform or Sobol sampling. The trained GPR model encodes the nonlinearities and anisotropies present in the microscale and serves as a material model for the macroscale Kirchhoff-Love shell. After verifying and validating the different components of the proposed approach, we introduce several examples involving membranes subjected to tension and shear to demonstrate its versatility and good performance.
【13】 Quantifying Explainability in NLP and Analyzing Algorithms for Performance-Explainability Tradeoff 标题:NLP中可解释性的量化及性能-可解释性折衷算法分析
作者:Mitchell Naylor,Christi French,Samantha Terker,Uday Kamath 备注:To appear at Interpretable ML in Healthcare workshop at ICML 2021. 9 pages (excluding references), 6 figures 链接:https://arxiv.org/abs/2107.05693 摘要:医疗保健领域是机器学习最令人兴奋的应用领域之一,但缺乏模型透明度导致行业内采用的滞后。在这项工作中,我们探讨了当前艺术的解释性和解释性的案例研究中的临床文本分类,使用任务的死亡率预测模拟-III临床笔记。我们展示了用于完全解释方法以及模型不可知的事后属性的各种可视化技术,并且我们提供了一种通用方法,用于评估从logistic回归到BERT变量的模型类型中使用不忠和局部Lipschitz的解释质量。通过这些度量,我们引入了一个框架,通过这个框架,从业者和研究人员可以评估模型的预测性能和可用解释的质量之间的界限。我们提供代码以鼓励继续改进这些方法。 摘要:The healthcare domain is one of the most exciting application areas for machine learning, but a lack of model transparency contributes to a lag in adoption within the industry. In this work, we explore the current art of explainability and interpretability within a case study in clinical text classification, using a task of mortality prediction within MIMIC-III clinical notes. We demonstrate various visualization techniques for fully interpretable methods as well as model-agnostic post hoc attributions, and we provide a generalized method for evaluating the quality of explanations using infidelity and local Lipschitz across model types from logistic regression to BERT variants. With these metrics, we introduce a framework through which practitioners and researchers can assess the frontier between a model's predictive performance and the quality of its available explanations. We make our code available to encourage continued refinement of these methods.
【14】 Least-Squares Linear Dilation-Erosion Regressor Trained using Stochastic Descent Gradient or the Difference of Convex Methods 标题:用随机下降梯度或凸差法训练的最小二乘线性膨胀-侵蚀回归器
作者:Angelica Lourenço Oliveira,Marcos Eduardo Valle 备注:None 链接:https://arxiv.org/abs/2107.05682 摘要:本文提出了一种混合形态神经网络的回归任务称为线性膨胀-侵蚀回归($ell$-DER)。简单地说,一个$ell$-DER模型是由线性算子和初等形态算子组成的凸组合给出的。因此,它们产生连续的分段线性函数,因此是通用逼近器。除了介绍$ell$-DER模型外,我们还提出了三种训练这些模型的方法:一种是基于随机下降梯度的方法,另一种是基于凸规划问题的差分方法。最后,我们使用14个回归任务来评估$ell$-DER模型的性能。尽管基于SDG的方法比其他两种方法显示的速度更快,但是使用严格的凸凹规划问题训练的$ell$-DER在最小平均绝对误差得分方面优于其他方法。 摘要:This paper presents a hybrid morphological neural network for regression tasks called linear dilation-erosion regression ($ell$-DER). In few words, an $ell$-DER model is given by a convex combination of the composition of linear and elementary morphological operators. As a result, they yield continuous piecewise linear functions and, thus, are universal approximators. Apart from introducing the $ell$-DER models, we present three approaches for training these models: one based on stochastic descent gradient and two based on the difference of convex programming problems. Finally, we evaluate the performance of the $ell$-DER model using 14 regression tasks. Although the approach based on SDG revealed faster than the other two, the $ell$-DER trained using a disciplined convex-concave programming problem outperformed the others in terms of the least mean absolute error score.
【15】 Functional Magnetic Resonance Imaging data augmentation through conditional ICA 标题:基于条件ICA的功能磁共振成像数据增强
作者:Badr Tajini,Hugo Richard,Bertrand Thirion 机构:Inria, CEA, Universit´e Paris-Saclay, France 备注:14 pages, 5 figures, 7 tables 链接:https://arxiv.org/abs/2107.06104 摘要:计算认知神经成像研究的进展与大量标记脑成像数据的可用性有关,但此类数据的生成既稀缺又昂贵。虽然强大的数据生成机制,如生成性对抗网络(GANs),是在过去十年中为计算机视觉设计的,但这种改进还没有延续到大脑成像。一个可能的原因是GANs训练不适合于功能性神经成像中的噪声、高维和小样本数据。本文介绍了条件独立分量分析(Conditional Independent Components Analysis,条件独立分量分析):一种快速的功能性磁共振成像(functional MRI)数据增强技术,它利用丰富的静态数据,通过从ICA分解中采样来创建图像。然后,我们提出了一种机制,将生成器的条件设置为在很少样本的情况下观察到的类。我们首先证明了生成机制在合成与观察不可区分的数据方面是成功的,并且它在大脑解码问题中产生了分类精度的提高。特别是它的性能优于GANs,同时更易于优化和解释。最后,条件ICA在8个数据集上提高了分类精度,无需进一步的参数调整。 摘要:Advances in computational cognitive neuroimaging research are related to the availability of large amounts of labeled brain imaging data, but such data are scarce and expensive to generate. While powerful data generation mechanisms, such as Generative Adversarial Networks (GANs), have been designed in the last decade for computer vision, such improvements have not yet carried over to brain imaging. A likely reason is that GANs training is ill-suited to the noisy, high-dimensional and small-sample data available in functional neuroimaging.In this paper, we introduce Conditional Independent Components Analysis (Conditional ICA): a fast functional Magnetic Resonance Imaging (fMRI) data augmentation technique, that leverages abundant resting-state data to create images by sampling from an ICA decomposition. We then propose a mechanism to condition the generator on classes observed with few samples. We first show that the generative mechanism is successful at synthesizing data indistinguishable from observations, and that it yields gains in classification accuracy in brain decoding problems. In particular it outperforms GANs while being much easier to optimize and interpret. Lastly, Conditional ICA enhances classification accuracy in eight datasets without further parameters tuning.
【16】 Oversampling Divide-and-conquer for Response-skewed Kernel Ridge Regression 标题:响应偏斜核岭回归的过抽样分治算法
作者:Jingyi Zhang,Xiaoxiao Sun 机构: Department of Industrial Engineering, Center for Statistical Science, Tsinghua University, Epidemiology and Biostatistics Department, the University of Arizona, ∗ Corresponding author 链接:https://arxiv.org/abs/2107.05834 摘要:分治方法已被广泛用于估计大规模核岭回归估计。不幸的是,当响应变量是高度倾斜时,分治核岭回归(dacKRR)可能会忽略代表性不足的区域,导致不可接受的结果。我们提出了一种新的响应自适应划分策略来克服这一局限性。特别地,我们建议将一些仔细识别的信息性观察的复制分配给多个节点(本地处理器)。这个想法类似于流行的过采样技术。虽然这种技术已被广泛用于解决离散标签偏斜,但将其扩展到dacKRR设置是不寻常的。我们为如何在dacKRR环境下有效地对观测数据进行过采样提供了理论和实践指导。此外,在温和的条件下,我们证明了所提出的估计比经典的dacKRR估计具有更小的渐近均方误差(AMSE)。我们的理论结果得到了模拟和实际数据分析的支持。 摘要:The divide-and-conquer method has been widely used for estimating large-scale kernel ridge regression estimates. Unfortunately, when the response variable is highly skewed, the divide-and-conquer kernel ridge regression (dacKRR) may overlook the underrepresented region and result in unacceptable results. We develop a novel response-adaptive partition strategy to overcome the limitation. In particular, we propose to allocate the replicates of some carefully identified informative observations to multiple nodes (local processors). The idea is analogous to the popular oversampling technique. Although such a technique has been widely used for addressing discrete label skewness, extending it to the dacKRR setting is nontrivial. We provide both theoretical and practical guidance on how to effectively over-sample the observations under the dacKRR setting. Furthermore, we show the proposed estimate has a smaller asymptotic mean squared error (AMSE) than that of the classical dacKRR estimate under mild conditions. Our theoretical findings are supported by both simulated and real-data analyses.
【17】 Calibrating Predictions to Decisions: A Novel Approach to Multi-Class Calibration 标题:对决策进行预测校正:一种新的多类校正方法
作者:Shengjia Zhao,Michael P. Kim,Roshni Sahoo,Tengyu Ma,Stefano Ermon 机构:Stanford University, UC Berkeley 链接:https://arxiv.org/abs/2107.05719 摘要:当面对不确定性时,决策者需要他们可以信任的预测。机器学习提供者可以通过保证决策者的预测经过分布校准来向决策者传递信心——在接收预测类概率向量$q$的输入中,类的实际分布是$q$。然而,对于多类预测问题,实现分布校正往往是不可行的,需要样本复杂度指数级的类数$C$。在这项工作中,我们引入了一个新的概念--emph{decision calibration},它要求预测分布和真实分布对于一组下游决策者是“不可区分的”。当考虑所有可能的决策者时,决策校正与分布校正是相同的。然而,当我们只考虑决策者在有界数量的动作之间选择(例如,在$C$中的多项式)时,我们的主要结果表明决策校准变得可行——我们设计了一个重新校准算法,该算法需要在动作数和类数上需要样本复杂度多项式。我们通过实验验证了我们的重新校准算法:与现有的方法相比,决策校准改进了皮肤损伤的决策和使用现代神经网络预测的ImageNet分类。 摘要:When facing uncertainty, decision-makers want predictions they can trust. A machine learning provider can convey confidence to decision-makers by guaranteeing their predictions are distribution calibrated -- amongst the inputs that receive a predicted class probabilities vector $q$, the actual distribution over classes is $q$. For multi-class prediction problems, however, achieving distribution calibration tends to be infeasible, requiring sample complexity exponential in the number of classes $C$. In this work, we introduce a new notion -- emph{decision calibration} -- that requires the predicted distribution and true distribution to be ``indistinguishable'' to a set of downstream decision-makers. When all possible decision makers are under consideration, decision calibration is the same as distribution calibration. However, when we only consider decision makers choosing between a bounded number of actions (e.g. polynomial in $C$), our main result shows that decisions calibration becomes feasible -- we design a recalibration algorithm that requires sample complexity polynomial in the number of actions and the number of classes. We validate our recalibration algorithm empirically: compared to existing methods, decision calibration improves decision-making on skin lesion and ImageNet classification with modern neural network predictors.
【18】 Quality of Service Guarantees for Physical Unclonable Functions 标题:物理不可克隆功能的服务质量保证
作者:Onur Günlü,Rafael F. Schaefer,H. Vincent Poor 机构:∗Chair of Communications Engineering and Security, University of Siegen, Siegen, Germany, †Electrical and Computer Engineering Department, Princeton University, Princeton, NJ , USA 备注:Submitted to IEEE 链接:https://arxiv.org/abs/2107.05675 摘要:我们认为一个密钥协商协议的问题,其中嘈杂的物理不可克隆函数(PUF)输出促进可靠,安全和私钥协议的帮助下,公共,无声,认证存储。PUF的输出具有高度的相关性,因此将变换编码方法与标量量化器相结合,提取出具有可靠性保证的不相关比特序列。对于具有连续值输出的PUF电路,用相应的截断分布代替拟合分布,使变换后的输出模型更符合实际。为每个提取的位提供可靠性保证的最新PUF方法不足以保证所有PUF输出具有相同的可靠性水平。因此,引入服务质量参数来控制可保证目标可靠性水平的PUF输出的百分比。一个公共环形振荡器(RO)输出数据集被用来说明截断高斯分布可以被拟合到作为统一标量量化器输入的变换RO输出,这样通过消除一个小的高斯噪声分量,在加性高斯噪声分量下,可以为从任何PUF设备提取的每个比特提供可靠性保证PUF输出的子集。此外,我们相反地证明,如果不允许额外的保密性和隐私泄露,在不消除任何PUF输出的情况下不可能提供这种可靠性保证。 摘要:We consider a secret key agreement problem in which noisy physical unclonable function (PUF) outputs facilitate reliable, secure, and private key agreement with the help of public, noiseless, and authenticated storage. PUF outputs are highly correlated, so transform coding methods have been combined with scalar quantizers to extract uncorrelated bit sequences with reliability guarantees. For PUF circuits with continuous-valued outputs, the models for transformed outputs are made more realistic by replacing the fitted distributions with corresponding truncated ones. The state-of-the-art PUF methods that provide reliability guarantees to each extracted bit are shown to be inadequate to guarantee the same reliability level for all PUF outputs. Thus, a quality of service parameter is introduced to control the percentage of PUF outputs for which a target reliability level can be guaranteed. A public ring oscillator (RO) output dataset is used to illustrate that a truncated Gaussian distribution can be fitted to transformed RO outputs that are inputs to uniform scalar quantizers such that reliability guarantees can be provided for each bit extracted from any PUF device under additive Gaussian noise components by eliminating a small subset of PUF outputs. Furthermore, we conversely show that it is not possible to provide such reliability guarantees without eliminating any PUF output if no extra secrecy and privacy leakage is allowed.