机器学习学术速递[8.19]

2021-08-24 16:35:12 浏览数 (1)

Update!H5支持摘要折叠,体验更佳!点击阅读原文访问arxivdaily.com,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏等功能!

cs.LG 方向,今日共计61篇

Graph相关(图学习|图神经网络|图优化等)(3篇)

【1】 Deep Graph Memory Networks for Forgetting-Robust Knowledge Tracing 标题:用于健忘型知识追踪的深图记忆网络 链接:https://arxiv.org/abs/2108.08105

作者:Ghodai Abdelrahman,Qing Wang 机构: when and how forgetting•All authors are with Research School of Computer Science, AustralianNational University 摘要:追踪学生的知识对于调整学习体验至关重要。最近的知识追踪方法倾向于通过跨学习概念建模知识状态动态来应对这些挑战。然而,他们仍然面临一些固有的挑战,包括:建模遗忘行为和确定潜在概念之间的关系。为了应对这些挑战,本文提出了一种新的知识跟踪模型,即emph{Deep Graph Memory Network}(DGMN)。在该模型中,我们在注意记忆结构中加入遗忘门机制,以便在知识追踪过程中动态捕捉遗忘行为。特别是,考虑到潜在概念之间的相互依赖性,这种遗忘门机制建立在潜在概念的注意遗忘特征之上。此外,该模型还能够根据学生不断发展的知识状态,从动态潜在概念图中学习潜在概念之间的关系。使用四个完善的基准数据集进行了综合实验评估。结果表明,在所有数据集上,DGMN始终优于最先进的KT模型。在我们的实验中还分析了遗忘行为建模和学习潜在概念图的有效性。 摘要:Tracing a student's knowledge is vital for tailoring the learning experience. Recent knowledge tracing methods tend to respond to these challenges by modelling knowledge state dynamics across learning concepts. However, they still suffer from several inherent challenges including: modelling forgetting behaviours and identifying relationships among latent concepts. To address these challenges, in this paper, we propose a novel knowledge tracing model, namely emph{Deep Graph Memory Network} (DGMN). In this model, we incorporate a forget gating mechanism into an attention memory structure in order to capture forgetting behaviours dynamically during the knowledge tracing process. Particularly, this forget gating mechanism is built upon attention forgetting features over latent concepts considering their mutual dependencies. Further, this model has the capability of learning relationships between latent concepts from a dynamic latent concept graph in light of a student's evolving knowledge states. A comprehensive experimental evaluation has been conducted using four well-established benchmark datasets. The results show that DGMN consistently outperforms the state-of-the-art KT models over all the datasets. The effectiveness of modelling forgetting behaviours and learning latent concept graphs has also been analyzed in our experiments.

【2】 Variational Graph Normalized Auto-Encoders 标题:变分图归一化自动编码器 链接:https://arxiv.org/abs/2108.08046

作者:Seong Jin Ahn,Myoung Ho Kim 机构:KAIST, Daejeon, Republic of Korea, MyoungHo Kim 摘要:链接预测是图结构数据的关键问题之一。随着图神经网络的发展,图自动编码器(GAEs)和变分图自动编码器(VGAEs)被提出以无监督的方式学习图嵌入。结果表明,这些方法对于链路预测任务是有效的。然而,当涉及度为零的节点(例如,隔离节点)时,它们在链路预测中不能很好地工作。我们发现,GAEs/VGAE使孤立节点的嵌入接近于零,而与它们的内容特征无关。在本文中,我们提出了一种新的变分图规范化自动编码器(VGNAE),它利用$L_2$-规范化为孤立节点生成更好的嵌入。我们表明,我们的VGNAEs在链路预测任务方面优于现有的最新模型。该守则可于https://github.com/SeongJinAhn/VGNAE. 摘要:Link prediction is one of the key problems for graph-structured data. With the advancement of graph neural networks, graph autoencoders (GAEs) and variational graph autoencoders (VGAEs) have been proposed to learn graph embeddings in an unsupervised way. It has been shown that these methods are effective for link prediction tasks. However, they do not work well in link predictions when a node whose degree is zero (i.g., isolated node) is involved. We have found that GAEs/VGAEs make embeddings of isolated nodes close to zero regardless of their content features. In this paper, we propose a novel Variational Graph Normalized AutoEncoder (VGNAE) that utilize $L_2$-normalization to derive better embeddings for isolated nodes. We show that our VGNAEs outperform the existing state-of-the-art models for link prediction tasks. The code is available at https://github.com/SeongJinAhn/VGNAE.

【3】 Predicting Dynamic Stability of Power Grids using Graph Neural Networks 标题:基于图神经网络的电网动态稳定预测 链接:https://arxiv.org/abs/2108.08230

作者:Christian Nauck,Michael Lindner,Konstantin Schürholt,Haoming Zhang,Paul Schultz,Jürgen Kurths,Ingrid Isenhardt,Frank Hellmann 备注:8 pages, 13 pages including appendix, 31 pictures plus tikz pictures 摘要:由于可再生能源的分散结构、减少惯性和波动性,随着可再生能源份额的增加,电网的动态稳定性预测变得更加重要和具有挑战性。以单节点流域稳定性(SNBS)为指标,研究了应用图形神经网络(GNN)预测复杂电网同步动态稳定性的可行性。为此,我们分别为具有20个和100个节点的网格生成两个合成数据集,并使用蒙特卡罗抽样估计SNB。这些数据集用于训练和评估八种不同GNN模型的性能。所有模型都使用完整的图形(无需简化)作为输入,并在节点回归设置中预测SNB。我们发现,SNB一般可以预测,并且使用不同的GNN模型,性能会发生显著变化。此外,我们观察到我们的方法具有有趣的传输能力:在较小网格上训练的GNN模型可以直接应用于较大网格,而无需重新训练。 摘要:The prediction of dynamical stability of power grids becomes more important and challenging with increasing shares of renewable energy sources due to their decentralized structure, reduced inertia and volatility. We investigate the feasibility of applying graph neural networks (GNN) to predict dynamic stability of synchronisation in complex power grids using the single-node basin stability (SNBS) as a measure. To do so, we generate two synthetic datasets for grids with 20 and 100 nodes respectively and estimate SNBS using Monte-Carlo sampling. Those datasets are used to train and evaluate the performance of eight different GNN-models. All models use the full graph without simplifications as input and predict SNBS in a nodal-regression-setup. We show that SNBS can be predicted in general and the performance significantly changes using different GNN-models. Furthermore, we observe interesting transfer capabilities of our approach: GNN-models trained on smaller grids can directly be applied on larger grids without the need of retraining.

Transformer(1篇)

【1】 Transformers predicting the future. Applying attention in next-frame and time series forecasting 标题:Transformer预测未来。注意力在下一帧和时间序列预测中的应用 链接:https://arxiv.org/abs/2108.08224

作者:Radostin Cholakov,Todor Kolev 机构:High School of Mathematics, "Acad. Kiril Popov" - Plovdiv, Bulgaria, Comrade Cooperative, Sofia, Bulgaria 备注:8 pages, 5 figures. Written during Summer Research School 2021 in Apriltsi, Bulgaria 摘要:直到最近,递归神经网络还是捕获序列中及时相关性的最佳方法之一。然而,随着Transformer的引入,已经证明只有注意机制而没有任何RNN的体系结构可以改善各种序列处理任务(例如NLP)的结果。此后的多项研究表明,类似的方法可以应用于图像、点云、视频、音频或时间序列预测。此外,还引入了诸如感知者或告密者之类的解决方案,以扩展Transformer的适用性。我们的主要目标是测试和评估在时间序列数据上应用Transformer式模型的有效性,通过微调超参数、预处理数据、应用降维或卷积编码解决异常敏感性、上下文感知和空间复杂性问题,等等。我们也在研究下一帧预测问题,并探索修改现有解决方案的方法,以实现更高的性能和学习通用知识。 摘要:Recurrent Neural Networks were, until recently, one of the best ways to capture the timely dependencies in sequences. However, with the introduction of the Transformer, it has been proven that an architecture with only attention-mechanisms without any RNN can improve on the results in various sequence processing tasks (e.g. NLP). Multiple studies since then have shown that similar approaches can be applied for images, point clouds, video, audio or time series forecasting. Furthermore, solutions such as the Perceiver or the Informer have been introduced to expand on the applicability of the Transformer. Our main objective is testing and evaluating the effectiveness of applying Transformer-like models on time series data, tackling susceptibility to anomalies, context awareness and space complexity by fine-tuning the hyperparameters, preprocessing the data, applying dimensionality reduction or convolutional encodings, etc. We are also looking at the problem of next-frame prediction and exploring ways to modify existing solutions in order to achieve higher performance and learn generalized knowledge.

GAN|对抗|攻击|生成相关(1篇)

【1】 Revisiting Adversarial Robustness Distillation: Robust Soft Labels Make Student Better 标题:重温对手健壮性蒸馏:健壮的软标签让学生变得更好 链接:https://arxiv.org/abs/2108.07969

作者:Bojia Zi,Shihao Zhao,Xingjun Ma,Yu-Gang Jiang 机构:Shanghai Key Lab of Intelligent Information Processing, School of Computer Science, Fudan Univeristy, Shanghai Collaborative Innovation Center on Intelligent Visual Computing, School of Information Technology, Deakin University, Geelong, Australia 摘要:对抗性训练是训练抗对抗性攻击的鲁棒深层神经网络的一种有效方法。虽然能够带来可靠的鲁棒性,但对抗性训练(AT)方法通常有利于高容量模型,即模型越大,鲁棒性越好。这往往会限制它们在小型模型上的有效性,在存储或计算资源非常有限的场景中(例如,移动设备),小型模型更可取。在本文中,我们利用知识提取的概念,通过从经过逆向训练的大模型中提取知识来提高小模型的鲁棒性。我们首先从蒸馏的角度回顾了几种最先进的AT方法,并确定了一种可以提高鲁棒性的常用技术:使用鲁棒软标签——鲁棒模型的预测。根据这一观察,我们提出了一种新的对抗性鲁棒性蒸馏方法,称为鲁棒软标签对抗性蒸馏(RSLAD),用于训练鲁棒的小型学生模型。RSLAD充分利用稳健(对抗性训练)大型教师模型产生的稳健软标签,指导学生在所有损失条件下的自然和对抗性示例学习。我们通过经验证明了我们的RSLAD方法在提高小型模型对最先进的攻击(包括自动攻击)的鲁棒性方面优于现有对抗性训练和蒸馏方法的有效性。我们还提供了一套关于RSLAD的理解,以及鲁棒软标签对于对抗性鲁棒性蒸馏的重要性。 摘要:Adversarial training is one effective approach for training robust deep neural networks against adversarial attacks. While being able to bring reliable robustness, adversarial training (AT) methods in general favor high capacity models, i.e., the larger the model the better the robustness. This tends to limit their effectiveness on small models, which are more preferable in scenarios where storage or computing resources are very limited (e.g., mobile devices). In this paper, we leverage the concept of knowledge distillation to improve the robustness of small models by distilling from adversarially trained large models. We first revisit several state-of-the-art AT methods from a distillation perspective and identify one common technique that can lead to improved robustness: the use of robust soft labels -- predictions of a robust model. Following this observation, we propose a novel adversarial robustness distillation method called Robust Soft Label Adversarial Distillation (RSLAD) to train robust small student models. RSLAD fully exploits the robust soft labels produced by a robust (adversarially-trained) large teacher model to guide the student's learning on both natural and adversarial examples in all loss terms. We empirically demonstrate the effectiveness of our RSLAD approach over existing adversarial training and distillation methods in improving the robustness of small models against state-of-the-art attacks including the AutoAttack. We also provide a set of understandings on our RSLAD and the importance of robust soft labels for adversarial robustness distillation.

半/弱/无/有监督|不确定性|主动学习(4篇)

【1】 A new semi-supervised inductive transfer learning framework: Co-Transfer 标题:一种新的半监督归纳迁移学习框架:协同迁移 链接:https://arxiv.org/abs/2108.07930

作者:Ze Yuan,Yimin Wen 机构:Guangxi Key Laboratory of Image and Graphic intelligent processing, Guilin University of, Electronic Technology, Guilin , China, School of Computer Science and Information Safety, Guilin University of 摘要:在许多实际的数据挖掘场景中,例如网络入侵检测、Twitter垃圾邮件检测和计算机辅助诊断,源域与目标域不同但与目标域相关是非常常见的。此外,在源域和目标域中都有大量未标记的数据,但标记它们中的每一个都是困难的、昂贵的、耗时的,有时甚至是不必要的。因此,充分挖掘源域和目标域中有标记和无标记的数据,解决目标域中的任务是非常重要和值得的。本文提出了一种新的半监督归纳迁移学习框架&emph{Co-transfer}。协同转移首先生成三个TradaBost分类器,用于从源域到目标域的转移学习,同时使用原始标记数据中的引导样本生成另外三个TradaBost分类器,用于从目标域到源域的转移学习。在每一轮的共转换中,每组TradaBost分类器都使用仔细标记的数据进行细化。最后,一组TradaBost分类器学习从源域转移到目标域,从而产生最终的假设。实验结果表明,协同传输可以有效地利用和重用源域和目标域中的标记和未标记数据。 摘要:In many practical data mining scenarios, such as network intrusion detection, Twitter spam detection, and computer-aided diagnosis, a source domain that is different from but related to a target domain is very common. In addition, a large amount of unlabeled data is available in both source and target domains, but labeling each of them is difficult, expensive, time-consuming, and sometime unnecessary. Therefore, it is very important and worthwhile to fully explore the labeled and unlabeled data in source and target domains to settle the task in target domain. In this paper, a new semi-supervised inductive transfer learning framework, named emph{Co-Transfer} is proposed. Co-Transfer first generates three TrAdaBoost classifiers for transfer learning from the source domain to the target domain, and meanwhile another three TrAdaBoost classifiers are generated for transfer learning from the target domain to the source domain, using bootstraped samples from the original labeled data. In each round of co-transfer, each group of TrAdaBoost classifiers are refined using the carefully labeled data. Finally, the group of TrAdaBoost classifiers learned to transfer from the source domain to the target domain produce the final hypothesis. Experiments results illustrate Co-Transfer can effectively exploit and reuse the labeled and unlabeled data in source and target domains.

【2】 Affect-Aware Deep Belief Network Representations for Multimodal Unsupervised Deception Detection 标题:多模态无监督欺骗检测的情感感知深信网表示 链接:https://arxiv.org/abs/2108.07897

作者:Leena Mathur,Maja J Matarić 机构:Department of Computer Science, University of Southern California 摘要:检测欺骗社会行为的自动化系统可以提高医疗、社会工作和法律领域的人类福祉。用于训练有监督欺骗检测模型的标记数据集很少能在现实世界的高风险环境中收集。为了应对这一挑战,我们提出了第一种无监督的方法来检测视频中真实世界的高风险欺骗,而不需要标签。本文提出了一种基于情感感知的无监督深层信念网络(DBN)学习欺骗行为和真实行为的鉴别表征的新方法。基于将情感和欺骗联系起来的心理学理论,我们对基于单峰和多峰DBN的方法进行了实验,这些方法都是基于面部配价、面部唤醒、听觉和视觉特征进行训练的。除了使用面部情感作为训练DBN模型的特征外,我们还介绍了一个DBN训练过程,该过程使用面部情感作为视听表示的对齐器。我们使用无监督高斯混合模型聚类进行分类实验,以评估我们的方法。我们最好的无监督方法(面部配价和视觉特征方面的训练)实现了80%的AUC,优于人类能力,表现与完全监督模型相当。我们的研究结果推动了未来在无监督、情感感知的计算方法上的工作,用于检测野外的欺骗和其他社会行为。 摘要:Automated systems that detect the social behavior of deception can enhance human well-being across medical, social work, and legal domains. Labeled datasets to train supervised deception detection models can rarely be collected for real-world, high-stakes contexts. To address this challenge, we propose the first unsupervised approach for detecting real-world, high-stakes deception in videos without requiring labels. This paper presents our novel approach for affect-aware unsupervised Deep Belief Networks (DBN) to learn discriminative representations of deceptive and truthful behavior. Drawing on psychology theories that link affect and deception, we experimented with unimodal and multimodal DBN-based approaches trained on facial valence, facial arousal, audio, and visual features. In addition to using facial affect as a feature on which DBN models are trained, we also introduce a DBN training procedure that uses facial affect as an aligner of audio-visual representations. We conducted classification experiments with unsupervised Gaussian Mixture Model clustering to evaluate our approaches. Our best unsupervised approach (trained on facial valence and visual features) achieved an AUC of 80%, outperforming human ability and performing comparably to fully-supervised models. Our results motivate future work on unsupervised, affect-aware computational approaches for detecting deception and other social behaviors in the wild.

【3】 Coverage Hole Detection for mmWave Networks: An Unsupervised Learning Approach 标题:毫米波网络覆盖空洞检测:一种无监督学习方法 链接:https://arxiv.org/abs/2108.07854

作者:Chethan K. Anjinappa,Ismail Guvenc 机构:Department of Electrical and Computer Engineering, NC State University, Raleigh, NC 备注:To appear in IEEE Commun. Lett 摘要:毫米波频段在5G网络中的应用对网络规划提出了新的挑战。毫米波频段的阻塞漏洞可导致无线电环境中的覆盖漏洞(CHs),从而在用户进入这些CHs时导致无线电链路故障。检测社区卫生服务具有至关重要的意义,以便能够采取必要的补救措施来提高覆盖率。在这封信中,我们提出了一种新的方法,使用最先进的流形学习技术:一致流形近似和投影,以无监督的方式识别CHs。其关键思想是保持所收集的未标记信道样本中固有的局部连通性结构,以便来自服务区域的CHs是可检测的。我们在DeepMIMO数据集场景中的结果表明,该方法可以学习数据样本中的结构,并在保持CH边界的同时在低维嵌入中提供可视孔洞。一旦在低维嵌入中确定了CH边界,就可以对这些样本应用基于信道的定位技术来获得CH的地理边界。 摘要:The utilization of millimeter-wave (mmWave) bands in 5G networks poses new challenges to network planning. Vulnerability to blockages at mmWave bands can cause coverage holes (CHs) in the radio environment, leading to radio link failure when a user enters these CHs. Detection of the CHs carries critical importance so that necessary remedies can be introduced to improve coverage. In this letter, we propose a novel approach to identify the CHs in an unsupervised fashion using a state-of-the-art manifold learning technique: uniform manifold approximation and projection. The key idea is to preserve the local-connectedness structure inherent in the collected unlabelled channel samples, such that the CHs from the service area are detectable. Our results on the DeepMIMO dataset scenario demonstrate that the proposed method can learn the structure within the data samples and provide visual holes in the low-dimensional embedding while preserving the CH boundaries. Once the CH boundary is determined in the low-dimensional embedding, channel-based localization techniques can be applied to these samples to obtain the geographical boundaries of the CHs.

【4】 Bagging Supervised Autoencoder Classifier for Credit Scoring 标题:用于信用评分的袋装监督自动编码器分类器 链接:https://arxiv.org/abs/2108.07800

作者:Mahsan Abdoli,Mohammad Akbari,Jamal Shahrabi 机构:Department of Industrial Engineering, Amirkabir University of, Department of Mathematics and Computer Science, Amirkabir, University of Technology (Tehran Polytechnic) 摘要:信用评分模型是银行和金融机构所依赖的最有效的风险管理工具之一,在过去几十年中一直是研究的热门课题。因此,已经制定了许多方法来应对贷款申请人分类方面的挑战,并改进和促进决策。信用评分数据集的不平衡性质,以及信用评分数据集中特征的异质性,给开发和实施有效的信用评分模型带来了困难,因为分类模型对看不见的数据具有泛化能力。在本文中,我们提出了Bagging监督自动编码器分类器(BSAC),该分类器主要利用监督自动编码器的优越性能,基于多任务学习原理,专门针对信用评分的最终分类任务学习输入数据的低维嵌入。BSAC还通过采用基于多数类欠采样的装袋过程变体来解决数据不平衡问题。我们在基准和真实信用评分数据集上的实验结果表明,Bagging监督自动编码器分类器在贷款申请人分类中的鲁棒性和有效性,这可以看作是信用评分模型的一个积极发展。 摘要:Credit scoring models, which are among the most potent risk management tools that banks and financial institutes rely on, have been a popular subject for research in the past few decades. Accordingly, many approaches have been developed to address the challenges in classifying loan applicants and improve and facilitate decision-making. The imbalanced nature of credit scoring datasets, as well as the heterogeneous nature of features in credit scoring datasets, pose difficulties in developing and implementing effective credit scoring models, targeting the generalization power of classification models on unseen data. In this paper, we propose the Bagging Supervised Autoencoder Classifier (BSAC) that mainly leverages the superior performance of the Supervised Autoencoder, which learns low-dimensional embeddings of the input data exclusively with regards to the ultimate classification task of credit scoring, based on the principles of multi-task learning. BSAC also addresses the data imbalance problem by employing a variant of the Bagging process based on the undersampling of the majority class. The obtained results from our experiments on the benchmark and real-life credit scoring datasets illustrate the robustness and effectiveness of the Bagging Supervised Autoencoder Classifier in the classification of loan applicants that can be regarded as a positive development in credit scoring models.

迁移|Zero/Few/One-Shot|自适应(2篇)

【1】 Confidence Adaptive Regularization for Deep Learning with Noisy Labels 标题:带噪声标签的深度学习置信度自适应正则化方法 链接:https://arxiv.org/abs/2108.08212

作者:Yangdi Lu,Yang Bo,Wenbo He 机构:Department of Computing and Software, McMaster University, Hamilton, Canada 摘要:最近关于深层神经网络对带噪标签的记忆效应的研究表明,网络首先拟合正确标记的训练样本,然后再记忆错误标记的样本。基于这种早期学习现象,我们提出了一种新的方法来防止错误标记样本的记忆。与使用模型输出识别或忽略错误标记样本的现有方法不同,我们在原始模型中引入了一个指标分支,使模型能够为每个样本生成置信值。置信值包含在我们的损失函数中,该函数学习为正确标记的样本分配大的置信值,为错误标记的样本分配小的置信值。我们还提出了一个辅助正则化项来进一步提高模型的鲁棒性。为了提高性能,我们采用了一种设计良好的目标估计策略来逐步校正噪声标签。我们提供了理论分析,并在合成数据集和真实数据集上进行了实验,证明我们的方法取得了与最先进方法相当的结果。 摘要:Recent studies on the memorization effects of deep neural networks on noisy labels show that the networks first fit the correctly-labeled training samples before memorizing the mislabeled samples. Motivated by this early-learning phenomenon, we propose a novel method to prevent memorization of the mislabeled samples. Unlike the existing approaches which use the model output to identify or ignore the mislabeled samples, we introduce an indicator branch to the original model and enable the model to produce a confidence value for each sample. The confidence values are incorporated in our loss function which is learned to assign large confidence values to correctly-labeled samples and small confidence values to mislabeled samples. We also propose an auxiliary regularization term to further improve the robustness of the model. To improve the performance, we gradually correct the noisy labels with a well-designed target estimation strategy. We provide the theoretical analysis and conduct the experiments on synthetic and real-world datasets, demonstrating that our approach achieves comparable results to the state-of-the-art methods.

【2】 DRDrV3: Complete Lesion Detection in Fundus Images Using Mask R-CNN, Transfer Learning, and LSTM 标题:DRDrV3:使用Mask R-CNN、转移学习和LSTM在眼底图像中完成病变检测 链接:https://arxiv.org/abs/2108.08095

作者:Farzan Shenavarmasouleh,Farid Ghareh Mohammadi,M. Hadi Amini,Thiab Taha,Khaled Rasheed,Hamid R. Arabnia 机构: Department of Computer Science, University of Georgia, Athens, GA, Knight Foundation School of Computing and Information Sciences, Florida International University, Miami, FL 备注:The 7th International Conference on Health Informatics & Medical Systems (HIMS'21: July 26-29, 2021, USA) 摘要:医学成像是计算机视觉领域的一个新兴领域。在这项研究中,我们的目标是解决糖尿病视网膜病变(DR)的问题,作为医学影像学的开放性挑战之一。在本研究中,我们提出了一种新的病变检测体系结构,包括两个子模块,这是一种最佳的解决方案,不仅可以检测和发现DR引起的病变类型、其相应的边界框及其遮罩;还包括整个病例的严重程度。除了传统的精度外,我们还使用两个流行的评估标准来评估模型的输出,即联合交集(IOU)和平均精度(mAP)。我们假设,这种新的解决方案使专家能够以高置信度检测病变,并以高精度估计损伤的严重程度。 摘要:Medical Imaging is one of the growing fields in the world of computer vision. In this study, we aim to address the Diabetic Retinopathy (DR) problem as one of the open challenges in medical imaging. In this research, we propose a new lesion detection architecture, comprising of two sub-modules, which is an optimal solution to detect and find not only the type of lesions caused by DR, their corresponding bounding boxes, and their masks; but also the severity level of the overall case. Aside from traditional accuracy, we also use two popular evaluation criteria to evaluate the outputs of our models, which are intersection over union (IOU) and mean average precision (mAP). We hypothesize that this new solution enables specialists to detect lesions with high confidence and estimate the severity of the damage with high accuracy.

医学相关(4篇)

【1】 TB-ICT: A Trustworthy Blockchain-Enabled System for Indoor COVID-19 Contact Tracing 标题:TB-ICT:一种可信赖的区块链室内冠状病毒接触追踪系统 链接:https://arxiv.org/abs/2108.08275

作者:Mohammad Salimibeni,Zohreh Hajiakhondi-Meybodi,Arash Mohammadi,Yingxu Wang 机构:†Concordia Institute for Information Systems Engineering, Concordia University, Montreal, Canada, ‡ Department of Electrical and Computer Engineering, Concordia University, Montreal, Canada 摘要:最近,作为新冠病毒-19大流行的结果,对接触追踪(CT)模型的依赖性显著增加,以防止这种高度传染性病毒的传播,并为未来可能的病毒做好准备。由于新型冠状病毒在室内环境中的传播概率远高于室外环境,因此迫切需要开发/设计高效、自主、可靠和安全的室内CT解决方案。尽管如此紧迫,这一领域仍处于起步阶段。本文解决了这一差距,并提出了可信的区块链支持的室内联系人追踪系统(TB-ICT)框架。TB-ICT框架旨在保护底层CT数据的隐私和完整性,防止未经授权的访问。更具体地说,它是一个完全分布式和创新的区块链平台,利用拟议的基于动态工作证明(dPoW)信用的共识算法,结合随机哈希窗口(W-Hash)和动态信用证明(dPoC)机制来区分诚实和不诚实的节点。TB-ICT不仅提供了数据复制的分散化,而且还根据其基础的基于信用的机制量化了节点的行为。为了实现高定位性能,我们利用物联网(IoT)室内定位基础设施的可用性,开发了基于蓝牙低能量(BLE)传感器测量的数据驱动定位模型。仿真结果表明,TB-ICT通过实施高精度的接触追踪模型,在提高用户隐私和安全性的同时,防止了新冠病毒的传播。 摘要:Recently, as a consequence of the COVID-19 pandemic, dependence on Contact Tracing (CT) models has significantly increased to prevent spread of this highly contagious virus and be prepared for the potential future ones. Since the spreading probability of the novel coronavirus in indoor environments is much higher than that of the outdoors, there is an urgent and unmet quest to develop/design efficient, autonomous, trustworthy, and secure indoor CT solutions. Despite such an urgency, this field is still in its infancy. The paper addresses this gap and proposes the Trustworthy Blockchain-enabled system for Indoor Contact Tracing (TB-ICT) framework. The TB-ICT framework is proposed to protect privacy and integrity of the underlying CT data from unauthorized access. More specifically, it is a fully distributed and innovative blockchain platform exploiting the proposed dynamic Proof of Work (dPoW) credit-based consensus algorithm coupled with Randomized Hash Window (W-Hash) and dynamic Proof of Credit (dPoC) mechanisms to differentiate between honest and dishonest nodes. The TB-ICT not only provides a decentralization in data replication but also quantifies the node's behavior based on its underlying credit-based mechanism. For achieving high localization performance, we capitalize on availability of Internet of Things (IoT) indoor localization infrastructures, and develop a data driven localization model based on Bluetooth Low Energy (BLE) sensor measurements. The simulation results show that the proposed TB-ICT prevents the COVID-19 from spreading by implementation of a highly accurate contact tracing model while improving the users' privacy and security.

【2】 Optimising Knee Injury Detection with Spatial Attention and Validating Localisation Ability 标题:利用空间注意优化膝关节损伤检测并验证定位能力 链接:https://arxiv.org/abs/2108.08136

作者:Niamh Belton,Ivan Welaratne,Adil Dahlan,Ronan T Hearne,Misgina Tsighe Hagos,Aonghus Lawlor,Kathleen M. Curran 机构: Science Foundation Ireland Centre for Research Training in Machine Learning, School of Medicine, University College Dublin, Department of Radiology, Mater Misericordiae University Hospital, Dublin, School of Electronic Engineering, University College Dublin 备注:None 摘要:这项工作采用了预先训练的多视角卷积神经网络(CNN)和空间注意块来优化膝关节损伤检测。利用一个开源的磁共振成像(MRI)数据集和图像级标签进行分析。由于MRI数据是从三个平面获取的,我们使用单平面和多平面(多平面)的数据来比较我们的技术。对于多平面,我们研究了在网络中融合平面的各种方法。该分析产生了新的“MPFuseNet”网络和最先进的曲线下面积(AUC)分数,用于检测前交叉韧带(ACL)撕裂和异常磁共振成像,AUC分数分别为0.977和0.957。然后,我们开发了一个客观指标,惩罚定位精度(PLA),以验证模型的定位能力。该指标比较了梯度Cam输出生成的二进制掩模和磁共振成像样本上放射科医生的注释。我们还提取了模型不可知方法中的可解释性特征,然后由放射科医生验证其临床相关性。 摘要:This work employs a pre-trained, multi-view Convolutional Neural Network (CNN) with a spatial attention block to optimise knee injury detection. An open-source Magnetic Resonance Imaging (MRI) data set with image-level labels was leveraged for this analysis. As MRI data is acquired from three planes, we compare our technique using data from a single-plane and multiple planes (multi-plane). For multi-plane, we investigate various methods of fusing the planes in the network. This analysis resulted in the novel 'MPFuseNet' network and state-of-the-art Area Under the Curve (AUC) scores for detecting Anterior Cruciate Ligament (ACL) tears and Abnormal MRIs, achieving AUC scores of 0.977 and 0.957 respectively. We then developed an objective metric, Penalised Localisation Accuracy (PLA), to validate the model's localisation ability. This metric compares binary masks generated from Grad-Cam output and the radiologist's annotations on a sample of MRIs. We also extracted explainability features in a model-agnostic approach that were then verified as clinically relevant by the radiologist.

【3】 De-identification of Unstructured Clinical Texts from Sequence to Sequence Perspective 标题:从序列到序列视角的非结构化临床文本去识别 链接:https://arxiv.org/abs/2108.07971

作者:Md Monowar Anjum,Noman Mohammed,Xiaoqian Jiang 机构:Department of Computer Science, University of Manitoba, Winnipeg, MB, Canada, School of Biomedical Informatics, University of Texas, Houston, TX, USA 备注:Currently Under Consideration for ACM CCS 2021 摘要:在这项工作中,我们提出了一个新的问题公式,用于非结构化临床文本的去识别。我们将去识别问题描述为一个序列到序列的学习问题,而不是一个令牌分类问题。我们的方法受到最近最先进的命名实体识别序列到序列学习模型的启发。我们提出的方法的早期实验在i2b2数据集上实现了98.91%的召回率。此性能与当前非结构化临床文本去识别的最先进模型相当。 摘要:In this work, we propose a novel problem formulation for de-identification of unstructured clinical text. We formulate the de-identification problem as a sequence to sequence learning problem instead of a token classification problem. Our approach is inspired by the recent state-of -the-art performance of sequence to sequence learning models for named entity recognition. Early experimentation of our proposed approach achieved 98.91% recall rate on i2b2 dataset. This performance is comparable to current state-of-the-art models for unstructured clinical text de-identification.

【4】 Effective and scalable clustering of SARS-CoV-2 sequences 标题:高效、可扩展的SARS-CoV-2序列聚类 链接:https://arxiv.org/abs/2108.08143

作者:Sarwan Ali,Tamkanat-E-Ali,Muhammad Asad Khan,Imdadullah Khan,Murray Patterson 机构:SARS-CoV-, like any other virus, continues to mutate as it spreads, according to an evolutionary process. Unlike any other, virus, the number of currently available sequences of SARS-CoV-, in public databases such as GISAID is already several 摘要:SARS-CoV-2和其他任何病毒一样,在传播过程中不断变异,这是一个进化过程。与任何其他病毒不同,目前在GISAID等公共数据库中可用的SARS-CoV-2序列数量已经达到数百万。这一数量的数据有可能揭示病毒前所未有的进化动力学。然而,100万已经超出了传统方法所能处理的数量级,传统方法旨在重建病毒的进化历史,例如构建系统发育树的方法。因此,需要设计新的和可扩展的方法,以便利用收集的病毒序列的数量不断增加。由于识别变异是理解病毒进化的一个重要部分,在本文中,我们提出了一种基于聚类序列的方法来识别当前主要的SARS-CoV-2变异。使用基于$k$-mer的特征向量生成和高效的特征选择方法,我们的方法在识别变体方面是有效的,并且高效且可扩展到数百万个序列。这种聚类方法使我们能够显示每种变异在一段时间内的相对比例,给出每种变异在不同地点的传播率——这对疫苗的开发和分发很重要。我们还计算了棘突蛋白的每个氨基酸位置在根据信息增益识别给定变体方面的重要性。具有高度变异特异重要性的位置往往与美国疾病控制和预防中心(CDC)报告的一致,进一步证明了我们的方法。 摘要:SARS-CoV-2, like any other virus, continues to mutate as it spreads, according to an evolutionary process. Unlike any other virus, the number of currently available sequences of SARS-CoV-2 in public databases such as GISAID is already several million. This amount of data has the potential to uncover the evolutionary dynamics of a virus like never before. However, a million is already several orders of magnitude beyond what can be processed by the traditional methods designed to reconstruct a virus's evolutionary history, such as those that build a phylogenetic tree. Hence, new and scalable methods will need to be devised in order to make use of the ever increasing number of viral sequences being collected. Since identifying variants is an important part of understanding the evolution of a virus, in this paper, we propose an approach based on clustering sequences to identify the current major SARS-CoV-2 variants. Using a $k$-mer based feature vector generation and efficient feature selection methods, our approach is effective in identifying variants, as well as being efficient and scalable to millions of sequences. Such a clustering method allows us to show the relative proportion of each variant over time, giving the rate of spread of each variant in different locations -- something which is important for vaccine development and distribution. We also compute the importance of each amino acid position of the spike protein in identifying a given variant in terms of information gain. Positions of high variant-specific importance tend to agree with those reported by the USA's Centers for Disease Control and Prevention (CDC), further demonstrating our approach.

推荐(2篇)

【1】 A Unified Framework for Cross-Domain and Cross-System Recommendations 标题:跨域和跨系统推荐的统一框架 链接:https://arxiv.org/abs/2108.07976

作者:Feng Zhu,Yan Wang,Jun Zhou,Chaochao Chen,Longfei Li,Guanfeng Liu 机构: Liu are with the Department of Computing, MacquarieUniversity 备注:14 pages, this paper has been accepted as a regular paper in an upcoming issue of the Transactions on Knowledge and Data Engineering (TKDE) 摘要:跨域推荐(CDR)和跨系统推荐(CSR)被提出,以提高目标数据集(域/系统)中的推荐精度,并借助于信息相对丰富的源数据集。然而,大多数现有的CDR和CSR方法都是单目标的,即存在一个单一的目标数据集,它只能帮助目标数据集,因此不能使源数据集受益。在本文中,我们重点讨论了三种新的场景,即双目标CDR(DTCDR)、多目标CDR(MTCDR)和CDR CSR,旨在同时提高所有场景中所有数据集的推荐准确性。为此,我们为所有三种场景提出了一个统一的框架,称为GA(基于图形嵌入和注意技术)。在遗传算法中,我们首先构造独立的异构图来生成更具代表性的用户和项目嵌入。然后,我们提出了一种基于元素的注意机制来有效地结合从不同数据集中学习到的公共实体(用户/项目)的嵌入。此外,为了避免负迁移,我们进一步提出了一种个性化的训练策略,以最小化较丰富数据集和较稀疏数据集之间公共实体的嵌入差异,分别针对这三种场景推导了GA-DTCDR-P、GA-MTCDR-P和GA-CDR CSR-P三种新模型。在四个真实数据集上进行的大量实验表明,我们提出的遗传算法模型显著优于最先进的方法。 摘要:Cross-Domain Recommendation (CDR) and Cross-System Recommendation (CSR) have been proposed to improve the recommendation accuracy in a target dataset (domain/system) with the help of a source one with relatively richer information. However, most existing CDR and CSR approaches are single-target, namely, there is a single target dataset, which can only help the target dataset and thus cannot benefit the source dataset. In this paper, we focus on three new scenarios, i.e., Dual-Target CDR (DTCDR), Multi-Target CDR (MTCDR), and CDR CSR, and aim to improve the recommendation accuracy in all datasets simultaneously for all scenarios. To do this, we propose a unified framework, called GA (based on Graph embedding and Attention techniques), for all three scenarios. In GA, we first construct separate heterogeneous graphs to generate more representative user and item embeddings. Then, we propose an element-wise attention mechanism to effectively combine the embeddings of common entities (users/items) learned from different datasets. Moreover, to avoid negative transfer, we further propose a Personalized training strategy to minimize the embedding difference of common entities between a richer dataset and a sparser dataset, deriving three new models, i.e., GA-DTCDR-P, GA-MTCDR-P, and GA-CDR CSR-P, for the three scenarios respectively. Extensive experiments conducted on four real-world datasets demonstrate that our proposed GA models significantly outperform the state-of-the-art approaches.

【2】 Learning Federated Representations and Recommendations with Limited Negatives 标题:学习有限否定的联邦表示和建议 链接:https://arxiv.org/abs/2108.07931

作者:Lin Ning,Karan Singhal,Ellie X. Zhou,Sushant Prakash 机构:Google Research 摘要:深度检索模型广泛用于学习实体表示和推荐。联合学习提供了一种保护隐私的方法来训练这些模型,而不需要集中用户数据。然而,由于客户端上的非IID(独立且相同分布)训练数据,联邦深度检索模型的性能通常比集中式模型差得多,这是联邦学习的固有特性,限制了可用于训练的负面信息。我们证明了这个问题不同于通常研究的客户漂移问题。这项工作提出了批量不敏感的损失作为一种方法,以减轻联邦电影推荐的非IID负面问题。我们探索了各种技术,发现批量不敏感损失可以有效地提高联邦深度检索模型的性能,将联邦模型的相对召回率提高93.15%,并将其与集中式模型之间的召回率差距从27.22%-43.14%降低到0.53%-2.42%。我们开放源代码框架,以加速联邦深度检索模型的进一步研究和应用。 摘要:Deep retrieval models are widely used for learning entity representations and recommendations. Federated learning provides a privacy-preserving way to train these models without requiring centralization of user data. However, federated deep retrieval models usually perform much worse than their centralized counterparts due to non-IID (independent and identically distributed) training data on clients, an intrinsic property of federated learning that limits negatives available for training. We demonstrate that this issue is distinct from the commonly studied client drift problem. This work proposes batch-insensitive losses as a way to alleviate the non-IID negatives issue for federated movie recommendation. We explore a variety of techniques and identify that batch-insensitive losses can effectively improve the performance of federated deep retrieval models, increasing the relative recall of the federated model by up to 93.15% and reducing the relative gap in recall between it and a centralized model from 27.22% - 43.14% to 0.53% - 2.42%. We open-source our code framework to accelerate further research and applications of federated deep retrieval models.

聚类(2篇)

【1】 Stochastic Cluster Embedding 标题:随机簇嵌入 链接:https://arxiv.org/abs/2108.08003

作者:Zhirong Yang,Yuwei Chen,Denis Sedov,Samuel Kaski,Jukka Corander 机构:Norwegian University of Science and Technology, Aalto University, Finnish Geospatial Research Institute, University of Oslo, University of Helsinki 摘要:邻域嵌入(NE)旨在保持数据项之间的成对相似性,已被证明是一种有效的数据可视化原理。然而,即使是目前最好的NE方法,例如随机邻居嵌入(SNE),也可能会隐藏诸如簇之类的大规模模式,尽管数据中存在强信号。为了解决这个问题,我们提出了一种新的基于邻居嵌入的聚类可视化方法。我们首先提出了一系列的邻域嵌入方法,通过使用带有尺度参数的非规范化Kullback-Leibler散度来推广SNE。在这个族中,更好的集群可视化通常使用与SNE对应的参数值不同的参数值。我们还开发了一个高效的软件,该软件采用异步随机块坐标下降来优化新的目标函数族。实验结果表明,与最新的NE方法相比,我们的方法一致且显著地改进了数据集群的可视化。 摘要:Neighbor Embedding (NE) that aims to preserve pairwise similarities between data items has been shown to yield an effective principle for data visualization. However, even the currently best NE methods such as Stochastic Neighbor Embedding (SNE) may leave large-scale patterns such as clusters hidden despite of strong signals being present in the data. To address this, we propose a new cluster visualization method based on Neighbor Embedding. We first present a family of Neighbor Embedding methods which generalizes SNE by using non-normalized Kullback-Leibler divergence with a scale parameter. In this family, much better cluster visualizations often appear with a parameter value different from the one corresponding to SNE. We also develop an efficient software which employs asynchronous stochastic block coordinate descent to optimize the new family of objective functions. The experimental results demonstrate that our method consistently and substantially improves visualization of data clusters compared with the state-of-the-art NE approaches.

【2】 HyperSF: Spectral Hypergraph Coarsening via Flow-based Local Clustering 标题:HyperSF:基于流的局部聚类的光谱超图粗化 链接:https://arxiv.org/abs/2108.07901

作者:Ali Aghdaei,Zhiqiang Zhao,Zhuo Feng 机构:Stevens Institute of Technology 摘要:超图允许多方向高阶关系的建模问题。然而,大多数现有的基于超图的算法的计算成本在很大程度上取决于输入超图的大小。为了解决日益增长的计算难题,图粗化可以通过积极聚合其顶点(节点)来潜在地应用于对给定超图的预处理。然而,结合启发式图粗化技术的最新超图划分(聚类)方法并没有优化以保持超图的结构(全局)属性。在这项工作中,我们提出了一种有效的谱超图粗化方案(HyperSF),以很好地保持超图的原始谱(结构)性质。我们的方法利用最近的基于强局部最大流的聚类算法来检测最小化比率切割的超图顶点集。为了进一步提高算法的效率,我们提出了一种分治方案,该方案利用原始超图对应的二部图的谱聚类。我们对从实际VLSI设计基准中提取的各种超图的实验结果表明,与现有的最新算法相比,所提出的超图粗化算法可以显著提高超图聚类的多向导度和运行效率。 摘要:Hypergraphs allow modeling problems with multi-way high-order relationships. However, the computational cost of most existing hypergraph-based algorithms can be heavily dependent upon the input hypergraph sizes. To address the ever-increasing computational challenges, graph coarsening can be potentially applied for preprocessing a given hypergraph by aggressively aggregating its vertices (nodes). However, state-of-the-art hypergraph partitioning (clustering) methods that incorporate heuristic graph coarsening techniques are not optimized for preserving the structural (global) properties of hypergraphs. In this work, we propose an efficient spectral hypergraph coarsening scheme (HyperSF) for well preserving the original spectral (structural) properties of hypergraphs. Our approach leverages a recent strongly-local max-flow-based clustering algorithm for detecting the sets of hypergraph vertices that minimize ratio cut. To further improve the algorithm efficiency, we propose a divide-and-conquer scheme by leveraging spectral clustering of the bipartite graphs corresponding to the original hypergraphs. Our experimental results for a variety of hypergraphs extracted from real-world VLSI design benchmarks show that the proposed hypergraph coarsening algorithm can significantly improve the multi-way conductance of hypergraph clustering as well as runtime efficiency when compared with existing state-of-the-art algorithms.

联邦学习|隐私保护|加密(1篇)

【1】 Fed-TGAN: Federated Learning Framework for Synthesizing Tabular Data 标题:FED-TGAN:合成表格数据的联邦学习框架 链接:https://arxiv.org/abs/2108.07927

作者:Zilong Zhao,Robert Birke,Aditya Kunar,Lydia Y. Chen 机构:Tu Delft, Delft, Netherlands, ABB Corporate Research Switzerland, Dättwil, Switzerland 摘要:生成性对抗网络(GAN)通常经过训练,在直接访问训练数据的假设下,从图像和最近的表格数据合成数据。最近,联邦学习(FL)是一种新兴的范例,其特点是对客户端的本地数据进行分散学习,并具有隐私保护功能。而且,虽然学习GANs在FL系统上合成图像刚刚得到演示,但对于表格数据的GANs是否可以从分散的数据源中学习还不得而知。此外,目前还不清楚哪种分布式体系结构最适合他们。与图像扫描不同,最先进的表格扫描需要事先了解每一列(离散和连续)的数据分布情况,才能商定一种通用的编码方式——这会危及隐私保障。在本文中,我们提出Fed TGAN,这是第一个用于表格GANs的联邦学习框架。为了在不完全相同的参与者身上有效地学习复杂的表格式GAN,Fed TGAN设计了两种新的特征:(i)用于模型初始化的隐私保护多源特征编码;和(ii)表相似性感知加权策略,用于聚合局部模型以对抗数据倾斜。我们在四个广泛使用的数据集上对所提出的Fed TGAN进行了广泛的评估,以对比分散学习体系结构的变体。结果表明,对于IID和非IID数据,Fed TGAN与替代架构相比,每个历元的训练时间加快了200%。总体而言,Fed TGAN不仅稳定了训练损失,而且在生成的数据和原始数据之间实现了更好的相似性。 摘要:Generative Adversarial Networks (GANs) are typically trained to synthesize data, from images and more recently tabular data, under the assumption of directly accessible training data. Recently, federated learning (FL) is an emerging paradigm that features decentralized learning on client's local data with a privacy-preserving capability. And, while learning GANs to synthesize images on FL systems has just been demonstrated, it is unknown if GANs for tabular data can be learned from decentralized data sources. Moreover, it remains unclear which distributed architecture suits them best. Different from image GANs, state-of-the-art tabular GANs require prior knowledge on the data distribution of each (discrete and continuous) column to agree on a common encoding -- risking privacy guarantees. In this paper, we propose Fed-TGAN, the first Federated learning framework for Tabular GANs. To effectively learn a complex tabular GAN on non-identical participants, Fed-TGAN designs two novel features: (i) a privacy-preserving multi-source feature encoding for model initialization; and (ii) table similarity aware weighting strategies to aggregate local models for countering data skew. We extensively evaluate the proposed Fed-TGAN against variants of decentralized learning architectures on four widely used datasets. Results show that Fed-TGAN accelerates training time per epoch up to 200% compared to the alternative architectures, for both IID and Non-IID data. Overall, Fed-TGAN not only stabilizes the training loss, but also achieves better similarity between generated and original data.

推理|分析|理解|解释(4篇)

【1】 CARE: Coherent Actionable Recourse based on Sound Counterfactual Explanations 标题:关注:基于合理的反事实解释的连贯的可诉资源 链接:https://arxiv.org/abs/2108.08197

作者:Peyman Rasouli,Ingrid Chieh Yu 机构:Department of Informatics, University of Oslo, Oslo, Norway 摘要:反事实解释方法以“假设情景”的形式解释机器学习模型的输出,而不损害保真度可解释性权衡。他们解释了如何通过建议对输入特征进行小的更改,从模型中获得所需的预测。我们认为,应根据源自地面真相数据分布并与领域知识相联系的合理反事实解释,建立可采取行动的追索权。此外,它需要在满足用户/域指定约束的同时保持更改/未更改特征之间的一致性。本文介绍了CARE,这是一个模块化的解释框架,它以连续和结构化的方式处理模型和用户级别的需求。我们通过在多目标优化框架中提出新颖高效的解决方案来解决现有需求。所设计的框架能够包含任意需求,并生成反事实解释和可选择的可诉追索权。作为一种模型不可知的方法,CARE为表格分类和回归设置中的任何黑盒模型生成多种多样的解释。在标准数据集和黑盒模型上的几个实验证明了我们的模块化框架的有效性及其与基线相比的优越性能。 摘要:Counterfactual explanation methods interpret the outputs of a machine learning model in the form of "what-if scenarios" without compromising the fidelity-interpretability trade-off. They explain how to obtain a desired prediction from the model by recommending small changes to the input features, aka recourse. We believe an actionable recourse should be created based on sound counterfactual explanations originating from the distribution of the ground-truth data and linked to the domain knowledge. Moreover, it needs to preserve the coherency between changed/unchanged features while satisfying user/domain-specified constraints. This paper introduces CARE, a modular explanation framework that addresses the model- and user-level desiderata in a consecutive and structured manner. We tackle the existing requirements by proposing novel and efficient solutions that are formulated in a multi-objective optimization framework. The designed framework enables including arbitrary requirements and generating counterfactual explanations and actionable recourse by choice. As a model-agnostic approach, CARE generates multiple, diverse explanations for any black-box model in tabular classification and regression settings. Several experiments on standard data sets and black-box models demonstrate the effectiveness of our modular framework and its superior performance compared to the baselines.

【2】 FOX-NAS: Fast, On-device and Explainable Neural Architecture Search 标题:FOX-NAS:快速、设备上和可解释的神经结构搜索 链接:https://arxiv.org/abs/2108.08189

作者:Chia-Hsiang Liu,Yu-Shin Han,Yuan-Yao Sung,Yi Lee,Hung-Yueh Chiang,Kai-Chiang Wu 机构:National Yang Ming Chiao Tung University,The University of Texas at Austin 备注:Accepted by ICCV 2021 Low-Power Computer Vision Workshop 摘要:神经结构搜索可以发现性能良好的神经网络,一次搜索法非常流行。一次性方法通常需要具有权重共享和预测体系结构性能的预测器的超级网。然而,以前的方法需要花费大量时间来生成性能预测,因此效率低下。为此,我们提出了FOX-NAS,它由基于模拟退火和多元回归的快速和可解释的预测因子组成。我们的方法是量化友好的,可以有效地部署到边缘。在不同硬件上的实验表明,FOX-NAS模型优于其他一些流行的神经网络结构。例如,FOX-NAS匹配MobileNetV2和EfficientNet-Lite0的准确性,在边缘CPU上的延迟分别减少240%和40%。FOX-NAS是2020年低功耗计算机视觉挑战赛(LPCVC)DSP分类赛道的第三名得主。请参阅上的所有评估结果https://lpcv.ai/competitions/2020. 搜索代码和预先训练的模型发布于https://github.com/great8nctu/FOX-NAS. 摘要:Neural architecture search can discover neural networks with good performance, and One-Shot approaches are prevalent. One-Shot approaches typically require a supernet with weight sharing and predictors that predict the performance of architecture. However, the previous methods take much time to generate performance predictors thus are inefficient. To this end, we propose FOX-NAS that consists of fast and explainable predictors based on simulated annealing and multivariate regression. Our method is quantization-friendly and can be efficiently deployed to the edge. The experiments on different hardware show that FOX-NAS models outperform some other popular neural network architectures. For example, FOX-NAS matches MobileNetV2 and EfficientNet-Lite0 accuracy with 240% and 40% less latency on the edge CPU. FOX-NAS is the 3rd place winner of the 2020 Low-Power Computer Vision Challenge (LPCVC), DSP classification track. See all evaluation results at https://lpcv.ai/competitions/2020. Search code and pre-trained models are released at https://github.com/great8nctu/FOX-NAS.

【3】 Stack Index Prediction Using Time-Series Analysis 标题:基于时间序列分析的烟囱指数预测 链接:https://arxiv.org/abs/2108.08120

作者:Raja CSP Raman,Rohith Mahadevan,Divya Perumal,Vedha Sankar,Talha Abdur Rahman 备注:10 pages, 9 figures 摘要:多年来,社区对技术行业不同领域的支持和参与发生了变化和演变。在这项研究中,我们旨在以科学的方式了解、分析和预测技术的发展趋势,收集了过去十年中许多主题及其增长的数据。我们对收集的数据应用机器学习模型,以了解、分析和预测不同领域的发展趋势。我们证明了某些技术概念,如python、机器学习和Keras具有无可争议的上升趋势,最后得出结论,Stackindex模型预测精度高,可以作为预测不同技术领域的可行工具。 摘要:The Prevalence of Community support and engagement for different domains in the tech industry has changed and evolved throughout the years. In this study, we aim to understand, analyze and predict the trends of technology in a scientific manner, having collected data on numerous topics and their growth throughout the years in the past decade. We apply machine learning models on collected data, to understand, analyze and forecast the trends in the advancement of different fields. We show that certain technical concepts such as python, machine learning, and Keras have an undisputed uptrend, finally concluding that the Stackindex model forecasts with high accuracy and can be a viable tool for forecasting different tech domains.

【4】 M-ar-K-Fast Independent Component Analysis 标题:M-Ar-K-快速独立分量分析 链接:https://arxiv.org/abs/2108.07908

作者:Luca Parisi 机构:Coventry, United Kingdom, PhD in Machine Learning for Clinical Decision Support Systems, MBA Candidate with Artificial Intelligence Specialism 备注:17 pages, 2 listings/Python code snippets, 2 figures, 5 tables. arXiv admin note: text overlap with arXiv:2009.07530 摘要:本研究提出了用于特征提取的m-arcinh核(m-ar-K)快速独立分量分析(FastICA)方法(m-ar-K-FastICA)。核技巧使降维技术能够捕获数据中更大程度的非线性;然而,用于辅助特征提取的可重复的开源内核仍然有限,并且在从熵数据投影特征时可能不可靠。m-ar-K函数在Python中免费提供,并与其开源库“scikit learn”兼容,因此与FastICA结合使用,以在数据存在高度随机性的情况下实现更可靠的特征提取,从而减少预白化的需要。被认为是不同的分类任务,作为相关的五(n=5)开放存取数据集的不同程度的信息熵,可从SCIKIT学习和大学加利福尼亚欧文(UCI)机器学习库。实验结果表明,该特征提取方法提高了分类性能。新的m-ar-K-FastICA降维方法与“FastICA”金标准方法进行了比较,支持其更高的可靠性和计算效率,而不考虑数据中潜在的不确定性。 摘要:This study presents the m-arcsinh Kernel ('m-ar-K') Fast Independent Component Analysis ('FastICA') method ('m-ar-K-FastICA') for feature extraction. The kernel trick has enabled dimensionality reduction techniques to capture a higher extent of non-linearity in the data; however, reproducible, open-source kernels to aid with feature extraction are still limited and may not be reliable when projecting features from entropic data. The m-ar-K function, freely available in Python and compatible with its open-source library 'scikit-learn', is hereby coupled with FastICA to achieve more reliable feature extraction in presence of a high extent of randomness in the data, reducing the need for pre-whitening. Different classification tasks were considered, as related to five (N = 5) open access datasets of various degrees of information entropy, available from scikit-learn and the University California Irvine (UCI) Machine Learning repository. Experimental results demonstrate improvements in the classification performance brought by the proposed feature extraction. The novel m-ar-K-FastICA dimensionality reduction approach is compared to the 'FastICA' gold standard method, supporting its higher reliability and computational efficiency, regardless of the underlying uncertainty in the data.

检测相关(4篇)

【1】 Fake News and Phishing Detection Using a Machine Learning Trained Expert System 标题:基于机器学习训练的专家系统在假新闻和钓鱼检测中的应用 链接:https://arxiv.org/abs/2108.08264

作者:Benjamin Fitzpatrick,Xinyu "Sherwin" Liang,Jeremy Straub 机构:Department of Electrical and Computer Engineering, University of Alabama, H.M. Comer,th Avenue, Tuscaloosa, AL , Phone: , (,) ,-, Fax: , (,) ,-, Xinyu “Sherwin” Liang, Dallas College – North Lake, N. MacArthur Blvd., Irving, TX , Corresponding Author 摘要:专家系统已被用来使计算机能够作出建议和决定。本文介绍了一个机器学习训练专家系统(MLES)在钓鱼网站检测和假新闻检测中的应用。这两个主题都有一个相似的目标:设计一个规则事实网络,允许计算机像领域专家一样在各个领域做出可解释的决策。钓鱼网站检测研究使用MLES通过分析网站属性(如URL长度和过期时间)来检测潜在的钓鱼网站。虚假新闻检测研究使用MLES规则事实网络,根据情感、说话人的政治背景和工作等因素来衡量新闻报道的真实性。这两项研究使用不同的MLES网络实现,本文对此进行了介绍和比较。假新闻研究采用了更线性的设计,而网络钓鱼项目则采用了更复杂的连接结构。这两个网络的输入都基于常用的数据集。 摘要:Expert systems have been used to enable computers to make recommendations and decisions. This paper presents the use of a machine learning trained expert system (MLES) for phishing site detection and fake news detection. Both topics share a similar goal: to design a rule-fact network that allows a computer to make explainable decisions like domain experts in each respective area. The phishing website detection study uses a MLES to detect potential phishing websites by analyzing site properties (like URL length and expiration time). The fake news detection study uses a MLES rule-fact network to gauge news story truthfulness based on factors such as emotion, the speaker's political affiliation status, and job. The two studies use different MLES network implementations, which are presented and compared herein. The fake news study utilized a more linear design while the phishing project utilized a more complex connection structure. Both networks' inputs are based on commonly available data sets.

【2】 SOME/IP Intrusion Detection using Deep Learning-based Sequential Models in Automotive Ethernet Networks 标题:汽车以太网中基于深度学习序列模型的Some/IP入侵检测 链接:https://arxiv.org/abs/2108.08262

作者:Natasha Alkhatib,Hadi Ghauch,Jean-Luc Danger 摘要:入侵检测系统广泛用于检测网络攻击,尤其是易受黑客攻击的协议,如SOME/IP。本文针对SOME/IP应用层协议,提出了一种基于深度学习的离线入侵检测序列模型。为了评估我们的入侵检测系统,我们生成并标记了一个数据集,其中包含多个表示真实入侵的类和一个普通类,这是由于缺少此类公开可用的数据集而做出的重大贡献。此外,我们还提出了一个简单的递归神经网络(RNN),作为基于深度学习的序列模型的一个实例,并将其应用于我们生成的数据集。数值结果表明,RNN在预测车内入侵方面表现出色,每种入侵类型的F1得分和AUC值均为0.99。 摘要:Intrusion Detection Systems are widely used to detect cyberattacks, especially on protocols vulnerable to hacking attacks such as SOME/IP. In this paper, we present a deep learning-based sequential model for offline intrusion detection on SOME/IP application layer protocol. To assess our intrusion detection system, we have generated and labeled a dataset with several classes representing realistic intrusions, and a normal class - a significant contribution due to the absence of such publicly available datasets. Furthermore, we also propose a simple recurrent neural network (RNN), as an instance of deep learning-based sequential model, that we apply to our generated dataset. The numerical results show that RNN excel at predicting in-vehicle intrusions, with F1 Scores and AUC values of 0.99 for each type of intrusion.

【3】 Out-of-Distribution Detection using Outlier Detection Methods 标题:基于孤立点检测方法的离群点检测 链接:https://arxiv.org/abs/2108.08218

作者:Jan Diers,Christian Pigorsch 机构:Friedrich-Schiller-University Jena, Germany 摘要:分布外检测(OOD)处理神经网络的异常输入。在过去,已经提出了专门的方法来拒绝对异常输入的预测。我们使用离群点检测算法来检测异常输入,就像OOD领域的专门方法一样可靠。不需要神经网络自适应;检测基于模型的softmax分数。我们的方法在无监督的情况下使用隔离林或有监督分类器(如梯度提升机)进行工作。 摘要:Out-of-distribution detection (OOD) deals with anomalous input to neural networks. In the past, specialized methods have been proposed to reject predictions on anomalous input. We use outlier detection algorithms to detect anomalous input as reliable as specialized methods from the field of OOD. No neural network adaptation is required; detection is based on the model's softmax score. Our approach works unsupervised with an Isolation Forest or with supervised classifiers such as a Gradient Boosting machine.

【4】 Towards Deep and Efficient: A Deep Siamese Self-Attention Fully Efficient Convolutional Network for Change Detection in VHR Images 标题:走向深度和高效:一种用于VHR图像变化检测的深度暹罗自关注完全高效卷积网络 链接:https://arxiv.org/abs/2108.08157

作者:Hongruixuan Chen,Chen Wu,Bo Du 机构:State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan, China., School of Computer Science, Wuhan University, Wuhan, China. 摘要:近年来,FCN在CD领域引起了广泛的关注。为了追求更好的CD性能,设计更深更复杂的FCN已成为一种趋势,这不可避免地会带来大量的参数和难以承受的计算负担。在这项工作中,我们提出了一个非常深入和高效的CD网络,名为EffCDNet。在EffCDNet中,为了减少与深度体系结构相关的众多参数,引入了一种由深度卷积和具有信道洗牌机制的组卷积组成的高效卷积来取代标准卷积层。在特定的网络体系结构方面,EffCDNet不使用主流的类似UNet的体系结构,而是采用具有非常深的编码器和轻量级解码器的体系结构。在甚深编码器中,两个通过有效卷积叠加的甚深暹罗流首先从输入图像对中提取两个具有高度代表性和信息性的特征映射。随后,设计了一个高效的ASPP模块来捕获多尺度变化信息。在轻量级解码器中,采用循环交叉自我注意(RCCA)模块,有效地利用非局部相似特征表示来增强每个像素的可分辨性,从而有效地分离变化区域和未变化区域。此外,为了解决混淆像素的优化问题,提出了两种新的基于信息熵的损失函数。在两个具有挑战性的CD数据集上,我们的方法优于其他基于SOTA FCN的方法,仅具有基准级参数数和相当低的计算开销。 摘要:Recently, FCNs have attracted widespread attention in the CD field. In pursuit of better CD performance, it has become a tendency to design deeper and more complicated FCNs, which inevitably brings about huge numbers of parameters and an unbearable computational burden. With the goal of designing a quite deep architecture to obtain more precise CD results while simultaneously decreasing parameter numbers to improve efficiency, in this work, we present a very deep and efficient CD network, entitled EffCDNet. In EffCDNet, to reduce the numerous parameters associated with deep architecture, an efficient convolution consisting of depth-wise convolution and group convolution with a channel shuffle mechanism is introduced to replace standard convolutional layers. In terms of the specific network architecture, EffCDNet does not use mainstream UNet-like architecture, but rather adopts the architecture with a very deep encoder and a lightweight decoder. In the very deep encoder, two very deep siamese streams stacked by efficient convolution first extract two highly representative and informative feature maps from input image-pairs. Subsequently, an efficient ASPP module is designed to capture multi-scale change information. In the lightweight decoder, a recurrent criss-cross self-attention (RCCA) module is applied to efficiently utilize non-local similar feature representations to enhance discriminability for each pixel, thus effectively separating the changed and unchanged regions. Moreover, to tackle the optimization problem in confused pixels, two novel loss functions based on information entropy are presented. On two challenging CD datasets, our approach outperforms other SOTA FCN-based methods, with only benchmark-level parameter numbers and quite low computational overhead.

分类|识别(2篇)

【1】 XAI Methods for Neural Time Series Classification: A Brief Review 标题:神经时间序列分类的XAI方法综述 链接:https://arxiv.org/abs/2108.08009

作者:Ilija Šimić,Vedran Sabol,Eduardo Veas 机构:Austria 2University of Technology Graz 备注:None 摘要:深度学习模型最近在各种任务中显示出显著的效果,这就是为什么它们越来越多地应用于高风险领域,如工业、医学和金融。考虑到这些领域中的自动预测可能会对一个人的福祉产生重大影响,以及对个人或公司造成相当大的财务和法律后果,应用这些模型产生的所有行动和决定都必须负责。鉴于在高风险领域收集的大量数据是以时间序列的形式存在的,本文研究了可解释人工智能(XAI)方法的现状,重点探讨了为时间序列分类任务打开深度学习黑盒的方法。最后,我们的贡献还旨在为未来的工作得出有希望的方向,以推进XAI对时间序列数据的深入学习。 摘要:Deep learning models have recently demonstrated remarkable results in a variety of tasks, which is why they are being increasingly applied in high-stake domains, such as industry, medicine, and finance. Considering that automatic predictions in these domains might have a substantial impact on the well-being of a person, as well as considerable financial and legal consequences to an individual or a company, all actions and decisions that result from applying these models have to be accountable. Given that a substantial amount of data that is collected in high-stake domains are in the form of time series, in this paper we examine the current state of eXplainable AI (XAI) methods with a focus on approaches for opening up deep learning black boxes for the task of time series classification. Finally, our contribution also aims at deriving promising directions for future work, to advance XAI for deep learning on time series data.

【2】 Contrastive Identification of Covariate Shift in Image Data 标题:图像数据中协变量漂移的对比识别 链接:https://arxiv.org/abs/2108.08000

作者:Matthew L. Olson,Thuy-Vy Nguyen,Gaurav Dixit,Neale Ratzlaff,Weng-Keen Wong,Minsuk Kahng 机构:Oregon State University 摘要:识别协变量转移对于使机器学习系统在现实世界中具有鲁棒性以及检测测试数据中未反映的训练数据偏差至关重要。然而,检测协变量偏移是一个挑战,特别是当数据由高维图像组成时,以及当多种类型的局部协变量偏移影响数据的不同子空间时。虽然自动化技术可用于检测协变量移位的存在,但我们的目标是通过无缝集成从检测算法获得的信息的接口,帮助人类用户描述大型图像数据集中协变量移位的程度。在本文中,我们设计并评估了一个新的可视化界面,该界面便于比较训练和测试数据的局部分布。我们对多属性人脸数据进行定量用户研究,比较两种不同的学习低维潜在表征(预训练ImageNet CNN与密度比)和两种用户分析工作流(最近邻与簇对簇)。我们的结果表明,我们的密度比模型的潜在表示,结合最近邻比较,是帮助人类识别协变量转移的最有效方法。 摘要:Identifying covariate shift is crucial for making machine learning systems robust in the real world and for detecting training data biases that are not reflected in test data. However, detecting covariate shift is challenging, especially when the data consists of high-dimensional images, and when multiple types of localized covariate shift affect different subspaces of the data. Although automated techniques can be used to detect the existence of covariate shift, our goal is to help human users characterize the extent of covariate shift in large image datasets with interfaces that seamlessly integrate information obtained from the detection algorithms. In this paper, we design and evaluate a new visual interface that facilitates the comparison of the local distributions of training and test data. We conduct a quantitative user study on multi-attribute facial data to compare two different learned low-dimensional latent representations (pretrained ImageNet CNN vs. density ratio) and two user analytic workflows (nearest-neighbor vs. cluster-to-cluster). Our results indicate that the latent representation of our density ratio model, combined with a nearest-neighbor comparison, is the most effective at helping humans identify covariate shift.

优化|敛散性(4篇)

【1】 Existence, uniqueness, and convergence rates for gradient flows in the training of artificial neural networks with ReLU activation 标题:RELU激活人工神经网络训练中梯度流的存在性、唯一性和收敛速度 链接:https://arxiv.org/abs/2108.08106

作者:Simon Eberle,Arnulf Jentzen,Adrian Riekert,Georg S. Weiss 机构:AG Analysis of Partial Differential Equations, University of Duisburg-Essen, Germany, e-mail: simon.eberle a, ○uni-due.de, Applied Mathematics: Institute for Analysis and Numerics, University of M¨unster, Germany, e-mail: ajentzen a, ○uni-muenster.de 备注:30 pages. arXiv admin note: text overlap with arXiv:2107.04479 摘要:通过梯度下降(GD)型优化方案激活校正线性单元(ReLU)的人工神经网络(ANN)的训练是目前一种常见的工业相关程序。直到今天,在科学文献中,一般还没有数学收敛分析来解释GD型优化方案在训练具有ReLU激活的人工神经网络方面的数值成功。GD型优化方案可视为与所考虑优化问题相关的梯度流(GF)微分方程的时间离散化方法,鉴于此,首先发展时间连续GF微分方程的数学收敛理论,然后将这种时间连续收敛理论推广到可实现的时间离散GD型优化方法,这似乎是一个自然的研究方向。在本文中,我们建立了GF微分方程在训练具有一个隐层和ReLU激活的全连接前馈神经网络时的两个基本结果。在本文的第一个主要结果中,我们在假设所考虑的监督学习问题的输入数据的概率分布是绝对连续的,且具有一个有界密度函数的情况下,在训练此类ANN时,每个GF微分方程对于每个初始值都允许一个在一系列合适的解决方案中也是独一无二的。在本文的第二个主要结果中,我们在假设输入数据概率分布的目标函数和密度函数是分段多项式的情况下,证明了在训练此类ANN时,每个非发散GF轨迹以适当的收敛速度收敛到临界点,并且无发散GF轨迹的风险以1的速率收敛到临界点的风险。 摘要:The training of artificial neural networks (ANNs) with rectified linear unit (ReLU) activation via gradient descent (GD) type optimization schemes is nowadays a common industrially relevant procedure. Till this day in the scientific literature there is in general no mathematical convergence analysis which explains the numerical success of GD type optimization schemes in the training of ANNs with ReLU activation. GD type optimization schemes can be regarded as temporal discretization methods for the gradient flow (GF) differential equations associated to the considered optimization problem and, in view of this, it seems to be a natural direction of research to first aim to develop a mathematical convergence theory for time-continuous GF differential equations and, thereafter, to aim to extend such a time-continuous convergence theory to implementable time-discrete GD type optimization methods. In this article we establish two basic results for GF differential equations in the training of fully-connected feedforward ANNs with one hidden layer and ReLU activation. In the first main result of this article we establish in the training of such ANNs under the assumption that the probability distribution of the input data of the considered supervised learning problem is absolutely continuous with a bounded density function that every GF differential equation admits for every initial value a solution which is also unique among a suitable class of solutions. In the second main result of this article we prove in the training of such ANNs under the assumption that the target function and the density function of the probability distribution of the input data are piecewise polynomial that every non-divergent GF trajectory converges with an appropriate rate of convergence to a critical point and that the risk of the non-divergent GF trajectory converges with rate 1 to the risk of the critical point.

【2】 Statistically Near-Optimal Hypothesis Selection 标题:统计上接近最优的假设选择 链接:https://arxiv.org/abs/2108.07880

作者:Olivier Bousquet,Mark Braverman,Klim Efremenko,Gillat Kol,Shay Moran 机构:‡Department of Computer Science, Ben Gurion University, §Department of Computer Science, Princeton University 备注:Accepted to FOCS 2021 摘要:假设选择是一个基本的分布学习问题,其中给定一个比较器类$Q={Qu 1、ldots、Qu n}$分布,以及对未知目标分布$p$的抽样访问,目标是输出分布$Q$,使得$mathsf{TV}(p,Q)$接近$opt$,其中$opt=minu i{mathsf{TV}(p,q_i)}$和$mathsf{TV}(cdot,cdot)$表示总变化距离。尽管这一问题自19世纪就被研究过,但其在基本资源方面的复杂性,如样本数量和近似保证,仍然没有得到解决(这一点在Devroye和Lugosi`00的《迷人的书》中进行了讨论)。这与其他(较年轻的)学习环境形成了鲜明的对比,例如PAC学习,对于这些环境的复杂性,我们已经很好地理解了。我们为假设选择问题导出了一个最优的$2$-近似学习策略,输出$q$,使得$mathsf{TV}(p,q)leq2cdot opt eps$,具有$tilde O(log n/epsilon^2)$(接近)最优样本复杂度。这是第一个同时获得最佳近似因子和样本复杂度的算法:之前,Bousquet、Kane和Moran(COLT`19)让学习者获得最佳$2$-近似,但样本复杂度呈指数级下降$tilde O(sqrt{n}/epsilon^{2.5})$,Yatracos~(Annals of Statistics`85)给出的学习者的最佳样本复杂度为$O(log n/epsilon^2)$,但次优近似因子为$3$。 摘要:Hypothesis Selection is a fundamental distribution learning problem where given a comparator-class $Q={q_1,ldots, q_n}$ of distributions, and a sampling access to an unknown target distribution $p$, the goal is to output a distribution $q$ such that $mathsf{TV}(p,q)$ is close to $opt$, where $opt = min_i{mathsf{TV}(p,q_i)}$ and $mathsf{TV}(cdot, cdot)$ denotes the total-variation distance. Despite the fact that this problem has been studied since the 19th century, its complexity in terms of basic resources, such as number of samples and approximation guarantees, remains unsettled (this is discussed, e.g., in the charming book by Devroye and Lugosi `00). This is in stark contrast with other (younger) learning settings, such as PAC learning, for which these complexities are well understood. We derive an optimal $2$-approximation learning strategy for the Hypothesis Selection problem, outputting $q$ such that $mathsf{TV}(p,q) leq2 cdot opt eps$, with a (nearly) optimal sample complexity of~$tilde O(log n/epsilon^2)$. This is the first algorithm that simultaneously achieves the best approximation factor and sample complexity: previously, Bousquet, Kane, and Moran (COLT `19) gave a learner achieving the optimal $2$-approximation, but with an exponentially worse sample complexity of $tilde O(sqrt{n}/epsilon^{2.5})$, and Yatracos~(Annals of Statistics `85) gave a learner with optimal sample complexity of $O(log n /epsilon^2)$ but with a sub-optimal approximation factor of $3$.

【3】 Structure Parameter Optimized Kernel Based Online Prediction with a Generalized Optimization Strategy for Nonstationary Time Series 标题:基于非平稳时间序列广义优化策略的结构参数优化核在线预测 链接:https://arxiv.org/abs/2108.08180

作者:Jinhua Guo,Hao Chen,Jingxin Zhang,Sheng Chen 机构: Zhang is with Department of Automation, Tsinghua University 备注:Journal article submitted to T-SP 摘要:本文研究了非平稳时间序列在再生核希尔BERT空间中的稀疏化技术辅助在线预测算法。在线预测算法通常包括核结构参数的选择和核权向量的更新。对于结构参数,采用在线选择性建模准则,通过稀疏化技术选择核字典,并根据协方差矩阵自适应进化策略(CMA-ES)对核协方差矩阵进行间歇性优化。优化实对称协方差矩阵不仅可以利用输入变量的交叉相关性提高核结构的灵活性,而且可以部分缓解非平稳时间序列核字典选择带来的预测不确定性。为了充分捕捉预测误差时间序列的基本动态特性,设计了一种广义优化策略,在多个核连接模式下依次构造核字典。广义优化策略为构建整个内核连接提供了一种更加独立的方法,从而增强了自适应跟踪动态特性变化的能力。数值模拟表明,该方法对非平稳时间序列具有良好的预测性能。 摘要:In this paper, sparsification techniques aided online prediction algorithms in a reproducing kernel Hilbert space are studied for nonstationary time series. The online prediction algorithms as usual consist of the selection of kernel structure parameters and the kernel weight vector updating. For structure parameters, the kernel dictionary is selected by some sparsification techniques with online selective modeling criteria, and moreover the kernel covariance matrix is intermittently optimized in the light of the covariance matrix adaptation evolution strategy (CMA-ES). Optimizing the real symmetric covariance matrix can not only improve the kernel structure's flexibility by the cross relatedness of the input variables, but also partly alleviate the prediction uncertainty caused by the kernel dictionary selection for nonstationary time series. In order to sufficiently capture the underlying dynamic characteristics in prediction-error time series, a generalized optimization strategy is designed to construct the kernel dictionary sequentially in multiple kernel connection modes. The generalized optimization strategy provides a more self-contained way to construct the entire kernel connections, which enhances the ability to adaptively track the changing dynamic characteristics. Numerical simulations have demonstrated that the proposed approach has superior prediction performance for nonstationary time series.

【4】 On Multimarginal Partial Optimal Transport: Equivalent Forms and Computational Complexity 标题:关于多边际部分最优运输:等价形式和计算复杂度 链接:https://arxiv.org/abs/2108.07992

作者:Khang Le,Huy Nguyen,Tung Pham,Nhat Ho 机构:University of Texas, Austin†;, VinAI Research, Vietnam⋄ 备注:20 pages, 3 figures. Khang Le and Huy Nguyen contributed equally to this work 摘要:我们研究了支持度不超过n$的m$离散(不平衡)测度之间的多边际部分最优运输(POT)问题。我们首先证明了通过代价张量的新扩展,我们可以从多边际最优运输问题的角度得到多边际POT问题的两种等价形式。第一种等价形式是在假设每个度量的总质量足够接近的情况下推导出来的,而第二种等价形式不需要对这些质量施加任何条件,而是以更复杂的扩展成本张量为代价。我们获得这些等价形式的证明技术依赖于图论中移动质量的新程序,以将运输计划推到适当的区域。最后,基于等价形式,我们开发了一种优化算法,名为ApproxMPOT算法,该算法建立在Sinkhorn算法的基础上,用于求解熵正则化多边际最优运输。我们证明了ApproxSpot算法可以逼近多边际POT问题的最优值,其计算复杂度上界为$tilde{mathcal{O}}(m^3(n 1)^{m}/varepsilon^2)$,其中$varepsilon>0$表示期望的容差。 摘要:We study the multi-marginal partial optimal transport (POT) problem between $m$ discrete (unbalanced) measures with at most $n$ supports. We first prove that we can obtain two equivalence forms of the multimarginal POT problem in terms of the multimarginal optimal transport problem via novel extensions of cost tensor. The first equivalence form is derived under the assumptions that the total masses of each measure are sufficiently close while the second equivalence form does not require any conditions on these masses but at the price of more sophisticated extended cost tensor. Our proof techniques for obtaining these equivalence forms rely on novel procedures of moving mass in graph theory to push transportation plan into appropriate regions. Finally, based on the equivalence forms, we develop optimization algorithm, named ApproxMPOT algorithm, that builds upon the Sinkhorn algorithm for solving the entropic regularized multimarginal optimal transport. We demonstrate that the ApproxMPOT algorithm can approximate the optimal value of multimarginal POT problem with a computational complexity upper bound of the order $tilde{mathcal{O}}(m^3(n 1)^{m}/ varepsilon^2)$ where $varepsilon > 0$ stands for the desired tolerance.

预测|估计(2篇)

【1】 LOKI: Long Term and Key Intentions for Trajectory Prediction 标题:LOKI:轨迹预测的长期和关键意图 链接:https://arxiv.org/abs/2108.08236

作者:Harshayu Girase,Haiming Gang,Srikanth Malla,Jiachen Li,Akira Kanehara,Karttikeya Mangalam,Chiho Choi 机构:Honda Research Institute USA, University of California, Berkeley, Honda R&D Co., Ltd. 备注:ICCV 2021 (The dataset is available at this https URL) 摘要:轨迹预测方面的最新进展表明,关于主体意图的明确推理对于准确预测其运动非常重要。然而,目前的研究活动并不直接适用于智能和安全关键系统。这主要是因为很少有公共数据集是可用的,他们只考虑行人特定意图短暂的时间跨度从限制自我中心的观点。为此,我们提出了LOKI(长期和关键意图),这是一种新型的大规模数据集,旨在解决自主驾驶环境中异构交通代理(行人和车辆)的联合轨迹和意图预测问题。创建LOKI数据集是为了发现可能影响意图的几个因素,包括i)代理人自身意愿,ii)社会互动,iii)环境约束,以及iv)上下文信息。我们还提出了一个联合执行轨迹和意图预测的模型,表明关于意图的循环推理可以辅助轨迹预测。我们展示了我们的方法比最先进的轨迹预测方法高出27%$,并且还为基于帧的意图估计提供了基线。 摘要:Recent advances in trajectory prediction have shown that explicit reasoning about agents' intent is important to accurately forecast their motion. However, the current research activities are not directly applicable to intelligent and safety critical systems. This is mainly because very few public datasets are available, and they only consider pedestrian-specific intents for a short temporal horizon from a restricted egocentric view. To this end, we propose LOKI (LOng term and Key Intentions), a novel large-scale dataset that is designed to tackle joint trajectory and intention prediction for heterogeneous traffic agents (pedestrians and vehicles) in an autonomous driving setting. The LOKI dataset is created to discover several factors that may affect intention, including i) agent's own will, ii) social interactions, iii) environmental constraints, and iv) contextual information. We also propose a model that jointly performs trajectory and intention prediction, showing that recurrently reasoning about intention can assist with trajectory prediction. We show our method outperforms state-of-the-art trajectory prediction methods by upto $27%$ and also provide a baseline for frame-wise intention estimation.

【2】 DeepExpress: Heterogeneous and Coupled Sequence Modeling for Express Delivery Prediction 标题:DeepExpress:用于快递预测的异构耦合序列建模 链接:https://arxiv.org/abs/2108.08170

作者:Siyuan Ren,Bin Guo,Longbing Cao,Ke Li,Jiaqi Liu,Zhiwen Yu 机构: Northwestern Polytechnical UniversityBIN GUO, Northwestern Polytechnical UniversityLONGBING CAO, University of Technology SydneyKE LI, Northwestern Polytechnical UniversityJIAQI LIU, Northwestern Polytechnical UniversityZHIWEN YU 摘要:快递顺序的预测,即建模和估计每日进出包裹的数量,对于在线业务、物流和积极的客户体验,特别是对于资源分配优化和促销活动安排至关重要。对消费者交付请求的精确估计必须涉及连续因素,如购物行为、天气条件、事件、商业活动及其耦合。此外,传统的序列预测假设序列演化稳定,无法处理复杂的非线性序列和上述多源数据中的各种特征效应。尽管深层网络和注意机制显示了复杂序列建模的潜力,但现有网络忽略了特征和序列之间的异构和耦合情况,导致预测精度低下。为了解决这些问题,我们提出了基于深度学习的快递序列预测模型DeepExpress,该模型将经典的seq2seq框架扩展到学习序列和特征之间的复杂耦合。DeepExpress利用express delivery seq2seq学习、精心设计的异构特征表示和新颖的联合训练注意机制自适应映射异构数据,并捕获序列特征耦合以进行精确估计。对真实数据的实验结果表明,该方法优于浅基线和深基线模型。 摘要:The prediction of express delivery sequence, i.e., modeling and estimating the volumes of daily incoming and outgoing parcels for delivery, is critical for online business, logistics, and positive customer experience, and specifically for resource allocation optimization and promotional activity arrangement. A precise estimate of consumer delivery requests has to involve sequential factors such as shopping behaviors, weather conditions, events, business campaigns, and their couplings. Besides, conventional sequence prediction assumes a stable sequence evolution, failing to address complex nonlinear sequences and various feature effects in the above multi-source data. Although deep networks and attention mechanisms demonstrate the potential of complex sequence modeling, extant networks ignore the heterogeneous and coupling situation between features and sequences, resulting in weak prediction accuracy. To address these issues, we propose DeepExpress - a deep-learning based express delivery sequence prediction model, which extends the classic seq2seq framework to learning complex coupling between sequence and features. DeepExpress leverages an express delivery seq2seq learning, a carefully-designed heterogeneous feature representation, and a novel joint training attention mechanism to adaptively map heterogeneous data, and capture sequence-feature coupling for precise estimation. Experimental results on real-world data demonstrate that the proposed method outperforms both shallow and deep baseline models.

其他神经网络|深度学习|模型|建模(11篇)

【1】 ALLNet: A Hybrid Convolutional Neural Network to Improve Diagnosis of Acute Lymphocytic Leukemia (ALL) in White Blood Cells 标题:Allnet:一种改进白细胞急性淋巴细胞白血病(ALL)诊断的混合卷积神经网络 链接:https://arxiv.org/abs/2108.08195

作者:Sai Mattapalli,Rishi Athavale 机构:Thomas Jefferson High School, Academy of Engineering, for Science & Technology, and Technology, Denotes equal contribution to writing paper regardless of the order of names 备注:20 pages, 13 figures, 4 tables 摘要:由于微观层面上的形态学相似性,在受急性淋巴细胞白血病(ALL)影响的血细胞与健康的血细胞之间进行准确和时间敏感的区分需要使用机器学习架构。然而,最常见的三种模型,VGG、ResNet和Inception,每种模型都有自己的缺陷,都有改进的余地,这就需要一种更优秀的模型。ALLNet是一种混合卷积神经网络结构,由VGG、ResNet和初始模型组成。ISBI2019的所有挑战数据集(此处提供)包含用于训练和测试模型的10691张白细胞图像。数据集中的7272张图像是含有ALL的细胞,3419张图像是健康细胞。在这些图像中,60%用于训练模型,20%用于交叉验证集,20%用于测试集。ALLNet在交叉验证集中的准确性为92.6567%,敏感性为95.5304%,特异性为85.9155%,AUC得分为0.966347,F1得分为0.94803,全面优于VGG、ResNet和初始模型。在测试集中,ALLNet的准确度为92.0991%,敏感性为96.5446%,特异性为82.8035%,AUC评分为0.959972,F1评分为0.942963。在临床工作区使用ALLNet可以更好地治疗世界各地成千上万的患有各种疾病的人,其中许多是儿童。 摘要:Due to morphological similarity at the microscopic level, making an accurate and time-sensitive distinction between blood cells affected by Acute Lymphocytic Leukemia (ALL) and their healthy counterparts calls for the usage of machine learning architectures. However, three of the most common models, VGG, ResNet, and Inception, each come with their own set of flaws with room for improvement which demands the need for a superior model. ALLNet, the proposed hybrid convolutional neural network architecture, consists of a combination of the VGG, ResNet, and Inception models. The ALL Challenge dataset of ISBI 2019 (available here) contains 10,691 images of white blood cells which were used to train and test the models. 7,272 of the images in the dataset are of cells with ALL and 3,419 of them are of healthy cells. Of the images, 60% were used to train the model, 20% were used for the cross-validation set, and 20% were used for the test set. ALLNet outperformed the VGG, ResNet, and the Inception models across the board, achieving an accuracy of 92.6567%, a sensitivity of 95.5304%, a specificity of 85.9155%, an AUC score of 0.966347, and an F1 score of 0.94803 in the cross-validation set. In the test set, ALLNet achieved an accuracy of 92.0991%, a sensitivity of 96.5446%, a specificity of 82.8035%, an AUC score of 0.959972, and an F1 score of 0.942963. The utilization of ALLNet in the clinical workspace can better treat the thousands of people suffering from ALL across the world, many of whom are children.

【2】 DeepCVA: Automated Commit-level Vulnerability Assessment with Deep Multi-task Learning 标题:DeepCVA:基于深度多任务学习的自动提交级漏洞评估 链接:https://arxiv.org/abs/2108.08041

作者:Triet H. M. Le,David Hin,Roland Croft,M. Ali Babar 机构:∗CREST - The Centre for Research on Engineering Software Technologies, The University of Adelaide, Australia, †Cyber Security Cooperative Research Centre, Australia 备注:Accepted as a full paper at the 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2021 摘要:越来越多的人建议在代码提交中识别软件漏洞(SV),以便对潜在的安全风险发出早期警告。然而,在检测到漏洞贡献提交后,缺乏对其进行评估,以及时提供有关SVs可利用性、影响和严重性的信息。这些信息对于规划和优先考虑已识别SV的缓解措施非常重要。我们提出了一种新的深度多任务学习模型DeepCVA,该模型基于通用漏洞评分系统(CVSS)度量同时自动化七个提交级漏洞评估任务。我们在246个真实软件项目中对1229个包含542个不同SV的漏洞贡献提交进行了大规模实验,以评估我们模型的有效性和效率。我们发现,与许多有监督和无监督的基线模型相比,DeepCVA是性能最好的模型,其Matthews相关系数高38%到59.8%。DeepCVA还需要比七个累积评估模型少6.3倍的训练和验证时间,从而大大降低了模型维护成本。总体而言,DeepCVA提供了第一个在软件系统早期自动评估SV的有效解决方案。 摘要:It is increasingly suggested to identify Software Vulnerabilities (SVs) in code commits to give early warnings about potential security risks. However, there is a lack of effort to assess vulnerability-contributing commits right after they are detected to provide timely information about the exploitability, impact and severity of SVs. Such information is important to plan and prioritize the mitigation for the identified SVs. We propose a novel Deep multi-task learning model, DeepCVA, to automate seven Commit-level Vulnerability Assessment tasks simultaneously based on Common Vulnerability Scoring System (CVSS) metrics. We conduct large-scale experiments on 1,229 vulnerability-contributing commits containing 542 different SVs in 246 real-world software projects to evaluate the effectiveness and efficiency of our model. We show that DeepCVA is the best-performing model with 38% to 59.8% higher Matthews Correlation Coefficient than many supervised and unsupervised baseline models. DeepCVA also requires 6.3 times less training and validation time than seven cumulative assessment models, leading to significantly less model maintenance cost as well. Overall, DeepCVA presents the first effective and efficient solution to automatically assess SVs early in software systems.

【3】 Verifying Low-dimensional Input Neural Networks via Input Quantization 标题:基于输入量化的低维输入神经网络验证 链接:https://arxiv.org/abs/2108.07961

作者:Kai Jia,Martin Rinard 机构:MIT CSAIL, Cambridge MA , USA 备注:SAS 2021 摘要:在机载防撞系统(ACAS)等系统中,深度神经网络是压缩控制策略查找表的一种有吸引力的工具。通过验证技术确保此类神经控制器的安全性至关重要。分析ACAS Xu网络的问题激励了许多成功的神经网络验证器。这些验证器通常分析神经网络的内部计算,以确定输入/输出的属性是否成立。神经网络计算的内在复杂性使得这种验证器运行缓慢,容易受到浮点错误的影响。本文重新讨论了验证ACAS Xu网络的原始问题。该网络利用预先计算的查找表提供的训练数据获取低维感官输入。我们建议在网络上预先设置一个输入量化层。量化允许通过输入状态枚举进行有效验证,其复杂性受量化空间大小的限制。量化相当于运行时的最近邻插值,这已被证明在模拟中为ACAS提供了可接受的精度。此外,如果我们直接枚举目标推理实现或目标实现的精确模拟上的网络输出,我们的技术可以提供不受浮点错误影响的精确验证结果。 摘要:Deep neural networks are an attractive tool for compressing the control policy lookup tables in systems such as the Airborne Collision Avoidance System (ACAS). It is vital to ensure the safety of such neural controllers via verification techniques. The problem of analyzing ACAS Xu networks has motivated many successful neural network verifiers. These verifiers typically analyze the internal computation of neural networks to decide whether a property regarding the input/output holds. The intrinsic complexity of neural network computation renders such verifiers slow to run and vulnerable to floating-point error. This paper revisits the original problem of verifying ACAS Xu networks. The networks take low-dimensional sensory inputs with training data provided by a precomputed lookup table. We propose to prepend an input quantization layer to the network. Quantization allows efficient verification via input state enumeration, whose complexity is bounded by the size of the quantization space. Quantization is equivalent to nearest-neighbor interpolation at run time, which has been shown to provide acceptable accuracy for ACAS in simulation. Moreover, our technique can deliver exact verification results immune to floating-point error if we directly enumerate the network outputs on the target inference implementation or on an accurate simulation of the target implementation.

【4】 Learning to Collaborate 标题:学习协作 链接:https://arxiv.org/abs/2108.07926

作者:Sen Cui,Jian Liang,Weishen Pan,Kun Chen,Changshui Zhang,Fei Wang 机构:Institute for Artificial Intelligence, Tsinghua University (THUAI), State Key Lab of Intelligent Technologies and Systems, Beijing National Research Center for Information Science and Technology (BNRist) 摘要:在本文中,我们关注的是在涉及多个客户的协作研究网络上的有效学习。每个客户端都有自己的样本群体,由于隐私问题,可能无法与其他客户端共享。我们的目标是通过与网络中其他客户机的安全协作,为每个客户机学习一个模型,该模型比从其自身数据中学习的模型表现得更好。由于不同客户之间样本分布的差异,与每个人合作并不一定会产生最佳的本地模型。我们提出了一个学习协作框架,其中每个客户可以选择与网络中的某些成员协作以实现“协作均衡”,在网络中形成较小的协作联盟,以便每个客户都可以获得具有最佳效用的模型。我们提出了效益图的概念,它描述了每个客户如何从与其他客户的合作中获益,并开发了一种帕累托优化方法来获得它。最后,可以基于图操作从中导出协作联盟。我们的框架提供了一种在研究网络中建立协作的新方法。通过对合成数据集和真实数据集的实验,验证了该方法的有效性。 摘要:In this paper, we focus on effective learning over a collaborative research network involving multiple clients. Each client has its own sample population which may not be shared with other clients due to privacy concerns. The goal is to learn a model for each client, which behaves better than the one learned from its own data, through secure collaborations with other clients in the network. Due to the discrepancies of the sample distributions across different clients, it is not necessarily that collaborating with everyone will lead to the best local models. We propose a learning to collaborate framework, where each client can choose to collaborate with certain members in the network to achieve a "collaboration equilibrium", where smaller collaboration coalitions are formed within the network so that each client can obtain the model with the best utility. We propose the concept of benefit graph which describes how each client can benefit from collaborating with other clients and develop a Pareto optimization approach to obtain it. Finally the collaboration coalitions can be derived from it based on graph operations. Our framework provides a new way of setting up collaborations in a research network. Experiments on both synthetic and real world data sets are provided to demonstrate the effectiveness of our method.

【5】 Data Pricing in Machine Learning Pipelines 标题:机器学习管道中的数据定价 链接:https://arxiv.org/abs/2108.07915

作者:Zicun Cong,Xuan Luo,Pei Jian,Feida Zhu,Yong Zhang 机构:Received: date Accepted: date 摘要:机器学习具有破坏性。与此同时,机器学习只有通过多方协作才能取得成功,在生态系统中自然地通过多个步骤进行协作,如为可能的机器学习应用收集数据、多方协作训练模型以及向最终用户提供机器学习服务。数据在整个机器学习过程中是至关重要的。由于机器学习管道涉及多方,为了取得成功,必须形成一个建设性的动态生态系统,因此市场和数据定价对于连接和促进这些多方至关重要。本文综述了机器学习管道中数据定价的原理和最新研究进展。我们首先简要回顾一下数据市场和定价需求。然后,我们重点讨论机器学习管道中三个重要步骤的定价。为了在训练数据收集步骤中了解定价,我们回顾了定价原始数据集和数据标签。我们还研究了机器学习模型协作训练步骤中的定价,并概述了机器学习部署步骤中面向最终用户的机器学习模型定价。我们还讨论了一系列可能的未来方向。 摘要:Machine learning is disruptive. At the same time, machine learning can only succeed by collaboration among many parties in multiple steps naturally as pipelines in an eco-system, such as collecting data for possible machine learning applications, collaboratively training models by multiple parties and delivering machine learning services to end users. Data is critical and penetrating in the whole machine learning pipelines. As machine learning pipelines involve many parties and, in order to be successful, have to form a constructive and dynamic eco-system, marketplaces and data pricing are fundamental in connecting and facilitating those many parties. In this article, we survey the principles and the latest research development of data pricing in machine learning pipelines. We start with a brief review of data marketplaces and pricing desiderata. Then, we focus on pricing in three important steps in machine learning pipelines. To understand pricing in the step of training data collection, we review pricing raw data sets and data labels. We also investigate pricing in the step of collaborative training of machine learning models, and overview pricing machine learning models for end users in the step of machine learning deployment. We also discuss a series of possible future directions.

【6】 Modulating Language Models with Emotions 标题:用情感调整语言模型 链接:https://arxiv.org/abs/2108.07886

作者:Ruibo Liu,Jason Wei,Chenyan Jia,Soroush Vosoughi 机构:Dartmouth College, ProtagoLabs, University of Texas at Austin 备注:Findings of ACL 2021 摘要:生成包含不同情感的上下文感知语言是构建移情NLP系统的重要一步。在本文中,我们提出了一种调制层规范化的公式——一种受计算机视觉启发的技术——它允许我们使用大规模语言模型来生成情绪反应。在MojiTalk数据集的自动和人工评估中,我们提出的调制层标准化方法在保持多样性、流畅性和一致性的同时优于先前的基线方法。即使只使用10%的可用训练数据,我们的方法也能获得有竞争力的性能。 摘要:Generating context-aware language that embodies diverse emotions is an important step towards building empathetic NLP systems. In this paper, we propose a formulation of modulated layer normalization -- a technique inspired by computer vision -- that allows us to use large-scale language models for emotional response generation. In automatic and human evaluation on the MojiTalk dataset, our proposed modulated layer normalization method outperforms prior baseline methods while maintaining diversity, fluency, and coherence. Our method also obtains competitive performance even when using only 10% of the available training data.

【7】 An Extensible Benchmark Suite for Learning to Simulate Physical Systems 标题:一种用于学习模拟物理系统的可扩展基准测试套件 链接:https://arxiv.org/abs/2108.07799

作者:Karl Otness,Arvi Gjoka,Joan Bruna,Daniele Panozzo,Benjamin Peherstorfer,Teseo Schneider,Denis Zorin 机构:Courant Institute of Mathematical Sciences, New York University, University of Victoria 备注:Accepted to NeurIPS 2021 track on datasets and benchmarks 摘要:模拟物理系统是科学计算的核心组成部分,涵盖了广泛的物理领域和应用。最近,由于有机会降低计算成本和/或利用对大量数据的访问学习新的物理模型,数据驱动方法在传统数值模拟方法的补充方面出现了激增。然而,问题设置和应用的多样性导致了过多的方法,每种方法都在不同的设置和不同的评估指标上进行评估。我们引入了一组基准问题,以朝着统一基准和评估协议迈出一步。我们提出了四个具有代表性的物理系统,以及广泛使用的经典时间积分器和具有代表性的数据驱动方法(基于内核、MLP、CNN、最近邻)的集合。我们的框架允许客观和系统地评估数据驱动方法的稳定性、准确性和计算效率。此外,它是可配置的,允许调整以适应其他学习任务,并为科学计算机器学习的未来发展奠定基础。 摘要:Simulating physical systems is a core component of scientific computing, encompassing a wide range of physical domains and applications. Recently, there has been a surge in data-driven methods to complement traditional numerical simulations methods, motivated by the opportunity to reduce computational costs and/or learn new physical models leveraging access to large collections of data. However, the diversity of problem settings and applications has led to a plethora of approaches, each one evaluated on a different setup and with different evaluation metrics. We introduce a set of benchmark problems to take a step towards unified benchmarks and evaluation protocols. We propose four representative physical systems, as well as a collection of both widely used classical time integrators and representative data-driven methods (kernel-based, MLP, CNN, nearest neighbors). Our framework allows evaluating objectively and systematically the stability, accuracy, and computational efficiency of data-driven methods. Additionally, it is configurable to permit adjustments for accommodating other learning tasks and for establishing a foundation for future developments in machine learning for scientific computing.

【8】 Distinguishing Healthy Ageing from Dementia: a Biomechanical Simulation of Brain Atrophy using Deep Networks 标题:区分健康衰老和痴呆症:基于深度网络的脑萎缩生物力学模拟 链接:https://arxiv.org/abs/2108.08214

作者:Mariana Da Silva,Carole H. Sudre,Kara Garcia,Cher Bass,M. Jorge Cardoso,Emma C. Robinson 机构: School of Biomedical Engineering and Imaging Sciences, King’s College London, MRC Unit for Lifelong Health and Ageing at UCL, University College London, Centre for Medical Image Computing, Department of Computer Science, University 备注:MLCN 2021 摘要:组织变形的生物力学模型可以用来模拟大脑纵向演化的不同场景。在这项工作中,我们为健康老龄化和阿尔茨海默病期间的脑萎缩超弹性应变建模提供了一个深入的学习框架。该框架直接模拟年龄、疾病状态和扫描间隔的影响,以回归萎缩的区域模式,基于应变的模型据此估计变形。该模型使用ADNI队列的3D结构磁共振成像数据进行训练和验证。结果表明,该框架可以根据阿尔茨海默病的已知病程估计真实的变形,从而明确区分健康和痴呆的衰老模式。这表明该框架有可能被纳入疾病的可解释模型中,用于探索干预措施和反事实的例子。 摘要:Biomechanical modeling of tissue deformation can be used to simulate different scenarios of longitudinal brain evolution. In this work,we present a deep learning framework for hyper-elastic strain modelling of brain atrophy, during healthy ageing and in Alzheimer's Disease. The framework directly models the effects of age, disease status, and scan interval to regress regional patterns of atrophy, from which a strain-based model estimates deformations. This model is trained and validated using 3D structural magnetic resonance imaging data from the ADNI cohort. Results show that the framework can estimate realistic deformations, following the known course of Alzheimer's disease, that clearly differentiate between healthy and demented patterns of ageing. This suggests the framework has potential to be incorporated into explainable models of disease, for the exploration of interventions and counterfactual examples.

【9】 Moser Flow: Divergence-based Generative Modeling on Manifolds 标题:Moser流:基于散度的流形产生式建模 链接:https://arxiv.org/abs/2108.08052

作者:Noam Rozen,Aditya Grover,Maximilian Nickel,Yaron Lipman 机构:FAIR and WIS 摘要:我们感兴趣的是学习通过流形描述的复杂几何体的生成模型,例如球体、环面和其他隐式曲面。现有(欧几里得)生成模型的当前扩展仅限于特定的几何结构,并且通常会遭受较高的计算成本。我们介绍了连续规范化流(CNF)族中的一类新的生成模型——Moser流(MF)。MF还通过变量公式变化的解决方案生成CNF,但与其他CNF方法不同,其模型(学习)密度被参数化为源(先验)密度减去神经网络(NN)的散度。散度是一个局部线性微分算子,易于在流形上逼近和计算。因此,与其他CNF不同,MF不需要在训练期间通过ODE解算器调用或反向传播。此外,将模型密度明确表示为NN的散度,而不是ODE的解,有助于学习高保真密度。理论上,我们在适当的假设下证明了MF是一个普适密度近似器。从经验上看,我们首次展示了使用流动模型从一般曲面采样,并在密度估计、样本质量和训练复杂性方面比现有CNF有了显著改进,从而挑战了地球和气候科学的合成几何和真实基准。 摘要:We are interested in learning generative models for complex geometries described via manifolds, such as spheres, tori, and other implicit surfaces. Current extensions of existing (Euclidean) generative models are restricted to specific geometries and typically suffer from high computational costs. We introduce Moser Flow (MF), a new class of generative models within the family of continuous normalizing flows (CNF). MF also produces a CNF via a solution to the change-of-variable formula, however differently from other CNF methods, its model (learned) density is parameterized as the source (prior) density minus the divergence of a neural network (NN). The divergence is a local, linear differential operator, easy to approximate and calculate on manifolds. Therefore, unlike other CNFs, MF does not require invoking or backpropagating through an ODE solver during training. Furthermore, representing the model density explicitly as the divergence of a NN rather than as a solution of an ODE facilitates learning high fidelity densities. Theoretically, we prove that MF constitutes a universal density approximator under suitable assumptions. Empirically, we demonstrate for the first time the use of flow models for sampling from general curved surfaces and achieve significant improvements in density estimation, sample quality, and training complexity over existing CNFs on challenging synthetic geometries and real-world benchmarks from the earth and climate sciences.

【10】 Aggregated Customer Engagement Model 标题:聚合客户参与度模型 链接:https://arxiv.org/abs/2108.07872

作者:Priya Gupta,Cuize Han 机构:Amazon 摘要:电子商务网站使用机器学习的排名模型向客户提供购物结果。通常,网站会记录客户搜索事件,其中包括输入的查询以及由此产生的与购物结果的互动,如点击和购买。每个客户搜索事件都作为模型的输入训练数据,单个客户参与度作为客户偏好的信号。因此,例如,购买的购物结果被认为比不购买的结果更重要。然而,新产品或印象不足的产品没有足够的客户参与信号,在与流行产品并列时处于劣势。在本文中,我们提出了一种新的数据管理方法,该方法在一天内聚合所有客户参与,以便将同一查询用作输入训练数据。这种聚合的客户参与度为模型提供了购物结果相对重要性的完整图像。基于此聚合数据的训练模型减少了对行为特征的依赖。这有助于缓解冷启动问题,并将相关新产品推到搜索结果的前列。在本文中,我们提供了离线和在线分析,并比较了在电子商务数据上训练的单个和聚合客户参与模型的结果。 摘要:E-commerce websites use machine learned ranking models to serve shopping results to customers. Typically, the websites log the customer search events, which include the query entered and the resulting engagement with the shopping results, such as clicks and purchases. Each customer search event serves as input training data for the models, and the individual customer engagement serves as a signal for customer preference. So a purchased shopping result, for example, is perceived to be more important than one that is not. However, new or under-impressed products do not have enough customer engagement signals and end up at a disadvantage when being ranked alongside popular products. In this paper, we propose a novel method for data curation that aggregates all customer engagements within a day for the same query to use as input training data. This aggregated customer engagement gives the models a complete picture of the relative importance of shopping results. Training models on this aggregated data leads to less reliance on behavioral features. This helps mitigate the cold start problem and boosted relevant new products to top search results. In this paper, we present the offline and online analysis and results comparing the individual and aggregated customer engagement models trained on e-commerce data.

【11】 OncoPetNet: A Deep Learning based AI system for mitotic figure counting on H&E stained whole slide digital images in a large veterinary diagnostic lab setting 标题:OncoPetNet:一种基于深度学习的大型兽医诊断实验室H&E染色全玻片数字图像有丝分裂图像人工智能系统 链接:https://arxiv.org/abs/2108.07856

作者:Michael Fitzke,Derick Whitley,Wilson Yau,Fernando Rodrigues Jr,Vladimir Fadeev,Cindy Bacmeister,Chris Carter,Jeffrey Edwards,Matthew P. Lungren,Mark Parkinson 机构:Mars Digital Technologies, Antech Diagnostics, Stanford University 摘要:背景:组织病理学是现代医疗中许多疾病诊断和治疗的重要手段,在癌症治疗中起着至关重要的作用。病理学样本可能很大,需要多点取样,导致单个肿瘤的载玻片多达20张,而人类专家的位置选择和有丝分裂图的定量评估任务既耗时又主观。在数字病理学服务环境中实现这些任务的自动化为提高工作流效率和增强实践中的人类专家提供了重要机会。方法:在OncoPetNet的开发过程中,使用了多种最先进的组织病理学图像分类和有丝分裂图检测深度学习技术。此外,采用了无模型方法来提高速度和精度。健壮且可扩展的推理引擎利用Pytork的性能优化以及专门开发的推理加速技术。结果:与人类专家基线相比,拟议的系统在14种癌症类型的41例癌症患者中显示出显著改善的有丝分裂计数性能。与人类专家评估相比,21.9%的病例使用Oncopenet导致肿瘤分级发生变化。在部署过程中,在2个中心的高通量兽医病理诊断服务中实现了有效的0.27分钟/张幻灯片推断,每天处理3323张数字整张幻灯片图像。结论:这项工作代表了首次成功地自动化部署深度学习系统,以便在大规模临床实践中,在重要的组织病理学任务中实现实时专家级性能。由此产生的影响概述了模型开发、部署、临床决策的重要考虑因素,并为数字组织病理学实践中实施深度学习系统提供了最佳实践。 摘要:Background: Histopathology is an important modality for the diagnosis and management of many diseases in modern healthcare, and plays a critical role in cancer care. Pathology samples can be large and require multi-site sampling, leading to upwards of 20 slides for a single tumor, and the human-expert tasks of site selection and and quantitative assessment of mitotic figures are time consuming and subjective. Automating these tasks in the setting of a digital pathology service presents significant opportunities to improve workflow efficiency and augment human experts in practice. Approach: Multiple state-of-the-art deep learning techniques for histopathology image classification and mitotic figure detection were used in the development of OncoPetNet. Additionally, model-free approaches were used to increase speed and accuracy. The robust and scalable inference engine leverages Pytorch's performance optimizations as well as specifically developed speed up techniques in inference. Results: The proposed system, demonstrated significantly improved mitotic counting performance for 41 cancer cases across 14 cancer types compared to human expert baselines. In 21.9% of cases use of OncoPetNet led to change in tumor grading compared to human expert evaluation. In deployment, an effective 0.27 min/slide inference was achieved in a high throughput veterinary diagnostic pathology service across 2 centers processing 3,323 digital whole slide images daily. Conclusion: This work represents the first successful automated deployment of deep learning systems for real-time expert-level performance on important histopathology tasks at scale in a high volume clinical practice. The resulting impact outlines important considerations for model development, deployment, clinical decision making, and informs best practices for implementation of deep learning systems in digital histopathology practices.

其他(14篇)

【1】 OACAL: Finding Module-Consistent Solutions to Weaken User Obligations 标题:OACAL:寻找模块一致的解决方案来削弱用户义务 链接:https://arxiv.org/abs/2108.08282

作者:Pengcheng Jiang,Kenji Tei 机构:Tokyo, Japan, Waseda University, National Institute of Informatics 备注:9 pages, 15 figures, 3 tables, not submitted to conference yet 摘要:与UI嵌入式机器或系统交互的用户通常必须按照预先确定的顺序执行操作,以成功实现某些功能目标。然而,用户往往没有严格遵守这些义务,这可能导致违反安全属性,特别是在安全关键系统中。为了提高系统的安全性和对用户意外行为的感知能力,可以通过改变系统规范中的操作顺序将系统重新设计为更健壮的系统。同时,我们预计修改后功能将保持一致。在本文中,我们提出了一种有效的算法来自动生成规范修订,以应对因用户义务减弱而导致的攻击场景。通过我们的算法,所有的修改都保持了原始规范中功能的完整性,这些规范是使用一种新的重组方法生成的。然后,结合模型检查和机器学习技术的混合方法可以有效地发现满足安全性要求的合格修订。我们通过比较我们的算法与最先进的方法在覆盖率和搜索速度方面的性能来评估我们的算法。 摘要:Users interacting with a UI-embedded machine or system are typically obliged to perform their actions in a pre-determined order, to successfully achieve certain functional goals. However, such obligations are often not followed strictly by users, which may lead to the violation to security properties, especially in security-critical systems. In order to improve the security with the awareness of unexpected user behaviors, a system can be redesigned to a more robust one by changing the order of actions in its specification. Meanwhile, we anticipate that the functionalities would remain consistent following the modifications. In this paper, we propose an efficient algorithm to automatically produce specification revisions tackling with attack scenarios caused by the weakened user obligations. By our algorithm, all the revisions maintain the integrity of the functionalities as the original specification, which are generated using a novel recomposition approach. Then, the qualified revisions that can satisfy the security requirements would be efficiently spotted by a hybrid approach combining model checking and machine learning techniques. We evaluate our algorithm by comparing its performance with a state-of-the-art approach regarding their coverage and searching speed of the desirable revisions.

【2】 X-modaler: A Versatile and High-performance Codebase for Cross-modal Analytics 标题:X-modaler:一个用于跨模态分析的通用高性能代码库 链接:https://arxiv.org/abs/2108.08217

作者:Yehao Li,Yingwei Pan,Jingwen Chen,Ting Yao,Tao Mei 机构:JD AI Research, Beijing, China 备注:Accepted by 2021 ACMMM Open Source Software Competition. Source code: this https URL 摘要:随着深度学习在过去十年中的兴起和发展,出现了稳定的创新和突破势头,令人信服地推动了多媒体领域视觉和语言之间的跨模态分析的最新发展。然而,还没有一个开源代码库来支持以统一和模块化的方式训练和部署用于跨模态分析的众多神经网络模型。在这项工作中,我们提出了X-modaler——一种通用的高性能代码库,它将最先进的跨模态分析封装到几个通用阶段(例如预处理、编码器、跨模态交互、解码器和解码策略)。每个阶段都具有功能,涵盖了一系列在最新技术中广泛采用的模块,并允许在这些模块之间无缝切换。这种方式自然能够灵活地实现图像字幕、视频字幕和视觉语言预训练的最新算法,以促进研究社区的快速发展。同时,由于几个阶段的有效模块化设计(例如,跨模态交互)在不同的视觉语言任务中共享,因此X-modaler可以简单地扩展为跨模态分析中其他任务的启动原型,包括视觉问答、视觉常识推理和跨模态检索。X-modaler是Apache许可的代码库,其源代码、示例项目和预先训练的模型可在线获取:https://github.com/YehLi/xmodaler. 摘要:With the rise and development of deep learning over the past decade, there has been a steady momentum of innovation and breakthroughs that convincingly push the state-of-the-art of cross-modal analytics between vision and language in multimedia field. Nevertheless, there has not been an open-source codebase in support of training and deploying numerous neural network models for cross-modal analytics in a unified and modular fashion. In this work, we propose X-modaler -- a versatile and high-performance codebase that encapsulates the state-of-the-art cross-modal analytics into several general-purpose stages (e.g., pre-processing, encoder, cross-modal interaction, decoder, and decode strategy). Each stage is empowered with the functionality that covers a series of modules widely adopted in state-of-the-arts and allows seamless switching in between. This way naturally enables a flexible implementation of state-of-the-art algorithms for image captioning, video captioning, and vision-language pre-training, aiming to facilitate the rapid development of research community. Meanwhile, since the effective modular designs in several stages (e.g., cross-modal interaction) are shared across different vision-language tasks, X-modaler can be simply extended to power startup prototypes for other tasks in cross-modal analytics, including visual question answering, visual commonsense reasoning, and cross-modal retrieval. X-modaler is an Apache-licensed codebase, and its source codes, sample projects and pre-trained models are available on-line: https://github.com/YehLi/xmodaler.

【3】 Generalizing MLPs With Dropouts, Batch Normalization, and Skip Connections 标题:使用辍学、批归一化和跳过连接对MLP进行泛化 链接:https://arxiv.org/abs/2108.08186

作者:Taewoon Kim 机构:Vrije Universiteit Amsterdam 备注:8 pages not including references 摘要:多层感知器(MLP)通常由多个具有非线性激活函数的完全连接层组成。有几种方法可以使其更好(例如,更快的收敛速度、更好的收敛极限等)。但是研究缺乏更结构化的方法来测试它们。我们通过在年龄和性别数据集上进行实验来测试不同的MLP架构。我们的经验表明,通过在每个线性层之前白化输入和添加跳过连接,我们提出的MLP架构可以获得更好的性能。由于白化过程包括辍学,因此它也可用于近似贝叶斯推理。我们已经在网站上公开了我们的代码发布模型和docker图像https://github.com/tae898/age-gender/. 摘要:A multilayer perceptron (MLP) is typically made of multiple fully connected layers with nonlinear activation functions. There have been several approaches to make them better (e.g. faster convergence, better convergence limit, etc.). But the researches lack in more structured ways to test them. We test different MLP architectures by carrying out the experiments on the age and gender datasets. We empirically show that by whitening inputs before every linear layer and adding skip connections, our proposed MLP architecture can result in better performance. Since the whitening process includes dropouts, it can also be used to approximate Bayesian inference. We have open sourced our code released models and docker images at https://github.com/tae898/age-gender/.

【4】 Single-DARTS: Towards Stable Architecture Search 标题:单镖:迈向稳定架构搜索 链接:https://arxiv.org/abs/2108.08128

作者:Pengfei Hou,Ying Jin,Yukang Chen 机构:Tsinghua University, The Chinese University of Hong Kong 备注:Accepted by ICCV 2021 NeurArch Workshp 摘要:可微结构搜索(DARTS)是神经结构搜索(NAS)的一个里程碑,具有简单性和较小的搜索成本。但是,DART仍然经常受到性能崩溃的影响,这种情况发生在某些操作(如跳过连接、零和池)主导体系结构时。在本文中,我们首先指出这种现象是由于双层优化造成的。我们提出了单DART,它仅使用单级优化,使用相同的数据批同时更新网络权重和结构参数。即使以前曾尝试过单级优化,也没有任何文献对这一要点提供系统的解释。单DART取代了双层优化,明显缓解了性能崩溃,并增强了架构搜索的稳定性。实验结果表明,单省道在主流搜索空间上达到了最先进的性能。例如,在NAS-Benchmark-201上,搜索的体系结构几乎是最优的。我们还验证了单级优化框架比双级优化框架更稳定。我们希望,这种简单而有效的方法将为差分体系结构搜索提供一些见解。该守则可于https://github.com/PencilAndBike/Single-DARTS.git. 摘要:Differentiable architecture search (DARTS) marks a milestone in Neural Architecture Search (NAS), boasting simplicity and small search costs. However, DARTS still suffers from frequent performance collapse, which happens when some operations, such as skip connections, zeroes and poolings, dominate the architecture. In this paper, we are the first to point out that the phenomenon is attributed to bi-level optimization. We propose Single-DARTS which merely uses single-level optimization, updating network weights and architecture parameters simultaneously with the same data batch. Even single-level optimization has been previously attempted, no literature provides a systematic explanation on this essential point. Replacing the bi-level optimization, Single-DARTS obviously alleviates performance collapse as well as enhances the stability of architecture search. Experiment results show that Single-DARTS achieves state-of-the-art performance on mainstream search spaces. For instance, on NAS-Benchmark-201, the searched architectures are nearly optimal ones. We also validate that the single-level optimization framework is much more stable than the bi-level one. We hope that this simple yet effective method will give some insights on differential architecture search. The code is available at https://github.com/PencilAndBike/Single-DARTS.git.

【5】 RANK-NOSH: Efficient Predictor-Based Architecture Search via Non-Uniform Successive Halving 标题:RANK-NOSH:基于非均匀连续减半的高效预测器体系结构搜索 链接:https://arxiv.org/abs/2108.08019

作者:Ruochen Wang,Xiangning Chen,Minhao Cheng,Xiaocheng Tang,Cho-Jui Hsieh 机构:Department of Computer Science, UCLA, DiDi AI Labs 备注:To Appear in ICCV2021. The code will be released shortly at this https URL 摘要:基于预测器的算法在神经结构搜索(NAS)任务中取得了显著的性能。然而,这些方法的计算成本很高,因为训练性能预测器通常需要从头开始训练和评估数百种体系结构。以前的工作主要集中在减少适应预测器所需的体系结构数量。在这项工作中,我们从另一个角度来应对这一挑战——通过减少体系结构训练的计算预算来提高搜索效率。我们提出了非均匀连续减半(NOSH)算法,这是一种分层调度算法,可以提前终止对性能不佳的体系结构的训练,以避免浪费预算。为了有效地利用NOSH产生的非均匀监督信号,我们将基于预测器的架构搜索描述为通过成对比较学习排序。由此产生的RANK-NOSH方法将搜索预算减少了约5倍,同时在各种空间和数据集上实现了比以前基于最先进的预测器的方法更具竞争力甚至更好的性能。 摘要:Predictor-based algorithms have achieved remarkable performance in the Neural Architecture Search (NAS) tasks. However, these methods suffer from high computation costs, as training the performance predictor usually requires training and evaluating hundreds of architectures from scratch. Previous works along this line mainly focus on reducing the number of architectures required to fit the predictor. In this work, we tackle this challenge from a different perspective - improve search efficiency by cutting down the computation budget of architecture training. We propose NOn-uniform Successive Halving (NOSH), a hierarchical scheduling algorithm that terminates the training of underperforming architectures early to avoid wasting budget. To effectively leverage the non-uniform supervision signals produced by NOSH, we formulate predictor-based architecture search as learning to rank with pairwise comparisons. The resulting method - RANK-NOSH, reduces the search budget by ~5x while achieving competitive or even better performance than previous state-of-the-art predictor-based methods on various spaces and datasets.

【6】 Look Before You Leap! Designing a Human-Centered AI System for Change Risk Assessment 标题:三思而后行!设计一个以人为中心的变更风险评估人工智能系统 链接:https://arxiv.org/abs/2108.07951

作者:Binay Gupta,Anirban Chatterjee,Harika Matha,Kunal Banerjee,Lalitdutt Parsai,Vijay Agneeswaran 机构:Walmart Global Tech, Bangalore, India 摘要:减少生产系统中的故障数量是技术驱动行业(如在线零售行业)中最具挑战性的问题之一。为了应对这一挑战,变更管理已经成为运营中一个很有前途的子领域,它以系统的方式管理和审查将在生产中部署的变更。然而,每天手动审查大量变更并评估与之相关的风险实际上是不可能的。这就需要开发一个自动化系统来评估与大量变更相关的风险。有一些商业解决方案可以解决这个问题,但这些解决方案缺乏将领域知识和领域专家的持续反馈纳入风险评估过程的能力。作为这项工作的一部分,我们的目标是通过在风险评估过程中建立一个持续的反馈回路,弥合模型驱动的变更请求风险评估和领域专家评估之间的差距。在这里,我们介绍了我们构建端到端机器学习系统的工作,并讨论了我们面临的一些实际挑战,这些挑战涉及到类分布的极端偏斜、概念漂移、与模型预测相关的不确定性估计以及系统的整体可伸缩性。 摘要:Reducing the number of failures in a production system is one of the most challenging problems in technology driven industries, such as, the online retail industry. To address this challenge, change management has emerged as a promising sub-field in operations that manages and reviews the changes to be deployed in production in a systematic manner. However, it is practically impossible to manually review a large number of changes on a daily basis and assess the risk associated with them. This warrants the development of an automated system to assess the risk associated with a large number of changes. There are a few commercial solutions available to address this problem but those solutions lack the ability to incorporate domain knowledge and continuous feedback from domain experts into the risk assessment process. As part of this work, we aim to bridge the gap between model-driven risk assessment of change requests and the assessment of domain experts by building a continuous feedback loop into the risk assessment process. Here we present our work to build an end-to-end machine learning system along with the discussion of some of practical challenges we faced related to extreme skewness in class distribution, concept drift, estimation of the uncertainty associated with the model's prediction and the overall scalability of the system.

【7】 Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay 标题:基于多样性的轨迹和目标选择及其后见之明经验回放 链接:https://arxiv.org/abs/2108.07887

作者:Tianhong Dai,Hengyan Liu,Kai Arulkumaran,Guangyu Ren,Anil Anthony Bharath 机构: Imperial College London, London SW,AZ, UK, Araya Inc., Tokyo ,-, Japan 摘要:事后经验重播(HER)是一种目标重新标记技术,通常与非策略深度强化学习算法一起用于解决面向目标的任务;它非常适合只提供少量奖励的机器人操作任务。在HER中,轨迹和过渡都是统一采样进行训练的。然而,并不是所有代理的经验对训练都有同样的贡献,因此天真的统一采样可能会导致学习效率低下。在本文中,我们提出了基于多样性的轨迹和目标选择(DTGSH)。首先,根据行列式点过程(DPP)模拟的目标状态的多样性对轨迹进行采样。其次,使用k-dpp从轨迹中选择具有不同目标状态的过渡。我们在模拟机器人环境中评估了五个具有挑战性的机器人操作任务中的DTGSH,结果表明,与其他最先进的方法相比,我们的方法在所有任务中都能更快地学习并达到更高的性能。 摘要:Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent's experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.

【8】 Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory 标题:不妥协的边缘人工智能:电阻性随机存取存储器中的高效、通用和精确的神经计算 链接:https://arxiv.org/abs/2108.07879

作者:Weier Wan,Rajkumar Kubendran,Clemens Schaefer,S. Burc Eryilmaz,Wenqiang Zhang,Dabin Wu,Stephen Deiss,Priyanka Raina,He Qian,Bin Gao,Siddharth Joshi,Huaqiang Wu,H. -S. Philip Wong,Gert Cauwenberghs 机构: Stanford University, CA, USA; , University of California San Diego, CA, USA; , Tsinghua University, Beijing, China; , University of Notre Dame, IN, USA; , University of Pittsburgh, PA, USA 备注:34 pages, 14 figures, 1 table 摘要:直接在分布在互联网边缘的设备上实现当今的云级人工智能功能,需要能够以前所未有的能效处理多种感官数据(如视频、音频)的边缘硬件。今天的人工智能硬件体系结构无法满足需求,因为存在一道基本的“内存墙”:单独的计算和内存单元之间的数据移动会消耗大量的能量,并且会产生较长的延迟。基于电阻随机存取存储器(RRAM)的内存中计算(CIM)体系结构通过直接在内存中执行计算,有望带来几个数量级的能效改进。然而,CIM硬件设计的传统方法限制了其处理各种AI工作负载所需的功能灵活性,并且必须克服降低推理精度的硬件缺陷。这种效率、多功能性和准确性之间的权衡不能通过对任何单一设计层次的单独改进来解决。通过在从算法和架构到电路和设备的所有设计层次上进行协同优化,我们展示了Neuram——第一款使用RRAM CIM的多模边缘AI芯片,可同时为各种模型架构提供高度的通用性,在各种计算位精度方面,记录的能效比现有技术高$5倍$-$8倍$,推理精度可与所有测量的标准AI基准上具有4位权重的软件模型相媲美,包括MNIST上99.0%的精度和CIFAR-10图像分类上85.7%的精度,谷歌语音命令识别的准确率为84.7%,贝叶斯图像恢复任务的图像重建误差降低了70%。这项工作为构建高效、可重构的边缘人工智能硬件平台铺平了道路,以满足未来更高要求和更异构的人工智能应用。 摘要:Realizing today's cloud-level artificial intelligence functionalities directly on devices distributed at the edge of the internet calls for edge hardware capable of processing multiple modalities of sensory data (e.g. video, audio) at unprecedented energy-efficiency. AI hardware architectures today cannot meet the demand due to a fundamental "memory wall": data movement between separate compute and memory units consumes large energy and incurs long latency. Resistive random-access memory (RRAM) based compute-in-memory (CIM) architectures promise to bring orders of magnitude energy-efficiency improvement by performing computation directly within memory. However, conventional approaches to CIM hardware design limit its functional flexibility necessary for processing diverse AI workloads, and must overcome hardware imperfections that degrade inference accuracy. Such trade-offs between efficiency, versatility and accuracy cannot be addressed by isolated improvements on any single level of the design. By co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAM - the first multimodal edge AI chip using RRAM CIM to simultaneously deliver a high degree of versatility for diverse model architectures, record energy-efficiency $5times$ - $8times$ better than prior art across various computational bit-precisions, and inference accuracy comparable to software models with 4-bit weights on all measured standard AI benchmarks including accuracy of 99.0% on MNIST and 85.7% on CIFAR-10 image classification, 84.7% accuracy on Google speech command recognition, and a 70% reduction in image reconstruction error on a Bayesian image recovery task. This work paves a way towards building highly efficient and reconfigurable edge AI hardware platforms for the more demanding and heterogeneous AI applications of the future.

【9】 Compressing gradients by exploiting temporal correlation in momentum-SGD 标题:利用动量-SGD的时间相关性压缩梯度 链接:https://arxiv.org/abs/2108.07827

作者:Tharindu B. Adikari,Stark C. Draper 机构: University of Toronto 备注:None 摘要:分散优化中一个日益增长的瓶颈是通信。更大的模型和不断增长的数据集意味着计算的分散非常重要,交换的信息量正在迅速增长。虽然已经引入了压缩技术来处理后者,但没有一种技术考虑利用连续向量更新中存在的时间相关性。一个重要的例子是分布式动量SGD,其中通过应用动量的低通滤波效应增强了时间相关性。在本文中,我们设计并分析了在有误差反馈和无误差反馈的系统中利用时间相关性的压缩方法。使用ImageNet数据集进行的实验表明,我们提出的方法在计算复杂度几乎不增加的情况下显著降低了通信速率。我们进一步分析了当采用误差反馈压缩时,SGD的收敛性。在文献中,收敛保证仅针对提供逐点误差边界的压缩器开发,即针对压缩器的每个输入。相比之下,许多重要的代码(例如率失真代码)仅在预期情况下提供错误界限,从而提供更一般的保证。本文通过建立最小梯度范数的界,证明了在期望误差假设下SGD的收敛性。 摘要:An increasing bottleneck in decentralized optimization is communication. Bigger models and growing datasets mean that decentralization of computation is important and that the amount of information exchanged is quickly growing. While compression techniques have been introduced to cope with the latter, none has considered leveraging the temporal correlations that exist in consecutive vector updates. An important example is distributed momentum-SGD where temporal correlation is enhanced by the low-pass-filtering effect of applying momentum. In this paper we design and analyze compression methods that exploit temporal correlation in systems both with and without error-feedback. Experiments with the ImageNet dataset demonstrate that our proposed methods offer significant reduction in the rate of communication at only a negligible increase in computation complexity. We further analyze the convergence of SGD when compression is applied with error-feedback. In the literature, convergence guarantees are developed only for compressors that provide error-bounds point-wise, i.e., for each input to the compressor. In contrast, many important codes (e.g. rate-distortion codes) provide error-bounds only in expectation and thus provide a more general guarantee. In this paper we prove the convergence of SGD under an expected error assumption by establishing a bound for the minimum gradient norm.

【10】 Learned holographic light transport 标题:习得的全息光传输 链接:https://arxiv.org/abs/2108.08253

作者:Koray Kavaklı,Hakan Urey,Kaan Akşit 机构:Department of Electrical and Electronics Engineering, Koç University, Istanbul, Turkey, Department of Computer Science, University College London, London, UK 摘要:计算机生成全息(CGH)算法通常无法将模拟结果与物理全息显示结果进行匹配。我们的工作通过学习全息显示中的全息光传输来解决这种不匹配。使用相机和全息显示器,我们捕获优化全息图的图像重建,这些图像重建依赖于理想模拟来生成数据集。受理想模拟的启发,我们学习了复值卷积核,它可以将给定的全息图传播到数据集中捕获的照片。我们的方法可以显著提高全息显示的模拟精度和图像质量,同时为物理信息学习方法铺平道路。 摘要:Computer-Generated Holography (CGH) algorithms often fall short in matching simulations with results from a physical holographic display. Our work addresses this mismatch by learning the holographic light transport in holographic displays. Using a camera and a holographic display, we capture the image reconstructions of optimized holograms that rely on ideal simulations to generate a dataset. Inspired by the ideal simulations, we learn a complex-valued convolution kernel that can propagate given holograms to captured photographs in our dataset. Our method can dramatically improve simulation accuracy and image quality in holographic displays while paving the way for physically informed learning approaches.

【11】 Quantitative Uniform Stability of the Iterative Proportional Fitting Procedure 标题:迭代比例拟合法的定量一致稳定性 链接:https://arxiv.org/abs/2108.08129

作者:George Deligiannidis,Valentin De Bortoli,Arnaud Doucet 机构:Department of Statistics, University of Oxford, UK 备注:15 pages 摘要:我们建立了迭代比例拟合过程(也称为Sinkhorn算法)的统一时间稳定性,即w.r.t.边缘,用于解决熵正则化最优传输问题。我们的结果是定量的,用1-Wasserstein度量表示。作为推论,我们建立了Schr“odinger桥的定量稳定性结果。 摘要:We establish the uniform in time stability, w.r.t. the marginals, of the Iterative Proportional Fitting Procedure, also known as Sinkhorn algorithm, used to solve entropy-regularised Optimal Transport problems. Our result is quantitative and stated in terms of the 1-Wasserstein metric. As a corollary we establish a quantitative stability result for Schr"odinger bridges.

【12】 Towards Interpreting Zoonotic Potential of Betacoronavirus Sequences With Attention 标题:关注解释β冠状病毒序列的人畜共患病潜力 链接:https://arxiv.org/abs/2108.08077

作者:Kahini Wadhawan,Payel Das,Barbara A. Han,Ilya R. Fischhoff,Adrian C. Castellanos,Arvind Varsani,Kush R. Varshney 机构:IBM Research, New Delhi, India, IBM Research, Yorktown Heights, NY, USA, Cary Institute of Ecosystem Studies, NY, USA, The Biodesign Institute, Arizona State University, USA 备注:11 pages, 8 figures, 1 table, accepted at ICLR 2021 workshop Machine learning for preventing and combating pandemics 摘要:目前发现病毒的方法以进化上保守的蛋白质为目标,这些蛋白质能够准确地识别病毒家族,但仍然无法区分新发现病毒的潜在人畜共患病能力。在这里,我们将注意力增强的长期短期记忆(LSTM)深度神经网络分类器应用于高度保守的病毒蛋白靶点,以预测贝塔克隆病毒的人畜共患病潜力。分类器的准确率为94%。在序列和结构水平上对注意力的分析和可视化表明,人畜共患病betacoronavirus中控制病毒复制的重要蛋白质-蛋白质相互作用与人畜共患病传播之间可能存在关联。 摘要:Current methods for viral discovery target evolutionarily conserved proteins that accurately identify virus families but remain unable to distinguish the zoonotic potential of newly discovered viruses. Here, we apply an attention-enhanced long-short-term memory (LSTM) deep neural net classifier to a highly conserved viral protein target to predict zoonotic potential across betacoronaviruses. The classifier performs with a 94% accuracy. Analysis and visualization of attention at the sequence and structure-level features indicate possible association between important protein-protein interactions governing viral replication in zoonotic betacoronaviruses and zoonotic transmission.

【13】 Combining K-means type algorithms with Hill Climbing for Joint Stratification and Sample Allocation Designs 标题:K-Means类算法与爬山相结合的联合分层和样本分配设计 链接:https://arxiv.org/abs/2108.08038

作者:Mervyn O'Luing,Steven Prestwich,S. Armagan Tarim 备注:39 pages, 20 tables, 8 Figures 摘要:在本文中,我们将k-means和/或k-means型算法与爬山算法相结合,分阶段解决联合分层和样本分配问题。这是一个组合优化问题,在该问题中,我们从所有可能的基本地层分层集合中搜索最佳分层。每个分层都是一个解决方案,其质量由其成本来衡量。这个问题对于较大的集合是难以解决的。此外,评估每个解决方案的成本是昂贵的。为了在合理的计算时间内找到可接受的解,已经开发了许多启发式算法来解决这个问题。然而,需要对这些算法的启发式进行训练,以便在每个实例中优化性能。我们将上述多阶段算法组合与三种最新算法进行比较,并报告解决方案成本、评估时间和训练时间。在原子层和连续层的情况下,多阶段组合通常与最新算法进行了很好的比较,并为勘测设计人员提供了更多的算法选择。 摘要:In this paper we combine the k-means and/or k-means type algorithms with a hill climbing algorithm in stages to solve the joint stratification and sample allocation problem. This is a combinatorial optimisation problem in which we search for the optimal stratification from the set of all possible stratifications of basic strata. Each stratification being a solution the quality of which is measured by its cost. This problem is intractable for larger sets. Furthermore evaluating the cost of each solution is expensive. A number of heuristic algorithms have already been developed to solve this problem with the aim of finding acceptable solutions in reasonable computation times. However, the heuristics for these algorithms need to be trained in order to optimise performance in each instance. We compare the above multi-stage combination of algorithms with three recent algorithms and report the solution costs, evaluation times and training times. The multi-stage combinations generally compare well with the recent algorithms both in the case of atomic and continuous strata and provide the survey designer with a greater choice of algorithms to choose from.

【14】 Semantic Perturbations with Normalizing Flows for Improved Generalization 标题:用于改进泛化的正规化流的语义扰动 链接:https://arxiv.org/abs/2108.07958

作者:Oguz Kaan Yuksel,Sebastian U. Stich,Martin Jaggi,Tatjana Chavdarova 机构:† Machine Learning and Optimization Lab, EPFL, ‡ Department of Electrical Engineering and Computer Sciences, UC Berkeley 备注:In Proceedings of the IEEE International Conference on Computer Vision 摘要:在训练深度神经网络时,数据增强是一种广泛采用的避免过度拟合的技术。然而,这种方法需要特定领域的知识,并且通常仅限于一组固定的硬编码转换。最近,有几项工作提出使用生成模型生成语义上有意义的扰动来训练分类器。然而,由于准确的编码和解码至关重要,这些方法使用近似潜在变量推断的架构,仍然局限于小数据集的初步研究。利用规范化流的完全可逆编码器-解码器结构,我们在潜在空间中执行流形扰动,以定义完全无监督的数据增强。我们证明,这种扰动与先进的数据增强技术的性能相匹配——使用ResNet-18,CIFAR-10的测试精度达到96.6%,并且优于现有方法,尤其是在低数据状态下——与经典训练相比,测试精度相对提高了10%-25%。我们发现,在分类器的整个训练过程中,对其进行自适应的潜在对抗性扰动是最有效的,通过潜在空间扰动,在真实数据集(CIFAR-10/100)上产生了第一个测试精度改进结果。 摘要:Data augmentation is a widely adopted technique for avoiding overfitting when training deep neural networks. However, this approach requires domain-specific knowledge and is often limited to a fixed set of hard-coded transformations. Recently, several works proposed to use generative models for generating semantically meaningful perturbations to train a classifier. However, because accurate encoding and decoding are critical, these methods, which use architectures that approximate the latent-variable inference, remained limited to pilot studies on small datasets. Exploiting the exactly reversible encoder-decoder structure of normalizing flows, we perform on-manifold perturbations in the latent space to define fully unsupervised data augmentations. We demonstrate that such perturbations match the performance of advanced data augmentation techniques -- reaching 96.6% test accuracy for CIFAR-10 using ResNet-18 and outperform existing methods, particularly in low data regimes -- yielding 10--25% relative improvement of test accuracy from classical training. We find that our latent adversarial perturbations adaptive to the classifier throughout its training are most effective, yielding the first test accuracy improvement results on real-world datasets -- CIFAR-10/100 -- via latent-space perturbations.

0 人点赞