访问www.arxivdaily.com获取含摘要速递,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏、发帖等功能!点击阅读原文即可访问
cs.RO机器人相关,共计13篇
【1】 DULA: A Differentiable Ergonomics Model for Postural Optimization in Physical HRI 标题:Dula:用于体能HRI体位优化的可微人机工程学模型
作者:Amir Yazdani,Roya Sabbagh Novin,Andrew Merryweather,Tucker Hermans 机构:∗University of Utah Robotics Center, Salt Lake City, UT, †NVIDIA, Seattle, WA 链接:https://arxiv.org/abs/2107.06875 摘要:人机工程学和人体舒适性是物理人机交互应用中的重要问题。定义一个准确且易于使用的人体工程学评估模型是提供姿势校正反馈以改善操作员健康和舒适度的一个重要步骤。为了实现有效的计算,先前提出的自动工效学评估和校正工具对工效学家在实践中使用的金标准评估工具进行了近似或简化。为了保持评估的质量,同时提高计算的考虑,我们引入了杜拉,一个可微的和连续的人体工程学模型学习复制流行的和科学验证的鲁拉评估。我们表明,杜拉提供的评估相当于鲁拉,同时提供计算效益。我们强调杜拉的实力,在演示梯度为基础的姿势优化模拟遥操作任务。 摘要:Ergonomics and human comfort are essential concerns in physical human-robot interaction applications. Defining an accurate and easy-to-use ergonomic assessment model stands as an important step in providing feedback for postural correction to improve operator health and comfort. In order to enable efficient computation, previously proposed automated ergonomic assessment and correction tools make approximations or simplifications to gold-standard assessment tools used by ergonomists in practice. In order to retain assessment quality, while improving computational considerations, we introduce DULA, a differentiable and continuous ergonomics model learned to replicate the popular and scientifically validated RULA assessment. We show that DULA provides assessment comparable to RULA while providing computational benefits. We highlight DULA's strength in a demonstration of gradient-based postural optimization for a simulated teleoperation task.
【2】 A novel approach for modelling and classifying sit-to-stand kinematics using inertial sensors 标题:一种基于惯性传感器的坐立运动学建模与分类新方法
作者:Maitreyee Wairagkar,Emma Villeneuve,Rachel King,Balazs Janko,Malcolm Burnett,Ann Ashburn,Veena Agarwal,R. Simon Sherratt,William Holderbaum,William Harwin 机构: Department of Mechanical Engineering, Imperial College London, London, UK, Care Research and Technology Centre, UK Dementia Research Institute, London, UK, Department of Biomedical Engineering, University of Reading, Reading, UK, CEA Grenoble, Paris, France 备注:25 pages, 11 figures 链接:https://arxiv.org/abs/2107.06859 摘要:从坐姿到站姿的转换是日常生活活动的重要组成部分,在人类的功能性活动中起着关键作用。老年人由于身体虚弱和帕金森氏症等导致跌倒的运动障碍患者,坐姿到站姿的运动常常受到影响。研究从坐姿到站姿转换的运动学可以为受影响人群的评估、监测和制定康复策略提供见解。我们提出了一个三段式人体模型,只需在小腿和背部放置两个可穿戴惯性传感器,就可以估算坐姿到站姿的运动学。将传感器的数量减少到两个,而不是每个身体部位一个,有助于长时间监测和分类运动,使其佩戴更舒适,同时降低传感器的功率要求。我们将该模型应用于10名年轻健康成人(YH)、12名老年健康成人(OH)和12名帕金森病(PwP)患者。在基于扩展卡尔曼滤波的角运动学模型重建中,我们采用了独特的无监督学习坐立分类技术来实现这一点。我们提出的模型表明,它是可能的,成功地估计大腿运动学,尽管没有测量大腿运动的惯性传感器。我们对坐立转换、坐立状态进行了分类,YH、OH和PwP的准确率分别为98.67%、94.20%和91.41%。提出了一种新颖的坐立运动人体运动学建模与分类的综合方法,并成功地应用于YH、OH和PwP组。 摘要:Sit-to-stand transitions are an important part of activities of daily living and play a key role in functional mobility in humans. The sit-to-stand movement is often affected in older adults due to frailty and in patients with motor impairments such as Parkinson's disease leading to falls. Studying kinematics of sit-to-stand transitions can provide insight in assessment, monitoring and developing rehabilitation strategies for the affected populations. We propose a three-segment body model for estimating sit-to-stand kinematics using only two wearable inertial sensors, placed on the shank and back. Reducing the number of sensors to two instead of one per body segment facilitates monitoring and classifying movements over extended periods, making it more comfortable to wear while reducing the power requirements of sensors. We applied this model on 10 younger healthy adults (YH), 12 older healthy adults (OH) and 12 people with Parkinson's disease (PwP). We have achieved this by incorporating unique sit-to-stand classification technique using unsupervised learning in the model based reconstruction of angular kinematics using extended Kalman filter. Our proposed model showed that it was possible to successfully estimate thigh kinematics despite not measuring the thigh motion with inertial sensor. We classified sit-to-stand transitions, sitting and standing states with the accuracies of 98.67%, 94.20% and 91.41% for YH, OH and PwP respectively. We have proposed a novel integrated approach of modelling and classification for estimating the body kinematics during sit-to-stand motion and successfully applied it on YH, OH and PwP groups.
【3】 FAST-LIO2: Fast Direct LiDAR-inertial Odometry 标题:FAST-LIO2:FAST直接激光雷达惯性里程计
作者:Wei Xu,Yixi Cai,Dongjiao He,Jiarong Lin,Fu Zhang 链接:https://arxiv.org/abs/2107.06829 摘要:本文介绍了FAST-LIO2:一种快速、健壮、通用的激光雷达惯性里程计框架。FAST-LIO2建立在一个高效的紧耦合迭代卡尔曼滤波器的基础上,具有两个关键的新特性,可以实现快速、健壮和精确的激光雷达导航(和测绘)。第一种方法是直接将原始点注册到地图(随后更新地图,即映射),而不提取特征。这样可以利用环境中的细微特征,从而提高准确性。取消了手工设计的特征提取模块,使其自然适应不同扫描模式的新兴激光雷达;第二个主要创新点是通过增量k-d树数据结构ikd-tree来维护地图,该结构支持增量更新(即点插入、删除)和动态重新平衡。与现有的动态数据结构(八叉树、R*-树、nanoflann k-d树)相比,ikd树在自然支持树上下采样的同时,实现了优越的整体性能。我们对来自各种开放激光雷达数据集的19个序列进行了详尽的基准比较。与其他最先进的激光雷达惯性导航系统相比,FAST-LIO2以更低的计算量实现了更高的精度。在小视场固体激光雷达上进行了各种实际实验。总体而言,FAST-LIO2计算效率高(例如,在大型室外环境中高达100Hz的里程计和地图)、鲁棒性强(例如,在旋转速度高达1000度/秒的杂乱室内环境中进行可靠的姿态估计)、通用性强(即,适用于多线旋转和固态激光雷达、无人机和手持平台,以及基于英特尔和ARM的处理器),同时仍能获得比现有方法更高的精度。我们在Github上实现的系统FAST-LIO2和数据结构ikd-Tree都是开源的。 摘要:This paper presents FAST-LIO2: a fast, robust, and versatile LiDAR-inertial odometry framework. Building on a highly efficient tightly-coupled iterated Kalman filter, FAST-LIO2 has two key novelties that allow fast, robust, and accurate LiDAR navigation (and mapping). The first one is directly registering raw points to the map (and subsequently update the map, i.e., mapping) without extracting features. This enables the exploitation of subtle features in the environment and hence increases the accuracy. The elimination of a hand-engineered feature extraction module also makes it naturally adaptable to emerging LiDARs of different scanning patterns; The second main novelty is maintaining a map by an incremental k-d tree data structure, ikd-Tree, that enables incremental updates (i.e., point insertion, delete) and dynamic re-balancing. Compared with existing dynamic data structures (octree, R*-tree, nanoflann k-d tree), ikd-Tree achieves superior overall performance while naturally supports downsampling on the tree. We conduct an exhaustive benchmark comparison in 19 sequences from a variety of open LiDAR datasets. FAST-LIO2 achieves consistently higher accuracy at a much lower computation load than other state-of-the-art LiDAR-inertial navigation systems. Various real-world experiments on solid-state LiDARs with small FoV are also conducted. Overall, FAST-LIO2 is computationally-efficient (e.g., up to 100 Hz odometry and mapping in large outdoor environments), robust (e.g., reliable pose estimation in cluttered indoor environments with rotation up to 1000 deg/s), versatile (i.e., applicable to both multi-line spinning and solid-state LiDARs, UAV and handheld platforms, and Intel and ARM-based processors), while still achieving higher accuracy than existing methods. Our implementation of the system FAST-LIO2, and the data structure ikd-Tree are both open-sourced on Github.
【4】 Person-MinkUNet: 3D Person Detection with LiDAR Point Cloud 标题:Person-MinkUNet:基于LiDAR点云的三维人物检测
作者:Dan Jia,Bastian Leibe 机构:Visual Computing Institute, RWTH Aachen 备注:accepted as an extended abstract in JRDB-ACT Workshop at CVPR21 链接:https://arxiv.org/abs/2107.06780 摘要:在这项初步工作中,我们尝试将子流形稀疏卷积应用于三维人物检测。特别地,我们提出了一种基于Minkowski引擎的U-Net结构的单级3D人检测网络Person-MinkUNet。该网络在JRDB三维检测基准上达到了76.4%的平均精度(AP)。 摘要:In this preliminary work we attempt to apply submanifold sparse convolution to the task of 3D person detection. In particular, we present Person-MinkUNet, a single-stage 3D person detection network based on Minkowski Engine with U-Net architecture. The network achieves a 76.4% average precision (AP) on the JRDB 3D detection benchmark.
【5】 Dynamic Event Camera Calibration 标题:动态事件摄像机校准
作者:Kun Huang,Yifu Wang,Laurent Kneip 机构: The efficient usage of event 1ShanghaiTech University; L 备注:accepted in the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 链接:https://arxiv.org/abs/2107.06749 摘要:摄像机标定是解决三维计算机视觉问题的重要前提。传统的方法依赖于校准模式的静态图像。这对事件摄影机的实际使用提出了有趣的挑战,特别是需要改变图像以产生足够的测量值。因此,事件摄像机校准的现行标准包括使用闪烁模式。它们具有在所有重投影模式特征位置同时触发事件的优点,但在现场构建或使用此类模式是困难的。提出了第一种动态事件摄像机标定算法。它直接从摄像机和校准模式之间的相对运动中捕获的事件进行校准。该方法采用了一种新的特征提取机制来提取校准模式,并利用现有的校准工具,通过多段连续时间公式来优化所有参数。通过对实际数据的标定结果表明,这种标定方法非常方便,能够可靠地对10秒以内的数据序列进行标定。 摘要:Camera calibration is an important prerequisite towards the solution of 3D computer vision problems. Traditional methods rely on static images of a calibration pattern. This raises interesting challenges towards the practical usage of event cameras, which notably require image change to produce sufficient measurements. The current standard for event camera calibration therefore consists of using flashing patterns. They have the advantage of simultaneously triggering events in all reprojected pattern feature locations, but it is difficult to construct or use such patterns in the field. We present the first dynamic event camera calibration algorithm. It calibrates directly from events captured during relative motion between camera and calibration pattern. The method is propelled by a novel feature extraction mechanism for calibration patterns, and leverages existing calibration tools before optimizing all parameters through a multi-segment continuous-time formulation. As demonstrated through our results on real data, the obtained calibration method is highly convenient and reliably calibrates from data sequences spanning less than 10 seconds.
【6】 Plan-Based Relaxed Reward Shaping for Goal-Directed Tasks 标题:基于计划的目标导向任务的轻松奖励形成
作者:Ingmar Schubert,Ozgur S. Oguz,Marc Toussaint 机构: Learning and Intelligent Systems Group, TU Berlin, Germany, Max Planck Institute for Intelligent Systems, Stuttgart, Germany, Machine Learning and Robotics Lab, University of Stuttgart, Germany 备注:None 链接:https://arxiv.org/abs/2107.06661 摘要:在高维状态空间中,强化学习的有效性受到探索问题的限制。这个问题已经解决了使用基于潜力的奖励成形(PB-RS)以前。在目前的工作中,我们介绍了最终体积保持奖励成形(FV-RS)。FV-RS放松了PB-RS严格的最优性保证,保证了长期行为的保持。由于限制较少,FV-RS允许更适合于提高RL算法的采样效率的奖励成形函数。特别地,我们考虑代理可以访问近似计划的设置。在这里,我们使用模拟机器人操作任务的例子来证明基于计划的FV-RS确实可以比基于计划的PB-RS显著提高RL的样本效率。 摘要:In high-dimensional state spaces, the usefulness of Reinforcement Learning (RL) is limited by the problem of exploration. This issue has been addressed using potential-based reward shaping (PB-RS) previously. In the present work, we introduce Final-Volume-Preserving Reward Shaping (FV-RS). FV-RS relaxes the strict optimality guarantees of PB-RS to a guarantee of preserved long-term behavior. Being less restrictive, FV-RS allows for reward shaping functions that are even better suited for improving the sample efficiency of RL algorithms. In particular, we consider settings in which the agent has access to an approximate plan. Here, we use examples of simulated robotic manipulation tasks to demonstrate that plan-based FV-RS can indeed significantly improve the sample efficiency of RL over plan-based PB-RS.
【7】 Model-free Reinforcement Learning for Robust Locomotion Using Trajectory Optimization for Exploration 标题:基于探索轨迹优化的鲁棒运动无模型强化学习
作者:Miroslav Bogdanovic,Majid Khadiv,Ludovic Righetti 机构:NewYorkUniversity, eduThis work was supported by New York University 链接:https://arxiv.org/abs/2107.06629 摘要:在这项工作中,我们提出了一个一般的,两阶段的强化学习方法,从一个单一的示范轨迹,到一个强大的政策,可以部署在硬件上没有任何额外的训练。第一阶段以示范为出发点,便于初步探索。在第二阶段,直接优化相关的任务报酬,计算出对环境不确定性具有鲁棒性的策略。我们在一个真实的四足机器人上详细地演示和检验了我们的方法在高动态跳跃和跳跃任务上的性能和鲁棒性。 摘要:In this work we present a general, two-stage reinforcement learning approach for going from a single demonstration trajectory to a robust policy that can be deployed on hardware without any additional training. The demonstration is used in the first stage as a starting point to facilitate initial exploration. In the second stage, the relevant task reward is optimized directly and a policy robust to environment uncertainties is computed. We demonstrate and examine in detail performance and robustness of our approach on highly dynamic hopping and bounding tasks on a real quadruped robot.
【8】 AutoMCM: Maneuver Coordination Service with Abstracted Functions for Autonomous Driving 标题:AutoMCM:具有抽象功能的自动驾驶机动协调服务
作者:Masaya Mizutani,Manabu Tsukada,Hiroshi Esaki 机构: TheUniversity of Tokyo 备注:Accepted to 24th IEEE International Conference on Intelligent Transportation (ITSC) 2021 链接:https://arxiv.org/abs/2107.06627 摘要:协作式智能交通系统(C-ITS)采用车辆对一切(V2X)技术,使自动驾驶车辆更安全、更高效。目前的C-ITS应用主要集中在实时信息共享上,如协同感知。除了更好的实时感知,自动驾驶车辆还需要通过协调行动计划来实现更高的安全性和效率。本研究设计了一个机动协调(MC)协议,该协议使用七条消息来覆盖不同的场景和一个抽象的MC支持服务。我们通过扩展两个开源软件工具来实现我们的提议:Autoware用于自动驾驶,OpenC2X用于C-ITS。结果表明,该系统以事件驱动的方式限制消息交换,有效地降低了通信带宽。结果表明,在车速为30km/h和50km/h时,车辆的运行速度分别提高了15%和28%。我们的系统在适当设定讯息逾时参数的实验中,显示出抗封包遗失的稳健性。 摘要:A cooperative intelligent transport system (C-ITS) uses vehicle-to-everything (V2X) technology to make self-driving vehicles safer and more efficient. Current C-ITS applications have mainly focused on real-time information sharing, such as for cooperative perception. In addition to better real-time perception, self-driving vehicles need to achieve higher safety and efficiency by coordinating action plans. This study designs a maneuver coordination (MC) protocol that uses seven messages to cover various scenarios and an abstracted MC support service. We implement our proposal as AutoMCM by extending two open-source software tools: Autoware for autonomous driving and OpenC2X for C-ITS. The results show that our system effectively reduces the communication bandwidth by limiting message exchange in an event-driven manner. Furthermore, it shows that the vehicles run 15% faster when the vehicle speed is 30 km/h and 28% faster when the vehicle speed is 50 km/h using our scheme. Our system shows robustness against packet loss in experiments when the message timeout parameters are appropriately set.
【9】 Probabilistic Human Motion Prediction via A Bayesian Neural Network 标题:基于贝叶斯神经网络的概率人体运动预测
作者:Jie Xu,Xingyu Chen,Xuguang Lan,Nanning Zheng 机构: Zheng are with the Institute of ArtificialIntelligence and Robotics, Xi’an Jiaotong University 备注:Accepted at ICRA 2021 链接:https://arxiv.org/abs/2107.06564 摘要:人体运动预测是一个重要而富有挑战性的课题,在高效、安全的人机交互系统中具有广阔的应用前景。目前,大多数的人体运动预测算法都是基于确定性模型的,这可能导致机器人的风险决策。为了解决这一问题,本文提出了一种人体运动预测的概率模型。该方法的核心思想是将传统的确定性运动预测神经网络扩展到贝叶斯网络。一方面,当给定一个观测到的运动序列时,我们的模型可以产生多个未来的运动。另一方面,通过计算认知不确定度和异方差任意不确定度,我们的模型可以判断机器人是否看到过观测,并在所有可能的预测中给出最优结果。我们在360万人的大规模基准数据集上广泛验证了我们的方法。实验表明,该方法优于确定性方法。我们在人-机器人交互(HRI)场景中进一步评估了我们的方法。实验结果表明,该方法使得交互更加高效和安全。 摘要:Human motion prediction is an important and challenging topic that has promising prospects in efficient and safe human-robot-interaction systems. Currently, the majority of the human motion prediction algorithms are based on deterministic models, which may lead to risky decisions for robots. To solve this problem, we propose a probabilistic model for human motion prediction in this paper. The key idea of our approach is to extend the conventional deterministic motion prediction neural network to a Bayesian one. On one hand, our model could generate several future motions when given an observed motion sequence. On the other hand, by calculating the Epistemic Uncertainty and the Heteroscedastic Aleatoric Uncertainty, our model could tell the robot if the observation has been seen before and also give the optimal result among all possible predictions. We extensively validate our approach on a large scale benchmark dataset Human3.6m. The experiments show that our approach performs better than deterministic methods. We further evaluate our approach in a Human-Robot-Interaction (HRI) scenario. The experimental results show that our approach makes the interaction more efficient and safer.
【10】 Robust and Recursively Feasible Real-Time Trajectory Planning in Unknown Environments 标题:未知环境下鲁棒递归可行的实时轨迹规划
作者:Inkyu Jang,Dongjae Lee,Seungjae Lee,H. Jin Kim 机构:©, IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media 备注:8 pages, 11 figures, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) accepted 链接:https://arxiv.org/abs/2107.06484 摘要:移动机器人在未知环境中的运动规划面临着同时保持对未建模不确定性的鲁棒性和轨迹发现问题的持久可行性的挑战。也就是说,在处理不确定性时,运动规划器必须实时更新其轨迹,以适应新暴露的环境;不这样做可能涉及不安全的情况。许多现有的规划算法通过保持执行紧急制动所需的间隙来保证这些,紧急制动本身就是一种稳健且持续可行的机动。然而,这种机动不适用于制动不可能或有风险的系统,如固定翼飞机。为此,我们提出了一种实时鲁棒规划器,该规划器递归地保证了持久的可行性,而不需要任何制动。该规划器通过构造一个由连续漏斗组成的回路,从后退视界局部轨迹的前向可达集出发,保证了对有界不确定性的鲁棒性和持久的可行性。实现了机器人小车跟踪速度固定参考轨迹的算法。实验结果表明,该算法可以在16hz以上的频率下运行,同时成功地避免了系统进入死区,保证了算法的安全性和可行性。 摘要:Motion planners for mobile robots in unknown environments face the challenge of simultaneously maintaining both robustness against unmodeled uncertainties and persistent feasibility of the trajectory-finding problem. That is, while dealing with uncertainties, a motion planner must update its trajectory, adapting to the newly revealed environment in real-time; failing to do so may involve unsafe circumstances. Many existing planning algorithms guarantee these by maintaining the clearance needed to perform an emergency brake, which is itself a robust and persistently feasible maneuver. However, such maneuvers are not applicable for systems in which braking is impossible or risky, such as fixed-wing aircraft. To that end, we propose a real-time robust planner that recursively guarantees persistent feasibility without any need of braking. The planner ensures robustness against bounded uncertainties and persistent feasibility by constructing a loop of sequentially composed funnels, starting from the receding horizon local trajectory's forward reachable set. We implement the proposed algorithm for a robotic car tracking a speed-fixed reference trajectory. The experiment results show that the proposed algorithm can be run at faster than 16 Hz, while successfully keeping the system away from entering any dead-end, to maintain safety and feasibility.
【11】 Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks 标题:稀疏奖励任务的最短路径约束强化学习
作者:Sungryull Sohn,Sungtae Lee,Jongwook Choi,Harm van Seijen,Mehdi Fatemi,Honglak Lee 机构: 200 3); it has been shown that theEqual contribution 1University of Michigan 2LG AI Research 3Yonsei University 4Microsoft Research 备注:In proceedings of ICML 2021 链接:https://arxiv.org/abs/2107.06405 摘要:我们提出了k-最短路径(k-SP)约束:一种新的对代理轨迹的约束,提高了稀疏奖励mdp中的样本效率。我们证明了任何最优策略都必须满足k-SP约束。值得注意的是,k-SP约束阻止策略沿着非k-SP轨迹探索状态-动作对(例如,来回)。然而,在实际应用中,排除状态-动作对可能会阻碍RL算法的收敛。为了克服这一问题,我们提出了一种新的代价函数来惩罚违反SP约束的策略,而不是完全排除它。在一个表格RL环境下的数值实验表明,SP约束可以显著减小策略的轨迹空间。因此,我们的约束通过抑制重复的探索和利用,实现了更有效的样本学习。我们在MiniGrid、DeepMind Lab、Atari和Fetch上的实验表明,该方法显著改善了近端策略优化(PPO),并优于现有的新颖性探索方法,包括连续控制任务中基于计数的探索,表明它通过防止代理执行冗余操作来提高样本效率。 摘要:We propose the k-Shortest-Path (k-SP) constraint: a novel constraint on the agent's trajectory that improves the sample efficiency in sparse-reward MDPs. We show that any optimal policy necessarily satisfies the k-SP constraint. Notably, the k-SP constraint prevents the policy from exploring state-action pairs along the non-k-SP trajectories (e.g., going back and forth). However, in practice, excluding state-action pairs may hinder the convergence of RL algorithms. To overcome this, we propose a novel cost function that penalizes the policy violating SP constraint, instead of completely excluding it. Our numerical experiment in a tabular RL setting demonstrates that the SP constraint can significantly reduce the trajectory space of policy. As a result, our constraint enables more sample efficient learning by suppressing redundant exploration and exploitation. Our experiments on MiniGrid, DeepMind Lab, Atari, and Fetch show that the proposed method significantly improves proximal policy optimization (PPO) and outperforms existing novelty-seeking exploration methods including count-based exploration even in continuous control tasks, indicating that it improves the sample efficiency by preventing the agent from taking redundant actions.
【12】 Semantically-Aware Strategies for Stereo-Visual Robotic Obstacle Avoidance 标题:立体视觉机器人避障的语义感知策略
作者:Jungseok Hong,Karin de Langis,Cole Wyeth,Christopher Walaszek,Junaed Sattar 链接:https://arxiv.org/abs/2107.06401 摘要:移动机器人在非结构化、无枫树的环境中必须依靠避障模块才能安全地导航。标准的避障技术估计障碍物相对于机器人的位置,但不知道障碍物的身份。因此,机器人在决定如何导航时不能利用障碍物的语义信息。提出了一种结合视觉实例分割和深度图的避障模块,对场景中的物体进行分类和定位。系统根据对象的身份不同地避开障碍物:例如,系统对不可预知的对象(如人类)的反应更为谨慎。该系统还可以导航到更接近无害的障碍物和忽略不构成碰撞危险的障碍物,使它能够更有效地导航。我们在两个模拟环境中验证了我们的方法:一个是陆地环境,一个是水下环境。结果表明,我们的方法是可行的,可以实现更有效的导航策略。 摘要:Mobile robots in unstructured, mapless environments must rely on an obstacle avoidance module to navigate safely. The standard avoidance techniques estimate the locations of obstacles with respect to the robot but are unaware of the obstacles' identities. Consequently, the robot cannot take advantage of semantic information about obstacles when making decisions about how to navigate. We propose an obstacle avoidance module that combines visual instance segmentation with a depth map to classify and localize objects in the scene. The system avoids obstacles differentially, based on the identity of the objects: for example, the system is more cautious in response to unpredictable objects such as humans. The system can also navigate closer to harmless obstacles and ignore obstacles that pose no collision danger, enabling it to navigate more efficiently. We validate our approach in two simulated environments: one terrestrial and one underwater. Results indicate that our approach is feasible and can enable more efficient navigation strategies.
【13】 Distributionally Robust Policy Learning via Adversarial Environment Generation 标题:通过对抗性环境生成实现分布式健壮策略学习
作者:Allen Z. Ren,Anirudha Majumdar 机构:Department of Mechanical and Aerospace Engineering, Princeton University 链接:https://arxiv.org/abs/2107.06353 摘要:我们的目标是训练能很好地推广到不可见环境的控制策略。受分布式鲁棒优化(DRO)框架的启发,我们提出了DRAGEN—通过环境对抗生成的分布式鲁棒策略学习—通过环境对抗生成迭代地提高策略对现实分布变化的鲁棒性。其关键思想是学习环境的生成模型,其潜在变量捕捉环境中的成本预测和现实变化。我们通过在潜在空间上的梯度上升生成真实的敌对环境,围绕环境的经验分布对Wasserstein球执行DRO。我们在模拟中展示了很强的非分布(OoD)泛化能力,用于(i)用车载视觉摇起单摆,(ii)抓取真实的2D/3D物体。在硬件上的抓取实验表明,与域随机化相比,sim2real具有更好的性能。 摘要:Our goal is to train control policies that generalize well to unseen environments. Inspired by the Distributionally Robust Optimization (DRO) framework, we propose DRAGEN - Distributionally Robust policy learning via Adversarial Generation of ENvironments - for iteratively improving robustness of policies to realistic distribution shifts by generating adversarial environments. The key idea is to learn a generative model for environments whose latent variables capture cost-predictive and realistic variations in environments. We perform DRO with respect to a Wasserstein ball around the empirical distribution of environments by generating realistic adversarial environments via gradient ascent on the latent space. We demonstrate strong Out-of-Distribution (OoD) generalization in simulation for (i) swinging up a pendulum with onboard vision and (ii) grasping realistic 2D/3D objects. Grasping experiments on hardware demonstrate better sim2real performance compared to domain randomization.