Update!H5支持摘要折叠,体验更佳!点击阅读原文访问arxivdaily.com,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏等功能!
cs.RO机器人相关,共计36篇
【1】 Discovering and Achieving Goals via World Models 标题:通过世界模型发现和实现目标 链接:https://arxiv.org/abs/2110.09514
作者:Russell Mendonca,Oleh Rybkin,Kostas Daniilidis,Danijar Hafner,Deepak Pathak 机构:Carnegie Mellon University, University of Pennsylvania, University of Toronto 备注:NeurIPS 2021. First two authors contributed equally. Website at this https URL 摘要:人工智能体如何在没有任何监督的情况下学会在复杂的视觉环境中解决许多不同的任务?我们将这个问题分解为两个问题:发现新的目标和学习可靠地实现它们。我们介绍了潜在探索者成就者(LEXA),这是一个统一的解决方案,它从图像输入中学习世界模型,并使用它从想象的卷展中训练探索者和成就者策略。与之前通过到达之前访问过的州进行探索的方法不同,探索者计划通过预见发现未知的令人惊讶的州,然后将这些州作为不同的目标供成功者练习。在无监督阶段之后,LEXA解决了指定为目标图像Zero-Shot的任务,无需任何额外的学习。LEXA在以前的基准测试和新的具有挑战性的基准测试中,在四个标准机器人操作和移动领域共有40项测试任务,大大优于以前的无监督目标达成方法。LEXA进一步实现了需要按顺序与多个对象交互的目标。最后,为了演示LEXA的可伸缩性和通用性,我们在四个不同的环境中训练了一个通用代理。代码和视频在https://orybkin.github.io/lexa/ 摘要:How can artificial agents learn to solve many diverse tasks in complex visual environments in the absence of any supervision? We decompose this question into two problems: discovering new goals and learning to reliably achieve them. We introduce Latent Explorer Achiever (LEXA), a unified solution to these that learns a world model from image inputs and uses it to train an explorer and an achiever policy from imagined rollouts. Unlike prior methods that explore by reaching previously visited states, the explorer plans to discover unseen surprising states through foresight, which are then used as diverse targets for the achiever to practice. After the unsupervised phase, LEXA solves tasks specified as goal images zero-shot without any additional learning. LEXA substantially outperforms previous approaches to unsupervised goal-reaching, both on prior benchmarks and on a new challenging benchmark with a total of 40 test tasks spanning across four standard robotic manipulation and locomotion domains. LEXA further achieves goals that require interacting with multiple objects in sequence. Finally, to demonstrate the scalability and generality of LEXA, we train a single general agent across four distinct environments. Code and videos at https://orybkin.github.io/lexa/
【2】 MTP: Multi-Hypothesis Tracking and Prediction for Reduced Error Propagation 标题:MTP:减少误差传播的多假设跟踪和预测 链接:https://arxiv.org/abs/2110.09481
作者:Xinshuo Weng,Boris Ivanovic,Marco Pavone 机构: Stanford University 备注:Project page: this https URL 摘要:最近,在开发标准感知规划机器人自主管道的各个模块方面取得了巨大进展,包括检测、跟踪、预测其他代理的轨迹和自我代理的轨迹规划。然而,对这些组件的原则性集成关注较少,特别是在级联错误的表征和缓解方面。本文通过关注跟踪和预测模块之间的耦合来解决级联误差问题。首先,通过使用最先进的跟踪和预测工具,我们对跟踪产生的错误对预测性能的影响程度进行了全面的实验评估。在KITTI和nuScenes数据集上,我们发现使用跟踪轨迹作为输入的预测(实践中的典型情况)与使用地面真实过去轨迹作为输入的理想环境相比,性能会显著下降(甚至数量级)。为了解决这个问题,我们提出了一个多假设跟踪和预测框架。我们的框架不依赖于单个跟踪结果集进行预测,而是同时对多个跟踪结果集进行推理,从而增加了将准确的跟踪结果作为预测输入的可能性。我们表明,在nuScenes数据集上,与标准的单假设跟踪预测管道相比,该框架将整体预测性能提高了34.2%,甚至有更显著的改进(高达70%)当将评估限制在涉及身份切换和片段的具有挑战性的场景时——所有这些都具有可接受的计算开销。 摘要:Recently, there has been tremendous progress in developing each individual module of the standard perception-planning robot autonomy pipeline, including detection, tracking, prediction of other agents' trajectories, and ego-agent trajectory planning. Nevertheless, there has been less attention given to the principled integration of these components, particularly in terms of the characterization and mitigation of cascading errors. This paper addresses the problem of cascading errors by focusing on the coupling between the tracking and prediction modules. First, by using state-of-the-art tracking and prediction tools, we conduct a comprehensive experimental evaluation of how severely errors stemming from tracking can impact prediction performance. On the KITTI and nuScenes datasets, we find that predictions consuming tracked trajectories as inputs (the typical case in practice) can experience a significant (even order of magnitude) drop in performance in comparison to the idealized setting where ground truth past trajectories are used as inputs. To address this issue, we propose a multi-hypothesis tracking and prediction framework. Rather than relying on a single set of tracking results for prediction, our framework simultaneously reasons about multiple sets of tracking results, thereby increasing the likelihood of including accurate tracking results as inputs to prediction. We show that this framework improves overall prediction performance over the standard single-hypothesis tracking-prediction pipeline by up to 34.2% on the nuScenes dataset, with even more significant improvements (up to ~70%) when restricting the evaluation to challenging scenarios involving identity switches and fragments -- all with an acceptable computation overhead.
【3】 Trajectory Optimization for Thermally-Actuated Soft Planar Robot Limbs 标题:热驱动柔性平面机器人肢体轨迹优化 链接:https://arxiv.org/abs/2110.09474
作者:Anthony Wertz,Andrew P. Sabelhaus,Carmel Majidi 机构:CarnegieMellonUniversity 备注:8 pages, 6 figures, submitted to IEEE RA-L & Robosoft conference 摘要:实际使用由软材料制成的机械手将需要规划复杂的运动。我们提出了第一种生成热驱动软机械手轨迹的方法。基于对柔性臂及其形状记忆合金(SMA)丝的简化近似,建立了关节力矩与丝温成正比的离散化刚性机械臂动力学模型。然后,我们提出了一种根据硬件数据对该模型进行校准的方法,并证明仿真结果与测试轨迹吻合良好。最后,我们使用具有非线性动力学的直接配置轨迹优化来推导与期望参考输入紧密对齐的可行轨迹的开环控制。两个示例轨迹在硬件上进行了验证。结果表明,该方法不仅适用于开环规划,而且适用于具有反馈的未来应用。 摘要:Practical use of robotic manipulators made from soft materials will require planning for complex motions. We present the first approach for generating trajectories of a thermally-actuated soft robotic manipulator. Based on simplified approximations of the soft arm and its shape-memory-alloy (SMA) wires, we justify a dynamics model of a discretized rigid manipulator with joint torques proportional to wire temperature. Then, we propose a method to calibrate this model from hardware data, and demonstrate that the simulation aligns well with a test trajectory. Finally, we use direct collocation trajectory optimization with the non-linear dynamics to derive open-loop controls for feasible trajectories that closely align with desired reference inputs. Two example trajectories are verified in hardware. The results show promise for both open-loop planning as well as for future applications with feedback.
【4】 FAR Planner: Fast, Attemptable Route Planner using Dynamic Visibility Update 标题:Far Planner:使用动态可见性更新的快速、可吸引的路径规划器 链接:https://arxiv.org/abs/2110.09460
作者:Fan Yang,Chao Cao,Hongbiao Zhu,Jean Oh,Ji Zhang 机构:All authors are with CMU Robotics Institute 备注:This paper has been submitted to ICRA 2022 and is currently under review 摘要:我们提出了一个基于可见性图的快速路线规划器。该方法提取环境中障碍物周围的边缘点以形成多边形,利用多边形动态更新全局可见性图,随着导航扩展可见性图,并移除被动态障碍物遮挡的边缘。当引导车辆到达目标时,该方法可以处理已知和未知环境。在后一种情况下,该方法可以尝试通过动态选择环境布局来发现实现目标的方法。我们在模拟和真实环境中使用地面和空中交通工具对该方法进行评估。在高度复杂的未知或部分已知环境中,与RRT*、RRT Connect、A*和D*Lite相比,我们的方法能够减少13-27%的旅行时间,并且在我们所有的实验中都能在3ms内找到一条路径。 摘要:We present our work on a fast route planner based on visibility graph. The method extracts edge points around obstacles in the environment to form polygons, with which, the method dynamically updates a global visibility graph, expanding the visibility graph along with the navigation and removing edges that become occluded by dynamic obstacles. When guiding a vehicle to the goal, the method can deal with both known and unknown environments. In the latter case, the method is attemptable in discovering a way to the goal by picking up the environment layout on the fly. We evaluate the method using both ground and aerial vehicles, in simulated and real-world settings. In highly convoluted unknown or partially known environments, our method is able to reduce travel time by 13-27% compared to RRT*, RRT-Connect, A*, and D* Lite, and finds a path within 3ms in all of our experiments.
【5】 A New Approach to Complex Dynamic Geofencing for Unmanned Aerial Vehicles 标题:无人机复杂动态地理围栏的一种新方法 链接:https://arxiv.org/abs/2110.09453
作者:Vihangi Vagal,Konstantinos Markantonakis,Carlton Shepherd 机构:Risk Advisory, Deloitte LLP, London, United Kingdom, Information Security Group, Royal Holloway University of London, Egham, United Kingdom 备注:Accepted to the 40th IEEE Digital Avionics Systems Conference 摘要:无人机(UAV)的预期广泛使用引起了重大的安全和安保问题,包括侵入限制区域、与其他无人机碰撞以及干扰高交通量的空域。为了降低这些风险,已提议将地理围栏作为一道防线,限制无人机飞入其他无人机的周界和限制位置。在本文中,我们解决了现有几何地理围栏算法在计算复杂地理围栏时缺乏准确性的问题,特别是在动态城市环境中。我们提出了一种基于alpha形状和Voronoi图的新算法,该算法使用OpenStreetMap的开源映射数据库集成到无人机框架中。为了证明其有效性,我们在真实的城市环境中使用微软的AirSim和低成本商用无人机平台展示了性能结果。 摘要:The anticipated widespread use of unmanned aerial vehicles (UAVs) raises significant safety and security concerns, including trespassing in restricted areas, colliding with other UAVs, and disrupting high-traffic airspaces. To mitigate these risks, geofences have been proposed as one line of defence, which limit UAVs from flying into the perimeters of other UAVs and restricted locations. In this paper, we address the concern that existing geometric geofencing algorithms lack accuracy during the calculation of complex geofences, particularly in dynamic urban environments. We propose a new algorithm based on alpha shapes and Voronoi diagrams, which we integrate into an on-drone framework using an open-source mapping database from OpenStreetMap. To demonstrate its efficacy, we present performance results using Microsoft's AirSim and a low-cost commercial UAV platform in a real-world urban environment.
【6】 NeuralBlox: Real-Time Neural Representation Fusion for Robust Volumetric Mapping 标题:NeuralBlox:用于鲁棒体积映射的实时神经表示融合 链接:https://arxiv.org/abs/2110.09415
作者:Stefan Lionar,Lukas Schmid,Cesar Cadena,Roland Siegwart,Andrei Cramariuc 机构:Autonomous Systems Lab, ETH Z¨urich, Switzerland 备注:3DV 2021. Equal contribution between the first two authors. Code: this https URL 摘要:我们提出了一种新的三维映射方法,利用神经隐式表示的最新进展进行三维重建。大多数现有的最先进的神经隐式表示方法仅限于对象级重建,并且不能在给定新数据的情况下增量执行更新。在这项工作中,我们提出了一种融合策略和训练管道,以增量方式构建和更新神经隐式表示,从而能够从连续的局部观测中重建大型场景。通过将任意大小的场景表示为潜在代码的网格并直接在潜在空间中执行更新,我们表明即使在CPU上也可以实时获得增量构建的占用地图。与传统方法(如截断符号距离场(TSDF))相比,我们的地图表示方法在给定噪声输入的情况下,在生成更好的场景完整性方面具有更高的鲁棒性。我们展示了我们的方法在实际数据集上的性能,这些数据集具有不同程度的附加姿态噪声。 摘要:We present a novel 3D mapping method leveraging the recent progress in neural implicit representation for 3D reconstruction. Most existing state-of-the-art neural implicit representation methods are limited to object-level reconstructions and can not incrementally perform updates given new data. In this work, we propose a fusion strategy and training pipeline to incrementally build and update neural implicit representations that enable the reconstruction of large scenes from sequential partial observations. By representing an arbitrarily sized scene as a grid of latent codes and performing updates directly in latent space, we show that incrementally built occupancy maps can be obtained in real-time even on a CPU. Compared to traditional approaches such as Truncated Signed Distance Fields (TSDFs), our map representation is significantly more robust in yielding a better scene completeness given noisy inputs. We demonstrate the performance of our approach in thorough experimental validation on real-world datasets with varying degrees of added pose noise.
【7】 How Far Two UAVs Should Be subject to Communication Uncertainties 标题:两架无人机应该在多大程度上受到通信不确定性的影响 链接:https://arxiv.org/abs/2110.09391
作者:Quan Quan,Rao Fu,Kai-Yuan 机构: Cai are with the School of Automation Scienceand Electrical Engineering, Beihang University 摘要:现在,业余和商业用户越来越容易使用无人机。需要一个安全的空中交通管理系统来帮助确保每一个最新进入天空的人不会与其他人碰撞。人们已经做了大量的研究来设计各种方法来避免与障碍物的碰撞。然而,如何确定受通信不确定性影响的安全半径仍然悬而未决。基于通信不确定性假设和控制性能假设,提出了安全半径设计和控制器设计的分离原则。在此基础上,研究了在设计阶段(无不确定性)和飞行阶段(受不确定性影响)与安全区相对应的安全半径。此外,将结果推广到多个障碍物。仿真和实验结果表明了该方法的有效性。 摘要:Unmanned aerial vehicles are now becoming increasingly accessible to amateur and commercial users alike. A safety air traffic management system is needed to help ensure that every newest entrant into the sky does not collide with others. Much research has been done to design various methods to perform collision avoidance with obstacles. However, how to decide the safety radius subject to communication uncertainties is still suspended. Based on assumptions on communication uncertainties and supposed control performance, a separation principle of the safety radius design and controller design is proposed. With it, the safety radius corresponding to the safety area in the design phase (without uncertainties) and flight phase (subject to uncertainties) are studied. Furthermore, the results are extended to multiple obstacles. Simulations and experiments are carried out to show the effectiveness of the proposed methods.
【8】 Does human-robot trust need reciprocity? 标题:人与机器人之间的信任需要互惠吗? 链接:https://arxiv.org/abs/2110.09359
作者:Joshua Zonca,Alessandra Sciutti 摘要:信任是人与人与机器人交互的标志之一。大量证据表明,人类之间的信任需要互惠。相反,人-机器人交互(HRI)的研究大多依赖于一种单向的信任观,这种信任观关注机器人的可靠性和性能。目前的论文认为,互惠也可能在人与机器人之间出现互信和成功合作方面发挥关键作用。我们将收集和讨论揭示人类-机器人信任中互惠维度的工作,为HRI中的双向和动态信任视图铺平道路。 摘要:Trust is one of the hallmarks of human-human and human-robot interaction. Extensive evidence has shown that trust among humans requires reciprocity. Conversely, research in human-robot interaction (HRI) has mostly relied on a unidirectional view of trust that focuses on robots' reliability and performance. The current paper argues that reciprocity may also play a key role in the emergence of mutual trust and successful collaboration between humans and robots. We will gather and discuss works that reveal a reciprocal dimension in human-robot trust, paving the way to a bidirectional and dynamic view of trust in HRI.
【9】 FAST3D: Flow-Aware Self-Training for 3D Object Detectors 标题:FAST3D:三维物体探测器的流感知自训练 链接:https://arxiv.org/abs/2110.09355
作者:Christian Fruhwirth-Reisinger,Michael Opitz,Horst Possegger,Horst Bischof 机构: Christian Doppler Laboratory for, Embedded Machine Learning, Institute of Computer Graphics and, Vision, Graz University of Technology 备注:Accepted to BMVC 2021 摘要:在自主驾驶领域,自训练被广泛应用于缓解基于激光雷达的三维目标探测器的分布偏移。这样,每当环境发生变化时(例如地理位置、传感器设置、天气状况),就不需要昂贵、高质量的标签。然而,最先进的自我训练方法大多忽略了自动驾驶数据的时间特性。为了解决这个问题,我们提出了一种流感知自训练方法,该方法能够在连续激光雷达点云上对三维目标探测器进行无监督的域自适应。为了获得可靠的伪标签,我们利用场景流通过时间传播检测。特别是,我们介绍了一种基于流的多目标跟踪器,该跟踪器利用流的一致性来过滤和细化生成的轨迹。然后,出现的精确伪标签将作为模型重新训练的基础。从预先训练好的KITTI模型开始,我们在具有挑战性的Waymo开放数据集上进行实验,以证明我们方法的有效性。在没有任何目标领域的先验知识的情况下,我们的结果显示了比最新技术的显著改进。 摘要:In the field of autonomous driving, self-training is widely applied to mitigate distribution shifts in LiDAR-based 3D object detectors. This eliminates the need for expensive, high-quality labels whenever the environment changes (e.g., geographic location, sensor setup, weather condition). State-of-the-art self-training approaches, however, mostly ignore the temporal nature of autonomous driving data. To address this issue, we propose a flow-aware self-training method that enables unsupervised domain adaptation for 3D object detectors on continuous LiDAR point clouds. In order to get reliable pseudo-labels, we leverage scene flow to propagate detections through time. In particular, we introduce a flow-based multi-target tracker, that exploits flow consistency to filter and refine resulting tracks. The emerged precise pseudo-labels then serve as a basis for model re-training. Starting with a pre-trained KITTI model, we conduct experiments on the challenging Waymo Open Dataset to demonstrate the effectiveness of our approach. Without any prior target domain knowledge, our results show a significant improvement over the state-of-the-art.
【10】 Electric Vehicle Automatic Charging System Based on Vision-force Fusion 标题:基于视觉-视觉融合的电动汽车自动充电系统 链接:https://arxiv.org/abs/2110.09191
作者:Dashun Guo,Liang Xie,Hongxiang Yu,Yue Wang,Rong Xiong 机构: Zhejiang University 摘要:电动汽车是一种新兴的环保交通工具。自动充电是一个充满挑战的热门话题。本文介绍了一个完整的基于视觉力融合的自动充电系统,该系统包括对机器人操作的感知、规划和控制。我们在仿真中设计了整个系统,并将其转移到现实世界中。实验结果证明了系统的有效性。 摘要:Electric vehicles are an emerging means of transportation with environmental friendliness. The automatic charging is a hot topic in this field that is full of challenges. We introduce a complete automatic charging system based on vision-force fusion, which includes perception, planning and control for robot manipulations of the system. We design the whole system in simulation and transfer it to the real world. The experimental results prove the effectiveness of our system.
【11】 A unified framework for walking and running of bipedal robots 标题:两足机器人行走和奔跑的统一框架 链接:https://arxiv.org/abs/2110.09172
作者:Mahrokh Ghoddousi Boroujeni,Elham Daneshmand,Ludovic Righetti,Majid Khadiv 机构: 1 Institute of Mechanical Engineering, 2Max Planck Institute for Intelligent Systems, 3 Tandon School of Engineering 摘要:在本文中,我们提出了一个新的框架,能够生成各种步行和跑步步态的两足机器人。主要目标是放松线性倒立摆模型(LIPM)的固定质心(CoM)高度假设,并生成更大范围的行走和跑步运动,而不会大幅增加复杂性。为此,我们在质心空间中使用了虚拟约束的概念,它可以在保持复杂性最小的情况下生成超越步行的运动。通过正确选择这些虚拟约束,我们可以生成不同类型的行走和跑步运动。更重要的是,通过反馈强制执行虚拟约束使动力学线性化,并使我们能够通过一个简单的二次规划(QP)设计一个反馈控制机制,在遇到干扰时调整下一步的位置和时间。为了证明该框架的有效性,我们展示了在环境不确定性和外部干扰的情况下,双足机器人Bolt的不同行走和跑步模拟。 摘要:In this paper, we propose a novel framework capable of generating various walking and running gaits for bipedal robots. The main goal is to relax the fixed center of mass (CoM) height assumption of the linear inverted pendulum model (LIPM) and generate a wider range of walking and running motions, without a considerable increase in complexity. To do so, we use the concept of virtual constraints in the centroidal space which enables generating motions beyond walking while keeping the complexity at a minimum. By a proper choice of these virtual constraints, we show that we can generate different types of walking and running motions. More importantly, enforcing the virtual constraints through feedback renders the dynamics linear and enables us to design a feedback control mechanism which adapts the next step location and timing in face of disturbances, through a simple quadratic program (QP). To show the effectiveness of this framework, we showcase different walking and running simulations of the biped robot Bolt in the presence of both environmental uncertainties and external disturbances.
【12】 Enhancing exploration algorithms for navigation with visual SLAM 标题:利用视觉SLAM增强导航探索算法 链接:https://arxiv.org/abs/2110.09156
作者:Kirill Muravyev,Andrey Bokovoy,Konstantin Yakovlev 机构: Artificial Intelligence Research Institute, Federal Research Center for Computer, Science and Control of Russian Academy of Sciences, Moscow, Russia., Moscow Institute of Physics and Technology, Dolgoprudny, Russia 备注:Camera-ready version as submitted to RNCAI 2021 conference 摘要:探索是机器人系统自主导航的重要一步。在本文中,我们介绍了一系列改进的探索算法,以便将其与基于视觉的同步定位和映射(vSLAM)方法结合使用。我们在两种模式下评估了在照片真实感模拟器中开发的方法:地面真实深度和神经网络重建深度图作为vSLAM输入。我们评估标准指标以估计勘探覆盖率。 摘要:Exploration is an important step in autonomous navigation of robotic systems. In this paper we introduce a series of enhancements for exploration algorithms in order to use them with vision-based simultaneous localization and mapping (vSLAM) methods. We evaluate developed approaches in photo-realistic simulator in two modes: with ground-truth depths and neural network reconstructed depth maps as vSLAM input. We evaluate standard metrics in order to estimate exploration coverage.
【13】 Probabilistic Inference in Planning for Partially Observable Long Horizon Problems 标题:部分可观测长视野问题规划中的概率推理 链接:https://arxiv.org/abs/2110.09153
作者:Alphonsus Adu-Bredu,Nikhil Devraj,Pin-Han Lin,Zhen Zeng,Odest Chadwicke Jenkins 机构:Pin-HanLinandOdestChadwickeJenkinsarewiththeRoboticsInstituteandDepartmentofElectricalEngineeringandComputerScience, UniversityofMichigan 备注:International Conference on Intelligent Robots and Systems (IROS), 2021 摘要:为了在现实世界中成功地执行长视野任务,自主服务机器人必须在部分可观察的环境中进行智能操作。大多数任务和运动规划方法都假设其状态空间完全可观测,使得它们在反映现实世界中不确定性的随机和部分可观测域中无效。我们提出了一种在线规划和执行方法,用于在部分可观测域中执行长时间任务。考虑到机器人的信念和由符号动作组成的计划框架,我们的方法通过推断成功执行计划所需的连续动作参数来确定每个符号动作的依据。为了实现这一点,我们将动作参数的联合推理问题描述为一个混合约束满足问题(H-CSP),并使用信念传播求解H-CSP。机器人执行最终的参数化动作,更新其对世界的信念,并在必要时重新规划。我们的方法能够有效地解决现实厨房模拟环境中的部分可观察任务。在我们的实验中,我们的方法优于最先进的方法。 摘要:For autonomous service robots to successfully perform long horizon tasks in the real world, they must act intelligently in partially observable environments. Most Task and Motion Planning approaches assume full observability of their state space, making them ineffective in stochastic and partially observable domains that reflect the uncertainties in the real world. We propose an online planning and execution approach for performing long horizon tasks in partially observable domains. Given the robot's belief and a plan skeleton composed of symbolic actions, our approach grounds each symbolic action by inferring continuous action parameters needed to execute the plan successfully. To achieve this, we formulate the problem of joint inference of action parameters as a Hybrid Constraint Satisfaction Problem (H-CSP) and solve the H-CSP using Belief Propagation. The robot executes the resulting parameterized actions, updates its belief of the world and replans when necessary. Our approach is able to efficiently solve partially observable tasks in a realistic kitchen simulation environment. Our approach outperformed an adaptation of the state-of-the-art method across our experiments.
【14】 A Tactile-enabled Grasping Method for Robotic Fruit Harvesting 标题:一种基于触觉的水果机器人抓取方法 链接:https://arxiv.org/abs/2110.09051
作者:Hongyu Zhou,Xing Wang,Hanwen Kang,Chao Chen 机构:† These authors contribute equally, Department of Mechanical and Aerospace Engineering, Monash University, Clayton, VIC , Australia 备注:submit to conference 摘要:在机器人收割作物的环境中,异物侵入夹持器工作空间的情况经常发生且不可识别,但很少得到解决。本文提出了一种新型的智能机器人抓取方法,该方法能够处理障碍物干扰,这在文献中尚属首次。该方法将深度学习算法与多自由度柔性机器人手爪上的低成本触觉传感硬件相结合。通过实验验证,该方法在区分不同抓取场景方面表现出良好的性能。四指独立控制夹持器具有出色的适应性,可处理各种拣选场景。这项工作的总体表现表明,解决机器人水果采摘挑战的潜力巨大。 摘要:In the robotic crop harvesting environment, foreign objects intrusion in the gripper workspace is frequently occurring and unignorable, however, rarely addressed. This paper presents a novel intelligent robotic grasping method capable of handling obstacle interference, which is the first of its kind in the literature. The proposed method combines deep learning algorithms with low-cost tactile sensing hardware on a multi-DoF soft robotic gripper. Through experimental validations, the proposed method demonstrated promising performance in distinguishing various grasping scenarios. The 4-finger independently controlled gripper presented outstanding adaptability to handle various picking scenarios. The overall performance of this work indicated great potential for solving the robotic fruit harvesting challenges.
【15】 Reinforcement Learning-Based Coverage Path Planning with Implicit Cellular Decomposition 标题:基于强化学习的隐式元胞分解覆盖路径规划 链接:https://arxiv.org/abs/2110.09018
作者:Javad Heydari,Olimpiya Saha,Viswanath Ganapathy 备注:20 pages 摘要:在一般已知环境中,覆盖路径规划是NP难的。当环境未知时,机器人需要依靠覆盖期间建立的在线地图信息来规划其路径,这就变得更具挑战性。一项重要的研究工作集中在设计启发式或近似算法,以实现合理的性能。此类算法在覆盖面积或覆盖成本(例如,覆盖时间或能量消耗)方面具有次优性能。在本文中,我们对覆盖问题进行了系统分析,并将其表述为一个最优停止时间问题,其中明确考虑了覆盖性能和成本之间的权衡。接下来,我们证明了强化学习(RL)技术可以用于计算解决问题。为此,我们提供了一些技术和实践方面的考虑,以促进RL算法的应用并提高解决方案的效率。最后,通过在网格世界环境和Gazebo模拟器中的实验,我们证明了基于强化学习的算法能够有效地覆盖真实的未知室内环境,并且优于目前的技术水平。 摘要:Coverage path planning in a generic known environment is shown to be NP-hard. When the environment is unknown, it becomes more challenging as the robot is required to rely on its online map information built during coverage for planning its path. A significant research effort focuses on designing heuristic or approximate algorithms that achieve reasonable performance. Such algorithms have sub-optimal performance in terms of covering the area or the cost of coverage, e.g., coverage time or energy consumption. In this paper, we provide a systematic analysis of the coverage problem and formulate it as an optimal stopping time problem, where the trade-off between coverage performance and its cost is explicitly accounted for. Next, we demonstrate that reinforcement learning (RL) techniques can be leveraged to solve the problem computationally. To this end, we provide some technical and practical considerations to facilitate the application of the RL algorithms and improve the efficiency of the solutions. Finally, through experiments in grid world environments and Gazebo simulator, we show that reinforcement learning-based algorithms efficiently cover realistic unknown indoor environments, and outperform the current state of the art.
【16】 Online Motion Planning with Soft Timed Temporal Logic in Dynamic and Unknown Environment 标题:动态未知环境下基于软时间时序逻辑的在线运动规划 链接:https://arxiv.org/abs/2110.09007
作者:Zhiliang Li,Mingyu Cai,Shaoping Xiao,Zhen Kan 机构: Cai is with theDepartment of Mechanical Engineering and Mechanics, Lehigh University 备注:under review 摘要:高规格自治系统的运动规划有着广泛的应用。然而,涉及时间时态逻辑的形式语言的研究仍在调查中。此外,许多现有结果依赖于一个关键假设,即用户指定的任务在给定环境中是可行的。当操作环境是动态的和未知的时,会出现挑战,因为环境可能会被发现是禁止的,从而导致潜在的冲突任务,其中预先指定的LTL任务无法完全满足。在考虑定时需求时,此类问题变得更具挑战性。为了应对这些挑战,本研究提出了一个控制框架,该框架考虑硬约束以强制执行安全要求,软约束以实现任务放松。度量区间时态逻辑(MITL)规范用于处理时间约束。通过构造一个放松的时间产品自动机,将在线运动规划策略与一个后退地平线控制器相结合,生成策略,以优先级递减的顺序实现多个目标1)形式上保证满足硬安全约束;2) 主要完成软定时任务;3)尽可能多地收集随时间变化的奖励。放松结构的另一个新颖之处是考虑不可行情况下的时间和任务的违反。仿真结果验证了该方法的有效性。 摘要:Motion planning of an autonomous system with high-level specifications has wide applications. However, research of formal languages involving timed temporal logic is still under investigation. Furthermore, many existing results rely on a key assumption that user-specified tasks are feasible in the given environment. Challenges arise when the operating environment is dynamic and unknown since the environment can be found prohibitive, leading to potentially conflicting tasks where pre-specified LTL tasks cannot be fully satisfied. Such issues become even more challenging when considering timed requirements. To address these challenges, this work proposes a control framework that considers hard constraints to enforce safety requirements and soft constraints to enable task relaxation. The metric interval temporal logic (MITL) specifications are employed to deal with time constraints. By constructing a relaxed timed product automaton, an online motion planning strategy is synthesized with a receding horizon controller to generate policies, achieving multiple objectives in decreasing order of priority 1) formally guarantee the satisfaction of hard safety constraints; 2) mostly fulfill soft timed tasks; and 3) collect time-varying rewards as much as possible. Another novelty of the relaxed structure is to consider violations of both time and tasks for infeasible cases. Simulation results are provided to validate the proposed approach.
【17】 Accurate and Robust Object-oriented SLAM with 3D Quadric Landmark Construction in Outdoor Environment 标题:室外环境中精确健壮的面向对象的SLAM三维二次地标构造 链接:https://arxiv.org/abs/2110.08977
作者:Rui Tian,Yunzhou Zhang,Yonghui Feng,Linghao Yang,Zhenzhong Cao,Sonya Coleman,Dermot Kerr 机构: Ulster University 备注:Submitting to RA-L 摘要:面向对象的SLAM是自动驾驶和机器人技术中的一种流行技术。在本文中,我们提出了一种立体视觉SLAM与鲁棒二次地标表示方法。该系统由四个部分组成,包括深度学习检测、面向对象的数据关联、双二次地标初始化和基于对象的姿态优化。基于二次曲面的SLAM算法一直面临着观测相关的问题,并且对观测噪声非常敏感,这限制了其在室外场景中的应用。针对这一问题,提出了一种基于二次参数解耦的二次初始化方法,提高了对观测噪声的鲁棒性。充分的对象数据关联算法和具有多个线索的面向对象优化能够实现对局部观测具有鲁棒性的高度精确的对象姿势估计。实验结果表明,该系统对观测噪声具有较强的鲁棒性,在室外环境下的性能明显优于现有的方法。此外,该系统还具有实时性。 摘要:Object-oriented SLAM is a popular technology in autonomous driving and robotics. In this paper, we propose a stereo visual SLAM with a robust quadric landmark representation method. The system consists of four components, including deep learning detection, object-oriented data association, dual quadric landmark initialization and object-based pose optimization. State-of-the-art quadric-based SLAM algorithms always face observation related problems and are sensitive to observation noise, which limits their application in outdoor scenes. To solve this problem, we propose a quadric initialization method based on the decoupling of the quadric parameters method, which improves the robustness to observation noise. The sufficient object data association algorithm and object-oriented optimization with multiple cues enables a highly accurate object pose estimation that is robust to local observations. Experimental results show that the proposed system is more robust to observation noise and significantly outperforms current state-of-the-art methods in outdoor environments. In addition, the proposed system demonstrates real-time performance.
【18】 Keypoint-Based Bimanual Shaping of Deformable Linear Objects under Environmental Constraints using Hierarchical Action Planning 标题:基于关键点的环境约束下可变形线状物体的分层动作规划双手成形 链接:https://arxiv.org/abs/2110.08962
作者:Shengzeng Huo,Anqing Duan,Chengxi Li,Peng Zhou,Wanyu Ma,David Navarro-Alarcon 机构:in part by the Jiangsu Industrial Technology Research Institute CollaborativeResearch Program Scheme under grant ZG9V, and in part by The Hong KongPolytechnic University under grant 8B0 1All authors are with The Hong Kong Polytechnic University 摘要:本文讨论了双臂机器人系统对可变形线性物体(DLO)进行基于接触的操纵以达到所需形状的问题。为了减轻高维连续状态作用空间的负担,我们通过我们提出的关键点检测网络将DLO建模为运动学多体系统。这种新的感知网络是在合成标记图像数据集上训练的,并在不进行任何手动注释的情况下转移到真实的操作场景中。我们的目标条件策略可以有效地学习根据检测到的关键点重新安排DLO的配置。提出的分层动作框架通过利用两个动作原语,以从粗到精的方式(通过高级任务规划和低级运动控制)处理操纵问题。由于算法在每次双手执行后重新规划其运动,因此避免了变形特性的识别。实验结果表明,该方法在DLO的状态表示方面具有较高的性能,并且对不确定环境约束具有较强的鲁棒性。 摘要:This paper addresses the problem of contact-based manipulation of deformable linear objects (DLOs) towards desired shapes with a dual-arm robotic system. To alleviate the burden of high-dimensional continuous state-action spaces, we model the DLO as a kinematic multibody system via our proposed keypoint detection network. This new perception network is trained on a synthetic labeled image dataset and transferred to real manipulation scenarios without conducting any manual annotations. Our goal-conditioned policy can efficiently learn to rearrange the configuration of the DLO based on the detected keypoints. The proposed hierarchical action framework tackles the manipulation problem in a coarse-to-fine manner (with high-level task planning and low-level motion control) by leveraging on two action primitives. The identification of deformation properties is avoided since the algorithm replans its motion after each bimanual execution. The conducted experimental results reveal that our method achieves high performance in state representation of the DLO, and is robust to uncertain environmental constraints.
【19】 Deep Tactile Experience: Estimating Tactile Sensor Output from Depth Sensor Data 标题:深度触觉体验:从深度传感器数据估计触觉传感器输出 链接:https://arxiv.org/abs/2110.08946
作者:Karankumar Patel,Soshi Iba,Nawid Jamali 机构:©, IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media 备注:Accepted for publication in the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2020) 摘要:触觉感知本质上是基于接触的。为了使用触觉数据,机器人需要与物体表面接触。在代理需要在多个备选方案之间做出决定的应用程序中,这是低效的,这些备选方案取决于联系人位置的物理属性。我们提出了一种非侵入式获取触觉数据的方法。所提出的方法根据过去的经验从物体表面的深度数据估计触觉传感器的输出。通过允许机器人与各种物体交互,收集触觉数据和相应的物体表面深度数据,建立了一个经验数据集。我们使用经验数据集训练神经网络,仅从深度数据估计触觉输出。我们使用GelSight触觉传感器(一种基于图像的传感器)生成图像,捕捉接触位置的详细表面特征。我们用一个包含578个触觉图像的数据集训练一个网络,使其与深度映射对应。给定物体表面的深度图,如果触觉传感器与物体接触,网络输出触觉传感器响应的估计值。我们使用结构相似性指数矩阵(SSIM)对该方法进行评估,SSIM是图像处理界常用的两幅图像之间的相似性度量。我们给出的实验结果表明,该方法优于使用具有统计显著性的随机图像的基线,SSIM得分分别为0.84 /-0.0056和0.80 /-0.0036。 摘要:Tactile sensing is inherently contact based. To use tactile data, robots need to make contact with the surface of an object. This is inefficient in applications where an agent needs to make a decision between multiple alternatives that depend the physical properties of the contact location. We propose a method to get tactile data in a non-invasive manner. The proposed method estimates the output of a tactile sensor from the depth data of the surface of the object based on past experiences. An experience dataset is built by allowing the robot to interact with various objects, collecting tactile data and the corresponding object surface depth data. We use the experience dataset to train a neural network to estimate the tactile output from depth data alone. We use GelSight tactile sensors, an image-based sensor, to generate images that capture detailed surface features at the contact location. We train a network with a dataset containing 578 tactile-image to depthmap correspondences. Given a depth-map of the surface of an object, the network outputs an estimate of the response of the tactile sensor, should it make a contact with the object. We evaluate the method with structural similarity index matrix (SSIM), a similarity metric between two images commonly used in image processing community. We present experimental results that show the proposed method outperforms a baseline that uses random images with statistical significance getting an SSIM score of 0.84 /- 0.0056 and 0.80 /- 0.0036, respectively.
【20】 Characterizing and Improving the Resilience of Accelerators in Autonomous Robots 标题:自主机器人油门弹性的表征与提高 链接:https://arxiv.org/abs/2110.08906
作者:Deval Shah,Zi Yu Xue,Karthik Pattabiraman,Tor M. Aamodt 机构:Electrical and Computer Engineering, The University of British Columbia 备注:14 pages 摘要:在自主机器人中,运动规划是一个计算密集且研究广泛的问题。然而,运动规划硬件加速器(MPA)必须具有软错误弹性,以便在安全关键应用中部署,并且由于成本、电源和性能开销,传统缓解技术的全面应用不适合。我们提出了碰撞暴露因子(CEF),这是一种评估处理空间关系(包括运动规划)的电路故障脆弱性的新指标。CEF基于这样一种认识:安全违规概率随着位翻转暴露的物理空间表面积的增加而增加。我们在四个MPA上评估CEF。我们从经验上证明,CEF与安全违规概率相关,并且CEF感知的选择性错误缓解与关键数据的统一、位位置和访问频率感知选择相比,在相同数量的受保护内存中,平均提供12.3倍、9.6倍和4.2倍的时间(拟合)故障率。此外,我们还展示了如何使用CEF实现故障表征,使用的故障注入(FI)实验比穷举FI少23000倍,并在不同的机器人和MPA上评估了我们的FI方法。我们证明了CEF感知FI可以提供MPA中易受攻击比特的洞察,同时占用与统一统计FI相同的时间。最后,我们使用CEF来制定软错误弹性MPA的设计指南。 摘要:Motion planning is a computationally intensive and well-studied problem in autonomous robots. However, motion planning hardware accelerators (MPA) must be soft-error resilient for deployment in safety-critical applications, and blanket application of traditional mitigation techniques is ill-suited due to cost, power, and performance overheads. We propose Collision Exposure Factor (CEF), a novel metric to assess the failure vulnerability of circuits processing spatial relationships, including motion planning. CEF is based on the insight that the safety violation probability increases with the surface area of the physical space exposed by a bit-flip. We evaluate CEF on four MPAs. We demonstrate empirically that CEF is correlated with safety violation probability, and that CEF-aware selective error mitigation provides 12.3x, 9.6x, and 4.2x lower Failures-In-Time (FIT) rate on average for the same amount of protected memory compared to uniform, bit-position, and access-frequency-aware selection of critical data. Furthermore, we show how to employ CEF to enable fault characterization using 23,000x fewer fault injection (FI) experiments than exhaustive FI, and evaluate our FI approach on different robots and MPAs. We demonstrate that CEF-aware FI can provide insights on vulnerable bits in an MPA while taking the same amount of time as uniform statistical FI. Finally, we use the CEF to formulate guidelines for designing soft-error resilient MPAs.
【21】 On-line Optimal Ranging Sensor Deployment for Robotic Exploration 标题:机器人探测中的在线最优测距传感器部署 链接:https://arxiv.org/abs/2110.08853
作者:Luca Santoro,Davide Brunelli,Daniele Fontanelli 机构:UWB 备注:10 pages, 12 figures, 3 tables, in IEEE Sensors Journal 摘要:对于移动机器人来说,在未知环境中没有任何预先存在的定位基础设施的导航一直很困难。本文提出了一种基于移动代理的自部署超宽带UWB基础设施,该基础设施允许在机器人探索新环境时动态放置和运行时扩展UWB锚基础设施。我们对UWB基础设施增长时定位系统的不确定性进行了详细分析。此外,我们开发了一种遗传算法,该算法可以最大限度地减少新锚的部署,节省移动机器人的能源和资源,并最大限度地延长任务时间。虽然所提出的方法适用于任何类型的移动系统,但我们使用室内无人机进行了仿真和实验。结果表明,采用几何精度稀释法(GDoP),最大定位不确定度始终控制在用户阈值下。 摘要:Navigation in an unknown environment without any preexisting positioning infrastructure has always been hard for mobile robots. This paper presents a self-deployable ultra wideband UWB infrastructure by mobile agents, that permits a dynamic placement and runtime extension of UWB anchors infrastructure while the robot explores the new environment. We provide a detailed analysis of the uncertainty of the positioning system while the UWB infrastructure grows. Moreover, we developed a genetic algorithm that minimizes the deployment of new anchors, saving energy and resources on the mobile robot and maximizing the time of the mission. Although the presented approach is general for any class of mobile system, we run simulations and experiments with indoor drones. Results demonstrate that maximum positioning uncertainty is always controlled under the user's threshold, using the Geometric Dilution of Precision (GDoP).
【22】 Siamese Transformer Pyramid Networks for Real-Time UAV Tracking 标题:用于无人机实时跟踪的暹罗Transformer金字塔网络 链接:https://arxiv.org/abs/2110.08822
作者:Daitao Xing,Nikolaos Evangeliou,Athanasios Tsoukalas,Anthony Tzes 机构:New York University , USA, New York University Abu Dhabi, UAE 备注:10 pages, 8 figures, accepted by WACV2022 摘要:最近的目标跟踪方法依赖于深度网络或复杂的体系结构。在计算资源有限的移动平台上,大多数跟踪器很难满足实时处理要求。在这项工作中,我们介绍了暹罗Transformer金字塔网络(SiamTPN),它继承了CNN和Transformer结构的优点。具体而言,我们利用轻量级网络(ShuffleNet V2)的固有特征金字塔,并使用转换器对其进行增强,以构建健壮的特定于目标的外观模型。开发了一种具有横向交叉注意的集中式体系结构,用于构建增强的高级特征图。为了避免将金字塔表示与转换器融合时的计算和内存强度,我们进一步引入了集中注意模块,该模块在提高鲁棒性的同时显著降低了内存和时间复杂性。在空中和通用跟踪基准上进行的综合实验在高速运行时取得了有竞争力的结果,证明了SiamTPN的有效性。此外,我们最快的变体跟踪器在单个CPU内核上运行超过30 Hz,在LaSOT数据集上获得58.1%的AUC分数。源代码可在https://github.com/RISCNYUAD/SiamTPNTracker 摘要:Recent object tracking methods depend upon deep networks or convoluted architectures. Most of those trackers can hardly meet real-time processing requirements on mobile platforms with limited computing resources. In this work, we introduce the Siamese Transformer Pyramid Network (SiamTPN), which inherits the advantages from both CNN and Transformer architectures. Specifically, we exploit the inherent feature pyramid of a lightweight network (ShuffleNetV2) and reinforce it with a Transformer to construct a robust target-specific appearance model. A centralized architecture with lateral cross attention is developed for building augmented high-level feature maps. To avoid the computation and memory intensity while fusing pyramid representations with the Transformer, we further introduce the pooling attention module, which significantly reduces memory and time complexity while improving the robustness. Comprehensive experiments on both aerial and prevalent tracking benchmarks achieve competitive results while operating at high speed, demonstrating the effectiveness of SiamTPN. Moreover, our fastest variant tracker operates over 30 Hz on a single CPU-core and obtaining an AUC score of 58.1% on the LaSOT dataset. Source codes are available at https://github.com/RISCNYUAD/SiamTPNTracker
【23】 Coordinated Multi-Agent Pathfinding for Drones and Trucks over Road Networks 标题:基于多智能体的无人机和卡车在公路网上的协同寻路 链接:https://arxiv.org/abs/2110.08802
作者:Shushman Choudhury,Kiril Solovey,Mykel Kochenderfer,Marco Pavone 机构:Stanford University, Stanford, CA, USA, Technion - Israel Institute of Technology, Haifa, Israel 摘要:我们解决了无人机和卡车在大规模城市道路网络上的路线问题。为了节省有限的飞行能量,无人机可以在前往目的地的途中使用卡车作为临时运输方式。与独立操作无人机和卡车相比,这种协调可以显著节省车辆行驶总距离,即卡车行驶距离和无人机飞行距离。但是,在决定哪些卡车和无人机应该进行协调以及何时何地进行协调时,可能会产生令人望而却步的计算成本。我们通过将整个棘手的问题解耦为可处理的子问题来解决这个基本的权衡问题,我们分阶段解决这些子问题。第一阶段只解决卡车的问题,通过计算路径使其更有可能成为无人机的有用运输选择。第二阶段仅解决无人机的问题,通过将无人机路由到由第一阶段卡车路径定义的道路网络和交通网络的组合中。我们设计了一个全面的算法框架,将每个阶段作为一个多智能体路径发现问题,并实现了两种不同的解决方法。我们使用高达100美元的代理在真实世界曼哈顿道路网络上进行广泛的模拟,其中包含近4500美元的顶点和10000美元的边缘,对我们的方法进行评估。与独立解决卡车和无人机的问题相比,我们的框架节省了超过50%$的车辆行驶距离,并在商品硬件上在5$分钟内计算所有设置的解决方案。 摘要:We address the problem of routing a team of drones and trucks over large-scale urban road networks. To conserve their limited flight energy, drones can use trucks as temporary modes of transit en route to their own destinations. Such coordination can yield significant savings in total vehicle distance traveled, i.e., truck travel distance and drone flight distance, compared to operating drones and trucks independently. But it comes at the potentially prohibitive computational cost of deciding which trucks and drones should coordinate and when and where it is most beneficial to do so. We tackle this fundamental trade-off by decoupling our overall intractable problem into tractable sub-problems that we solve stage-wise. The first stage solves only for trucks, by computing paths that make them more likely to be useful transit options for drones. The second stage solves only for drones, by routing them over a composite of the road network and the transit network defined by truck paths from the first stage. We design a comprehensive algorithmic framework that frames each stage as a multi-agent path-finding problem and implement two distinct methods for solving them. We evaluate our approach on extensive simulations with up to $100$ agents on the real-world Manhattan road network containing nearly $4500$ vertices and $10000$ edges. Our framework saves on more than $50%$ of vehicle distance traveled compared to independently solving for trucks and drones, and computes solutions for all settings within $5$ minutes on commodity hardware.
【24】 TIP: Task-Informed Motion Prediction for Intelligent Systems 标题:提示:面向智能系统的任务知晓运动预测 链接:https://arxiv.org/abs/2110.08750
作者:Xin Huang,Guy Rosman,Ashkan Jasour,Stephen G. McGill,John J. Leonard,Brian C. Williams 机构:Mas-sachusettsInstituteofTechnology, edu 2Toyota Research Institute 备注:8 pages, 6 figures, 2 tables 摘要:运动预测对于智能驾驶系统非常重要,它可以提供道路代理行为的未来分布,并支持各种决策任务。现有的运动预测器通常通过基于预测精度的任务无关度量进行优化和评估。这些措施无法考虑在下游任务中使用预测,并可能导致次优任务性能。我们提出了一个基于任务的运动预测框架,该框架综合考虑预测精度和任务效用,通过预测更好地支持下游任务。任务效用函数不需要完整的任务信息,而是需要任务效用的规范,从而产生服务于广泛下游任务的预测值。我们在自主驾驶和并行自主的背景下,在任务实用程序的两个用例上展示了我们的框架,并在Waymo开放运动数据集上展示了任务通知预测器相对于任务无关预测器的优势。 摘要:Motion prediction is important for intelligent driving systems, providing the future distributions of road agent behaviors and supporting various decision making tasks. Existing motion predictors are often optimized and evaluated via task-agnostic measures based on prediction accuracy. Such measures fail to account for the use of prediction in downstream tasks, and could result in sub-optimal task performance. We propose a task-informed motion prediction framework that jointly reasons about prediction accuracy and task utility, to better support downstream tasks through its predictions. The task utility function does not require the full task information, but rather a specification of the utility of the task, resulting in predictors that serve a wide range of downstream tasks. We demonstrate our framework on two use cases of task utilities, in the context of autonomous driving and parallel autonomy, and show the advantage of task-informed predictors over task-agnostic ones on the Waymo Open Motion dataset.
【25】 CLASP: Constrained Latent Shape Projection for Refining Object Shape from Robot Contact 标题:CLACP:用于从机器人接触中细化物体形状的约束潜在形状投影 链接:https://arxiv.org/abs/2110.08719
作者:Brad Saund,Dmitry Berenson 机构:Robotics Institute, University of Michigan 备注:16 pages, 10 figures, Accepted at the Conference on Robot Learning (CoRL) 2021 摘要:机器人需要视觉和接触感应来有效地估计环境的状态。摄像头RGBD数据提供了机器人周围物体的丰富信息,形状先验可以帮助纠正噪声并填充间隙和遮挡区域。但是,当机器人感知到意外接触时,应更新估计值以解释接触。为了满足这一需求,我们提出了CLASP:约束潜在形状投影。该方法包括一个形状完成网络,该网络从RGBD数据生成先验信息,以及一个生成与网络先验信息和机器人接触观察结果一致的形状的过程。我们发现CLASP始终减少预测场景和地面真实场景之间的倒角距离,而其他方法则不受益于联系信息。 摘要:Robots need both visual and contact sensing to effectively estimate the state of their environment. Camera RGBD data provides rich information of the objects surrounding the robot, and shape priors can help correct noise and fill in gaps and occluded regions. However, when the robot senses unexpected contact, the estimate should be updated to explain the contact. To address this need, we propose CLASP: Constrained Latent Shape Projection. This approach consists of a shape completion network that generates a prior from RGBD data and a procedure to generate shapes consistent with both the network prior and robot contact observations. We find CLASP consistently decreases the Chamfer Distance between the predicted and ground truth scenes, while other approaches do not benefit from contact information.
【26】 Dynamic Compressed Sensing of Unsteady Flows with a Mobile Robot 标题:基于移动机器人的非定常流动动态压缩传感 链接:https://arxiv.org/abs/2110.08658
作者:Sachin Shriwastav,Gregory Snyder,Zhuoyuan Song 机构: Song are with the Department ofMechanical Engineering, University of Hawai‘i at M¯anoa 备注:9 pages, 7 figures 摘要:使用有限数量的移动传感器进行大规模环境传感是一项具有挑战性的任务,需要大量资源和时间。当环境中的特征以未知或部分已知的动力学在时空上发生变化时,这一点尤其正确。然而,这些动态特征通常在低维空间中演化,使得仅使用一个或多个适当规划的移动传感器就可以充分捕获它们的动态。本文研究了非定常流场的动态压缩传感(DCS)问题,该问题利用底层流体动力学固有的低维性来减少移动传感机器人的航路点数量。通过迭代压缩感知算法确定最佳感知航路点,该算法优化基于适当正交分解(POD)模式的流量重建。然后找到一个优化的机器人轨迹来穿越这些航路点,同时最小化能量消耗、时间和流量重建误差。在非定常双回转流场中的仿真结果验证了所提算法的有效性。在室内四旋翼机上的实验结果表明了该轨迹的可行性。 摘要:Large-scale environmental sensing with a finite number of mobile sensor is a challenging task that requires a lot of resources and time. This is especially true when features in the environment are spatiotemporally changing with unknown or partially known dynamics. However, these dynamic features often evolve in a low-dimensional space, making it possible to capture their dynamics sufficiently well with only one or several properly planned mobile sensors. This paper investigates the problem of dynamic compressed sensing (DCS) of an unsteady flow field, which takes advantage of the inherently low dimensionality of the underlying flow dynamics to reduce number of waypoints for a mobile sensing robot. The optimal sensing waypoints are identified by an iterative compressed sensing algorithm that optimizes the flow reconstruction based on the proper orthogonal decomposition (POD) modes. An optimized robot trajectory is then found to traverse these waypoints while minimizing the energy consumption, time, and flow reconstruction error. Simulation results in an unsteady double-gyre flow field is presented to demonstrate the efficacy of the proposed algorithms. Experimental results with an indoor quadcopter are presented to show the feasibility of the resulting trajectory.
【27】 Partial Hierarchical Pose Graph Optimization for SLAM 标题:SLAM的部分层次位图优化 链接:https://arxiv.org/abs/2110.08639
作者:Alexander Korovko,Dmitry Robustov 机构:NVIDIA 备注:5 pages, 4 figures 摘要:在本文中,我们考虑一个层次的姿势图优化(HPGO)的同时定位和映射(SLAM)。我们提出了一个快速的增量过程来构建姿势图中的层次结构。我们研究了这个过程的特性,并表明我们的解决方案提供了高执行速度、高缩减率和良好的灵活性。我们提出了一种进行局部分层优化的方法,并将其与其他优化模式进行了比较。我们表明,在给定相对大量的姿势的情况下,部分HPGO与原始优化相比可以提高10倍的速度,而不会牺牲质量。 摘要:In this paper we consider a hierarchical pose graph optimization (HPGO) for Simultaneous Localization and Mapping (SLAM). We propose a fast incremental procedure for building hierarchy levels in pose graphs. We study the properties of this procedure and show that our solution delivers high execution speed, high reduction rate and good flexibility. We propose a way to do partial hierarchical optimization and compare it to other optimization modes. We show that given a comparatively large amount of poses, partial HPGO gives a 10x speed up comparing to the original optimization, not sacrificing the quality.
【28】 Learning Cloth Folding Tasks with Refined Flow Based Spatio-Temporal Graphs 标题:基于精化流的时空图学习折叠布料任务 链接:https://arxiv.org/abs/2110.08620
作者:Peng Zhou,Omar Zahra,Anqing Duan,Shengzeng Huo,Zeyu Wu,David Navarro-Alarcon 机构: andin part by the Jiangsu Industrial Technology Research Institute CollaborativeResearch Program Scheme under grant ZG9V, All authors are with The Hong Kong Polytechnic University 备注:8 pages, 6 figures 摘要:布料折叠是一项广泛的家庭任务,似乎是由人类完成的,但由于纺织品的高度可变形性,这对自主机器人来说是一项极具挑战性的任务;很难设计和学习操作管道来有效地执行它。在本文中,我们提出了一个新的解决方案,机器人布折叠(使用标准的折叠板)通过学习示范。我们的演示视频编码基于高级抽象,即基于光流的时空图,而不是像图像像素这样的低级编码。通过构造一个具有高级视觉对应描述符的新时空图,策略学习可以聚焦于关键点和与3D空间配置的关系,从而可以快速概括不同环境。为了进一步促进策略搜索,我们结合光流和静态运动显著性图来区分主导运动,以便更好地实时处理系统动力学,这与主导人类模仿过程的注意运动机制一致。为了验证所提出的方法,我们分析了手动折叠过程,并开发了一个定制的末端执行器来有效地与折叠板交互。在一个真实的机器人平台上进行了多次实验,验证了该方法的有效性和鲁棒性。 摘要:Cloth folding is a widespread domestic task that is seemingly performed by humans but which is highly challenging for autonomous robots to execute due to the highly deformable nature of textiles; It is hard to engineer and learn manipulation pipelines to efficiently execute it. In this paper, we propose a new solution for robotic cloth folding (using a standard folding board) via learning from demonstrations. Our demonstration video encoding is based on a high-level abstraction, namely, a refined optical flow-based spatiotemporal graph, as opposed to a low-level encoding such as image pixels. By constructing a new spatiotemporal graph with an advanced visual corresponding descriptor, the policy learning can focus on key points and relations with a 3D spatial configuration, which allows to quickly generalize across different environments. To further boost the policy searching, we combine optical flow and static motion saliency maps to discriminate the dominant motions for better handling the system dynamics in real-time, which aligns with the attentional motion mechanism that dominates the human imitation process. To validate the proposed approach, we analyze the manual folding procedure and developed a custom-made end-effector to efficiently interact with the folding board. Multiple experiments on a real robotic platform were conducted to validate the effectiveness and robustness of the proposed method.
【29】 MAAD: A Model and Dataset for "Attended Awareness" in Driving 链接:https://arxiv.org/abs/2110.08610
作者:Deepak Gopinath,Guy Rosman,Simon Stent,Katsuya Terahata,Luke Fletcher,Brenna Argall,John Leonard 备注:25 pages, 13 figures, 14 tables, Accepted at EPIC@ICCV 2021 Workshop. Main paper Supplementary Material 摘要:我们提出了一个计算模型来估计一个人对环境的感知。我们将参与意识定义为一个人在最近的历史中参与过的潜在动态场景的那些部分,并且他们仍然可能在身体上意识到这些部分。我们的模型以视频和噪声注视估计的形式作为输入场景信息,并输出视觉显著性、精细注视估计和人的注意感知估计。为了测试我们的模型,我们用一个高精度的凝视跟踪器捕获了一个新的数据集,其中包括23名观看驾驶场景视频的受试者24.5小时的凝视序列。该数据集还包含基于扫描路径观察的受试者注意意识的第三方注释。我们的结果表明,我们的模型能够在受控环境下合理估计有人参与的意识,并且在未来可能扩展到真实的以自我为中心的驾驶数据,以帮助在安全系统中实现更有效的提前警告,从而提高驾驶员的驾驶性能。我们还使用我们的数据集和现有的显著性数据集证明了我们的模型在显著性、凝视校准和去噪任务上的有效性。我们的模型和数据集在https://github.com/ToyotaResearchInstitute/att-aware/. 摘要:We propose a computational model to estimate a person's attended awareness of their environment. We define attended awareness to be those parts of a potentially dynamic scene which a person has attended to in recent history and which they are still likely to be physically aware of. Our model takes as input scene information in the form of a video and noisy gaze estimates, and outputs visual saliency, a refined gaze estimate, and an estimate of the person's attended awareness. In order to test our model, we capture a new dataset with a high-precision gaze tracker including 24.5 hours of gaze sequences from 23 subjects attending to videos of driving scenes. The dataset also contains third-party annotations of the subjects' attended awareness based on observations of their scan path. Our results show that our model is able to reasonably estimate attended awareness in a controlled setting, and in the future could potentially be extended to real egocentric driving data to help enable more effective ahead-of-time warnings in safety systems and thereby augment driver performance. We also demonstrate our model's effectiveness on the tasks of saliency, gaze calibration, and denoising, using both our dataset and an existing saliency dataset. We make our model and dataset available at https://github.com/ToyotaResearchInstitute/att-aware/.
【30】 Generative Adversarial Imitation Learning for End-to-End Autonomous Driving on Urban Environments 标题:城市环境端到端自主驾驶的生成性对抗性模仿学习 链接:https://arxiv.org/abs/2110.08586
作者:Gustavo Claudio Karl Couto,Eric Aislan Antonelo 机构:Automation and Systems Engineering Department, Federal University of Santa Catarina, Florianopolis, Brazil 摘要:自动驾驶是一项复杂的任务,自1989年第一辆自动驾驶汽车ALVINN问世以来,就一直采用有监督的学习方法或行为克隆(BC)来解决这一问题。在BC中,使用状态-动作对对对神经网络进行训练,这些状态-动作对构成由专家(即人类驾驶员)制作的训练集。然而,这种类型的模仿学习没有考虑在导航轨迹的不同时刻采取的行动之间可能存在的时间依赖性。强化学习(RL)算法可以更好地处理这些类型的任务,它需要定义一个奖励函数。另一方面,最近的模仿学习方法,如生成性对抗性模仿学习(GAIL),可以在不明确要求定义奖励函数的情况下训练策略,允许代理直接在专家轨迹的训练集上通过试错学习。在这项工作中,我们提出了两种GAIL变体,用于在城市场景的真实CARLA模拟环境中进行车辆自主导航。它们都使用相同的网络结构,处理来自三个正面摄像头的高维图像输入,以及表示速度的其他九个连续输入,稀疏轨迹的下一个点和高级驾驶指令。我们证明了这两种方法都能在训练结束后从头到尾模拟专家轨迹,但在收敛时间和训练稳定性方面,用BC扩充的GAIL损失函数优于前者。 摘要:Autonomous driving is a complex task, which has been tackled since the first self-driving car ALVINN in 1989, with a supervised learning approach, or behavioral cloning (BC). In BC, a neural network is trained with state-action pairs that constitute the training set made by an expert, i.e., a human driver. However, this type of imitation learning does not take into account the temporal dependencies that might exist between actions taken in different moments of a navigation trajectory. These type of tasks are better handled by reinforcement learning (RL) algorithms, which need to define a reward function. On the other hand, more recent approaches to imitation learning, such as Generative Adversarial Imitation Learning (GAIL), can train policies without explicitly requiring to define a reward function, allowing an agent to learn by trial and error directly on a training set of expert trajectories. In this work, we propose two variations of GAIL for autonomous navigation of a vehicle in the realistic CARLA simulation environment for urban scenarios. Both of them use the same network architecture, which process high dimensional image input from three frontal cameras, and other nine continuous inputs representing the velocity, the next point from the sparse trajectory and a high-level driving command. We show that both of them are capable of imitating the expert trajectory from start to end after training ends, but the GAIL loss function that is augmented with BC outperforms the former in terms of convergence time and training stability.
【31】 Lifelong Topological Visual Navigation 标题:终身拓扑视觉导航 链接:https://arxiv.org/abs/2110.08488
作者:Rey Reza Wiyatno,Anqi Xu,Liam Paull 机构: 1 illustrates howour agent uses the graph for planning a navigation task in 1ReyRezaWiyatnoandLiamPaullarewithMontr´ealRoboticsandEmbodiedAILab(REAL)andDIROattheUniversityofMontr´eal 备注:Project page: this https URL 摘要:机器人仅使用视觉导航的能力因其简单而吸引人。传统的基于视觉的导航方法需要预先构建地图,这是一个艰巨且容易失败的步骤,或者只能精确地遵循先前执行的轨迹。新的基于学习的视觉导航技术减少了对地图的依赖,而是直接从图像输入中学习策略进行导航。目前有两种流行的范例:端到端方法完全放弃显式地图表示,拓扑方法仍然保留空间的一些松散连接。然而,虽然端到端的方法往往难以在远程导航任务中实现,但基于拓扑图的解决方案由于图中的伪边而容易失败。在这项工作中,我们提出了一种基于学习的拓扑视觉导航方法,该方法具有图形更新策略,可以随着时间的推移提高终身导航性能。我们从基于采样的规划算法中获得灵感,构建基于图像的拓扑图,从而生成更稀疏的图,但与基线方法相比具有更高的导航性能。此外,与从固定训练环境学习的控制器不同,我们表明,我们的模型可以使用来自部署机器人的真实环境的相对较小的数据集进行微调。我们将进一步评估系统在实际部署中的性能。 摘要:The ability for a robot to navigate with only the use of vision is appealing due to its simplicity. Traditional vision-based navigation approaches required a prior map-building step that was arduous and prone to failure, or could only exactly follow previously executed trajectories. Newer learning-based visual navigation techniques reduce the reliance on a map and instead directly learn policies from image inputs for navigation. There are currently two prevalent paradigms: end-to-end approaches forego the explicit map representation entirely, and topological approaches which still preserve some loose connectivity of the space. However, while end-to-end methods tend to struggle in long-distance navigation tasks, topological map-based solutions are prone to failure due to spurious edges in the graph. In this work, we propose a learning-based topological visual navigation method with graph update strategies that improve lifelong navigation performance over time. We take inspiration from sampling-based planning algorithms to build image-based topological graphs, resulting in sparser graphs yet with higher navigation performance compared to baseline methods. Also, unlike controllers that learn from fixed training environments, we show that our model can be finetuned using a relatively small dataset from the real-world environment where the robot is deployed. We further assess performance of our system in real-world deployments.
【32】 Extended Version of Reactive Task Allocation and Planning of A Heterogeneous Multi-Robot System 标题:异构多机器人系统反应式任务分配与规划的扩展版本 链接:https://arxiv.org/abs/2110.08436
作者:Ziyi Zhou,Dong Jae Lee,Yuki Yoshinaga,Dejun Guo,Ye Zhao 摘要:本文在给出全局线性时序逻辑规范的基础上,朝着反应式、层次化的多机器人任务分配与规划框架迈出了第一步。在我们的场景中,腿机器人和轮式机器人在异构团队中协作,以完成各种导航和交付任务。然而,所有机器人都容易受到不同类型的干扰,包括移动故障、人为干预和环境障碍。为了解决这些干扰,我们提出了任务级局部和全局再分配策略,在保证原始任务完成的同时,有效地在线生成更新的动作状态序列。此外,这些任务重新分配方法消除了重建整个计划或重新合成新任务的问题。最后,行为树执行层监控不同类型的干扰,并采用重新分配方法制定相应的恢复策略。为了评估该规划框架,在真实的医院环境中进行了动态模拟,由四足机器人和轮式机器人组成的异构机器人团队负责交付任务。 摘要:This paper takes the first step towards a reactive, hierarchical multi-robot task allocation and planning framework given a global Linear Temporal Logic specification. In our scenario, legged and wheeled robots collaborate in a heterogeneous team to accomplish a variety of navigation and delivery tasks. However, all robots are susceptible to different types of disturbances including locomotion failures, human interventions, and obstructions from the environment. To address these disturbances, we propose task-level local and global reallocation strategies to efficiently generate updated action-state sequences online while guaranteeing the completion of the original task. In addition, these task reallocation approaches eliminate reconstructing the entire plan or resynthesizing a new task. Lastly, a Behavior Tree execution layer monitors different types of disturbances and employs the reallocation methods to make corresponding recovery strategies. To evaluate this planning framework, dynamic simulations are conducted in a realistic hospital environment with a heterogeneous robot team consisting of quadrupeds and wheeled robots for delivery tasks.
【33】 sbp-env: A Python Package for Sampling-based Motion Planner and Samplers 标题:sbp-env:基于采样的运动规划器和采样器的Python包 链接:https://arxiv.org/abs/2110.08402
作者:Tin Lai 机构: School of Computer Science, The University of Sydney, Australia, DOI: ,.,joss., Software, • Review, • Repository, • Archive, Editor: Daniel S. Katz, Reviewers:, Authors of papers retain, copyright and release the work, under a Creative Commons, Attribution ,., International 备注:None 摘要:基于采样的运动规划测试环境(sbp env)是一个全功能框架,用于快速测试不同的基于采样的运动规划算法。sbp env注重修补框架不同方面的灵活性,并将主要规划组件分为两类(i)采样器和(ii)规划器。运动规划研究的重点主要集中在(i)提高采样效率(使用启发式或学习分布等方法)和(ii)规划器使用不同例程构建连通图的算法方面。因此,通过分离这两个组件,可以快速交换出不同的组件来测试新的想法。 摘要:Sampling-based motion planners' testing environment (sbp-env) is a full feature framework to quickly test different sampling-based algorithms for motion planning. sbp-env focuses on the flexibility of tinkering with different aspects of the framework, and had divided the main planning components into two categories (i) samplers and (ii) planners. The focus of motion planning research had been mainly on (i) improving the sampling efficiency (with methods such as heuristic or learned distribution) and (ii) the algorithmic aspect of the planner using different routines to build a connected graph. Therefore, by separating the two components one can quickly swap out different components to test novel ideas.
【34】 Starkit: RoboCup Humanoid KidSize 2021 Worldwide Champion Team Paper 标题:Starkit:RoboCup人形KidSize 2021世界冠军团体论文 链接:https://arxiv.org/abs/2110.08377
作者:Egor Davydenko,Ivan Khokhlov,Vladimir Litvinenko,Ilya Ryakin,Ilya Osokin,Azer Babaev 机构:Team Starkit, Moscow Institute of Physics and Technology, Russia 备注:15 pages, 10 figures 摘要:本文致力于介绍在悉尼RoboCup 2019和全球RoboCup 2021之间正在开发的功能。这些特征包括与视觉相关的问题,如检测和定位、机械和算法创新。由于比赛是虚拟举行的,本文还考虑了模拟的具体特点。我们概述了已经尝试过的方法,并分析了它们的前提条件、观点和性能评估。 摘要:This article is devoted to the features that were under development between RoboCup 2019 Sydney and RoboCup 2021 Worldwide. These features include vision-related matters, such as detection and localization, mechanical and algorithmic novelties. Since the competition was held virtually, the simulation-specific features are also considered in the article. We give an overview of the approaches that were tried out along with the analysis of their preconditions, perspectives and the evaluation of their performance.
【35】 Learning When and What to Ask: a Hierarchical Reinforcement Learning Framework 标题:学习何时问什么:分层强化学习框架 链接:https://arxiv.org/abs/2110.08258
作者:Khanh Nguyen,Yonatan Bisk,Hal Daumé III 机构:Hal Daum´e III♣♥, ♣ University of Maryland, ♦ Carnegie Melon University, ♥ Microsoft Research 备注:15 pages, 3 figures, 4 tables 摘要:可靠的人工智能代理应注意其知识的局限性,并在感觉到他们没有足够的知识做出合理决策时咨询人类。我们制定了一个分层强化学习框架,用于学习决定何时向人类请求额外信息,以及哪些类型的信息有助于请求。我们的框架通过允许代理与助手交互来利用其知识完成任务,从而扩展了部分观察到的马尔可夫决策过程(POMDP)。模拟人类辅助导航问题的结果证明了我们框架的有效性:通过我们的方法学习的交互策略的辅助,导航策略在任务成功率方面比单独执行任务提高了7倍。交互策略也很有效:平均而言,在任务执行期间执行的所有操作中,只有四分之一是信息请求。我们用分层的政策结构分析学习的好处和挑战,并为未来的工作提出方向。 摘要:Reliable AI agents should be mindful of the limits of their knowledge and consult humans when sensing that they do not have sufficient knowledge to make sound decisions. We formulate a hierarchical reinforcement learning framework for learning to decide when to request additional information from humans and what type of information would be helpful to request. Our framework extends partially-observed Markov decision processes (POMDPs) by allowing an agent to interact with an assistant to leverage their knowledge in accomplishing tasks. Results on a simulated human-assisted navigation problem demonstrate the effectiveness of our framework: aided with an interaction policy learned by our method, a navigation policy achieves up to a 7x improvement in task success rate compared to performing tasks only by itself. The interaction policy is also efficient: on average, only a quarter of all actions taken during a task execution are requests for information. We analyze benefits and challenges of learning with a hierarchical policy structure and suggest directions for future work.
【36】 Koopman Operator Theory for Nonlinear Dynamic Modeling using Dynamic Mode Decomposition 标题:基于动态模态分解的库普曼算子理论在非线性动态建模中的应用 链接:https://arxiv.org/abs/2110.08442
作者:Gregory Snyder,Zhuoyuan Song 机构: Song are with the Department of Mechanical Engineer-ing, University of Hawai‘i at M¯anoa 备注:8 pages, 16 figures 摘要:Koopman算符是一种线性算符,它描述了无限维Hilbert空间中标量观测值(即状态的测量函数)的演化。这一算子理论观点将有限维非线性系统的动力学提升到无限维函数空间,其中原始系统的演化成为线性。在本文中,我们简要总结了非线性动力学建模中的Koopman算子定理,并重点分析了使用动态模式分解(DMD)解决自治和受控规范问题的几种数据驱动实现。我们应用扩展动态模式分解(EDMD)来识别领先的Koopman特征函数,并近似发现的线性动力学的有限维表示。这使我们能够对非线性系统应用线性控制方法,而无需围绕固定点进行线性化近似。然后,我们可以检查在具有基本机动的欠驱动系统上使用基于Koopman算子近似系统的线性控制器的保真度。通过对两个经典动力系统的数值模拟,我们证明了该理论的有效性,并用DMD方法计算和逼近Koopman算子,以及它在线性化这些系统方面的有效性。 摘要:The Koopman operator is a linear operator that describes the evolution of scalar observables (i.e., measurement functions of the states) in an infinitedimensional Hilbert space. This operator theoretic point of view lifts the dynamics of a finite-dimensional nonlinear system to an infinite-dimensional function space where the evolution of the original system becomes linear. In this paper, we provide a brief summary of the Koopman operator theorem for nonlinear dynamics modeling and focus on analyzing several data-driven implementations using dynamical mode decomposition (DMD) for autonomous and controlled canonical problems. We apply the extended dynamic mode decomposition (EDMD) to identify the leading Koopman eigenfunctions and approximate a finite-dimensional representation of the discovered linear dynamics. This allows us to apply linear control approaches towards nonlinear systems without linearization approximations around fixed points. We can then examine the fidelity of using a linear controller based on a Koopman operator approximated system on under-actuated systems with basic maneuvers. We demonstrate the effectiveness of this theory through numerical simulation on two classic dynamical systems are used to show DMD methods of evaluating and approximating the Koopman operator and its effectiveness at linearizing these systems.
机器翻译,仅供参考