访问www.arxivdaily.com获取含摘要速递,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏、发帖等功能!点击阅读原文即可访问
cs.RO机器人相关,共计16篇
【1】 Reinforcement Learning with Formal Performance Metrics for Quadcopter Attitude Control under Non-nominal Contexts 标题:非标称环境下四轴飞行器姿态控制的形式化性能度量强化学习
作者:Nicola Bernini,Mikhail Bessa,Rémi Delmas,Arthur Gold,Eric Goubault,Romain Pennec,Sylvie Putot,François Sillion 机构:Fran¸cois Silliona, Uber ATCP, Paris, France, LIX, Ecole polytechnique, CNRS, IP-Paris, Palaiseau, France 链接:https://arxiv.org/abs/2107.12942 摘要:通过对四直升机姿态控制器实例的深入讨论,探讨了控制器设计的强化学习方法。我们提供了所有的细节,可以重现我们的方法,从一个CrazyFlie2.0在各种标称和非标称条件下的动力学模型开始,包括部分电机故障和阵风。我们发展了一种稳健的信号时序逻辑,以定量评估车辆的行为和衡量控制器的性能。针对不同的性能指标,本文详细描述了训练算法、神经网络结构、超参数、观测空间的选择。我们讨论了所得到的控制器对一个转子的部分功率损失和阵风的鲁棒性,并通过强化学习得出了实际控制器设计的结论。 摘要:We explore the reinforcement learning approach to designing controllers by extensively discussing the case of a quadcopter attitude controller. We provide all details allowing to reproduce our approach, starting with a model of the dynamics of a crazyflie 2.0 under various nominal and non-nominal conditions, including partial motor failures and wind gusts. We develop a robust form of a signal temporal logic to quantitatively evaluate the vehicle's behavior and measure the performance of controllers. The paper thoroughly describes the choices in training algorithms, neural net architecture, hyperparameters, observation space in view of the different performance metrics we have introduced. We discuss the robustness of the obtained controllers, both to partial loss of power for one rotor and to wind gusts and finish by drawing conclusions on practical controller design by reinforcement learning.
【2】 Persistent Reinforcement Learning via Subgoal Curricula 标题:通过子目标课程进行持续强化学习
作者:Archit Sharma,Abhishek Gupta,Sergey Levine,Karol Hausman,Chelsea Finn 机构:† Stanford University, ‡ Google Brain, # UC Berkeley 链接:https://arxiv.org/abs/2107.12931 摘要:强化学习(RL)有望实现对不同智能体复杂行为的自主获取。然而,当前强化学习算法的成功取决于一个经常被忽视的要求——每次试验都需要从一个固定的初始状态分布开始。不幸的是,在每次试验后将环境重置为初始状态需要大量的人类监督和广泛的环境仪器,这破坏了自主强化学习的目的。在这项工作中,我们提出了价值加速持续强化学习(VaPRL),它产生了一个初始状态的课程,使得代理可以在较容易的任务成功的基础上进行引导,从而有效地学习较难的任务。代理还学习达到课程建议的初始状态,最大限度地减少对人类干预学习的依赖。我们观察到,与幕式RL相比,VaPRL减少了三个数量级所需的干预,同时在各种模拟机器人问题的样本效率和渐近性能方面都优于现有的无重置RL方法。 摘要:Reinforcement learning (RL) promises to enable autonomous acquisition of complex behaviors for diverse agents. However, the success of current reinforcement learning algorithms is predicated on an often under-emphasised requirement -- each trial needs to start from a fixed initial state distribution. Unfortunately, resetting the environment to its initial state after each trial requires substantial amount of human supervision and extensive instrumentation of the environment which defeats the purpose of autonomous reinforcement learning. In this work, we propose Value-accelerated Persistent Reinforcement Learning (VaPRL), which generates a curriculum of initial states such that the agent can bootstrap on the success of easier tasks to efficiently learn harder tasks. The agent also learns to reach the initial states proposed by the curriculum, minimizing the reliance on human interventions into the learning. We observe that VaPRL reduces the interventions required by three orders of magnitude compared to episodic RL while outperforming prior state-of-the art methods for reset-free RL both in terms of sample efficiency and asymptotic performance on a variety of simulated robotics problems.
【3】 Learning Local Recurrent Models for Human Mesh Recovery 标题:用于人体网格恢复的学习局部递归模型
作者:Runze Li,Srikrishna Karanam,Ren Li,Terrence Chen,Bir Bhanu,Ziyan Wu 机构:United Imaging Intelligence, Cambridge MA, USA, University of California Riverside, Riverside CA, USA 备注:10 pages, 6 figures, 2 tables 链接:https://arxiv.org/abs/2107.12847 摘要:我们考虑的问题估计帧级全人体网格给一个视频的人自然运动动力学。虽然这一领域在基于单个图像的网格估计方面取得了很大进展,但由于其在缓解深度模糊和遮挡等问题方面的作用,从视频中推断网格动力学的努力最近有所上升。然而,现有工作的一个关键限制是假设所有观测到的运动动力学可以使用一个动力学/循环模型来建模。虽然这在相对简单的动态情况下可能会很好地工作,但在野外视频中进行推理会带来许多挑战。特别地,典型的情况是,人的不同身体部位在视频中经历不同的动力学,例如,腿可以以与手(例如,跳舞的人)动力学不同的方式移动。为了解决这些问题,我们提出了一种新的视频网格恢复方法,该方法根据标准的骨架模型将人体网格划分为多个局部区域。然后,我们用单独的递归模型对每个局部部分的动力学进行建模,每个模型根据已知的人体运动学结构进行适当的调节。这就产生了一个基于结构的局部递归学习体系结构,它可以通过可用的注释以端到端的方式进行训练。我们在Human3.6M、MPI-INF-3DHP和3DPW等标准视频网格恢复基准数据集上进行了各种实验,证明了我们的局部动态建模设计的有效性,并建立了基于标准评估指标的最新结果。 摘要:We consider the problem of estimating frame-level full human body meshes given a video of a person with natural motion dynamics. While much progress in this field has been in single image-based mesh estimation, there has been a recent uptick in efforts to infer mesh dynamics from video given its role in alleviating issues such as depth ambiguity and occlusions. However, a key limitation of existing work is the assumption that all the observed motion dynamics can be modeled using one dynamical/recurrent model. While this may work well in cases with relatively simplistic dynamics, inference with in-the-wild videos presents many challenges. In particular, it is typically the case that different body parts of a person undergo different dynamics in the video, e.g., legs may move in a way that may be dynamically different from hands (e.g., a person dancing). To address these issues, we present a new method for video mesh recovery that divides the human mesh into several local parts following the standard skeletal model. We then model the dynamics of each local part with separate recurrent models, with each model conditioned appropriately based on the known kinematic structure of the human body. This results in a structure-informed local recurrent learning architecture that can be trained in an end-to-end fashion with available annotations. We conduct a variety of experiments on standard video mesh recovery benchmark datasets such as Human3.6M, MPI-INF-3DHP, and 3DPW, demonstrating the efficacy of our design of modeling local dynamics as well as establishing state-of-the-art results based on standard evaluation metrics.
【4】 A Storytelling Robot managing Persuasive and Ethical Stances via ACT-R: an Exploratory Study 标题:通过ACT-R管理说服性和道德性立场的讲故事机器人的探索性研究
作者:Agnese Augello,Giuseppe Città,Manuel Gentile,Antonio Lieto 机构:ICAR-CNR, Italy, Palermo, ITD-CNR, Italy, Università di Torino, Dipartimento di Informatica, Italy 备注:None 链接:https://arxiv.org/abs/2107.12845 摘要:我们介绍了一个通过ACT-R认知结构控制的讲故事机器人,它能够在谈论有关COVID-19的一些话题时采用不同的说服技巧和伦理立场。本文的主要贡献在于提出了一个需求驱动的模型,在对话过程中指导和评估,在代理程序记忆中可用的说服技巧的使用(如果有的话)。在这样一个模型中测试的说服技巧的组合范围从讲故事的使用,到框架技巧和基于修辞的论点。据我们所知,这是第一次尝试建立一个有说服力的代理,能够整合有关对话管理、讲故事和说服技巧以及道德态度的明确的认知假设。本文介绍了63名参与者对该系统的探索性评估结果 摘要:We present a storytelling robot, controlled via the ACT-R cognitive architecture, able to adopt different persuasive techniques and ethical stances while conversing about some topics concerning COVID-19. The main contribution of the paper consists in the proposal of a needs-driven model that guides and evaluates, during the dialogue, the use (if any) of persuasive techniques available in the agent procedural memory. The portfolio of persuasive techniques tested in such a model ranges from the use of storytelling, to framing techniques and rhetorical-based arguments. To the best of our knowledge, this represents the first attempt of building a persuasive agent able to integrate a mix of explicitly grounded cognitive assumptions about dialogue management, storytelling and persuasive techniques as well as ethical attitudes. The paper presents the results of an exploratory evaluation of the system on 63 participants
【5】 Information-Theoretic Based Target Search with Multiple Agents 标题:基于信息论的多智能体目标搜索
作者:Minkyu Kim,Ryan Gupta,Luis Sentis 机构:Department of Mechanical Engineering, The University of Texas at Austin, Department of Aerospace Engineering 备注:6 pages, 6 figures 链接:https://arxiv.org/abs/2107.12715 摘要:针对现实环境中执行目标搜索的异构机器人团队,提出了一种在线路径规划和运动生成算法。每个机器人的路径选择使用信息论公式进行优化,并为每个代理依次计算。首先,我们从垂直单元分解得到的全局路径点和局部边界点生成候选轨迹。从这个集合中,我们选择信息增益最大的路径。我们证明了该算法提供的分层顺序决策结构在模拟环境中可扩展到多个代理。我们还验证了我们的框架在现实世界中的公寓设置使用两个机器人团队组成的Unitree A1四足和丰田高铁移动机械手搜索一个人。代理利用一个有效的领导者-追随者沟通结构,其中只有关键信息是共享的。 摘要:This paper proposes an online path planning and motion generation algorithm for heterogeneous robot teams performing target search in a real-world environment. Path selection for each robot is optimized using an information-theoretic formulation and is computed sequentially for each agent. First, we generate candidate trajectories sampled from both global waypoints derived from vertical cell decomposition and local frontier points. From this set, we choose the path with maximum information gain. We demonstrate that the hierarchical sequential decision-making structure provided by the algorithm is scalable to multiple agents in a simulation setup. We also validate our framework in a real-world apartment setting using a two robot team comprised of the Unitree A1 quadruped and the Toyota HSR mobile manipulator searching for a person. The agents leverage an efficient leader-follower communication structure where only critical information is shared.
【6】 End-To-End Real-Time Visual Perception Framework for Construction Automation 标题:面向施工自动化的端到端实时视觉感知框架
作者:Mohit Vohra,Ashish Kumar,Ravi Prakash,Laxmidhar Behera 备注:The paper has been accepted as a regular paper in IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2021 链接:https://arxiv.org/abs/2107.12701 摘要:在这项工作中,我们提出了一个机器人解决方案,以自动化的任务,墙壁建设。为此,我们提出了一个端到端的视觉感知框架,可以快速检测和定位杂乱中的砖块。此外,我们提出了一种结合上述信息的砖块姿态估计的光计算方法。与YOLO和SSD相比,本文提出的检测网络预测一个旋转的盒子,从而在预测的盒子区域中最大化目标的区域。此外,还报告了精度P、召回率R和平均精度(mAP)分数来评估所提出的框架。我们观察到,对于我们的任务,提出的方案优于垂直包围盒检测器。此外,我们将所提出的视觉感知框架应用于一个具有UR5机械手的机器人系统上,并证明了该系统可以在自主模式下成功地复制一个简化的砌墙任务。 摘要:In this work, we present a robotic solution to automate the task of wall construction. To that end, we present an end-to-end visual perception framework that can quickly detect and localize bricks in a clutter. Further, we present a light computational method of brick pose estimation that incorporates the above information. The proposed detection network predicts a rotated box compared to YOLO and SSD, thereby maximizing the object's region in the predicted box regions. In addition, precision P, recall R, and mean-average-precision (mAP) scores are reported to evaluate the proposed framework. We observed that for our task, the proposed scheme outperforms the upright bounding box detectors. Further, we deploy the proposed visual perception framework on a robotic system endowed with a UR5 robot manipulator and demonstrate that the system can successfully replicate a simplified version of the wall-building task in an autonomous mode.
【7】 Dynamic and Static Object Detection Considering Fusion Regions and Point-wise Features 标题:融合区域和逐点特征的动态静电目标检测
作者:Andrés Gómez,Thomas Genevois,Jerome Lussereau,Christian Laugier 备注:6 pages, 7 figures 链接:https://arxiv.org/abs/2107.12692 摘要:目标检测是自主车辆与道路使用者安全交互的关键问题。深度学习方法使目标检测方法的发展具有更好的性能。然而,从实时检测的目标中获取更多的特征仍然是一个挑战。主要原因是来自环境对象的更多信息可以提高自主车辆面对不同城市情况的能力。提出了一种检测自主车辆前方静态和动态目标的新方法。我们的方法还可以从检测到的物体中获得其他特征,比如它们的位置、速度和航向。我们开发了我们的提议融合了环境对YoloV3的解释结果和贝叶斯过滤器。为了证明我们的方案的性能,我们通过一个基准数据集和从自主平台获得的真实数据来评估它。我们将结果与另一种方法进行了比较。 摘要:Object detection is a critical problem for the safe interaction between autonomous vehicles and road users. Deep-learning methodologies allowed the development of object detection approaches with better performance. However, there is still the challenge to obtain more characteristics from the objects detected in real-time. The main reason is that more information from the environment's objects can improve the autonomous vehicle capacity to face different urban situations. This paper proposes a new approach to detect static and dynamic objects in front of an autonomous vehicle. Our approach can also get other characteristics from the objects detected, like their position, velocity, and heading. We develop our proposal fusing results of the environment's interpretations achieved of YoloV3 and a Bayesian filter. To demonstrate our proposal's performance, we asses it through a benchmark dataset and real-world data obtained from an autonomous platform. We compared the results achieved with another approach.
【8】 The Pursuit and Evasion of Drones Attacking an Automated Turret 标题:追打与无人机攻击自动炮塔的规避
作者:Daniel Biediger,Luben Popov,Aaron T. Becker 机构:UniversityofHouston 备注:8 pages, 10 figures, 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2021) 链接:https://arxiv.org/abs/2107.12660 摘要:研究了防御炮塔与一架或多架攻击无人机的追逃问题。炮塔必须``访问'每一个攻击无人机一次,尽快击败威胁。这构成了通过无人机的最短哈密顿路径(SHP)。这项研究考虑了保真度不断提高的情况,从二维运动学模型开始,发展到三维动力学模型。在2D中,我们确定一个或多个无人机始终可以到达炮塔的区域,或者距离炮塔足够近的区域,它们可以躲避炮塔。这为$n$无人机提供了围绕炮塔的最佳启动角度,以及一架和两架无人机的最大启动半径。我们证明了安全区域也存在于三维空间中,并提供了一个控制器,使得该区域中的无人机能够避开云台。通过仿真,我们探索了$n$无人机可以启动且至少有一架到达炮塔的最大射程,并分析了炮塔行为以及无人机数量、启动配置和行为的影响。 摘要:This paper investigates the pursuit-evasion problem of a defensive gun turret and one or more attacking drones. The turret must ``visit" each attacking drone once, as quickly as possible, to defeat the threat. This constitutes a Shortest Hamiltonian Path (SHP) through the drones. The investigation considers situations with increasing fidelity, starting with a 2D kinematic model and progressing to a 3D dynamic model. In 2D we determine the region from which one or more drones can always reach a turret, or the region close enough to it where they can evade the turret. This provides optimal starting angles for $n$ drones around a turret and the maximum starting radius for one and two drones. We show that safety regions also exist in 3D and provide a controller so that a drone in this region can evade the pan-tilt turret. Through simulations we explore the maximum range $n$ drones can start and still have at least one reach the turret, and analyze the effect of turret behavior and the drones' number, starting configuration, and behaviors.
【9】 Topology Design and Position Analysis of a Reconfigurable Modular Hybrid-Parallel Manipulator 标题:一种可重构模块化混联机器人的拓扑设计与位置分析
作者:Rajashekhar Vachiravelu Saminathan 机构:Researcher, Tentacles Robotic Foundation, Kanchipuram, Tamil Nadu, India 备注:11 pages, 14 figures, Accepted for IDETC/CIE 2019 the ASME 2019 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference 链接:https://arxiv.org/abs/2107.12637 摘要:在现代,机械手出现在批量生产产品的自动化装配线上。这些机械手只能用于一种配置,即串行或并行。本文介绍了一种新的具有两个自由度的模块。将两个模块和三个模块串联起来,分别构成四自由度和六自由度的混合机械手。通过并联安装3个模块并稍加修改,就可以构成6自由度并联机器人。因此,该机械手具有可重构性,可以通过拆装的方式作为混合或并联机械手使用。对两种混合结构和并联结构进行了拓扑设计、正逆位置分析。该机械手可用于需要部署柔性制造系统的行业。利用计算机图形用户界面(GUI)对并联机器人的三种结构进行了实验验证。 摘要:In the modern days, manipulators are found in the automated assembly lines of industries that produce products in masses. These manipulators can be used only in one configuration, that is either serial or parallel. In this paper, a new module which has two degrees of freedom is introduced. By connecting the two and three modules in series, 4 and 6 DoF hybrid manipulators can be formed respectively. By erecting 3 modules in parallel and with some minor modifications, a 6 DoF parallel manipulator can be formed. Hence the manipulator is reconfigurable and can be used as hybrid or parallel manipulator by disassembling and assembling. The topology design, forward and inverse position analysis has been done for the two hybrid configurations and the parallel configuration. This manipulator can be used in industries where flexible manufacturing system is to be deployed. The three configurations of the parallel manipulator has been experimentally demonstrated using a graphical user interface (GUI) control through a computer.
【10】 VIPose: Real-time Visual-Inertial 6D Object Pose Tracking 标题:VIPose:实时视觉惯性6D目标姿态跟踪
作者:Rundong Ge,Giuseppe Loianno 机构:The authors are with the New York University, Tandon School ofEngineering 备注:Accepted by The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2021 链接:https://arxiv.org/abs/2107.12617 摘要:估计物体的6D姿态有利于机器人的任务,如运输、自主导航、操纵以及虚拟现实和增强现实等机器人以外的场景。对于单帧图像的姿态估计,姿态跟踪考虑了多帧图像的时间信息,克服了可能的检测不一致性,提高了姿态估计的效率。在这项工作中,我们介绍了一种新的深度神经网络(DNN)称为VIPose,它结合惯性和相机数据来解决实时的目标姿态跟踪问题。其关键贡献是设计了一种融合视觉和惯性特征的DNN结构来预测物体在连续图像帧间的相对6D姿态。然后通过连续组合相对姿势来估计整个6D姿势。我们的方法显示了显著的姿态估计结果严重闭塞的对象是众所周知的是非常具有挑战性的处理现有的最先进的解决方案。在一个名为VIYCB的新数据集上验证了该方法的有效性,该数据集包含RGB图像、IMU数据和通过自动标记技术创建的精确的6D姿势标注。该方法的精度性能可与最先进的技术相媲美,但具有实时性的额外优势。 摘要:Estimating the 6D pose of objects is beneficial for robotics tasks such as transportation, autonomous navigation, manipulation as well as in scenarios beyond robotics like virtual and augmented reality. With respect to single image pose estimation, pose tracking takes into account the temporal information across multiple frames to overcome possible detection inconsistencies and to improve the pose estimation efficiency. In this work, we introduce a novel Deep Neural Network (DNN) called VIPose, that combines inertial and camera data to address the object pose tracking problem in real-time. The key contribution is the design of a novel DNN architecture which fuses visual and inertial features to predict the objects' relative 6D pose between consecutive image frames. The overall 6D pose is then estimated by consecutively combining relative poses. Our approach shows remarkable pose estimation results for heavily occluded objects that are well known to be very challenging to handle by existing state-of-the-art solutions. The effectiveness of the proposed approach is validated on a new dataset called VIYCB with RGB image, IMU data, and accurate 6D pose annotations created by employing an automated labeling technique. The approach presents accuracy performances comparable to state-of-the-art techniques, but with additional benefit to be real-time.
【11】 Design and Analysis of a Robotic Lizard using Five-Bar Mechanism 标题:一种采用五杆机构的机器人蜥蜴的设计与分析
作者:Rajashekhar V S,Dinakar Raj C K,Vishwesh S,Selva Perumal E,Nirmal Kumar M 机构: Researcher, Tentacles Robotic Foundation, Kanchipuram, Tamil Nadu, India, Department of Mechanical Engineering, Adhiparasakthi Engineering College, Melmaruvathur 备注:None 链接:https://arxiv.org/abs/2107.12614 摘要:腿机器人被用来探索崎岖的地形,因为它们能够穿越缝隙和障碍物。本文设计了一种新的机构,利用集成的五杆机构复制机器人蜥蜴。有两个五杆机构,通过按特定顺序连接连杆,形成两个以上的机构。腿连接到五杆机构的连杆上,这样,当机构启动时,它们会使机器人向前移动。采用矢量回路法对机构进行了位置分析。样机已经建立和控制使用伺服电机来验证机器人蜥蜴机制。 摘要:Legged robots are being used to explore rough terrains as they are capable of traversing gaps and obstacles. In this paper, a new mechanism is designed to replicate a robotic lizard using integrated five-bar mechanisms. There are two five bar mechanisms from which two more are formed by connecting the links in a particular order. The legs are attached to the links of the five bar mechanism such that, when the mechanism is actuated, they move the robot forward. Position analysis using vector loop approach has been done for the mechanism. A prototype has been built and controlled using servo motors to verify the robotic lizard mechanism.
【12】 A Neurorobotics Approach to Behaviour Selection based on Human Activity Recognition 标题:一种基于人类活动识别的神经机器人行为选择方法
作者:Caetano M. Ranieri,Renan C. Moioli,Patricia A. Vargas,Roseli A. F. Romero 机构:Institute of Mathematical and Computer Sciences, University of Sao Paulo, Sao Carlos, SP, Brazil, Digital Metropolis Institute, Federal University of Rio Grande do Norte, Natal, RN, Brazil, Edinburgh Centre for Robotics, Heriot-Watt University, Edinburgh, Scotland, UK 链接:https://arxiv.org/abs/2107.12540 摘要:行为选择一直是机器人学的研究热点,尤其是在人机交互领域。为了使机器人能够与人类进行有效的自主交互,基于感知信息的人类活动识别技术与基于决策机制的机器人行为选择技术之间的耦合是至关重要的。然而,到目前为止,大多数方法都是由已识别的活动和机器人行为之间的确定性关联组成,忽略了实时应用中连续预测固有的不确定性。在这篇论文中,我们提出了一种基于计算模型的神经机器人方法来解决这个问题,这种模型类似于生物的神经生理学方面。这种神经机器人学方法与非生物启发的启发式方法进行了比较。为了评估这两种方法,开发了一个机器人仿真系统,其中移动机器人必须根据智能家庭中的居民执行的活动来完成任务。根据机器人提供的正确结果数对每种方法的结果进行评估。结果表明,神经机器人方法是有利的,特别是考虑到计算模型更复杂的动物。 摘要:Behaviour selection has been an active research topic for robotics, in particular in the field of human-robot interaction. For a robot to interact effectively and autonomously with humans, the coupling between techniques for human activity recognition, based on sensing information, and robot behaviour selection, based on decision-making mechanisms, is of paramount importance. However, most approaches to date consist of deterministic associations between the recognised activities and the robot behaviours, neglecting the uncertainty inherent to sequential predictions in real-time applications. In this paper, we address this gap by presenting a neurorobotics approach based on computational models that resemble neurophysiological aspects of living beings. This neurorobotics approach was compared to a non-bioinspired, heuristics-based approach. To evaluate both approaches, a robot simulation is developed, in which a mobile robot has to accomplish tasks according to the activity being performed by the inhabitant of an intelligent home. The outcomes of each approach were evaluated according to the number of correct outcomes provided by the robot. Results revealed that the neurorobotics approach is advantageous, especially considering the computational models based on more complex animals.
【13】 Language Grounding with 3D Objects 标题:3D对象的语言基础
作者:Jesse Thomason,Mohit Shridhar,Yonatan Bisk,Chris Paxton,Luke Zettlemoyer 机构:University of Southern California, University of Washington, Carnegie Mellon University, NVIDIA 备注:this https URL 链接:https://arxiv.org/abs/2107.12514 摘要:对机器人看似简单的自然语言请求通常没有明确规定,例如“你能给我拿无线鼠标吗?”当查看架子上的鼠标时,从某些角度或位置可能看不到按钮的数量或电线的存在。候选小鼠的平面图像可能无法提供“无线”所需的鉴别信息。世界和其中的物体不是平面的图像,而是复杂的三维形状。如果人类根据物体的任何基本属性(如颜色、形状或纹理)请求物体,机器人应该进行必要的探索以完成任务。特别是,虽然在明确理解颜色和类别等视觉属性方面做出了大量的努力和进展,但在理解形状和轮廓的语言方面取得的进展相对较少。在这项工作中,我们介绍了一种新的推理任务,目标都是视觉和非视觉语言的三维物体。我们的新基准,ShapeNet注解引用表达式(SNARE),需要一个模型来选择两个对象中的哪一个被自然语言描述引用。我们介绍了几种基于剪辑的模型来区分物体,并证明了尽管视觉和语言联合建模的最新进展有助于机器人的语言理解,但这些模型在理解物体的三维本质(在操纵中起关键作用的属性)方面仍然较弱。特别是,我们发现在语言基础模型中添加视图估计可以提高SNARE和在机器人平台上识别语言中引用的对象的准确性。 摘要:Seemingly simple natural language requests to a robot are generally underspecified, for example "Can you bring me the wireless mouse?" When viewing mice on the shelf, the number of buttons or presence of a wire may not be visible from certain angles or positions. Flat images of candidate mice may not provide the discriminative information needed for "wireless". The world, and objects in it, are not flat images but complex 3D shapes. If a human requests an object based on any of its basic properties, such as color, shape, or texture, robots should perform the necessary exploration to accomplish the task. In particular, while substantial effort and progress has been made on understanding explicitly visual attributes like color and category, comparatively little progress has been made on understanding language about shapes and contours. In this work, we introduce a novel reasoning task that targets both visual and non-visual language about 3D objects. Our new benchmark, ShapeNet Annotated with Referring Expressions (SNARE), requires a model to choose which of two objects is being referenced by a natural language description. We introduce several CLIP-based models for distinguishing objects and demonstrate that while recent advances in jointly modeling vision and language are useful for robotic language understanding, it is still the case that these models are weaker at understanding the 3D nature of objects -- properties which play a key role in manipulation. In particular, we find that adding view estimation to language grounding models improves accuracy on both SNARE and when identifying objects referred to in language on a robot platform.
【14】 SpectGRASP: Robotic Grasping by Spectral Correlation 标题:spectGRASP:基于谱相关的机器人抓取
作者:Maxime Adjigble,Cristiana de Farias,Rustam Stolkin,Naresh Marturi 机构: School of Metal-lurgy and Materials, University of Birmingham 备注:Accepted for 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS): September 27 - October 1, Prague, Czech Republic (Online) 链接:https://arxiv.org/abs/2107.12492 摘要:提出了一种基于谱相关的机器人抓取任意形状未知物体的方法。给定一个物体的点云,spectGrass提取物体表面上与手的配置相匹配的接触点。它既不需要离线训练,也不需要先验对象模型。提出了一种新的二值扩展高斯图像(BEGI),它将物体和机器人手指的点云表面法线表示为二维球体上的信号。然后利用球谐函数估计手指和物体之间的相关性。得到的谱相关密度函数提供了夹持器和物体表面法线的相似性度量。这是高效的,因为它是同时评估在所有可能的手指旋转SO(3)。然后使用具有高相关值的旋转为每个手指提取一组接触点。然后,我们使用我们以前的工作,局部接触矩(LoCoMo)相似性度量,对生成的抓取进行排序,从而执行具有最大可能性的抓取。在物理模拟环境下,我们用一个装有平行钳口的7轴机器人进行了实验,评估了SpectGRASP的性能。结果表明,该方法不仅能抓取单个目标,而且能成功地清除随机组织的目标群。SpectGRASP方法在抓取生成时间和抓取效率方面也优于最新的方法。 摘要:This paper presents a spectral correlation-based method (SpectGRASP) for robotic grasping of arbitrarily shaped, unknown objects. Given a point cloud of an object, SpectGRASP extracts contact points on the object's surface matching the hand configuration. It neither requires offline training nor a-priori object models. We propose a novel Binary Extended Gaussian Image (BEGI), which represents the point cloud surface normals of both object and robot fingers as signals on a 2-sphere. Spherical harmonics are then used to estimate the correlation between fingers and object BEGIs. The resulting spectral correlation density function provides a similarity measure of gripper and object surface normals. This is highly efficient in that it is simultaneously evaluated at all possible finger rotations in SO(3). A set of contact points are then extracted for each finger using rotations with high correlation values. We then use our previous work, Local Contact Moment (LoCoMo) similarity metric, to sequentially rank the generated grasps such that the one with maximum likelihood is executed. We evaluate the performance of SpectGRASP by conducting experiments with a 7-axis robot fitted with a parallel-jaw gripper, in a physics simulation environment. Obtained results indicate that the method not only can grasp individual objects, but also can successfully clear randomly organized groups of objects. The SpectGRASP method also outperforms the closest state-of-the-art method in terms of grasp generation time and grasp-efficiency.
【15】 High-Payload Online Identification and Adaptive Control for an Electrically-actuated Quadruped Robot 标题:电驱动四足机器人高有效载荷在线辨识与自适应控制
作者:Bingchen Jin,Shusheng Ye,Juntong Su,Chaoyang Song,Ye Zhao,Aidong Zhang,Ning Ding,Jianwen Luo 机构: The Chinese University ofHong Kong (CUHK), Song is with the Southern University of Science and Technology(SUSTech) 链接:https://arxiv.org/abs/2107.12482 摘要:四足机器人在穿越崎岖不平的地形时表现出巨大的潜力。许多传统的腿式动态运动控制方法都是基于模型的,对模型的不确定性和载荷的变化具有很高的敏感性。因此,高性能的模型参数估计就显得必不可少。然而,四足机器人在执行多种任务时,负载的惯性参数往往是未知的,并且是动态变化的。为了解决这一问题,四足机器人的惯性参数和有效载荷重心位置的在线辨识越来越受到人们的关注。提出了一种基于负载在线辨识的四足机器人高负载容量(负载与机器人自重之比)运动自适应控制器。我们称之为四足运动自适应控制器(ACQL),它由一个递推更新律和一个控制律组成。ACQL在线估计有效载荷引起的外力和扭矩。该估计被纳入基于逆动力学的二次规划(QP)实现小跑步态。因此,机器人的CoM和方向轨迹的跟踪精度得到了提高。在实际四足机器人平台上进行了验证。实验结果表明,该方法对机器人躯干不同部位的有效载荷(重量从20kg到75kg)具有较好的估计效果。 摘要:Quadruped robots manifest great potential to traverse rough terrains with payload. Numerous traditional control methods for legged dynamic locomotion are model-based and exhibit high sensitivity to model uncertainties and payload variations. Therefore, high-performance model parameter estimation becomes indispensable. However, the inertia parameters of payload are usually unknown and dynamically changing when the quadruped robot is deployed in versatile tasks. To address this problem, online identification of the inertia parameters and the Center of Mass (CoM) position of the payload for the quadruped robots draw an increasing interest. This study presents an adaptive controller based on the online payload identification for the high payload capacity (the ratio between payload and robot's self-weight) quadruped locomotion. We name it as Adaptive Controller for Quadruped Locomotion (ACQL), which consists of a recursive update law and a control law. ACQL estimates the external forces and torques induced by the payload online. The estimation is incorporated in inverse-dynamics-based Quadratic Programming (QP) to realize a trotting gait. As such, the tracking accuracy of the robot's CoM and orientation trajectories are improved. The proposed method, ACQL, is verified in a real quadruped robot platform. Experiments prove the estimation efficacy for the payload weighing from 20 kg to 75 kg and loaded at different locations of the robot's torso.
【16】 Terrain-perception-free Quadrupedal Spinning Locomotion on Versatile Terrains: Modeling, Analysis, and Experimental Validation 标题:在多种地形上无地形感知的四足旋转运动:建模、分析和实验验证
作者:Hongwu Zhu,Dong Wang,Ganyu Deng,Nathan Boyd,Ziyi Zhou,Lecheng Ruan,Caiming Sun,Aidong Zhang,Ning Ding,Ye Zhao,Jianwen Luo 机构: The ChineseUniversity of Hong Kong (CUHK), Ruan is with Beijing Institute of General Artificial Intelligence (BIGAI) 链接:https://arxiv.org/abs/2107.12479 摘要:四足动物在崎岖地形上的动态行走,虽然在过去的几十年里取得了显著的进步,但仍然是一项具有挑战性的任务。小型四足机器人具有足够的灵活性和适应能力,能够在沿着其矢状方向移动的同时,穿越许多不平的地形,如斜坡和楼梯。然而,在不平坦地形上的旋转行为往往表现出位置漂移。基于这一问题,本研究提出了一种算法来实现在不平坦地形上的精确旋转运动,并将质心旋转半径限制在一个很小的范围内,从而使漂移风险最小化。提出了一种改进的球形足运动学表示法,以改进四足动物在运动过程中的足运动学模型和滚动动力学。提出了一种基于投影稳定裕度的CoM规划器来生成稳定的旋转运动。采用线性二次型调节器(LQR)实现了纺纱运动的精确跟踪,抑制了纺纱运动过程中的位置漂移。在小型四足机器人上进行了实验,并分别在平地、楼梯和斜坡地形上验证了该方法的有效性。 摘要:Dynamic quadrupedal locomotion over rough terrains, although revealing remarkable progress over the last few decades, remains a challenging task. Small-scale quadruped robots are adequately flexible and adaptable to traverse numerous uneven terrains, such as slopes and stairs, while moving along its Sagittal direction. However, spinning behaviors on uneven terrain often exhibit position drifts. Motivated by this problem, this study presents an algorithmic method to enable accurate spinning motions over uneven terrain and constrain the spinning radius of the Center of Mass (CoM) to be bounded within a small range so as to minimize the drift risks. A modified spherical foot kinematics representation is proposed to improve the foot kinematic model and rolling dynamics of the quadruped during locomotion. A CoM planner is proposed to generate stable spinning motion based on projected stability margins. Accurate motion tracking is accomplished with Linear Quadratic Regulator (LQR) to bound the position drift during the spinning movement. Experiments are conducted on a small-scale quadruped robot and the effectiveness of the proposed method is verified on versatile terrains including flat ground, stairs and slope terrains, respectively.