vpp Buffer Metadata

2023-03-07 17:10:27 浏览数 (1)

本篇文章来源于对vpp官方开发文档的个人翻译,有翻译不当的地方欢迎指正。原文地址:

https://fd.io/docs/vpp/master/gettingstarted/developers/metadata.html#buffer-metadata-extensions

Buffer Metadata

每个vlib_buffer_t (报文缓冲区)都携带描述当前包处理状态的缓冲区metadata。这种基础技术已经在多包处理环境中使用了几十年。

我们会详细检查vpp buffer metadata的使用细节,但是需要使用者修改或者扩展方案应该有一定的代码检查。

Vlib (Vector library) primary buffer metadata

每个vlib_buffer_t的前64个字节携带主要的buffer metadata。请参阅…/src/vlib/buffer.h了解完整细节。

主要字段如下:

  • i16 current_data: 当前正在处理的data[]和pre_data[]中的偏移量。如果负的,说明当前头指向pre-data(重写空间)区域。
  • u16 current_length: 在current_data到这个缓冲区的数据结尾之间的字节数
  • u32 flags: 缓冲标志位。已经使用了很多。
    • VNET_BUFFER_F_L4_CHECKSUM_COMPUTED: tcp/udp checksum has been computed
    • VNET_BUFFER_F_L4_CHECKSUM_CORRECT: tcp/udp checksum is correct
    • VNET_BUFFER_F_VLAN_2_DEEP: two vlan tags present
    • VNET_BUFFER_F_VLAN_1_DEEP: one vlan tag present
    • VNET_BUFFER_F_SPAN_CLONE: packet has already been cloned (span feature)
    • VNET_BUFFER_F_LOOP_COUNTER_VALID: packet look-up loop count valid
    • VNET_BUFFER_F_LOCALLY_ORIGINATED: packet built by vpp
    • VNET_BUFFER_F_IS_IP4: packet is ipv4, for checksum offload
    • VNET_BUFFER_F_IS_IP6: packet is ipv6, for checksum offload
    • VNET_BUFFER_F_OFFLOAD_IP_CKSUM: hardware ip checksum offload requested
    • VNET_BUFFER_F_OFFLOAD_TCP_CKSUM: hardware tcp checksum offload requested
    • VNET_BUFFER_F_OFFLOAD_UDP_CKSUM: hardware udp checksum offload requested
    • VNET_BUFFER_F_IS_NATED: natted packet, skip input checks
    • VNET_BUFFER_F_L2_HDR_OFFSET_VALID: L2 header offset valid
    • VNET_BUFFER_F_L3_HDR_OFFSET_VALID: L3 header offset valid
    • VNET_BUFFER_F_L4_HDR_OFFSET_VALID: L4 header offset valid
    • VNET_BUFFER_F_FLOW_REPORT: packet is an ipfix packet
    • VNET_BUFFER_F_IS_DVR: packet to be reinjected into the l2 output path
    • VNET_BUFFER_F_QOS_DATA_VALID: QoS data valid in vnet_buffer_opaque2
    • VNET_BUFFER_F_GSO: generic segmentation offload requested
    • VNET_BUFFER_F_AVAIL1: available bit
    • VNET_BUFFER_F_AVAIL2: available bit
    • VNET_BUFFER_F_AVAIL3: available bit
    • VNET_BUFFER_F_AVAIL4: available bit
    • VNET_BUFFER_F_AVAIL5: available bit
    • VNET_BUFFER_F_AVAIL6: available bit
    • VNET_BUFFER_F_AVAIL7: available bit
    • VLIB_BUFFER_IS_TRACED: buffer is traced
    • VLIB_BUFFER_NEXT_PRESENT: buffer has multiple chunks
    • VLIB_BUFFER_TOTAL_LENGTH_VALID: total_length_not_including_first_buffer is valid (see below)
    • src/vlib/buffer.h flag bits
    • src/vnet/buffer.h flag bits
  • u32 flow_id: generic flow identifier,通用的流标识符。目前主要用在网卡FDIR功能上,在UDPI,或vxlan offload都有使用。
  • u8 ref_count: buffer reference / clone count (e.g. for span replication) 当前buffer引用或克隆的计数。在镜像功能有使用。
  • u8 buffer_pool_index: buffer pool index which owns this buffer 当前buffer所在的buffer pool 的索引。主要对应numa的id。
  • vlib_error_t (u16) error: error code for buffers enqueued to error handler 进入错误处理程序的缓冲区的错误代码;主要对应show error 显示信息在使用。平时编码中可能没太在意。
  • u32 next_buffer: buffer index of next buffer in chain. Only valid if VLIB_BUFFER_NEXT_PRESENT is set 单链表中下一个缓冲区的缓冲区索引,仅仅在当标识位VLIB_BUFFER_NEXT_PRESENT时有效。
  • union
    • u32 current_config_index: current index on feature arc 当前feature arc 的索引,获取feature next0使用。
    • u32 punt_reason: reason code once packet punted. Mutually exclusive with current_config_index
  • u32 opaque[10]: primary vnet-layer opaque data (see below)
  • END of first cache line / data initialized by the buffer allocator
  • u32 trace_index: buffer’s index in the packet trace subsystem 数据包跟踪子系统中的缓冲区索引
  • u32 total_length_not_including_first_buffer: see VLIB_BUFFER_TOTAL_LENGTH_VALID above
  • u32 opaque2[14]: secondary vnet-layer opaque data (see below)
  • u8 pre_data[VLIB_BUFFER_PRE_DATA_SIZE]: rewrite space, often used to prepend tunnel encapsulations
  • u8 data[0]: buffer data received from the wire. Ordinarily, hardware devices use b->data[0] as the DMA target but there are exceptions. Do not write code which blindly assumes that packet data starts in b->data[0]. Use vlib_buffer_get_current(…). 存放从网卡接收数据缓存区。通常硬件设备使用b->data[0]作为DMA的起始目标。但是也有例外;不能盲目假设数据包起始位在b->data[0].应该使用vlib_buffer_get_current(…)。
  • Vnet (network stack) primary buffer metadata

Vnet主要buffer metadata 占用了上面显示的vlib opaque中保留的空间,类型名为vnet_buffer_opaque_t。通常使用vnet_buffer(b)宏访问。详情请参阅../src/vnet/buffer.h。

主要字段信息如下:

  • u32 sw_if_index[2]: RX and TX interface handles. At the ip lookup stage, vnet_buffer(b)->sw_if_index[VLIB_TX] is interpreted as a FIB index.
  • i16 l2_hdr_offset: offset from b->data[0] of the packet L2 header. Valid only if b->flags & VNET_BUFFER_F_L2_HDR_OFFSET_VALID is set
  • i16 l3_hdr_offset: offset from b->data[0] of the packet L3 header. Valid only if b->flags & VNET_BUFFER_F_L3_HDR_OFFSET_VALID is set
  • i16 l4_hdr_offset: offset from b->data[0] of the packet L4 header. Valid only if b->flags & VNET_BUFFER_F_L4_HDR_OFFSET_VALID is set
  • u8 feature_arc_index: feature arc that the packet is currently traversing
  • union
    • connection index
    • sequence numbers
    • header and data offsets
    • data length
    • flags
    • u32 adj_index[2]: adjacency from dest IP lookup in [VLIB_TX], adjacency from source ip lookup in [VLIB_RX], set to ~0 until source lookup done
    • union
    • generic fields
    • ICMP fields
    • reassembly fields
    • ip
    • mpls fields
    • l2 bridging fields, only valid in the L2 path
    • l2tpv3 fields
    • l2 classify fields
    • vnet policer fields
    • MAP fields
    • MAP-T fields
    • ip fragmentation fields
    • COP (whitelist/blacklist filter) fields
    • LISP fields
    • TCP fields
    • SCTP fields
    • NAT fields
    • u32 unused[6]

Vnet (network stack) secondary buffer metadata

Vnet主要缓存区metadata占用了上面显示的vlib opaque2字段中保留的空间,类型名称为vnet_buffer_opaque2_t。通常使用vnet_buffer2(b)宏访问。详情请参阅../src/vnet/buffer.h

主要字段如下:

  • qos fields
    • u8 bits
    • u8 source
  • u8 loop_counter: used to detect and report internal forwarding loops
  • group-based policy fields
    • u8 flags
    • u16 sclass: the packet’s source class
  • u16 gso_size: L4 payload size, persists all the way to interface-output in case GSO is not enabled
  • u16 gso_l4_hdr_sz: size of the L4 protocol header
  • union
    • u64 pg_replay_timestamp: timestamp for replayed pcap trace packets
    • u16 *trajectory_trace; only #if VLIB_BUFFER_TRACE_TRAJECTORY > 0 用于记录报文转发流程中处理节点的node索引。这个对于定位buffer泄漏及一些异常问题非常有用。但是在有buffer copy或者clone场景可能存在问题。但是我们也可以借用pre_data来存储。
    • packet trajectory tracer (largely deprecated)
    • packet generator
    • u32 unused[8]

Buffer Metadata Extensions

Plugins 开发者可能不希望扩展主或者次 vnet buffer opaque联合体。请使用提供

在vpp核心代码的头文件中添加插件或修改专有metadata数据是不合适的。相反,请按照以下方式进行。这个示例涉及vnet主缓冲opaque union vlib_buffer_opaque_t。使用vnet辅助缓冲区opaque union vlib_buffer_opaque2_t是一个非常简单的变体。

在插件头文件中:

代码语言:javascript复制
   /* Add arbitrary buffer metadata */
    #include <vnet/buffer.h>

    typedef struct
    {
      u32 my_stuff[6];
    } my_buffer_opaque_t;

    STATIC_ASSERT (sizeof (my_buffer_opaque_t) <=
                   STRUCT_SIZE_OF (vnet_buffer_opaque_t, unused),
                   "Custom meta-data too large for vnet_buffer_opaque_t");

    #define my_buffer_opaque(b)  
      ((my_buffer_opaque_t *)((u8 *)((b)->opaque)   STRUCT_OFFSET_OF (vnet_buffer_opaque_t, unused)))

To set data in the custom buffer opaque type given a vlib_buffer_t *b:

代码语言:javascript复制
    my_buffer_opaque (b)->my_stuff[2] = 123;

To read data from the custom buffer opaque type:

代码语言:javascript复制
stuff0 = my_buffer_opaque (b)->my_stuff[2];

0 人点赞