本篇文章来源于对vpp官方开发文档的个人翻译,有翻译不当的地方欢迎指正。原文地址:
https://fd.io/docs/vpp/master/gettingstarted/developers/metadata.html#buffer-metadata-extensions
Buffer Metadata
每个vlib_buffer_t (报文缓冲区)都携带描述当前包处理状态的缓冲区metadata。这种基础技术已经在多包处理环境中使用了几十年。
我们会详细检查vpp buffer metadata的使用细节,但是需要使用者修改或者扩展方案应该有一定的代码检查。
Vlib (Vector library) primary buffer metadata
每个vlib_buffer_t的前64个字节携带主要的buffer metadata。请参阅…/src/vlib/buffer.h了解完整细节。
主要字段如下:
- i16 current_data: 当前正在处理的data[]和pre_data[]中的偏移量。如果负的,说明当前头指向pre-data(重写空间)区域。
- u16 current_length: 在current_data到这个缓冲区的数据结尾之间的字节数
- u32 flags: 缓冲标志位。已经使用了很多。
- VNET_BUFFER_F_L4_CHECKSUM_COMPUTED: tcp/udp checksum has been computed
- VNET_BUFFER_F_L4_CHECKSUM_CORRECT: tcp/udp checksum is correct
- VNET_BUFFER_F_VLAN_2_DEEP: two vlan tags present
- VNET_BUFFER_F_VLAN_1_DEEP: one vlan tag present
- VNET_BUFFER_F_SPAN_CLONE: packet has already been cloned (span feature)
- VNET_BUFFER_F_LOOP_COUNTER_VALID: packet look-up loop count valid
- VNET_BUFFER_F_LOCALLY_ORIGINATED: packet built by vpp
- VNET_BUFFER_F_IS_IP4: packet is ipv4, for checksum offload
- VNET_BUFFER_F_IS_IP6: packet is ipv6, for checksum offload
- VNET_BUFFER_F_OFFLOAD_IP_CKSUM: hardware ip checksum offload requested
- VNET_BUFFER_F_OFFLOAD_TCP_CKSUM: hardware tcp checksum offload requested
- VNET_BUFFER_F_OFFLOAD_UDP_CKSUM: hardware udp checksum offload requested
- VNET_BUFFER_F_IS_NATED: natted packet, skip input checks
- VNET_BUFFER_F_L2_HDR_OFFSET_VALID: L2 header offset valid
- VNET_BUFFER_F_L3_HDR_OFFSET_VALID: L3 header offset valid
- VNET_BUFFER_F_L4_HDR_OFFSET_VALID: L4 header offset valid
- VNET_BUFFER_F_FLOW_REPORT: packet is an ipfix packet
- VNET_BUFFER_F_IS_DVR: packet to be reinjected into the l2 output path
- VNET_BUFFER_F_QOS_DATA_VALID: QoS data valid in vnet_buffer_opaque2
- VNET_BUFFER_F_GSO: generic segmentation offload requested
- VNET_BUFFER_F_AVAIL1: available bit
- VNET_BUFFER_F_AVAIL2: available bit
- VNET_BUFFER_F_AVAIL3: available bit
- VNET_BUFFER_F_AVAIL4: available bit
- VNET_BUFFER_F_AVAIL5: available bit
- VNET_BUFFER_F_AVAIL6: available bit
- VNET_BUFFER_F_AVAIL7: available bit
- VLIB_BUFFER_IS_TRACED: buffer is traced
- VLIB_BUFFER_NEXT_PRESENT: buffer has multiple chunks
- VLIB_BUFFER_TOTAL_LENGTH_VALID: total_length_not_including_first_buffer is valid (see below)
- src/vlib/buffer.h flag bits
- src/vnet/buffer.h flag bits
- u32 flow_id: generic flow identifier,通用的流标识符。目前主要用在网卡FDIR功能上,在UDPI,或vxlan offload都有使用。
- u8 ref_count: buffer reference / clone count (e.g. for span replication) 当前buffer引用或克隆的计数。在镜像功能有使用。
- u8 buffer_pool_index: buffer pool index which owns this buffer 当前buffer所在的buffer pool 的索引。主要对应numa的id。
- vlib_error_t (u16) error: error code for buffers enqueued to error handler 进入错误处理程序的缓冲区的错误代码;主要对应show error 显示信息在使用。平时编码中可能没太在意。
- u32 next_buffer: buffer index of next buffer in chain. Only valid if VLIB_BUFFER_NEXT_PRESENT is set 单链表中下一个缓冲区的缓冲区索引,仅仅在当标识位VLIB_BUFFER_NEXT_PRESENT时有效。
- union
- u32 current_config_index: current index on feature arc 当前feature arc 的索引,获取feature next0使用。
- u32 punt_reason: reason code once packet punted. Mutually exclusive with current_config_index
- u32 opaque[10]: primary vnet-layer opaque data (see below)
- END of first cache line / data initialized by the buffer allocator
- u32 trace_index: buffer’s index in the packet trace subsystem 数据包跟踪子系统中的缓冲区索引
- u32 total_length_not_including_first_buffer: see VLIB_BUFFER_TOTAL_LENGTH_VALID above
- u32 opaque2[14]: secondary vnet-layer opaque data (see below)
- u8 pre_data[VLIB_BUFFER_PRE_DATA_SIZE]: rewrite space, often used to prepend tunnel encapsulations
- u8 data[0]: buffer data received from the wire. Ordinarily, hardware devices use b->data[0] as the DMA target but there are exceptions. Do not write code which blindly assumes that packet data starts in b->data[0]. Use vlib_buffer_get_current(…). 存放从网卡接收数据缓存区。通常硬件设备使用b->data[0]作为DMA的起始目标。但是也有例外;不能盲目假设数据包起始位在b->data[0].应该使用vlib_buffer_get_current(…)。
- Vnet (network stack) primary buffer metadata
Vnet主要buffer metadata 占用了上面显示的vlib opaque中保留的空间,类型名为vnet_buffer_opaque_t。通常使用vnet_buffer(b)宏访问。详情请参阅../src/vnet/buffer.h。
主要字段信息如下:
- u32 sw_if_index[2]: RX and TX interface handles. At the ip lookup stage, vnet_buffer(b)->sw_if_index[VLIB_TX] is interpreted as a FIB index.
- i16 l2_hdr_offset: offset from b->data[0] of the packet L2 header. Valid only if b->flags & VNET_BUFFER_F_L2_HDR_OFFSET_VALID is set
- i16 l3_hdr_offset: offset from b->data[0] of the packet L3 header. Valid only if b->flags & VNET_BUFFER_F_L3_HDR_OFFSET_VALID is set
- i16 l4_hdr_offset: offset from b->data[0] of the packet L4 header. Valid only if b->flags & VNET_BUFFER_F_L4_HDR_OFFSET_VALID is set
- u8 feature_arc_index: feature arc that the packet is currently traversing
- union
- connection index
- sequence numbers
- header and data offsets
- data length
- flags
- u32 adj_index[2]: adjacency from dest IP lookup in [VLIB_TX], adjacency from source ip lookup in [VLIB_RX], set to ~0 until source lookup done
- union
- generic fields
- ICMP fields
- reassembly fields
- ip
- mpls fields
- l2 bridging fields, only valid in the L2 path
- l2tpv3 fields
- l2 classify fields
- vnet policer fields
- MAP fields
- MAP-T fields
- ip fragmentation fields
- COP (whitelist/blacklist filter) fields
- LISP fields
- TCP fields
- SCTP fields
- NAT fields
- u32 unused[6]
Vnet (network stack) secondary buffer metadata
Vnet主要缓存区metadata占用了上面显示的vlib opaque2字段中保留的空间,类型名称为vnet_buffer_opaque2_t。通常使用vnet_buffer2(b)宏访问。详情请参阅../src/vnet/buffer.h
主要字段如下:
- qos fields
- u8 bits
- u8 source
- u8 loop_counter: used to detect and report internal forwarding loops
- group-based policy fields
- u8 flags
- u16 sclass: the packet’s source class
- u16 gso_size: L4 payload size, persists all the way to interface-output in case GSO is not enabled
- u16 gso_l4_hdr_sz: size of the L4 protocol header
- union
- u64 pg_replay_timestamp: timestamp for replayed pcap trace packets
- u16 *trajectory_trace; only #if VLIB_BUFFER_TRACE_TRAJECTORY > 0 用于记录报文转发流程中处理节点的node索引。这个对于定位buffer泄漏及一些异常问题非常有用。但是在有buffer copy或者clone场景可能存在问题。但是我们也可以借用pre_data来存储。
- packet trajectory tracer (largely deprecated)
- packet generator
- u32 unused[8]
Buffer Metadata Extensions
Plugins 开发者可能不希望扩展主或者次 vnet buffer opaque联合体。请使用提供
在vpp核心代码的头文件中添加插件或修改专有metadata数据是不合适的。相反,请按照以下方式进行。这个示例涉及vnet主缓冲opaque union vlib_buffer_opaque_t。使用vnet辅助缓冲区opaque union vlib_buffer_opaque2_t是一个非常简单的变体。
在插件头文件中:
代码语言:javascript复制 /* Add arbitrary buffer metadata */
#include <vnet/buffer.h>
typedef struct
{
u32 my_stuff[6];
} my_buffer_opaque_t;
STATIC_ASSERT (sizeof (my_buffer_opaque_t) <=
STRUCT_SIZE_OF (vnet_buffer_opaque_t, unused),
"Custom meta-data too large for vnet_buffer_opaque_t");
#define my_buffer_opaque(b)
((my_buffer_opaque_t *)((u8 *)((b)->opaque) STRUCT_OFFSET_OF (vnet_buffer_opaque_t, unused)))
To set data in the custom buffer opaque type given a vlib_buffer_t *b:
代码语言:javascript复制 my_buffer_opaque (b)->my_stuff[2] = 123;
To read data from the custom buffer opaque type:
代码语言:javascript复制stuff0 = my_buffer_opaque (b)->my_stuff[2];