本文介绍一下l3xc插件,功能是将三层接口的所有入接口流量交叉连接输出到指定的FIB路径。此功能和在相同vrf中设置默认路由的效果差不多的。但是比默认路由的转发方式更加省内存和在cpu处理方面高效。
搭建环境通过ping 百度ip地址来验证l3xc功能:
主要配置是在vpp创建tap0接口,另一端与连接内核;接口GigabitEthernet2/6/0接口配置路由后nat。在linux系统下配置往百度ip地址的明细路由通过tap0接口接入vpp。这样vpp中没有配置默认路由的情况下,通过配置l3xc从内核ping 百度地址可以正常ping通。具体配置如下:
代码语言:javascript复制#设置wan接口ip地址
set interface state GigabitEthernet2/6/0 up
set interface ip address GigabitEthernet2/6/0 192.168.1.20/24
#使能nat插件,设置nat 地址,配置路由后nat模式。
at44 plugin enable
nat44 add interface address GigabitEthernet2/6/0
set interface nat44 out GigabitEthernet2/6/0 output-feature
#创建tap0
create tap
set interface state tap0 up
set interface ip address tap0 192.168.100.1/24
#设置l3xc
l3xc add tap0 via 192.168.1.1 GigabitEthernet2/6/0
#设置linux 内核配置
ifconfig tap0 192.168.100.2/24
#设置ping 百度39.156.66.14 下一跳指向vpp tap0接口
ip route add 39.156.66.14 via 192.168.100.1 dev tap0
在内核中ping 39.156.66.14,在vpp设置trace,下面是trace流程:
代码语言:javascript复制00:09:38:564603: virtio-input
virtio: hw_if_index 2 next-index 4 vring 0 len 98
hdr: flags 0x00 gso_type 0x00 hdr_len 0 gso_size 0 csum_start 0 csum_offset 0 num_buffers 1
00:09:38:564611: ethernet-input
IP4: 02:fe:e6:d1:c6:be -> 02:fe:bc:19:ce:e8
00:09:38:564616: ip4-input
ICMP: 192.168.100.2 -> 39.156.66.14
tos 0x00, ttl 64, length 84, checksum 0xf415 dscp CS0 ecn NON_ECN
fragment id 0xb83e, flags DONT_FRAGMENT
ICMP echo_request checksum 0x6c45 id 2
00:09:38:564620: l3xc-input-ip4 # l3xc-input node节点
l3xc-index:0 lb-index:3
00:09:38:564623: ip4-rewrite
tx_sw_if_index 1 dpo-idx 3 : ipv4 via 192.168.1.1 GigabitEthernet2/6/0: mtu:9000 next:3 flags:[features ] f4de0cf8128000505621aac20800 flow hash: 0x00000000
00000000: f4de0cf8128000505621aac2080045000054b83e40003f01f515c0a86402279c
00000020: 420e08006c450002016d8bfa276400000000101a0800000000001011
00:09:38:564638: ip4-sv-reassembly-output-feature
[not-fragmented]
00:09:38:564640: nat-pre-in2out-output
in2out next_index 4 arc_next_index 11
00:09:38:564642: nat44-ed-in2out-output
NAT44_IN2OUT_ED_FAST_PATH: sw_if_index 2, next index 11, session 0, translation result 'success' via i2of
i2of match: saddr 192.168.100.2 sport 2 daddr 39.156.66.14 dport 2 proto ICMP fib_idx 0 rewrite: saddr 192.168.1.20 daddr 39.156.66.14 icmp-id 63327 txfib 0
o2if match: saddr 39.156.66.14 sport 63327 daddr 192.168.1.20 dport 63327 proto ICMP fib_idx 0 rewrite: daddr 192.168.100.2 icmp-id 2 txfib 0
search key local 192.168.100.2:2 remote 39.156.66.14:2 proto ICMP fib 0 thread-index 0 session-index 0
00:09:38:564649: GigabitEthernet2/6/0-output
GigabitEthernet2/6/0 flags 0x00380005
IP4: 00:50:56:21:aa:c2 -> f4:de:0c:f8:12:80
ICMP: 192.168.1.20 -> 39.156.66.14
tos 0x00, ttl 63, length 84, checksum 0x5804 dscp CS0 ecn NON_ECN
fragment id 0xb83e, flags DONT_FRAGMENT
ICMP echo_request checksum 0x74e7 id 63327
00:09:38:564651: GigabitEthernet2/6/0-tx
GigabitEthernet2/6/0 tx queue 0
buffer 0x9ee9d: current data 0, length 98, buffer-pool 0, ref-count 1, trace handle 0x2
natted l2-hdr-offset 0 l3-hdr-offset 14
PKT MBUF: port 65535, nb_segs 1, pkt_len 98
buf_len 2176, data_len 98, ol_flags 0x0, data_off 128, phys_addr 0x6abba7c0
packet_type 0x0 l2_len 0 l3_len 0 outer_l2_len 0 outer_l3_len 0
rss 0x0 fdir.hi 0x0 fdir.lo 0x0
IP4: 00:50:56:21:aa:c2 -> f4:de:0c:f8:12:80
ICMP: 192.168.1.20 -> 39.156.66.14
tos 0x00, ttl 63, length 84, checksum 0x5804 dscp CS0 ecn NON_ECN
fragment id 0xb83e, flags DONT_FRAGMENT
ICMP echo_request checksum 0x74e7 id 63327
l3xc功能涉及的结构体也相当简单,在l3xc配置下发时通过传入路由信息生成转发node所需要的dpo信息。由于本人对路由模块不熟悉,l3xc配置处理流程无法讲解,可以直接阅读下面代码?
代码语言:javascript复制 {
/* 从l3xc pool内存池中申请一个新的l3xc 节点。
* create a new x-connect
*/
pool_get_aligned_zero (l3xc_pool, l3xc, CLIB_CACHE_LINE_BYTES);
l3xci = l3xc - l3xc_pool;
/*初始化fib node节点*/
fib_node_init (&l3xc->l3xc_node, l3xc_fib_node_type);
l3xc->l3xc_sw_if_index = sw_if_index;
l3xc->l3xc_proto = fproto;
/*
* 创建路由路径path
*/
l3xc->l3xc_pl = fib_path_list_create ((FIB_PATH_LIST_FLAG_SHARED |
FIB_PATH_LIST_FLAG_NO_URPF),
rpaths);
l3xc->l3xc_sibling = fib_path_list_child_add (l3xc->l3xc_pl,
l3xc_fib_node_type,
l3xci);
l3xc_stack (l3xc);
/*
* 将此新策略添加到数据库中,并在输入接口上启用该功能*/
l3xc_db_add (sw_if_index, fproto, l3xci);
vnet_feature_enable_disable ((FIB_PROTOCOL_IP4 == fproto ?
"ip4-unicast" :
"ip6-unicast"),
(FIB_PROTOCOL_IP4 == fproto ?
"l3xc-input-ip4" :
"l3xc-input-ip6"),
l3xc->l3xc_sw_if_index,
1, &l3xci, sizeof (l3xci));
上面的代码中,我们需要关注一下feature使能的时候将l3xc pool内存池索引写入feature配置信息中,在l3xc-input-ip4node节点时从配置信息中读取出来。这里处理也比较高效。下面看一下转发核心代码:
代码语言:javascript复制 u32 l3xci0, next_u32;
const l3xc_t *l3xc0;
#从feature 配置信息中获取的l3xc pool内存池索引。
l3xci0 =
*(u32 *) vnet_feature_next_with_data (&next_u32, b[0],
sizeof (l3xci0));
#从内存池中获取l3xc抓发信息
l3xc0 = l3xc_get (l3xci0);
#获取next slot id 及dpo 索引,直接送到ip4-rewrite
next[0] = l3xc0->l3xc_dpo.dpoi_next_node;
vnet_buffer (b[0])->ip.adj_index[VLIB_TX] = l3xc0->l3xc_dpo.dpoi_index;
node处理流程只有上面短短的四行代码就完成了路由查询过程,是不是相当的高效。