A
A
anatoly_tim2014-07-07 13:33:31
linux
anatoly_tim, 2014-07-07 13:33:31

Link aggregation issue in Oracle VM Server 3.2.8. What could be the problem with dropped packets on interfaces? How can I find out what these packages are?

Good afternoon, I have been suffering for more than two weeks with channel aggregation. A bunch of cisco 4948 and 2 network adapters is used from the server side.
In the router, 2 ports are combined into a port-channel. It carries tagged traffic

interface GigabitEthernet1/3
 description test-eth0
 switchport trunk encapsulation dot1q
 switchport trunk allowed vlan 2,3
 switchport mode trunk
 no cdp enable
 channel-protocol lacp
 channel-group 6 mode active

interface GigabitEthernet1/39
 description test-eth1
 switchport trunk encapsulation dot1q
 switchport trunk allowed vlan 2,3
 switchport mode trunk
 no cdp enable
 channel-protocol lacp
 channel-group 6 mode active

interface Port-channel6
 description test
 switchport
 switchport trunk encapsulation dot1q
 switchport trunk allowed vlan 2,3
 switchport mode trunk
 spanning-tree portfast
 spanning-tree bpduguard enable

From the server side, the configuration is as follows:
#
uname -a
Linux ovm-test 2.6.39-300.32.6.el5uek #1 SMP Fri Oct 11 22:05:27 PDT 2013 x86_64 x86_64 x86_64 GNU/Linux

The network card is using an Intel® PRO/1000 PT Dual Port Server Adapter
# lspci | grep Eth
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 03)
04:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)
04:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)

ifconfig command output As you can see there are more dropped packets
# ifconfig
bond0     Link encap:Ethernet  HWaddr 00:15:17:D6:B0:14
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:9381 errors:0 dropped:6123 overruns:0 frame:0
          TX packets:1234 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1352583 (1.2 MiB)  TX bytes:165782 (161.8 KiB)

bond0.3  Link encap:Ethernet  HWaddr 00:15:17:D6:B0:14
          inet addr:192.168.32.7  Bcast:192.168.32.127  Mask:255.255.255.128
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1561 errors:0 dropped:146 overruns:0 frame:0
          TX packets:210 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:131627 (128.5 KiB)  TX bytes:33176 (32.3 KiB)

eth0      Link encap:Ethernet  HWaddr 00:15:17:D6:B0:14
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:5038 errors:0 dropped:2521 overruns:0 frame:0
          TX packets:104 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:758116 (740.3 KiB)  TX bytes:13150 (12.8 KiB)
          Interrupt:16 Memory:feba0000-febc0000

eth1      Link encap:Ethernet  HWaddr 00:15:17:D6:B0:14
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:4344 errors:0 dropped:26 overruns:0 frame:0
          TX packets:1141 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:594535 (580.6 KiB)  TX bytes:155006 (151.3 KiB)
          Interrupt:17 Memory:febe0000-fec00000

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:46 errors:0 dropped:0 overruns:0 frame:0
          TX packets:46 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:6906 (6.7 KiB)  TX bytes:6906 (6.7 KiB)

The bonding interface has options for working with tagged traffic
# cat ./ifcfg-bond0
DEVICE=bond0
USERCTL=no
BONDING_OPTS="debug=1 mode=802.3ad miimon=100 xmit_hash_policy=layer2 lacp_rate=1"
BOOTPROTO=none
ONBOOT=yes
TYPE=Ethernet

Here we configure the corresponding VLAN
# cat ./ifcfg-bond0.3
VLAN=yes
DEVICE=bond0.3
BOOTPROTO=static
ONBOOT=yes
TYPE=Ethernet
IPADDR=192.168.32.7
NETMASK=255.255.255.128
GATEWAY=192.168.32.125
DNS=192.168.32.70

Physical interface configuration
# cat ./ifcfg-eth0
# Intel Corporation 82571EB Gigabit Ethernet Controller
DEVICE=eth0
BOOTPROTO=none
HWADDR=00:15:17:D6:B0:14
ONBOOT=yes
MASTER=bond0
SLAVE=yes
USERCTL=no

# cat ./ifcfg-eth1
# Intel Corporation 82571EB Gigabit Ethernet Controller
DEVICE=eth1
BOOTPROTO=none
HWADDR=00:15:17:D6:B0:15
ONBOOT=yes
MASTER=bond0
SLAVE=yes
USERCTL=no

I can't figure out where these packets come from. The rest of the network works. Perhaps I have errors in the configuration. Who faced similar?
The stand itself is a test one and so far it is impractical to do this on a real server due to a problem with packages.
Thank you in advance.

Answer the question

In order to leave comments, you need to log in

2 answer(s)
T
throughtheether, 2014-07-07
@anatoly_tim

Noticing the difference between the total number of "drops" on the bond0 interface:

bond0     Link encap:Ethernet  HWaddr 00:15:17:D6:B0:14
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:9381 errors:0 dropped:6123 overruns:0 frame:0

and the number of "drops" on the interface that processes Vlan 3 traffic:
bond0.3  Link encap:Ethernet  HWaddr 00:15:17:D6:B0:14
          inet addr:192.168.32.7  Bcast:192.168.32.127  Mask:255.255.255.128
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1561 errors:0 dropped:146 overruns:0 frame:0

I assume:
1) as already noted, untagged traffic is most likely coming (by default, native vlan on cisco equipment is number 1, service traffic like CDP and others are sent in it)
2) it is also likely that traffic comes to the server in the form of tagged frames (tagged) 2 vlan. It is allowed on the switch side, but, as I understand it, you do not process it on the server.
As a possible solution - assigning an unused vlan as a native vlan on the switch side, allowing only the necessary vlans on the trunk from the switch side (you use vlan 3, vlan 2 presumably not needed).
Unfortunately, I can't explain the presence of "drops" on the bond0.3 interface. Probably, these are some kind of dropped packets like an unnecessary (unlistenable) multicast.
Just in case, please provide the output .
UPD :
I asked for the output of ethtool to make sure the ports mode is correct (1000Mbps, full duplex). In fact, the absence of collisions and crc errors indicates the correct mode of operation of the ports, but the more data, the better in such cases.
Native VLAN is not used, I give only 2 of the above VLANs to PortChanell.
Please provide the output from the switch
show interface Port-channel6 switchport. If it turns out that native vlan is set to 1 on the Po6 interface, look at the switch
show cdp
show cdp interface GigabitEthernet1/3
show cdp interface GigabitEthernet1/39

Do you think it has something to do with the kernel?
Unfortunately, I don't know enough about the Linux kernel to answer this question. I can only suggest using tcpdump on the bond0.3 interface . Keep in mind that a packet "dropped" by the kernel may or may not appear in the traffic dump, depending on the reason for the "drop". But it may well be that you see someone else's multicast, for example (OSPF Hello from the other end of the L2 domain as an option). My point is that it might have something to do with the kernel, or it might have something to do with the network environment.
UPD2 :
It can be seen here that native vlan 1 is still present, stubbornly googled, but did not understand how to remove these parameters from the configuration ..
Create a new (previously unused) vlan on the switch and assign it as native, but only on this interface. Example:
configure terminal
interface Port-channel6
switchport trunk native vlan XXXX
, XXXX-vlan number.
I would also like to share an observation:
eth0      Link encap:Ethernet  HWaddr 00:15:17:D6:B0:14
          TX packets:104 errors:0 dropped:0 overruns:0 carrier:0
          
eth1      Link encap:Ethernet  HWaddr 00:15:17:D6:B0:14
          TX packets:1141 errors:0 dropped:0 overruns:0 carrier:0

As follows from this listing, the volume of outgoing traffic from the eth1 interface is an order of magnitude larger than the volume of traffic from the eth0 interface. Of course, you should not rush to conclusions with such insignificant indicators, but I strongly recommend that you check the scheme with a traffic volume of more than a gigabit. I have a suspicion that if traffic consumers are in a different L2 segment (i.e., traffic to them is routed through the default gateway), then performance can reach 1 Gb / s. If this situation occurs, pay attention to the line
# cat ./ifcfg-bond0
BONDING_OPTS="debug=1 mode=802.3ad miimon=100 xmit_hash_policy=layer2 lacp_rate=1"

It is likely that the hashing algorithm will need to be changed to one that is more sensitive to detail. But it makes sense to think about it, having test results with a load of 1 Gb / s and higher.

S
Sergey Petrikov, 2014-07-07
@RicoX

I will assume that the packets coming from the switchport trunk native vlan are dropped, change to not used and should stop pouring, the bonding itself is configured normally judging by the config.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question