T
T
torum2015-01-02 17:42:16
Asterisk
torum, 2015-01-02 17:42:16

Does a storm appear periodically in the VLAN where IP telephony is used?

A storm appears in the VLAN segment.
1) There is an asterisk server with a static ip address that looks in VLAN
2) All phones are also in VLAN 4
3) D-Link switches
4) A lot of ARP requests
appear at different times 5) UDP requests for registration with sides of phones towards the server, which are not processed by the asterik server.
At the same time, before switching to VLAN, everything worked fine.
It is possible that such packets were processed by the Firewall (dropped packets).
Tried to disable ARP requests on the server side. ifconfig eth0 -arp, after that it loaded from the file where the IP addresses of the phones and the MAC address are registered - as a result, all the phones "disconnected" - returned everything back.
Registered in iptables limit 60/minutes the number of udp requests on port 5060 - did not help.
Removed several phones from the VLAN and connected to asterisk - they work without problems.
Softphones also work without problems.
Some phones are in rupture with computers - when the network in the VLAN "falls" (storm) - computers also "buggy".
What could be the problem.
Some changes:
1) Set up the 2nd asterisk and link it to the 1st one.
2) In the 2nd asterisk threw most of the phones and removed the VLAN.
3) There are 5 phones left in the 1st asterisk - there are no problems.
4) There are 20 phones in the 2nd asterisk - lags continue.
5) Checking for loops did not give any results - everything is fine.
6) The MAC address table on the main switch is updated normally - no problems are visible either. All phones are visible and IP addresses correspond to the table of MAC addresses and ports.
7) Checking the configuration of phones also did not give anything.
Outcome:
Backup did not give a solution to the problem.
So far, the only thing that has not yet been checked comes to mind:
- Remove the phone autoconfiguration settings via tftp - not an obvious solution - but this is the only thing that changed just at the moment when the problems started.
Solution:
1) DHCP server was configured
2) DNS server just in case
3) Disabled on Keep Alive IP phones
4) Asterisk settings were reset
As a result: The
storm stopped.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
T
throughtheether, 2015-01-02
@throughtheether

4) At different times there are many ARP requests
Are the requests coming from the same device or different ones? What is the approximate intensity? Do these traffic spikes at random or recurring times?
At the same time, before switching to VLAN, everything worked fine.
Before transition to VLAN is how? How was the network organized? One L2 domain for all devices? What changes were made during the "transition to VLAN"?
Tried to disable ARP requests on the server side. ifconfig eth0 -arp, after that it loaded from the file where the IP addresses of the phones and the MAC address are registered - as a result, all the phones "disconnected" - returned everything back.
Do your phones have manually assigned IP addresses or do they receive them automatically?
A storm appears in the VLAN segment.
What is the traffic intensity? Besides ARP requests and UDP packets, are there other significant traffic components?
Other recommendations and comments:
1) organize equipment monitoring, unless, of course, you want to spend several days troubleshooting similar problems in the future
2) on switches, limit the level of broadcast (broadcast) and multicast (multicast) traffic on user ports (to which PC or phones). The functionality is called "storm control".
3) it is possible that an L2 loop temporarily appears in the network. Of the suspects: hardware VoIP phones with a built-in switch, client PCs (if connected with two cables), WiFi access points (simultaneous bridging on the access point and on the PC, simultaneous bridging on two access points if configured incorrectly)
UPD :
There were questions about the materials you provide.
Topology ( link ).
1) Horizontal links between switches (circled by hand in red) - are they really present or is it carelessness when drawing up the topology?
2) Duplicate addresses are circled with a red rectangular frame - is this an inaccuracy? Check if there are any duplicate addresses from the manually set ones.
3) in the topology, you specified the central switch as DGS-3120. At the same time, they provided a screenshot of the settings of a certain DES-3200. How do they compare? In general, the topology you provided raises more questions than it answers. If possible, please remake it in a more acceptable form (you yourself may find it useful in the future).
For dumps:
4) if possible, please provide dumps in the form of .pcap files in the future, they are more convenient to process.
5) I can’t say anything about SIP, I’m not a specialist
6) about ARP. The traffic intensity is estimated at about 2000 pps, for ARP, in my opinion, this is not normal. Please check host with address 192.168.20.11. How exactly is it connected? Through the phone? What is the phone model? Phone settings? What switch port is the phone connected to, what are its settings? Are there entries in the phone or switch logs that correlate with the problem?
UPD2 :
I went to work and found something interesting:
1. During the "storm" - on the Asterisk server - checking with the
arp -a command
- it cannot "bind" some* IP phones and MAC addresses
2. At the same time, just udp requests on port 5060 go from these phones.
3. I.e. at some point in time, the server cannot understand where the phone is located, the phone tries to send a request to the server - and the server cannot answer it, because doesn't know where he is. The result is Storm.
It is logical that the phone generates ARP responses in response to an ARP request that the server sends because, for example, an entry in the arp table is out of date.
A couple of new ideas:
7) you can increase the lifetime of the arp entry on the server, this will presumably reduce the frequency of the problem. But keep in mind that this value must be less than the lifetime of the entry in the switch's MAC address table.
8) by the way, it is quite possible that the record with the server's MAC address on the switch becomes obsolete, and therefore the frames addressed to it are sent to all ports of the switch, including those that participate in the fluctuating loop. Check the settings and logs of the switch closest to the server
9) It is unlikely that this is the cause of the problem, but check the settings of each trunk on both sides. In rare cases (bridging different vlans + collisions in D-link) this, I think, can lead to the formation of loops.
So far, without access to the network, it is difficult to give any additional recommendations (except for general ones - look for a loop, check all devices on the way from the phone to the server).

R
Ref, 2015-01-02
@KargoZ

3) D-Link switches
Look at how much memory per port, it's quite possible that it's just not enough.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question