How to Recognize a Distributed Denial of Service (DDoS) Attack
To avoid an attack, you need to know what’s coming your way. When you detect an attempt to disrupt the regular operation of a target server, service, or network by overloading it with unwanted traffic, you face a Distributed Denial of Service (DDoS) attack.
A attempts to deny access to a target server by generating a large amount of malicious Internet traffic that overwhelms the target’s available resources. We implement traffic filtering solutions to prevent these types of attacks and ensure maximum uptime.
We are constantly improving our services and updating our systems to stay ahead of the curve. Today we will detail how we combat DDoS attacks and explain the setup of our Wanguard traffic filter and general infrastructure.
What does an attack look like?
Here is a real life example of a DDoS attack. Imagine 400 Mbps of UDP traffic going to a VPS with 100 Mbps available bandwidth.
Simple Jekyll website load time results:
Before attack: 0.08 seconds.
During an attack: 23.35 seconds (first attempt), 30.86 seconds (second attempt).
DDoS attacks are often difficult to mitigate because they typically involve entire or multiple botnets targeting you. A botnet consists of many infected systems, so fighting it on your own will most of the time be useless.
How we deal with DDoS attacks
We have two DDoS mitigation solutions to deal with incoming attacks on our infrastructure: remotely activated black hole (RTBH) and traffic filtering.
The RTBH filtering offers a way to remove unwanted traffic quickly before it enters our infrastructure. While this method effectively protects our infrastructure as a service provider, it prevents all traffic from affecting us, which is not something our customers prefer. Eventually, your websites and VPS become completely unreachable. As a result, attackers achieve their goals.
traffic filtering is the next level DDoS protection for our services. It only stops malicious traffic instead of giving up everything. Malicious traffic is identified by examining the packets flowing through our infrastructure. The following traffic elements are inspected for specific patterns:
- packet payload
- port of origin
- source IP
- port of destination
- and more
This filtering process is done in our infrastructure before the traffic reaches our services, so our customers don’t have to worry.
Have implemented offline filtering for our setup. Since we rarely experience powerful DDoS attacks, online filtering it would be inefficient in terms of actual practicality and cost; instead we have the RTBH method to fight them.
Simplified filter setup topology
Our setup involves filter instances connected to column switches through which the diverted traffic flows. We use sFlows, which are sent from backbone instances to the filter instance, to investigate and divert traffic if necessary. Clean traffic is forwarded to the leaf switches, while malicious traffic is dropped at the filter instance. It is important to keep in mind that the filtering processes Y traffic diversion is it so fully automated.
If any destination host experiences a spike in traffic above our set thresholds, we advertise that IP address to the backbones using ExaBGP. When traffic reaches a filter instance, we examine the incoming packets to identify the attack pattern. Once complete, new rules are added to the firewall, preventing malicious traffic from reaching its destination.
The main elements that the filter server depends on are the CPU and the NIC. After some testing and research, we decided to go with the following:
CPU: Intel(R) Xeon(R) Silver 4215R @ 3.2GHz.
NIC: Intel XL710 (40G).
During a DDoS attack with ~1.5Mpps Y 8Gbps of traffic, the CPU usage looks like this:
It would be difficult to manually manage multiple filter instances across all data centers. As a result, the entire solution is fully automated, from attack detection to threshold configuration. We currently use Chef and Ansible for our infrastructure as code (IaC). Changing thresholds or other settings for all instances at once is as easy as changing a few lines of code.
Here’s a sneak peek of our setup:
Our instance needs to be able to route packets between interfaces, so forwarding is active for both IPv4 and IPv6. Since we don’t have routes through interfaces used for forwarding, we need to disable reverse route filtering or set it to loose mode, as we have, so that packets arriving through those interfaces are not lost. .
We have increased the maximum number of packets in a NAPI polling cycle (net.core.netdev_budget) to 1000. Since we prefer throughput to latency in this case, we have set our ring buffers to maximum.
We have been running this solution for six months and we can see that these small changes are enough to handle any attack from the anticipated scales. We do not delve into system tuning as the default values are reasonable and do not cause any problems.
Next, we have actions. A action activates when an attack is detected or terminated. We use it to divert traffic (route advertisement via ExaBGP), inform our monitoring team about the attack (a Slack message from the instance), and more.
thresholds they are also managed as code, providing numerous options for detecting an attack. For example, if we detect 100,000 UDP packets per second addressed to a single target, we start the filtering process. It can also be TCP traffic, HTTP/HTTPS requests, etc.
Prefixes that should be under protection are also automatically added from Chef data bags.
What does handling a DDoS attack look like in Grafana? Let’s look at a recent attack with 8Gbps Y 1 Mpps of traffic below.
here’s the traffic incoming to the filter instance:
And here’s the traffic outgoing to end device:
packages incoming per second:
packages outputs per second:
As you can see, there is a short burst of traffic going from the filtering instance to the end device. It is the gap caused by the attack pattern identification process. It is a short amount of time, usually between 1 and 10 secondsbut it is something to keep in mind. As seen on the graph, once the attack pattern is identified, you are safe!
What about attack detection speed? This part depends on sFlows and as we know it is not as fast as port mirroring. That said, it’s easy to set up, flexible, and costs less. Once an attack begins, the time to divert traffic to the filter instance takes between 20 and 50 seconds.
This is what the whole process looks like from the target instance:
packets per second
There’s a brief spike and we’re back to normal. Depending on the service you’re running, you may not even notice it.
At we like to know what’s going on in our infrastructure, so let’s investigate this case a bit more:
attack source. We noticed an increase in IPv4 traffic from some countries, with India and Taiwan contributing the most. There is a strong possibility that those IP addresses have been spoofed, so this information may be inaccurate. We have the list of source addresses and ASNs, but we won’t post them here for the same reason (spoofing).
attack protocol. This attack was mostly based on UDP, as we didn’t see any unusual spikes in the TCP graph.
Attack type. It was generating a lot of traffic to random UDP ports. Some of them are seen in the following graph:
RTBH, like DDoS protection, is effective, but eventually causes downtime. After implementing the traffic filtering solution on our infrastructure, we only remove the malicious traffic instead of all of it. We have noticed that the use of RTBH has decreased between a 90-95%which translates into better uptime for our services and customers!
Carlos is a professional in digital marketing, eCommerce and website builders. He loves helping businesses grow online through his tips. In his free time, he is surely singing or practicing martial arts.