Leave us your email address and be the first to receive a notification when Robin posts a new blog.
One of the most important best practices, for the NSX Distributed Firewall, must be: the Applied To. I explain in this blog why the “Applied To” is so important for the inner working and performance of the NSX Distributed Firewall.
The best practice for using “Applied To” has been around since the early days of NSX microsegmentation and remains one of the most important.
In this second part of the blog, I’ll dive into the more technical details behind the Applied To option. If you're not familiar with the Applied To option, you can first read the basics here: Demystify Applied To for NSX DFW - part one.
Applied To is used to limit the scope of published firewall rules. And as always, there is a maximum or limit to the number of rules that can be programmed. For most VMware products, these configuration maximums can be found at: https://configmax.vmware.com
For the Distributed Firewall, the rule maximum is 120k per ESXi host and 100k for the management plane. These are the tested maximums for NSX version 4.1.
These numbers say that you can create 100,000 firewall rules on the NSX management plane. Meaning in the NSX GUI. The data plane resides in the ESXi hosts. Where each vNIC on each VM gets its own set of firewall rules. All firewall rules from all VMs on a single ESXi host combined cannot exceed 120,000 firewall rules, according to the configmax website.
For example, if there are 100 VMs running on a single ESXi server and 1,000 rules are Applied To the entire DFW. Then the total number of firewall rules on the ESXi host would be 100,0000 rules (100 VMs * 1,000 rules). As you can see, these numbers add up quickly.
What will happen if there are 120,001 rules on the ESXi host? One more than the tested maximum. To answer this, we need to look further at how the rules in the data plane are programmed.
As you probably know, the Security Groups are translated to IP-addresses and the rules in the management plane are broken down by protocol. TCP and UDP therefore have their own rules. So, there can be more rules in the data plane than the amount of rules programmed in to the management plane. This is called: rule sprawl.
Rule sprawl can occur when a single firewall rule in the management plane has both TCP and UDP services. Then 2 firewall rules are programmed into the data plane. Take a look at the example below.
Here you see two rules 2075 and 2076, both with TCP and UDP services. But rule 2076 combined them into a Service group.
The rule for each vNIC can be found on the ESXi host by using summarize-dvfilter to list all the filters. With the vsipioctl command the firewall rules or address sets per vNIC can be retrieved. For example: vsipioctl getrules -f nic-3414546-eth0-vmware-sfw.2
As you can see, both firewall rules consist of two firewall rules in the data plane. So there is a minor rule sprawl. The reason I’m showing two different firewall rules is because in NSX-V, the firewall rule with the separate services would create 7 different firewall rules in the data plane. Fortunately, NSX-T combines as many services as possible into a single rule. The maximum number of service ports per rule is 15. Then a new firewall rule is programmed into the data plane.
Now that we know that there may be more rules on the data plane than we see in the management plane. How do I know when the maximum has been reached?
To answer that question, we need to know how the rules are stored on the ESXi hosts.
Ultimately, everything has to be stored somewhere. And that includes the firewalls rule programmed on the vNIC on the VMs. Each ESXi host has 3GB of Heap size memory reserved for the Distributed Firewall configuration. So, the more rules and (larger) address sets are used. The sooner the memory runs out. Again, it’s important to use the Applied To, because it can drastically reduce the number of firewall rules that are programmed and stored in memory.
For NSX-V, there is actually a knowledge base article about the Heap Size (KB2146298). Because it causes major problems for a number of customers.
I haven’t found a similar KB article for NSX-T, but my assumption is that this still applies.
This means that the tested configuration limits in the configmax gives a good estimate and can be viewed as 'soft' limits, while the Heap size will trigger a 'hard' limit.
So, now we know how the rules are programmed and stored. Is there a way to calculate how much space the firewalls are using on the ESXi hosts?
Well, not that I know of, but there is a way to check the Heap size on the ESXi hosts. At least one way I’ve found so far.
Monitoring the Heap Size
To monitor the Heap size on the ESXi hosts and track growth over time. Take a look at the following page where Dale Coghlan explains it: Monitoring DFW Heap Usage – SneakU
This has helped me several times, because I can never remember all of these commends by heart.
I hope this gives you a little more insight into how “Applied To” works in NSX. And especially why it is so important to use it, as often as possible.
Hopefully you enjoyed reading this blog. If you have any questions or would like to see some more, please leave a remark below.
Questions, Remarks & Comments
If you have any questions and need more clarification, we are more than happy to dig deeper. Any comments are also appreciated. You can either post it online or send it directly to the author, it’s your choice.