Introduction

I have a colleague who builds out network infrastructure for <redacted>, and part of his duties involves configuring Software-Defined Networking (SDN) switches that connect industrial devices into his employer’s monitoring apparatus. One purpose of the SDN in this context is to improve the overall security of the network by strictly enforcing the path layer 3 packets take on the layer 2 network (layer 3 switching). For example, before deployment, my colleague sets valid IP:PORT:MAC pairings on each physical port of the SDN switch. If a device tries sending traffic to the switch without respecting these pairings–a scenario ranging from an improperly connected device to an attacker attempting an intrusion–the SDN will drop the offending packets and notify a network administrator. Since the connected hardware rarely changes, these strict networking rules protect sensitive infrastructure from cyber attack without placing significant burden on engineers to update the SDN.

The downside of this “layer bundling” comes during testing. Typically, if a network needs to test routability for 20 different services, it is easy to spin up a single device, assign that device 20 IP addresses, and then bind each service to an IP. The specific path the layer 3 packets take on the layer 2 network does not matter as long as the packets reach their destination. In the SDN’s case, the layer 2 path does matter because each service must appear on a unique physical port, so replicating this test would require 20 devices or a single device with 20 ethernet interfaces.

Since my colleague’s group does not have enough devices to properly test the SDN as each industrial computer can cost thousands of dollars, the common practice at his company is to test the SDN configuration in the field. Even minor errors can lead to expensive and time-consuming delays in infrastructure roll-out because an engineer needs to travel on-site and debug why endpoints are not connecting.

SDN Tester

I wanted to come up with a way for my colleague to test his SDN configurations before deployment using only a single device. The industrial computers used by his company are capable of running multiple services on different ports, so the idea I landed on was to deploy a NAT coupled with a managed switch between the industrial computer and the SDN switch. The managed switch would use port-based VLANs, each of these VLANs would get a single IP assigned to it, and then the NAT would convert between the VLAN IP and the port running the service on the industrial computer. Below is a network diagram outlining the VLAN-NAT SDN tester architecture for 18 endpoints connected to a single device.

The devil is in the details though, and the difficult part here is getting the NAT working. To function properly, the NAT needs to do the following:

Create a set of VLAN interfaces on the parent interface connected to the managed switch, each with its own MAC address.
Assign a single IP to each of these interfaces on a /32 subnet.
Set up sysctl rules to allow ARP replies and IP forwarding from each interface.
Create a DNAT rule that changes the destination IP:PORT of inbound traffic to the industrial computer IP:PORT and an SNAT rule that changes the source address to the NAT IP for outbound traffic to the industrial computer.
Create FORWARDING rules allowing traffic to flow bi-directionally between the VLAN interface and the industrial computer
Set up policy-based routing for return traffic from the industrial computer to the VLAN interface and traffic originating from the VLAN interface.

Point 6 is especially important because each VLAN interface should only reply to ARP requests on its own interface. Without the policy routes the kernel would drop any inbound ARP requests since a route back to the SDN would not exist.

Create the VLAN interfaces & assign an IP

Creating the VLAN interfaces is straight-forward. For each interface we want to define, create a new link on the interface connected to the managed switch (egress interface) with a given VLAN ID, set the mac address, and then assign an IP to the newly created VLAN.

sudo ip link add link "$egress_interface" name "$vlan_iface" type vlan id "$vlan_id"
sudo ip link set dev "$vlan_iface" address "$new_mac"
sudo ip link set "$vlan_iface" up
sudo ip addr flush dev "$vlan_iface"
sudo ip addr add "$vlan_ip/32" dev "$vlan_iface"

$ ifconfig | grep -A 2 'eth.*\.'
eth2.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.10  netmask 255.255.255.255  broadcast 0.0.0.0
        ether bc:24:11:e1:21:96  txqueuelen 1000  (Ethernet)
--
eth2.2: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.11  netmask 255.255.255.255  broadcast 0.0.0.0
        ether bc:24:11:e1:21:97  txqueuelen 1000  (Ethernet)
--
eth2.3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.12  netmask 255.255.255.255  broadcast 0.0.0.0
        ether bc:24:11:e1:21:98  txqueuelen 1000  (Ethernet)
--
eth2.4: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.1.13  netmask 255.255.255.255  broadcast 0.0.0.0
        ether bc:24:11:e1:21:99  txqueuelen 1000  (Ethernet)

Kernel Parameters

The kernel needs the following parameters set to facilitate communication between the SDN and the industrial computer:

ip_forward:1 This allows packets to travel between interfaces. IP forwarding is a requirement of any NAT system. This setting needs to go first because it resets all other configuration parameters when changed.

arp_ignore:1 This tells the kernel to reply to ARP requests only if the ARP target IP is the local address of the interface.

arp_announce:2 Use interfaces that include the target ARP IP address. Since the target IP is only reachable from each VLAN interface, only the correct interface will reply.

arp_filter:1 Answers ARP requests if the interface can reach the sender. This setting allows multiple interfaces to share a subnet.

rp_filter:2 Only allows incoming packets that are reachable from any interface’s FIB.

The code to set these kernel parameters is simply:

sudo sysctl -w net.ipv4.ip_forward=1
sudo sysctl -w net.ipv4.conf.all.arp_ignore=1
sudo sysctl -w net.ipv4.conf.all.arp_announce=2
sudo sysctl -w net.ipv4.conf.all.arp_filter=1
sudo sysctl -w net.ipv4.conf.all.rp_filter=2

Create DNAT/SNAT rules

The destination NAT (DNAT) rules change the incoming packet’s (PREROUTING) destination IP and PORT to a predefined value. In our case, we set the destination IP to the industrial computer and the destination PORT to the PORT expecting to receive communications on this VLAN IP.

sudo iptables -t nat -A PREROUTING -i "$vlan_iface" -p "$protocol" \
              -d "$nat_ip" --dport "$nat_port"  -s "$vlan_subnet" \
              -j DNAT --to-destination "$industrial_computer_ip:$industrial_computer_port"

The above command basically says: in the nat table -t NAT, add a rule to the PREROUTING chain -A PREROUTING that matches packets on a specific interface -i $vlan_iface running a specific protocol -p $protocol and also have the destination IP:PORT of the vlan NAT interface -d $nat_ip --dport $nat_port and the source sending this packet exists on the vlan subnet -s $vlan_subnet. If an incoming packet matches, we use the DNAT extension -j DNAT to change the destination IP:PORT of the packet to the industrial computer IP:PORT --to-destination "$industrial_computer_ip:$industrial_computer_port". Below shows an example of what the PREROUTING chain looks like after running 4 of these commands.

$ iptables -t nat -L PREROUTING -v -n
Chain PREROUTING (policy ACCEPT 33416 packets, 6960K bytes)
 pkts bytes target     prot opt in     out     source               destination
    1    60 DNAT       6    --  eth2.1 *       192.168.1.0/24       192.168.1.10         tcp dpt:502 to:192.168.2.2:9000
    1    60 DNAT       6    --  eth2.2 *       192.168.1.0/24       192.168.1.11         tcp dpt:502 to:192.168.2.2:9001
    1    60 DNAT       6    --  eth2.3 *       192.168.1.0/24       192.168.1.12         tcp dpt:502 to:192.168.2.2:9002
    1    60 DNAT       6    --  eth2.4 *       192.168.1.0/24       192.168.1.13         tcp dpt:502 to:192.168.2.2:9003

The SNAT rules work in the same manner but for traffic leaving the NAT (POSTROUTING). Any traffic outbound on the IP:PORT:INTERFACE connected to the industrial computer will have its source address changed to the IP assigned to the NAT connected to the industrial computer.

sudo iptables -t nat -A POSTROUTING -o "$ingress_interface" -p "$protocol" \
	          -d "$industrial_computer_ip" --dport "$industrial_computer_port" -s "$vlan_subnet" \
	          -j SNAT --to-source "$industrial_computer_nat_ip"

$ iptables -t nat -L POSTROUTING -v -n
Chain POSTROUTING (policy ACCEPT 390 packets, 32984 bytes)
 pkts bytes target     prot opt in     out     source               destination
    1    60 SNAT       6    --  *      eth1    192.168.1.0/24       192.168.2.2          tcp dpt:9000 to:192.168.2.1
    1    60 SNAT       6    --  *      eth1    192.168.1.0/24       192.168.2.2          tcp dpt:9001 to:192.168.2.1
    1    60 SNAT       6    --  *      eth1    192.168.1.0/24       192.168.2.2          tcp dpt:9002 to:192.168.2.1
    1    60 SNAT       6    --  *      eth1    192.168.1.0/24       192.168.2.2          tcp dpt:9003 to:192.168.2.1

Finally, we set up forwarding rules to allow traffic between the industrial computer and VLAN interfaces.

sudo iptables -A FORWARD -i "$vlan_iface" -o "$ingress_interface" -p "$protocol" --dport "$industrial_computer_port" -j ACCEPT
sudo iptables -A FORWARD -i "$ingress_interface" -o "$vlan_iface" -p "$protocol" --sport "$industrial_computer_port" -j ACCEPT

Policy-Based Routing

The DNAT/SNAT rules ensure that inbound traffic reaches the industrial computer on the correct port, but the problem now becomes how to deal with return traffic. When the industrial computer sends reply traffic back through the NAT, the linux kernel will set the DST address of these packets back to the SRC of the arriving packets. The kernel sees a range of interfaces it can use to reach the 192.168.1.0/24 subnet, and so will choose the first available route from this set. This means return traffic will go out on a different interface than it arrived on, and therefore violate the SDN layer 2/layer 3 pairing requirement.

If we try setting DNAT/SNAT rules for return traffic, the kernel will ignore these directives because of conntrack. Conntrack is a kernel module which keeps a per-flow record of all the NAT’s IP:PORT modifications which allows the kernel to automatically set the correct IP:PORT SRC/DST for return packets. Once conntrack registers a flow, both inbound and outbound packets use this record to perform NAT translation instead of checking the POSTROUTING/PREROUTING chains to reduce NAT load. This means specifying the DNAT/SNAT rules for return traffic does not work, since the conntrack record will already be handling NAT translation.

Rather than relying on NAT rules to route traffic back to the correct interface, we can instead modify how the kernel makes routing decisions directly through Policy-Based routing. Policy-Based routing works by creating a new routing table and then conditionally matching packets via ip rules and MANGLE entries.

We start by creating a new routing table containing a single route: the entire VLAN subnet is routable from each VLAN interface.

echo "$table_id $table_name" | sudo tee -a /etc/iproute2/rt_tables >/dev/null

sudo ip route add "$vlan_subnet" dev "$vlan_iface" src "$vlan_ip" table "$table_name" 2>/dev/null

$ ip route show table vlan201
192.168.1.0/24 dev eth2.1 scope link src 192.168.1.10

We then add a rule stating that any traffic originating from the VLAN IP address is to use this table for routing decisions. This rule is critical to allow ARP replies from each VLAN interface. Without this rule, ARP replies would send on whatever interface happens to be first in the system routing table that reaches the VLAN subnet.

sudo ip rule add from "$vlan_ip" table "$table_name" 2>/dev/null

$ ip rule show table vlan201
32757:	from 192.168.1.10 lookup vlan201

The next rule applies this table to any traffic originating from the industrial computer port associated with a VLAN. This rule solves the problem with return traffic from the industrial computer - if the returning packets use the system routing table, they will all route using the first entry that connects to the VLAN subnet. Since we know the industrial computer PORT:VLAN IP pairings, we can set any traffic originating from that PORT to use the routing table associated with its VLAN IP.

sudo ip rule add fwmark "$vlan_id" table "$table_name" 2>/dev/null
sudo iptables -t mangle -A PREROUTING -i "$ingress_interface" -p "$protocol" \
              -s "$industrial_computer_ip" -d "$industrial_computer_nat_ip" --sport "$industrial_computer_port" \
              -j MARK --set-mark "$vlan_id"

$ ip rule show table vlan201
32756:	from all fwmark 0x1 lookup vlan201
32757:	from 192.168.1.10 lookup vlan201

$ iptables -L -t mangle -n -v
Chain PREROUTING (policy ACCEPT 62495 packets, 12M bytes)
 pkts bytes target     prot opt in     out     source               destination
    0     0 MARK       6    --  eth1   *       192.168.2.2          192.168.2.1          tcp spt:9000 MARK set 0x1
    0     0 MARK       6    --  eth1   *       192.168.2.2          192.168.2.1          tcp spt:9001 MARK set 0x2
    0     0 MARK       6    --  eth1   *       192.168.2.2          192.168.2.1          tcp spt:9002 MARK set 0x3
    0     0 MARK       6    --  eth1   *       192.168.2.2          192.168.2.1          tcp spt:9003 MARK set 0x4

Putting it all together

With the primary NAT functions complete, the final step is to combine these functions into an easy to use bash script that reads a csv file containing the INDUSTRIAL_COMPUTER_IP, INDUSTRIAL_COMPUTER_PORT, NAT_IP, NAT_PORT, VLAN, and VLAN_SUBNET pairings. With the hard work of designing the NAT complete, coding this amounts to digital plumbing - i.e the perfect job for a language model. I wrote a prompt detailing the specification and how each section should perform, and after a bit of back and forth with Claude I had a fully working SDN VLAN NAT script complete with creation, tear-down, and status commands. I had to further modify the script by hand since some of the grep commands Claude wrote weren’t matching properly, but overall the generated code worked as expected.

As an aside, when I first started this project I tried throwing the original problem statement to multiple LLMs but none were able to come up with a working solution. After a few hours of fighting I gave up and sat down with the ip, sysctl, and iptable documentation to create a solution by hand. I’ve been using LLMs for nearly two years now and found this to be a familiar theme - anything vaguely novel and you might as well sit down and write the code yourself, but afterwards the LLMs are fantastic for creating all the boilerplate on top. Since I don’t use bash as part of my day-to-day, writing the code for this would have been a painful experience of constantly referencing the documentation, but with Claude it took about 15 minutes to get a working script going.

$ ./vlan-nat.sh --help
Usage: ./vlan-nat.sh [OPTIONS] COMMAND

Commands:
  create    Create VLAN and NAT configuration
  destroy   Remove VLAN and NAT configuration
  status    Show current configuration

Options:
  -f, --file FILE                             CSV file (default: data.csv)
  -i, --ingress/--industrial_computer IFACE   Ingress interface (default: eth0)
  -e, --egress/--switch IFACE                 Egress interface (default: eth1)
  -p, --protocol PROTO                        Protocol (default: tcp)
  --dry-run                                   Show what would be done without making changes
  --debug                                     Enable debug output
  -h, --help                                  Show this help message

I am hosting the full code on my gitlab instance at: SDN-Repo

Testing the SDN VLAN NAT

Since I don’t have an industrial computer or an SDN at home, I have to make do with running everything as VMs in Proxmox. To do this I needed 4 different hosts: an “industrial computer” VM running pymodbus on a range of ports, a “VLAN-NAT” VM running the above NAT code, a “Managed Switch” VM running an Open vSwitch bridge set up in port-based VLAN mode, and a tester VM that would connect to all the outbound ports on the Open vSwitch and ensure each port only had a single reachable IP. To simulate the physical connections between each device, I used Linux bridges in Proxmox.

For the test I set up the data.csv file for vlan-nat.sh as:

INDUSTRIAL_COMPUTER_IP	INDUSTRIAL_COMPUTER_PORT	NAT_IP	NAT_PORT	VLAN	VLAN_SUBNET
192.168.2.2	9000	192.168.1.10	502	1	192.168.1.0/24
192.168.2.2	9001	192.168.1.11	502	2	192.168.1.0/24
192.168.2.2	9002	192.168.1.12	502	3	192.168.1.0/24
192.168.2.2	9003	192.168.1.13	502	4	192.168.1.0/24

And the mapping.csv for the SDN Tester script (a simple wrapper for nmap I wrote for testing SDN networks - it allows specification of the outgoing interface and source IP):

Interface	ServerIP	DeviceIP	Ports
eth0	192.168.1.100/24	192.168.1.10	502
eth2	192.168.1.100/24	192.168.1.10	502
eth3	192.168.1.100/24	192.168.1.10	502
eth4	192.168.1.100/24	192.168.1.10	502
eth0	192.168.1.100/24	192.168.1.11	502
eth2	192.168.1.100/24	192.168.1.11	502
eth3	192.168.1.100/24	192.168.1.11	502
eth4	192.168.1.100/24	192.168.1.11	502
eth0	192.168.1.100/24	192.168.1.12	502
eth2	192.168.1.100/24	192.168.1.12	502
eth3	192.168.1.100/24	192.168.1.12	502
eth4	192.168.1.100/24	192.168.1.12	502
eth0	192.168.1.100/24	192.168.1.13	502
eth2	192.168.1.100/24	192.168.1.13	502
eth3	192.168.1.100/24	192.168.1.13	502
eth4	192.168.1.100/24	192.168.1.13	502

This test checks each endpoint IP from all 4 interfaces to see which would reply. And the results:

I ran another test to show TCP transport working properly. In this test I load a different message to each Modbus server and then connect from the “SDN Test” VM. I threw in some wrong pairings to show each IP is only accessible on certain interfaces:

$ poetry run python modbus-client.py
Received from server 192.168.1.10 on port 502 via 192.168.1.4: This is server 1 running on port 9000
Received from server 192.168.1.11 on port 502 via 192.168.1.5: This is server 2 running on port 9001
Received from server 192.168.1.12 on port 502 via 192.168.1.6: This is server 3 running on port 9002
Received from server 192.168.1.13 on port 502 via 192.168.1.7: This is server 4 running on port 9003
Failed to connect to server 192.168.1.13 at port 502 via 192.168.1.4
Failed to connect to server 192.168.1.12 at port 502 via 192.168.1.4
Failed to connect to server 192.168.1.11 at port 502 via 192.168.1.4
Failed to connect to server 192.168.1.10 at port 502 via 192.168.1.7