Diving into OpenStack Network Architecture - Part 2 - Basic Use Cases
- by Ronen Kofman
rkofman
Normal
rkofman
4
138
2014-06-05T03:38:00Z
2014-06-05T05:04:00Z
3
2735
15596
Oracle Corporation
129
36
18295
12.00
Clean
Clean
false
false
false
false
EN-US
X-NONE
HE
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin-top:0in;
mso-para-margin-right:0in;
mso-para-margin-bottom:10.0pt;
mso-para-margin-left:0in;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:Arial;
mso-bidi-theme-font:minor-bidi;
mso-bidi-language:AR-SA;}
In the previous post we reviewed several network components
including Open vSwitch, Network Namespaces, Linux
Bridges and veth pairs. In this post we will take
three simple use cases and see how those basic components come together to
create a complete SDN solution in OpenStack. With those three use cases we will
review almost the entire network setup and see how all the pieces work
together. The use cases we will use are:
1.
Create network – what
happens when we create network and how can we create multiple isolated networks
2.
Launch a VM – once we have
networks we can launch VMs and connect them to networks.
3.
DHCP request from a VM –
OpenStack can automatically assign IP addresses to VMs. This is done through
local DHCP service controlled by OpenStack Neutron. We will see how this
service runs and how does a DHCP request and response look like.
In this post we will show connectivity, we will see how packets
get from point A to point B. We first focus on how a configured deployment
looks like and only later we will discuss how and when the configuration is
created. Personally I found it very valuable to see the actual interfaces and
how they connect to each other through examples and hands on experiments. After
the end game is clear and we know how the connectivity works, in a later post,
we will take a step back and explain how Neutron configures the components to
be able to provide such connectivity.
We are going to get pretty technical shortly and I recommend
trying these examples on your own deployment or using the Oracle OpenStack Tech Preview. Understanding
these three use cases thoroughly and how to look at them will be very helpful
when trying to debug a deployment in case something does not work.
Use case #1: Create Network
Create network is a simple operation it can be performed
from the GUI or command line. When we create a network in OpenStack the network
is only available to the tenant who created it or it could be defined as
“shared” and then it can be used by all tenants. A network can have multiple
subnets but for this demonstration purpose and for simplicity we will assume
that each network has exactly one subnet. Creating a network from the command
line will look like this:
# neutron net-create net1
Created a new
network:
+---------------------------+--------------------------------------+
| Field | Value |
+---------------------------+--------------------------------------+
| admin_state_up
| True
|
| id
| 5f833617-6179-4797-b7c0-7d420d84040c |
| name
| net1
|
| provider:network_type | vlan |
| provider:physical_network |
default |
| provider:segmentation_id | 1000 |
| shared | False |
| status
| ACTIVE
|
| subnets
|
|
| tenant_id
| 9796e5145ee546508939cd49ad59d51f
|
+---------------------------+--------------------------------------+
Creating a subnet for this network will look like this:
# neutron subnet-create net1 10.10.10.0/24
Created a new
subnet:
+------------------+------------------------------------------------+
| Field | Value |
+------------------+------------------------------------------------+
| allocation_pools | {"start":
"10.10.10.2", "end": "10.10.10.254"} |
| cidr | 10.10.10.0/24 |
| dns_nameservers |
|
| enable_dhcp |
True
|
| gateway_ip | 10.10.10.1 |
| host_routes
|
|
| id |
2d7a0a58-0674-439a-ad23-d6471aaae9bc
|
| ip_version |
4
|
| name
|
|
| network_id |
5f833617-6179-4797-b7c0-7d420d84040c
|
| tenant_id |
9796e5145ee546508939cd49ad59d51f
|
+------------------+------------------------------------------------+
We now have a network and a subnet, on the network topology
view this looks like this:
Now let’s dive in and see what happened under the hood.
Looking at the control node we will discover that a new namespace was created:
# ip netns
list
qdhcp-5f833617-6179-4797-b7c0-7d420d84040c
The name of the namespace is qdhcp-<network
id> (see above), let’s look into the namespace and see what’s in it:
# ip netns
exec qdhcp-5f833617-6179-4797-b7c0-7d420d84040c ip addr
1: lo:
<LOOPBACK,UP,LOWER_UP> mtu
65536 qdisc noqueue state
UNKNOWN
link/loopback
00:00:00:00:00:00 brd 00:00:00:00:00:00
inet
127.0.0.1/8 scope host lo
inet6 ::1/128
scope host
valid_lft
forever preferred_lft forever
12:
tap26c9b807-7c: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
link/ether
fa:16:3e:1d:5c:81 brd ff:ff:ff:ff:ff:ff
inet
10.10.10.3/24 brd 10.10.10.255 scope global
tap26c9b807-7c
inet6 fe80::f816:3eff:fe1d:5c81/64
scope link
valid_lft
forever preferred_lft forever
We see two interfaces in the namespace, one is the loopback
and the other one is an interface called “tap26c9b807-7c”. This
interface has the IP address of 10.10.10.3 and it will also serve dhcp requests in a way we will see later. Let’s trace the
connectivity of the “tap26c9b807-7c” interface from the namespace. First stop is OVS, we see that the interface
connects to bridge “br-int”
on OVS:
# ovs-vsctl show
8a069c7c-ea05-4375-93e2-b9fc9e4b3ca1
Bridge "br-eth2"
Port "br-eth2"
Interface "br-eth2"
type:
internal
Port "eth2"
Interface "eth2"
Port "phy-br-eth2"
Interface "phy-br-eth2"
Bridge br-ex
Port br-ex
Interface br-ex
type:
internal
Bridge br-int
Port "int-br-eth2"
Interface "int-br-eth2"
Port "tap26c9b807-7c"
tag: 1
Interface "tap26c9b807-7c"
type:
internal
Port br-int
Interface br-int
type:
internal
ovs_version:
"1.11.0"
In the picture above we have a veth
pair which has two ends called “int-br-eth2” and "phy-br-eth2",
this veth pair is used to connect two bridge in OVS "br-eth2"
and "br-int". In the previous post
we explained how to check the veth connectivity using
the ethtool command. It shows that the
two are indeed a pair:
# ethtool -S int-br-eth2
NIC statistics:
peer_ifindex: 10
.
.
#ip link
.
.
10:
phy-br-eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
.
.
Note that “phy-br-eth2”
is connected to a bridge called "br-eth2" and one of this bridge's interfaces
is the physical link eth2. This means that the network which we have just
created has created a namespace which is connected to the physical interface
eth2. eth2 is the “VM network” the physical interface
where all the virtual machines connect to where all the VMs are connected.
About network
isolation:
OpenStack supports creation of multiple isolated networks
and can use several mechanisms to isolate the networks from one another. The isolation
mechanism can be VLANs, VxLANs or GRE tunnels, this
is configured as part of the initial setup in our deployment we use VLANs. When
using VLAN tagging as an isolation mechanism a VLAN tag is allocated by
Neutron from a pre-defined VLAN tags pool and assigned to the newly created
network. By provisioning VLAN tags to the networks Neutron allows creation of
multiple isolated networks on the same physical link. The big difference between this and other
platforms is that the user does not have to deal with allocating and managing VLANs to
networks. The VLAN allocation and provisioning is handled by Neutron
which keeps track of the VLAN tags, and responsible for allocating and
reclaiming VLAN tags. In the example above net1 has the VLAN tag 1000,
this means that whenever a VM is created and connected to this network the
packets from that VM will have to be tagged with VLAN tag 1000 to go on this particular
network. This is true for namespace as well, if we would like to connect a
namespace to a particular network we have to make sure that the packets to and
from the namespace are correctly tagged when they reach the VM network.
In the example above we see that the namespace interface “tap26c9b807-7c”
has vlan tag 1 assigned to it, if we examine OVS we
see that it has flows which modify VLAN tag 1 to VLAN tag 1000 when a packet
goes to the VM network on eth2 and vice versa. We can see this using the
dump-flows command on OVS for packets going to the VM network we see the
modification done on br-eth2:
# ovs-ofctl dump-flows
br-eth2
NXST_FLOW
reply (xid=0x4):
cookie=0x0,
duration=18669.401s, table=0, n_packets=857, n_bytes=163350, idle_age=25,
priority=4,in_port=2,dl_vlan=1 actions=mod_vlan_vid:1000,NORMAL
cookie=0x0,
duration=165108.226s, table=0, n_packets=14, n_bytes=1000, idle_age=5343, hard_age=65534, priority=2,in_port=2 actions=drop
cookie=0x0,
duration=165109.813s, table=0, n_packets=1671, n_bytes=213304, idle_age=25, hard_age=65534, priority=1 actions=NORMAL
For packets coming from the interface to the namespace we
see the following modification:
# ovs-ofctl dump-flows br-int
NXST_FLOW
reply (xid=0x4):
cookie=0x0,
duration=18690.876s, table=0, n_packets=1610, n_bytes=210752, idle_age=1,
priority=3,in_port=1,dl_vlan=1000 actions=mod_vlan_vid:1,NORMAL
cookie=0x0,
duration=165130.01s, table=0, n_packets=75, n_bytes=3686, idle_age=4212, hard_age=65534, priority=2,in_port=1 actions=drop
cookie=0x0, duration=165131.96s,
table=0, n_packets=863, n_bytes=160727,
idle_age=1, hard_age=65534,
priority=1 actions=NORMAL
To summarize
we can see that when a user creates a network Neutron creates a namespace and
this namespace is connected through OVS to the “VM network”. OVS also takes
care of tagging the packets from the namespace to the VM network with the
correct VLAN tag and knows to modify the VLAN for packets coming from VM
network to the namespace. Now let’s see what happens when a VM is launched and
how it is connected to the “VM network”.
Use case #2: Launch a VM
Launching a VM can be done from Horizon or from the command
line this is how we do it from Horizon:
Attach the network:
And Launch
Once the virtual machine is up and running we can see the
associated IP using the nova list command :
# nova list
+--------------------------------------+--------------+--------+------------+-------------+-----------------+
| ID | Name | Status | Task State | Power State |
Networks |
+--------------------------------------+--------------+--------+------------+-------------+-----------------+
|
3707ac87-4f5d-4349-b7ed-3a673f55e5e1 | Oracle Linux | ACTIVE | None | Running | net1=10.10.10.2 |
+--------------------------------------+--------------+--------+------------+-------------+-----------------+
The nova list command shows us that the VM is
running and that the IP 10.10.10.2 is assigned to this VM. Let’s trace the connectivity
from the VM to VM network on eth2 starting with the VM definition file. The
configuration files of the VM including the virtual disk(s), in case of ephemeral
storage, are stored on the compute node at/var/lib/nova/instances/<instance-id>/.
Looking into the VM definition file ,libvirt.xml, we see that the VM is connected to an
interface called “tap53903a95-82” which is connected to a Linux bridge
called “qbr53903a95-82”:
<interface
type="bridge">
<mac
address="fa:16:3e:fe:c7:87"/>
<source
bridge="qbr53903a95-82"/>
<target
dev="tap53903a95-82"/>
</interface>
Looking at the bridge using the brctl show
command we see this:
# brctl show
bridge name bridge id STP enabled interfaces
qbr53903a95-82 8000.7e7f3282b836 no
qvb53903a95-82
tap53903a95-82
The bridge has two interfaces, one connected
to the VM (“tap53903a95-82
“) and another one ( “qvb53903a95-82”) connected
to “br-int” bridge on OVS:
# ovs-vsctl show
83c42f80-77e9-46c8-8560-7697d76de51c
Bridge "br-eth2"
Port "br-eth2"
Interface "br-eth2"
type:
internal
Port "eth2"
Interface "eth2"
Port "phy-br-eth2"
Interface "phy-br-eth2"
Bridge br-int
Port br-int
Interface br-int
type:
internal
Port "int-br-eth2"
Interface "int-br-eth2"
Port "qvo53903a95-82"
tag: 3
Interface "qvo53903a95-82"
ovs_version:
"1.11.0"
As we showed earlier “br-int”
is connected to “br-eth2” on OVS using the veth
pair int-br-eth2,phy-br-eth2 and br-eth2 is connected to the physical
interface eth2.
The whole flow end to end looks like this:
VM è
tap53903a95-82
(virtual interface)è
qbr53903a95-82
(Linux bridge) è qvb53903a95-82
(interface connected from Linux bridge to OVS bridge br-int)
è int-br-eth2
(veth one end) è
phy-br-eth2
(veth the other end) è
eth2
physical interface.
The purpose of the Linux Bridge connecting to the VM is to
allow security group enforcement with iptables.
Security groups are enforced at the edge point which are
the interface of the VM, since iptables nnot
be applied to OVS bridges we use Linux bridge to apply them. In the future we
hope to see this Linux Bridge going away rules.
VLAN tags: As we discussed in the first use case net1
is using VLAN tag 1000, looking at OVS above we see that qvo41f1ebcf-7c
is tagged with VLAN tag 3. The modification from VLAN tag 3 to 1000 as
we go to the physical network is done by OVS as part of the packet flow of br-eth2 in
the same way we showed before.
To summarize, when a VM is launched it is connected to the
VM network through a chain of elements as described here. During the packet
from VM to the network and back the VLAN tag is modified.
Use case #3: Serving a DHCP request coming from the virtual machine
In the previous use cases we have shown that both the
namespace called dhcp-<some id> and the VM end
up connecting to the physical interface eth2 on their respective nodes, both will tag their
packets with VLAN tag 1000.We saw that the namespace has an interface with IP
of 10.10.10.3. Since the VM and the namespace are connected to each other and
have interfaces on the same subnet they can ping each other, in this picture we
see a ping from the VM which was assigned 10.10.10.2 to the namespace:
The fact that they are connected and can ping each other can
become very handy when something doesn’t work right and we need to isolate the
problem. In such case knowing that we should be able to ping from the VM to the
namespace and back can be used to trace the disconnect
using tcpdump or other monitoring tools.
To serve DHCP requests coming from VMs on the network
Neutron uses a Linux tool called “dnsmasq”,this
is a lightweight DNS and DHCP service you can read more about it here. If
we look at the dnsmasq on the control node with the ps command we see this:
dnsmasq --no-hosts --no-resolv --strict-order --bind-interfaces
--interface=tap26c9b807-7c --except-interface=lo
--pid-file=/var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/pid
--dhcp-hostsfile=/var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/host
--dhcp-optsfile=/var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/opts
--leasefile-ro --dhcp-range=tag0,10.10.10.0,static,120s
--dhcp-lease-max=256 --conf-file= --domain=openstacklocal
The service connects to the tap interface in the namespace (“--interface=tap26c9b807-7c”),
If we look at the hosts file we see this:
# cat /var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/host
fa:16:3e:fe:c7:87,host-10-10-10-2.openstacklocal,10.10.10.2
If you look at the console output above you can see the MAC
address fa:16:3e:fe:c7:87 which is the VM MAC. This MAC
address is mapped to IP 10.10.10.2 and so when a DHCP request comes with this
MAC dnsmasq will return the 10.10.10.2.If we look
into the namespace at the time we initiate a DHCP request from the VM (this can
be done by simply restarting the network service in the VM) we see the
following:
# ip netns
exec qdhcp-5f833617-6179-4797-b7c0-7d420d84040c tcpdump
-n
19:27:12.191280
IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:fe:c7:87, length 310
19:27:12.191666
IP 10.10.10.3.bootps > 10.10.10.2.bootpc: BOOTP/DHCP, Reply, length 325
To summarize, the DHCP service is handled by dnsmasq which is configured by Neutron to listen to the
interface in the DHCP namespace. Neutron also configures dnsmasq
with the combination of MAC and IP so when a DHCP request comes along it will
receive the assigned IP.
Summary
In this post we relied on the components described in the
previous post and saw how network connectivity is achieved using three simple
use cases. These use cases gave a good view of the entire network stack and helped
understand how an end to end connection is being made between a VM on a compute
node and the DHCP namespace on the control node. One conclusion we can draw
from what we saw here is that if we launch a VM and it is able to perform a
DHCP request and receive a correct IP then there is reason to believe that the
network is working as expected. We saw that a packet has to travel through a
long list of components before reaching its destination and if it has done so
successfully this means that many components are functioning properly.
In the next post we will look at some more sophisticated services
Neutron supports and see how they work. We will see that while there are some
more components involved for the most part the concepts are the same.
@RonenKofman