The controller can modify flow entries on demand to update the route for each packet

The control plane runs on the controller server, with the purpose of sending instructions to the network to update routes. Our orchestrator server set up applications in the computing nodes and collects statistics to measure performance. The northbound and southbound interfaces enable the communication from the SDN controller to the control plane and the orchestrator. We configured SNMP and SCPI as well, for coarse-grain statistics and to send instructions to the optical switch. To differentiate between host and guest machines for both servers and switches, we use the terms host or physical when not referring to the virtualized elements. The purpose of the control network is to create out-of-band links that facilitate experiment orchestration and collecting data from the testbed, without sending instructions through UC Davis network. In such manner, we have full control over the infrastructure, the network design and we keep dedicated links without worrying for bandwidth availability nor external outages. The individual clouds in Figure 2.3 illustrate different networks, with their own IP addressing. Orchestrator, EPS chassis, controller, node1, and node2 are connected with physical 1 Gbps ethernet links through an electronic switch. The Opticalswitch is also connected with a 1 Gbps ethernet cable to the orchestrator. Our computing nodes vm1, vm2, vm3 and vm4 are hosted in node1 and node2, interconnected with virtual links. As we mentioned at the beginning of the chapter, our data plane is a reconfigurable optical network driven by an optical switch and four electronic packet switches. Br1, br2, br3 and br4 are hosted in an EPS chassis configured for running in OVS mode, which means that they are optimized for OpenFlow applications and receive instructions from a centralized SDN controller. Each individual link can forward data at rates up to 10 Gbps. Figure 2.4 represent the general topology of our data plane. Nevertheless, grow racks the physical links can be attached or removed with software to or from different bridges as a consequence of the flexibility that OVS yields. We use the terms bridge, EPS and ToR as synonyms.

SDN-enabled devices communicate with a controller through a channel, using OpenFlow protocol. In our testbed, the SDN controller is the built-in application OFCTL OpenFlow switches use flow tables to route packets by matching one or more fields. to achieve dynamic routing in a software defined network. Pica8 PICOS is the Network Operating System that runs in our physical EPS. It allows the chassis to run in two modes: Traditional L2/L3 mode and OVS. We use the latter, as it fits in our SDN design. In our testbed, we handle the traffic as needed without traditional routing protocols like OSPF. If there is any overlap in the flow tables, the forwarding action will be random and this behavior is not desired. To avoid that, the priority field is varied in our experiments. We use a single flow table for each bridge, and the addressing is done with IPv4. The relevant fields for our testbed are described in Table 2.1. An alternative to run tests with SDN is to use the L2/L3 mode in PICOS with cross flow enabled. This hybrid approach allows a combination of both networking paradigms. Network operators interact with the switch with Linux commands or a CLI similar to Juniper Junos interface. The flexibility of having two consoles is useful for researchers with different backgrounds. Some may be experienced with Linux servers, others with enterprise or service provider networking. However, throughout our experiments we found issues while performing optical reconfiguration in hybrid operation. We observed that the ports where we connect the transceivers remained in down state after we execute the optical reconfiguration. We had to reboot the Picos service and reinstall the flows to get the ports up, adding at least one minute to the reconfiguration. Hence we decided to run OVS mode. So far, we have discussed the pertinence of flows in our testbed and how we use the SDN controller to update routes from the orchestrator using a REST API. Now we will introduce the concept of flow in computer networks. RFCs are technical documents published by the IETF after being written and reviewed by interested parties.

They cover foundations for computer networks, namely transport, addressing and routing. From RFCs 2722 and 3697, a flow is defined as the packets sent from a source to a destination that may be unicast, multicast or anycast, with specific attributes . An alternate definition for flow refers to the packets in a single physical media or stream of data. In the context of our testbed, a flow is an entry in a table that is attached to a specific bridge, with different fields that will be matched to follow the desired route towards their destination.Two EXXACT chassis , node1 and node2, host the virtual machines, vm1, vm2, vm3 and vm4. Each physical server has an Intel X710 dual NIC which supports up to 10 Gbps per port. The main issue with this card was the transceiver compatibility, because it does not read generic devices but only products listed in the Intel compatibility tool. Every virtual server has its own 10 Gbps dedicated port that connects to the data plane, as illustrated in Figure 2.4. Additionally, we connected the virtual machines to the orchestrator with dedicated 1 Gbps NICs apart from the 10 Gbps cards. Figure 3.2 represents the wired and virtual links between the orchestrator, host servers and virtual machines. Virtual bridges inside node1 and node2 map the traffic from the guest servers to the exterior. These interconnections enable the orchestration of applications in our experiments from a principal server. In the rest of the thesis, we use virtual machines, virtual server, computing nodes, computing servers as synonyms.Technologies for optical switches were introduced in chapter 1. In our testbed, optical reconfigurations were performed with a Polatis optical switch tray , a single-mode MEMS device that takes up to 25 ms to steer the light beam to the new port. Instructions for switching were sent from the orchestrator with SCPI commands through a TCP socket. The ethernet interface of the switch speeds up the deployment and integration with our testbed control network. The lack of an openflow agent in the OST does not allow to seamlessly integrate the optical switch with an SDN controller. Nevertheless, most recent chassis such as the Polatis 6000 and 7000 series come with open flow agents to enable centralized management with SDN.

These products offer flexibility for theindustry. However, they are built on top of the same MEMS-based technology and the switching latency is still in the order of tens of millisecond.Small Form-factor Pluggable are defined by Cisco as compact optical transceivers. They create an interface between optical and electrical communications, with a transmitter and a receiver in both sides of the link, and support several communication standards like Ethernet, SONET and PON . They are manufactured for different purposes: single mode, multimode, for short and long distances, fiber and copper. It is important to choose the appropriate SFP for the application, otherwise the deployments could not work properly or the devices could get damaged. We installed two types of 10 Gbps transceivers. The first works with multi mode fiber in the 850 nm band, grow table and allow us to connect the computing nodes with the electronic packet switches, as observed in Figure 2.4. The second kind is 10 Gbps DWDM SFP, operating in the C band. By using Single Mode Fiber patch cords connecting our electronic packet switches and optical switch, we created a reconfigurable optical network which supports multiple wavelengths. Various tools are available to generate packets between hosts. Some are packeth, ostinato, D-ITG, MGEN iperf and TRex. The latest is developed and maintained by Cisco, and it offers scalability up to 200 Gbps per server. However, this tool is made for a single server, that is, the transmitter and receiver are hosted in the same physical machine, which must have at least two NICs from the supported models. The Device Under Test can be an external element or virtualized. Codilime, a company that specializes in software and computer networks, customized TRex to allow testing a network from different start and end points. We tested this traffic generator, but the installation, configuration and integration with our testbed was not straightforward. In contrast, iPerf installation and execution is quick, it does not show issues with drivers and the integration with our experiments was successful. This tool has been widely used in research and generates synthetic packets to emulate traffic between servers. Several studies have shown different approaches to gather statistics to analyze network performance. Platforms like Netseer, Jetstream, Planck have demonstrated improvements in the detection of network performance anomalies, including packet drops, decrease in throughput and increased latency, at different scales like data center and cloud. Zhang et al. [99] compare two ways of gathering data from computer networks, fine and coarse sampling. To analyze testbed latencies and network performance metrics,we used tcpdump as we need end-to-end per-packet resolution. On the other hand, SNMP counters work well to confirm that the data stream is going through the desired route when we design the reconfiguration paths. Zabbix is a popular tool in enterprises for monitoring systems, including SNMP statistics, and there are docker versions for agile implementation. With this software we observed how the traffic flows through the EPSs interfaces. To monitor traffic per flow, we deployed an sflow collector and sflow agents in the virtual bridges. Recalling the first chapter, the goals for this research comprehend building a networking and computing testbed with SDN capabilities to demonstrate the benefit of make before break approach combining optical and electronic packet switches to achieve hitless reconfiguration. In Figure 2.1 we show the architecture of the testbed and the meaning of each element in the network diagram. The data plane encompasses an optical switch and four electronic packet switches , connected to an SDN control plane. The topology can be modified as needed with Open vSwitch commands. Additionally, an orchestrator sends the route updates to the controller and optical switch. In the first subsection we analyze the hardware and software switching latency introduced by our control plane. Next, we compare the throughput, round-trip time and packet loss of a single data stream between a pair of servers . We show the benefit of our make before break approach for updating the route, compared to a plain optical reconfiguration. Finally, similar to the single data stream experiments, we show the metrics of doing a route update with two data streams to demonstrate the benefit of make before break. We refer to the last subsection as bandwidth steering experiments. Five main switching latencies were found in our testbed. A summary is shown in Table 4.1. We discuss each one in the following subsections. Overall, the dominant latency of 605ms is introduced by the transceivers and the operating system of the host electronic packet switch. In later sections, we show how to avoid interrupting the data stream due to a link unavailability caused by a path reconfiguration, by using our MBB approach. We define the optical switching latency as the time it takes from the beginning of the reconfiguration performed by the optical switch, which leads the ports to go down, until all the transceivers report operativity in the ToR chassis. The topology in Figure 4.1 shows a single data stream between servers vm2 and vm3. At the beginning, the data transfer goes through EPS br2, br1, br4, and br3 . Then we perform the path reconfiguration with the optical switch ost1, and the new route between servers passes through switches br2 and br3 . A summary of steps performed in our experiment is shown in Table 4.2. Server 2 and server 3 run iperf in server and client mode, respectively. The data rate on the sender side is set at values from 5 Gbps to 10 Gbps, with a duration of 20s. After 485 experiments, we obtained the logs from the physical electronic packet switch, which hosts the virtual switches br1, br2, br3, br4. Then we identify all the interface flapping events of the ports in the topology shown in Figure 4.1. Finally, we calculate the time difference between these events, port down and port up, to obtain the summary of statistics in Table 4.3 and Figure 4.2 as well.