Using the OpenStack APIs for building LBaaS

By Jake Edge
November 11, 2015

Praveen Yalagandula gave a presentation at the Tokyo OpenStack Summit about some of the work that his company, Avi Networks, did in creating a network service to run atop OpenStack. His talk also focused on the OpenStack APIs that the company used. In the grand tradition of such talks, it was billed as "the good, the bad, and the ugly" of those APIs. In implementing "load balancing as a service" (LBaaS) using OpenStack, the company found APIs of each type, which Yalagandula described in his talk.

Load balancing as a service

The application that the company is building is an "enterprise-grade scalable network service" to do load balancing. It needs to be able to scale out and scale in as demand changes, be highly available, and have high performance as well. It provides tenant isolation, as well, so that multiple customers can all use the service at the same time; they are also able to self-allocate resources from the service without affecting other customers.

One of the main goals of the project was to build on top of OpenStack, rather than in or alongside the cloud service. So that means only using the APIs provided by OpenStack components; it is a layered design somewhat akin to running a program in user space. Running in OpenStack would mean adding another component running in the cloud-management layer that accessed the OpenStack core message queue directly or perhaps adding the functionality directly into the Neutron network component. Running LBaaS alongside OpenStack would mean creating a component that ran outside of the framework, but that still understood and could use the virtualized networking set up by an OpenStack cloud.

Running the service on top of OpenStack was chosen because it provides flexible deployment models for the service, Yalagandula said. There are multiple ways to deploy the various components of the service, which is well-supported by OpenStack. In addition, OpenStack provides easy management for the compute nodes and underlying network virtualization that the service can use.

He then did a quick introduction to load balancing. Essentially, the idea is to balance the traffic from users to multiple web servers so that each server doesn't get overloaded. In addition, load balancers monitor the servers and stop sending traffic to those that have failed. For many web applications, though, there are several tiers (web servers, application servers, database servers), each of which has multiple servers that are fronted by its own load balancer.

So, Yalagandula said, that sounds like it could be done with a simple "packet sprayer" that could be handled by the switch, the router, or in Neutron. But there is more expected of a load balancer in these kinds of enterprise deployments—they act more as an "application delivery controller". For example, the web is not really stateless, so all of the traffic for a user's session should be sent to the same backend server, which requires more than just routing at the packet level.

In addition, there is an element of intelligence that may be required by load balancers. If a user is simply browsing the inventory at a site, then a set of regular servers could be used. But if they have something in their shopping cart, perhaps premium servers should be used, which requires the load balancer to make a decision based on the URL in the request.

Another feature for load balancers is SSL termination. That allows putting all of the SSL handshake, encryption, and decryption handling on a smaller set of servers that can be more tightly controlled in terms of policies for acceptable ciphers, protocol versions, key lengths, and so on. SSL also uses lots of CPU and memory, so moving it to dedicated servers and allowing the backend servers to handle regular unencrypted traffic makes sense.

The legacy architecture for LBaaS is one that runs alongside OpenStack and must be managed separately from the OpenStack cloud. The approach Avi Networks has taken is to run "service engines" that handle the load balancing in OpenStack virtual machines (VMs). The "Avi controllers" then allocate resources using the Nova compute component and Neutron networking component to run those service engines. It is similar to software-defined networking in some ways, Yalagandula said. One of the strengths of the architecture is that the controllers can easily add more service engines as demand increases and destroy them when they are no longer needed.

OpenStack APIs

Having provided that background, Yalagandula shifted to the kinds of APIs required by the service. It needs to access the elasticity features to create and delete VMs as well as to attach and move them to the right networks. The high-availability features are required to detect VM failures or network connectivity disruptions and quickly switch over to replacements. The multi-tenancy features are needed to support multiple users of the service. It also needs high performance, especially from the networking, so that it could support high packet rates.

He started by mentioning the OpenStack APIs that worked well for the LBaaS application. The Nova VM creation and deletion APIs were solid; it was easy to create VMs and plug them into the networks as needed. That allows scaling in and scaling out as needed. But the options for placing VMs on certain hosts could be better, he said. For non-admin users (which is how the Avi controller sometimes runs), there is limited support for VM placement.

The multi-tenancy support from the Keystone identity service and the integration of that with Nova and Neutron is "very good and very solid", he said. When compared to other infrastructure-as-a-service offerings, OpenStack stands out here. At the basic level, Neutron's CRUD API for handling networks, ports, subnets, and so on is "pretty solid". It makes it easy to create multiple application tiers with the proper isolation between them.

He then moved on to some of the problem areas, but he noted that he wasn't trying to "bash" OpenStack—these were areas for improvement. The fact that a service like LBaaS can be written on top of OpenStack is a testament to the quality of the APIs overall.

Notifications are one such area; they are lacking from the core OpenStack services. If a VM dies, there is no way to get notified. It is the same if a port gets deleted in Neutron and the same goes for Keystone, he said. He would like to have a way to subscribe to alerts specific to a particular resource. Since there isn't one, the system has to periodically check on the status of each resource, which generates lots of traffic. Using alerts from the Ceilometer telemetry service might be possible, but it appears to not be popular with customers so they do not enable it.

There is a mismatch between Nova "interfaces" (virtual NICs) and Neutron "ports" (virtual switch ports) that stems from an improper separation when Neutron was split out from Nova. The result is that there is no way to move an existing interface from one network to another. Instead, the interface for the VM must be destroyed and a new one in a different network needs to be attached, which is a pretty heavyweight operation. In the physical world, it is the equivalent of moving a wire from one switch port to another, but it can't be done that simply in OpenStack. A better separation between Nova and Neutron would have allowed that, Yalagandula said.

There is also inconsistency in the semantics of the security group APIs. Depending on which component implements those APIs, the behavior can be different. Both Nova and Neutron implement the APIs but, for Nova, policies (e.g. firewall rules) apply across all interfaces in a VM, while in Neutron they apply per-port. That means users of the API need to know which component is implementing it to get the behavior they need, but there is no way to query for which is doing so in a given installation.

The OpenStack APIs do not allow for customization, in general. For example, source IP address spoofing is not allowed even on the local network. But that is an essential primitive for building high-availability servers. There are some ad hoc Neutron extension APIs that alleviate the problem, but they are not core APIs so there is no guarantee they will be present on any given installation.

The last issue he described was not really an API issue, but was more of a problem with the reference implementation of OpenStack. Network performance can be fairly poor, though it has been getting better over time. There can be many layers between the VM and the physical network and that number is dependent on the plugins and installation configuration. Each of those layers imposes a cost. Those are fixable issues and things are getting better, but it is a problem Avi Networks ran into when trying to build LBaaS on OpenStack.

To summarize, Yalagandula said that the basic APIs are really good, but the advanced APIs for building highly available, scalable network services still need some work. That meant that it only took roughly a month for the team to get something up and running, but then took another one and a half years before it was able to get the service running on customers' OpenStack deployments.

[I would like to thank the OpenStack Foundation for travel assistance to Tokyo for the summit.]

Index entries for this article
Conference	OpenStack Summit/2015