I just returned home from Barcelona, where I attended the OpenStack Design Summit for Ocata, the next release of OpenStack. It was a good trip, I think I ate more excellent food in one week than ever before. The weather cooperated, providing opportunities for sitting on the beach with a Cerveza in hand, talking with engineers from several different companies working on the same Open Source projects.
Here are my observations, coming from the perspective of an OpenStack deployment networking architect. My job is to develop the framework and automation which performs network-based deployment and configuration of OpenStack onto bare metal or virtual hardware. To do this, I have to interface with many different levels, from bare hardware to the graphical user interface, and since I work primarily on TripleO (OpenStack On OpenStack), that means the tools at my disposal are the OpenStack services themselves.
Gone are the heady days of competing massive parties thrown for thousands of attendees by the event sponsors. The trade show portion of the conference seemed to lack energy, and the recent consolidation in the space meant that vendors who would have had their own booth in previous years were folded into their new parent companies. For instance, the Dell/EMC booth would have been represented by separate booths for Cloudscaling, Dell, and EMC a few years ago. The same is happening with some of the storage and management startups, and it looks like the consolidation trend will continue for a while as market leadership becomes clear. Recent layoffs at both HP and Mirantis created a mood of insecurity among some of the attending engineers, but the future of OpenStack looks bright, despite the turmoil.
This was the last of the combined conference and design session, in the future the Project Teams Gathering (PTG) will take the place of the Design Summit, and will happen midway between conferences. I can’t help but wonder if splitting these two events will end up draining the life out of the trade show. After all, unless companies send their engineers to both conferences and both PTG gatherings (which costs more money than sending them to a combined event twice a year), there will be less interaction between engineers and the end-user customers who attend the OpenStack conference. I have heard from many end-users and customers that a big part of the reason they attend the conference is to talk to engineers, so this may negatively affect the conference attendance.
In order to shift the development cycle so that OpenStack releases happen 2 months before the conference, the development cycle of Ocata was shortened to 4 months. This had the net effect of creating limitations on the number of features which can be folded into Ocata. It’s unfortunate that the development cycle had to be shifted, rather than shifting the conference schedule, but in the classic battle between Marketing and Product Engineering, typically it’s the Marketers who seem to win out. Well, that and the fact that conferences of this size have to be booked and planned a year or more in advance. It seems to me like it’s the end-users who fare worst in this arrangement, since it means a longer time before many planned features can be released in Pike in mid-2017.
The big star of the talks and presentations this year is containers, Containers, CONTAINERS! There were at least half a dozen talks which amounted to an introduction and overview of how containers work, the various networking solutions which try to add functionality and connectivity between container hosts, and some high-level information about how one can install OpenStack on top of a container pool. Unfortunately, the sessions I attended on the topic were woefully high level, but still informative for the majority of the attendees. One of the better talks, “Networking Approaches in a Container World”, is available on YouTube. I would have liked for someone to present a low-level technical deep-dive into the underlying tech. Instead of just a big block on a slide labeled “Kubernetes” (how obtuse is that?), how about an overview of k8s scheduling, and some comparison of various types of container sets and pods? Maybe next time.
Containers are Coming: What Does it Mean?
I do see the value in containers, and I think it makes sense to deploy a service-oriented modular application like OpenStack on top of containers, but at the same time I think that this is an implementation detail rather than a revolution. Moving OpenStack to containers won’t fundamentally change the way it operates, nor will it add any features. In fact, the reduced feature set of containers (as opposed to bare metal) will create some limitations and make it more difficult to implement some features. So why do it? Because deployment is actually only the second-biggest headache for OpenStack operators, upgrades are worse, and disaster-recovery is almost as bad. Containers won’t solve these problems, but will provide a toolset for developers to automate those procedures. Things like rolling upgrades, hardware replacement, and disaster recovery are easier to do right with containers. I expect that all commercial distributions will be using containers within 2 years, if not sooner.
TripleO and Containers Work in Ocata
There will be a lot of groundwork laid for containers during Ocata. The containers CI job for TripleO will be resurrected and should become gating. Improvements to composable roles will help pave the way for further containerization in Pike and beyond. There is a lot of work to be done to smooth out upgrades for composable roles. A fair amount of time was dedicated to growing the TripleO team, and a proposal to introduce task-oriented “squads” was discussed. Hopefully this will allow subject-matter experts in the various projects which TripleO utilizes (Ironic, Nova, Neutron, Heat, Mistral, etc.) to contribute to the project without feeling like they have to understand every aspect of TripleO. New features which can be expected in Ocata:
- Better upgrade support
- Changes to the HA architecture to better support containers
- Composable services for the Undercloud (add modularity)
- TripleO GUI
Ironic Work in Ocata
A big topic for the Ironic and Nova teams this year is improving bare-metal provisioning, and making bare metal instances available to cloud tenants. It may seem counterintuitive, since “cloud computing” has become almost synonymous with virtualization, but there are still plenty of reasons to provision bare metal, and many operators want a “single pane” interface for provisioning both VMs and bare metal servers. Containers once again come to the foreground, since a common use case is to spin up a bare metal host with an OS dedicated to hosting containers. In this way, developers can worry only about their application, and not about managing a host OS on virtual nodes. The Ironic team will be working on improving this workflow, and some of the things that are expected to be largely or entirely completed in Ocata are:
- Ironic port-groups (for bonding multiple NICs within a bare metal host)
- Rolling upgrades
- Security groups for bare metal instances
- Micro-versioned API changes
- Collection of LLDP data from network switches for physical topology awareness
- Better synchronization of events with Neutron
- VNC console for bare metal instances using BMC controllers
- Port attach/detach to unblock development of VLAN-aware bare metal instances
Some work in Ironic was recently completed, or is largely complete with code not yet accepted upstream:
Putting the Work Together
The most exciting part of this OpenStack summit for me was seeing one of my pet projects getting close to fruition.
I’ve been wanting to bring routed networks (spine-and-leaf or Clos) to OpenStack deployments using Neutron for several years. At a previous job, I was involved with a replacement for Nova Networking which supported AWS-style L3 routing between instances. This architecture, where each rack is connected to one or more routing switches, the VLANs are local to the rack, scales horizontally because nothing is shared between racks. When Neutron began to take hold, I was disappointed that this model wasn’t addressed except by 3rd-party SDN solutions, especially since it is quickly gaining traction as the de-facto network architecture for newly-built datacenters.
Most of the components of OpenStack support this model, but the dependencies weren’t in place to support automatic deployment onto a routed network in TripleO until now. Some of the work done by the Ironic, TripleO, Nova, and Neutron teams will now make it possible:
- Routed networks (Neutron)
- Ironic/Neutron integration (Ironic/Neutron)
- Nodes tagging (Ironic)
- Port groups (Ironic)
- Composable Roles (TripleO)
Where OpenStack is Headed
In the last several years, OpenStack has changed form so much that it is nearly unrecognizable when compared to the initial versions. One trend remains clear, however: anything which can be abstracted and described in software, eventually will be. It all started with system virtualization, which paved the way for AWS and cloud computing. Eventually, the network was redesigned in a software-defined model with Neutron, but today we see software-defined storage, software-defined security policies, software-defined network functions, and more. At the risk of sounding absurd, with technologies such as hardware partitioning, SR-IOV, unified server/network/storage platforms, and dynamically-attached pass-through hardware, we are even getting close to software-defined hardware.
As an example of this approach, a recently announced project named Valence aims to implement “Rack Scale Control” over pools of CPU, RAM, and disk storage: https://wiki.openstack.org/wiki/Valence
The end-goal is software-defined infrastructure, with hardware added to pools which are utilized as needed. The trend is for added layers of abstraction. Add containers to the mix, and you have several layers of abstraction between the running code and the physical hardware, but with lower overhead and better utilization. In other words, OpenStack is getting more flexible and more efficient as more and more things are virtualized, but it’s also getting more complex.