Tuesday, May 8, 2018

Introducing nsx-t-gen: Automating NSX-T Install with Concourse

This blog post deals with automating install of VMware NSX-T v2.1 using Concourse pipeline.


Overview of NSX-T and Concourse


VMware NSX-T 2.1 is the next generation SDN for securely managing cloud native microservices at container level and also at VM level, along with standard networking capabilities. Installing the NSX-T v2.1 requires very careful planning and series of manual steps to get a full working configuration. Any mis-step wont be visible till VMs or app containers are up and running.

Concourse is the open source CI/CD toolkit used by heavily Cloud Foundry developers and Pivotal customers. It provides a much cleaner and user-friendly abstraction compared to other CI/CD tools.

NSX-T integration with PAS & PKS


Pivotal and VMware has built integration between Pivotal Application Service (PAS), formerly known as Elastic Runtime (ERT), and NSX-T v2.1 to manage the networking and container level security. Additionally a new product line, Pivotal Container Service (PKS) was launched that provides a BOSH managed install of Kubernetes along with NSX-T to manage container networking and security. All of this means, both VMware and Pivotal teams have to gain experience and be able to install NSX-T fast and easy for their own learning and use by customers.

Introducing nsx-t-gen


VMware Internal Dev teams have build a set of automation scripts (based on Ansible) for building a demo version of NSX-T 2.1. While the tool achieves most of the functionality required for a very basic demo of the NSX-T product for field personnel, it still requires heavy grunt work and ansible knowledge to make it customizable as well as flexible to handle different configurations (support PAS or PKS, multiple T1 Routers, Logical switches). Additional configurations like NAT rules, Load balancer configs etc are still to be implemented (as of writing this blog post).

In order for both VMware and Pivotal field teams to reduce the learning curve and also make the entire process automated without losing the option for customization and flexible, I have built nsx-t-gen that uses a Concourse pipeline (for Cloud foundry and Pivotal folks who are used to Concourse) to wrap around and execute the ansible scripts that handles the final install. The end user only has to configure set of values against various parameters. Along with the pipeline that handles the install, there is a default sample parameters bundled that shows what and how to configure for an install with PAS or PKS.

The entire nsx-t-gen pipeline is on a public Github repo:
https://github.com/sparameswaran/nsx-t-gen


Things handled by nsx-t-gen:

  • Deploy the VMware NSX-T Manager, Controller and Edge ova images
  • Configure the Controller cluster and add it to the management plane
  • Configure hostswitches, profiles, transport zones
  • Configure the Edges and ESXi Hosts to be part of the Fabric
  • Create T0 Router (one per run, in HA vip mode) with uplink and static route
  • Configure arbitrary set of T1 Routers with logical switches and ports
  • NAT Rules setup for T0 Router
  • Container IP Pools and External IP Blocks
  • Self-signed cert generation and registration against NSX-T Manager
  • Route redistribution for T0 Router
  • HA Spoofguard Switching Profile
  • Load Balancer (with virtual servers and server pool) creation

Acknowledgements

The author of this blog wishes to thank following VMware folks:
  • Yasen Simeonov, TPM in VMware NSBU Team, Twitter: @yasensim, creator of the ansible scripts and who made this possible. 
  • Yves Fauser, TPM in VMware NSBU team, Twitter: @yfauserfor technical guidance and troubleshooting.
  • Niran Even Chen, VMware Road warrior leading the NSX Pipeline field initiative, Twitter: @NiranEC , for extensive testing, feedback and fine tuning of the tool (as with nsx-edge-gen earlier).


Disclaimer

The tools discussed in this blog post are neither officially supported nor managed by VMware or Pivotal. These are unsupported and users are at cautioned to use it at their own risk.


Concourse

For users new to concourse, you can setup a quick install of concourse using concourse docker images. Please check out: https://github.com/concourse/concourse-docker. Make sure you have additional disk space on the vm that would run concourse and possibly host the webserver (install nginx if needed) that would serve the OVA images and ovftool binary also. Refer to the concourse documentation page for usage of fly tool and other concourse related details.

Pipeline Jobs

There are 4 jobs registered in the pipeline:
  1. Full install : complete deployment of NSX-T starting from base deploy of OVAs (of NSX-T Edge, Mgr, Controller), configuration of Mgr, Control cluster, to creation of routers and additional configurations.
  2. Base Install: Deploy of OVAs (of NSX-T Edge, Mgr, Controller), configuration of Mgr, Control cluster creation and membership. Users can keep ova memory reservation ON (for production) or turn it off (for demo/simple POC setup) so creation of all the controllers and edges does not end up causing resource starvation.
  3. Adding Routers: Creation of overlay & vlan networks, hostswitches, Addition of the ESXi hosts and edges as transport nodes, followed by creation of T0 Routers and any set of nested T1 Routers and logical switches. Additionally, container ip block and external ip pools would also be created.
  4. Extra configurations: Creation of HA Spoofguard profiles, NAT rules for T0 Router, Load balancer (with virtual servers and server pools) creation and custom signed cert for the NSX Manager.
The Full install would automatically invoke Base Install, add routers and configure the extras while the others would just run just the job. Base Install and Adding of routers uses customized ansible scripts (provided by VMware) while the last one uses direct REST api calls to configure things.

Pipeline Configurations

All the configurations required by the pipeline are driven by the parameters specified during the pipeline registration. There is a full sample params file in the github repo.

vCenter config

The pipeline requires default vCenter related configuration details (like vCenter endpoint, creds, datastore, cluster, resource pool etc.).  These are specified in the params file.


It also requires a standard management portgroup (like 'VM Network') that is non nsx-t type for deploying the OVAs. Users would hit following error if there is no such portgroup: Host did not have any virtual network defined 

Web Server

A web server needs to be configured to act as the hosting site for the OVA images and ovftool binary. The configuration can be modified to use a s3 bucket if necessary (requires modification of the pipeline resources section also).










ESXi Hosts
Name, IP and root password of the ESXi hosts that should be configured as part of the transport nodes.


NSX-T Manager and Controller Config

Configurations (ip, hostnames, creds, prefix names for controllers and edges) for the NSX-T install along with portgroups to be used for external uplink and internal overlay transport.



TEP and Edge interfaces


Most of these defaults can be used as is.

Edges created:


Reservation

For users running in smaller setups or dont want to reserve all memory, its recommended to turn off reservation. For Prod setups, its recommended to leave it to true (ON).

Edge Size

The size of the edge would determine the memory and cpu usage as well as number and size of load balancers that can be running on a given edge. The minimum size for edge should be medium while recommended is large (for PKS and PAS).


T0 Router config

There can be one T0 Router specified per run that would be tied to all the remaining T1 Routers, Logical switches, NAT Rules, Load balancers etc defined in the rest of the configuration associated with that run.


Configurations specified includes the name, vip, uplink ips, static route, and tags (used by PAS).

T1 Routers config

Multiple T1 Routers can be configured each with its own set of logical switches. Sample template provides topology for PKS and PAS.



HA Spoofguard Profile

HA Spoofguard profile is required to maintain locks in NSX-T.


Container IP Blocks

There can be multiple IP blocks to be used by containers

External IP Pools


NAT Rules on T0 Router

There can be NAT rules defined for DNAT (incoming to Ops Mgr or PKS ..) and SNAT (for containers to talk outside)


Custom Self-signed cert for NSX-Manager

One of the important step is to assign a self-signed custom cert for the NSX-Manager so the cert presented by it is valid. Provide the various attributes used for cert generation.



Load Balancers

Multiple load balancers can be configured (but size and number of instances is defined by size of the Edge instance).

LB Size Virtual Servers Pool Members
small 10 30
medium 100 300
large 1000 3000

Number of LBs per Edge is based on size of edge

Edge Size Small LBs Medium LBs Large LBs
small 0 0 0
medium 1 0 0
large 4 1 0


Snapshots of LBR, Virtual servers & Server Pools



Running the pipeline

Copy over the sample params as nsx-t-params.yml and then use following script to register the pipeline (after editing the concourse endpoint, target etc.) against a running concourse instance. 


After registering the pipeline, unpause the pipeline before kicking off any job group

Video of Run

Here is a recording of the pipeline execution.
Part 1 deals with basic install of the ova images and waiting for the VMs to come up.



Part 2 deals remaining steps (joining the control plane, configuring the hostswitches, transport nodes, T0 Routers and T1 Routers with Logical switches) and extra config steps (NAT rules, HA spoofguard creation etc.). 


Conclusion

The aim of nsx-t-gen is to alleviate the pain point involved with manual installation of NSX-T v2.1 while removing the various chances for mis-configurations and deliver an easy, smooth and consistent install experience for the field and customers via the Concourse user interface.

Please post feedback, questions either here or on Twitter: @sabha_mp.


No comments:

Post a Comment