Thursday, October 11, 2018

VM World 2018 Session on Automating NSX and CNA with Concourse

Check out the video recording of the NSX-T automation using Concourse presented at Las Vegas, VM World Aug 2018.



Video walks through the nsx-t-gen pipeline and a real time install of NSX-T, all achieved within 20 minutes of the presentation.

Evolution of nsx-t-gen, nsx-t-ci-pipeline and canned-pks toolkits

I had been working and supporting a set of tools aimed at easing and automating install and use of VMware's NSX-T SDN product with Pivotal's PAS (Pivotal Application Service) and PKS (Pivotal Container Service - k8s offering) product versions on vSphere platforms. I wanted to share the evolution of these tools and how they work together.

nsx-t-gen

nsx-t-gen is a toolkit that helps automate the install of VMware NSX-T SDN product on vSphere platform. It uses Concourse pipelines to make a set of tasks easy and automated while keeping the various pieces together towards the objective of fully automation of NSX-T.



The various components used in this toolkit are:

  • Github repos for the pipeline definitions, tasks and scripts
  • NSX-T and other binary install bits (ovftool)
  • User supplied params for vCenter, networks, credentials etc.
  • NSX-T Mgmt and runtime configuration for Edges, Mgr, Ctrl along with routers/switches, ip blocks, pools, nat rules, security group etc.
The end result is a fully installed NSX-T Mgmt plane running on a managed cluster (with edges) while using single or multiple compute clusters as transport nodes. Various Cloud native product offerings (like PAS or PKS or others) can be installed and managed on top of this NSX-T managed network infrastructure.

The current version of NSX-T supported against vSphere are (each version is in its own branch):
  • 2.3 (most recent release supported with PKS 1.2)
  • 2.2
  • 2.1

All of these pipelines use similar set of pipeline parameters and tasks while abstracting the version dependencies from the end user to provide the same zero-touch install experience.



Things handled by nsx-t-gen:

  • Deploy the VMware NSX-T Manager, Controller and Edge ova images
  • Configure the Controller cluster and add it to the management plane
  • Configure hostswitches, profiles, transport zones
  • Configure the Edges and ESXi Hosts to be part of the NSX-T Fabric
  • Create T0 Router (one per run, in HA vip mode) with uplink and static route
  • Configure arbitrary set of T1 Routers with logical switches and ports
  • NAT Rules setup for T0 Router
  • Container IP Pools and External IP Blocks
  • Self-signed cert generation and registration against NSX-T Manager
  • Route redistribution for T0 Router
  • HA Spoofguard Switching Profile
  • Load Balancer (with virtual servers and server pool) creation
  • Security groups to associate with job types or vms or others (leveraged for PAS to allow Load Balancer server pool to be dynamically linked to GoRouter or TCPRouter or SSH Proxy VM instances from PAS) 


nsx-t-ci-pipeline


The nsx-t-ci-pipelines is aimed at install of Pivotal Products with NSX-T integration with fully automation including and not limited to:
  • Installing Pivotal Ops Mgr with NSX-T mode enabled, configuring the Bosh Director
  • Installing the NSX-T Add-on tile along with PAS 2.x tile with external cni provider configurations
  • Installing PKS tile (v1.0, 1.1.x, and most recent v1.2) with NSX-T enablement, auto-configuring the PKS Super User creds to be used against NSX-T.

  • Handle dynamic integration of PAS components with Load balancer (like GoRouter or TcpRouter or Diego Brains) using security group association with the related job groups and using pre-configured server groups tied to security group for membership (handled by nsx-t-gen pipeline), compared to old way of using static IPs for job types and associating them with the Load Balancer server pool membership.
  • Automatically configuring NAT rules to allow the PKS API Controller to be reachable externally.
  • Automatically creating and configuring the PKS CLI user 
  • Install Harbor Tile and automatically configure NAT rules to expose Harbor outside.


Canned PKS


One of the challenges most customers or field people face relates to installing these products in completely offline or isolated environments (no online access to any external resource).


The Concourse pipelines implicitly assume an online, interconnected setup to pull down various resources, github repos and docker images that are internally used as part of a task definition. This makes the whole isolated, offline working model break it.

I helped build a model of a pipeline that was aimed at truly offline pipeline execution to install NSX-T and PKS (Pivotal Container Service) in such limited restricted environments. The outcome was canned-pks.

 

There are two portions to using the canned-pks toolkit:
  1. Capture all the materials required for executing the pipeline in purely offline mode by saving or downloading the materials in online mode into a s3 compatible store (minio is a simple, easy and free s3 equivalent that anyone can run on any machine)
    • The BOM file has the bill of materials (github repos, docker images, install bits, including Pivotal Tiles and stemcells or VMware NSX-T install bits)
    • Tools to download and upload the saved bits into the store 
    • All of the above mentioned bits are downloaded and saved using scripts.
  2. Run the offline equivalent of the pipeline bundled in the canned-pks repo against the saved resources. The pipeline extracts them into a form that can then be used to execute the actual steps that installs the product, without requiring any online access. 
The complete steps are detailed in the canned-pks repo. It supports an opinionated install view of NSX-T (single compute cluster vs arbitrary # of compute clusters supported in base nsx-t-gen pipeline and certain precooked configs) to achieve a fast, simple and easy install of NSX-T with PKS in offline environments.


Also, I am happy to announce that the nsx-t-gen pipeline codebase has now been adopted by VMware and would be maintained going forward in form of their nsx-t-datacenter-ci-pipelines repo

All of these tools should help users get an more easier and faster ramp up experience with NSX-T and PAS/PKS product stack.

Just a note of caution: All of the above tools are not officially supported but free to use at user's own discretion and risk.



ThreadLogic v2.5.2 now supports OpenJDK Thread Dumps


With Oracle announcing end of free support for past versions of Java, more users are expected to use OpenJDK and I believe its very useful for ThreadLogic to support OpenJDK thread dumps.

With that in mind, ThreadLogic version 2.5.2 now supports parsing OpenJDK thread dumps along with some additional thread group (Tomcat, Datastax etc.) classification.

For people new to ThreadLogic, they can check the older posts on the need for such a tool and its usage in analyzing thread dumps.

Older Blog Entries:

  1. Introducing ThreadLogic
  2. Analyzing Thread Dumps in Middleware - Part 1
  3. Analyzing Thread Dumps in Middleware - Part 2
  4. Analyzing Thread Dumps in Middleware - Part 3
  5. Analyzing Thread Dumps in Middleware - Part 4
Feedback welcome on the tool.

Tuesday, May 8, 2018

Introducing nsx-t-gen: Automating NSX-T Install with Concourse

This blog post deals with automating install of VMware NSX-T v2.1 using Concourse pipeline.


Overview of NSX-T and Concourse


VMware NSX-T 2.1 is the next generation SDN for securely managing cloud native microservices at container level and also at VM level, along with standard networking capabilities. Installing the NSX-T v2.1 requires very careful planning and series of manual steps to get a full working configuration. Any mis-step wont be visible till VMs or app containers are up and running.

Concourse is the open source CI/CD toolkit used by heavily Cloud Foundry developers and Pivotal customers. It provides a much cleaner and user-friendly abstraction compared to other CI/CD tools.

NSX-T integration with PAS & PKS


Pivotal and VMware has built integration between Pivotal Application Service (PAS), formerly known as Elastic Runtime (ERT), and NSX-T v2.1 to manage the networking and container level security. Additionally a new product line, Pivotal Container Service (PKS) was launched that provides a BOSH managed install of Kubernetes along with NSX-T to manage container networking and security. All of this means, both VMware and Pivotal teams have to gain experience and be able to install NSX-T fast and easy for their own learning and use by customers.

Introducing nsx-t-gen


VMware Internal Dev teams have build a set of automation scripts (based on Ansible) for building a demo version of NSX-T 2.1. While the tool achieves most of the functionality required for a very basic demo of the NSX-T product for field personnel, it still requires heavy grunt work and ansible knowledge to make it customizable as well as flexible to handle different configurations (support PAS or PKS, multiple T1 Routers, Logical switches). Additional configurations like NAT rules, Load balancer configs etc are still to be implemented (as of writing this blog post).

In order for both VMware and Pivotal field teams to reduce the learning curve and also make the entire process automated without losing the option for customization and flexible, I have built nsx-t-gen that uses a Concourse pipeline (for Cloud foundry and Pivotal folks who are used to Concourse) to wrap around and execute the ansible scripts that handles the final install. The end user only has to configure set of values against various parameters. Along with the pipeline that handles the install, there is a default sample parameters bundled that shows what and how to configure for an install with PAS or PKS.

The entire nsx-t-gen pipeline is on a public Github repo:
https://github.com/sparameswaran/nsx-t-gen


Things handled by nsx-t-gen:

  • Deploy the VMware NSX-T Manager, Controller and Edge ova images
  • Configure the Controller cluster and add it to the management plane
  • Configure hostswitches, profiles, transport zones
  • Configure the Edges and ESXi Hosts to be part of the Fabric
  • Create T0 Router (one per run, in HA vip mode) with uplink and static route
  • Configure arbitrary set of T1 Routers with logical switches and ports
  • NAT Rules setup for T0 Router
  • Container IP Pools and External IP Blocks
  • Self-signed cert generation and registration against NSX-T Manager
  • Route redistribution for T0 Router
  • HA Spoofguard Switching Profile
  • Load Balancer (with virtual servers and server pool) creation

Acknowledgements

The author of this blog wishes to thank following VMware folks:
  • Yasen Simeonov, TPM in VMware NSBU Team, Twitter: @yasensim, creator of the ansible scripts and who made this possible. 
  • Yves Fauser, TPM in VMware NSBU team, Twitter: @yfauserfor technical guidance and troubleshooting.
  • Niran Even Chen, VMware Road warrior leading the NSX Pipeline field initiative, Twitter: @NiranEC , for extensive testing, feedback and fine tuning of the tool (as with nsx-edge-gen earlier).


Disclaimer

The tools discussed in this blog post are neither officially supported nor managed by VMware or Pivotal. These are unsupported and users are at cautioned to use it at their own risk.


Concourse

For users new to concourse, you can setup a quick install of concourse using concourse docker images. Please check out: https://github.com/concourse/concourse-docker. Make sure you have additional disk space on the vm that would run concourse and possibly host the webserver (install nginx if needed) that would serve the OVA images and ovftool binary also. Refer to the concourse documentation page for usage of fly tool and other concourse related details.

Pipeline Jobs

There are 4 jobs registered in the pipeline:
  1. Full install : complete deployment of NSX-T starting from base deploy of OVAs (of NSX-T Edge, Mgr, Controller), configuration of Mgr, Control cluster, to creation of routers and additional configurations.
  2. Base Install: Deploy of OVAs (of NSX-T Edge, Mgr, Controller), configuration of Mgr, Control cluster creation and membership. Users can keep ova memory reservation ON (for production) or turn it off (for demo/simple POC setup) so creation of all the controllers and edges does not end up causing resource starvation.
  3. Adding Routers: Creation of overlay & vlan networks, hostswitches, Addition of the ESXi hosts and edges as transport nodes, followed by creation of T0 Routers and any set of nested T1 Routers and logical switches. Additionally, container ip block and external ip pools would also be created.
  4. Extra configurations: Creation of HA Spoofguard profiles, NAT rules for T0 Router, Load balancer (with virtual servers and server pools) creation and custom signed cert for the NSX Manager.
The Full install would automatically invoke Base Install, add routers and configure the extras while the others would just run just the job. Base Install and Adding of routers uses customized ansible scripts (provided by VMware) while the last one uses direct REST api calls to configure things.

Pipeline Configurations

All the configurations required by the pipeline are driven by the parameters specified during the pipeline registration. There is a full sample params file in the github repo.

vCenter config

The pipeline requires default vCenter related configuration details (like vCenter endpoint, creds, datastore, cluster, resource pool etc.).  These are specified in the params file.


It also requires a standard management portgroup (like 'VM Network') that is non nsx-t type for deploying the OVAs. Users would hit following error if there is no such portgroup: Host did not have any virtual network defined 

Web Server

A web server needs to be configured to act as the hosting site for the OVA images and ovftool binary. The configuration can be modified to use a s3 bucket if necessary (requires modification of the pipeline resources section also).










ESXi Hosts
Name, IP and root password of the ESXi hosts that should be configured as part of the transport nodes.


NSX-T Manager and Controller Config

Configurations (ip, hostnames, creds, prefix names for controllers and edges) for the NSX-T install along with portgroups to be used for external uplink and internal overlay transport.



TEP and Edge interfaces


Most of these defaults can be used as is.

Edges created:


Reservation

For users running in smaller setups or dont want to reserve all memory, its recommended to turn off reservation. For Prod setups, its recommended to leave it to true (ON).

Edge Size

The size of the edge would determine the memory and cpu usage as well as number and size of load balancers that can be running on a given edge. The minimum size for edge should be medium while recommended is large (for PKS and PAS).


T0 Router config

There can be one T0 Router specified per run that would be tied to all the remaining T1 Routers, Logical switches, NAT Rules, Load balancers etc defined in the rest of the configuration associated with that run.


Configurations specified includes the name, vip, uplink ips, static route, and tags (used by PAS).

T1 Routers config

Multiple T1 Routers can be configured each with its own set of logical switches. Sample template provides topology for PKS and PAS.



HA Spoofguard Profile

HA Spoofguard profile is required to maintain locks in NSX-T.


Container IP Blocks

There can be multiple IP blocks to be used by containers

External IP Pools


NAT Rules on T0 Router

There can be NAT rules defined for DNAT (incoming to Ops Mgr or PKS ..) and SNAT (for containers to talk outside)


Custom Self-signed cert for NSX-Manager

One of the important step is to assign a self-signed custom cert for the NSX-Manager so the cert presented by it is valid. Provide the various attributes used for cert generation.



Load Balancers

Multiple load balancers can be configured (but size and number of instances is defined by size of the Edge instance).

LB Size Virtual Servers Pool Members
small 10 30
medium 100 300
large 1000 3000

Number of LBs per Edge is based on size of edge

Edge Size Small LBs Medium LBs Large LBs
small 0 0 0
medium 1 0 0
large 4 1 0


Snapshots of LBR, Virtual servers & Server Pools



Running the pipeline

Copy over the sample params as nsx-t-params.yml and then use following script to register the pipeline (after editing the concourse endpoint, target etc.) against a running concourse instance. 


After registering the pipeline, unpause the pipeline before kicking off any job group

Video of Run

Here is a recording of the pipeline execution.
Part 1 deals with basic install of the ova images and waiting for the VMs to come up.



Part 2 deals remaining steps (joining the control plane, configuring the hostswitches, transport nodes, T0 Routers and T1 Routers with Logical switches) and extra config steps (NAT rules, HA spoofguard creation etc.). 


Conclusion

The aim of nsx-t-gen is to alleviate the pain point involved with manual installation of NSX-T v2.1 while removing the various chances for mis-configurations and deliver an easy, smooth and consistent install experience for the field and customers via the Concourse user interface.

Please post feedback, questions either here or on Twitter: @sabha_mp.


Tuesday, January 9, 2018

Concourse Pipeline for PCF 2.0 + NSX-T Add-on Tile

Users of Pivotal Cloud Foundry 2.0 ( renamed from ERT to PAS) and VMware NSX-T Container Plug-in Tile can use the nsx-t-ci-pipeline Concourse pipeline and scripts to bring up a fully configured Ops Mgr 2.0 with PAS and NSX-T. Users can choose to go either with just the base install of PAS or the full install of MySQL, Rabbit MQ and SCS service tiles in addition to the PAS install.





Versions installed:

Pivotal Operations Manager: 2.0.x
Pivotal Application Service (formerly ERT): 2.0.x
NSX-T Container Plug-in Tile: 2.1.x
Spring Cloud Service: 1.5.x
MySQL: 1.10.x
RabbitMQ: 1.11.x

Steps:
  1. Install and configure NSX-T with T0 Router and T1 Routers along with separate logical switches for Infra, Ert, Services, Dynamic-Services (following the standard PCF NSX Reference arch).  The concourse pipeline does not install or configure NSX-T.
  2. Clone the repo or just create local copy.
  3. Use an existing concourse install or bring up a new concourse instance (scripts available to create one). The concourse install can use github based auth to allow access. Configure the bosh director and cloud configs first to bring up Bosh director followed by configuring the concourse portion and bring up Concourse.
  4. Configure the parameters required for the pipeline (sample params file available in the repo under pipelines/params.sample.yml).
  5. NSX-T Container Plug-in tile is available from VMware site but not yet on network.pivotal.io. Till it becomes directly available for download as a Pivnet resource, download it from other sources (VMware or friendly SE contacts) and upload it to a S3 bucket for reference from the pipeline.
  6. Register and run the pipeline. Edit the pipelines/setup.sh script as needed to edit the concourse endpoint and pipeline, params etc.


Thursday, December 14, 2017

Spring One Platform 2017 Session on PCF & NSX

Check out the video recording and presentation of Pivotal Cloud Foundry and VMware NSX-T integration at the recently conducted Spring One Platform 2017 session:


PCF in the Land of NSX: A Closer Look at PCF with NSX-V vs. NSX-T



Comments, feedback welcome.

Saturday, December 9, 2017

Restitution of ThreadLogic, a Java Thread Dump Analysis tool

I had been busy with Cloud Foundry in general and Pivotal in particular for the past 4 years and haven't had a chance to work or contribute on ThreadLogic, the Java thread dump analysis tool I created, since leaving Oracle. For folks who don't know about ThreadLogic, they can check out the older ThreadLogic related blog posts on this site.

In the meantime, java.net projects site has been shuttered and all references to ThreadLogic has vanished including the tool and docs. Going over the various blog posts and questions, it appeared that users really miss the tool. I have decided to bring it back alive and host it on ThreadLogic Github Repo so users can continue to use it and I can continue to add advisory patterns for newer thread interactions and hope it would be of use to most Java and Enterprise users. The newly re-versioned 2.5.1 is available here.

I would sincerely appreciate it if users can share feedbacks and samples of their thread dumps either as posts or gists or via filing issues on the above Github url so we can add those patterns and advisories that would allow the tool to grow richer with more known patterns and more usage.

Just to be aware, I haven't had a chance to run a full bunch of tests against recent versions of Java (1.8 from Oracle/IBM..) thread dumps. So, there can be problems and errors in handling those thread dumps. Kindly let me know and I will try to fix those issues. Thanks for your patience and consideration.

Feedback, comments and questions on ThreadLogic welcome as always.