Tuesday, May 8, 2018

Introducing nsx-t-gen: Automating NSX-T Install with Concourse

This blog post deals with automating install of VMware NSX-T v2.1 using Concourse pipeline.


Overview of NSX-T and Concourse


VMware NSX-T 2.1 is the next generation SDN for securely managing cloud native microservices at container level and also at VM level, along with standard networking capabilities. Installing the NSX-T v2.1 requires very careful planning and series of manual steps to get a full working configuration. Any mis-step wont be visible till VMs or app containers are up and running.

Concourse is the open source CI/CD toolkit used by heavily Cloud Foundry developers and Pivotal customers. It provides a much cleaner and user-friendly abstraction compared to other CI/CD tools.

NSX-T integration with PAS & PKS


Pivotal and VMware has built integration between Pivotal Application Service (PAS), formerly known as Elastic Runtime (ERT), and NSX-T v2.1 to manage the networking and container level security. Additionally a new product line, Pivotal Container Service (PKS) was launched that provides a BOSH managed install of Kubernetes along with NSX-T to manage container networking and security. All of this means, both VMware and Pivotal teams have to gain experience and be able to install NSX-T fast and easy for their own learning and use by customers.

Introducing nsx-t-gen


VMware Internal Dev teams have build a set of automation scripts (based on Ansible) for building a demo version of NSX-T 2.1. While the tool achieves most of the functionality required for a very basic demo of the NSX-T product for field personnel, it still requires heavy grunt work and ansible knowledge to make it customizable as well as flexible to handle different configurations (support PAS or PKS, multiple T1 Routers, Logical switches). Additional configurations like NAT rules, Load balancer configs etc are still to be implemented (as of writing this blog post).

In order for both VMware and Pivotal field teams to reduce the learning curve and also make the entire process automated without losing the option for customization and flexible, I have built nsx-t-gen that uses a Concourse pipeline (for Cloud foundry and Pivotal folks who are used to Concourse) to wrap around and execute the ansible scripts that handles the final install. The end user only has to configure set of values against various parameters. Along with the pipeline that handles the install, there is a default sample parameters bundled that shows what and how to configure for an install with PAS or PKS.

The entire nsx-t-gen pipeline is on a public Github repo:
https://github.com/sparameswaran/nsx-t-gen


Things handled by nsx-t-gen:

  • Deploy the VMware NSX-T Manager, Controller and Edge ova images
  • Configure the Controller cluster and add it to the management plane
  • Configure hostswitches, profiles, transport zones
  • Configure the Edges and ESXi Hosts to be part of the Fabric
  • Create T0 Router (one per run, in HA vip mode) with uplink and static route
  • Configure arbitrary set of T1 Routers with logical switches and ports
  • NAT Rules setup for T0 Router
  • Container IP Pools and External IP Blocks
  • Self-signed cert generation and registration against NSX-T Manager
  • Route redistribution for T0 Router
  • HA Spoofguard Switching Profile
  • Load Balancer (with virtual servers and server pool) creation

Acknowledgements

The author of this blog wishes to thank following VMware folks:
  • Yasen Simeonov, TPM in VMware NSBU Team, Twitter: @yasensim, creator of the ansible scripts and who made this possible. 
  • Yves Fauser, TPM in VMware NSBU team, Twitter: @yfauserfor technical guidance and troubleshooting.
  • Niran Even Chen, VMware Road warrior leading the NSX Pipeline field initiative, Twitter: @NiranEC , for extensive testing, feedback and fine tuning of the tool (as with nsx-edge-gen earlier).


Disclaimer

The tools discussed in this blog post are neither officially supported nor managed by VMware or Pivotal. These are unsupported and users are at cautioned to use it at their own risk.


Concourse

For users new to concourse, you can setup a quick install of concourse using concourse docker images. Please check out: https://github.com/concourse/concourse-docker. Make sure you have additional disk space on the vm that would run concourse and possibly host the webserver (install nginx if needed) that would serve the OVA images and ovftool binary also. Refer to the concourse documentation page for usage of fly tool and other concourse related details.

Pipeline Jobs

There are 4 jobs registered in the pipeline:
  1. Full install : complete deployment of NSX-T starting from base deploy of OVAs (of NSX-T Edge, Mgr, Controller), configuration of Mgr, Control cluster, to creation of routers and additional configurations.
  2. Base Install: Deploy of OVAs (of NSX-T Edge, Mgr, Controller), configuration of Mgr, Control cluster creation and membership. Users can keep ova memory reservation ON (for production) or turn it off (for demo/simple POC setup) so creation of all the controllers and edges does not end up causing resource starvation.
  3. Adding Routers: Creation of overlay & vlan networks, hostswitches, Addition of the ESXi hosts and edges as transport nodes, followed by creation of T0 Routers and any set of nested T1 Routers and logical switches. Additionally, container ip block and external ip pools would also be created.
  4. Extra configurations: Creation of HA Spoofguard profiles, NAT rules for T0 Router, Load balancer (with virtual servers and server pools) creation and custom signed cert for the NSX Manager.
The Full install would automatically invoke Base Install, add routers and configure the extras while the others would just run just the job. Base Install and Adding of routers uses customized ansible scripts (provided by VMware) while the last one uses direct REST api calls to configure things.

Pipeline Configurations

All the configurations required by the pipeline are driven by the parameters specified during the pipeline registration. There is a full sample params file in the github repo.

vCenter config

The pipeline requires default vCenter related configuration details (like vCenter endpoint, creds, datastore, cluster, resource pool etc.).  These are specified in the params file.


It also requires a standard management portgroup (like 'VM Network') that is non nsx-t type for deploying the OVAs. Users would hit following error if there is no such portgroup: Host did not have any virtual network defined 

Web Server

A web server needs to be configured to act as the hosting site for the OVA images and ovftool binary. The configuration can be modified to use a s3 bucket if necessary (requires modification of the pipeline resources section also).










ESXi Hosts
Name, IP and root password of the ESXi hosts that should be configured as part of the transport nodes.


NSX-T Manager and Controller Config

Configurations (ip, hostnames, creds, prefix names for controllers and edges) for the NSX-T install along with portgroups to be used for external uplink and internal overlay transport.



TEP and Edge interfaces


Most of these defaults can be used as is.

Edges created:


Reservation

For users running in smaller setups or dont want to reserve all memory, its recommended to turn off reservation. For Prod setups, its recommended to leave it to true (ON).

Edge Size

The size of the edge would determine the memory and cpu usage as well as number and size of load balancers that can be running on a given edge. The minimum size for edge should be medium while recommended is large (for PKS and PAS).


T0 Router config

There can be one T0 Router specified per run that would be tied to all the remaining T1 Routers, Logical switches, NAT Rules, Load balancers etc defined in the rest of the configuration associated with that run.


Configurations specified includes the name, vip, uplink ips, static route, and tags (used by PAS).

T1 Routers config

Multiple T1 Routers can be configured each with its own set of logical switches. Sample template provides topology for PKS and PAS.



HA Spoofguard Profile

HA Spoofguard profile is required to maintain locks in NSX-T.


Container IP Blocks

There can be multiple IP blocks to be used by containers

External IP Pools


NAT Rules on T0 Router

There can be NAT rules defined for DNAT (incoming to Ops Mgr or PKS ..) and SNAT (for containers to talk outside)


Custom Self-signed cert for NSX-Manager

One of the important step is to assign a self-signed custom cert for the NSX-Manager so the cert presented by it is valid. Provide the various attributes used for cert generation.



Load Balancers

Multiple load balancers can be configured (but size and number of instances is defined by size of the Edge instance).

LB Size Virtual Servers Pool Members
small 10 30
medium 100 300
large 1000 3000

Number of LBs per Edge is based on size of edge

Edge Size Small LBs Medium LBs Large LBs
small 0 0 0
medium 1 0 0
large 4 1 0


Snapshots of LBR, Virtual servers & Server Pools



Running the pipeline

Copy over the sample params as nsx-t-params.yml and then use following script to register the pipeline (after editing the concourse endpoint, target etc.) against a running concourse instance. 


After registering the pipeline, unpause the pipeline before kicking off any job group

Video of Run

Here is a recording of the pipeline execution.
Part 1 deals with basic install of the ova images and waiting for the VMs to come up.



Part 2 deals remaining steps (joining the control plane, configuring the hostswitches, transport nodes, T0 Routers and T1 Routers with Logical switches) and extra config steps (NAT rules, HA spoofguard creation etc.). 


Conclusion

The aim of nsx-t-gen is to alleviate the pain point involved with manual installation of NSX-T v2.1 while removing the various chances for mis-configurations and deliver an easy, smooth and consistent install experience for the field and customers via the Concourse user interface.

Please post feedback, questions either here or on Twitter: @sabha_mp.


Tuesday, January 9, 2018

Concourse Pipeline for PCF 2.0 + NSX-T Add-on Tile

Users of Pivotal Cloud Foundry 2.0 ( renamed from ERT to PAS) and VMware NSX-T Container Plug-in Tile can use the nsx-t-ci-pipeline Concourse pipeline and scripts to bring up a fully configured Ops Mgr 2.0 with PAS and NSX-T. Users can choose to go either with just the base install of PAS or the full install of MySQL, Rabbit MQ and SCS service tiles in addition to the PAS install.





Versions installed:

Pivotal Operations Manager: 2.0.x
Pivotal Application Service (formerly ERT): 2.0.x
NSX-T Container Plug-in Tile: 2.1.x
Spring Cloud Service: 1.5.x
MySQL: 1.10.x
RabbitMQ: 1.11.x

Steps:
  1. Install and configure NSX-T with T0 Router and T1 Routers along with separate logical switches for Infra, Ert, Services, Dynamic-Services (following the standard PCF NSX Reference arch).  The concourse pipeline does not install or configure NSX-T.
  2. Clone the repo or just create local copy.
  3. Use an existing concourse install or bring up a new concourse instance (scripts available to create one). The concourse install can use github based auth to allow access. Configure the bosh director and cloud configs first to bring up Bosh director followed by configuring the concourse portion and bring up Concourse.
  4. Configure the parameters required for the pipeline (sample params file available in the repo under pipelines/params.sample.yml).
  5. NSX-T Container Plug-in tile is available from VMware site but not yet on network.pivotal.io. Till it becomes directly available for download as a Pivnet resource, download it from other sources (VMware or friendly SE contacts) and upload it to a S3 bucket for reference from the pipeline.
  6. Register and run the pipeline. Edit the pipelines/setup.sh script as needed to edit the concourse endpoint and pipeline, params etc.


Thursday, December 14, 2017

Spring One Platform 2017 Session on PCF & NSX

Check out the video recording and presentation of Pivotal Cloud Foundry and VMware NSX-T integration at the recently conducted Spring One Platform 2017 session:


PCF in the Land of NSX: A Closer Look at PCF with NSX-V vs. NSX-T



Comments, feedback welcome.

Saturday, December 9, 2017

Restitution of ThreadLogic, a Java Thread Dump Analysis tool

I had been busy with Cloud Foundry in general and Pivotal in particular for the past 4 years and haven't had a chance to work or contribute on ThreadLogic, the Java thread dump analysis tool I created, since leaving Oracle. For folks who don't know about ThreadLogic, they can check out the older ThreadLogic related blog posts on this site.

In the meantime, java.net projects site has been shuttered and all references to ThreadLogic has vanished including the tool and docs. Going over the various blog posts and questions, it appeared that users really miss the tool. I have decided to bring it back alive and host it on ThreadLogic Github Repo so users can continue to use it and I can continue to add advisory patterns for newer thread interactions and hope it would be of use to most Java and Enterprise users. The newly re-versioned 2.5.1 is available here.

I would sincerely appreciate it if users can share feedbacks and samples of their thread dumps either as posts or gists or via filing issues on the above Github url so we can add those patterns and advisories that would allow the tool to grow richer with more known patterns and more usage.

Just to be aware, I haven't had a chance to run a full bunch of tests against recent versions of Java (1.8 from Oracle/IBM..) thread dumps. So, there can be problems and errors in handling those thread dumps. Kindly let me know and I will try to fix those issues. Thanks for your patience and consideration.

Feedback, comments and questions on ThreadLogic welcome as always.

Thursday, August 17, 2017

Automating NSX Integration with PCF

Customers using Pivotal Cloud Foundry (PCF) with VMware NSX (SDN Solution) can now automate the creation of NSX Edge Gateway instances with a pre-configured set of subnets and Load balancers for various PCF tiles/products components. nsx-edge-gen helps automate the creation of NSX Edge instances pre-configured for PCF, while nsx-ci-pipeline helps install core set of PCF products integrated with NSX using the Concourse pipeline respectively.

Overview

Pivotal Cloud Foundry is a Cloud Foundry Platform for running cloud native applications on various IaaS (like vSphere, AWS, GCP, Azure, Openstack). 

Customers using VMware NSX as the SDN solution and vSphere as the IaaS, configuring NSX Edge instances for networking and load balancing requires multiple manual steps to create different networks for logical partitioning of different PCF products, create and configure NSX edge instances with these subnets, virtual servers etc. in order to get PCF Platform up and running. Creation of a NSX Edge can be tricky and time consuming and repeating it for creation of multiple PCF Foundations is quite a labor even for experienced administrators. The manual steps are detailed in the NSX Edge Cookbook for PCF.

Reference architecture of NSX + PCF

Ref: https://docs.pivotal.io/pivotalcf/1-11/refarch/images/vsphere-overview-arch.png



Ref: https://docs.pivotal.io/pivotalcf/1-11/refarch/images/vsphere-port-groups.png

Details

NSX Edge acts as the gatekeeper to a set of logical switches and load balancers associated with a PCF Foundation (managed by one Ops Mgr and BOSH Director).

The Logical switches are associated with the subnets used by various products/layers:
  • Infrastructure - for BOSH and Ops Mgr managing the entire install
  • Deployment - for main Elastic Runtime Tile that is Cloud Foundry
  • Services - for other supporting tiles that provides services (like MySQL, RabbitMQ, SCS...)
  • Dynamic Services - for those tiles that support On-Demand Broker model of spinning of new service instances on demand.
  • Isolation Segments - for apps that require their own Routers and Diego cells for specialized hardware/routing/isolation.


Virtual servers are required with pools (application profiles, roles and monitors) to handle load balancing for components like GoRouters, Diego SSH access, MySQL proxy, RabbitMQ proxy and so on.

Additionally some generic coarse grained firewall rules can be applied at the NSX Edge instance level to allow or disallow communication between tiles/products or inbound and outbound directions (East-West and North-South). 

Creation of all these manual steps requires tedious and careful steps and repeating them for each foundation means more time and grunt work.


nsx-edge-gen Tool


Automating the NSX Edge Creation

Users can now automate the creation of the NSX edge instances with a set of logical switches, load balancers, pools that conforms to a template using the nsx-edge-gen toolkit. The tool is supported against the NSX-V (6.2.x versions, not NSX-T to be released later in 2017 or early 2018) of the VMware NSX product. 

Note: The tool does not install or configure vSphere or NSX Manager itself, only works on an existing installation and creates/deletes NSX edge instances on an existing NSX Manager.

To start with, nsx-edge-gen provides a template of logical switches and routed components as provided in the reference architecture and the user can modify it either via command line or a yaml config file.


Distributed Logical Router (DLR)

Distributed Logical Router is a special use case that allows all communication between the logical switches and subnets to avoid doing hair-pin bend across the NSX Edge and rather use the Distributed Logical router (DLR). Also, a DLR allows well above the the maximum default of 10 logical switches that can be associated with a NSX edge instance. There is an auto-generated OSPF subnet wrapping the DLR and connecting it to the NSX Edge instance. There is a 1-1-1 mapping between each NSX edge, OSPF and DLR and these are auto-created. There is no overhead cost (license or performance) due to the OSPF and DLR layers.

If the DLR option is disabled, then the standard PCF reference architecture as defined in the NSX cookbook would be used (no DLR or OSPF).

Logical switches


```
logical_switches:
- name: OSPF  
  cidr: 172.16.100.10/24
  primary_ip: 172.16.100.2
- name: Infra  
  cidr: 192.168.10.0/26
  primary_ip: 192.168.10.2
- name: Ert
  cidr: 192.168.20.0/22
  primary_ip: 192.168.20.2
- name: PCF-Tiles
  cidr: 192.168.24.0/22
  primary_ip: 192.168.24.2
- name: Dynamic-Services
  cidr: 192.168.28.0/22
  primary_ip: 192.168.28.2
#- name: IsoZone-01
#  cidr: 192.168.32.0/22
#  primary_ip: 192.168.32.2
# - name: IsoZone-02
#   cidr: 192.168.36.0/22
#   primary_ip: 192.168.36.2
```
Additional logical switches can be added, like additional isolation segments. The subnets can also be tweaked.

Routed Components

PCF components like GoRouter, Diego Brain, MySQL Proxy require load balancer in front for HA and distribution of traffic across multiple instances. Similar requirement exists for the RabbitMQ, MySQL Tiles etc. Some others like Operations Manager (Ops Mgr for short) requires only a vip to access it.

nsx-edge-gen provides a default set of routed components with associated logical switches, offsets, number of instances for each component etc.

```
routed_components:
- id: OPS 
  name: OPS
  switch: INFRA
  external: true
  useVIP: false
  instances: 1
  offset: 5
  monitor_id: monitor-3
  transport:
    ingress:
      port: '443'
      protocol: https
    egress:
      port: '443'
      protocol: https
      monitor_port: '443'
      url: "/"
  
- id: GO-ROUTER
  name: GO-ROUTER
  switch: ERT
  external: true
  useVIP: true
  instances: 4
  offset: 200
  monitor_id: monitor-4
  transport:
    ingress:
      port: '443'
      protocol: https
    egress:
      port: '80'
      protocol: tcp
      monitor_port: '80'
      # protocol: http
      # monitor_port: '8080'
      # url: "/health"
```

One can specify whether the component needs to be external or not, number of vms hosting that component, use VIP, type of monitor (http/tcp/..), ingress and egress for the load balancer. Sample pasted above specifies Ops Manager that needs to be exposed outside via a VIP but does not require a load balancer  (single instance) and needs to run on Infra subnet. The offset determines the IP to be assigned to vms from the subnet CIDR.

Default configuration includes Ops Manager (tagged as OPS), GoRouter (GO-ROUTER), Diego Brain for ssh (DIEGO), TCP Router (TCP-ROUTER), MySQL bundled within ERT as well as the separate service Tile (as MYSQL-<type>), RabbitMQ (RABBITMQ-TILE), Iso Segment GoRouters (GO-ROUTER-ISO). Each of these components are associated with the logical switches.

The default built-in configuration is good enough for most deployments.

These components are then tied together with NSX Load balancers and pools with Application Rules and Profiles. Profile specify the ingress and egress protocol for the load balancer (LBR). For instance, GoRouter might let the LBR handle SSL termination and only allow plain traffic over Http. So, it can use the https-http profile while MySQL would use a pure tcp style profile. The Application Roles include http logging, forward, including X-Forward-Proto headers etc.


Generation

nsx-edge-gen requires user to provide some default configurations (like endpoints and credentials to vSphere and NSX Manager, Cluster, Datastore, Datacenter). Other configurations required are name of the edge, transport zone used for the logical switches, ssl certs (or allow autogeneration) for the LBRs, distributed portgroup in case of DLR enablement, uplink ips for the various components (as VIP) etc.  Multiple edge instances can be created too using the same template. 

Each NSX Edge instance would be created with a default set of firewall rules, virtual servers, pools, profiles etc. For those that have DLR enabled, there would be an OSPF network acting as the bridge between the NSX Edge and its DLR. 

Use the tool with build, list or delete options to either build a NSX Edge instance, list available instances (& logical switches and verify the parameters) or delete a specified edge instance.

OSPF and DLR



Logical Switches



Firewall Rules




NATs




Virtual Servers





Additionally, based on user indicating the target BOSH environment supports NSX or not, NSX would not populate the pool members (would rather use NSX Security group association for jobs) in case of BOSh supporting NSX (as in PCF 1.11) or statically fill in the member ips based on offset and instance counts.

Customizing the configuration

Using command line arguments, its entirely possible to override the subnets, names, offsets, instances etc. Check the sample test script under the test folder of nsx-edge-gen.

```
#!/bin/bash
echo "Use build, list, delete"
echo "Default option: list"
echo ""

RUN_CMD=${1:-list}
CONFIG_NAME=test-nsx
rm -rf $CONFIG_NAME

./nsx-gen/bin/nsxgen -i $CONFIG_NAME init

./nsx-gen/bin/nsxgen -c $CONFIG_NAME  \
  -esg_name_1 edge1 \
  -esg_size_1 compact \
  -esg_cli_user_1 admin \
  -esg_cli_pass_1 'P1v0t4l!P1v0t4l!' \
  -esg_ert_certs_1 Foundation1 \
  -nsxmanager_dportgroup DPortGroupTest \
  -nsxmanager_en_dlr true \
  -nsxmanager_bosh_nsx_enabled true \
  -nsxmanager_tz TestTZ \
  -nsxmanager_tz_clusters 'Cluster1,Cluster2' \
  -esg_ert_certs_config_sysd_1 sys2.test.pivotal.io \
  -esg_ert_certs_config_appd_1 apps3.test.pivotal.io \
  -esg_iso_certs_1_1 iso-1 \
  -esg_iso_certs_config_switch_1_1 IsoZone-1 \
  -esg_iso_certs_config_ou_1_1 Pivotal \
  -esg_iso_certs_config_cc_1_1 US \
  -esg_iso_certs_config_domains_1_1 zone1-app.test.pivotal.io \
  -esg_opsmgr_uplink_ip_1 10.193.99.171 \
  -esg_go_router_uplink_ip_1 10.193.99.172 \
  -esg_diego_brain_uplink_ip_1 10.193.99.173 \
  -esg_tcp_router_uplink_ip_1 10.193.99.174 \
  -esg_mysql_ert_uplink_ip_1 192.168.23.250 \
  -esg_mysql_ert_inst_1 5  \
  -esg_mysql_tile_uplink_ip_1 192.168.27.250 \
  -esg_mysql_tile_inst_1 2  \
  -esg_rabbitmq_tile_uplink_ip_1 192.168.27.251 \
  -esg_rabbitmq_tile_inst_1 5 \
  -esg_rabbitmq_tile_off_1 10 \
  -vcenter_addr vcsa-01.test.pivotal.io \
  -vcenter_user administrator@vsphere.local \
  -vcenter_pass 'passwd!' \
  -vcenter_dc "" \
  -vcenter_ds vsanDatastore \
  -vcenter_cluster Cluster1 \
  -nsxmanager_addr 10.193.99.20 \
  -nsxmanager_user admin \
  -nsxmanager_pass 'passwd!' \
  -nsxmanager_uplink_ip 10.193.99.170 \
  -nsxmanager_uplink_port 'VM Network' \
  -esg_gateway_1 10.193.99.1  \
 -isozone_switch_name_1 IsoZone-1 \
 -isozone_switch_cidr_1 192.168.34.0/22 \
 -isozone_switch_name_2 IsoZone-2 \
 -isozone_switch_cidr_2 192.168.38.0/22 \
 -esg_go_router_isozone_1_uplink_ip_1  10.193.99.181 \
 ....
```
This allows the user to override the configuration using env variables without rebuilding yaml configurations for each run.

NSX CI Pipeline

The concourse pipelines built in nsx-ci-pipeline allow users to use Concourse pipeline to drive the creation of the NSX Edge instances and then install the Ops Mgr along with ERT and other Pivotal Products that leverage the NSX Edge for load balancing and or security policies.

The users can create a parameter file that is consumed by the pipeline to connect and create NSX edge instances by utilizing the nsx-edge-gen tool mentioned previously. The Ops Mgr tile would be installed after the edge creation and the networks would be auto-populated based on the recently created edge instance logical switches followed by installation of ERT and other product Tiles.

Users can either stop with just installation of the Ops Mgr (for vSphere) and ERT or go all the way with installation of MySQL, RabbitMQ, Spring Cloud Service and Isolation Segment tiles.

Just Ops Mgr and ERT:




Full Install (with MySQL, RabbitMQ, Iso Segments):




The pipeline supports both PCF 1.10 and 1.11. The Product versions, NSX connection information all these can be specified in the concourse parameters file.

With the NSX integration in PCF1.11 BOSH layer, the pipeline automatically calls Ops Mgr APIs to register the routed components with the pre-created load balancer pools along with any additional security groups specified for components that require proxy/load balancing (like GoRouters, TCP routers, mysql proxy, rabbit proxy etc.) This allows the pool to be correctly associated with the members as the platform scales up or down its components, rather than going with a static predefined set of members. The behavior is not available in 1.10 and so the pools would be statically populated with the IP addresses pre-determined by the nsx-edge-gen execution.

Multiple Isolation Segment Tiles can be installed using the add-additional-iso-segment pipeline.

Summary

Using the nsx-edge-gen and nsx-ci-pipeline, users of PCF and NSX can automate the manual steps involved in NSX Edge creation and configuration and PCF installs, while allowing a fast, easy and efficient creation of multiple NSX Edges and PCF foundations that are built to conform to a desired layout in a consistent manner. 

Wednesday, June 19, 2013

WLS Cluster Messaging Protocols on A-Team Chronicles

Oracle A-Team has launched a new Oracle A-Team Chronicles web site for all A-Team generated content that cover architecture, product features, technical tips, performance tuning, troubleshooting and best practices on Oracle Middleware products.

There is a new blog post on WLS Cluster Messaging protocols in the A-team chronicles. For more details on it and related A-Team recommendations, kindly refer to Weblogic Server Cluster Messaging Protocols  

Friday, September 28, 2012

OSB, Service Callouts and OQL - Part 3



This final section of the series will focus on the corrective action to avoid Service Callout related OSB serer hangs. Before we dive into the solution, we need to briefly discus about Work Managers in WLS.

WLS Work Managers


WLS version 9 and newer releases use a concept of Work Managers (WM) and self-tuning of threads to schedule and execute server requests (internal or external). All WLS server (ExecuteThread) threads are held in a global self-tuning thread pool and request are associated with WMs. The threads in the global thread pool can grow or shrink based on some inbuilt monitoring and heuristics (grow when requests are piling up over a period of time or shrink when threads are sitting idle for long etc.) Once the request is finished, the threads go back to the global self-tuning pool. The Work Manager is a concept to associate a request with some scheduling policy and not a thread. There is a "default" WM in WLS that is automatically created. A copy or template of the "default" WM will be used for all  deployed applications, by default, out of the box. 

There are no dedicated threads associated with any WM. If an application decides to use a custom (non-default) WM, the requests meant for that application (like Webapp Servlet requests or EJB or MDB) will be scheduled based on its WM policies. The association of Work Manager with an application is via wl-dispatch-policy in the application descriptor. Multiple different applications can use a copy of Work Manager policy (each inherit the policies associated with that WM) or refer to their own custom WMs. 

Within a given Work Manager, there are options like Min-Thread and Max-Thread Constraints. Explaining the whole WM concept and the various constraints is beyond the scope of this blog entry. Please refer to Workload Management in WebLogic Server white paper on WMs and Thread Constraints in Work Managers blog post to understand more on Constraints and WMs. We will later refer to Custom WMs and Min-Thread Constraint for our particular problem. 

Suffice to say, use Custom Work Manager and Min-Thread Constraint (set to real low value, say 3 or 5) for a given application only under rare circumstances to avoid thread starvation issues like incoming requests requiring additional server thread to complete (as in case of Service callout requiring additional thread to complete the response notification) or loop-backs of requests (AppA makes outbound call which again lands on same server as new requests for AppB); using Min-Threads excessively can cause too many threads or inversion of priority as mentioned in the previously referred blog posting. Use Custom WM with Max-Thread Constraint only in case of MDBs (to increase number of MDB instances processing messages in parallel). 

Corrective Actions for handling Service Callouts


Now that we have seen (or detected) how Service Callouts can contribute to Stuck threads and thread starvation, there are solutions that can be implemented to make OSB gracefully recover from such situations.

1) Ensure the remote Backend Services invoked via Service callouts or Publish can scale under higher loads and still maintain response SLAs. Using the heap dump analysis, identify the remote services involved and improve their scalability and performance.

2) Use Route action whenever possible instead of the Service Callout when the actual invocation for a proxy is a single service and not multiple, and can be implemented using simple Route. Avoid Service callouts for calling co-located services that only do simple transformations or logging. Just invoke them as replace/insert/rename/log actions directly instead of using Service Callouts to achive the same result.

3) Protect OSB from thread starvation due to excessive usage of Service Callouts under load. The actual response handling is handled by an additional thread (Thread T2 in  the Service callout implementation image) for a very short duration and it just notifies the Proxy thread waiting on the Service callout response.

Now this is one good match for applying a custom Work Manager with Min-Thread Constraint we discussed earlier as there is a requirement for additional thread to complete a given request (a bit of loop-back). For the remote Business Service definition that is invoked via Service Callouts, we can associate a Custom Work Manager with Min-Thread constraint so the response handling part (T2) can use a thread to get scheduled right away due to the Min-Thread Constraint (as long as we have not hit the Min-Thread constraint) instead of waiting to be scheduled. Since the thread is really used just for a real short duration, its the right fit for our situation. The thread can pick the Service Callout response of the remote service from  the native Muxer layer when the response is ready and then immediately notify the waiting Service Callout thread before returning to work on the next Service Callout response or go back to the global self-tuning pool. This will ensure there are no thread starvation issues with the Service Callout pattern under high loads.

Custom Work Managers and Min-Thread Constraint


Create a custom Work Manager in WLS with a low Min Thread Constraint (less than or max 5) via the WLS Console.

Login to the WLS Console and expand the Environment node to select the Work Managers.





Start with creation of a Min Thread Constraint.
Select the Count to be 5 (or less).




Target the Constraint to the relevant servers (or OSB Cluster).
Next create a new Work Manager.

 



Target the new custom Work Manager to the relevant servers (or OSB Cluster).
Next associate the Work Manager with the previously created Min Thread Constraint.
Leave the rest as empty (Max Thread Constraint/Capacity Constraint).





Note: The Server instance would have to be restarted to pick the changes.

OSB Business Service with Custom Work Manager


Login to the sbconsole.
Create a Session
Go to the related Business Service configuration.
Edit the HTTP Transport Configuration
Select the newly created custom Work Manager for the Dispatch Policy


Save changes.




Commit the session changes

Now with the above changes to the Business Service, the OSB Service callouts will always use the custom Work Manager with Min-Thread Constraint to handle the response from the Business Service and notification of the waiting Proxy thread and not run into any of the thread starvation issues that we observed earlier with Service Callouts.

The same custom Work Manager can be used by multiple Business Services that are all invoked via Service Callout actions as the thread would be used for a very short duration and can handle responses for multiple business services. Associate the dispatch policy of the Business services invoked via Service Callouts to use the custom WM.

Summary


Hope this series gave some pointers on the internal implementation of OSB for Route Vs. Service Callouts, correct usage of Service Callouts, identifying issues with callouts using Thread Dump and Heap Dump Analysis and the solutions to resolve them.