Troubleshooting

Bayware Engine Diagnostics

  1. I have just installed and configured the Bayware processor (proc-1, proc-2, proc-3, proc-4), but it does not show up in Topology on the orchestrator.

    • Ensure that the service is running. As root on the processor node, type the following

      ]# systemctl status ib_engine
      

      The status should indicate both loaded and active (along with either exited or running). If you do not see this status, restart the service

      ]# systemctl restart ib_engine
      
    • Check login credentials used to attach node to orchestrator. You can verify the orchestrator FQDN, domain, username, and password used during engine configuration. As root, view the following

      ]# more /opt/ib_engine/releases/1/sys.config
      

      Find keystone_token near the top. This shows the FQDN of the orchestrator (ignore the trailing path), for example

      {keystone_token,"https://c1382fd7.sb.bayware.io/api/v1"}
      

      You would ensure that c1382fd7.sb.bayware.io matches the FQDN for the orchestrator shown on the SIS. You can find the SIS FQDN in the URL section (everything that comes after https:// for the orchestrator row).

      Search further for login, password, and domain and ensure that these match processor login credentials on your SIS.

      If credentials do not match, simply re-run the ib_engine configuration script again

      ]# /opt/ib_engine/bin/ib_configure -i
      
  2. The Bayware processor shows up on the orchestrator, but it doesn’t make connections with any other processor.

    • Be patient. It can take up to one minute to form the link.

    • Click Resources and then click on the processor Node Name on the orchestrator. Scroll to the bottom and ensure you configured the expected link.

    • As root on the processor node, ensure the IPsec client, strongSwan, has been configured.

      ]# systemctl status strongswan
      

      If strongSwan is not active, restart the service

      ]# systemctl restart strongswan
      

      Once strongSwan is active, ensure that it has security associations set up with other nodes. There should be one security association established for each green link shown on the Topology page.

      ]# strongswan status
      

      If there are no security associations or if systemctl indicated that the strongswan service is not running, then it may not have been configured. Re-run engine configuration bullet point above and be sure to answer yes to IPsec.

Bayware Agent Diagnostics

  1. I have just installed and configured the Bayware agent, but it does not show up in Topology on the orchestrator.

    • Ensure that the service is running. As root on the workload node, type the following

      ]# systemctl status ib_agent
      

      The status should indicate both loaded and active (running) If you do not see this status, restart the service

      ]# systemctl restart ib_agent
      
    • Check login credentials used to attach node to orchestrator. You can verify the orchestrator FQDN, domain, username, and password used during agent configuration. As root, view the following

      ]# more /etc/ib_agent.conf
      

      Ensure correct controller_ip by cross-checking the IP address with that for aws-c1 on your SIS. Ensure correct login, password, and domain with that expected from the SIS.

      If credentials do not match, simply re-run the ib_agent configuration script again

      ]# /opt/ib_agent/bin/ib_configure -i
      
    • Check ib_agent status to ensure that it is properly registered with the orchestrator. To do this, you need the IP address and port used for the REST interface. Look for the [rest] section near the bottom of the following file

      ]# more /etc/ib_agent.conf
      

      It should look like

      ...
      
      [rest]
      rest_ip = 192.168.250.1
      rest_port = 5500
      log_file = /var/log/ib_agent_rest.log
      log_level = DEBUG
      

      Note the rest_ip and rest_port and use them in the following curl command. For example,

      [root@aws-11-382fd7 ~]# curl 192.168.250.1:5500/api/v1/status
      

      The ready, registered, and success keys should all be assigned a value of true. You can also verify login credentials as well as orchestrator IP address (which is called controller in this context.)

  2. The Bayware agent shows up on the orchestrator, but it doesn’t make connections with any other processor.

    • Be patient. It can take up to one minute to form the link.

    • As root on the workload node, ensure the IPsec client, strongSwan, has been configured.

      ]# systemctl status strongswan
      

      If strongSwan is not active, restart the service

      ]# systemctl restart strongswan
      

      Once strongSwan is active, ensure that it has security associations set up with other nodes. There should be one security association established for each green link shown on the Topology page.

      ]# strongswan status
      

      If there are no security associations or if systemctl indicates that the strongswan service is not running, then it may not have been configured. Re-run agent configuration bullet point above and be sure to answer yes to IPsec.

Getaway App & Voting App Diagnostics

It’s best to ensure that the App is running on a properly configured Bayware interconnection fabric before checking individual microservices.

Table 6 Getaway App Connectivity
Host Host Owner URL
aws-11 http-proxy frontend.getaway-app.ib.loc
aws-12 getaway-svc news-api.getaway-app.ib.loc
aws-12 getaway-svc weather-api.getaway-app.ib.loc
aws-12 getaway-svc places-api.getaway-app.ib.loc
Table 7 Voting App Connectivity
Host Host Owner URL
aws-11 http-proxy result-frontend.voting-app.ib.loc
aws-11 http-proxy voting-frontend.voting-app.ib.loc
aws-12 worker result-worker.voting-app.ib.loc
aws-12 worker voting-worker.voting-app.ib.loc
gcp-11 voting-svc voting-backend.voting-app.ib.loc
azr-11 result-svc result-backend.voting-app.ib.loc

Do this by logging in to one of the workload hosts listed in Table 6 and Table 7. From the workload host, issue a ping command to the URL listed. For example,

[centos@aws-11-382fd7]$ ping frontend.getaway-app.ib.loc

If connectivity exists over the Bayware interconnection fabric, then you should see ongoing responses indicating 64 bytes from .... If you do not see response packets, then resume troubleshooting ib_agent and ib_engine in the sections above.

If you do see ping response packets as indicated, then ensure the application service units are installed and running on the proper VMs. This is performed differently for Getaway App and Voting App.

With Getaway App for instance, as indicated in Getaway Microservices VM Mapping, the http-proxy microservice running on aws-11 relies on a service unit called getaway-proxy. getaway-proxy should be installed and started on aws-11. Login to aws-11 as root and ensure it is installed

]# yum list installed | grep getaway-proxy

If you get a positive response, then ensure that the service unit is running under systemd

]# systemctl status getaway-proxy

You should see a response of active (running). If the service unit is not installed or it is not running, you can follow the tutorial installation instructions to reinstall and start or restart it (systemctl start getaway-proxy or systemctl restart getaway-proxy).

Also ensure that only a single getaway service is running i.e., there should be only a single getaway-* listed among running services. Show all running services with

]# systemctl list-units --type service

If an unexpected getaway-* service appears in the list, stop the service. For example, to stop the getaway-service service

]# systemctl stop getaway-service

With Voting App, you should find a container image running on each VM where you expect a microservice. Login to a workload node as root and execute the following

]# systemctl list-units --type service --all | grep _container

A positive response should show a container service recognizable as being part of Voting App, for instance, http-proxy_container. It should be loaded, active, and running with an output similar to

http-proxy_container.service       loaded    active   running "http-proxy_container"

Re-run the deploy-voting-app.sh script as described in Installation with Ansible if any service is missing or its status is incorrect.