Apache Camel Cluster Service Environment Debugging Tips

In certain scenarios, it may be necessary to debug issues related to the Apache Camel Cluster Service. Problems that can manifest from a non-working setup include seeing some “unacked” records in blc_notification_state tables, indicating that the Retry Cluster Service is not working for some reason.

If you are unaware of how Broadleaf leverage’s Apache Camel’s Cluster Service, you can read more about it here: Broadleaf Dev Central

The retry logic mainly kicks in for two scenarios:

  1. to send notifications for the notification states have been inserted as seed data (or directly in the DB with an “unacked” status), OR
  2. if the initial message send we do during a CRUD operation fails for whatever reason (e.g. the broker is down temporarily). Most of the time, with (2), it will succeed and never even hit the retry flow. If there’s something wrong with the retry flow, you’d only know if that initial send failed, which it typically doesn’t.

Debug Tip 1: Check ENV Properties

When running on Kubernetes, there are specific ENV properties that need to be set to tell the system which Camel Cluster implementation to use, e.g. file based for local dev and kubernetes or zookeeper when deploying to a higher environment (e.g. on Kubernetes in the cloud)

Double check the appropriate ENV properties are set and injected properly on your pods: Broadleaf Dev Central

Additional Tip: Turn on the Broadleaf Environment Report and verify that your ENV properties are being properly picked up as expected. For example, in your config server, add the following property for all your flex packages:

broadleaf:
  environment:
    report:
      disabled: false

Debug Tip 2: When using the Kubernetes Camel Cluster Service - check leases

If you have configured the Kubernetes Camel Cluster Service - then the number one check is to verify that each of your deployed applications is able to create a kubernetes lease. You can verify this by running: kubectl get leases and should see an output similar to what is presented here: Broadleaf Dev Central

Important: each type of FLEX PACKAGE should have ownership to various leases. For example, if you deploy the Balanced composition, then you should see lease holders for the auth, browse, cart, processing, and supporting flex packages. If you notice that one or more of the following are missing, then that likely points to some issue with that flex package type unable to interact with the Camel Cluster Service.

Debug Tip 3: Turn on more granular logging

To see what the Camel Cluster Service implementation is doing, you may find it useful to turn on additional logging. Setting the following logs may be helpful :

logging:
  level:
    io:
      fabric8: DEBUG
    org:
      apache:
        camel: DEBUG (you may also want to set to TRACE for more visibility, but does get noisy)
    com:
      broadleafcommerce:
        common:
          messaging:
            notification: TRACE

Additional Tip: From an Flex Package startup log perspective, one other indication that there may be a problem is looking at the logs during startup - the following shows screenshots of normal INFO level logging turned on (assuming you have Kubernetes as your backing cluster implementation):

A successful startup should have logs emitted that look like this:

A startup for a Flex Package that has a problem communicating with the Kubernetes Camel Cluster Service might look like this:

Debug Tip 4: Hook Up a Remote JVM Debugger

If possible, it may be useful to hook up a remove JVM debugger to the pod that is having Camel Cluster Service issues.

To hook up a remote debugger to a pod, the application must have the remote debug port enabled. The default HELM charts for Broadleaf’s flex packages have a property that can be set to enable this. For example, on the blc-browse flex package chart, there is a property on the values.yml file that you can set like:

debug:
  enabled: false
  port: 9004

Next - you will need to port forward the debug port to your local machine (You can do this via kubectl or via a tool like Lens)

Once you have that setup, now you can start a remove JVM debug session in intelliJ. You can set this up using a Debug configuration similar to the one shown here:

Debug Tip 5: Understanding the Kubernetes Camel Cluster Service Flow

It’s helpful to understand the flow of how the Kubernetes Camel Cluster Service works to set appropriate breakpoints.

The general call stack looks like this:

DeferServiceStartupListenerBaseServiceAbstractCamelClusterServiceKubernetesClusterView

On initialization of DeferServiceStartupListener, it has 2 services list: earlyServices and services.

During execution of this class, the doStart method gets called twice with both lists… first the earlyServices list (in the method onCamelContextStarting ) and then with the services list (in the method onCamelContextStarted)

The KubernetesClusterService is supposed to be loaded in the services list (not the earlyServices). If you notice that the KubernetesClusterService class is NOT present in the services list, then that indicates a Spring bean loading problem/conflict. You’ll want to check all instantiations of the CamelClusterService in your codebase for issues.

A successful breakpoint and step through of this flow would look something like this:

  1. DeferServiceStartupListener starts up and iterates through the list of earlyServices first and starts all them up.

  2. DeferServiceStartupListener next iterates through the list of services and starts all them up (it should include the KubernetesClusterService

  3. The BaseService#start() method which is a parent class of KubernetesClusterView will be called

  4. The BaseService#start() method which is a parent class of KubernetesClusterService will be called

  5. The AbstractCamelClusterService#doStart() method will be called

  6. The KubernetesClusterView#doStart() method will be called and try to get a connection to the running Kubernetes Control Plain API

Additional Resources and Context: