Next, let's implement the controller. This is the brain of the cluster. Helix makes sure there is exactly one active controller running the cluster.
It requires the following parameters:
manager = HelixManagerFactory.getZKHelixManager(clusterName, instanceName, instanceType, zkConnectString);
The Controller needs to know about all changes in the cluster. Helix takes care of this with the default implementation. If you need additional functionality, see GenericHelixController on how to configure the pipeline.
manager = HelixManagerFactory.getZKHelixManager(clusterName, instanceName, InstanceType.CONTROLLER, zkConnectString); manager.connect(); GenericHelixController controller = new GenericHelixController(); manager.addConfigChangeListener(controller); manager.addLiveInstanceChangeListener(controller); manager.addIdealStateChangeListener(controller); manager.addExternalViewChangeListener(controller); manager.addControllerListener(controller);
The snippet above shows how the controller is started. You can also start the controller using command line interface.
cd helix/helix-core/target/helix-core-pkg/bin ./run-helix-controller.sh --zkSvr <Zookeeper ServerAddress (Required)> --cluster <Cluster name (Required)>
Helix provides multiple options to deploy the controller.
The Controller can be started as a separate process to manage a cluster. This is the recommended approach. However, since one controller can be a single point of failure, multiple controller processes are required for reliability. Even if multiple controllers are running, only one will be actively managing the cluster at any time and is decided by a leader-election process. If the leader fails, another leader will take over managing the cluster.
Even though we recommend this method of deployment, it has the drawback of having to manage an additional service for each cluster. See Controller As a Service option.
If setting up a separate controller process is not viable, then it is possible to embed the controller as a library in each of the participants.
One of the cool features we added in Helix is to use a set of controllers to manage a large number of clusters.
For example if you have X clusters to be managed, instead of deploying X*3 (3 controllers for fault tolerance) controllers for each cluster, one can deploy just 3 controllers. Each controller can manage X/3 clusters. If any controller fails, the remaining two will manage X/2 clusters.