Use two different Elasticsearch clusters with Spring Data Elasticsearch
Motivation
In the Spring Data Elasticsearch issue tracker recently someone asked if it is possible to access two different Elasticsearch clusters in an application that is using Spring Data Elasticsearch. The reason they needed this weas because that application should do some migration - I assume that data from one cluster should be transferred to a second cluster. In this blog post I will show how this can be set up. In this article I name the one cluster the default cluster and the other the secondary cluster.
Used versions and the sample code
The code was created by using the Spring Initializr ; the selected version of Spring Boot was 3.1.4, I added the “Spring Data Elasticsearch” (of course) and “web” dependencies, although the latter is not necessarily needed, I always add this because normally I need it. The version of Spring Data Elasticsearch that is added is 5.1.4.
The code for this example is available at Codeberg .
Prerequesites
I assume that you are familiar with Spring Data Elasticsearch and how to configure it by providing a configuration to configure the connection to the cluster. If not, check the reference documentation . I am not using any Spring Boot autoconfiguration features or values taken from environment or properties files.
What is needed
To achieve our goal we need to
- set up two different
ElasticsearchOperations
, one for each cluster - a mechanism to be able to create
ElasticsearchRepository
beans that used the designatedElasticsearchOperations
Providing the ElasticsearchOperations
beans
The default connection
For the connection to the default cluster we use the normal way of defining a @Configuration
annotated class that
derives from org.springframework.data.elasticsearch.client.elc.ElasticsearchConfiguration
. We need to implement the
method clientConfiguration()
, here I show the minimal way to do this by just setting the host and port where the
default cluster can be accessed.
ClusterConfiguration.java:
@Configuration(proxyBeanMethods = false)
public class ClusterConfiguration extends ElasticsearchConfiguration {
@Override
public ClientConfiguration clientConfiguration() {
return ClientConfiguration.create("es-primary:9200");
}
@Bean
@Primary
public ElasticsearchOperations elasticsearchOperations(ElasticsearchConverter elasticsearchConverter, ElasticsearchClient elasticsearchClient) {
return super.elasticsearchOperations(elasticsearchConverter, elasticsearchClient);
}
}
The second method to overload is the method providing the default ElasticsearchOperations
bean. We add no logic in
here and just call the base class implementation, but we add the @Primary
annotation to this so that whenever a bean
of this type is requested without specifying a different qualifier, this one is used.
The secondary connection
For this one we create a second class derived from ElasticsearchConfiguration
and specify the necessary host and port
like for the default one. But on this one we do not add the @Configuration
annotation, because if we’d do so, the
base class would create beans that would conflict with the ones created from our default cluster configuration. But we
need some of the logic from the base class.
SecondaryClusterConfiguration.java:
// NOTE: no @Configuration here!
public class SecondaryClusterConfiguration extends ElasticsearchConfiguration {
public ClientConfiguration clientConfiguration() {
return ClientConfiguration.create("es-secondary:9200");
}
}
When we have this class we add a new method to our configuration of the first cluster:
ClusterConfiguration.java:
@Configuration(proxyBeanMethods = false)
public class ClusterConfiguration extends ElasticsearchConfiguration {
@Override
public ClientConfiguration clientConfiguration() {
return ClientConfiguration.create("es-primary:9200");
}
@Bean
@Primary
public ElasticsearchOperations elasticsearchOperations(ElasticsearchConverter elasticsearchConverter, ElasticsearchClient elasticsearchClient) {
return super.elasticsearchOperations(elasticsearchConverter, elasticsearchClient);
}
@Bean
@Qualifier("secondaryCluster")
public ElasticsearchOperations secondaryCluster(ElasticsearchConverter elasticsearchConverter) {
var elasticsearchConfiguration = new SecondaryClusterConfiguration();
var clientConfiguration = elasticsearchConfiguration.clientConfiguration();
var restClient = elasticsearchConfiguration.elasticsearchRestClient(clientConfiguration);
var elasticsearchClient = elasticsearchClient(restClient);
return elasticsearchConfiguration.elasticsearchOperations(elasticsearchConverter, elasticsearchClient);
}
}
This new method provides the second ElasticsearchOperations
bean that is qualified by the name “secondaryCluster”.
With this we already can use both ElasticsearchOperations
to access the different clusters. Just inject them like
this:
@Autowired
private ElasticsearchOperations primaryClusterOperations;
@Autowired
@Qualifier("secondaryCluster")
private ElasticsearchOperations secondaryClusterOperations;
Each one will use its own connection to the corresponding cluster
The repositories
default cluster repositories
The implementations for Spring Data repositories are automatically created on application startup and by default use the
bean with the name “elasticsearchTemplate”, that is the default name that
the org.springframework.data.elasticsearch.client.elc.ElasticsearchConfiguration
class assigns to the created bean. We
do not need any additional configuration here.
secondary cluster repositories
We now need a way to tell Spring Data Elasticsearch that we have some repositories that need a different one.
To achieve this, we create a new package, I named it secondarycluster, in this package we put the repository interfaces and one configuration. The following shows the layout in the sample project:
.
├── BlogSdeMultipleClustersApplication.java
├── ClusterConfiguration.java
├── Data.java
├── PrimaryRepository.java
├── SecondaryClusterConfiguration.java
├── package-info.java
└── secondarycluster
├── SecondaryRepository.java
└── SecondaryRepositoryConfiguration.java
The repository is nothing special,
SecondaryRepository.java:
public interface SecondaryRepository extends ElasticsearchRepository<Data, String> {
}
The important file in this package is
SecondaryRepositoryConfiguration.java:
@Configuration
@EnableElasticsearchRepositories(elasticsearchTemplateRef = "secondaryCluster")
public class SecondaryRepositoryConfiguration {
}
This configuration enables the repository scanning in this package and its sub-packages, but specifies the name of
the ElasticsearchOperations
bean to be used when instantiating these repository interfaces.
What’s left to do is that we need to exclude this package from the normal default repository scan, we achieve this by
adding the following to our ClusterConfiguration
class:
@Configuration(proxyBeanMethods = false)
@EnableElasticsearchRepositories(excludeFilters = {
@ComponentScan.Filter(
type = FilterType.REGEX,
pattern = "com\\.sothawo\\.blogsdemultipleclusters\\.secondarycluster\\..*"
)
})
public class ClusterConfiguration extends ElasticsearchConfiguration {
// code shown above
}
We enable the repository scan that uses the default ElasticsearchOperations
, but we exclude the package that contains
our repositories that should use the “secondaryCluster”
Summing it up
With just some configuration files and adaptions to the default setup we achieved our goal to access two different Elasticsearch cluster from one application. Checkout the code from https://codeberg.org/sothawo/blog-sde-multiple-clusters , you can give feedback by mail or on Mastodon .