A couple of months ago I published the post Using geo-distance sort in Spring Data Elasticsearch 4. In the comments there came up the question “What about searching within a distance?”
Well, this is not supported by query derivation from the method name, but it can easily be done with a custom repository implementation (see the documentation for more information about that).
I updated the example – which is available on GitHub – and will explain what is needed for this implementation. I won’t describe the entity and setup, please check the original post for that.
The custom repository interface
First we need to define a new repository interface that defines the method we want to provide:
public interface FoodPOIRepositoryCustom { /** * search all {@link FoodPOI} that are within a given distance of a point * * @param geoPoint * the center point * @param distance * the distance * @param unit * the distance unit * @return the found entities */ List<SearchHit<FoodPOI>> searchWithin(GeoPoint geoPoint, Double distance, String unit); }
The custom repository implementation
Next we need to provide an implementation, important here is that this is named like the interface with the suffix “Impl”:
public class FoodPOIRepositoryCustomImpl implements FoodPOIRepositoryCustom { private final ElasticsearchOperations operations; public FoodPOIRepositoryCustomImpl(ElasticsearchOperations operations) { this.operations = operations; } @Override public List<SearchHit<FoodPOI>> searchWithin(GeoPoint geoPoint, Double distance, String unit) { Query query = new CriteriaQuery( new Criteria("location").within(geoPoint, distance.toString() + unit) ); // add a sort to get the actual distance back in the sort value Sort sort = Sort.by(new GeoDistanceOrder("location", geoPoint).withUnit(unit)); query.addSort(sort); return operations.search(query, FoodPOI.class).getSearchHits(); } }
In this implementation we have an ElasticsearchOperations
instance injected by Spring. In the method implementation we build a NativeSearchQuery
that specifies the distance query we want. In addition to that we add a GeoDistanceSort
to have the actual distance of the found entities in the output. We pass this query to the ElasticsearchOperations
instance and return the search result.
Adapt the repository
We need to add the new interface to our FoodPOIRepository
definition, which otherwise is unchanged:
public interface FoodPOIRepository extends ElasticsearchRepository<FoodPOI, String>, FoodPOIRepositoryCustom { List<SearchHit<FoodPOI>> searchTop3By(Sort sort); List<SearchHit<FoodPOI>> searchTop3ByName(String name, Sort sort); }
Use it in the controller
In the rest controller, there is a new method that uses the distance search:
@PostMapping("/within") List<ResultData> withinDistance(@RequestBody RequestData requestData) { GeoPoint location = new GeoPoint(requestData.getLat(), requestData.getLon()); List<SearchHit<FoodPOI>> searchHits = repository.searchWithin(location, requestData.distance, requestData.unit); return toResultData(searchHits); } private List<ResultData> toResultData(List<SearchHit<FoodPOI>> searchHits) { return searchHits.stream() .map(searchHit -> { Double distance = (Double) searchHit.getSortValues().get(0); FoodPOI foodPOI = searchHit.getContent(); return new ResultData(foodPOI.getName(), foodPOI.getLocation(), distance); }).collect(Collectors.toList()); }
We extract the needed parameters from the requestData
that came in, call our repository method and convert the results to our output format.
And that’s it
So with a small custom repository implementation we were able to add the desired functionality to our repository
Also (apart from above code),
I tried using Spring’s way:
public List getWithin(final Short userId,
final Double latitude, final Double longitude, final Double radius) {
final String geoPointFieldName = getFieldNameFromAnnotation(LocEntity.class,
GeoPointField.class);
GeoPoint geoPoint = new GeoPoint(latitude, longitude);
Criteria geoCriteria = new Criteria(geoPointFieldName)
.within(geoPoint, radius.toString() + LOCKER_SEARCH_DISTANCE_UNIT);
// I have few more criteria’s to query on
Criteria andCriteria = Criteria.where(“userId”).is(String.valueOf(userId))
.and(“isActive”).greaterThan(0);
// Added a pageable
final Pageable pageable = Utils
.getPageable(ApiConstants.DEFAULT_PAGE_NUMBER,
ApiConstants.DEFAULT_PAGE_SIZE);
CriteriaQuery criteriaQuery = new CriteriaQuery(andCriteria, pageable);
criteriaQuery.addCriteria(geoCriteria);
Sort sort = Sort.by(new GeoDistanceOrder(geoPointFieldName, geoPoint)
.withUnit(“m”));
criteriaQuery.addSort(sort);
It is giving me results same as my Es client code.
But what is the possible set of values for UNIT?
Can it be in metres?
I had set “mi” and “m”, which gave me same results. Is this expected?
Ah! I see Spring is usine “org.elasticsearch.common.unit.DistanceUnit”
Is that correct?
[So,
m/meters -> Metres
mi/miles -> Miles]
we are using org.elasticsearch.common.unit.DistanceUnit to parse the unit string, so the possible values are
INCH(0.0254, “in”, “inch”),
YARD(0.9144, “yd”, “yards”),
FEET(0.3048, “ft”, “feet”),
KILOMETERS(1000.0, “km”, “kilometers”),
NAUTICALMILES(1852.0, “NM”, “nmi”, “nauticalmiles”),
MILLIMETERS(0.001, “mm”, “millimeters”),
CENTIMETERS(0.01, “cm”, “centimeters”),
// ‘m’ is a suffix of ‘nmi’ so it must follow ‘nmi’
MILES(1609.344, “mi”, “miles”),
// since ‘m’ is suffix of other unit
// it must be the last entry of unit
// names ending with ‘m’. otherwise
// parsing would fail
METERS(1, “m”, “meters”);
Thank you for your response.
May I ask one question on the Geo search within using Criteria and custom repository implementation:
How can we get all data in the result set?
As ES only return 10 hits by default.
(I mean in repository method name we can have “findAllBy…”, how do we get for the above impl you have shows?)
And I don’t mean Paging/pageable.
I am asking for an alternative to “findAll…”, where I don’t need to check for pages (number, size, etc)
you need to set the
from
andsize
values in your query. Either to the max value of 10000 or you do a count before the data retrieval and set the exact valueYes, wouldn’t the count be a performance hit?
How about using:
final SearchHitsIterator iterator = elasticsearchOperations
.searchForStream(criteriaQuery, ElasticsearchEntity.class);
If I iterate over the iterator, I can get ALL results. Right?
Is this approach better or getting the count from ES first and then querying?
for large result sets the streaming variant is better, because under the hood it will use multiple calls to get the data when the stream is processed
Got it. Thanks a lot for all the responses!
Thanks for writing a post on this, I’m glad I asked that question!
However, this is I have written for Location search within a distance (I wrote it before Spring Data 4, so used Es Clients)
final BoolQueryBuilder finalQuery = QueryBuilders.boolQuery();
final QueryBuilder geoDistanceQueryBuilder = QueryBuilders
.geoDistanceQuery(geoPointField)
.point(lat, lon)
.distance(distance, DistanceUnit.METERS);
finalQuery.filter(geoDistanceQueryBuilder);
final SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(finalQuery);
//Assume I have paged request
if (pageable != null) {
final int fromIndex = pageable.getPageNumber() * pageable.getPageSize();
searchSourceBuilder.from(fromIndex)
.size(pageable.getPageSize());
}
SearchRequest searchRequest = new SearchRequest().indices(indexName);
return searchRequest
.source(searchSourceBuilder.sort(SortBuilders.geoDistanceSort(geoPointField, lat, lon)
.order(SortOrder.fromString(“ASC”))
.unit(DistanceUnit.METERS)));
Does this look okay?
I am in dilemma to either continue using this or switch to Spring Data for this.