Search entities within a geographic distance with Spring Data Elasticsearch 4

A couple of months ago I published the post Using geo-distance sort in Spring Data Elasticsearch 4. In the comments there came up the question “What about searching within a distance?”

Well, this is not supported by query derivation from the method name, but it can easily be done with a custom repository implementation (see the documentation for more information about that).

I updated the example – which is available on GitHub – and will explain what is needed for this implementation. I won’t describe the entity and setup, please check the original post for that.

The custom repository interface

First we need to define a new repository interface that defines the method we want to provide:

public interface FoodPOIRepositoryCustom {

    /**
     * search all {@link FoodPOI} that are within a given distance of a point
     *
     * @param geoPoint
     *     the center point
     * @param distance
     *     the distance
     * @param unit
     *     the distance unit
     * @return the found entities
     */
    List<SearchHit<FoodPOI>> searchWithin(GeoPoint geoPoint, Double distance, String unit);
}

The custom repository implementation

Next we need to provide an implementation, important here is that this is named like the interface with the suffix “Impl”:

public class FoodPOIRepositoryCustomImpl implements FoodPOIRepositoryCustom {

    private final ElasticsearchOperations operations;

    public FoodPOIRepositoryCustomImpl(ElasticsearchOperations operations) {
        this.operations = operations;
    }

    @Override
    public List<SearchHit<FoodPOI>> searchWithin(GeoPoint geoPoint, Double distance, String unit) {

        Query query = new CriteriaQuery(
          new Criteria("location").within(geoPoint, distance.toString() + unit)
        );

        // add a sort to get the actual distance back in the sort value
        Sort sort = Sort.by(new GeoDistanceOrder("location", geoPoint).withUnit(unit));
        query.addSort(sort);

        return operations.search(query, FoodPOI.class).getSearchHits();
    }
}

In this implementation we have an ElasticsearchOperations instance injected by Spring. In the method implementation we build a NativeSearchQuery that specifies the distance query we want. In addition to that we add a GeoDistanceSort to have the actual distance of the found entities in the output. We pass this query to the ElasticsearchOperations instance and return the search result.

Adapt the repository

We need to add the new interface to our FoodPOIRepository definition, which otherwise is unchanged:

public interface FoodPOIRepository extends ElasticsearchRepository<FoodPOI, String>, FoodPOIRepositoryCustom {

    List<SearchHit<FoodPOI>> searchTop3By(Sort sort);

    List<SearchHit<FoodPOI>> searchTop3ByName(String name, Sort sort);
}

Use it in the controller

In the rest controller, there is a new method that uses the distance search:

@PostMapping("/within")
List<ResultData> withinDistance(@RequestBody RequestData requestData) {

    GeoPoint location = new GeoPoint(requestData.getLat(), requestData.getLon());

    List<SearchHit<FoodPOI>> searchHits
        = repository.searchWithin(location, requestData.distance, requestData.unit);

    return toResultData(searchHits);
}

private List<ResultData> toResultData(List<SearchHit<FoodPOI>> searchHits) {
    return searchHits.stream()
        .map(searchHit -> {
            Double distance = (Double) searchHit.getSortValues().get(0);
            FoodPOI foodPOI = searchHit.getContent();
            return new ResultData(foodPOI.getName(), foodPOI.getLocation(), distance);
        }).collect(Collectors.toList());
}

We extract the needed parameters from the requestData that came in, call our repository method and convert the results to our output format.

And that’s it

So with a small custom repository implementation we were able to add the desired functionality to our repository

10 thoughts on “Search entities within a geographic distance with Spring Data Elasticsearch 4

  1. Thanks for writing a post on this, I’m glad I asked that question!
    However, this is I have written for Location search within a distance (I wrote it before Spring Data 4, so used Es Clients)

    final BoolQueryBuilder finalQuery = QueryBuilders.boolQuery();

    final QueryBuilder geoDistanceQueryBuilder = QueryBuilders
    .geoDistanceQuery(geoPointField)
    .point(lat, lon)
    .distance(distance, DistanceUnit.METERS);

    finalQuery.filter(geoDistanceQueryBuilder);

    final SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
    searchSourceBuilder.query(finalQuery);
    //Assume I have paged request
    if (pageable != null) {
    final int fromIndex = pageable.getPageNumber() * pageable.getPageSize();
    searchSourceBuilder.from(fromIndex)
    .size(pageable.getPageSize());
    }

    SearchRequest searchRequest = new SearchRequest().indices(indexName);
    return searchRequest
    .source(searchSourceBuilder.sort(SortBuilders.geoDistanceSort(geoPointField, lat, lon)
    .order(SortOrder.fromString(“ASC”))
    .unit(DistanceUnit.METERS)));

    Does this look okay?
    I am in dilemma to either continue using this or switch to Spring Data for this.

  2. Also (apart from above code),

    I tried using Spring’s way:

    public List getWithin(final Short userId,
    final Double latitude, final Double longitude, final Double radius) {

    final String geoPointFieldName = getFieldNameFromAnnotation(LocEntity.class,
    GeoPointField.class);

    GeoPoint geoPoint = new GeoPoint(latitude, longitude);
    Criteria geoCriteria = new Criteria(geoPointFieldName)
    .within(geoPoint, radius.toString() + LOCKER_SEARCH_DISTANCE_UNIT);

    // I have few more criteria’s to query on
    Criteria andCriteria = Criteria.where(“userId”).is(String.valueOf(userId))
    .and(“isActive”).greaterThan(0);

    // Added a pageable
    final Pageable pageable = Utils
    .getPageable(ApiConstants.DEFAULT_PAGE_NUMBER,
    ApiConstants.DEFAULT_PAGE_SIZE);

    CriteriaQuery criteriaQuery = new CriteriaQuery(andCriteria, pageable);
    criteriaQuery.addCriteria(geoCriteria);
    Sort sort = Sort.by(new GeoDistanceOrder(geoPointFieldName, geoPoint)
    .withUnit(“m”));
    criteriaQuery.addSort(sort);

    It is giving me results same as my Es client code.
    But what is the possible set of values for UNIT?
    Can it be in metres?
    I had set “mi” and “m”, which gave me same results. Is this expected?

    • Ah! I see Spring is usine “org.elasticsearch.common.unit.DistanceUnit”

      Is that correct?
      [So,
      m/meters -> Metres
      mi/miles -> Miles]

      • we are using org.elasticsearch.common.unit.DistanceUnit to parse the unit string, so the possible values are
        INCH(0.0254, “in”, “inch”),
        YARD(0.9144, “yd”, “yards”),
        FEET(0.3048, “ft”, “feet”),
        KILOMETERS(1000.0, “km”, “kilometers”),
        NAUTICALMILES(1852.0, “NM”, “nmi”, “nauticalmiles”),
        MILLIMETERS(0.001, “mm”, “millimeters”),
        CENTIMETERS(0.01, “cm”, “centimeters”),

        // ‘m’ is a suffix of ‘nmi’ so it must follow ‘nmi’
        MILES(1609.344, “mi”, “miles”),

        // since ‘m’ is suffix of other unit
        // it must be the last entry of unit
        // names ending with ‘m’. otherwise
        // parsing would fail
        METERS(1, “m”, “meters”);

        • Thank you for your response.
          May I ask one question on the Geo search within using Criteria and custom repository implementation:

          How can we get all data in the result set?
          As ES only return 10 hits by default.

          (I mean in repository method name we can have “findAllBy…”, how do we get for the above impl you have shows?)

          • And I don’t mean Paging/pageable.
            I am asking for an alternative to “findAll…”, where I don’t need to check for pages (number, size, etc)

            • you need to set the from and size values in your query. Either to the max value of 10000 or you do a count before the data retrieval and set the exact value

              • Yes, wouldn’t the count be a performance hit?
                How about using:

                final SearchHitsIterator iterator = elasticsearchOperations
                .searchForStream(criteriaQuery, ElasticsearchEntity.class);

                If I iterate over the iterator, I can get ALL results. Right?

                Is this approach better or getting the count from ES first and then querying?

Leave a Reply

Your email address will not be published.