Using geo-distance sort in Spring Data Elasticsearch 4

The release of Spring Data Elasticsearch in version 4.0 (see the documentation) brings two new features that now enable users to use geo-distance sorts in repository queries: The first is a new class GeoDistanceOrder and the second is a new return type for repository methods SearchHit<T>. In this post I will show how easy it is to use these classes to answer questions like “Which pubs are the nearest to a given location?”.

The source code

The complete runnable code used for this post is available on GitHub. In order to run the application you will need Java 8 or higher and a running instance of Elasticsearch. If this is not accessible at localhost:9200 you need to set the correct value in the src/main/resources/application.yaml file.

Update 12.09.2020

The original code was a little extended for the follow-up post Search entities within a geographic distance with Spring Data Elasticsearch 4

The sample data

For this sample application I use a csv file with POI data from OpenStreetMap that contains POIs in Germany which are categorized as kind of food, like restaurants, pubs, fast-food and more. All together there are 826843 records.

When the application is started, the index in Elasticsearch is created and loaded with the data if it does not yet exist. So the first startup takes a little longer, the progress can be seen on the console. Within the application, these POIs are modelled by the following entity:

@Document(indexName = "foodpois")
public class FoodPOI {
    @Id
    private String id;
    @Field(type = FieldType.Text)
    private String name;
    @Field(type = FieldType.Integer)
    private Integer category;
    private GeoPoint location;
    // constructors, getter/setter left out for brevity
}

The interesting properties for this blog post are the location and the name.

The Repository

In order to search the data we need a Repository Definition:

public interface FoodPOIRepository extends ElasticsearchRepository<FoodPOI, String> {
    List<SearchHit<FoodPOI>> searchTop3By(Sort sort);
    List<SearchHit<FoodPOI>> searchTop3ByName(String name, Sort sort);
}

We have two functions defined, the first we will use to search any POI near a given point, with the second on we can search for the POIs with a name. Defining these methods in the interface is all we need as Spring Data Elasticsearch will under the hood create the implementation for these methods by analyzing the method names and parameters.

In Spring Data Elasticsearch before version 4 we could only get a List<FoodPOI> from a repository method. But now there is the SearchHit<T> class, which not only contains the entity, but also other values like a score, highlights or – what we need here – the sort value. When doing a geo-distance sort, the sort value contains the actual distance of the POI to the value we passed into the search.

The Controller

We define a REST controller, so we can call our application to get the data. The request parameters will come in a POST body that will be mapped to the following class:

public class RequestData {
    private String name;
    private double lat;
    private double lon;
    // constructors, getter/setter ...
}

The result data that will be sent to the client looks like this:

public class ResultData {
    private String name;
    private GeoPoint location;
    private Double distance;

  // constructor, gette/setter ...
}

The controller has just one method:

@RestController
@RequestMapping("/foodpois")
public class FoodPOIController {

    private final FoodPOIRepository repository;

    public FoodPOIController(FoodPOIRepository repository) {
        this.repository = repository;
    }

    @PostMapping("/nearest3")
    List<ResultData> nearest3(@RequestBody RequestData requestData) {

        GeoPoint location = new GeoPoint(requestData.getLat(), requestData.getLon());
        Sort sort = Sort.by(new GeoDistanceOrder("location", location).withUnit("km"));

        List<SearchHit<FoodPOI>> searchHits;

        if (StringUtils.hasText(requestData.getName())) {
            searchHits = repository.searchTop3ByName(requestData.getName(), sort);
        } else {
            searchHits = repository.searchTop3By(sort);
        }

        return searchHits.stream()
            .map(searchHit -> {
                Double distance = (Double) searchHit.getSortValues().get(0);
                FoodPOI foodPOI = searchHit.getContent();
                return new ResultData(foodPOI.getName(), foodPOI.getLocation(), distance);
            }).collect(Collectors.toList());
    }
}

In line 15 we create a Sort object that specifies that Elasticsearch should return the data ordered by the geographical distance to the given value which we take from the request data. Then, depending if we have a name, we call the corresponding method and get back a List<SearchHit<FoodPOI>>.

We then in the lines 27 to 29 extract the information we need from the returned objects and build our result data object.

Check the result

After starting the application we can hit the endpoint. I use curl here and pipe the output through jq to have it formatted:

$curl -X "POST" "http://localhost:8080/foodpois/nearest3" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
  "lat": 49.02,
  "lon": 8.4
}'|jq

[
  {
    "name": "Cantina Majolika",
    "location": {
      "lat": 49.0190808,
      "lon": 8.4014792
    },
    "distance": 0.14860088197123017
  },
  {
    "name": "Waldgaststätte FSSV",
    "location": {
      "lat": 49.023578,
      "lon": 8.3954656
    },
    "distance": 0.5173117164589114
  },
  {
    "name": "Hatz",
    "location": {
      "lat": 49.0155358,
      "lon": 8.3975457
    },
    "distance": 0.5276800664204232
  }
]

And the Pubs?

curl -X "POST" "http://localhost:8080/foodpois/nearest3" \
     -H 'Content-Type: application/json; charset=utf-8' \
     -d $'{
  "lat": 49.02,
  "lon": 8.4,
  "name": "pub"
}'|jq
[
  {
    "name": "Scruffy's Irish Pub",
    "location": {
      "lat": 49.0116335,
      "lon": 8.3950194
    },
    "distance": 0.998711100164643
  },
  {
    "name": "Irish Pub “Sean O'Casey's”",
    "location": {
      "lat": 49.0090639,
      "lon": 8.4028365
    },
    "distance": 1.2335132790824628
  },
  {
    "name": "Oxford Pub",
    "location": {
      "lat": 49.0086149,
      "lon": 8.4129781
    },
    "distance": 1.5806674447458173
  }
]

And that’s it

Without even needing to know how these request are sent to Elasticsearch and what Elasticsearch sends back, we can easily use these features in our Spring application. Hope you enjoyed it!

 

 

 

6 thoughts on “Using geo-distance sort in Spring Data Elasticsearch 4

  1. Hi, I try to developp a simple app that index GeoJson data in ES. I’m using Spring Data Elastic search. Dealing with GeoPoints is ok. but in GeoJson the geometry attribute have this form:
    {
    “type”: “Feature”,
    “properties”: {},
    “geometry”: {
    “type”: “Point”,
    “coordinates”: [
    9.195556640625,
    34.8047829195724
    ]
    }
    }

    but in Spring data elastic search we can only use @GeoPointField . so my question is haw to create a property in my @document that spring data can index a Geo_shape in elastic search.
    this is my git with simple app https://github.com/kobra007/spring-wfs
    I hope that mu question is clear thank you.

      • Thx for the replay so I ness to use some concerter on that specific properti to let spring data do the serialisation as needed in geojson format ? I think if the concerter work correctly elastic search will index the transformed properti in geospacial forment as es support geo_shape. Thank you for help I will try to write the code and publish it to help how need it. I will let you know about it i hope more collaboration with you. Have a Nice day

        • This would be the fist step. You would need to define the queries using StringQuery or a NativeSearchQuery then to do your searches.

Leave a Reply

Your email address will not be published.