Basic Spatial Data with SQL Server and Entity Framework 5.0

Posted by Rick Strahl on West-Wind See other posts from West-Wind or by Rick Strahl
Published on Thu, 21 Jun 2012 14:10:00 GMT Indexed on 2012/06/21 9:16 UTC
Read the original article Hit count: 762

Filed under:
|
|

In my most recent project we needed to do a bit of geo-spatial referencing. While spatial features have been in SQL Server for a while using those features inside of .NET applications hasn't been as straight forward as could be, because .NET natively doesn't support spatial types. There are workarounds for this with a few custom project like SharpMap or a hack using the Sql Server specific Geo types found in the Microsoft.SqlTypes assembly that ships with SQL server.

While these approaches work for manipulating spatial data from .NET code, they didn't work with database access if you're using Entity Framework. Other ORM vendors have been rolling their own versions of spatial integration. In Entity Framework 5.0 running on .NET 4.5 the Microsoft ORM finally adds support for spatial types as well.

In this post I'll describe basic geography features that deal with single location and distance calculations which is probably the most common usage scenario.

SQL Server Transact-SQL Syntax for Spatial Data

Before we look at how things work with Entity framework, lets take a look at how SQL Server allows you to use spatial data to get an understanding of the underlying semantics. The following SQL examples should work with SQL 2008 and forward.

Let's start by creating a test table that includes a Geography field and also a pair of Long/Lat fields that demonstrate how you can work with the geography functions even if you don't have geography/geometry fields in the database. Here's the CREATE command:

CREATE TABLE [dbo].[Geo](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [Location] [geography] NULL,
    [Long] [float] NOT NULL,
    [Lat] [float] NOT NULL
)

Now using plain SQL you can insert data into the table using geography::STGeoFromText SQL CLR function:

insert into Geo( Location , long, lat ) values
               ( geography::STGeomFromText ('POINT(-121.527200 45.712113)', 4326), -121.527200, 45.712113 )
insert into Geo( Location , long, lat ) values 
               ( geography::STGeomFromText ('POINT(-121.517265 45.714240)', 4326), -121.517265, 45.714240 )
insert into Geo( Location , long, lat ) values 
               ( geography::STGeomFromText ('POINT(-121.511536 45.714825)', 4326), -121.511536, 45.714825)


The STGeomFromText function accepts a string that points to a geometric item (a point here but can also be a line or path or polygon and many others). You also need to provide an SRID (Spatial Reference System Identifier) which is an integer value that determines the rules for how geography/geometry values are calculated and returned. For mapping/distance functionality you typically want to use 4326 as this is the format used by most mapping software and geo-location libraries like Google and Bing.

The spatial data in the Location field is stored in binary format which looks something like this:

spatialresults

Once the location data is in the database you can query the data and do simple distance computations very easily. For example to calculate the distance of each of the values in the database to another spatial point is very easy to calculate.

Distance calculations compare two points in space using a direct line calculation. For our example I'll compare a new point to all the points in the database.

Using the Location field the SQL looks like this:

-- create a source point
DECLARE @s geography
SET @s = geography:: STGeomFromText('POINT(-121.527200 45.712113)' , 4326);


--- return the ids
select ID, Location as Geo , 
       Location .ToString() as Point ,
       @s.STDistance( Location) as distance
from Geo
order by distance

The code defines a new point which is the base point to compare each of the values to. You can also compare values from the database directly, but typically you'll want to match a location to another location and determine the difference for which you can use the geography::STDistance function.

This query produces the following output:

DistanceQuery

The STDistance function returns the straight line distance between the passed in point and the point in the database field. The result for SRID 4326 is always in meters. Notice that the first value passed was the same point so the difference is 0. The other two points are two points here in town in Hood River a little ways away - 808 and 1256 meters respectively.

Notice also that you can order the result by the resulting distance, which effectively gives you results that are ordered radially out from closer to further away. This is great for searches of points of interest near a central location (YOU typically!).

These geolocation functions are also available to you if you don't use the Geography/Geometry types, but plain float values. It's a little more work, as each point has to be created in the query using the string syntax, but the following code doesn't use a geography field but produces the same result as the previous query.

--- using float fields
select ID,
    geography::STGeomFromText ('POINT(' + STR (long, 15,7 ) + ' ' + Str(lat ,15, 7) + ')' , 4326),
    geography::STGeomFromText ('POINT(' + STR (long, 15,7 ) + ' ' + Str(lat ,15, 7) + ')' , 4326). ToString(),        
    @s.STDistance( geography::STGeomFromText ('POINT(' + STR(long ,15, 7) + ' ' + Str(lat ,15, 7) + ')' , 4326)) as distance          
from geo
order by distance

Spatial Data in the Entity Framework

Prior to Entity Framework 5.0 on .NET 4.5 consuming of the data above required using stored procedures or raw SQL commands to access the spatial data. In Entity Framework 5 however, Microsoft introduced the new DbGeometry and DbGeography types. These immutable location types provide a bunch of functionality for manipulating spatial points using geometry functions which in turn can be used to do common spatial queries like I described in the SQL syntax above.

The DbGeography/DbGeometry types are immutable, meaning that you can't write to them once they've been created. They are a bit odd in that you need to use factory methods in order to instantiate them - they have no constructor() and you can't assign to properties like Latitude and Longitude.

Creating a Model with Spatial Data

Let's start by creating a simple Entity Framework model that includes a Location property of type DbGeography:

    public class GeoLocationContext : DbContext
    {
        public DbSet<GeoLocation> Locations { get; set; }
    }

    public class GeoLocation
    {
        public int Id { get; set; }
        public DbGeography Location { get; set; }
        public string Address { get; set; }
    }

That's all there's to it. When you run this now against SQL Server, you get a Geography field for the Location property, which looks the same as the Location field in the SQL examples earlier.

Adding Spatial Data to the Database

Next let's add some data to the table that includes some latitude and longitude data. An easy way to find lat/long locations is to use Google Maps to pinpoint your location, then right click and click on What's Here. Click on the green marker to get the GPS coordinates.

To add the actual geolocation data create an instance of the GeoLocation type and use the DbGeography.PointFromText() factory method to create a new point to assign to the Location property:

[TestMethod]
public void AddLocationsToDataBase()
{
    var context = new GeoLocationContext();

    // remove all
    context.Locations.ToList().ForEach( loc => context.Locations.Remove(loc));
    context.SaveChanges();

    var location = new GeoLocation()
    {
        // Create a point using native DbGeography Factory method
        Location = DbGeography.PointFromText(
                    string.Format("POINT({0} {1})", -121.527200,45.712113)
                    ,4326),
        Address = "301 15th Street, Hood River"
    };
    context.Locations.Add(location);

    location = new GeoLocation()
    {
        Location = CreatePoint(45.714240, -121.517265),
        Address = "The Hatchery, Bingen"
    };
    context.Locations.Add(location);

    location = new GeoLocation()
    {
        // Create a point using a helper function (lat/long)
        Location = CreatePoint(45.708457, -121.514432),
        Address = "Kaze Sushi, Hood River"
    };
    context.Locations.Add(location);

    location = new GeoLocation()
    {
        Location = CreatePoint(45.722780, -120.209227),
        Address = "Arlington, OR"
    };
    context.Locations.Add(location);

    context.SaveChanges();
}

As promised, a DbGeography object has to be created with one of the static factory methods provided on the type as the Location.Longitude and Location.Latitude properties are read only. Here I'm using PointFromText() which uses a "Well Known Text" format to specify spatial data. In the first example I'm specifying to create a Point from a longitude and latitude value, using an SRID of 4326 (just like earlier in the SQL examples).

You'll probably want to create a helper method to make the creation of Points easier to avoid that string format and instead just pass in a couple of double values. Here's my helper called CreatePoint that's used for all but the first point creation in the sample above:

public static DbGeography CreatePoint(double latitude, double longitude)
{
    var text = string.Format(CultureInfo.InvariantCulture.NumberFormat,
                             "POINT({0} {1})", longitude, latitude);
    // 4326 is most common coordinate system used by GPS/Maps
    return DbGeography.PointFromText(text, 4326);
}

Using the helper the syntax becomes a bit cleaner, requiring only a latitude and longitude respectively. Note that my method intentionally swaps the parameters around because Latitude and Longitude is the common format I've seen with mapping libraries (especially Google Mapping/Geolocation APIs with their LatLng type).

When the context is changed the data is written into the database using the SQL Geography type which looks the same as in the earlier SQL examples shown.

Querying

Once you have some location data in the database it's now super easy to query the data and find out the distance between locations. A common query is to ask for a number of locations that are near a fixed point - typically your current location and order it by distance.

Using LINQ to Entities a query like this is easy to construct:

[TestMethod]
public void QueryLocationsTest()
{
    var sourcePoint = CreatePoint(45.712113, -121.527200);

    var context = new GeoLocationContext();

    // find any locations within 5 kilometers ordered by distance
    var matches = 
        context.Locations
                .Where(loc => loc.Location.Distance(sourcePoint) < 5000)
                .OrderBy( loc=> loc.Location.Distance(sourcePoint) )
                .Select( loc=> new { Address = loc.Address, Distance = loc.Location.Distance(sourcePoint) });

    Assert.IsTrue(matches.Count() > 0);

    foreach (var location in matches)
    {
        Console.WriteLine("{0} ({1:n0} meters)", location.Address, location.Distance);
    }
}

This example produces:

301 15th Street, Hood River (0 meters)
The Hatchery, Bingen (809 meters)
Kaze Sushi, Hood River (1,074 meters)

 

The first point in the database is the same as my source point I'm comparing against so the distance is 0. The other two are within the 5 mile radius, while the Arlington location which is 65 miles or so out is not returned. The result is ordered by distance from closest to furthest away.

In the code, I first create a source point that is the basis for comparison. The LINQ query then selects all locations that are within 5km of the source point using the Location.Distance() function, which takes a source point as a parameter. You can either use a pre-defined value as I'm doing here, or compare against another database DbGeography property (say when you have to points in the same database for things like routes).

What's nice about this query syntax is that it's very clean and easy to read and understand. You can calculate the distance and also easily order by the distance to provide a result that shows locations from closest to furthest away which is a common scenario for any application that places a user in the context of several locations. It's now super easy to accomplish this.

Meters vs. Miles

As with the SQL Server functions, the Distance() method returns data in meters, so if you need to work with miles or feet you need to do some conversion. Here are a couple of helpers that might be useful (can be found in GeoUtils.cs of the sample project):

/// <summary>
/// Convert meters to miles
/// </summary>
/// <param name="meters"></param>
/// <returns></returns>
public static double MetersToMiles(double? meters)
{
    if (meters == null)
        return 0F;

    return meters.Value * 0.000621371192;
}

/// <summary>
/// Convert miles to meters
/// </summary>
/// <param name="miles"></param>
/// <returns></returns>
public static double MilesToMeters(double? miles)
{
    if (miles == null)
        return 0;

    return miles.Value * 1609.344;
}

Using these two helpers you can query on miles like this:

[TestMethod]
public void QueryLocationsMilesTest()
{
    var sourcePoint = CreatePoint(45.712113, -121.527200);

    var context = new GeoLocationContext();

    // find any locations within 5 miles ordered by distance
    var fiveMiles = GeoUtils.MilesToMeters(5);

    var matches =
        context.Locations
                .Where(loc => loc.Location.Distance(sourcePoint) <= fiveMiles)
                .OrderBy(loc => loc.Location.Distance(sourcePoint))
                .Select(loc => new { Address = loc.Address, Distance = loc.Location.Distance(sourcePoint) });

    Assert.IsTrue(matches.Count() > 0);

    foreach (var location in matches)
    {
        Console.WriteLine("{0} ({1:n1} miles)", location.Address, GeoUtils.MetersToMiles(location.Distance));
    }
}

which produces:

301 15th Street, Hood River (0.0 miles)
The Hatchery, Bingen (0.5 miles)
Kaze Sushi, Hood River (0.7 miles)

Nice 'n simple.

.NET 4.5 Only

Note that DbGeography and DbGeometry are exclusive to Entity Framework 5.0 (not 4.4 which ships in the same NuGet package or installer) and requires .NET 4.5. That's because the new DbGeometry and DbGeography (and related) types are defined in the 4.5 version of System.Data.Entity which is a CLR assembly and is only updated by major versions of .NET. Why this decision was made to add these types to System.Data.Entity rather than to the frequently updated EntityFramework assembly that would have possibly made this work in .NET 4.0 is beyond me, especially given that there are no native .NET framework spatial types to begin with.

I find it also odd that there is no native CLR spatial type. The DbGeography and DbGeometry types are specific to Entity Framework and live on those assemblies. They will also work for general purpose, non-database spatial data manipulation, but then you are forced into having a dependency on System.Data.Entity, which seems a bit silly. There's also a System.Spatial assembly that's apparently part of WCF Data Services which in turn don't work with Entity framework. Another example of multiple teams at Microsoft not communicating and implementing the same functionality (differently) in several different places. Perplexed as a I may be, for EF specific code the Entity framework specific types are easy to use and work well.

Working with pre-.NET 4.5 Entity Framework and Spatial Data

If you can't go to .NET 4.5 just yet you can also still use spatial features in Entity Framework, but it's a lot more work as you can't use the DbContext directly to manipulate the location data. You can still run raw SQL statements to write data into the database and retrieve results using the same TSQL syntax I showed earlier using Context.Database.ExecuteSqlCommand().

Here's code that you can use to add location data into the database:

[TestMethod]
public void RawSqlEfAddTest()
{
    string sqlFormat = 
    @"insert into GeoLocations( Location, Address) values
            ( geography::STGeomFromText('POINT({0} {1})', 4326),@p0 )";

    var sql = string.Format(sqlFormat,-121.527200, 45.712113);

    Console.WriteLine(sql);

    var context = new GeoLocationContext();

    Assert.IsTrue(context.Database.ExecuteSqlCommand(sql,"301 N. 15th Street") > 0);
}

Here I'm using the STGeomFromText() function to add the location data.

Note that I'm using string.Format here, which usually would be a bad practice but is required here. I was unable to use ExecuteSqlCommand() and its named parameter syntax as the longitude and latitude parameters are embedded into a string. Rest assured it's required as the following does not work:

string sqlFormat = 
    @"insert into GeoLocations( Location, Address) values
            ( geography::STGeomFromText('POINT(@p0 @p1)', 4326),@p2 )";
context.Database.ExecuteSqlCommand(sql, -121.527200, 45.712113, "301 N. 15th Street")

Explicitly assigning the point value with string.format works however.

There are a number of ways to query location data. You can't get the location data directly, but you can retrieve the point string (which can then be parsed to get Latitude and Longitude) and you can return calculated values like distance.

Here's an example of how to retrieve some geo data into a resultset using EF's and SqlQuery method:

[TestMethod]
public void RawSqlEfQueryTest()
{
    var sqlFormat =
    @"
    DECLARE @s geography
    SET @s = geography:: STGeomFromText('POINT({0} {1})' , 4326);

    SELECT     
        Address,        
        Location.ToString() as GeoString,        
        @s.STDistance( Location) as Distance        
    FROM GeoLocations
    ORDER BY Distance";

    var sql = string.Format(sqlFormat, -121.527200, 45.712113);    

    var context = new GeoLocationContext();
    var locations = context.Database.SqlQuery<ResultData>(sql);

    Assert.IsTrue(locations.Count() > 0);

    foreach (var location in locations)
    {
        Console.WriteLine(location.Address + " " + location.GeoString + " " + location.Distance);
    }
}

public class ResultData
{
    public string GeoString { get; set; }
    public double Distance { get; set; }
    public string Address { get; set; }
}

Hopefully you don't have to resort to this approach as it's fairly limited. Using the new DbGeography/DbGeometry types makes this sort of thing so much easier. When I had to use code like this before I typically ended up retrieving data pks only and then running another query with just the PKs to retrieve the actual underlying DbContext entities. This was very inefficient and tedious but it did work.

Summary

For the current project I'm working on we actually made the switch to .NET 4.5 purely for the spatial features in EF 5.0. This app heavily relies on spatial queries and it was worth taking a chance with pre-release code to get this ease of integration as opposed to manually falling back to stored procedures or raw SQL string queries to return spatial specific queries. Using native Entity Framework code makes life a lot easier than the alternatives. It might be a late addition to Entity Framework, but it sure makes location calculations and storage easy. Where do you want to go today? ;-)

Resources

© Rick Strahl, West Wind Technologies, 2005-2012
Posted in ADO.NET  Sql Server  .NET  

© West-Wind or respective owner

Related posts about ADO.NET

Related posts about SQL Server