Batch Geocoding and Batch Reverse-Geocoding with Bing Maps

Introduction

Geocoding and reverse geocoding are services that Bing Maps provides in the AJAX Control as well as in the SOAP and REST web services but what if you are about to start a project and you have to geocode thousands of addresses or what if you have a requirement to batch-process data updates as a recurring task? Of course you could just call the geocoder again and again but that doesn’t seem to be a very efficient approach. With our June release we also launched a new batch-geocoder and batch reverse-geocoder as part of the Bing Spatial Data API in order to address just these scenarios. Chris Pendleton briefly touched on it in his blog post here.

Today I would like to go in a bit more detail and build a little application that leverages the Bing Spatial Data Services. During this walkthrough we will follow the process as pictured below.

01

As a prerequisite you will need a Bing Maps Key which you can create yourself at the Bing Maps Portal.

Format Data

Your data can be either in XML- or text-files. In text files you can separate values with comma, tab or pipe (|). The data can be

  • latitudes and longitudes which would be reverse geocoded
  • query-strings such as place-names, postcodes or unformatted addresses
  • formatted addresses with separate attributes for each address-part

You will find a full description of the data schema here and some sample data here. An interesting aspect of the service is that we can mix different types of information. In the sample data set below you see for example formatted addresses, well known places, UK postcodes, latitudes and longitudes for reverse geocoding as well as an empty entry which I intentionally put in there to demonstrate what happens if a record cannot be resolved.

<GeocodeFeed>
  <GeocodeEntity Id="1" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <GeocodeRequest Culture="de-DE">
      <Address AddressLine="Konrad-Zuse-Str. 1" Locality="Unterschleißheim" PostalCode="85716" />
    </GeocodeRequest>
  </GeocodeEntity>
  <GeocodeEntity Id="4" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <GeocodeRequest Culture="en-GB" Query="Tower of London">
    </GeocodeRequest>
  </GeocodeEntity>
  <GeocodeEntity Id="5" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <GeocodeRequest Culture="en-GB" Query="Angel of the North">
    </GeocodeRequest>
  </GeocodeEntity>
  <GeocodeEntity Id="6" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <GeocodeRequest Culture="en-GB" Query="RG6 1WG">
    </GeocodeRequest>
  </GeocodeEntity>
  <GeocodeEntity Id="7" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <ReverseGeocodeRequest Culture="fr-FR">
      <Location Longitude="2.265087118043766" Latitude="48.83431718199653"/>
    </ReverseGeocodeRequest>
  </GeocodeEntity>
  <GeocodeEntity Id="8" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <GeocodeRequest Culture="en-US" Query="">
      <Address AddressLine="" AdminDistrict="" />
    </GeocodeRequest>
  </GeocodeEntity>
</GeocodeFeed>

There is a size-limitation to consider though. The file to upload must not exceed 100 MB. You can have up to 10 jobs at a time but if you really need to go to the limits you should consider using a more efficient file format such as a pipe(|)-delimited text file. The sample data above would look in this format as shown below and would only be a quarter of the size of the XML-file

1|de-DE||Konrad-Zuse-Str. 1|||||Unterschleißheim|85716||||||||||||||||||||||
4|en-GB|Tower of London||||||||||||||||||||||||||||
5|en-GB|Angel of the North||||||||||||||||||||||||||||
6|en-GB|RG6 1WG||||||||||||||||||||||||||||
7|fr-FR||||||||||||||||||||||||||||48.83431718199653|2.265087118043766
8|en-US|||||||||||||||||||||||||||||

Create a Job

In the SDK you will find sample code for a console application in C#. In this walk-through we will build a WinForm-application in VB.NET. The final application will look like shown below and you can download the source code here.

02

Once we have selected our source-data-file we first set the content-type .

' The 'Content-Type' header must be "text/plain" or "application/xml"
' depending on the input data format.
Dim contentType As String = "text/plain"
If Microsoft.VisualBasic.Right(txtSelectedFile.Text, 3).ToLower = "xml" Then
  contentType = "application/xml"
End If

Next we build our HTTP-POST-request adding parameters for the source-data-format and the Bing Maps key. We also add our source-data-file as bytes from a file-stream. If the job was successfully submitted, we receive a job-ID as part of the response-header. Together with a desired output-format (JSON or XML) and the Bing Maps key we can use this job-ID to monitor the job status. We will start a timer to do just that every 30 seconds (or whatever you think is appropriate).

Dim queryStringBuilder As New StringBuilder()

' The 'input' and 'key' parameters are required.
queryStringBuilder.Append("input=").Append(Uri.EscapeUriString(cbInputFormat.Text))
queryStringBuilder.Append("&")
queryStringBuilder.Append("key=").Append(Uri.EscapeUriString(txtBMKey.Text))

' The 'description' parameter is optional.
If Not String.IsNullOrEmpty(txtDescription.Text) Then
  queryStringBuilder.Append("&")
  queryStringBuilder.Append("description=").Append(Uri.EscapeUriString(txtDescription.Text))
End If

Dim uriBuilder As New UriBuilder("http://spatial.virtualearth.net")
uriBuilder.Path = "/REST/v1/dataflows/geocode"
uriBuilder.Query = queryStringBuilder.ToString()

Using dataStream As FileStream = File.OpenRead(txtSelectedFile.Text)
  Dim request As HttpWebRequest = DirectCast(WebRequest.Create(uriBuilder.Uri), HttpWebRequest)

  ' The method must be 'POST'.
  request.Method = "POST"
  request.ContentType = contentType

  Using requestStream As Stream = request.GetRequestStream()
    Dim buffer As Byte() = New Byte(16383) {}
    Dim bytesRead As Integer = dataStream.Read(buffer, 0, buffer.Length)
    While bytesRead > 0
      requestStream.Write(buffer, 0, bytesRead)

      bytesRead = dataStream.Read(buffer, 0, buffer.Length)
    End While
  End Using

  Try
    Using response As HttpWebResponse = DirectCast(request.GetResponse(), HttpWebResponse)
      ' If the job was created successfully, the status code should be
      ' 201 (Created) and the 'Location' header should contain the
      ' location of the new dataflow job.
      If response.StatusCode <> HttpStatusCode.Created Then
        lblStatus.Text = "Unexpected status code."
      End If

      Dim dataflowJobLocation As String = response.GetResponseHeader("Location")
      If String.IsNullOrEmpty(dataflowJobLocation) Then
        lblStatus.Text = "Expected the 'Location' header."
      End If

      myStatusUrl = dataflowJobLocation & "?output=" + cbOutputFormat.Text + "&key=" + txtBMKey.Text
      lblStatusUrl.Visible = True

      ' Start a timer to monitor the status. 
      ' in this sample the timer ticks every 30 seconds
      myTimer.Start()
    End Using
  Catch ex As Exception
    lblStatus.Text = ex.Message
  End Try
End Using

Monitor Status

In the previous section we have created our batch-job, retrieved the job-ID and started a timer which checks the job-status periodically. The job status can be returned either in XML or JSON format and would look like shown below. As you can see we can retrieve the status of the job as well as URLs from where we can download our geocoded data as well as those that failed to geocode.

<?xml version="1.0" encoding="utf-8"?>
<Response xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xmlns:xsd=http://www.w3.org/2001/XMLSchema
xmlns="http://schemas.microsoft.com/search/local/ws/rest/v1"> <Copyright>Copyright © 2010 Microsoft and its suppliers. All rights reserved...</Copyright> <BrandLogoUri>http://spatial.virtualearth.net/Branding/logo_powered_by.png</BrandLogoUri> <StatusCode>200</StatusCode> <StatusDescription>OK</StatusDescription> <AuthenticationResultCode>ValidCredentials</AuthenticationResultCode> <TraceId>0508be9c784f4a9c898003942643b7f2|LTSM001003|02.00.136.1000|</TraceId> <ResourceSets> <ResourceSet> <EstimatedTotal>1</EstimatedTotal> <Resources> <DataflowJob> <Id>ce3548b360ca42d3adac0f7c4a26f392</Id> <Link role="self">https://spatial.virtualearth.net/REST/v1/…</Link> <Link role="output" name="succeeded">https://spatial.virtualearth.net/…</Link> <Link role="output" name="failed">https://spatial.virtualearth.net/REST/…</Link> <Description>My Batch Job 31/08/2010 00:00:00</Description> <Status>Completed</Status> <CreatedDate>2010-08-31T02:46:47.1744785-07:00</CreatedDate> <CompletedDate>2010-08-31T02:47:36.7504986-07:00</CompletedDate> <TotalEntityCount>12</TotalEntityCount> <ProcessedEntityCount>12</ProcessedEntityCount> <FailedEntityCount>1</FailedEntityCount> </DataflowJob> </Resources> </ResourceSet> </ResourceSets> </Response>

In the procedure that is being executed when the timer ticks we evaluate the job status. If the job has been completed we update our user interface with statistical information and download links.

Dim myXmlDocument As New XmlDocument
Dim numTotal As Integer = 0
Dim numProcessed As Integer = 0
Dim numFailed As Integer = 0

myXmlDocument.Load(myStatusUrl)

Dim myJobStatus As String = myXmlDocument.Item("Response").Item("ResourceSets")._
Item("ResourceSet").Item("Resources").Item("DataflowJob").Item("Status").InnerText If myJobStatus = "Completed" Then lblStatus.Text = "Job Complete" Dim myXmlNode As XmlNode = myXmlDocument.Item("Response").Item("ResourceSets")._
Item("ResourceSet").Item("Resources").Item("DataflowJob") For i = 0 To myXmlNode.ChildNodes.Count - 1 Select Case myXmlNode.ChildNodes(i).Name Case "Link" If myXmlNode.ChildNodes(i).Attributes.Count > 1 Then If (myXmlNode.ChildNodes(i).Attributes("role").Value = "output" And _
myXmlNode.ChildNodes(i).Attributes("name").Value = "succeeded") Then mySucessUrl = myXmlNode.ChildNodes(i).InnerText + "?key=" + txtBMKey.Text ElseIf (myXmlNode.ChildNodes(i).Attributes("role").Value = "output" And _
myXmlNode.ChildNodes(i).Attributes("name").Value = "failed") Then myFailedUrl = myXmlNode.ChildNodes(i).InnerText + "?key=" + txtBMKey.Text End If End If Case "TotalEntityCount" numTotal = CInt(myXmlNode.ChildNodes(i).InnerText) Case "ProcessedEntityCount" numProcessed = CInt(myXmlNode.ChildNodes(i).InnerText) Case "FailedEntityCount" numFailed = CInt(myXmlNode.ChildNodes(i).InnerText) End Select Next lblSummary.Text = "Summary" + vbCrLf _ + "Total Entities: " + numTotal.ToString + vbCrLf _ + "Processed Entities: " + numProcessed.ToString + vbCrLf _ + "Failed Entities: " + numFailed.ToString

Download Results

Results will remain available for download for up to 14 days. Again, a detailed description of the data schema is available here in the SDK but let’s have a quick look at our sample data in XML-format:

<?xml version="1.0"?>
<GeocodeFeed >
  <GeocodeEntity Id="1" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode">
    <GeocodeRequest Culture="de-DE">
      <Address AddressLine="Konrad-Zuse-Str. 1" 
Locality="Unterschleißheim"
PostalCode="85716" /> </GeocodeRequest> <GeocodeResponse DisplayName="Konrad-Zuse-straße 1, 85716 Unterschleißheim"
EntityType="Address"
Confidence="Medium"
StatusCode="Success"> <Address AddressLine="Konrad-Zuse-straße 1"
AdminDistrict="BY"
CountryRegion="Germany"
FormattedAddress="Konrad-Zuse-straße 1, 85716 Unterschleißheim"
Locality="Unterschleißheim"
PostalCode="85716" /> <RooftopLocation Latitude="48.290643" Longitude="11.581654" /> <InterpolatedLocation Latitude="48.290542" Longitude="11.581076" /> </GeocodeResponse> </GeocodeEntity> <GeocodeEntity Id="4" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode"> <GeocodeRequest Culture="en-GB"
Query="Tower of London" /> <GeocodeResponse DisplayName="Tower of London, United Kingdom"
EntityType="HistoricalSite"
Confidence="High"
StatusCode="Success"> <Address AdminDistrict="England"
CountryRegion="United Kingdom"
FormattedAddress="Tower of London, United Kingdom"
Locality="London" /> <RooftopLocation Latitude="51.5081448107958" Longitude="-0.0762598961591721" /> </GeocodeResponse> </GeocodeEntity> <GeocodeEntity Id="5" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode"> <GeocodeRequest Culture="en-GB"
Query="Angel of the North" /> <GeocodeResponse DisplayName="Angel of the North, United Kingdom"
EntityType="LandmarkBuilding"
Confidence="High"
StatusCode="Success"> <Address AdminDistrict="England"
CountryRegion="United Kingdom"
FormattedAddress="Angel of the North, United Kingdom"
Locality="Gateshead" /> <RooftopLocation Latitude="54.9144704639912" Longitude="-1.58999472856522" /> </GeocodeResponse> </GeocodeEntity> <GeocodeEntity Id="6" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode"> <GeocodeRequest Culture="en-GB"
Query="RG6 1WG" /> <GeocodeResponse DisplayName="RG6 1WG, Wokingham, United Kingdom"
EntityType="Postcode1"
Confidence="High"
StatusCode="Success"> <Address AdminDistrict="England"
CountryRegion="United Kingdom"
FormattedAddress="RG6 1WG, Wokingham, United Kingdom"
PostalCode="RG6 1WG" /> <RooftopLocation Latitude="51.461179330945" Longitude="-0.925943478941917" /> </GeocodeResponse> </GeocodeEntity> <GeocodeEntity Id="7" xmlns="http://schemas.microsoft.com/search/local/2010/5/geocode"> <ReverseGeocodeRequest Culture="fr-FR"> <Location Latitude="48.8343171819965" Longitude="2.26508711804377" /> </ReverseGeocodeRequest> <GeocodeResponse DisplayName="Quai du Président Roosevelt, 92130 Issy-les-Moulineaux"
EntityType="Address"
Confidence="Medium"
StatusCode="Success"> <Address AddressLine="Quai du Président Roosevelt"
AdminDistrict="IdF"
CountryRegion="France"
FormattedAddress="Quai du Président Roosevelt, 92130 Issy-les-Moulineaux"
Locality="Issy-les-Moulineaux"
PostalCode="92130" /> <InterpolatedLocation Latitude="48.8343036174774" Longitude="2.26509869098663" /> </GeocodeResponse> </GeocodeEntity> </GeocodeFeed>

That’s it for today. Happy coding Smile

 

Advertisements
This entry was posted in Bing Maps. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s