I wrote an initial blog post on
Gisgraphy about a month ago. I wanted to write a follow-up, but hadn't gotten around to it, what with all the other stuff I have going on. But I'm going to take a few minutes now and write something up.
The initial import process to get the Gisgraphy server up and running took about 250 hours. The documentation said it would take around 40 hours, but of course there's no way to be accurate about that kind of thing without knowing about the specific hardware of the server it's being installed on, and the environment it's in. I'm guessing that, if I had more experience with AWS/EC2, I would have been better able to configure a machine properly for this project.
Once the import was complete, I started experimenting with the geocoding web service. I quickly discovered something that I'd overlooked when I was first testing, against his hosted web service. The
geocoding web service takes a free-form address string as a parameter. It's not set up to accept the parts of the address (street, city, state, zip) separately. It runs that string through an address parser, and here's where we hit a problem. The
address parser, while "part of the gisgraphy project," is not actually open source. An installed instance of Gisgraphy, by default, calls out to the author's web service site to do the address parsing. And, if you call it too often, you get locked out. At which point, you have to talk about licensing the DLL or JAR for it, or
paying for access to it via web service.
Technically, the geocoder will work without the address parser, but I found that it returns largely useless results without it. For instance, it will happily return a result in California, given an address in New Jersey. I'm not entirely sure how the internal logic works, but it appears to just be doing a text search when it's trying to geocode an un-parsed address, merely returning a location with a similar street name, for instance, regardless of where in the US it is.
While I don't think the author is purposely running a bait-and-switch, I also don't think he's clear enough about the fact that the address parser isn't part of the open source project, and that the geocoder is fairly useless without it. So, we shut down the EC2 instance for this and moved on the other things.
Specifically, we moved on to
MapQuest Open, which I was going to write up here in this post, but I need to head out to work now, so maybe another time.
Labels: Linux, software