Getting All the Places
The StreetCred community enriches point-of-interest data from a web crawl.
MapNYC, our first test, started with a blank map: users created 20,000 POIs entirely from scratch as they competed for Bitcoin prizes. We wanted to test the “blank map” approach first, but most datasets start off with some pre-existing data. The problem with all POI datasets? As soon as you collect them, they start going stale: new businesses open and old ones close or change their hours. This is the core problem StreetCred will solve.
MapAustin tests a number of new dynamics as we build a high-quality POI dataset across a 6,700 square mile area in the middle of Texas in just one month. If you're participating in the contest, you might have noticed some Places without photos that invite you to complete the record. By adding data in person, such as hours, category, website, and phone number, you can earn points that count toward your position on the leaderboard.
Where does this data come from? We’re bulk importing open POI data from the All the Places project. A secret about the POI industry is that much of the data is crawled and scraped from across the web. Many companies duplicate this work behind closed doors, creating private scrapers for brands with store locator data. We thought: why duplicate this effort? And wouldn’t it be better with real-world verification?
All the Places is a series of web crawlers, but unlike proprietary efforts, it’s all open source. If you know a little Python and have some free time, you can head over to the Github repo and contribute your own crawler. Do you love Buffalo Wild Wings? If you don’t see it in the list of crawlers (as of now it’s not in there), you can create your own crawler, which will then pull all the locations into a big, open dataset that ideally will grow to all the POIs available via store locators throughout the world.
StreetCred imagines a decentralized protocol that will encourage data imports, like we’re doing for MapAustin. We want developers who are able to produce POI datasets like All the Places to be compensated through tokens for the valid places they enter. Of course, the data will need to be validated on location and proven by consensus to be accurate, but improving imported data is a key part of what we’re building.
Before MapAustin, we imported 1,807 POIs from All the Places. As of now, 643 incomplete records have been enriched: over 1/3 at the halfway mark of the contest. We're excited about these results!
So if you’re in Austin this month and are using the app (on iOS or Android), remember that you can and should add all the mom-and-pop places, BBQ joints, boutique shops, etc. But if you run across a data record that a big chain made available on the web, you might consider adding a photo and some other data to improve the record. And you can thank the community behind All the Places for the head start!