Can You Trust a Player?

Chris Shughrue - May 28, 2020

Every time you add a place to the StreetCred map, that data is validated by other players. But how do you know that the other players are trustworthy? To answer this question, we apply the same network validation approach to the StreetCred community. In this post, we’ll take a deeper dive into how our peer validation network plays out in the real world, using Jakarta as an example.

Validating trustworthiness

Some players are more accurate than others. By leveraging data contributions from the whole mapping network, we can continuously compare accuracy throughout the community. Two factors influence our assessment of player trustworthiness: the number of place data points validated by peers, and the rate at which data from that player agrees with the consensus of the peer network. The typical well-validated player has contributed over 1,700 data points. Players who are extremely diligent, avoiding mistakes, and who get overlapping validation from other players can illustrate trustworthiness in as few as 14 data points.

Places in Jakarta
Figure 1. Illustrative example of three StreetCred players’ quality assessment versus number of data points contributed. A new player with few data contributions (orange) has a neutral initial quality assessment with a high degree of uncertainty (vertical error bars). As more data contributed by a player is validated, their quality assessment stabilizes around a value. The example of a player that contributes high quality data (blue) follows the upper trajectory, with uncertainty about player accuracy decreasing as they play. A player with poor performance (red) follows the lower trajectory, with increasing peer validation illustrating sloppy data contributions.

Trust spreads from the top

Within the community of validated players, there’s a sub-community containing the most trustworthy players in the StreetCred “world.” About 5% of StreetCred players graduate into this high level of trustworthiness, where the average player makes mistakes fewer than 4% of the time. These players are also the most experienced, having contributed more than 16,500 data points on average.

The most trustworthy players, in turn, contribute the greatest expansion of the validation network, making them more likely to validate other players in the community. Typical trustworthy players have data overlap with eight other players in the network, with prolific users validating data of more than 35 others in the community. Building up this user base is critical, not just because of the volume of place data they contribute, but also because they enable us to understand the validity of data throughout the community.

Top players drive the map

The most skilled and prolific players account for more than 70% of the data on the map. For some of the more challenging data fields, such as operating hours, this differentiation is even more stark, with more than 90% of hours data generated by the most highly trusted users. Harnessing the trustworthiness of this sub-community enables us to capture accurate, up-to-date information about this conventionally challenging type of information.

Places in Jakarta
Figure 2. Sample of peer validation network where nodes represent players and edges represent overlapping place data. Average data quality per player is represented by color (cooler = higher quality, warmer = lower quality). Strongly connected red nodes in center of the graph represent players who contribute low-quality data identified by multiple validations.

Sloppiness doesn’t scale

Not everyone shines in the light of validation by the peer network, however. Though a majority of players validated by the peer network demonstrate reliability and fastidiousness, validation also helps us automatically identify and remove sloppy data. Only 0.2% of all incoming data is ultimately generated by players who are found to have a low reliability through validation, and this data can be easily segregated from high-quality contributions. This reflects, in part, the effectiveness of the peer validation network in flagging unfair usage for moderation. When a poor player plays at a high volume to try to win, their data attracts more scrutiny from the peer network, automatically catching out unfair behavior. Good players who play to win ultimately shine, while poor players reveal themselves through competition.

Community builds better data, faster

Places in Jakarta
Figure 3. Distribution of data point quality by contest period. By week four, peer validation network is sufficiently extensive for a majority of data points to have high quality assessment.

Player behavior and validation dynamics reinforce the value of StreetCred player communities. In the initial ramp-up of a competition, multiple contributions per place are needed to verify accuracy of player behavior (Figure 3). The validation network quickly expands in subsequent contest weeks as prolific players contribute and validate new place data. As players become validated through this process, the community takes on a self-regulating character. By week four, most active players have been validated, which increases the quality of data as it is created. This pattern illustrates how our understanding of player behavior over time improves the pace at which we can assess the quality of the data they contribute. By focusing on community, StreetCred builds a mapping network to sustainably contribute and update high quality data.