Mark as Spam

Chris Shughrue - August 18, 2020

As the StreetCred community expands, it’s been rewarding to see that the vast majority of players contribute accurate data about the places around them. There are always exceptions, though, so we’ve developed methods to flag the occasional spam creators who pop up.

In addition to creating a system where players are internally monitored by their peers, we’re continually refining our tools to detect and deter spam. Using player activity pattern indicators, we’ve built a predictive spam detection model with a 98% overall accuracy rate.

Here are a few examples of behavior patterns that our model uses to predict whether a player is trying to play us (Figure 1):

Interactions among spam indicators
Figure 1. Sample of 50 spammy users’ activity flags plotted by interactions among indicators. Spam strategies are often characterized by multiple types of anomalous behaviors, making it easier to identify spammers.

Speeding: Working quickly is the only way to win...but when users work too fast, something might be awry. About 10% of spam users engage in activities that would require superhuman speeds.

Globetrotting: Winning players are typically nimble, exploring their cities in search of new places to add to the map. But when you see a player travel thousands of kilometers a week, or leapfrog from one neighborhood to another in mere moments, questions about plausibility emerge. More than 20% of spam activity is indicated by impossible distances.

Peer review: Peer validation continues to be one of the primary gatekeepers in spam detection. Players that consistently perform poorly upon peer validation are likely to be engaging in less-than-fair play. Our credibility network automatically discounts contributions from poorly performing players as it harmonizes the dataset; but this metric is also valuable when taken as an indicator of bad behavior. 55% of spam users get bad marks when graded by their peers.

Antisocial behavior: In StreetCred contests, players create and edit a shared set of places within their neighborhoods; each player interacts regularly with data added by others. Players who are conspicuously disengaged from the network are likely to be creating illegitimate places. Nearly 90% of users engaged in spam exhibit anomalously low integration into the credibility network.

Other: We also flag anomalous behavior using a number of additional characteristics. These indicators include rate of duplicative place creation, for example, which may be related to players attempting to unfairly squeeze additional points from already-created places. Other types of flags also include technical measures of app activity associated with implausible usage.

These analytics help us understand behavior patterns that indicate sketchy activity. We’ve found that bad actors typically light up several of these indicators, making them easy to identify. Our machine learning models take advantage of the co-occurrence of anomalies to make predictions on holistic patterns of behavior.

By drawing together all the available metrics in training our spam prediction model, we’ve achieved a 98% overall accuracy rate. This is just a start. As the StreetCred community grows, we’re working continuously to improve the sensitivity of our model by uncovering new strategies undertaken by spammers.