Introducing BotSight: A New Tool to Detect Bots on Twitter in Real-Time: Quantifying Disinformation on Twitter, one
Tweet at a Time
NortonLifeLock Research Group (formerly known as Symantec Research Labs) has released a beta tool that can detect bots
on Twitter in real-time to help Twitter users understand the prevalence of bots and disinformation campaigns within
their personal feeds. The tool has also been made available in New Zealand.
Awareness around misinformation is higher than ever before, particularly as major social media platforms clamp down on
removing misleading content and accounts – yet, there is still little understanding of how much disinformation there
With this in mind, we trained a state-of-the-art machine learning model that can detect Twitter bots
with a high degree of accuracy, achieving an Area Under Curve – a common indicator of model quality – of 0.967 on
popular research datasets, which matches or exceeds the best current academic results. But we didn’t stop there. We
created a tool – called BotSight – which takes the results of our model and injects them directly into the Twitter feed.
Now, we are releasing a beta version of BotSight (on popular browsers and iOS) to give people a better understanding of
how bots operate on Twitter. You can download it for free here
To determine whether an account is a bot
, we look at over 20 different distinguishing features per case, including the amount of randomness in the Twitter
handle, whether the account is verified, the rate at which it is acquiring followers, and the account’s description. We
verified our approach by observing BotSight in action. So far, BotSight’s beta users have successfully analysed over
100,000 Twitter accounts.
BotSight works across the majority of Twitter including search, trending topics, and your home timeline. For the past
six months, our team has been diligently scrolling through Twitter with BotSight enabled in order to continuously test
and improve both our model and our design. It has also enabled us to better understand bots, contextualizing where they
are likely to appear and how they act.
Using BotSight’s classifier on what we believe is the largest archive of Twitter’s historical data ever collected
outside Twitter (over 4TB), we found many interesting and surprising things. One is that the problem of disinformation
is not as small as Twitter’s numbers suggested on first blush, but also nowhere near the more sensational headlines
we’ve seen. We’ve found that about 5% of tweets belong to bots overall, and this percentage has gone down over time, which is a testament to the hard work of Twitter’s Site Integrity team.
However, this percentage can go up as high as 20% when viewing trending topics, such as #COVID19 or other trending
hashtags. In our analysis of recent coronavirus-related tweets, we found that between 6-18% of users tweeting on this
subject were bots, depending on which time period we sampled, while a random sample of the Twitter stream indicates 4-8%
bot activity by volume over the same time period. This contrast shows that bots are strategic about their behaviour:
favouring current events to maximize their impact.
All these numbers differ depending on language, topic, and time of day. That’s precisely why seeing it right in your
Twitter feed itself is so helpful.
While we made every effort to make sure BotSight works well, it is still a research prototype. We invite you to use BotSight
and share your feedback with us at DLfirstname.lastname@example.org