Instagram Uses AI Track Online Bullying in Real Time

Starting today, Facebook's Instagram is rolling out a new feature that notifies people when their captions on a photo or video may be considered offensive, and gives them a chance to pause and reconsider their words before posting.

Instagram said it has developed and tested AI that can recognize different forms of bullying on Instagram. Earlier this year, the company launched a feature that notifies people when their comments may be considered offensive before they’re posted.

Today, when someone writes a caption for a feed post and Instagram's AI detects the caption as potentially offensive, they will receive a prompt informing them that their caption is similar to those reported for bullying. They will have the opportunity to edit their caption before it’s posted.

In addition to limiting the reach of bullying, this warning helps educate people on what Instagram doesn't allow on Instagram, and when an account may be at risk of breaking the company's rules. To start, this feature will be rolling out in select countries, and Instagram will begin expanding globally in the coming months.

Combatting Misinformation

In May of this year, Instagram began working with third-party fact-checkers in the US to help identify, review, and label false information. These partners independently assess false information to help Instagram catch it and reduce its distribution. Today, Instagram is expanding its fact-checking program globally to allow fact-checking organizations around the world to assess and rate misinformation on our platform.

When content has been rated as false or partly false by a third-party fact-checker, Instagram reduces its distribution by removing it from Explore and hashtag pages. In addition, it will be labeled so people can better decide for themselves what to read, trust, and share. When these labels are applied, they will appear to everyone around the world viewing that content – in feed, profile, stories, and direct messages.

Instagram uses image matching technology to find further instances of this content and apply the label, helping reduce the spread of misinformation. In addition, if something is rated false or partly false on Facebook, starting today Instagram will automatically label identical content if it is posted on Instagram (and vice versa). The label will link out to the rating from the fact-checker and provide links to articles from credible sources that debunk the claim(s) made in the post. Instagram makes content from accounts that repeatedly receive these labels harder to find by removing it from Explore and hashtag pages.

To determine which content should be sent to fact-checkers for review, Instagram uses a combination of feedback from its community and technology. Earlier this year, Instagram added a “False Information” feedback option, and these reports, along with other signals, help the company to better identify and take action on potentially false information.