Unsafe organizations are often illegal and related to:
Content related to unsafe organizations may include names of people or movements, slogans or ideologies such as antisemitism, white supremacism, islamism or anti-government ideas.
As user-generated content grows on social media, so does extremist content: for example, "between August 1, 2015 and December 31, 2017, [Twitter] suspended 1,210,357 accounts [...] for violations related to the promotion of terrorism" (more info here).
Most of the content promoting or supporting unsafe organizations like terrorism, unauthorized events or sectarism is actually illegal in the UK, US and EU.
Extremism is not always illegal but is generally not tolerated by platforms and applications users. Also, promotion of extremism often leads to hate speech and identity attacks towards protected groups, which should not be accepted on platforms hosting user-generated messages.
It is therefore important, if not mandatory, to detect this type of content and report it to legal authorities when necessary.
Ignoring this threatening content may lead to very serious and unsafe situations involving injury and death in the worst case scenario.
Here are some examples of ways a platform could be used for illegal purposes related to unsafe organizations:
There are several ways to detect these kinds of behaviors but it can be difficult to do so as they are often not explicit:
Platforms should rely on the reports coming from their users. Users can report suspicious messages to the trust and safety team so that these are reviewed by moderators, handled by the platform and / or reported to the authorities. It's a simple way to do moderation, the user just needs a way to do the report. The main disadvantage of user reports is the delay for the content to be reported and handled by the platform.
To simplify the handling of user reports and make the difference between the possible topics of the message, it is important that users have the option to categorize their reports, with categories such as terrorism or sectarism for instance.
Human moderators are efficient for detecting content related to unsafe organizations as they are expert in detecting implicit information. Yet, such content needs to be detected as soon as possible and the lack of speed is a common issue with human moderation.
Also, detecting extremist content can be challenging even for human moderators. Some slogans or terms used by extremist groups can be unknown to moderators. Such examples would be:
To help detect those, you should make sure that moderators get training on the latest trends used by extremist groups, and encourage your moderators to seek and share feedback on flagged items, to spur continuous improvement and learn from the ever-evolving behaviors of your community. One suggestion would be to have a database of these groups with their names, logos, slogans, head members, etc. for quick reference during reviews. This would probably lower the error rate and increase the speed when it comes to decision-making.
Here are some examples of resources that can be used for a quick reference to identified unsafe groups:
Keywords are a good way to pre-filter comments and messages by detecting all the words related to unsafe organizations such as:
Although the above keywords are relevant, we should keep in mind the context of the words and the differences in terms of language and culture. For example, the word kkk is often used to express laughter or something funny in Portuguese and could lead to false positive detection. For more information on existing lists of keywords, you can refer to this repository.
Also, note that when it comes to criticizing government officials or organizations, it is usually allowed as long as it is not threatening or calling violence (acab for instance is not explicitly calling for violence).
With these detected keywords, human moderators can focus on a shorter sample to be verified, but this method is not the best to detect unexplicit content and could lead to a high risk of missing dangerous messages. The example below for instance does not contain any word related to unsafe organizations but could still be a threat depending on the context:
The use of ML models may complement the keywords and human moderators approaches by detecting such cases with the help of annotated datasets containing examples of unsafe organizations promotion or support.
Automated moderation can also be used to moderate images and videos submitted by users by detecting offensive symbols representing hateful ideas (nazi-era or supremacist symbols) as well as terrorist flags and symbols.
As soon as the unsafe content is found and verified, platforms and applications must handle the issue depending on its intensity and severity.
Messages referring to illegal organizations or suggesting an upcoming threat such as a terrorist attack are considered high-risk messages.
When encountering such comments, platforms should react extremely quickly by reporting them to the country's authorities and providing them with all the necessary information.
Rules and legislations regarding such content and actions depend on the country, but below are some examples:
In any case, the platform or application should not handle the issue by itself and should refer to the legal and relevant authorities.
For more information on regulations by countries, you can refer to this guide that was compiled by Tech Against Terrorism and that describes the state of online regulations for terrorism-related content.
These messages are less urgent because they are not illegal or because they are less obvious. A user promoting extremist ideas is probably less of a risk than a user expressing a desire to kill anyone who disagrees with his extremist ideas.
In such cases, platforms could decide to ban the user for a limited time or permanently. They also could decide to take some time to see if the user posts more similar comments.
Setting up a moderation log to allow moderators to report the incidents encountered for a specific user is a way to track possible at-risk users.
The process defined for high-priority situations could then be used for users that are in fact at risk.
It is important to remember that in case of any doubt, the issue should still be reported to local authorities to avoid any possible mistake that could lead to very serious consequences, as social media platforms and other platforms hosting user-generated content have become a preferred means to promote extremist ideas, especially among young or vulnerable users.
This is a guide to detecting, moderating and handling self-harm, self-injury and suicide-related topics in texts and images.
This is a complete guide to the Trust&Safety challenges, regulations and evolutions around GenAI for imagery.