FAQ / Text Moderation

How do ruled-based pattern-matching algorithms work?

The existing moderation categories for Rule-based Detection are the following: profanity, personal details, links, misleading usernames, extremist references, weapon names, medical or recreational drugs.

Profanity Detection

Profanity Detection allows to detect unwanted, hateful, sexual and toxic content in any user-generated text: comments, messages, posts, reviews, usernames. The available categories are the following:

Category	Description	Example
sexual	term or expression that refers to sexual acts, sexual organs, body parts or bodily fluids typically associated with sexual acts.	here's my d*ck
discriminatory	discriminatory and derogatory content. Mostly hate speech that instigates violence or hate against groups based on specific characteristics such as religion, national or ethnic origin, sexual orientation or gender identity.	he's a tard
insult	words or phrases that undermine the dignity or honor of an individual, that are signs of disrespect and are generally used to refer to someone	you fatso!
inappropriate	inappropriate language: swear words, slang, familiar/informal or socially inappropriate/unacceptable words or phrases to describe something, or to talk to someone	what the cr@p?
grawlix	string of typographical symbols that are typically used in place of obscenity or profanity	you #@$%!!

For each detected profanity, the API also returns an intensity level. It is a way to rank toxic language from mild to extreme. The three intensity levels are:

Intensity	Description
high	The highest level of profanity, with words and expressions that are problematic in most if not all contexts.
medium	Medium level profanity. Might be acceptable in some adult-only circles, while being unacceptable in public or in front of children.
low	Lowest level of profanity. Mild language.

PII Detection

The API has been developed to detect Personally Identifiable Information (PII) shared in text messages, reviews or comments.

Personal details that the API detects includes the following:

Phone numbers from various countries
Email addresses
Social security numbers
IP addresses

Detecting personal details can be important for multiple reasons:

Privacy: prevent users from publicly sharing their or other people's personal information.
Abuse: detect users who use your service inappropriately (ads, scammers, sales, etc.).
Compliance: make sure you don't store PII.

For more details, check out our PII Detection model.

URL and link Detection

URL Moderation automatically flags links and URLs. When applicable, the API also returns the unsafe category to which the link or URL belongs. Here is the list of categories that Sightengine will flag for you:

Category	Description
unsafe	sites presenting a risk for visitors, such as phishing, malware, scams
adult	sites containing porn, erotica, escort services
gambling	legal and illegal casinos, money games
drugs	sites promoting or selling recreational drugs
hate	extremist or hateful content
custom	your own custom disallow lists and allow lists

All categories apart from the custom one work directly out-of-the-box. The categories cover links and URLs from more than 5 million domains known to host unwanted content. Our lists are updated weekly to reflect the ever changing nature of the web.

Misleading usernames

The Username Moderation API can detect misleading usernames, i.e. usernames that might mislead other users, for instance because they attempt to impersonate someone else, or because they try to mimic app features, actions or technical terms. It can therefore prevent users from choosing these usernames.

Misleading usernames generally belong to one of the following categories:

Description	Examples
Impersonation of privileged users, employees	admin support
Common features and actions in apps and websites	about profile upgrade
Technical terms related to apps, websites and development	404 cgi-bin ubuntu
Commercial terms that are widely known	facebook

Other optional capabilities

Extremism Detection

This capability is useful to detect if user-generated texts (comments, messages, posts, reviews, usernames, etc.) contain words related to extremist ideologies with the purpose of promoting hate, violence or terror acts.

The words and phrases that will be flagged for being extremist-related or terrorist-related can be names of people, groups or movements, slogans or known keywords. Here is an overview of the different types:

Type	Description	Example
people	individuals linked to past or present extremist or terrorist organizations or events	mussolini, bin laden
group / movement	organizations known to promote or incite hate	al qaeda
keyword	words frequently used to promote hate or describe extremist practices or theories	holohoax, mein Kampf
slogan	catch phrases used by extremist people or organizations to promote hateful ideas	6mwe, acab

Extremist content ranges across ideologies that are considered extremist. Those include, but are not limited to, islamic extremism, white supremacism, antisemitism, anti-government, left-wing extremism and right-wing extremism.

See the documentation to know more about Extremism Detection.

Drug Detection

This capability is useful to detect if user-generated texts (comments, messages, posts, reviews, usernames, etc.) contain words related to recreational drugs.

Detected text items are common names of drugs but also more informal words like nicknames or shortened variants. Here are a few examples of the names of drugs that are detected:

Commonly used name	Other detected variants
cannabis	weed, ganja, marijuana...
ecstasy	md, mdma, xtc...
tnt	poppers...
cocaine	coke, crack...
...	...

See the documentation to know more about Drug Detection.

Medical Drug Detection

This capability is useful to detect if user-generated texts (comments, messages, posts, reviews, usernames, etc.) contain words related to medical drugs.

Detected text items are common names of medication used against pain, depression, anxiety, insomnia, obesity and erectile dysfunction, but also names of molecules found in these medicines. These topics are all sensitive issues as they could be a sign of a feeling of insecurity, or result in a dependence or addiction to the medication used to treat or control them. Here are a few examples of the names of medication that are detected:

Treated issue	Example
pain	fentanyl
depression	prozac
anxiety	valium
insomnia	flurazepam
obesity	bontril
erectile dysfunction	viagra

The API also returns an intensity level for each detected medical term, depending on how the medication is made available:

Intensity	Description
high	The most unsafe level, with medicines that are only available under prescription and for which the selling, exchange or simple giving away in a private circle would be a serious concern.
medium	Medium level, with medicines that are available either over the counter or under prescription and for which the selling, exchange or simple giving away in a private circle would be a reasonable concern.
low	Lowest level, with medicines only available over the counter for which the selling, exchange or simple giving away in a private circle could be a lower concern.

See the documentation to know more about Medical Drug Detection.

Weapon Detection

This capability is useful to detect if user-generated texts (comments, messages, posts, reviews, usernames, etc.) contain words related to weapons.

Detected text items are common names referring to weapons but also proper names like brands selling weapons or weapon models. Here are a few examples of the names of weapons that are detected:

Type of words	Examples
common nouns	handgun, rifle, carabin...
models	AK-47, HK33, PT 1911...
brands	Imbel, Colt, Kalashnikov...

See the documentation to know more about Weapon Detection.

Was this page helpful?

Products

MODERATION

REDACTION