All Articles

Social Sentinel Blog

Alert Identification Series (Part 1): The Complexity of Harmful Language

various sentences diagramed on a blackboard
Summary: As a native English speaker, it’s easy to overlook how complicated the English language can be,  especially when you’re trying to glean intent from an unruly lingual landscape like social media. Our first article in this ten-part series begins to explore the complexity of the English language through a lens of artificial intelligence. 

The Alert Identification 10-part blog series provides insight into the inner-workings of our social media scanning product. Brought to you by our Data Science team.

Imagine you’re in charge of reviewing social media posts as part of your school’s safety plan and find the following two posts:

social post example my boyfriend is going to kill me

social post example my hangnail is going to kill me

You easily categorize one as being potentially harmful and requiring further contextual research. The other is heavy with hyperbole, lacking any legitimate indication of harm. Now imagine doing that a billion times a day.

That’s where artificial intelligence and machine learning take over to make the process simple and our lives more efficient. Our platform’s logic processes and understands the example posts with a more systematic approach:

  • future tense
    The author is writing about something that will happen

  • negative sentiment
    The overall emotional gauge of the words in the sentence is not happy

  • identical structure
    The sentences have the same construction and lexical ambiguity; specifically, the meaning of kill me depends entirely on the object

My [object] is going to kill me

Let’s dive a little deeper to see how much this single sentence structure can vary. By changing the object, we greatly alter the potential harm it communicates.  

sentence structure example base on my object is going to kill me

What if the object is:

  • a human
    In this case, the post may be harmful, although it is not guaranteed. Further context is required to determine the appropriate follow-up.

  • a disease or wellness issue
    For example, if the object noun was depression or anorexia, the post may require further attention.

  • a wide variety of object nouns that would indicate hyperbole
    For example, cat, exam, period, vacation, or dentist. Our brains immediately process the sarcasm. Therefore, the post reads as humorous rather than potentially dangerous.

A.I. vs. H.I.

The nuance contained in the structure above shows the difficulty of identifying harmful language. While our machine learning models are innovative, robust, continually updated, and a whole list of other superlatives that inflate the robots’ digital ego, they’ll never be as intelligent as a human.

We heavily consider the risk of misidentifying a true Actionable Alert as a false positive. If our machine learning models cannot confidently identify a post as a false positive, the system is programmed to send the alert. This practice reduces overall accuracy, but considering the possible repercussions of missing an alert, it is necessary. (A future article will describe this feature in more detail.)

Machine learning substantially reduces the number of posts our users have to sift through. Approximately 1 out of every 1,000 associated posts is sent as an Actionable Alert. At which point human intelligence takes over as the final evaluator of the potentially harmful content in context.

Only the user knows whether a student who posts “I am going to bomb this election” fears they’ll fail their run for Student Council President, or something much more dangerous. Language is incredibly complex and constantly changes, but Social Sentinel’s Data Science team can help you focus on the posts that matter most to your schools’ safety and wellness climate.

Next in our series: Keeping Up With Slang. Stay tuned!


Key take-aways:

  • The complexity of language makes understanding meaning--especially in digital conversations--very difficult.
  • Two social media posts can be identical in structure, yet communicate two very different things.
  • Machine learning and artificial intelligence are highly effective partners when identifying potentially harmful language.
  • In the end, though, it’s human intelligence that ultimately knows whether an alert is actionable or a false positive.