Empathization refers to an effort that, thus far, has produced two data-science products aimed to foster empathy and how people regard themselves and each other. The products are informed by user feedback to mitigate an epidemic -- online gender harassment -- as part of an even larger issue: global gender inequality that exists online and offline.
The current products revolve around two groups of Twitter users: people who repeatedly send online gender harassment tweets, and people who receive and are affected by such harassment. The products are built upon artificial intelligence (AI) algorithms that learn from humans' detection of gender harassment, and do so in an automated way.
Among tweets the algorithms predict as online gender harassment, the algorithms are 76-80% accurate. The user-facing products have shown the scalable potential to not only detect but take action on over 1 million offensive tweets per week.
Unequal treatment of women cuts across race/ethnicity, nationality, and region. However, the opportunity to elevate gender equality exists through daily interpersonal interactions.
According to the United Nations Human Rights Office of the High Commissioner, gender equality refers to “equal rights, responsibilities and opportunities. . . However, after 60 years, it is clear that it is the human rights of women that we see most widely ignored around the world”. And McKinsey Global Institute estimates potential gender equality at $12+ trillion gain per year for future global GDP. While social and financial aspects exist, our efforts revolve around the social aspect.
Given gender inequality comprises myriad issues, we target our efforts at a specific sub-issue that has reached an epidemic: online gender harassment. Such harassment is prevalent (Norton, 2016; Women, Action, & the Media, 2015; Pew Research Center, 2014) across Twitter, Facebook, YouTube, etc. Consequently, this influences many women to disengage from social media and not share their perspectives as much. Yet, women deserve equal opportunities to contribute online and offline.
Various women granted valuable interviews, helping us understand their personal experiences with Twitter online gender harassment and what they view as potential solutions. For instance, regarding potential solutions, women provided feedback that's been incorporated in our product work in the next section. As for the global problem, some women said they anonymize their usernames to reduce gender-based backlash. Various women said their engagement with Twitter and other social media has declined. And some believe they are treated worse online, where offenders (men and women) reveal their true character. Moreover, people's harassing behavior can bleed into their offline behavior.
Tara Moss (Canadian-Australian author, women's rights advocate, and UNICEF ambassador) explains, “it's feeding into the higher rates of sexual violence and sexual harassment that women are experiencing in the physical world.” Other research uncovers:
The Automated Twitter Bot is the first of our products. This bot, disguised to appear as a young white male, is designed to detect gender harassment tweets and intervene by calling out the offensive language of the tweet in a reply to the offender. This product is designed with the intent to mitigate abusive online behavior at the source.
The AI behind the bot detects gender harassment tweets using an ensemble of eight models: Five Gradient Boosting Decision Trees (GDBT), Two Feed Forward Neural Networks (FNN), and One Logistic Regression (LR). Tweets are classified based on the average predicted probability of harassment across these models. Our default probability threshold is set at 70%. This threshold was defined through the rigorous process of analyzing almost 20,000 tweets for language specifically indicative of gender harassment.
The method and message of intervention was informed by two studies. One is a study using ReThink, a software product designed to prevent adolescents from sending or posting hurtful messages. The second is an NYU Field Experiment, which addressed racial harassment on Twitter. Both studies found that checking offensive language with a simple message was effective.
(Names and parts of some messages have been blacked out for privacy reasons)
Hypothesis: The number of offensive tweets (tweets with harassment probability of 70% or more) per offender in the treatment group will be lower than the number of offensive tweets per offender in the control group post intervention. The intervention is the response from the bot to the offender.
Setup: Over 25 million tweets were run through our AI ensemble of models to identify about 4K offenders, excluding porn and bot accounts. Each bot is now set up to track around 1.5K offender accounts for offensive tweets.
Randomization: As soon as a tweet from a selected offender is flagged as gender harassment, the offender is randomly placed in treatment or control. If they get placed in treatment, the bot replies to their offensive tweet six minutes later. No reply to offenders in the control group.
The experiment ran April 15-23 (with results posted and presented previously) after a brief pilot study. A new, larger pilot study was run June-July 2017, while a full study ran July-September 2017. The rightmost column will be populated with full study results by December 2017.
The Gender Harassment Tweets Blocker is the second product. In contrast to the first product, women can download this Chrome browser extension to automatically block tweets that the product predicts to be gender harassment. Based on feedback, women have two customizable features to start. One can adjust a setting that automatically hides/removes tweets at their preferred level (e.g., tweets that have 60%+ chance of harassment, 80%+ chance of harassment, etc.). In addition, one can click a button to flag tweets as gender harassment that the browser extension didn't block (similar to clicking spam in email), or flag tweets as not gender harassment (similar to restoring email that went to spam incorrectly).
With positive user experience our top priority, we plan to add a complex but essential layer of web security around the product before release. In addition, the product will be accompanied by a 5-minute, step-by-step video tutorial to facilitate how one can quickly use the product. Please stay tuned for December 2017. . . The product has been demonstrated at UC Berkeley and at Google.
Please email us to join other subscribers, and inform and receive our latest product developments:
We used active learning (machine learning) to collect enough tweets (dated 2017 and earlier) that humans such as Mechanical Turk women workers regard as gender harassment. That allowed our AI models to better learn and predict what humans regard as gender harassment language and symbols.
We found roughly 0.09% (9 out of 10,000) tweets are harassment. Rather than read through 10,000 tweets to find roughly 9 harassment ones, active learning (machine learning) helped us circumvent. We first labeled 1K tweets via various methods (i.e., Twitter live stream via API, Twitter keyword searches via API, harassment tweets via articles, etc.), then used our earliest baseline model (Logistic Regression) to output predicted probabilities on the tweets, before starting the cycle of active learning. Below are actual tweets we presented to an initial audience on 2/15/2017.
[If you prefer not to read what many regard as highly offensive / misogynistic tweets, please bypass the table below, and jump to the next circular diagram.]
With active learning -- iteratively moving back and forth between data collection and machine learning -- we retrained our earliest baseline model, improving its ability to predict probability of harassment on past labeled tweets and new unlabeled tweets. For instance, our earliest model predicted the tweet, “you deserve vagina cancer”, at only 50.6% probability of harassment. As the model learned further, it eventually predicted that tweet with 70%+ probability of harassment. Our active learning process. . .
Our data collection process. . .
We chose to leverage and tune three categories of models: Gradient Boosting Decision Trees (GBDTs), Feed Forward Neural Networks (FNNs), and Logistic Regression (LR). And we sought not the best single model but the best combination of models. Our artificial intelligence process. . .
Rather than take a tweet's predicted probability of harassment from one model, we took a tweet's average predicted probability of harassment across multiple models for better reliability. Different models can make different mistakes in predicting the likelihood that tweets are offensive. For instance, for specific tweets, two models might predict low probability of gender harassment incorrectly, whereas six models might predict high probability of gender harassment correctly. By taking their average, the models can compensate for each other.
We used an automated approach that viewed the results of thousands of ensembles (where one ensemble refers to one combination of models). The graph below shows results for three separate combinations of models. The combination in the leftmost column is tied to our user-facing products: the Twitter Bot and Gender Harassment Tweets Blocker. Note, our flexibile approach allows replacement of one ensemble with another ensemble, if users and stakeholders prefer different performance. [Technical Language (Optional): As an interim step, we trained our final ensemble on the labeled train + validation data, then ran it on the labeled test data once (AUC: 91.3%, Precision: 78.9%, Recall: 32.5%). Then we proceeded to create a sampling distribution of results. Our final ensemble and specific alternative ensembles were eventually retrained on all labeled data (train + validation + test data), before linking our final ensemble to our user-facing products in the wild.]
As we collect more labeled tweets via various channels, including via the Gender Harassment Tweets Blocker, our ensemble of models should improve even further.
[Technical Language (Optional): Our final models analyze words, not characters, despite our preference for some models in our ensemble to analyze characters. For instance, we initially analyzed characters as well, leveraging vectorizer "analyzer='char'" with random search across "ngram_range=(1,4)". Some of our initial GBDT models achieved about 95%+ precision and 80% recall. However, GDBT feature importances revealed some single characters such as " ' " took too much importance in the predicted probability of harassment, despite limited occurrences. So, we concluded a larger dataset than 18.8K tweets seems necessary to analyze characters in the future, and should not use that character-level method until then. Thus, to be fair and reasonable, we discarded ensembles which use that method and yield better results, and instead selected ensembles that both perform well and should generalize to new tweets in the Twitter universe. As revealed in the graph above, we built a sampling distribution to show not only our averages, but our standard errors around those averages. The small standard errors indicate each ensemble's consistent performance on 15 cross-validation samples of 6K+ tweets. 15 samples were derived from randomizing seed for 5 iterations and, within each iteration, implementing 3-fold cross-validation.]
Our combination of 8 models (5 GBDTs, 2 FNNs, and 1 LR) yielded gender harassment probabilities across a sample of 46.2 million tweets. . .
Implement an existing list of user feedback for the Twitter Bot and Gender Harasment Tweets Blocker
Continue to learn from users on product concepts aimed to mitigate online gender harassment
Reach out to writers and organizations whose gender-harassment research and advocacy has inspired us to consider partnerships
Create a corresponding tweets blocker for phone and tablet given 82% of Twitter active users are on mobile
Continue field experiments for cause-effect conclusions on product impact
Collect more labeled tweets and/or try other AI methods to further detect and mitigate online gender harassement