Introducing Hatebase: the world’s largest online database of hate speech

March 25, 2013 |

Predicting genocide is, by definition, an almost impossible task due to the scarcity of early, actionable data. There’s no chi-squared test or Monte Carlo method for reliably distributing societies along a spectrum from homogeneous to homicidal, both because the extermination of entire populations has become a relatively rare occurrence (thanks to the ever-increasing internationalization of human rights, law, media, and trade) and because those societies which do succeed at systematized annihilation are often equally resourceful at hiding evidence of their crimes.

In the information-rich twenty-first century, good data remains the Achilles’ heel of genocide studies.

At the Sentinel Project for Genocide Prevention, we’re tackling this problem on two fronts. First, in order to improve our data intake we’ve begun to engage in direct field work through our situations of concern (SOCs). Earlier this month, staff from the Sentinel Project were in Kenya during the contested presidential elections, monitoring tensions in urban hubs such as Nairobi and Mombasa as well as in known regional conflict zones such as the Tana River District.

Our second strategy has been to improve the tools with which we parse and prioritize data, whether from the field, from mainstream media or from social networks. To this end, the Sentinel Project recently partnered with my own organization, Mobiocracy, on the development of Hatebase, an authoritative, multilingual, usage-based repository of structured hate speech which data-driven NGOs can use to better contextualize conversations from known conflict zones.

Hatebase is available to casual users through a Wikipedia-like web interface, and to developers through an authenticating API. Although the core of Hatebase is its community-edited vocabulary of multilingual hate speech, a critical concept in Hatebase is regionality: users can associate hate speech with geography, thus building a parallel dataset of “sightings” which can be monitored for frequency, localization, migration, and transformation.

For instance, an organization monitoring several simultaneous theaters of operation might integrate location-based Hatebase data into its monitoring software to assign additional real-time “weight” to specific conflict zones, providing guidance on how to best redeploy limited resources. For genocide monitoring organizations in particular, regional hate speech is a widely recognized indicator of elevated risk.

There are some weaknesses implicit in a solely vocabulary-based approach to linguistic analysis. Innocuous language, when localized, can adopt a sinister secondary meaning (e.g. “cockroaches,” meaning Tutsis in Rwanda), and threats can be communicated without the need for easily identified keywords (“their days are numbered”). Despite these limitations, Hatebase can provide a layer of relevance which complements other context-based information sources, not unlike traffic congestion layered onto a city map.

In the months ahead, we’ll be adding additional data attributes, visualizations, and end-user functionality to Hatebase, with a particular focus on strengthening the API in accordance with our commitment to partnership-based innovation. Our hope is that other individuals, groups and organizations will embrace this collaborative model by leveraging Hatebase data in their own applications.

hatebase.org

Hatebase Logo

March 25, 2013 | | English

How we can work together?
Posted by Dr.kamal kumar saha | March 26, 2013 @ 6:32 am
This is a very good and intelligent innovation. It is a good data base for Geographic Information on hate speeches; and could be used as Early Warning Mechanism in violence prevention and peacebuilding. Thanks alot; Iam in.
Posted by Dr. Don John Omale | March 26, 2013 @ 1:48 pm
keep moving.
Posted by naftali mutahi | March 27, 2013 @ 3:29 pm
This posting is under a sub-heading of Kenya. Is Hatebase to be a general international database or, at least in the present time, on focused on Kenya?
Posted by Marc Altman | March 27, 2013 @ 4:47 pm
Hi Marc - Hatebase has a general international scope. We may have mis-tagged it, I will take a look. Thanks for the comment! -Taneem
Posted by Sentinel Project | March 28, 2013 @ 4:36 pm
[...] A new database project attempts to identify impending genocide by spotting key textual indicators.  It’s crowdsourced, called Hatebase, and a co-sponsor describes it like so: [...]
Posted by Text analysis to anticipate genocide | Scanning for Futures | April 6, 2013 @ 6:22 pm
[...] A new database project attempts to identify impending genocide by spotting key textual indicators.  It’s crowdsourced, called Hatebase, and a co-sponsor describes it like so: [...]
Posted by Anticipating the future through vocabulary | Bryan Alexander | April 6, 2013 @ 6:38 pm
[...] Launched on 25 March, the database is still in its early stages, but the developers say that further functionality will be added in the coming months. In the future Hatebase may become a valuable tool for NGOs trawling through vast amounts of online communication, providing “a layer of relevance which complements other context-based information sources, not unlike traffic congestion layered onto a city map.” [...]
Posted by #breakingnews Crowdsourced hate speech database could spot early signs of genocide | | April 7, 2013 @ 8:30 pm
[...] Launched on 25 March, the database is still in its early stages, but the developers say that further functionality will be added in the coming months. In the future Hatebase may become a valuable tool for NGOs trawling through vast amounts of online communication, providing “a layer of relevance which complements other context-based information sources, not unlike traffic congestion layered onto a city map”. [...]
Posted by Crowdsourced hate speech database could spot early signs of genocide | umuvugizi | April 8, 2013 @ 7:13 am
[...] Launched on 25 March, the database is still in its early stages, but the developers say that further functionality will be added in the coming months. In the future Hatebase may become a valuable tool for NGOs trawling through vast amounts of online communication, providing “a layer of relevance which complements other context-based information sources, not unlike traffic congestion layered onto a city map”. [...]
Posted by Crowdsourced hate speech database could spot early signs of genocide | April 8, 2013 @ 5:54 pm
[...] few weeks ago, The Sentinel Project for Genocide Prevention announced the launch of Hatebase, a web- and API-based data platform for tracking regionalized hate [...]
Posted by Hatebase: seeking NGO partners for open hate speech data | appvocacy | April 16, 2013 @ 12:45 pm
I forgot to tell you: 1. I shall be presenting a Paper at the conference of IAGS (International Association of Genocide Scholars) this coming June in Siena, Italy; hope see any of you there. 2. I happen to be a survivor of Genocide-Holocaust myself. Olek Nezer, PhD
Posted by Olek Netzer | April 17, 2013 @ 6:44 am
There is an ongoing issue in Tibet.... hundreds of people immolating themselves in protest over the 50+ year human rights violations on the part of the Chinese Communist Party. I'm a little surprised to NOT see Tibet listed in your "SITUATIONS OF CONCERN".
Posted by Gerry | April 27, 2013 @ 7:27 pm
[...] users through a Wikipedia-like web interface, and to developers through an authenticating API,” said Sentinel’s ICT advisor Timothy Quinn.  “Although the core of Hatebase is its community-edited vocabulary of multilingual hate speech, [...]
Posted by Mapping Hate Speech and Human Rights Abuses | Global Wire Associates | May 13, 2013 @ 12:03 pm
[...] qu’elle pourra faire une différence. Lire l’article (en anglais) En complément, un article de l’ONG Sentinel Project qui présente sa base de données HateBase et un article sur la banalisation des propos racistes sur Internet   A lire sur Business Analytics [...]
Posted by Une base de données pour détecter les risques de génocides « Analyse « Business-analytics-info.fr | June 11, 2013 @ 7:08 pm
[...] Launched on 25 March, the database is still in its early stages, but the developers say that further functionality will be added in the coming months. In the future Hatebase may become a valuable tool for NGOs trawling through vast amounts of online communication, providing “a layer of relevance which complements other context-based information sources, not unlike traffic congestion layered onto a city map”. [...]
Posted by Crowdsourced hate speech database could spot early signs of genocide | Dr Ko Ko Gyi’s Blog | July 24, 2013 @ 8:02 am
[...] is the world’s largest online database of hate speech launched in March. On top of being a catalog of hate speech terms, it also tracks usage of hate speech, either [...]
Posted by Learn how to access the Sentinel Project’s open data with our APIs | July 30, 2013 @ 3:16 pm
[…] more information, read about their work on The Sentinel Project and follow them on twitter […]
Posted by Rising Voices » Say What?! Website Collects Hate Speech for Genocide Prevention | December 18, 2013 @ 9:01 am
[…] Informationen und mehr über ihre Arbeit findest du bei The Sentinel Project [en]. Folge ihnen auf Twitter […]
Posted by Say What?! Webseite sammelt Hassrede zur Genozidprävention · Global Voices auf Deutsch | January 8, 2014 @ 10:19 am
[…] mais informações, leia sobre o trabalho da Hatebase no The Sentinel Project [en] e siga-os no twitter […]
Posted by Hatebase: Uma rede que reúne discursos de ódio para prevenir o genocídio · Global Voices em Português | January 17, 2014 @ 10:11 am

Leave a comment

Your email address will not be published.