Machine learning models/Proposed/Tone Check
This model card page currently has a draft status. It is a piece of model documentation that is in the process of being written. Once the model card is completed, this template should be removed. |
Model card | |
---|---|
This page is an on-wiki machine learning model card. | |
A model card is a document about a machine learning model that seeks to answer basic questions about the model. | |
Model Information Hub | |
Model creator(s) | Diego Sáez Trumper |
Model owner(s) | Diego Sáez Trumper |
Model interface | Coming soon |
Code | Coming soon |
Uses PII | No |
In production? | Yes |
Which projects? | Tone Check |
This model uses the page title and edit text to predict the likelihood that a given edit contains a tone check violation for new edits to articles. | |
Motivation
[edit]The model was developed to support Wikipedia’s commitment to a neutral point of view, especially in contributions from newer editors who may be less familiar with the platform’s content policies. Initially aimed at detecting “peacock language” (overly promotional or flattering phrases), the scope of the model expanded to address a broader category of tone issues, including language that may come across as promotional, subjective, or derogatory.
This model powers Tone Check, an edit check that prompts editors to consider neutralizing their language when their edits include problematic tone. By surfacing tone-related suggestions at the point of editing, this model helps contributors - especially newcomers - improve the quality and neutrality of their contributions, making it easier to participate in Wikipedia in a productive and policy-aligned way.
Users and uses
[edit]- Predicting (with various degrees of probability) whether a given sentence or paragraph of human-written content contains a tone that is promotional, derogatory, or otherwise subjective
- Providing suggestions to editors to improve the tone of an edit or published article
- As with any AI/ML model, we recommend keeping humans in the loop when applying this model in a new setting
- We don't recommend using this model's predictions as training data for other ML models
- Tone Check
Ethical considerations, caveats, and recommendations
[edit]This models relies on Multilingual BERT, a Large Language Model, that might contain certain biases.
Model
[edit]Tone Check leverages a Small Language Model (SLM) to detect the presence of promotional, derogatory, or otherwise subjective language. The SLM we are using is a BERT model, which is open source and presents its weights openly.
The model works by being fine-tuned on examples of Wikipedia revisions. It learns from instances where experienced editors have applied a specific template ("peacock") to flag tone violations, as well as instances where that template was removed. This process teaches the BERT model to identify patterns associated with appropriate and inappropriate tones based on Wikipedia's editorial standards. Under the hood, SLMs work by transforming text into high-dimensional vectors, which are then compared with the label, allowing the model to find a hyperplane that splits text into negative or positive cases.
The model was trained using 20,000 data points from 10 languages consisting of:
- Positive examples: Revisions on Wikipedia that were marked with the "peacock" template, indicating a tone policy violation.
- Negative examples: Revisions where the "peacock" template had been removed (signifying no policy violation).
The language covered are:
Language | wiki_db | lang_code |
---|---|---|
English | enwiki | en |
French | frwiki | fr |
Arabic | arwiki | ar |
Russian | ruwiki | ru |
Spanish | eswiki | es |
Japanese | jawiki | ja |
Dutch | nlwiki | nl |
Portuguese | ptwiki | pt |
Chinese | zhwiki | zh |
German | dewiki | de |
Small Language Models (like the one being used for Tone Check) differ from Large Language Models (LLMs) in that the former are trained to adapt for particular use cases by learning from a focused dataset. In the case of Tone Check, this means the SLM learns directly from the expertise of experienced Wikipedia volunteers. Hence, they offer more explainability and flexibility compared to LLMs. Also SLMs requires significantly fewer computational resources than its larger counterparts.
LLMs on the other hand, are designed to work for general-purposes, with limited context and through a chat or prompting interface. LLMs require a huge amount of computation resources, and their behavior is difficult to explain, due the high amount of parameters involved.
Performance
[edit]wiki_db | precision | f1 | recall | acc |
---|---|---|---|---|
nlwiki | 0.763 | 0.556 | 0.438 | 0.651 |
frwiki | 0.706 | 0.581 | 0.494 | 0.644 |
ptwiki | 0.694 | 0.555 | 0.462 | 0.629 |
enwiki | 0.668 | 0.519 | 0.424 | 0.607 |
ruwiki | 0.665 | 0.509 | 0.412 | 0.602 |
eswiki | 0.650 | 0.529 | 0.446 | 0.603 |
dewiki | 0.647 | 0.489 | 0.394 | 0.589 |
zhwiki | 0.631 | 0.465 | 0.368 | 0.576 |
arwiki | 0.617 | 0.634 | 0.651 | 0.624 |
jawiki | 0.597 | 0.393 | 0.292 | 0.547 |
Implementation
[edit]The information provided to the BeRT model is the article’s title,language and content:
{article_title} <SEP> {language} <SEP> {article_content}
This is the original model input, to see how to use it on LiftWing check below.
{
"prediction": true,
"probability": 0.831,
}
This is the original model output, to see how to use it on LiftWing check below.
This is the format to use the model on LiftWing
curl https://api.wikimedia.org/service/lw/inference/v1/models/edit-check:predict -X POST -d '{ "instances": [{"lang": "en", "check_type": "tone","page_title":"this is a test", "original_text": "text", "modified_text": "this is a great example of work"}]}'
{
"predictions": [
{
"status_code": 200,
"model_name": "edit-check",
"model_version": "v1",
"check_type": "tone",
"language": "en",
"page_title": "this is a test",
"prediction": true,
"probability": 0.831,
"details": {}
}
]
}
Data
[edit]We use Templates as labels. In this case the Peacock Language template was used. We collect pairs of positive/negative samples for each template, where a positive sample corresponds to the revision that adds a template, and a negative sample - to the first revision where the template is removed.
Mode details can be found here.
We collect this data for all languages covered by the AYA 23 model then exclude all languages with less than 1K pairs. 10 languages are left after this step.
We sample 1k pairs for each remaining language as evaluation data. 20k evaluation samples are collected
Licenses
[edit]- Code: Apache 2.0 License
- Model: Apache 2.0 License
Citation
[edit]Cite this model as:
@misc{Tone_Check,
title={Tone Check,
author={Saez-Trumper, Diego and Baigutanova, Aitolkyn and Chou, Aiko and Aslam, Muniza},
year={2024},
url={https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Tone_Check}
}