Jump to navigation Jump to search
|Emergency robot shutoff button|
|Administrators: Use this button if the bot is malfunctioning. (direct link)
Non-administrators can report misbehaving bots to Wikipedia:Administrators' noticeboard.
|ChenzwBot (Talk · Contribs)|
ChenzwBot patrols the sea of recent changes.
|Flagged?||Yes (11 April 2008)|
|Edit rate:||Variable (Anti-vandalism)|
|Automatic or manual?||Automatic|
|Programming language/s:||PHP and Python|
|Source code published?||https://gitlab.com/antivandalbot-ng (partial)|
- Since 2010: Anti-vandalism task begins, using Chris G Bot's code. Vandalism detection was achieved by evaluating edits using regular expressions. Prone to low detection and high false positive rates.
- Approximately Dec 2015: Bot begins using the revscoring library (which powers ORES) to extract features (e.g. numbers of characters added/removed) about each edit. Vandalism probability is predicted by a Random forest classifier.
- Mid-2016: Bot core rewritten in line with the reactor design pattern.
- 7 May 2018: Bot core rewritten (again) in Python, which is vastly more efficient than the original PHP implementation. Classifier changed to XGBoost.
- Mid-October 2019: Classifier changed to LightGBM, with substantial improvements to how each diff is evaluated. Words added by editors are transformed to tf–idf vectors, and fed into a separate Bayesian classifier. These words are also tagged as nouns/verbs/pronouns etc., with the counts of the various categories becoming new inputs for the main LightGBM classifier.
Summary of algorithm
- Write-up in progress.