A Web-Forum Free of Disguised Profanity by Means of Sequence Alignment

Mogollón, Christian; Rojas-Galeano, Sergio A.

Date

2016-06-20

Les auteurs

Mogollón, Christian
Rojas-Galeano, Sergio A.

Éditeur

Pontificia Universidad Javeriana

Type

Artículo de revista

ISSN

2011-2769

0123-2126

résumé

Profanity is the use of offensive, obscene or abusive vocables or expressions in public conversations. A big source of conversations in text format nowadays are digital media such as forums, blogs or social networks where malicious users are taking advantage of their ample worldwide coverage to disseminate undesired profanity aimed at insulting or denigrating opinions, names or trademarks. Lexicon-based exact comparisons are the most common filter technique within these media; however, ingenious users are disguising profanity using transliteration or masking of the original vocable while still conveying its intended semantic (e.g. by writing piss as P!55 or p.i.s.s), hence defeating the filter. Recent approaches to this problem inspired in the sequence alignment methods from comparative genomics in bioinformatics, have shown promise in preventing overlooking such guises. Building upon those results we have developed an experimental Web forum where user comments are screened against disguised profanity. In this paper we introduce the software (ForumForte) and describe briefly the technique and engineering behind it, as well as some empirical evidence of its filtering performance. Our software is open-source under the New BSD License and is available at: http://tinyurl.com/ForumForte.

Lien vers la ressource

http://revistas.javeriana.edu.co/index.php/iyu/article/view/14811

URI

http://hdl.handle.net/10554/25529

Collections

Ingeniería y Universidad [283]

Excepté là où spécifié autrement, la license de ce document est décrite en tant que Atribución-NoComercial-SinDerivadas 4.0 Internacional