Détail du package

cleanword

nabarupdev0ISC1.0.0

A lightweight package to detect and filter profanity, especially Indian bad words.

profanity, profane, obscenity, obscene

readme

Cleanword

A simple, fast, and extensible JavaScript package to detect and censor abusive words in multiple Indian and international languages. Useful for chat moderation, content filtering, and building safe online communities.

Features

  • Detects and censors abusive words in Hindi, English, Bengali, Urdu, and more
  • Customizable censorship character (grawlix)
  • Fine-grained control with alwaysAllow and alwaysBlock word lists
  • Easy to use and integrate in Node.js projects

Installation

npm install cleanword

Usage

Example

const { cleanText } = require('cleanword');

const options = {
  language: ['english', 'hindi'],
  grawlixChar: '@',
  alwaysAllow: ['kutto'],
  alwaysBlock: ['test', 'what'],
};
const cleaned = cleanText('This is a test sentence with kutto and what.', options);
console.log(cleaned); // This is a @@@@ sentence with kutto and @@@@..

TypeScript Example

import { cleanText } from 'cleanword';

interface CleanTextOptions {
  language: string[];
  grawlixChar: string;
  alwaysAllow: string[];
  alwaysBlock: string[];
}

const options: CleanTextOptions = {
  language: ['english', 'hindi'],
  grawlixChar: '@',
  alwaysAllow: ['kutto'],
  alwaysBlock: ['test', 'what'],
};

const cleaned: string = cleanText('This is a test sentence with kutto and what.', options);
console.log(cleaned); // This is a @@@@ sentence with kutto and @@@@.

API

cleanText(text, options)

  • text (string): The input string to clean.
  • options (object, optional):
    • language: string | string[] — Language(s) to check (default: 'hindi').
    • grawlixChar: string — Character to use for censorship (default: '*').
    • alwaysAllow: string[] — Words that should never be censored, even if abusive.
    • alwaysBlock: string[] — Words that should always be censored, even if not abusive.
    • customAbuseSet: Set<string> — Custom set of abusive words (for advanced use/testing).

Returns: The cleaned string with abusive words replaced by the grawlix character.

Config Options

Option Type Description
language string/string[] Languages to check (e.g. 'hindi', 'english', 'bengali', 'urdu')
grawlixChar string Character to use for censorship (default: '*')
alwaysAllow string[] Words to never censor
alwaysBlock string[] Words to always censor
customAbuseSet Set<string> Custom abusive word set (advanced/testing)

Supported Languages

  • Hindi
  • English
  • Assamese
  • Bengali
  • Bhojpuri
  • Marathi
  • Chhattisgarhi
  • Gujarati
  • Haryanvi
  • Kannada
  • Kashmiri
  • Konkani
  • Ladakhi
  • Malayalam
  • Manipuri
  • Marwari
  • Nepali
  • Odia
  • Punjabi
  • Rajasthani
  • Tamil
  • Telugu
  • Urdu

You can specify one or more languages using the language option. Example:

cleanText('some text', { language: ['hindi', 'english'] });

Contributing

  1. Fork this repository and clone your fork.
  2. Install dependencies:
    npm install
    
  3. Add or improve abusive word lists in src/abuse_words.js.
  4. Add or update tests in Test/cleanText.test.js.
  5. Run tests:
    npm test
    
  6. Submit a pull request with a clear description of your changes.

Guidelines:

  • Please be respectful and avoid adding non-abusive or irrelevant words.
  • Keep word lists accurate and up-to-date for each language.
  • Add tests for any new features or language support.

Author

Developed with ❤️ by Nabarup

If you find this package useful, ⭐ star the repo and share it!

License

MIT © 2025 Nabarup.
Use freely. Contribute with respect.

npm version npm downloads MIT License

Feedback & Contact

For feature requests, feedback, or bug reports, open an issue or email me at nabaruproy.dev@gmail.com .