Skip to content

Python module designed to replace common internet slang and abbreviations with their full forms, enhancing the readability of informal text. It efficiently cleans text data from chats, social media, and online communication. The module also supports tokenization and integrates seamlessly with pandas for batch processing of text in DataFrames.

License

Notifications You must be signed in to change notification settings

mrqadeer/internet_words_remover

Repository files navigation

Internet Words Remover

Internet Words Remover is a Python module that replaces common internet slang and abbreviations with their full forms. It can be used to clean text data containing informal language commonly used in chats, social media, and online communication.

Installation

You can install Internet Words Remover using pip:

pip install internet_words_remover

How to Use

from internet_words_remover import words_remover
text="OMG! It works! Osm"
cleaned=words_remover(text)
print(cleaned)

Output

oh my god It works! Awesome

Tokenization

If you are intrested to get tokens of your give string then use follow code.

from internet_words_remover import words_remover
text="OMG! It works! Osm"
cleaned=words_remover(text,is_token=True)
print(cleaned)

Output

['oh', 'my', 'god', 'It', 'works!', 'Awesome']

Bonus

It also works on pandas series

from internet_words_remover import words_remover
import pandas as pd 
data={
    'Name':['Qadeer'],
    'Message':['Hi gm TIL something new. PTL']
}
df=pd.DataFrame(data)
df['Message'].apply(words_remover,is_token=True)

Output

['Hi', 'good', 'morning', 'today', 'I', 'learned', 'something', 'new.', 'praise', 'the', 'lord']

Catch me on

Github
LinkedIn

Thanks

Keep Learning and Exploring!
License: MIT

About

Python module designed to replace common internet slang and abbreviations with their full forms, enhancing the readability of informal text. It efficiently cleans text data from chats, social media, and online communication. The module also supports tokenization and integrates seamlessly with pandas for batch processing of text in DataFrames.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages