Package maleo
Maleo
is wrapper library for text cleansing, preprocessing and POS Tagging in NLP
Overview of features
- Scanner : get insight about your text dataset (ex: number of chars, words, emojis, etc)
- Remove hyperlink, punctuation, stopword, emoticon, etc
- Extract hashtags, price from text
- Convert email, phone number, date to [TAG]
- Convert Indonesian slang to formal word
- Convert emoji to word or [TAG]
- Convert word to number
- Predict Part-of-Speech (POS) tags
Installation
>>> pip install maleo
Getting Started
>>> from maleo.wizard import Wizard
>>> from maleo.pos_tag import POS
>>> wiz = Wizard()
>>> pos = POS()
>>> wiz.scanner(df, 'text')
>>> wiz.emoji_to_word(df.text)
>>> wiz.slang_to_formal(df.text)
>>> pos.predict('saya mau pergi beli makan siang dulu', output_pair=False)
Universal POS tags
https://universaldependencies.org/u/pos/index.html
Contributor:
- Ruben Stefanus
Expand source code Browse git
"""
`Maleo` is wrapper library for text cleansing, preprocessing and POS Tagging in NLP
.. include:: ./documentation.md
"""
Sub-modules
maleo.cleansing
maleo.pos_tag
maleo.preprocessing
maleo.scanner
maleo.stopword_remover
maleo.wizard