We Have Published a Model For Text Repunctuation and Recapital | Data Science by ODS.ai 🦜
We Have Published a Model For Text Repunctuation and Recapitalization
The model works with SINGLE sentences (albeit long ones) and:
- Inserts capital letters and basic punctuation marks (dot, comma, hyphen, question mark, exclamation mark, dash for Russian); - Works for 4 languages (Russian, English, German, Spanish) and can be extended; - By design is domain agnostic and is not based on any hard-coded rules; - Has non-trivial metrics and succeeds in the task of improving text readability;
Links:
- Model repo - https://github.com/snakers4/silero-models#text-enhancement - Colab notebook - https://colab.research.google.com/github/snakers4/silero-models/blob/master/examples_te.ipynb - Russian article - https://habr.com/ru/post/581946/ - English article - https://habr.com/ru/post/581960/
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of f...