Alternative Text utilites and packages for Python
Updated :
Dec. 9, 2022
textual
Github stargazers
16336
Github forks
470
Commits
3421
Code contributors Contributors
59
Textual is a TUI (Text User Interface) framework for Python inspired by modern web development.
Created
April 8, 2021
Updated
Dec. 9, 2022
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
107
newspaper
Github stargazers
12298
Github forks
1988
Commits
651
Code contributors Contributors
96
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Created
Nov. 25, 2013
Updated
Sept. 2, 2020
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
490
Homepage
TextBlob
Github stargazers
8378
Github forks
1098
Commits
562
Code contributors Contributors
27
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
Created
June 30, 2013
Updated
Oct. 22, 2021
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
108
OCRmyPDF
Github stargazers
7836
Github forks
636
Commits
3349
Code contributors Contributors
66
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Created
Dec. 20, 2013
Updated
Oct. 4, 2022
License
mpl-2.0
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
101
OCRmyPDF
Github stargazers
7836
Github forks
636
Commits
3349
Code contributors Contributors
66
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Created
Dec. 20, 2013
Updated
Oct. 4, 2022
License
mpl-2.0
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
101
snownlp
Github stargazers
5963
Github forks
1353
Commits
57
Code contributors Contributors
8
Python library for processing Chinese text
Created
Nov. 26, 2013
Updated
Jan. 19, 2020
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
42
pyWhat
Github stargazers
5588
Github forks
298
Commits
629
Code contributors Contributors
35
๐Ÿธ Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! ๐Ÿง™โ€โ™€๏ธ
Created
March 19, 2021
Updated
Nov. 15, 2022
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
24
textgenrnn
Github stargazers
4818
Github forks
752
Commits
174
Code contributors Contributors
16
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
Created
Aug. 7, 2017
Updated
July 14, 2020
License
other
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
139
snips-nlu
Github stargazers
3730
Github forks
519
Commits
2154
Code contributors Contributors
15
Snips Python library to extract meaning from text
Created
Feb. 8, 2017
Updated
May 3, 2021
License
apache-2.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
63
python-ftfy
Github stargazers
3381
Github forks
116
Commits
609
Code contributors Contributors
13
Fixes mojibake and other glitches in Unicode text, after the fact.
Created
Aug. 24, 2012
Updated
Oct. 25, 2022
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
15
asciimatics
Github stargazers
3191
Github forks
233
Commits
1083
Code contributors Contributors
39
A cross platform package to do curses-like operations, plus higher level APIs and widgets to create text UIs and ASCII art animations
Created
April 15, 2015
Updated
July 3, 2022
License
apache-2.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
24
gpt-2-simple
Github stargazers
3062
Github forks
631
Commits
149
Code contributors Contributors
21
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts
Created
April 13, 2019
Updated
May 22, 2022
License
other
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
169
textdistance
Github stargazers
3006
Github forks
240
Commits
376
Code contributors Contributors
10
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
Created
May 5, 2017
Updated
Sept. 18, 2022
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
9
sumy
Github stargazers
2979
Github forks
488
Commits
426
Code contributors Contributors
22
Module for automatic summarization of text documents and HTML pages.
Created
Feb. 20, 2013
Updated
Oct. 23, 2022
License
apache-2.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
15
TextRank4ZH
Github stargazers
2801
Github forks
811
Commits
29
Code contributors Contributors
1
:deciduous_tree:ไปŽไธญๆ–‡ๆ–‡ๆœฌไธญ่‡ชๅŠจๆๅ–ๅ…ณ้”ฎ่ฏๅ’Œๆ‘˜่ฆ
Created
Dec. 1, 2014
Updated
July 3, 2018
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
8
texar
Github stargazers
2325
Github forks
379
Commits
1719
Code contributors Contributors
29
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Created
July 22, 2017
Updated
July 29, 2020
License
apache-2.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
36
Homepage
TextAttack
Github stargazers
2168
Github forks
285
Commits
2576
Code contributors Contributors
46
TextAttack ๐Ÿ™ is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Created
Oct. 15, 2019
Updated
Dec. 6, 2022
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
21
aeneas
Github stargazers
2068
Github forks
208
Commits
280
Code contributors Contributors
6
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Created
May 11, 2015
Updated
May 13, 2020
License
agpl-3.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
54
textacy
Github stargazers
2001
Github forks
251
Commits
1716
Code contributors Contributors
28
NLP, before and after spaCy
Created
Feb. 3, 2016
Updated
March 6, 2022
License
other
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
31
pytextrank
Github stargazers
1936
Github forks
333
Commits
452
Code contributors Contributors
16
Python implementation of TextRank algorithms ("textgraphs") for phrase extraction
Created
Oct. 2, 2016
Updated
July 27, 2022
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
20
Homepage