Alternative Text utilites and packages for Python
Updated :
May 25, 2022
newspaper
Github stargazers
11867
Github forks
1958
Commits
651
Code contributors Contributors
96
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Created
Nov. 25, 2013
Updated
Sept. 2, 2020
License
other
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
480
Homepage
textual
Github stargazers
11239
Github forks
293
Commits
380
Code contributors Contributors
34
Textual is a TUI (Text User Interface) framework for Python inspired by modern web development.
Created
April 8, 2021
Updated
May 4, 2022
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
94
TextBlob
Github stargazers
8162
Github forks
1076
Commits
562
Code contributors Contributors
27
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more.
Created
June 30, 2013
Updated
Oct. 22, 2021
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
104
OCRmyPDF
Github stargazers
6439
Github forks
570
Commits
3196
Code contributors Contributors
59
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Created
Dec. 20, 2013
Updated
May 24, 2022
License
mpl-2.0
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
91
OCRmyPDF
Github stargazers
6439
Github forks
570
Commits
3196
Code contributors Contributors
59
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
Created
Dec. 20, 2013
Updated
May 24, 2022
License
mpl-2.0
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
91
snownlp
Github stargazers
5804
Github forks
1342
Commits
57
Code contributors Contributors
8
Python library for processing Chinese text
Created
Nov. 26, 2013
Updated
Jan. 19, 2020
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
42
pyWhat
Github stargazers
5167
Github forks
242
Commits
627
Code contributors Contributors
34
๐Ÿธ Identify anything. pyWhat easily lets you identify emails, IP addresses, and more. Feed it a .pcap file or some text and it'll tell you what it is! ๐Ÿง™โ€โ™€๏ธ
Created
March 19, 2021
Updated
May 9, 2022
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
20
textgenrnn
Github stargazers
4685
Github forks
751
Commits
174
Code contributors Contributors
16
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
Created
Aug. 7, 2017
Updated
July 14, 2020
License
other
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
136
snips-nlu
Github stargazers
3652
Github forks
499
Commits
2154
Code contributors Contributors
15
Snips Python library to extract meaning from text
Created
Feb. 8, 2017
Updated
May 3, 2021
License
apache-2.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
61
python-ftfy
Github stargazers
3244
Github forks
113
Commits
605
Code contributors Contributors
13
Fixes mojibake and other glitches in Unicode text, after the fact.
Created
Aug. 24, 2012
Updated
Feb. 9, 2022
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
13
asciimatics
Github stargazers
3053
Github forks
227
Commits
1079
Code contributors Contributors
39
A cross platform package to do curses-like operations, plus higher level APIs and widgets to create text UIs and ASCII art animations
Created
April 15, 2015
Updated
April 26, 2022
License
apache-2.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
22
gpt-2-simple
Github stargazers
2931
Github forks
621
Commits
149
Code contributors Contributors
21
Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts
Created
April 13, 2019
Updated
May 22, 2022
License
other
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
165
textdistance
Github stargazers
2830
Github forks
231
Commits
321
Code contributors Contributors
9
Compute distance between sequences. 30+ algorithms, pure python implementation, common interface, optional external libs usage.
Created
May 5, 2017
Updated
Nov. 29, 2021
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
11
sumy
Github stargazers
2807
Github forks
472
Commits
411
Code contributors Contributors
20
Module for automatic summarization of text documents and HTML pages.
Created
Feb. 20, 2013
Updated
April 21, 2022
License
apache-2.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
14
TextRank4ZH
Github stargazers
2671
Github forks
790
Commits
29
Code contributors Contributors
1
:deciduous_tree:ไปŽไธญๆ–‡ๆ–‡ๆœฌไธญ่‡ชๅŠจๆๅ–ๅ…ณ้”ฎ่ฏๅ’Œๆ‘˜่ฆ
Created
Dec. 1, 2014
Updated
July 3, 2018
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
6
texar
Github stargazers
2285
Github forks
375
Commits
1719
Code contributors Contributors
29
Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/
Created
July 22, 2017
Updated
July 29, 2020
License
apache-2.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
36
Homepage
aeneas
Github stargazers
1987
Github forks
202
Commits
280
Code contributors Contributors
6
aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
Created
May 11, 2015
Updated
May 13, 2020
License
agpl-3.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
47
TextAttack
Github stargazers
1969
Github forks
250
Commits
2457
Code contributors Contributors
41
TextAttack ๐Ÿ™ is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
Created
Oct. 15, 2019
Updated
April 4, 2022
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
45
textacy
Github stargazers
1928
Github forks
240
Commits
1716
Code contributors Contributors
28
NLP, before and after spaCy
Created
Feb. 3, 2016
Updated
March 6, 2022
License
other
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
29
pytextrank
Github stargazers
1803
Github forks
332
Commits
438
Code contributors Contributors
16
Python implementation of TextRank algorithms ("textgraphs") for phrase extraction
Created
Oct. 2, 2016
Updated
March 7, 2022
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
25
Homepage