Alternative Scraping tools and utilities for Python
Updated :
May 25, 2022
twint
Github stargazers
13206
Github forks
2224
Commits
845
Code contributors Contributors
62
An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.
Created
June 10, 2017
Updated
March 2, 2021
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
590
pattern
Github stargazers
8213
Github forks
1584
Commits
1434
Code contributors Contributors
20
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
Created
May 3, 2011
Updated
April 25, 2020
License
bsd-3-clause
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
172
Homepage
instagram-scraper
Github stargazers
6221
Github forks
1324
Commits
366
Code contributors Contributors
55
Scrapes an instagram user's photos and videos
Created
April 9, 2013
Updated
March 30, 2022
License
unlicense
Github repo
Type
App
Primary Language, based on Github DataLanguage
Python
Issues
408
autoscraper
Github stargazers
4412
Github forks
471
Commits
134
Code contributors Contributors
6
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Created
Aug. 31, 2020
Updated
Feb. 3, 2021
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
9
twitter-scraper
Github stargazers
3194
Github forks
556
Commits
208
Code contributors Contributors
28
Scrape the Twitter Frontend API without authentication.
Created
Feb. 22, 2018
Updated
Dec. 17, 2021
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
46
cloudflare-scrape
Github stargazers
2792
Github forks
406
Commits
156
Code contributors Contributors
23
A Python module to bypass Cloudflare's anti-bot page.
Created
Feb. 28, 2013
Updated
March 23, 2020
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
113
instagram-scraper
Github stargazers
2357
Github forks
383
Commits
161
Code contributors Contributors
29
scrapes medias, likes, followers, tags and all metadata. Inspired by instagram-php-scraper,bot
Created
June 10, 2019
Updated
April 27, 2021
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
111
grab
Github stargazers
2188
Github forks
269
Commits
2233
Code contributors Contributors
48
Web Scraping Framework
Created
May 1, 2013
Updated
March 1, 2022
License
mit
Github repo
Type
Tool/utility
Primary Language, based on Github DataLanguage
Python
Issues
9
lazynlp
Github stargazers
2020
Github forks
306
Commits
14
Code contributors Contributors
4
Library to scrape and clean web pages to create massive datasets.
Created
Feb. 27, 2019
Updated
Oct. 7, 2019
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
7
GitDorker
Github stargazers
1766
Github forks
353
Commits
113
Code contributors Contributors
4
A Python program to scrape secrets from GitHub through usage of a large repository of dorks.
Created
July 13, 2020
Updated
May 7, 2021
Github repo
Type
Tool/utility
Primary Language, based on Github DataLanguage
Python
Issues
13
pagodo
Github stargazers
1609
Github forks
342
Commits
108
Code contributors Contributors
5
pagodo (Passive Google Dork) - Automate Google Hacking Database scraping and searching
Created
Aug. 19, 2016
Updated
Jan. 25, 2022
License
gpl-3.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
1
ruia
Github stargazers
1578
Github forks
170
Commits
436
Code contributors Contributors
14
Async Python 3.6+ web scraping micro-framework based on asyncio
Created
July 10, 2018
Updated
March 27, 2022
License
apache-2.0
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
3
Homepage
JobFunnel
Github stargazers
1556
Github forks
182
Commits
413
Code contributors Contributors
12
Scrape job websites into a single spreadsheet with no duplicates.
Created
Aug. 25, 2017
Updated
Nov. 25, 2021
License
mit
Github repo
Type
Cli
Primary Language, based on Github DataLanguage
Python
Issues
8
SoundScrape
Github stargazers
1345
Github forks
145
Commits
250
Code contributors Contributors
13
SoundCloud (and Bandcamp and Mixcloud) downloader in Python.
Created
Dec. 29, 2013
Updated
Nov. 22, 2020
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
66
search-script-scrape
Github stargazers
1205
Github forks
234
Commits
56
Code contributors Contributors
1
101 real world web scraping exercises in Python 3 for data journalists
Created
June 7, 2015
Updated
Oct. 5, 2015
Github repo
Type
Script
Primary Language, based on Github DataLanguage
Python
Issues
2
Homepage
django-dynamic-scraper
Github stargazers
1100
Github forks
319
Commits
552
Code contributors Contributors
9
Creating Scrapy scrapers via the Django admin interface
Created
Dec. 16, 2011
Updated
June 25, 2021
License
bsd-3-clause
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
39
scrapy-cluster
Github stargazers
1038
Github forks
313
Commits
747
Code contributors Contributors
25
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Created
April 14, 2015
Updated
April 7, 2021
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
14
recipe-scrapers
Github stargazers
877
Github forks
319
Commits
880
Code contributors Contributors
102
Python package for scraping recipes data
Created
Sept. 14, 2015
Updated
May 24, 2022
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
56
amazon-scraper-python
Github stargazers
826
Github forks
151
Commits
104
Code contributors Contributors
6
Non-official client to get some info about products sold on Amazon
Created
June 2, 2018
Updated
Oct. 13, 2020
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
13
openstates-scrapers
Github stargazers
729
Github forks
420
Commits
18080
Code contributors Contributors
143
source for Open States scrapers
Created
Feb. 26, 2009
Updated
May 23, 2022
License
gpl-3.0
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
4