Alternative Scraping tools and utilities for Python
Updated :
April 23, 2024
amazon-scraper-python
Github stargazers
859
Github forks
199
Commits
104
Code contributors Contributors
6
Non-official client to get some info about products sold on Amazon
Created
June 2, 2018
Updated
Oct. 13, 2020
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
13
openstates-scrapers
Github stargazers
834
Github forks
456
Commits
20167
Code contributors Contributors
161
source for Open States scrapers
Created
Feb. 26, 2009
Updated
April 22, 2024
License
gpl-3.0
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
7
ImageScraper
Github stargazers
749
Github forks
98
Commits
260
Code contributors Contributors
12
:scissors: High performance, multi-threaded image scraper
Created
May 24, 2014
Updated
Jan. 4, 2018
License
gpl-3.0
Github repo
Type
Script
Primary Language, based on Github DataLanguage
Python
Issues
24
URS
Github stargazers
724
Github forks
100
Commits
1348
Code contributors Contributors
5
Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.
Created
March 20, 2019
Updated
May 25, 2023
License
mit
Github repo
Type
Tool/utility
Primary Language, based on Github DataLanguage
Python
Issues
6
google-play-scraper
Github stargazers
689
Github forks
186
Commits
171
Code contributors Contributors
20
Google play scraper for Python inspired by <facundoolano/google-play-scraper>
Created
June 4, 2019
Updated
Jan. 29, 2024
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
46
web-scraping
Github stargazers
672
Github forks
161
Commits
211
Code contributors Contributors
1
Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
Created
April 4, 2018
Updated
June 28, 2021
License
apache-2.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
DaProfiler
Github stargazers
658
Github forks
2
Commits
121
Code contributors Contributors
7
DaProfiler is an OSINT tool allowing you to collect certain information about yourself in order to rectify by rgpd requests the traces you may have left on the net. DaProfiler is indeed able to recover: Addresses, Social media accounts, e-mail addresses, mobile / landline number, jobs. On a specified subject in a limited time. DaProfiler is designe
Created
June 26, 2021
Updated
June 11, 2023
License
gpl-3.0
Github repo
Primary Language, based on Github DataLanguage
Python
instascrape
Read-only repository, archived by owner Archived
Github stargazers
618
Github forks
110
Commits
862
Code contributors Contributors
8
Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
Created
Sept. 21, 2020
Updated
April 20, 2023
License
mit
Github repo
Type
Tool/utility
Primary Language, based on Github DataLanguage
Python
Issues
53
stweet
Github stargazers
569
Github forks
62
Commits
274
Code contributors Contributors
1
Advanced python library to scrap Twitter (tweets, users) from unofficial API
Created
Nov. 16, 2020
Updated
Feb. 6, 2023
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
12
scrapli
Github stargazers
551
Github forks
55
Commits
388
Code contributors Contributors
14
Fast, flexible, sync/async, Python 3.7+ screen scraping client specifically for network devices
Created
Jan. 27, 2020
Updated
April 11, 2024
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
1
dryscrape
Read-only repository, archived by owner Archived
Github stargazers
537
Github forks
72
Commits
99
Code contributors Contributors
7
[not actively maintained] A lightweight Python library that uses Webkit to enable easy scraping of dynamic, Javascript-heavy web pages
Created
Jan. 11, 2012
Updated
Aug. 22, 2017
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
21
socialreaper
Github stargazers
527
Github forks
93
Commits
83
Code contributors Contributors
1
Social media scraping / data collection library for Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
Created
Jan. 31, 2017
Updated
Jan. 20, 2019
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
1
dorks-eye
Github stargazers
509
Github forks
111
Commits
31
Code contributors Contributors
1
Dorks Eye Google Hacking Dork Scraping and Searching Script. Dorks Eye is a script I made in python 3. With this tool, you can easily find Google Dorks. Dork Eye collects potentially vulnerable web pages and applications on the Internet or other awesome info that is picked up by Google's search bots. Author: Jolanda de Koff
Created
April 30, 2020
Updated
Jan. 8, 2022
License
agpl-3.0
Github repo
Type
Tool/utility
Primary Language, based on Github DataLanguage
Python
Issues
7
Search-Engines-Scraper
Github stargazers
438
Github forks
127
Commits
165
Code contributors Contributors
4
Search google, bing, yahoo, and other search engines with python
Created
Jan. 8, 2018
Updated
Oct. 27, 2023
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
9
search-engine-parser
Github stargazers
434
Github forks
84
Commits
162
Code contributors Contributors
22
Lightweight package to query popular search engines and scrape for result titles, links and descriptions
Created
Feb. 1, 2019
Updated
Nov. 21, 2022
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
17
search-engine-parser
Github stargazers
434
Github forks
84
Commits
162
Code contributors Contributors
22
Lightweight package to query popular search engines and scrape for result titles, links and descriptions
Created
Feb. 1, 2019
Updated
Nov. 21, 2022
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
17
List-of-user-agents
Github stargazers
416
Github forks
223
Commits
5
Code contributors Contributors
2
List of major web + mobile browser user agent strings. +1 Bonus script to scrape :)
Created
Oct. 21, 2017
Updated
Dec. 17, 2020
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
3
wayback-machine-scraper
Github stargazers
405
Github forks
72
Commits
42
Code contributors Contributors
1
A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Created
April 4, 2017
Updated
Feb. 15, 2021
License
isc
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
9
social-media-profile-scrapers
Github stargazers
402
Github forks
67
Commits
112
Code contributors Contributors
1
Fetch user's data across social media
Created
April 26, 2020
Updated
Oct. 1, 2023
License
apache-2.0
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
10
googlesearch
Github stargazers
402
Github forks
103
Commits
33
Code contributors Contributors
8
A Python library for scraping the Google search engine.
Created
July 5, 2020
Updated
May 30, 2023
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
32
Homepage