Alternative Scraping tools and utilities for Python
amazon-scraper-python
Github stargazers
872
Github forks
159
Commits
104
Code contributors Contributors
6
Non-official client to get some info about products sold on Amazon
Created
June 2, 2018
Updated
Oct. 13, 2020
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
13
openstates-scrapers
Github stargazers
846
Github forks
464
Commits
20167
Code contributors Contributors
163
source for Open States scrapers
Created
Feb. 26, 2009
Updated
Sept. 27, 2024
License
gpl-3.0
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
11
URS
Github stargazers
799
Github forks
108
Commits
1348
Code contributors Contributors
5
Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.
Created
March 20, 2019
Updated
May 25, 2023
License
mit
Github repo
Type
Tool/utility
Primary Language, based on Github DataLanguage
Python
Issues
8
ImageScraper
Github stargazers
761
Github forks
100
Commits
260
Code contributors Contributors
12
:scissors: High performance, multi-threaded image scraper
Created
May 24, 2014
Updated
Jan. 4, 2018
License
gpl-3.0
Github repo
Type
Script
Primary Language, based on Github DataLanguage
Python
Issues
24
google-play-scraper
Github stargazers
765
Github forks
208
Commits
171
Code contributors Contributors
21
Google play scraper for Python inspired by <facundoolano/google-play-scraper>
Created
June 4, 2019
Updated
June 7, 2024
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
60
web-scraping
Github stargazers
730
Github forks
173
Commits
211
Code contributors Contributors
1
Detailed web scraping tutorials for dummies with financial data crawlers on Reddit WallStreetBets, CME (both options and futures), US Treasury, CFTC, LME, MacroTrends, SHFE and alternative data crawlers on Tomtom, BBC, Wall Street Journal, Al Jazeera, Reuters, Financial Times, Bloomberg, CNN, Fortune, The Economist
Created
April 4, 2018
Updated
June 28, 2021
License
apache-2.0
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
DaProfiler
Github stargazers
658
Github forks
11
Commits
121
Code contributors Contributors
7
DaProfiler is an OSINT tool allowing you to collect certain information about yourself in order to rectify by rgpd requests the traces you may have left on the net. DaProfiler is indeed able to recover: Addresses, Social media accounts, e-mail addresses, mobile / landline number, jobs. On a specified subject in a limited time. DaProfiler is designe
Created
June 26, 2021
Updated
June 11, 2023
License
gpl-3.0
Github repo
Primary Language, based on Github DataLanguage
Python
instascrape
Read-only repository, archived by owner Archived
Github stargazers
635
Github forks
111
Commits
862
Code contributors Contributors
8
Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
Created
Sept. 21, 2020
Updated
April 20, 2023
License
mit
Github repo
Type
Tool/utility
Primary Language, based on Github DataLanguage
Python
Issues
53
stweet
Github stargazers
588
Github forks
67
Commits
274
Code contributors Contributors
1
Advanced python library to scrap Twitter (tweets, users) from unofficial API
Created
Nov. 16, 2020
Updated
Feb. 6, 2023
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
12
scrapli
Github stargazers
583
Github forks
61
Commits
388
Code contributors Contributors
20
Fast, flexible, sync/async, Python 3.7+ screen scraping client specifically for network devices
Created
Jan. 27, 2020
Updated
Sept. 22, 2024
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
2
dorks-eye
Github stargazers
570
Github forks
122
Commits
31
Code contributors Contributors
1
Dorks Eye Google Hacking Dork Scraping and Searching Script. Dorks Eye is a script I made in python 3. With this tool, you can easily find Google Dorks. Dork Eye collects potentially vulnerable web pages and applications on the Internet or other awesome info that is picked up by Google's search bots. Author: Jolanda de Koff
Created
April 30, 2020
Updated
Jan. 8, 2022
License
agpl-3.0
Github repo
Type
Tool/utility
Primary Language, based on Github DataLanguage
Python
Issues
10
socialreaper
Github stargazers
552
Github forks
92
Commits
83
Code contributors Contributors
1
Social media scraping / data collection library for Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
Created
Jan. 31, 2017
Updated
Jan. 20, 2019
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
1
dryscrape
Read-only repository, archived by owner Archived
Github stargazers
537
Github forks
67
Commits
99
Code contributors Contributors
7
[not actively maintained] A lightweight Python library that uses Webkit to enable easy scraping of dynamic, Javascript-heavy web pages
Created
Jan. 11, 2012
Updated
Aug. 22, 2017
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
21
Search-Engines-Scraper
Github stargazers
532
Github forks
143
Commits
165
Code contributors Contributors
7
Search google, bing, yahoo, and other search engines with python
Created
Jan. 8, 2018
Updated
Sept. 21, 2024
License
mit
Github repo
Type
Module/library
Primary Language, based on Github DataLanguage
Python
Issues
12
googlesearch
Github stargazers
509
Github forks
118
Commits
33
Code contributors Contributors
15
A Python library for scraping the Google search engine.
Created
July 5, 2020
Updated
Aug. 13, 2024
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
13
Homepage
search-engine-parser
Github stargazers
455
Github forks
87
Commits
162
Code contributors Contributors
22
Lightweight package to query popular search engines and scrape for result titles, links and descriptions
Created
Feb. 1, 2019
Updated
Nov. 21, 2022
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
19
search-engine-parser
Github stargazers
455
Github forks
87
Commits
162
Code contributors Contributors
22
Lightweight package to query popular search engines and scrape for result titles, links and descriptions
Created
Feb. 1, 2019
Updated
Nov. 21, 2022
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
19
social-media-profile-scrapers
Github stargazers
446
Github forks
76
Commits
112
Code contributors Contributors
1
Fetch user's data across social media
Created
April 26, 2020
Updated
Oct. 1, 2023
License
apache-2.0
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
10
List-of-user-agents
Github stargazers
430
Github forks
221
Commits
5
Code contributors Contributors
2
List of major web + mobile browser user agent strings. +1 Bonus script to scrape :)
Created
Oct. 21, 2017
Updated
Dec. 17, 2020
License
mit
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
3
wayback-machine-scraper
Github stargazers
418
Github forks
74
Commits
42
Code contributors Contributors
1
A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
Created
April 4, 2017
Updated
Feb. 15, 2021
License
isc
Github repo
Primary Language, based on Github DataLanguage
Python
Issues
10