Top alternative scraping utilities for Nodejs
Updated :
April 23, 2024
jDistiller
Github stargazers
51
Github forks
10
Commits
57
Code contributors Contributors
3
A page scraping DSL for extracting structured information from unstructured XHTML, built on Node.js and jQuery
Created
Aug. 28, 2012
Updated
Jan. 9, 2015
Github repo
Platform
Node.js
Primary Language, based on Github DataLanguage
JavaScript
TapNews
Github stargazers
50
Github forks
13
Commits
30
Code contributors Contributors
1
A real Time News Scraping and Recommendation System using React, Node.js, MongoDB, and TensorFlow.
Created
Sept. 13, 2017
Updated
Oct. 2, 2017
Github repo
Type
App
Platform
Node.js
Primary Language, based on Github DataLanguage
Python
scraper
Github stargazers
46
Github forks
18
Commits
16
Code contributors Contributors
2
Node.js based scraper using headless chrome
Created
Oct. 14, 2017
Updated
Dec. 31, 2019
License
mit
Github repo
Platform
Node.js
Primary Language, based on Github DataLanguage
JavaScript
Issues
11
yahoo-stock-prices
Github stargazers
46
Github forks
20
Commits
34
Code contributors Contributors
4
Node.js API to scrape stock prices from Yahoo Finance
Created
Aug. 8, 2017
Updated
Jan. 11, 2021
License
mit
Github repo
Primary Language, based on Github DataLanguage
JavaScript
Issues
14
node-microdata-scraper
Read-only repository, archived by owner Archived
Github stargazers
45
Github forks
9
Commits
12
Code contributors Contributors
3
Scrape & parse a webpage to return a JSON with found microdata (schema.org)
Created
Jan. 7, 2014
Updated
June 22, 2017
License
mit
Github repo
Platform
Node.js
Primary Language, based on Github DataLanguage
JavaScript
Issues
2
News_Mining_and_Recommendation_System
Github stargazers
44
Github forks
5
Commits
4
Code contributors Contributors
1
News website (React + Node.js + Python + MongoDB + RabbitMQ), with news scraping and recommendation.
Created
March 23, 2018
Updated
March 28, 2018
Github repo
Primary Language, based on Github DataLanguage
JavaScript
webparsy
Github stargazers
43
Github forks
7
Commits
157
Code contributors Contributors
5
Node.JS library and cli for scraping websites using Puppeteer (or not) and YAML definitions
Created
March 9, 2019
Updated
Dec. 16, 2020
License
mit
Github repo
Type
Module/library
Platform
Browser
Primary Language, based on Github DataLanguage
JavaScript
Issues
19
Homepage
amazon-scrapper-1M-nodejs
Github stargazers
43
Github forks
12
Commits
51
Code contributors Contributors
1
A crawler to scrape 1 million products using NodeJS & puppeteer
Created
Sept. 1, 2018
Updated
Sept. 2, 2018
Github repo
Platform
Node.js
Primary Language, based on Github DataLanguage
JavaScript
scrape-github-trending
Github stargazers
42
Github forks
8
Commits
7
Code contributors Contributors
1
Tutorial for web scraping / crawling with Node.js.
Created
May 24, 2019
Updated
Sept. 6, 2019
Github repo
Primary Language, based on Github DataLanguage
JavaScript
Issues
6
logo-scrape
Github stargazers
42
Github forks
11
Commits
63
Code contributors Contributors
1
🕷🚀 Scrapes/Crawls the logo from a provided url(s)/website for your Node.js applications.
Created
March 7, 2019
Updated
March 11, 2019
License
mit
Github repo
Platform
Node.js
Primary Language, based on Github DataLanguage
TypeScript
Issues
9
jsdom-based-screen-scraper
Github stargazers
39
Github forks
9
Commits
3
Code contributors Contributors
1
This a learning project to use jsdom and jquery with Node to scrape web screens.
Created
Nov. 29, 2010
Updated
Nov. 29, 2010
Github repo
Primary Language, based on Github DataLanguage
JavaScript
Issues
1
noscrape
Github stargazers
38
Github forks
8
Commits
178
Code contributors Contributors
2
obfuscate text via node to make scraping your content really difficult
Created
Dec. 9, 2021
Updated
March 29, 2024
License
mit
Github repo
Primary Language, based on Github DataLanguage
TypeScript
Issues
1
node-express-ajax-craigslist
Github stargazers
38
Github forks
32
Commits
35
Code contributors Contributors
1
scraping craigslist
Created
Oct. 24, 2013
Updated
April 22, 2014
Github repo
Platform
Node.js
Primary Language, based on Github DataLanguage
JavaScript
react-node-web-scraper
Github stargazers
38
Github forks
14
Commits
27
Code contributors Contributors
1
Final Year project, scraping data of e-commerce stores and display in ReactJS app.
Created
Jan. 19, 2019
Updated
Nov. 3, 2023
Github repo
Primary Language, based on Github DataLanguage
JavaScript
export-github-stars
Github stargazers
37
Github forks
3
Commits
32
Code contributors Contributors
1
A Node.js webapp to scrape the stars of specified GitHub users
Created
March 23, 2019
Updated
Nov. 19, 2020
Github repo
Platform
Node.js
Primary Language, based on Github DataLanguage
JavaScript
Issues
12
web-scraping
Github stargazers
37
Github forks
10
Commits
5
Code contributors Contributors
2
Scraping the web with node js
Created
Jan. 28, 2018
Updated
March 30, 2018
License
mit
Github repo
Primary Language, based on Github DataLanguage
JavaScript
Issues
1
ESPN-Fantasy-Football-Scraper
Github stargazers
36
Github forks
13
Commits
27
Code contributors Contributors
1
Node.js script to scrape ESPN's Fantasy Football league pages and save data to MongoDB.
Created
Sept. 19, 2015
Updated
Nov. 10, 2015
Github repo
Primary Language, based on Github DataLanguage
JavaScript
Issues
2
coursera-scraper
Github stargazers
36
Github forks
13
Commits
26
Code contributors Contributors
5
A lightweight Node.js app to fetch assets / videos for Coursera courses.
Created
June 20, 2021
Updated
Sept. 28, 2023
License
mit
Github repo
Primary Language, based on Github DataLanguage
JavaScript
Issues
1
jedi-crawler
Github stargazers
35
Github forks
5
Commits
10
Code contributors Contributors
1
Lightsabing Node/PhantomJS crawler; scrape dynamic content : without the hassle
Created
Aug. 6, 2013
Updated
Oct. 3, 2013
Github repo
Platform
Node.js
Primary Language, based on Github DataLanguage
JavaScript
papercut
Github stargazers
35
Github forks
2
Commits
65
Code contributors Contributors
2
Papercut is a scraping/crawling library for Node.js built on top of JSDOM. It provides basic selector features together with features like Page Caching and Geosearch.
Created
July 25, 2020
Updated
Nov. 15, 2021
License
mit
Github repo
Primary Language, based on Github DataLanguage
TypeScript
Issues
12