Thursday, 13 June 2019

Creating my very own API that is seeded by a webscraper

So I want to create my own API since free ones regarding my interests (sports related) are no longer updated or they cost money.I originally was playing around with puppeteer but now I am reading around this subreddit and users are saying using a headless browser is not a good idea for performance reasons. Also the data I was scraping is rather simple. Just a few names and records.So I guess that means cheerios or xray is more ideal for simple scraping needs?I am rather new to web scraping so I want to make sure this entire process is correct.I would run the scraper to parse websites for data.structure the data into objects in my nodejs backend and save it as documents to my mongodb.My frontend would query for the recently scraped data through designated endpoints.Use PM2(?) to periodically run the scraping program to update my data.Not sure if I am missing anything or whatever I am doing is overkill. Appreciate any help.

Submitted June 14, 2019 at 06:32AM by mynonohole

No comments:

Post a Comment