Tuesday 19 February 2019

Linkedin Scraper / Crawler working 2019 for Node.js!

Since LinkedIn changed his website to a SPA-ish in 2018 (?) almost all scrapers stopped working, but seems that no one cared about it! I suppose due to Linkedin API. I also tried to use the API however it has a lot of limitations, the API only give us basic information, unle$$ you are a partner, so I built one for myself, and it seems that is the 1st which works on Node.js: https://github.com/linkedtales/scrapedinIt took over 2 months, more than 10k profiles tested, it reads full data and I'm very happy if it also helps other people 😊. The "trick" what made it possible was using "puppeteer" a chromium headless browser. It is really not very performant if you scrap only one profile at once (~1 profile per second) , but it works amazingly in parallel (~5 profiles 1.5 seconds), I also built a crawler using scrapedin in parallel: https://github.com/linkedtales/scrapedin-linkedin-crawler

Submitted February 19, 2019 at 11:50AM by tumeni

No comments:

Post a Comment