Thursday 25 July 2019

[puppeteer] Can't get the height of certain web sites' pages

I'm using the excellent puppeteer library to scrape a few web sites' frontpages to get a screenshot of the whole page. While it works just fine for most of the web sites, there are a few where I can't get page's height; it ends up being the same height as the viewport height.Example site: https://www.dagbladet.no/Here's the code I currently use to get the page's height:const bodyHandle = await page.$('body'); const boundingBox = await bodyHandle.boundingBox(); const height = boundingBox.height; console.log("height: " + height); const altHeight = await page.evaluate(() => document.documentElement.offsetHeight); console.log("altHeight: " + altHeight); height and altHeight doesn't differ, but I put it in there in hope that a different approach would work.Any ideas?EDIT: I could just scroll down until I can't scroll anymore, but I'm afraid of "never ending" sites. I could of course have a max. scroll length as well, but not sure if it's an ideal solution.

Submitted July 25, 2019 at 10:44AM by UnityNorway

No comments:

Post a Comment