Skip to content

Commit

Permalink
Add info from #45 and bump NPM version.
Browse files Browse the repository at this point in the history
  • Loading branch information
Lewis Donovan committed Mar 18, 2024
1 parent e00eb18 commit 81f75a1
Show file tree
Hide file tree
Showing 3 changed files with 21 additions and 3 deletions.
13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,19 @@
# google-news-scraper CHANGELOG
All notable changes to this project will be documented in this file.

## [1.2.2] - 2024-03-18

To update please run `npm update google-news-scraper`

### Changed

- **index.js**
- Merge changes from [#45](https://github.com/lewisdonovan/google-news-scraper/pull/45) (credit to [ole-ve](https://github.com/ole-ve/))
- **package.json**
- Bump version
- **README.md**
- Include details of new `puppeteerHeadlessMode` config item.

## [1.2.1] - 2024-03-11

To update please run `npm update google-news-scraper`
Expand Down
9 changes: 7 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -128,10 +128,15 @@ Defaults to `null`
#### puppeteerArgs
An array of Chromium flags to pass to the browser instance. By default, this will be an empty array. A full list of available flags can be found [here](https://peter.sh/experiments/chromium-command-line-switches/). NB: if you are launching this in a Heroku app, you will need to pass the `--no-sandbox` and `--disable-setuid-sandbox` flags, as explained in [this SO answer](https://stackoverflow.com/a/52228855/7546845).

defaults to `[]`
Defaults to `[]`

#### puppeteerHeadlessMode
Whether or not Puppeteer should run in [headless mode](https://www.browserstack.com/guide/puppeteer-headless). Running in headless mode increases performance by approximately 30% (credit to [ole-ve](https://github.com/lewisdonovan/google-news-scraper/pull/45) for finding this). If you're not sure about this setting, leave it as it is.

Defaults to `true`

## Performance 📈
My test query returned 99 results, which took 4.5 seconds with article content and 3.6 seconds without it. I'm on a fibre connection, and other queries may return a different number of results, so your mileage may vary.
My test query returned 94 results, which took 4.5 seconds with article content and 3.6 seconds without it. I'm on a fibre connection, and other queries may return a different number of results, so your mileage may vary.

## Upkeep 🧹
Please note that this is a web-scraper, which relies on DOM selectors, so any fundamental changes in the markup on the Google News site will probably break this tool. I'll try my best to keep it up-to-date, but changes to the markup on Google News will be silent and therefore difficult to keep track of. Feel free to submit an issue if the tool stops working.
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "google-news-scraper",
"version": "1.2.1",
"version": "1.2.2",
"description": "Lightweight async scraper for Google News",
"main": "index.js",
"scripts": {
Expand Down

0 comments on commit 81f75a1

Please sign in to comment.