Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crawlee/Playwright struggles to start with default project #26925

Open
drewbitt opened this issue Nov 19, 2024 · 0 comments
Open

Crawlee/Playwright struggles to start with default project #26925

drewbitt opened this issue Nov 19, 2024 · 0 comments

Comments

@drewbitt
Copy link

Version: Deno 2.0.6
macOS 15.2 arm64
https://github.com/apify/crawlee

Reproduction

deno run -A npm:crawlee create crawlee -t getting-started-ts
cd crawlee
deno run -A npm:playwright install
deno run -A src/main.ts

Result

❯ deno run -A src/main.ts
INFO  PlaywrightCrawler: Starting the crawler.
WARN  PlaywrightCrawler: Reclaiming failed request back to the list or queue. page.goto: net::ERR_TIMED_OUT at https://crawlee.dev/
Call log:
  - navigating to "https://crawlee.dev/", waiting until "load"

    at eventLoopTick (/Users/drewbitt/Documents/Repos/crawlee/ext:core/01_core.js:214:9) {"id":"3FULW9cbbkMrV4R","url":"https://crawlee.dev","retryCount":1}
INFO  PlaywrightCrawler:Statistics: PlaywrightCrawler request statistics: {"requestAvgFailedDurationMillis":null,"requestAvgFinishedDurationMillis":null,"requestsFinishedPerMinute":0,"requestsFailedPerMinute":0,"requestTotalDurationMillis":0,"requestsTotal":0,"crawlerRuntimeMillis":60151,"retryHistogram":[]}
INFO  PlaywrightCrawler:AutoscaledPool: state {"currentConcurrency":1,"desiredConcurrency":3,"systemStatus":{"isSystemIdle":true,"memInfo":{"isOverloaded":false,"limitRatio":0.2,"actualRatio":0},"eventLoopInfo":{"isOverloaded":false,"limitRatio":0.6,"actualRatio":0},"cpuInfo":{"isOverloaded":false,"limitRatio":0.4,"actualRatio":0},"clientInfo":{"isOverloaded":false,"limitRatio":0.3,"actualRatio":0}}}

Expected result

Same command with tsx

❯ npx tsx src/main.ts
INFO  PlaywrightCrawler: Starting the crawler.
INFO  PlaywrightCrawler: Title of https://crawlee.dev/ is 'Crawlee · Build reliable crawlers. Fast.'

Comments

Crawlee encourages CLI usage so this is a typical flow someone might do to start a new crawlee project in Deno, even though it installs a node project only for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant