Add Playwright headless browser as 3rd crawling fallback
Crawl chain: Jsoup → Jina Reader → Playwright (headless Chromium). Error page detection (403, Access Denied, etc.) triggers next fallback. Switch to exploded classpath for Playwright driver-bundle compatibility. Fix Next.js standalone static file serving with symlink. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -104,6 +104,18 @@
|
||||
<version>1.18.3</version>
|
||||
</dependency>
|
||||
|
||||
<!-- Playwright (headless browser, driver-bundle includes node runtime) -->
|
||||
<dependency>
|
||||
<groupId>com.microsoft.playwright</groupId>
|
||||
<artifactId>playwright</artifactId>
|
||||
<version>1.51.0</version>
|
||||
</dependency>
|
||||
<dependency>
|
||||
<groupId>com.microsoft.playwright</groupId>
|
||||
<artifactId>driver-bundle</artifactId>
|
||||
<version>1.51.0</version>
|
||||
</dependency>
|
||||
|
||||
<!-- Jackson -->
|
||||
<dependency>
|
||||
<groupId>com.fasterxml.jackson.core</groupId>
|
||||
|
||||
Reference in New Issue
Block a user