Add Playwright headless browser as 3rd crawling fallback
Crawl chain: Jsoup → Jina Reader → Playwright (headless Chromium). Error page detection (403, Access Denied, etc.) triggers next fallback. Switch to exploded classpath for Playwright driver-bundle compatibility. Fix Next.js standalone static file serving with symlink. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -6,4 +6,10 @@ set +a
|
||||
JAVA_HOME=${JAVA_HOME:-/usr/lib/jvm/java-21}
|
||||
export JAVA_HOME
|
||||
|
||||
exec $JAVA_HOME/bin/java -jar /home/opc/sundol/sundol-backend/target/sundol-backend-0.0.1-SNAPSHOT.jar
|
||||
# Playwright: use pre-installed browsers, skip auto-download
|
||||
export PLAYWRIGHT_BROWSERS_PATH=/home/opc/.cache/ms-playwright
|
||||
export PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1
|
||||
|
||||
# Playwright driver-bundle requires exploded classpath (fat JAR extraction fails)
|
||||
BACKEND_DIR=/home/opc/sundol/sundol-backend
|
||||
exec $JAVA_HOME/bin/java -cp "$BACKEND_DIR/target/classes:$BACKEND_DIR/target/dependency/*" com.sundol.SundolApplication
|
||||
|
||||
Reference in New Issue
Block a user