knowledge-inbox

Author	SHA1	Message	Date
joungmin	d0c2aa3857	feat: replace single-pass enricher with 4-step pipeline Upgrades content processing from a single LLM call to a structured 5-step document reconstruction pipeline: 1. Normalize — 구어체 정제, 문장부호 복원, 핵심 엔티티 추출 2. Index Tree — 텍스트 전체 스캔 → 계층적 목차(JSON) 생성 3. Leaf Summarize — 섹션별 상세 요약 (context overlap 300자 적용) 4. Consistency Check — 누락 엔티티 검증 및 보완 5. Assemble — 최종 Markdown 문서 조립 (LLM 불필요) - Short texts (< 3000 chars): simple 1-pass fallback - Long texts: full pipeline (N+4 LLM calls where N = section count) - worker.py: uses body_md from enricher as Obsidian note body Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-02 18:02:00 +09:00
joungmin	128fde3ad6	fix: Python 3.9 compatibility for union type hints Use Optional[T] + from __future__ import annotations instead of T \| None syntax which requires Python 3.10+. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-28 09:59:40 +09:00
joungmin	826961f2b9	fix: improve title, summary, and content formatting - youtube.py: fetch real title via YouTube oEmbed API instead of falling back to video ID - youtube.py: paragraphize transcript text by grouping sentences (4 per para) - enricher.py: increase max_tokens 1024→2048 to prevent summary truncation - web.py: restore paragraph breaks after HTML stripping	2026-02-28 09:39:05 +09:00
joungmin	9739daf481	fix: explicitly move Anki cards to correct deck after addNotes addNotes ignores deckName in some AnkiConnect versions. Use changeDeck after note creation to ensure cards land in English::Vocabulary.	2026-02-28 08:42:29 +09:00
joungmin	a9db6a8771	feat: add English vocab extraction and Anki card registration - core/vocab.py: extract B1-B2 level vocabulary from English content via Gemini Flash - core/anki.py: register vocab cards to AnkiConnect (English::Vocabulary deck) - core/enricher.py: add language detection field + summary_ko (Korean summary) - core/obsidian.py: render Korean + English summary in note - daemon/worker.py: call vocab extraction and Anki registration for English content	2026-02-28 08:39:58 +09:00
joungmin	86a4104ae3	feat: initial knowledge-inbox pipeline implementation - Oracle ADB queue table (sql/schema.sql) - Queue CRUD: core/queue_db.py - YouTube transcript: core/youtube.py - Web page fetch: core/web.py - LLM enrichment via OCI GenAI Gemini Flash: core/enricher.py - Text chunker: core/chunker.py - Obsidian note writer: core/obsidian.py - Oracle vector store insertion: core/vector.py - Polling daemon: daemon/worker.py - Telegram bot: bot/telegram_bot.py - Main runner: main.py	2026-02-28 08:16:11 +09:00

6 Commits