Files

joungmin d31b17f3e8 [Architect] #215 ADR-0003 + design spec for Gemma frame suggest

- ADR-0003: on-device LLM Gemma 4 E2B Q4_0 + flutter_gemma 도입 결정.
  5개 대안(클라우드/정적확장/Llama/E4B/APK번들) 기각 사유 명시.
- docs/design/215-gemma-frame-suggest/: 설계서 게이트 통과 산출물.
  README.md (12 섹션 전부 + AC10 + OQ6 + 함수 15개) +
  fn-suggest_frame.md (suggestFrame/buildFewShotPrompt/parseFrameCandidates) +
  fn-model_lifecycle.md (LlmService/GemmaLlmService/ModelLifecycle).
- graceful degradation 전면: AI 실패 시 throw 없이 빈 리스트 + 수동 입력 유지.
- LlmService 추상화로 도메인 ↔ flutter_gemma 경계 분리 (테스트 가능성).

Refs #215

2026-06-12 11:16:15 +09:00

13 KiB

Raw Blame History

함수 설계서: `suggestFrame` + `buildFewShotPrompt` + `parseFrameCandidates` (#215)

부모 설계서: ./README.md · 상태: Draft 작성: [AI] Architect · 구현: app/lib/domain/ai/suggest_frame.dart, few_shot_builder.dart, parse_response.dart (TBD) · 테스트: app/test/domain/ai/{suggest_frame,few_shot_builder,parse_response}_test.dart (TBD)

본 문서는 도메인 핵심 알고리즘 함수 3개를 묶어 다룬다. 셋 다 순수 함수 (또는 LlmService 만 의존) 로 테스트 가능성을 우선한다.

§A. `suggestFrame` (메인)

1. 시그니처

Future<List<FrameCandidate>> suggestFrame(
  SuggestFrameInput input, {
  required LlmService llm,
  required List<FramePattern> framePatterns,
  FrameValidator validator = const FrameValidator(),
  Duration timeout = const Duration(seconds: 10),
});

2. 책임 (1줄)

raw text 를 Gemma 4 에 보내 L2/L3 프레임 후보 ≤ 3 개를 받아 반환한다. L0/L1 응답은 자동 폐기.

3. 입력

파라미터	타입	제약/검증	설명
`input.rawText`	String	1 ≤ length ≤ 200, NFC normalize	사용자 자유 입력
`input.habitType`	enum {build, break}	필수	few-shot 매칭 방향 결정
`input.anchorHint`	String?	nullable	optional "아침 양치 후" 등
`llm`	`LlmService`	추상 인터페이스 (DI)	flutter_gemma 구현체 또는 mock
`framePatterns`	`List<FramePattern>`	#204 시드 30개	few-shot 동적 추출 소스
`validator`	`FrameValidator`	기본값 OK	`validateFrameLevel` 의 래퍼
`timeout`	Duration	1~30s	LlmService.generateStructured 의 타임아웃

4. 출력

반환: List<FrameCandidate> — 길이 0~3.
부수효과: 없음 (순수). LlmService 호출은 인자로 받은 의존성을 통해서만.
graceful: 실패 시 throw 하지 않고 빈 리스트 반환. 호출자 (UI provider) 가 메시지 결정.

5. 동작 / 알고리즘

1. input 경계 검증
   - rawText.trim().length in [1, 200] 아니면 → return [] (호출자에 위임)
   - NFC normalize 이미 안 되어 있으면 적용

2. prompt = buildFewShotPrompt(input, framePatterns)
   - §B 참조

3. JSON schema = FrameCandidate function calling schema (README §6)

4. try:
     json = await llm.generateStructured(prompt, schema).timeout(timeout)
   catch TimeoutException, StateError, FormatException, Exception:
     log meta (latency, error type) — NO prompt body
     return []

5. candidates = parseFrameCandidates(json) — §C 참조

6. validated = candidates.where((c) {
     final result = validator.validate(FrameInput(
       level: c.level,
       framedText: c.framedText,
       originalText: input.rawText,
     ));
     return result.status != FrameStatus.error;  // L0/L1 또는 hard avoid → 폐기
   }).toList()

7. return validated.take(3).toList()

6. 에러 & 실패 모드

조건	처리	반환
`rawText` 빈/200자 초과	즉시 반환	`[]`
LlmService timeout	catch → log latency, `error_type=timeout`	`[]`
LlmService throw StateError (모델 미로드)	catch → log	`[]`
응답이 malformed JSON	`parseFrameCandidates` 가 FormatException → catch	`[]`
모든 후보가 L0/L1 또는 hard avoid	정상 흐름. validated 빈 리스트	`[]`
validator 자체 throw	비정상. catch → log + skip 후보	부분 리스트

모든 예외를 catch — 도메인 함수는 throw 하지 않음 (graceful). 호출자가 빈 리스트 시 UI 메시지 결정.

7. 엣지케이스

rawText = " " (whitespace) → trim 후 length=0 → [].
rawText 가 코드 / 이모지 / 영어만 → prompt 에 그대로 들어가 모델이 한국어 응답 시도. 결과 품질 낮을 가능성 — AC-10 평가 대상.
framePatterns = [] (시드 미로드) → buildFewShotPrompt 가 fallback 시스템 prompt 만 사용 — quality 저하 경고.
habitType = break + raw text 가 build 패턴에 가까움 → few-shot 매칭이 약함. 모델이 break 방향으로 frame 시도.
LlmService 가 같은 호출에 다른 응답 — 정상. cache 없음.

8. 복잡도 / 성능

호출 빈도: 사용자가 "AI 제안" 탭한 시점만. throttle 5회/세션.
시간: cold start 1–3초 (모델 로드 포함), warm 0.5–2초. 본 함수 자체 (LlmService 호출 제외) 는 O(N) — N = framePatterns 길이 = 30. 사실상 < 5ms.
공간: prompt string ≈ 2–4KB. JSON response ≈ 1KB.

9. 의존성

호출 함수: buildFewShotPrompt (§B), parseFrameCandidates (§C), validator.validate (= validateFrameLevel 래퍼, #204).
외부 API: LlmService.generateStructured (인터페이스, 구현체 = GemmaLlmService).
모델: FramePattern (#204 카탈로그), FrameCandidate (도메인), SuggestFrameInput (도메인).

10. 테스트 케이스

정상: rawText="술 끊고 싶어", habitType=break, mock LlmService 가 valid JSON 3개 반환 → result.length == 3, 모두 L2/L3.
L0/L1 폐기: mock 응답에 L1 1개 + L2 2개 → result.length == 2.
timeout: mock LlmService 가 Future.delayed(15s) → timeout 10s → [].
malformed JSON: mock 응답 {"foo": "bar"} → parseFrameCandidates throw → catch → [].
빈 rawText: rawText: " " → LlmService 미호출, [].
rawText > 200자: 201자 입력 → [].
framePatterns 비어있음: → LlmService 호출은 하되 prompt 가 fallback. mock 으로 응답 시 정상 동작 보장.
LlmService throw StateError (모델 미로드): catch → [].
non-blocking 보장: 어떤 예외 케이스에서도 throw 하지 않음 (assert no exception thrown).

11. 추적성

인수조건: #215 AC-6, AC-7, AC-9 (graceful).
관련 ADR: ADR-0003 (on-device LLM + function calling + few-shot 동적 추출).

§B. `buildFewShotPrompt`

1. 시그니처

String buildFewShotPrompt(
  SuggestFrameInput input,
  List<FramePattern> framePatterns, {
  int maxFewShot = 5,
});

2. 책임 (1줄)

FramePattern 카탈로그에서 raw text + habit type 키워드 매칭 상위 N개를 추출해 system + few-shot + user 섹션으로 구성된 prompt string 을 반환한다.

3. 입력

파라미터	타입	제약/검증	설명
`input`	`SuggestFrameInput`	검증된 입력	rawText, habitType, anchorHint
`framePatterns`	`List<FramePattern>`	시드 30 가정	matching pool
`maxFewShot`	int	1~10	top-N few-shot 갯수

4. 출력

반환: prompt string (≈ 2–4KB).
부수효과: 없음 (순수). 입력 인자만으로 결과 결정.

5. 동작 / 알고리즘

1. tokens = rawText 의 단어 토큰화 (whitespace + 한국어 형태소 lite)
   - 형태소 분석기 비도입. 정규식 split + 길이 ≥ 2 한국어 substring 만 남김

2. scored = framePatterns
   .where((p) => p.habitType == input.habitType || p.habitType == null)
   .map((p) => MapEntry(p, scoreMatch(tokens, p.keywords)))
   .where((e) => e.value > 0)
   .toList()
   ..sort((a, b) => b.value - a.value)

3. selected = scored.take(maxFewShot).toList()
   - scored 빈 리스트면 framePatterns 중 임의 3개 fallback (habit_type 만 일치)

4. prompt 조립:
   <SYSTEM>
   당신은 Huberman 프로토콜 한국어 코치입니다. 사용자의 raw text 를
   L2 (조건부 긍정) 또는 L3 (정체성) 프레임의 한국어 문장으로 변환합니다.
   - L2 예: "스트레스 받을 때 책 한 페이지를 펼친다"
   - L3 예: "나는 글을 읽는 사람이다"
   - L0/L1 (회피/부정) 금지: "안", "끊다", "그만두다"
   - 응답은 반드시 함수 호출 emit_frame_candidates(candidates: [...]) 로.

   <FEW_SHOT>
   for p in selected:
     # 예시 {n}: {p.title}
     L0: {p.level_l0_example}
     L2: {p.level_l2_example}
     L3: {p.level_l3_example}

   <USER>
   habit_type: {input.habitType}
   raw_text: "{input.rawText}"
   anchor_hint: {input.anchorHint ?? "없음"}

   위 raw_text 를 L2/L3 후보 3개로 변환하세요.

5. return prompt

scoreMatch(tokens, keywords) = 두 리스트 교집합 크기 + 한국어 substring 부분 매칭 보정. 정확한 점수 공식은 구현 시 단순한 set intersection 으로 시작 — 평가 후 보강.

6. 에러 & 실패 모드

조건	처리	반환
framePatterns 비어있음	fallback: system + user 만 (few-shot 섹션 생략)	prompt 단축본
rawText 비어있음	호출자 (`suggestFrame`) 가 사전 검증. 본 함수는 어떻게든 prompt 반환	empty user_input prompt
keyword 매칭 0개	임의 3개 fallback (habitType 일치 기준)	정상 prompt

7. 엣지케이스

한국어 형태소 분석기 없음 → keyword 매칭 false negative 다수. v1 baseline. v1.1 에서 mecab-ko 도입 검토.
같은 raw text 가 두 번 들어와도 결정론적 → cache 없음, 매번 같은 prompt 생성.
anchorHint 길이 폭주 (사용자가 100자 입력) → prompt 비대화. UI 단에서 ≤ 50자 제한.

8. 복잡도 / 성능

O(N × M) — N = framePatterns 길이 (30), M = 평균 keyword 갯수 (≈ 3). 사실상 < 5ms.
prompt string concat — O(L), L = 총 길이 (≈ 4KB).

9. 의존성

SuggestFrameInput, FramePattern 도메인 모델만.
Dart core (String, List).
외부 의존 0 — 순수 함수.

10. 테스트 케이스

정상 매칭: rawText="술 끊고", patterns 에 술 관련 3개 + 운동 5개 → selected 의 첫 3개가 술 관련.
fallback: rawText="xyz unknown", 매칭 0 → habit_type=break 인 임의 3개로 fallback.
빈 patterns: framePatterns=[] → few-shot 섹션 없는 prompt + L2/L3 가이드만.
anchorHint null: prompt 에 "anchor_hint: 없음" 명시.
maxFewShot=1: selected.length = 1.
결정론: 같은 입력 두 번 → 같은 출력 string.
NFC: rawText 가 NFD form 으로 들어오면 caller 가 normalize 책임 (본 함수는 가정).

11. 추적성

인수조건: #215 AC-6 (few-shot 동적 추출), AC-10 (한국어 품질).
관련 ADR: ADR-0003 (SoT few-shot 동적 추출 원칙).

§C. `parseFrameCandidates`

1. 시그니처

List<FrameCandidate> parseFrameCandidates(Map<String, dynamic> json);

2. 책임 (1줄)

function calling JSON 응답을 FrameCandidate[] 으로 변환한다. 형식 위반 시 FormatException.

3. 입력

파라미터	타입	제약/검증	설명
`json`	`Map<String, dynamic>`	function calling 응답	`{"candidates": [...]}` 구조 가정

4. 출력

반환: List<FrameCandidate> — 0~3 길이.
부수효과: 없음 (순수).

5. 동작 / 알고리즘

1. raw = json['candidates']
   - null 또는 not List → throw FormatException("candidates missing")

2. result = []
3. for each item in raw:
   - levelStr = item['level'] as String? ?? throw FormatException
   - level = FrameLevel.parse(levelStr) — L0/L1/L2/L3 enum
     - 알 수 없는 값 → skip (log)
   - framedText = item['framed_text'] as String? ?? throw FormatException
     - trim, length in [1, 120] 아니면 skip
   - confidence = (item['confidence'] as num?)?.toDouble() ?? 0.5
     - clamp(0.0, 1.0)
   - sourcePatternId = item['source_pattern_id'] as String?  // optional
   - result.add(FrameCandidate(level, framedText, confidence, sourcePatternId))

4. return result

L0/L1 폐기는 본 함수가 아닌 호출자 suggestFrame 에서 validateFrameLevel 로 수행. parseFrameCandidates 는 형식 검증만.

6. 에러 & 실패 모드

조건	처리	반환/예외
json 에 candidates 키 없음	throw	FormatException("candidates missing")
candidates not List	throw	FormatException("candidates not array")
item 에 level 누락	throw	FormatException(...)
level 값이 enum 외 ("L99")	item skip + log	부분 리스트
framed_text 길이 위반	item skip + log	부분 리스트
confidence not number	0.5 fallback	정상 진행

7. 엣지케이스

candidates: [] → 빈 리스트 반환 (예외 아님).
4개 이상 후보 반환 → 모두 파싱. suggestFrame 에서 take(3).
동일한 framed_text 가 2개 → 중복 그대로 반환. dedup 은 호출자 선택.
Unicode 이모지 포함 → 허용 (length 카운트는 grapheme 가 아닌 UTF-16 길이).

8. 복잡도 / 성능

O(N) — N = candidates 길이 (보통 3).
사실상 < 1ms.

9. 의존성

FrameCandidate, FrameLevel 도메인 모델.
Dart core (Map, List).

10. 테스트 케이스

정상: 3 valid items → length 3.
candidates 누락: {"foo": "bar"} → throw FormatException.
candidates not list: {"candidates": "string"} → throw.
L0 + L2 + L3 mix: 모두 파싱 (L0 폐기는 호출자 책임).
알 수 없는 level "L99": skip → length 2 (3 중 2).
framed_text 길이 120 초과: skip.
confidence 누락: 0.5 fallback.
confidence -0.1: clamp 0.0.
빈 candidates list: [] → 빈 리스트 반환 (예외 X).
이모지 포함: 정상 파싱.

11. 추적성

인수조건: #215 AC-7 (function calling JSON 파싱 + L0/L1 폐기 결합).
관련 ADR: ADR-0003 (function calling 강제).

13 KiB Raw Blame History Unescape Escape

함수 설계서: suggestFrame + buildFewShotPrompt + parseFrameCandidates (#215)

§A. suggestFrame (메인)

1. 시그니처

2. 책임 (1줄)

3. 입력

4. 출력

5. 동작 / 알고리즘

6. 에러 & 실패 모드

7. 엣지케이스

8. 복잡도 / 성능

9. 의존성

10. 테스트 케이스

11. 추적성

§B. buildFewShotPrompt

1. 시그니처

2. 책임 (1줄)

3. 입력

4. 출력

5. 동작 / 알고리즘

6. 에러 & 실패 모드

7. 엣지케이스

8. 복잡도 / 성능

9. 의존성

10. 테스트 케이스

11. 추적성

§C. parseFrameCandidates

1. 시그니처

2. 책임 (1줄)

3. 입력

4. 출력

5. 동작 / 알고리즘

6. 에러 & 실패 모드

7. 엣지케이스

8. 복잡도 / 성능

9. 의존성

10. 테스트 케이스

11. 추적성

13 KiB

Raw Blame History

함수 설계서: `suggestFrame` + `buildFewShotPrompt` + `parseFrameCandidates` (#215)

§A. `suggestFrame` (메인)

§B. `buildFewShotPrompt`

§C. `parseFrameCandidates`