Files

joungmin 1fa4f24a8a [02-Architect] #311 design spec + UX-Reviewer persona for LLM warm-up

- docs/design/311-llm-warmup/README.md — 기능 설계서. ChatWarmupController (5-state) + GemmaLlmService _loadingFuture concurrent guard + ModelLifecycle.quickCheck (lightweight ready).
- docs/design/311-llm-warmup/UX-REVIEW.md — UX-Reviewer parallel pass. Strong 4 + Suggest 2 권고. 입력창 enabled 유지 (타이핑 가능) + hintText 만 교체 + 상태-행동 분리.
- docs/design/311-llm-warmup/fn-chat_warmup_controller.md — start/retry 상세 + 빠른 경로 (isLoaded 시 Loading skip).
- docs/design/311-llm-warmup/fn-concurrent_load_guard.md — _loadingFuture 패턴 + whenComplete cleanup.
- .claude/agents/ux-reviewer.md — 신규 페르소나 (02-Architect 단계 내 parallel reviewer, 카테고리 부여 X).

AC 8 → 12 (UX 신규 4건 통합). OQ 3건 모두 해소. ADR 없음 (backward-compatible 추가).

Refs #311 #260

2026-06-15 11:41:03 +09:00

4.0 KiB

Raw Blame History

함수 설계서: `GemmaLlmService.load` concurrent guard (#311)

부모 설계서: ./README.md · 상태: Draft 작성: [AI] Architect · 구현: app/lib/data/ai/gemma_llm_service.dart:load (수정) · 테스트: app/test/data/ai/gemma_llm_service_test.dart (concurrent 케이스 추가) / chat_warmup_test.dart (시뮬)

1. 시그니처

class GemmaLlmService implements LlmService {
  Future<void>? _loadingFuture;   // 신규 필드

  @override
  Future<void> load() {
    if (_loaded) return Future.value();
    final existing = _loadingFuture;
    if (existing != null) return existing;
    final future = _doLoad();
    _loadingFuture = future;
    return future.whenComplete(() {
      _loadingFuture = null;
    });
  }

  Future<void> _doLoad() async {
    // 기존 load() 본문 (initialize → installModel → getActiveModel).
  }
}

MockLlmService.load() 도 같은 패턴 적용 (_loadingFuture 필드 추가). 테스트의 동시성 검증 일관성.

2. 책임 (단일 책임)

load() 가 진행 중일 때 다른 caller 가 호출하면 새 작업을 시작하지 않고 같은 Future 를 반환한다. native runtime 의 FlutterGemma.installModel + getActiveModel 가 두 번 불리지 않도록 보호.

3. 입력

없음 (메서드).

4. 출력

반환: Future<void> — 단일 native init 작업의 완료 future. 모든 caller 가 같은 인스턴스 공유.
부수효과: _loadingFuture, _loaded, _model 필드 변경.

5. 동작 / 알고리즘

1. _loaded == true  → 즉시 완료 Future 반환.
2. _loadingFuture != null → 그 future 그대로 반환. (새 작업 시작 X)
3. 그 외:
   a. future = _doLoad();
   b. _loadingFuture = future;
   c. future.whenComplete(() => _loadingFuture = null);
   d. return future;

_doLoad() 내부 = 기존 load() 본문 그대로 (initialize → installModel → getActiveModel → _loaded=true).

6. 에러 & 실패 모드

조건	처리	반환/예외
`_doLoad()` 가 throw	`whenComplete` 가 `_loadingFuture = null` 처리 후 throw 전파	모든 caller 가 같은 exception 받음
caller A 가 await 중에 caller B 도 호출	같은 future 반환 (step 2)	둘 다 동일하게 완료 또는 fail
첫 호출 실패 후 재시도	`_loadingFuture=null` 로 cleared → 다음 호출 시 새 `_doLoad()` 시작	정상 재시도 가능

7. 엣지케이스

load() 와 unload() race: caller A 가 load → 진행 중 caller B 가 unload() 호출. _doLoad() 가 _model 설정 직후 unload 가 _model.close() 호출. 본 이슈 범위 외 — 현재 시점에 unload() 호출 경로 없음 (#219 가 다룰 영역). 본 설계는 load 의 concurrent 만 다룬다.
whenComplete 실행 시점: future 가 동기 완료 (이미 _loaded=true 인 첫 분기) 시에도 _loadingFuture=null 보장. 단, step 1 에서 early return 이라 _loadingFuture 는 손대지 않음.

8. 복잡도 / 성능

시간: 첫 호출 = 기존 _doLoad 비용. 후속 caller = O(1) future 공유.
공간: future 1개 + null 가능 필드.
호출 빈도: ChatScreen mount + userTurn 첫 호출 + frame suggestion (#215) — 모두 일생에 몇 회.

9. 의존성

flutter_gemma FlutterGemma.initialize / installModel / getActiveModel (기존).
_loaded / _model 필드 (기존).

10. 테스트 케이스

정상: load() 1회 호출 → _doLoad() 1회 실행.
concurrent: load() 두 번 await 동시 호출 → _doLoad() 1회만 실행, 두 future 같은 Future 인스턴스.
실패 후 재시도: 첫 _doLoad throws → caller A 에게 propagate → _loadingFuture cleared → 두 번째 load() 새 _doLoad 시작.
isLoaded 이미 true: load() → 즉시 완료, _doLoad 미실행.

Gemma native 는 통합 테스트에서만 검증 가능. 단위 테스트는 MockLlmService 의 동일 가드로 시뮬.

11. 추적성

인수조건: AC7.
관련 follow-up: #220 (purge try/catch — 동일 정신).
관련 ADR: 없음.

4.0 KiB Raw Blame History

함수 설계서: GemmaLlmService.load concurrent guard (#311)