장기 AI 진행내역

현재 문서는 완료된 장문 로그를 제거하고, 지금 실제로 닫아야 할 메인계획 1의 핵심 게이트만 남긴 공개 기준 문서입니다. 다음 실행 우선순위는 1~7 새 진행 체제입니다. 핵심은 scalar evaluator 확대가 아니라 C++ root candidate ledger, deep margin oracle, pairwise/listwise root ranking, bounded runtime integration, root-transfer sentinel, same-time long gate 순서로 실력 증거를 닫는 것입니다.

최신 상태 2026-05-13 1~7 체제 반영 다음 기준: root candidate ledger부터 재시작 완료 작업은 핵심 증거만 보존 레거시 장문 로그 삭제/압축 promotionAllowed=false runtimeReady=false 정식 규칙 장기로 돌아가기

현재 결론

메인계획 1의 현재 작업은 3번이다

1번 rule/result corpus와 2번 tactical replay는 완료 증거로 보존합니다. 지금은 NNUE/strong evaluator가 실제 검색 root 선택을 개선하는지 증명해야 하며, 단순 holdout 성능이나 scalar TSV 개선만으로 승격하지 않습니다.

아직 실력 승격이 아니다

3-0 계약과 3-A root-decision target ledger는 생성됐지만, 3번은 positive searched root-decision transfer가 아직 없습니다. 4번 support-on/off ablation과 5번 same-time 250ms/500ms/1500ms long gate도 남아 있으므로 promotionAllowed=false, runtimeReady=false를 유지합니다.

다음 실행 체제: 1~7

현재 판정

기존 3-0/3-A 산출물은 설계·진단용으로 격하

main_plan1_remaining_3_0_target_contract_pre_gate_v1.json과 main_plan1_remaining_3_a_root_decision_objective_targets_v1.json, main_plan1_remaining_3_a_root_decision_objective_targets_v1.jsonl을 생성했습니다. 현재 inventory는 600 target rows, 1,215 candidate move refs, 997 hard-negative refs입니다. 하지만 480개는 shallow/tie oracle, 120개는 recorded-move proxy이며, 긍정 개선 row는 10개뿐입니다. 따라서 이 산출물은 바로 학습 정답으로 쓰지 않고 hard-negative와 설계 진단 풀로만 사용합니다.

진단용 inventory weak oracle 승격 금지 유지

3-0-R0: C++ Raw Root Candidate Ledger

기존 비교 요약에서 target을 조립하지 않고, C++ 검색 JSON에서 position별 raw root 후보를 새로 수집합니다. 각 row는 전체 legal root, searched survivor, rejected/pruned reason, depth별 score, margin, tie 여부, terminal/tactical family, root bonus/penalty, exact search controls를 가져야 합니다.

full candidate list 1-candidate row 금지 summary 조립 금지

Ledger Validator / Leakage Split

candidate ledger 기준으로 game, position, position+move overlap 0을 다시 보장합니다. missing candidate list, 단일 후보 target, position signature 누락, split leakage, oracle provenance 누락 row는 학습용 hard preference에서 제외합니다.

leakage 0 split 재고정 fail-closed

Deep Margin Oracle

accepted 후보에만 deeper fixed-depth oracle을 붙입니다. depth 안정성, score margin, tactical/failure-family 검증으로 accept/reject를 나누고, tie/unknown/recorded-only row는 학습용 hard positive가 아니라 soft/diagnostic row로만 둡니다.

winner provenance tie는 hard label 금지 recorded move 정답화 금지

3-A Rebuild: Decisive Root Preference Targets

3-A는 board value 예측이 아니라 candidate move 단위의 +1 / 0 / -1 ranking target으로 재생성합니다. decisive margin은 pairwise/listwise hard preference로 쓰고, ambiguous/tie row는 evaluation-only 또는 soft target으로 분리합니다.

pairwise/listwise ambiguous 분리 scalar fit 승격 금지

3-B: Root Ranking Training

position + candidate move 입력으로 root 후보 순서를 학습합니다. feature에는 move type, from/to, capture/check, reply count, repetition delta, palace/invasion delta, root bonus components, shallow score margin을 포함합니다. 기존 13-feature scalar TSV는 보조 신호로만 취급합니다.

candidate-aware features hard-negative cap 13-feature 단독 금지

3-C: Bounded C++ Runtime Integration

learned signal은 final chooser가 아니라 root ordering support 또는 verification challenger 확장에만 제한합니다. 큰 search-score margin은 override하지 않고, 검증된 near-tie margin window 안에서만 작동해야 합니다. C++ loader는 feature parity, quantization drift, deterministic load/fail, safe fallback, node/time overhead cap을 통과해야 합니다.

bounded support verification challenger hidden chooser 금지

Root-Transfer Sentinel → Same-Time Long Gate

곧바로 long gate를 재도전하지 않습니다. 먼저 fixed-depth root-transfer sentinel에서 searchImproved > searchWorsened, hard-negative regression cap, quiet/tactical/failure-family bucket pass, enriched v3 회귀 재발 0을 확인합니다. 그 뒤 250ms mini same-time sentinel, 500ms/1500ms long gate로 확장합니다.

root-transfer sentinel 250→500→1500ms long gate 전 승격 금지

보존하는 완료 증거

1번 rule/result equivalence corpus

JS-vs-C++ rule/result corpus는 623개 케이스, 12개 family, critical mismatch 0으로 통과했습니다. 이 내용은 규칙 기반 신뢰 근거로만 보존하고, 현재 실행 대상에서는 제외합니다.

2번 tactical failure-family replay

fixed-depth/no-time-noise replay 기준 62개 row, required/neutral/profile regression family 누락 0, bounded tactical risk 감소 14개를 확인했습니다. 이 내용은 전술 안정화 근거로만 보존합니다.

1-1~1-6 구현·스모크·계약 통과

룰 정합성, PVS/TT, 전술 feature, NNUE feature smoke, 보조 신호 권한 제한, long gate manifest는 1차 통과입니다. 단, 이것은 실력 승격이 아니라 안전 차단과 기반 정리입니다.

메인계획 2 속도 최적화

2-1~2-5는 빌드/핫패스/정수 키/persistent bridge/안전 최적화 기준으로 완료 요약만 남깁니다. 현재 3번 검증을 빠르게 돌리기 위한 보조 기반입니다.

삭제·압축한 레거시

현재 실행 플랜에서 제거

prior/posterior/transfer/shadow/runtime option 중심의 이전 plans/main_plan1_v5 계열, 과거 JS multi-head 제한 연결, C++ baseline/ranker 미승격 장문 설명은 현재 실행 순서를 결정하지 않으므로 삭제/압축했습니다.

역사 근거로만 보존

과거 엔진 screening 실패, 기보 수집 확장, 날짜별 긴 실험 로그는 현재 공개 진행 페이지에서 제거했습니다. 필요한 경우 내부 산출물과 JSON/MD 문서에서만 확인하는 보관 정보로 취급합니다.

공개 페이지는 앞으로 “지금 무엇을 해야 하는지”를 우선 보여줍니다. 완료 세부 로그와 과거 실험 목록은 현재 판단을 흐리지 않는 범위의 핵심 근거만 남깁니다.