背景与价值多语言文本的词边界差异显著;使用原生分词提升选择与高亮的语义准确性。基本用法function segmentText(text: string, locale = 'zh', granularity: 'grapheme'|'word'|'sentence' = 'word') { const seg = new (Intl as any).Segmenter(locale, { granularity }); return Array.from(seg.segment(text)); } 高亮与选择function findWordAt(text: string, index: number, locale = 'zh') { const seg = new (Intl as any).Segmenter(locale, { granularity: 'word' }); let pos = 0; for (const s of seg.segment(text)) { const start = s.index; const end = s.index + s.segment.length; if (index >= start && index < end) return { start, end }; pos++; } return { start: 0, end: 0 }; } 指标验证(Chrome 128/Edge 130)分词准确率:中英混合文本 ≥ 98%。选择准确性:编辑与高亮边界匹配率 ≥ 97%。性能:大段文本分词耗时(P95) ≤ 12ms。回退策略不支持环境:使用轻量库或基于空格与标点的启发式切分。测试清单多语言与混合文本:分词与选择边界正确;高亮行为合理。

发表评论 取消回复