[Day 9] A local Japanese sentiment AI (BERT) read 8 years of a LINE chat, and the ups and downs surfaced from numbers alone

[第9天] 本地日语情感AI (BERT) 读取了8年的LINE聊天记录，仅凭数字就浮现出了起伏

Intro Day 9. Today is less about model internals and more of a personal experiment: have a local AI analyze the entire chat history with one LINE friend. (LINE is the dominant messaging app in Japan.) When I exported it, 8 years were sitting there — from the very first message to today. It started, we talked a lot, it went quiet for a while, then picked up again. That whole arc is in there. Because the content is what it is, nothing left my machine: everything ran locally on my DGX Spark. What I used: my home AI box (DGX Spark) + a Japanese sentiment model (for tone) + a bigger local model (to guess events from numbers).

引言第9天。今天的主题与其说是模型内部原理，不如说是一项个人实验：让本地AI分析我与一位LINE好友的全部聊天记录。（LINE是日本主流的通讯软件。）当我导出记录时，8年的时光就在那里——从第一条消息到今天。我们开始聊天，聊得很火热，中间沉寂了一段时间，后来又恢复了联系。整个过程的弧线都在其中。由于内容比较私密，没有任何数据离开过我的机器：一切都在我的DGX Spark上本地运行。我使用的工具：我的家用AI盒子 (DGX Spark) + 日语情感模型（用于分析语气）+ 一个更大的本地模型（用于从数字中推测事件）。

Today’s setup What I wanted to do: Re-reading 8 years of messages one by one isn’t realistic. So instead of reading the content, I looked only at the “shape” of the conversation — when, how much, and in what tone we talked. Concretely: monthly message volume, the trend of tone (positive / negative), then asking an AI to find “when something big happened.”

今日设置 我想做的是：逐条重读8年的消息是不现实的。因此，我没有去阅读具体内容，而是只观察对话的“形状”——我们何时聊天、聊了多少、以及以什么样的语气。具体来说：每月的消息量、语气趋势（积极/消极），然后让AI找出“重大事件发生的节点”。

Heads-up (the result) From message counts and tone alone, the 8-year arc came out clearly on a chart. Started, went quiet, came back — the flow was visible without me re-reading a thing.

预告（结果） 仅凭消息数量和语气，8年的弧线就在图表上清晰地呈现了出来。开始、沉寂、回归——无需重读任何内容，整个流程一目了然。

🔧 Pipeline LINE chat export (text) │ ▼ 1. Parse: split each message into {datetime, who, type, text} │ (from here on, message text never leaves the machine) ▼ 2. Aggregate: monthly counts, time-of-day, reply gaps │ ▼ 3. Tone scoring: classify each of 66k messages pos/neu/neg │ ▼ 4. Turning-point detection: from sudden changes in the numbers │ + also show ONLY the numbers to a bigger AI and ask it to guess ▼ 5. Answer check: compare against the real timeline.

🔧 流程 LINE聊天导出（文本）│ ▼ 1. 解析：将每条消息拆分为 {日期时间, 发送者, 类型, 文本} │（从此处开始，消息文本绝不离开机器）▼ 2. 聚合：月度统计、时间段、回复间隔 │ ▼ 3. 语气评分：将66,000条消息分类为积极/中性/消极 │ ▼ 4. 转折点检测：根据数字的突变 │ + 同时仅向更大的AI展示数字并让其猜测 ▼ 5. 答案核对：与真实时间线进行对比。

You can export a LINE chat as text from the chat screen (“send chat history”). Data size:

Item	Value
Span	~8 years 2 months
Total messages	87,621
Text messages	66,329
Stickers	15,605
Photos	3,982
15,605 stickers… that’s a lot.

你可以从聊天界面导出LINE聊天记录为文本（“发送聊天记录”）。数据规模：

项目	数值
跨度	约8年2个月
消息总数	87,621
文本消息	66,329
贴图	15,605
照片	3,982
15,605个贴图……这可真不少。

The two AIs

Step	Model	What it does	What it sees

Tone | Japanese sentiment model (koheiduck/bert-japanese-finetuned-sentiment) | scores each message pos/neu/neg | 66k message texts (scores averaged per month)
Turning points | a bigger local model (Qwen2.5 72B) | guesses “what happened to these two?” | only the per-month table of counts + tone scores (no conversation, no words)

两个AI

步骤	模型	功能	输入内容

语气 | 日语情感模型 (koheiduck/bert-japanese-finetuned-sentiment) | 为每条消息评分（积极/中性/消极） | 6.6万条消息文本（按月取平均分）
转折点 | 更大的本地模型 (Qwen2.5 72B) | 猜测“这两人发生了什么？” | 仅包含月度统计表 + 语气得分（无对话内容，无文字）

Both run locally on my own machine.

两者均在我的个人机器上本地运行。

📊 Results The 8-year arc of volume and tone This chart is the highlight. Top: monthly message count. Bottom: tone (up = positive, down = negative). The x-axis is months since the conversation started. (Axis labels are in Japanese.) Plotted, it isn’t a steady climb or a flat line — it splits cleanly into “chapters”: ramp-up → an 8-month silence → a second peak → a stable plateau. Four phases, at a glance.

📊 结果 8年的音量与语气弧线这张图表是亮点。上方：月度消息数。下方：语气（向上为积极，向下为消极）。横轴是对话开始后的月份。（轴标签为日语。）绘制出来后，它不是平稳的上升或平线，而是清晰地分成了几个“章节”：起步 → 8个月的沉寂 → 第二个高峰 → 稳定的平台期。四个阶段，一目了然。

Tone has two peaks of about +0.6, around the start and around when things resumed (overall mean ≈ 0, slightly negative in the later years). The interesting part: in the month before the silence, tone had already dropped to −0.1. The mood dimmed before the volume did. There are two dips into negative tone. The one before the silence was an “omen.” The other is the recent years — not an omen, but the effect of logistics-y messages (“what time are you home?”) piling up.

语气有两个约+0.6的高峰，分别在开始阶段和恢复联系时（总体平均值≈0，后期略微偏向消极）。有趣的是：在沉寂前的一个月，语气已经降至-0.1。情绪在音量下降之前就已经变淡了。语气中有两次跌入负值。沉寂前的那次是“预兆”。另一次是最近几年——那不是预兆，而是物流类消息（如“你几点到家？”）堆积的结果。

💡 Mini-note: how is “tone” turned into a number? The scoring is done by a Japanese sentiment model. Roughly: pre-trained on lots of Japanese text labeled positive / negative; judges with context, not just by spotting keywords; returns a probability of “positive-ness” / “negative-ness” per message. I used the difference as a per-message score.

💡 小贴士：如何将“语气”转化为数字？ 评分由日语情感模型完成。大致过程：在大量标注了积极/消极的日语文本上进行预训练；结合上下文进行判断，而非仅仅识别关键词；返回每条消息的“积极性”/“消极性”概率。我将两者的差值作为每条消息的得分。

What kinds of messages scored how? A few actual judgments (short, name- and place-free one-liners):

Message	Verdict
「楽しかったね！」 (that was fun!)	Positive
「これめちゃうまい」 (this is so good)	Positive
「おはようございます」 (good morning)	Neutral
「もうお家？」 (home already?)	Neutral
「全く集中できない」 (can’t focus at all)	Negative
「それは悔しいな、、」 (that’s frustrating…)	Negative
(a long trip-planning message)	Neutral
(a snappy one-liner sent in a huff)	Negative

什么样的消息得分如何？ 一些实际的判断（简短、无姓名和地点的单行句）：

消息	判定
「楽しかったね！」 (真开心啊！)	积极
「これめちゃうまい」 (这个超好吃)	积极
「おはようございます」 (早上好)	中性
「もうお家？」 (已经到家了？)	中性
「全く集中できない」 (完全无法集中)	消极
「それは悔しいな、、」 (那真让人懊恼……)	消极
(长篇旅行计划消息)	中性
(气头上发的一句简短的话)	消极

Plain happy lines score positive; logistics (“good morning”, “home already?”) score neutral; tiredness or irritation scores negative. Even long, businesslike planning messages lean neutral. Mornings are when we talk. Message density by weekday × hour (brighter = more). A clear concentration at 7–9 a.m.!

纯粹开心的句子得分积极；物流类（“早上好”、“到家了吗？”）得分中性；疲惫或烦躁得分消极。即使是长篇的、公事公办的计划消息也倾向于中性。我们通常在早晨聊天。按星期×小时的消息密度（越亮表示越多）。在早上7-9点有明显的集中！

Could the AI guess the turning points? First, the simple method: mechanically pick the points where message volume jumped or dropped, then check against the real timeline.

Real event	Auto-detected timing
When it started	exact match
When it went quiet	exact match
When it resumed	exact match
When it got lively again	a few months off
A big life milestone	hard to detect (barely shows in counts)

AI能猜出转折点吗？ 首先是简单的方法：机械地挑选消息量跳跃或下降的点，然后与真实时间线核对。

真实事件	自动检测时间
开始时	完全匹配
沉寂时	完全匹配
恢复时	完全匹配
再次活跃时	偏差几个月
重大人生里程碑	难以检测（在统计中几乎不显示）

Sharp volume changes were nailed. But “a big life milestone” got missed. So I showed the same numbers to the bigger local model and asked “what happened?” — and got back: “around when it started” → roughly matches; “a stretch of going silent” → matches the quiet period; “a major life change” → almost exactly before the real milestone. Rather than hunting for a single spike, it reads the whole sequence of numbers as a “flow,” so it could pick up even an event that barely moves the counts.

音量的剧烈变化被准确捕捉到了。但“重大人生里程碑”被漏掉了。所以我把同样的数字展示给更大的本地模型并问“发生了什么？”——得到的回答是：“开始时左右”→大致吻合；“一段沉寂期”→与沉寂期吻合；“重大人生改变”→几乎就在真实里程碑之前。它不是在寻找单一的峰值，而是将整个数字序列作为一个“流”来阅读，因此即使是几乎不影响统计数据的事件，它也能捕捉到。

💡 Takeaways

Volume + tone alone reveal the arc: Counts and tone were enough to see the 8-year shape. Silence marks the quiet stretch; a surge marks the resumption — straight off the chart.
A local model reads a story out of numbers: Given only monthly numbers, the model inferred even a barely-visible event (“something big around here”), and it lined up with reality. It connects scattered points into one flow.
A “negative” tone doesn’t mean a bad relationship: The slight negative lean in later years isn’t about getting along badly. Logistics messages (“what time are you home?”) just don’t score high. Low score ≠ trouble. It isn’t that sentiment analysis is poor — the scores need to be read together with context.

💡 总结

仅凭音量+语气就能揭示弧线：统计数据和语气足以看出8年的形状。沉寂标志着安静期；激增标志着恢复——直接从图表上就能看出来。
本地模型能从数字中读出故事：仅给定月度数字，模型甚至推断出了一个几乎不可见的事件（“这附近发生了大事”），而且与现实吻合。它将分散的点连接成了一个流。
“消极”语气并不意味着关系不好：后期略微偏向消极并不是因为相处不融洽。物流类消息（“你几点到家？”）得分就是不高。低分≠有问题。这并不是情感分析做得不好——分数需要结合上下文来解读。

🛠️ Technical details Parsing & aggregation: LINE export format is a date header plus timenametext. Multi-line messages (4,987 of them) are merged back into t…

🛠️ 技术细节 解析与聚合：LINE导出格式为日期标题加上时间姓名文本。多行消息（共4,987条）被合并回……