[Day 9] A local Japanese sentiment AI (BERT) read 8 years of a LINE chat, and the ups and downs surfaced from numbers alone

[Day 9] A local Japanese sentiment AI (BERT) read 8 years of a LINE chat, and the ups and downs surfaced from numbers alone

[第9天] 本地日语情感AI (BERT) 读取了8年的LINE聊天记录,仅凭数字就浮现出了起伏

Intro Day 9. Today is less about model internals and more of a personal experiment: have a local AI analyze the entire chat history with one LINE friend. (LINE is the dominant messaging app in Japan.) When I exported it, 8 years were sitting there — from the very first message to today. It started, we talked a lot, it went quiet for a while, then picked up again. That whole arc is in there. Because the content is what it is, nothing left my machine: everything ran locally on my DGX Spark. What I used: my home AI box (DGX Spark) + a Japanese sentiment model (for tone) + a bigger local model (to guess events from numbers).

引言 第9天。今天的主题与其说是模型内部原理,不如说是一项个人实验:让本地AI分析我与一位LINE好友的全部聊天记录。(LINE是日本主流的通讯软件。)当我导出记录时,8年的时光就在那里——从第一条消息到今天。我们开始聊天,聊得很火热,中间沉寂了一段时间,后来又恢复了联系。整个过程的弧线都在其中。由于内容比较私密,没有任何数据离开过我的机器:一切都在我的DGX Spark上本地运行。我使用的工具:我的家用AI盒子 (DGX Spark) + 日语情感模型(用于分析语气)+ 一个更大的本地模型(用于从数字中推测事件)。

Today’s setup What I wanted to do: Re-reading 8 years of messages one by one isn’t realistic. So instead of reading the content, I looked only at the “shape” of the conversation — when, how much, and in what tone we talked. Concretely: monthly message volume, the trend of tone (positive / negative), then asking an AI to find “when something big happened.”

今日设置 我想做的是:逐条重读8年的消息是不现实的。因此,我没有去阅读具体内容,而是只观察对话的“形状”——我们何时聊天、聊了多少、以及以什么样的语气。具体来说:每月的消息量、语气趋势(积极/消极),然后让AI找出“重大事件发生的节点”。

Heads-up (the result) From message counts and tone alone, the 8-year arc came out clearly on a chart. Started, went quiet, came back — the flow was visible without me re-reading a thing.

预告(结果) 仅凭消息数量和语气,8年的弧线就在图表上清晰地呈现了出来。开始、沉寂、回归——无需重读任何内容,整个流程一目了然。

🔧 Pipeline LINE chat export (text) │ ▼ 1. Parse: split each message into {datetime, who, type, text} │ (from here on, message text never leaves the machine) ▼ 2. Aggregate: monthly counts, time-of-day, reply gaps │ ▼ 3. Tone scoring: classify each of 66k messages pos/neu/neg │ ▼ 4. Turning-point detection: from sudden changes in the numbers │ + also show ONLY the numbers to a bigger AI and ask it to guess ▼ 5. Answer check: compare against the real timeline.

🔧 流程 LINE聊天导出(文本)│ ▼ 1. 解析:将每条消息拆分为 {日期时间, 发送者, 类型, 文本} │(从此处开始,消息文本绝不离开机器)▼ 2. 聚合:月度统计、时间段、回复间隔 │ ▼ 3. 语气评分:将66,000条消息分类为积极/中性/消极 │ ▼ 4. 转折点检测:根据数字的突变 │ + 同时仅向更大的AI展示数字并让其猜测 ▼ 5. 答案核对:与真实时间线进行对比。

You can export a LINE chat as text from the chat screen (“send chat history”). Data size:

ItemValue
Span~8 years 2 months
Total messages87,621
Text messages66,329
Stickers15,605
Photos3,982
15,605 stickers… that’s a lot.

你可以从聊天界面导出LINE聊天记录为文本(“发送聊天记录”)。 数据规模:

项目数值
跨度约8年2个月
消息总数87,621
文本消息66,329
贴图15,605
照片3,982
15,605个贴图……这可真不少。

The two AIs

StepModelWhat it doesWhat it sees
  1. Tone | Japanese sentiment model (koheiduck/bert-japanese-finetuned-sentiment) | scores each message pos/neu/neg | 66k message texts (scores averaged per month)
  2. Turning points | a bigger local model (Qwen2.5 72B) | guesses “what happened to these two?” | only the per-month table of counts + tone scores (no conversation, no words)

两个AI

步骤模型功能输入内容
  1. 语气 | 日语情感模型 (koheiduck/bert-japanese-finetuned-sentiment) | 为每条消息评分(积极/中性/消极) | 6.6万条消息文本(按月取平均分)
  2. 转折点 | 更大的本地模型 (Qwen2.5 72B) | 猜测“这两人发生了什么?” | 仅包含月度统计表 + 语气得分(无对话内容,无文字)

Both run locally on my own machine.

两者均在我的个人机器上本地运行。

📊 Results The 8-year arc of volume and tone This chart is the highlight. Top: monthly message count. Bottom: tone (up = positive, down = negative). The x-axis is months since the conversation started. (Axis labels are in Japanese.) Plotted, it isn’t a steady climb or a flat line — it splits cleanly into “chapters”: ramp-up → an 8-month silence → a second peak → a stable plateau. Four phases, at a glance.

📊 结果 8年的音量与语气弧线 这张图表是亮点。上方:月度消息数。下方:语气(向上为积极,向下为消极)。横轴是对话开始后的月份。(轴标签为日语。)绘制出来后,它不是平稳的上升或平线,而是清晰地分成了几个“章节”:起步 → 8个月的沉寂 → 第二个高峰 → 稳定的平台期。四个阶段,一目了然。

Tone has two peaks of about +0.6, around the start and around when things resumed (overall mean ≈ 0, slightly negative in the later years). The interesting part: in the month before the silence, tone had already dropped to −0.1. The mood dimmed before the volume did. There are two dips into negative tone. The one before the silence was an “omen.” The other is the recent years — not an omen, but the effect of logistics-y messages (“what time are you home?”) piling up.

语气有两个约+0.6的高峰,分别在开始阶段和恢复联系时(总体平均值≈0,后期略微偏向消极)。有趣的是:在沉寂前的一个月,语气已经降至-0.1。情绪在音量下降之前就已经变淡了。语气中有两次跌入负值。沉寂前的那次是“预兆”。另一次是最近几年——那不是预兆,而是物流类消息(如“你几点到家?”)堆积的结果。

💡 Mini-note: how is “tone” turned into a number? The scoring is done by a Japanese sentiment model. Roughly: pre-trained on lots of Japanese text labeled positive / negative; judges with context, not just by spotting keywords; returns a probability of “positive-ness” / “negative-ness” per message. I used the difference as a per-message score.

💡 小贴士:如何将“语气”转化为数字? 评分由日语情感模型完成。大致过程:在大量标注了积极/消极的日语文本上进行预训练;结合上下文进行判断,而非仅仅识别关键词;返回每条消息的“积极性”/“消极性”概率。我将两者的差值作为每条消息的得分。

What kinds of messages scored how? A few actual judgments (short, name- and place-free one-liners):

MessageVerdict
「楽しかったね!」 (that was fun!)Positive
「これめちゃうまい」 (this is so good)Positive
「おはようございます」 (good morning)Neutral
「もうお家?」 (home already?)Neutral
「全く集中できない」 (can’t focus at all)Negative
「それは悔しいな、、」 (that’s frustrating…)Negative
(a long trip-planning message)Neutral
(a snappy one-liner sent in a huff)Negative

什么样的消息得分如何? 一些实际的判断(简短、无姓名和地点的单行句):

消息判定
「楽しかったね!」 (真开心啊!)积极
「これめちゃうまい」 (这个超好吃)积极
「おはようございます」 (早上好)中性
「もうお家?」 (已经到家了?)中性
「全く集中できない」 (完全无法集中)消极
「それは悔しいな、、」 (那真让人懊恼……)消极
(长篇旅行计划消息)中性
(气头上发的一句简短的话)消极

Plain happy lines score positive; logistics (“good morning”, “home already?”) score neutral; tiredness or irritation scores negative. Even long, businesslike planning messages lean neutral. Mornings are when we talk. Message density by weekday × hour (brighter = more). A clear concentration at 7–9 a.m.!

纯粹开心的句子得分积极;物流类(“早上好”、“到家了吗?”)得分中性;疲惫或烦躁得分消极。即使是长篇的、公事公办的计划消息也倾向于中性。我们通常在早晨聊天。按星期×小时的消息密度(越亮表示越多)。在早上7-9点有明显的集中!

Could the AI guess the turning points? First, the simple method: mechanically pick the points where message volume jumped or dropped, then check against the real timeline.

Real eventAuto-detected timing
When it startedexact match
When it went quietexact match
When it resumedexact match
When it got lively againa few months off
A big life milestonehard to detect (barely shows in counts)

AI能猜出转折点吗? 首先是简单的方法:机械地挑选消息量跳跃或下降的点,然后与真实时间线核对。

真实事件自动检测时间
开始时完全匹配
沉寂时完全匹配
恢复时完全匹配
再次活跃时偏差几个月
重大人生里程碑难以检测(在统计中几乎不显示)

Sharp volume changes were nailed. But “a big life milestone” got missed. So I showed the same numbers to the bigger local model and asked “what happened?” — and got back: “around when it started” → roughly matches; “a stretch of going silent” → matches the quiet period; “a major life change” → almost exactly before the real milestone. Rather than hunting for a single spike, it reads the whole sequence of numbers as a “flow,” so it could pick up even an event that barely moves the counts.

音量的剧烈变化被准确捕捉到了。但“重大人生里程碑”被漏掉了。所以我把同样的数字展示给更大的本地模型并问“发生了什么?”——得到的回答是:“开始时左右”→大致吻合;“一段沉寂期”→与沉寂期吻合;“重大人生改变”→几乎就在真实里程碑之前。它不是在寻找单一的峰值,而是将整个数字序列作为一个“流”来阅读,因此即使是几乎不影响统计数据的事件,它也能捕捉到。

💡 Takeaways

  1. Volume + tone alone reveal the arc: Counts and tone were enough to see the 8-year shape. Silence marks the quiet stretch; a surge marks the resumption — straight off the chart.
  2. A local model reads a story out of numbers: Given only monthly numbers, the model inferred even a barely-visible event (“something big around here”), and it lined up with reality. It connects scattered points into one flow.
  3. A “negative” tone doesn’t mean a bad relationship: The slight negative lean in later years isn’t about getting along badly. Logistics messages (“what time are you home?”) just don’t score high. Low score ≠ trouble. It isn’t that sentiment analysis is poor — the scores need to be read together with context.

💡 总结

  1. 仅凭音量+语气就能揭示弧线:统计数据和语气足以看出8年的形状。沉寂标志着安静期;激增标志着恢复——直接从图表上就能看出来。
  2. 本地模型能从数字中读出故事:仅给定月度数字,模型甚至推断出了一个几乎不可见的事件(“这附近发生了大事”),而且与现实吻合。它将分散的点连接成了一个流。
  3. “消极”语气并不意味着关系不好:后期略微偏向消极并不是因为相处不融洽。物流类消息(“你几点到家?”)得分就是不高。低分≠有问题。这并不是情感分析做得不好——分数需要结合上下文来解读。

🛠️ Technical details Parsing & aggregation: LINE export format is a date header plus timenametext. Multi-line messages (4,987 of them) are merged back into t…

🛠️ 技术细节 解析与聚合:LINE导出格式为日期标题加上 时间姓名文本。多行消息(共4,987条)被合并回……