TaigiSpeech

一個專為長者設計的台語語音意圖辨識資料集 A Real-World Taiwanese Speech Intent Dataset for Elderly Users

CC BY 4.0

關於 TaigiSpeech About TaigiSpeech

臺灣台語(臺灣閩南語)屬於低資源語言,且主要以口語形式使用。在台灣,65 歲以上的長者中,有 64.9% 以台語為主要語言。然而,現有的語音技術對台語的支援嚴重不足。

TaigiSpeech 是第一個專為台語長者設計的語音意圖辨識資料集,涵蓋居家照護與智慧家庭應用情境。資料集包含 21 位長者講者超過 3,000 句語音,共 8 種意圖類別(4 種緊急情境 + 4 種功能指令),旨在推動低資源語言的語音技術研究。

Taiwanese Taigi (Taiwanese Hokkien / Southern Min) is a primarily spoken and low-resource language. In Taiwan, 64.9% of residents aged 65+ report Taiwanese as their primary language. Yet existing speech technologies remain largely inaccessible to these speakers.

TaigiSpeech is the first Taiwanese Hokkien speech intent dataset specifically designed for elderly speakers in healthcare and smart-home assistant scenarios. It comprises 21 elderly speakers, more than 3,000 utterances, and 8 intent categories (4 emergency + 4 functional), advancing speech technology research for low-resource languages.

意圖類別 Intent Categories

🚨 緊急情境 🚨 Emergency Intents

SOS_CALL
呼叫緊急救援
Calls for emergency help
FALL_HELP
跌倒求助
Indicates a fall and requests help
BREATH_EMERG
呼吸困難通報
Reports breathing difficulty
PAIN_GENERAL
疼痛不適通報
Reports physical pain or discomfort

🏠 功能指令 🏠 Functional Commands

CALL_CONTACT
聯絡特定對象
Requests to contact a person
LIGHT_ON
開燈
Requests to turn on the lights
LIGHT_OFF
關燈
Requests to turn off the lights
CANCEL_ALERT
取消警報
Cancels a previously triggered alert

資料集統計 Dataset Statistics

21
位講者 Speakers
13F, 8M
3,000+
句語音 Utterances
6.1
小時語音 Hours
8
種意圖類別 Intent Categories
67.9
歲(平均年齡) Mean Age (yrs)
Speaker age distribution

講者年齡分布 Speaker Age Distribution

Ambient noise level distribution

環境噪音分布 Ambient Noise Level Distribution

情境式語音錄製 Scenario-Driven Recording

我們開發了網頁式錄音應用程式,讓長者參與者在研究人員的協助下,針對想像的情境(如「你在浴室滑倒了,爬不起來」)以自然的台語回應。每個意圖包含 20 種不同情境,搭配由 Google Veo 3 生成的 10 秒無聲影片輔助想像,讓錄音內容更加自然且富有表現力。

We developed a web-based recording application where elderly participants respond naturally in Taiwanese to imagined scenarios (e.g., "You slipped in the bathroom and cannot stand up") under researcher supervision. Each intent contains 20 distinct scenarios, with optional 10-second silent video prompts generated by Google Veo 3 to enhance immersion and natural expressiveness.

TaigiSpeech recording app interface
Recording session in progress

錄音現場 — 長者在研究人員的協助下進行錄音 Recording session — elderly participants record under researcher supervision

Recording equipment

錄音設備 Recording equipment

致謝 Acknowledgments

特別感謝基隆社區大學台語文化社的大力協助與支持,社員們熱情參與語音錄製,為本資料集提供了珍貴的台語語音資料。

Special thanks to the Taiwanese Culture Club at Keelung Community University for their generous support. Club members enthusiastically participated in the speech recording sessions, contributing invaluable Taiwanese Hokkien voice data to this dataset.

感謝所有參與錄音的長者們,以及協助本研究的研究團隊。本研究由以下機構合作完成:

We sincerely thank all the elderly participants who contributed their voices and the research team members who supported this work. This project is a collaboration across the following institutions:

MIT
麻省理工學院
Massachusetts Institute of Technology
NTU
國立臺灣大學
National Taiwan University
Academia Sinica
中央研究院
Academia Sinica
NYCU
國立陽明交通大學
National Yang Ming Chiao Tung University
USC
南加州大學
University of Southern California

引用 Citation

@inproceedings{chang2026taigispeech,
  title     = {TaigiSpeech: A Low-Resource Real-World Speech Intent Dataset
               and Preliminary Results with Scalable Data Mining In-the-Wild},
  author    = {Chang, Kai-Wei and Lin, Yi-Cheng and Chou, Huang-Cheng and
               Ren, Wenze and Huang, Yu-Han and Tsai, Yun-Shao and
               Chen, Chien-Cheng and Tsao, Yu and Liao, Yuan-Fu and
               Narayanan, Shrikanth and Glass, James and Lee, Hung-yi},
  year      = {2026}
}