ログイン
言語:

WEKO3

  • トップ
  • ランキング
To
lat lon distance
To

Field does not validate



インデックスリンク

インデックスツリー

メールアドレスを入力してください。

WEKO

One fine body…

WEKO

One fine body…

アイテム

  1. 工学
  2. 発表論文(工学系)

Speech Visualization by Integrating Features for the Hearing Impaired

http://hdl.handle.net/2298/3522
http://hdl.handle.net/2298/3522
5aa45ccf-8e3b-472e-9566-fb6b2bc1dc31
名前 / ファイル ライセンス アクション
scansnap002.pdf scansnap002.pdf (10.4 MB)
Item type 学術雑誌論文 / Journal Article(1)
公開日 2007-08-15
タイトル
タイトル Speech Visualization by Integrating Features for the Hearing Impaired
言語
言語 eng
キーワード
主題 Feature extraction, reading test, speech visualization
資源タイプ
資源タイプ journal article
著者 Watanabe, Akira

× Watanabe, Akira

WEKO 80434

Watanabe, Akira

Search repository
Tomishige, Shingo

× Tomishige, Shingo

WEKO 80435

Tomishige, Shingo

Search repository
Nakatake, Masahiro

× Nakatake, Masahiro

WEKO 80436

Nakatake, Masahiro

Search repository
別言語の著者 渡邉, 亮

× 渡邉, 亮

WEKO 80440

渡邉, 亮

Search repository
内容記述
内容記述 Describes development of a new speech visualization system that creates readable patterns by integrating different speech features into a single picture. The system extracts the phonemic and prosodic features from speech signals and converts them into a visual image using neither speech segmentation nor speech recognition. We used four time-delay neural networks (TDNNs) to generate phonemic features in the new system. Training of the TDNNs using three selected frames of eight kinds of acoustic parameters showed significant improvement in the performance. The TDNN outputs control the brightness of patterns used for consonants, that is, each of the consonant-patterns is represented by a different white texture whose brightness is weighted by the output of a corresponding TDNN. All the weighted consonant-patterns are simply added and then overlaid synchronously on colors due to the formant frequencies. When this is done, phonemic sequences and boundaries manifest themselves in the resulting visual patterns. In addition, the color of a single vowel sandwiched between consonants looks uniform. These visual phenomena are very useful for decoding the complex speech code, which is generated by the continuous movements of speech organs. We evaluated the visualized speech in a preliminary test. When three students read the patterns of 75 words uttered by four males (300 items), the learning curves showed a steep rise and the correct answer rate reached 96-99%. The learning effect was durable: after five months of absence from the system, a subject read 96.3% of the 300 tokens in a response time which averaged only 1.3 s/word.
書誌情報 IEEE Transactions on Speech and Audio Processing

巻 8, 号 4, p. 454-466, 発行年 2000-07
書誌レコードID
収録物識別子タイプ NCID
収録物識別子 AA10888994
DOI
関連タイプ isIdenticalTo
関連識別子 10.1109/89.848226
権利
権利情報 c2000 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. IEEE, IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 8, 4, 2000, 454-466
フォーマット
内容記述タイプ Other
内容記述 application/pdf
形態
10386188 bytes
著者版フラグ
出版タイプ VoR
日本十進分類法
主題Scheme NDC
主題 500
出版者
出版者 Institute of Electrical and Electronics Engineers
資源タイプ
内容記述タイプ Other
内容記述 論文(Article)
資源タイプ・ローカル
雑誌掲載論文
資源タイプ・NII
Journal Article
資源タイプ・DCMI
text
資源タイプ・ローカル表示コード
01
戻る
0
views
See details
Views

Versions

Ver.1 2023-06-19 19:15:59.540951
Show All versions

Share

Mendeley Twitter Facebook Print Addthis

Cite as

エクスポート

OAI-PMH
  • OAI-PMH JPCOAR 2.0
  • OAI-PMH JPCOAR 1.0
  • OAI-PMH DublinCore
  • OAI-PMH DDI
Other Formats
  • JSON
  • BIBTEX

Confirm


Powered by WEKO3


Powered by WEKO3