个介绍各种语音分析软件的speechanalysistoolPPT

上传人:日度 文档编号:150839500 上传时间:2020-11-09 格式:PPT 页数:30 大小:940KB
返回 下载 相关 举报
个介绍各种语音分析软件的speechanalysistoolPPT_第1页
第1页 / 共30页
个介绍各种语音分析软件的speechanalysistoolPPT_第2页
第2页 / 共30页
个介绍各种语音分析软件的speechanalysistoolPPT_第3页
第3页 / 共30页
个介绍各种语音分析软件的speechanalysistoolPPT_第4页
第4页 / 共30页
个介绍各种语音分析软件的speechanalysistoolPPT_第5页
第5页 / 共30页
点击查看更多>>
资源描述

《个介绍各种语音分析软件的speechanalysistoolPPT》由会员分享,可在线阅读,更多相关《个介绍各种语音分析软件的speechanalysistoolPPT(30页珍藏版)》请在金锄头文库上搜索。

1、Speech analysis tools,Jean-Philippe Goldman 03.03.2004,2,Two questions,What kind of data ? Which task ?,3,What kind of data ?,Speech content (noise, multivoice,) Data File Sound/Transcription/PitchCurve Sampling/Quantization 16k 12k 8k 4k 8bit Size 16k16bit,256kbps 1.9Mo/mn 115Mo/h Format Sound: wav

2、, wma, mp3, ogg, aiff, aifc, au, vox, raw, sd, CSL, Ogg/Vorbis, NIST/Sphere Transcription: HTK, TIMIT, TextGrid, Phondat Number of files,4,Which task ?,Visualization and Edition: Record, Play, edit, mix, add effects Analysis: spectral, pitch Speech manipulation: Filtering, mixing, adding effects, pr

3、osodic manipulation Annotation: segmentation, labeling Scripting: Batch, communication with outside Plotting,5,Examples of tasks,build stimuli for an experiment (i.e. cross-splicing) manage a speech database for a TTS engine create a prosodic database analyze speech corpus from experiment recordings

4、 verify/correct an automatic segmentation,6,Two questions,What kind of data ? Which task ?,Two rules,there is no unique tool to do everything there are plenty of ways to do one thing,7,Tool features,Visualization/Edition Analysis Speech manipulation Annotation Scripting Plotting,Supported format Pla

5、tform/installation Evolution/community Accessibility Price,8,Softwares,Goldwave(audio editor) Esps Xwaves(routines + visual.) Praat(speech analysis) Wavesurfer(speech editor) Transcriber(annotation tool) Matlab(general purpose soft) OGI speech tools(routines + app. dev.) winpitch, pitchworks, phoned

6、it, cooledit.,9,Goldwave,self-defined as “top rated, professional digital audio editor”,10,Goldwave,pros : edition (good gestion of memory for big files), many FX, noise reduction, real-time spectrum and VU meters, various formats, batch conversion, chain effects, easy interface cons: nothing for sp

7、eech (pitch, formant), windows only, no scripting Good for file edition not for speech,11,12,Esps - Waves,Developed by Entropic + AT&T. Now public Comp.speech FAQ says: Esps: comprehensive set of speech analysis/processing tools Waves is a graphical front-end for speech processing (waveforms, spectr

8、ograms, pitch) includes a signal labeling utility,13,14,Esps waves,pros: powerful, designed for big files, cons: UNIX only (free BSD), not standard formats, requires programming skills, development has stopped,15,Praat,Developed by P.Boersma and D.Weenink at the Institute of Phonetic Sciences, Unive

9、rsity of Amsterdam general purpose speech tool : edition, segmentation and labeling, prosodic manipulation,16,17,Praat,pros: designed for speech analysis (not only sound edition or spectrogram visualization), nice GUI, scripting, active development and community, prosodic manipulation cons: limited

10、scripting language, native format of transcription and pitch files,18,WaveSurfer,Open Source tool for sound visualization and manipulation speech/sound analysis and sound annotation/transcription platform for more advanced/specialized applications: extending WaveSurfer with new custom plug-ins or em

11、bedding WaveSurfer visualization components in other applications Requires SnackToolKit,19,20,Transcriber,Authors: C. Barras, E. Geoffrois Relies on Snack (Tcl/tk) Good for annotation Nice, simple GUI No speech analysis,21,22,Matlab (Mathworks),Math. environment Signal processing toolbox : filter-de

12、sign, spectral analysis, waveform generation, linear prediction voicebox (2002) mike.brookesic.ac.uk pitch determination algorithm (2002) XuejingSun sunxjnorthwestern.edu colea speech editor (1998) Philip Loizou loizouutdallas.edu Univ of Texas-Dallas,23,Matlab (Mathworks),pros: open, powerful, scri

13、pting, excellent plotting cons: poor speech community, standards, not designed for big files,24,OGI speech tools/CSLU Toolkit,development started in 1992 in C on Unix, at Center for Spoken Language Understanding (CSLU) at OGI Includes : An X windows display tool (LYRE) display, edit speech signal, s

14、pectrograms, phoneme labels, and other information a set of C library routines (LIBNSPEECH), utilities for converting file formats, filtering, Neural Network training, vector-quantizer, database utility to automate speech database related enquiries a set of PERL Scripts which have been used mainly t

15、o automate the use of the OGI Speech Tools. MAN Pages RAD rapid application development points of entry: Package(C), script(tcl), GUI(tk) levels free for research use,25,26,Summary,= yes but requires some dev.,27,Expect to do conversions,Sound files goldwave (win) sox (unix) Transcription files scri

16、pts to convert text-formatted label files,28,Links, www.speech.kth.se/software/#esps www.praat.org www.speech.kth.se/software/#wavesurfer www.cse.ogi.edu/toolkit (Matlab) www.lpl.univ-aix.fr/sqlab/ (phonedit) (PitchWorks) (WinPitch) (CoolEdit Audition),29,Other toolkits,ASR: HTK TTS: MBROLA, Festival, OGI,

展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 行业资料 > 教育/培训

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号