Sorry, you need to enable JavaScript to visit this website.

Speech Production (SPE-SPRD)

AN IMPROVED AIR TISSUE BOUNDARY SEGMENTATION TECHNIQUE FOR REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING SEGNET


This paper presents an improved methodology for the segmentation of the Air-Tissue boundaries (ATBs) in the upper airway of the human vocal tract using Real-Time Magnetic Resonance Imaging (rtMRI) videos. Semantic segmentation is deployed in the proposed approach using a Deep learning architecture called SegNet. The network processes an input image to produce a binary output image of the same dimensions having classified each pixel as air cavity or tissue, following which contours are predicted. A Multi-dimensional least square smoothing technique is applied to smoothen the contours.

Paper Details

Authors:
Valliappan CA, Avinash Kumar, Renuka Mannem, Karthik GR, Prasanta Kumar Ghosh
Submitted On:
8 May 2019 - 6:06am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Icassp_2019.pdf

(30)

Subscribe

[1] Valliappan CA, Avinash Kumar, Renuka Mannem, Karthik GR, Prasanta Kumar Ghosh, "AN IMPROVED AIR TISSUE BOUNDARY SEGMENTATION TECHNIQUE FOR REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING SEGNET", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4062. Accessed: Sep. 18, 2019.
@article{4062-19,
url = {http://sigport.org/4062},
author = {Valliappan CA; Avinash Kumar; Renuka Mannem; Karthik GR; Prasanta Kumar Ghosh },
publisher = {IEEE SigPort},
title = {AN IMPROVED AIR TISSUE BOUNDARY SEGMENTATION TECHNIQUE FOR REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING SEGNET},
year = {2019} }
TY - EJOUR
T1 - AN IMPROVED AIR TISSUE BOUNDARY SEGMENTATION TECHNIQUE FOR REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING SEGNET
AU - Valliappan CA; Avinash Kumar; Renuka Mannem; Karthik GR; Prasanta Kumar Ghosh
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4062
ER -
Valliappan CA, Avinash Kumar, Renuka Mannem, Karthik GR, Prasanta Kumar Ghosh. (2019). AN IMPROVED AIR TISSUE BOUNDARY SEGMENTATION TECHNIQUE FOR REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING SEGNET. IEEE SigPort. http://sigport.org/4062
Valliappan CA, Avinash Kumar, Renuka Mannem, Karthik GR, Prasanta Kumar Ghosh, 2019. AN IMPROVED AIR TISSUE BOUNDARY SEGMENTATION TECHNIQUE FOR REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING SEGNET. Available at: http://sigport.org/4062.
Valliappan CA, Avinash Kumar, Renuka Mannem, Karthik GR, Prasanta Kumar Ghosh. (2019). "AN IMPROVED AIR TISSUE BOUNDARY SEGMENTATION TECHNIQUE FOR REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING SEGNET." Web.
1. Valliappan CA, Avinash Kumar, Renuka Mannem, Karthik GR, Prasanta Kumar Ghosh. AN IMPROVED AIR TISSUE BOUNDARY SEGMENTATION TECHNIQUE FOR REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING SEGNET [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4062

AIR-TISSUE BOUNDARY SEGMENTATION IN REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING A CONVOLUTIONAL ENCODER-DECODER NETWORK


In this paper, we propose a convolutional encoder-decoder network (CEDN) based approach for upper and lower Air-Tissue Boundary (ATB) segmentation within vocal tract in real-time magnetic resonance imaging (rtMRI) video frames. The output images from CEDN are processed using perimeter and moving average filters to generate smooth contours representing ATBs. Experiments are performed in both seen subject and unseen subject conditions to examine the generalizability of the CEDN based approach.

Paper Details

Authors:
Renuka Mannem, Prasanta Kumar Ghosh
Submitted On:
8 May 2019 - 5:42am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

main.pdf

(29)

Subscribe

[1] Renuka Mannem, Prasanta Kumar Ghosh, " AIR-TISSUE BOUNDARY SEGMENTATION IN REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING A CONVOLUTIONAL ENCODER-DECODER NETWORK", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/4058. Accessed: Sep. 18, 2019.
@article{4058-19,
url = {http://sigport.org/4058},
author = {Renuka Mannem; Prasanta Kumar Ghosh },
publisher = {IEEE SigPort},
title = { AIR-TISSUE BOUNDARY SEGMENTATION IN REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING A CONVOLUTIONAL ENCODER-DECODER NETWORK},
year = {2019} }
TY - EJOUR
T1 - AIR-TISSUE BOUNDARY SEGMENTATION IN REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING A CONVOLUTIONAL ENCODER-DECODER NETWORK
AU - Renuka Mannem; Prasanta Kumar Ghosh
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/4058
ER -
Renuka Mannem, Prasanta Kumar Ghosh. (2019). AIR-TISSUE BOUNDARY SEGMENTATION IN REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING A CONVOLUTIONAL ENCODER-DECODER NETWORK. IEEE SigPort. http://sigport.org/4058
Renuka Mannem, Prasanta Kumar Ghosh, 2019. AIR-TISSUE BOUNDARY SEGMENTATION IN REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING A CONVOLUTIONAL ENCODER-DECODER NETWORK. Available at: http://sigport.org/4058.
Renuka Mannem, Prasanta Kumar Ghosh. (2019). " AIR-TISSUE BOUNDARY SEGMENTATION IN REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING A CONVOLUTIONAL ENCODER-DECODER NETWORK." Web.
1. Renuka Mannem, Prasanta Kumar Ghosh. AIR-TISSUE BOUNDARY SEGMENTATION IN REAL TIME MAGNETIC RESONANCE IMAGING VIDEO USING A CONVOLUTIONAL ENCODER-DECODER NETWORK [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/4058

Representation learning using convolution neural network for acoustic-to-articulatory inversion

Paper Details

Authors:
Aravind Illa, Prasanta Kumar Ghosh
Submitted On:
4 May 2019 - 8:15am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

E2E_AAI_ICASSP_19.pdf

(37)

Subscribe

[1] Aravind Illa, Prasanta Kumar Ghosh, "Representation learning using convolution neural network for acoustic-to-articulatory inversion", IEEE SigPort, 2019. [Online]. Available: http://sigport.org/3907. Accessed: Sep. 18, 2019.
@article{3907-19,
url = {http://sigport.org/3907},
author = {Aravind Illa; Prasanta Kumar Ghosh },
publisher = {IEEE SigPort},
title = {Representation learning using convolution neural network for acoustic-to-articulatory inversion},
year = {2019} }
TY - EJOUR
T1 - Representation learning using convolution neural network for acoustic-to-articulatory inversion
AU - Aravind Illa; Prasanta Kumar Ghosh
PY - 2019
PB - IEEE SigPort
UR - http://sigport.org/3907
ER -
Aravind Illa, Prasanta Kumar Ghosh. (2019). Representation learning using convolution neural network for acoustic-to-articulatory inversion. IEEE SigPort. http://sigport.org/3907
Aravind Illa, Prasanta Kumar Ghosh, 2019. Representation learning using convolution neural network for acoustic-to-articulatory inversion. Available at: http://sigport.org/3907.
Aravind Illa, Prasanta Kumar Ghosh. (2019). "Representation learning using convolution neural network for acoustic-to-articulatory inversion." Web.
1. Aravind Illa, Prasanta Kumar Ghosh. Representation learning using convolution neural network for acoustic-to-articulatory inversion [Internet]. IEEE SigPort; 2019. Available from : http://sigport.org/3907

Voice Impersonation Using Generative Adversarial Networks

Paper Details

Authors:
Rita Singh, Bhiksha Raj
Submitted On:
14 April 2018 - 8:39pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

YG_poster.pdf

(145)

Subscribe

[1] Rita Singh, Bhiksha Raj, "Voice Impersonation Using Generative Adversarial Networks", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2862. Accessed: Sep. 18, 2019.
@article{2862-18,
url = {http://sigport.org/2862},
author = {Rita Singh; Bhiksha Raj },
publisher = {IEEE SigPort},
title = {Voice Impersonation Using Generative Adversarial Networks},
year = {2018} }
TY - EJOUR
T1 - Voice Impersonation Using Generative Adversarial Networks
AU - Rita Singh; Bhiksha Raj
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2862
ER -
Rita Singh, Bhiksha Raj. (2018). Voice Impersonation Using Generative Adversarial Networks. IEEE SigPort. http://sigport.org/2862
Rita Singh, Bhiksha Raj, 2018. Voice Impersonation Using Generative Adversarial Networks. Available at: http://sigport.org/2862.
Rita Singh, Bhiksha Raj. (2018). "Voice Impersonation Using Generative Adversarial Networks." Web.
1. Rita Singh, Bhiksha Raj. Voice Impersonation Using Generative Adversarial Networks [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2862

Voice Impersonation Using Generative Adversarial Networks

Paper Details

Authors:
Rita Singh, Bhiksha Raj
Submitted On:
14 April 2018 - 8:39pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

YG_poster.pdf

(170)

Subscribe

[1] Rita Singh, Bhiksha Raj, "Voice Impersonation Using Generative Adversarial Networks", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2861. Accessed: Sep. 18, 2019.
@article{2861-18,
url = {http://sigport.org/2861},
author = {Rita Singh; Bhiksha Raj },
publisher = {IEEE SigPort},
title = {Voice Impersonation Using Generative Adversarial Networks},
year = {2018} }
TY - EJOUR
T1 - Voice Impersonation Using Generative Adversarial Networks
AU - Rita Singh; Bhiksha Raj
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2861
ER -
Rita Singh, Bhiksha Raj. (2018). Voice Impersonation Using Generative Adversarial Networks. IEEE SigPort. http://sigport.org/2861
Rita Singh, Bhiksha Raj, 2018. Voice Impersonation Using Generative Adversarial Networks. Available at: http://sigport.org/2861.
Rita Singh, Bhiksha Raj. (2018). "Voice Impersonation Using Generative Adversarial Networks." Web.
1. Rita Singh, Bhiksha Raj. Voice Impersonation Using Generative Adversarial Networks [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2861

DIRECT, NEAR REAL TIME ANIMATION OF A 3D TONGUE MODEL USING NON-INVASIVE ULTRASOUND IMAGES


A new technique for representing speech articulation with
an ultrasound-driven finite element model of the tongue is
presented. By using a snake contour extraction algorithm
with anatomically motivated constraints and a common
coordinate system between the ultrasound and the tongue
model, it is possible for the first time to obtain a realistic 3D
simulation of the tongue directly from a non-invasive sensor
(ultrasound), without mapping through any intermediate
sensor modalities, and at near real-time frame rates.

Paper Details

Authors:
Submitted On:
19 April 2018 - 9:35pm
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

ICASSP.pptx

(120)

ICASSP.pptx

(243)

Subscribe

[1] , "DIRECT, NEAR REAL TIME ANIMATION OF A 3D TONGUE MODEL USING NON-INVASIVE ULTRASOUND IMAGES", IEEE SigPort, 2018. [Online]. Available: http://sigport.org/2846. Accessed: Sep. 18, 2019.
@article{2846-18,
url = {http://sigport.org/2846},
author = { },
publisher = {IEEE SigPort},
title = {DIRECT, NEAR REAL TIME ANIMATION OF A 3D TONGUE MODEL USING NON-INVASIVE ULTRASOUND IMAGES},
year = {2018} }
TY - EJOUR
T1 - DIRECT, NEAR REAL TIME ANIMATION OF A 3D TONGUE MODEL USING NON-INVASIVE ULTRASOUND IMAGES
AU -
PY - 2018
PB - IEEE SigPort
UR - http://sigport.org/2846
ER -
. (2018). DIRECT, NEAR REAL TIME ANIMATION OF A 3D TONGUE MODEL USING NON-INVASIVE ULTRASOUND IMAGES. IEEE SigPort. http://sigport.org/2846
, 2018. DIRECT, NEAR REAL TIME ANIMATION OF A 3D TONGUE MODEL USING NON-INVASIVE ULTRASOUND IMAGES. Available at: http://sigport.org/2846.
. (2018). "DIRECT, NEAR REAL TIME ANIMATION OF A 3D TONGUE MODEL USING NON-INVASIVE ULTRASOUND IMAGES." Web.
1. . DIRECT, NEAR REAL TIME ANIMATION OF A 3D TONGUE MODEL USING NON-INVASIVE ULTRASOUND IMAGES [Internet]. IEEE SigPort; 2018. Available from : http://sigport.org/2846

Production and Perception of Focus in L2 Mandarin of Qiang Speakers


The present study investigated production and perception of focus in L2 Mandarin of Qiang speakers. Three target sentences were uttered under four focus conditions, i.e., initial, medial, final and neutral focus by 10 Qiang-Mandarin speakers. Systematic acoustic analysis showed that: (1) In Qiang-Mandarin, on-focus words exhibit significant F0 rising, intensity increasing and duration lengthening. There is no Post-focus Compression (PFC). The duration of pre-focus and post-focus words remains largely intact.

Paper Details

Authors:
Bei Wang
Submitted On:
15 October 2016 - 5:27am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Qiang-Mandarin_ID174.pdf

(341)

Subscribe

[1] Bei Wang, "Production and Perception of Focus in L2 Mandarin of Qiang Speakers", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1224. Accessed: Sep. 18, 2019.
@article{1224-16,
url = {http://sigport.org/1224},
author = {Bei Wang },
publisher = {IEEE SigPort},
title = {Production and Perception of Focus in L2 Mandarin of Qiang Speakers},
year = {2016} }
TY - EJOUR
T1 - Production and Perception of Focus in L2 Mandarin of Qiang Speakers
AU - Bei Wang
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1224
ER -
Bei Wang. (2016). Production and Perception of Focus in L2 Mandarin of Qiang Speakers. IEEE SigPort. http://sigport.org/1224
Bei Wang, 2016. Production and Perception of Focus in L2 Mandarin of Qiang Speakers. Available at: http://sigport.org/1224.
Bei Wang. (2016). "Production and Perception of Focus in L2 Mandarin of Qiang Speakers." Web.
1. Bei Wang. Production and Perception of Focus in L2 Mandarin of Qiang Speakers [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1224

An Interface Research on Rhetorical Structure and Prosody Features in Chinese Reading Texts


This paper conducted an interface research on rhetorical and prosodic aspects of Chinese reading discourses within the Rhetorical Structure Theory (RST) framework. Ten discourses in 3 genres (Commentary, Narrative, and Descriptive) from the Annotated Speech Corpus of Chinese Discourse (ASCCD) were diagrammed in RST. The recordings from 5 males and 5 females were annotated and further analyzed acoustically and statistically by applying Praat and R.

Paper Details

Authors:
Liang Zhang, Yuan Jia, Aijun Li
Submitted On:
18 October 2016 - 7:52am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

Liang_O10-4-1016.pptx

(349)

Liang_O10-4-1016.pptx

(350)

Liang_O10-4-1018.pptx

(366)

Subscribe

[1] Liang Zhang, Yuan Jia, Aijun Li, "An Interface Research on Rhetorical Structure and Prosody Features in Chinese Reading Texts", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1217. Accessed: Sep. 18, 2019.
@article{1217-16,
url = {http://sigport.org/1217},
author = {Liang Zhang; Yuan Jia; Aijun Li },
publisher = {IEEE SigPort},
title = {An Interface Research on Rhetorical Structure and Prosody Features in Chinese Reading Texts},
year = {2016} }
TY - EJOUR
T1 - An Interface Research on Rhetorical Structure and Prosody Features in Chinese Reading Texts
AU - Liang Zhang; Yuan Jia; Aijun Li
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1217
ER -
Liang Zhang, Yuan Jia, Aijun Li. (2016). An Interface Research on Rhetorical Structure and Prosody Features in Chinese Reading Texts. IEEE SigPort. http://sigport.org/1217
Liang Zhang, Yuan Jia, Aijun Li, 2016. An Interface Research on Rhetorical Structure and Prosody Features in Chinese Reading Texts. Available at: http://sigport.org/1217.
Liang Zhang, Yuan Jia, Aijun Li. (2016). "An Interface Research on Rhetorical Structure and Prosody Features in Chinese Reading Texts." Web.
1. Liang Zhang, Yuan Jia, Aijun Li. An Interface Research on Rhetorical Structure and Prosody Features in Chinese Reading Texts [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1217

Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra


The bilateral cavities of the piriform fossa are the side branches of the vocal tract and produce anti-resonance(s) in the transfer function. This effect has been known for male vocal tracts, but female data were few. This study investigates contributions of the piriform fossa to vowel spectra in female vocal tracts by means of MRI-based vocal-tract modeling and acoustic experiment with the water-filling technique. Results from three female subjects indicate that the piriform fossa generates one or two dips in the frequency region of 4-6 kHz.

Paper Details

Authors:
Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei
Submitted On:
15 October 2016 - 12:24am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

zcc_ISCSLP2016.pdf

(351)

Subscribe

[1] Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei, "Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1203. Accessed: Sep. 18, 2019.
@article{1203-16,
url = {http://sigport.org/1203},
author = {Congcong Zhang; Kiyoshi Honda; Ju Zhang; Jianguo Wei },
publisher = {IEEE SigPort},
title = {Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra},
year = {2016} }
TY - EJOUR
T1 - Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra
AU - Congcong Zhang; Kiyoshi Honda; Ju Zhang; Jianguo Wei
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1203
ER -
Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei. (2016). Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra. IEEE SigPort. http://sigport.org/1203
Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei, 2016. Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra. Available at: http://sigport.org/1203.
Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei. (2016). "Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra." Web.
1. Congcong Zhang, Kiyoshi Honda, Ju Zhang, Jianguo Wei. Contributions of the Piriform Fossa of Female Speakers to Vowel Spectra [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1203

Individual difference and acoustic effect of female laryngeal cavities


This study examines the acoustic effect of the laryngeal cavity of female speakers on the higher vowel spectra. To do so, MRI data of vowels /a/ and /i/ obtained from three female speakers were analyzed with data from a male speaker as reference. 3D vocal-tract shapes were extracted from the MRI data and printed as solid mechanical models. Transfer functions of the models' vocal tracts were estimated by a transmission line model. Individual variations of the laryngeal cavity were described by the area functions of the cavity.

Paper Details

Authors:
Jing Li, Kiyoshi Honda, Ju Zhang, Jianguo Wei
Submitted On:
14 October 2016 - 10:43am
Short Link:
Type:
Event:
Presenter's Name:
Paper Code:
Document Year:
Cite

Document Files

[ISCSLP2016] ID82 Oral.PDF

(41)

Subscribe

[1] Jing Li, Kiyoshi Honda, Ju Zhang, Jianguo Wei, "Individual difference and acoustic effect of female laryngeal cavities", IEEE SigPort, 2016. [Online]. Available: http://sigport.org/1202. Accessed: Sep. 18, 2019.
@article{1202-16,
url = {http://sigport.org/1202},
author = {Jing Li; Kiyoshi Honda; Ju Zhang; Jianguo Wei },
publisher = {IEEE SigPort},
title = {Individual difference and acoustic effect of female laryngeal cavities},
year = {2016} }
TY - EJOUR
T1 - Individual difference and acoustic effect of female laryngeal cavities
AU - Jing Li; Kiyoshi Honda; Ju Zhang; Jianguo Wei
PY - 2016
PB - IEEE SigPort
UR - http://sigport.org/1202
ER -
Jing Li, Kiyoshi Honda, Ju Zhang, Jianguo Wei. (2016). Individual difference and acoustic effect of female laryngeal cavities. IEEE SigPort. http://sigport.org/1202
Jing Li, Kiyoshi Honda, Ju Zhang, Jianguo Wei, 2016. Individual difference and acoustic effect of female laryngeal cavities. Available at: http://sigport.org/1202.
Jing Li, Kiyoshi Honda, Ju Zhang, Jianguo Wei. (2016). "Individual difference and acoustic effect of female laryngeal cavities." Web.
1. Jing Li, Kiyoshi Honda, Ju Zhang, Jianguo Wei. Individual difference and acoustic effect of female laryngeal cavities [Internet]. IEEE SigPort; 2016. Available from : http://sigport.org/1202

Pages