ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2020 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.
- Read more about Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction
- 2 comments
- Log in to post comments
- Categories:
- Read more about A MULTI-SCALED RECEPTIVE FIELD LEARNING APPROACH FOR MEDICAL IMAGE SEGMENTATION
- 1 comment
- Log in to post comments
Biomedical image segmentation has been widely studied, and lots of methods have been proposed. Among these methods, attention U-Net has achieved a promising performance. However, it has drawbacks of extracting the multi-scaled receptive field features at the high-level feature maps, resulting in the degeneration when dealing with the lesions with apparent scale variations. To solve this problem, this paper integrates an atrous spatial pyramid pooling (ASPP) module in the contracting path of attention U-Net.
- Categories:
- Read more about LEVERAGING ORDINAL REGRESSION WITH SOFT LABELS FOR 3D HEAD POSE ESTIMATION FROM POINT SETS
- Log in to post comments
Head pose estimation from depth image is a challenging problem, considering its large pose variations, severer occlusions, and low quality of depth data. In contrast to existing approaches that take 2D depth image as input, we propose a novel deep regression architecture called Head PointNet, which consumes 3D point sets derived from a depth image describing the visible surface of a head. To cope with the non-stationary property of pose variation process, the network is facilitated with an ordinal regression module that incorporates metric penalties into ground truth label representation.
- Categories:
- Read more about Multimodal active speaker detection and virtual cinematography for video conferencing
- Log in to post comments
Active speaker detection (ASD) and virtual cinematography (VC) can significantly improve the remote user experience of a video conference by automatically panning, tilting and zooming of a video conferencing camera: users subjectively rate an expert video cinematographer’s video significantly higher than unedited video. We describe a new automated ASD and VC that performs within 0.3 MOS of an expert cinematographer based on subjective ratings with a 1-5 scale.
- Categories:
- Read more about Generalized Kernel-Based Dynamic Mode Decomposition
- 2 comments
- Log in to post comments
manuscript.pdf
- Categories:
- Read more about POWER SPECTRUM OPTIMIZATION FOR CAPACITY OF THE EXTENDED SPECTRUM HYBRID FIBER COAX NETWORK
- Log in to post comments
Capacity requirements of the fixed access network keep increasing towards multi-gigabit connections. For Hybrid Fiber Coaxial (HFC) networks, aggregated rates around 30 Gbit/s can be achieved by increasing the DOCSIS spectrum to 3GHz, assuming a spectral efficiency around 10 bit/s/Hz. Replacement of spectrum limiting components such as passive taps in the HFC network is an efficient way to achieve these data rates, compared with the cost of fiber to the home (FTTH).
- Categories:
- Read more about Counting dense objects in remote sensing images
- Log in to post comments
Estimating accurate number of interested objects from a given image is a challenging yet important task. Significant efforts have been made to address this problem and achieve great progress, yet counting number of ground objects from remote sensing images is barely studied. In this paper, we are interested in counting dense objects from remote sensing images. Compared with object counting in natural scene, this task is challenging in following factors: large scale variation, complex cluttered background and orientation arbitrariness.
- Categories:
- Read more about Counting dense objects in remote sensing images
- Log in to post comments
Estimating accurate number of interested objects from a given image is a challenging yet important task. Significant efforts have been made to address this problem and achieve great progress, yet counting number of ground objects from remote sensing images is barely studied. In this paper, we are interested in counting dense objects from remote sensing images. Compared with object counting in natural scene, this task is challenging in following factors: large scale variation, complex cluttered background and orientation arbitrariness.
- Categories:
- Read more about MoGA: Searching Beyond MobileNetV3
- Log in to post comments
In this paper, we aim to bring forward the frontier of mobile neural architecture design by utilizing the latest neural architecture search (NAS) approaches. First, we shift the search trend from mobile CPUs to mobile GPUs, with which we can gauge the speed of a model more accurately and provide a production-ready solution. On this account, our overall search approach is named \alert{Mobile GPU-Aware neural architecture search (MoGA)}.
- Categories:
- Read more about A whiteness test based on the spectral measure of large non-Hermitian random matrices
- Log in to post comments
In the context of multivariate time series, a whiteness test against an MA(1)
correlation model is proposed. This test is built on the eigenvalue
distribution (spectral measure) of the non-Hermitian one-lag sample
autocovariance matrix, instead of its singular value distribution. The large
dimensional limit spectral measure of this matrix is derived. To obtain this
result, a control over the smallest singular value of a related random matrix
is provided. Numerical simulations show the excellent performance of this
test.
- Categories: