Sorry, you need to enable JavaScript to visit this website.

This supplementary material presents detailed transformer model architectures, training parameters, and comprehensive evaluation metrics to complement our comparison of RNN and transformer models for Indonesian news classification. Our analysis provides deeper insights into why transformer models outperform RNN approaches despite their larger parameter counts.

Categories:
2 Views

Supp ro TASE 2025 submission

Categories:
6 Views

Supplemental Material

Categories:
16 Views

This document contains the supplementary material for the ICIP_2050 paper with ID #2494 and title "An End-to-End Class-Aware and Attention-Guided Model for Object State Classification".

Categories:
55 Views

Supplementary materials for paper Pose-free 3D Gaussian Splatting via Shape-Ray Estimation

Categories:
9 Views

The emergence of text-to-image diffusion models marked a revolutionary breakthrough in generative AI. However, training a text-to-image model to consistently reproduce the same subject remains a challenging task. Existing methods often require costly setups, lengthy fine-tuning processes and struggle to generate diverse, text-aligned images. Moreover, the increasing size of diffusion models over the years highlights a scalability challenge for previous fine-tuning methods, as tuning on these large models is even more costly. To address these limitations, we present Stencil.

Categories:
47 Views

In the last few years, vision transformers have increasingly been adopted for medical image classification and other applications due to their improved accuracies compared to other deep learning models. However, due to their size and complex interactions via the self-attention mechanism, they are not well understood. In particular, it is unclear whether the representations produced by such models are semantically meaningful.

Categories:
3 Views

Fine-grained action localization in untrimmed sports videos is a challenging task, as motion transitions are subtle and occur within short time spans. Traditional supervised and weakly supervised methods require extensive labeled data, making them less scalable and generalizable. To address these challenges, we propose an unsupervised skeleton-based action localization pipeline that detects fine-grained action boundaries using spatio-temporal graph embeddings.

Categories:
8 Views

Handshape recognition is a fundamental component of Sign Language Recognition (SLR). However, most existing approaches are language-dependent and require extensive training data, which limits their scalability. To address this limitation, we explore the use of SignWriting as a standardized, language-agnostic representation for handshapes. Our method employs Mediapipe for hand landmark extraction, followed by normalization and data augmentation to enhance robustness.

Categories:
9 Views

Pages