Abstract: Due to advancements in digital cameras, it is easy to gather multiple images (or videos) from an object under different conditions. Therefore, image-set classification has attracted more ...
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language 2022 Audio Continuous WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing 2021 ...
2025-10-28 Bayesian Speech synthesizers Can Learn from Multiple Teachers Ziyang Zhang et.al. 2510.24372 null 2025-10-28 emg2speech: synthesizing speech from electromyography using self-supervised ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results
Feedback