Yochai Yemini
Your photo
img/photo.jpg

Yochai Yemini

I completed my Ph.D. in Electrical Engineering at Bar-Ilan University, advised by Prof. Sharon Gannot and Dr. Ethan Fetaya.

My research focuses on deep learning for audio-visual speech processing - spanning speech separation, enhancement, dereverberation, and generative synthesis. A central theme is leveraging visual cues such as lip movements to guide and improve acoustic models.

I have interned at OriginAI, working on audio-visual speech separation, and at Amazon Alexa, improving wake-word detection for non-English accents.

Publications

SSNAPS: Audio-Visual Separation of Speech and Background Noise with Diffusion Inverse Sampling
Preprint  ·  Paper
Diffusion-based Unsupervised Audio-Visual Speech Separation in Noisy Environments with Noise Prior
ICASSP 2026  ·  Paper
LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading
ICLR 2024  ·  Paper
Scene-Agnostic Multi-Microphone Speech Dereverberation
Interspeech 2021  ·  Paper
GP-Tree: A Gaussian Process Classifier for Few-Shot Incremental Learning
ICML 2021  ·  Paper
A Composite DNN Architecture for Speech Enhancement
ICASSP 2020  ·  Paper
Single Microphone Speech Separation by Diffusion-Based HMM Estimation
EURASIP Journal on Audio, Speech, and Music Processing, 2016  ·  Paper
Speech Enhancement Using a Multidimensional Mixture-Maximum Model
IWAENC 2010  ·  Paper