Speech2Face

Aqib Ahmad, Taimoor Aftab, Syed Haider Bokhari, Dr. Omer Ishaq

FYP Project for Spring 2020 at FAST-NUCES, Islamabad

An implemention of the following papers:

Speech2Face: Learning the Face Behind a Voice (Tae-Hyun Oh, Tali Dekel, Changil Kim, Inbar Mosseri, William T. Freeman, Michael Rubinstein, Wojciech Matusik) CVPR 2019

Synthesizing Normalized Faces from Facial Identity Features (Forrester Cole, David Belanger, Dilip Krishnan, Aaron Sarna, Inbar Mosseri, William T. Freeman) CVPR 2017

The repository includes the following code:

Scripts for data preprocessing for the facial decoder and the voice encoder models
PyTorch models for Facial Encoder (VGG-face recognition), Facial Decoder and Voice Encoder
Flask Server to deploy all these models
Links to datasets for Facial Decoder and Voice Encoder
Python Notebooks for training the Facial Deocoder and Voice Encoder models

References:

Face Morphing Library: https://github.com/alyssaq/face_morpher

Data pre-processing for Voice Encoder: https://github.com/saiteja-talluri/Speech2Face

speech recognition based on facial images

The project consists of 2 major models:

Sound to FaceVector: converts soundwave into a facial recognition vector
FaceVector to Image: converts the above mentioned vector to an image

Current implementation consists of FaceVector to Image model

INSTRUCTIONS:

Upload notebook onto Google Drive
For VGG-16 backend, make sure you get at least 10GB of CUDA memory
For Facenet backend, any graphics card on Colab will suffice
Connect to Google Drive

TEST INSTRUCTIONS:

Run the cells containing imports, model classes and model loading
Upload test images
Run the cell for testing

TRAIN INSTRUCTIONS:

Download the required batches from Google Drive
Specify required learning rate and interations
Load pre-saved model
Select Run All from google colab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Speech2Face

Aqib Ahmad, Taimoor Aftab, Syed Haider Bokhari, Dr. Omer Ishaq

FYP Project for Spring 2020 at FAST-NUCES, Islamabad

Files

README.md

Latest commit

History

README.md

File metadata and controls

Speech2Face

Aqib Ahmad, Taimoor Aftab, Syed Haider Bokhari, Dr. Omer Ishaq

FYP Project for Spring 2020 at FAST-NUCES, Islamabad