Speech2Face

Aqib Ahmad, Taimoor Aftab, Syed Haider Bokhari, Dr. Omer Ishaq

FYP Project for Spring 2020 at FAST-NUCES, Islamabad

An implemention of the following papers:

Speech2Face: Learning the Face Behind a Voice (Tae-Hyun Oh, Tali Dekel, Changil Kim, Inbar Mosseri, William T. Freeman, Michael Rubinstein, Wojciech Matusik) CVPR 2019

Synthesizing Normalized Faces from Facial Identity Features (Forrester Cole, David Belanger, Dilip Krishnan, Aaron Sarna, Inbar Mosseri, William T. Freeman) CVPR 2017

The repository includes the following code:

Scripts for data preprocessing for the facial decoder and the voice encoder models
PyTorch models for Facial Encoder (VGG-face recognition), Facial Decoder and Voice Encoder
Flask Server to deploy all these models
Links to datasets for Facial Decoder and Voice Encoder
Python Notebooks for training the Facial Deocoder and Voice Encoder models

References:

Face Morphing Library: https://github.com/alyssaq/face_morpher

Data pre-processing for Voice Encoder: https://github.com/saiteja-talluri/Speech2Face

speech recognition based on facial images

The project consists of 2 major models:

Sound to FaceVector: converts soundwave into a facial recognition vector
FaceVector to Image: converts the above mentioned vector to an image

Current implementation consists of FaceVector to Image model

INSTRUCTIONS:

Upload notebook onto Google Drive
For VGG-16 backend, make sure you get at least 10GB of CUDA memory
For Facenet backend, any graphics card on Colab will suffice
Connect to Google Drive

TEST INSTRUCTIONS:

Run the cells containing imports, model classes and model loading
Upload test images
Run the cell for testing

TRAIN INSTRUCTIONS:

Download the required batches from Google Drive
Specify required learning rate and interations
Load pre-saved model
Select Run All from google colab

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
decoder_preprocessing		decoder_preprocessing
face_morphing		face_morphing
imgs		imgs
models		models
preprocess		preprocess
templates		templates
.DS_Store		.DS_Store
README.md		README.md
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech2Face

Aqib Ahmad, Taimoor Aftab, Syed Haider Bokhari, Dr. Omer Ishaq

FYP Project for Spring 2020 at FAST-NUCES, Islamabad

About

Releases

Packages

Languages

deepteshrout/speech2face

Folders and files

Latest commit

History

Repository files navigation

Speech2Face

Aqib Ahmad, Taimoor Aftab, Syed Haider Bokhari, Dr. Omer Ishaq

FYP Project for Spring 2020 at FAST-NUCES, Islamabad

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages