Session Name: | Machine Learning Summit: Developing and Running Neural Audio in Constrained Environments |
Speaker(s): | Carter Huffman, Brendan Kelly |
Company Name(s): | Modulate.ai, Modulate.ai |
Track / Format: | Machine Learning Summit |
Overview: | Speech recognition, manipulation, and synthesis are opening up new types of experiences in games; but the deep neural networks that achieve state-of-the-art performance at these tasks are difficult to develop and use efficiently.The first part of this talk explores challenges and solutions for developing neural speech systems that meet gaming-centric requirements, and do so through the lens of an early-stage startup. Strategies for effective iteration with small team sizes are discussed, along with efficient use of limited compute resources.The second part of the talk covers methods for running audio neural networks on a device, specifically focusing on real-time audio manipulation and synthesis. Included are lessons-learned about aspects of audio neural networks for both traditional audio practitioners (e.g. choice of sample rates, buffer sizes, memory allocations) and traditional machine learning practitioners (e.g. suitability of deployment frameworks, model distillation). |