Multimodal Deep Learning Approaches and Applications
Apply multimodal learning approaches to challenging business use cases.
Multimodal learning research focuses on developing models that combine multiple modes of data with varying structures such as sequential relationships between words in natural language and spatial pixel relationships in images.
These models aspire to create joint representations of the input data that provide richer features for downstream tasks compared to models leveraging a single mode of data. In this post we will introduce multimodal learning approaches as well as possible applications.