Module 5: Vision Language ActionChapter 26: Multi-Modal Interaction Speech Gesture Visionmultimodal-fusion-techniquesmultimodal-fusion-techniques