Ga offline met de app Player FM !
Llama 3.2 Vision and Molmo: Foundations for the multimodal open-source ecosystem
Fetch error
Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on October 02, 2024 13:35 ()
What now? This series will be checked again in the next hour. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.
Manage episode 442330816 series 3590272
Sorry this one was late! Thanks for bearing with me, and keep sending feedback my way. Still a year or two away from when I have time to record these, but I would love to.
Open-source tools, examples, limits, and the state of training multimodal models.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/molmo-and-llama-3-vision
00:00 Llama 3.2 Vision and Molmo: Foundations for the multimodal open-source ecosystem
02:47 Llama vision: Multimodality for the masses of developers
03:27 Molmo: a (mostly) open-source equivalent to Llama vision
08:45 How adding vision changes capabilities and reasoning
11:47 Multimodal language models: Earlier on the exponential
Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_013.png
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_015.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_021.png
Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_023.png
Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_027.png
Fig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_030.png
Fig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_037.png
Fig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_046.png
Fig 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_048.png
Fig 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_050.png
Fig 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_052.png
Fig 12: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_054.png
Fig 13: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_058.png
Fig 14: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_065.png
58 afleveringen
Fetch error
Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on October 02, 2024 13:35 ()
What now? This series will be checked again in the next hour. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.
Manage episode 442330816 series 3590272
Sorry this one was late! Thanks for bearing with me, and keep sending feedback my way. Still a year or two away from when I have time to record these, but I would love to.
Open-source tools, examples, limits, and the state of training multimodal models.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/molmo-and-llama-3-vision
00:00 Llama 3.2 Vision and Molmo: Foundations for the multimodal open-source ecosystem
02:47 Llama vision: Multimodality for the masses of developers
03:27 Molmo: a (mostly) open-source equivalent to Llama vision
08:45 How adding vision changes capabilities and reasoning
11:47 Multimodal language models: Earlier on the exponential
Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_013.png
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_015.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_021.png
Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_023.png
Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_027.png
Fig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_030.png
Fig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_037.png
Fig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_046.png
Fig 9: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_048.png
Fig 10: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_050.png
Fig 11: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_052.png
Fig 12: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_054.png
Fig 13: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_058.png
Fig 14: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/llama-and-molmo/img_065.png
58 afleveringen
Alla avsnitt
×Welkom op Player FM!
Player FM scant het web op podcasts van hoge kwaliteit waarvan u nu kunt genieten. Het is de beste podcast-app en werkt op Android, iPhone en internet. Aanmelden om abonnementen op verschillende apparaten te synchroniseren.