Multimodal learning with next-token prediction for large multimodal models

www.nature.com

Multimodal learning with next-token prediction for large multimodal models

www.nature.com

paywallMB to

NatureEnglish · 23 days ago

Nature, Published online: 28 January 2026; doi:10.1038/s41586-025-10041-x

Emu3 enables large-scale text, image and video learning based solely on next-token prediction, matching the generation and perception performance of task-specific methods, with implications for the development of scalable and unified multimodal intelligence systems.

From Nature via this RSS feed

You must log in or # to comment.

Chat

Nature

nature

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !nature@ibbit.at

Community locked: only moderators can create posts. You can still comment on posts.

Nature is a weekly international journal publishing the finest peer-reviewed research in all fields of science and technology on the basis of its originality, importance, interdisciplinary interest, timeliness, accessibility, elegance and surprising conclusions. Nature also provides rapid, authoritative, insightful and arresting news and interpretation of topical and coming trends affecting science, scientists and the wider public.

Don’t post archive.is links or full text of articles, you will receive a temp ban.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

8 users / day
10 users / week
29 users / month
60 users / 6 months
1 local subscriber
27 subscribers
2.55K Posts
16 Comments
Modlog