Training large language models on narrow tasks can lead to broad misalignment

www.nature.com

Training large language models on narrow tasks can lead to broad misalignment

www.nature.com

paywallMB to

NatureEnglish · 1 day ago

Nature, Published online: 14 January 2026; doi:10.1038/s41586-025-09937-5

Finetuning a large language model on a narrow task of writing insecure code causes a broad range of concerning behaviours unrelated to coding.

From Nature via this RSS feed

You must log in or # to comment.

Chat

Nature

nature

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !nature@ibbit.at

Community locked: only moderators can create posts. You can still comment on posts.

Nature is a weekly international journal publishing the finest peer-reviewed research in all fields of science and technology on the basis of its originality, importance, interdisciplinary interest, timeliness, accessibility, elegance and surprising conclusions. Nature also provides rapid, authoritative, insightful and arresting news and interpretation of topical and coming trends affecting science, scientists and the wider public.

Don’t post archive.is links or full text of articles, you will receive a temp ban.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

3 users / day
4 users / week
6 users / month
51 users / 6 months
1 local subscriber
21 subscribers
2.08K Posts
12 Comments
Modlog