Last time, we encountered a mystery where the synthesis of CF_OEM­TEXT from CF_TEXT did not use Ansi­To­Oem. Today we will begin the investigation.

Recall that we have a table showing how Windows synthesizes each of the various text formats from the other two. But in the case where the clipboard has two formats available, and you ask for the third, there are two ways that the third format could be synthesized: It could convert the first, or it could convert the second. How does Windows decide?

The preference table is

To getFirst tryThen tryAnd then tryCF_TEXTCF_TEXTCF_UNICODETEXTCF_OEMTEXTCF_OEMTEXTCF_OEMTEXTCF_UNICODETEXTCF_TEXTCF_UNICODETEXTCF_UNICODETEXTCF_TEXTCF_OEMTEXT

In words, first look for a perfect match. If that’s not available, then try (in order) CF_UNICODE­TEXT, then CF_TEXT, then CF_OEM­TEXT. (One of those last three checks is redundant with the perfect match check.)

Combining that with our previous table produces this conversion table with priorities:

To getFirst tryThen tryAnd then tryCF_TEXTCF_TEXTCF_UNICODETEXT + WC2MB(ANSI CP)CF_OEMTEXT + OemToAnsiCF_OEMTEXTCF_OEMTEXTCF_UNICODETEXT + WC2MB(OEM CP)CF_TEXT + AnsiToOemCF_UNICODETEXTCF_UNICODETEXTCF_TEXT + MB2WC(ANSI CP)CF_OEMTEXT + MB2WC(OEM CP)

Again, “ANSI CP” means “the code page reported by calling Get­Locale­Info with the LCID in the CF_LOCALE clipboard format, and the LOCALE_IDEFAULT­ANSI­CODE­PAGE locale attribute”. Similarly for “OEM CP”, using LOCALE_IDEFAULT­CODE­PAGE instead of LOCALE_IDEFAULT­ANSI­CODE­PAGE.

If you stare at this table, you might notice something odd, possibly even disturbing. And that is part of the answer to the mystery. We’ll talk about it next time.

The post Resolving an ambiguity in the Windows clipboard automated text conversion table appeared first on The Old New Thing.


From The Old New Thing via this RSS feed