Windows has three built-in text formats for the clipboard:

CF_UNICODE­TEXT: UTF-16 text.CF_TEXT: 8-bit text in ANSI code page.CF_OEM­TEXT: 8-bit text in OEM code page.

If you don’t provide all three formats, then the system will synthesize the missing ones from the ones you have. How does this work?

Believe it or not, we’re going to spend the rest of the week on this topic.

One thing to note is that the synthesis is done on demand. It is only when somebody asks for, say, CF_TEXT and the clipboard realizes that it has only CF_OEM­TEXT, that the clipboard creates the CF_TEXT on the fly. Once done, the result is cached so that it doesn’t have to be converted again.

Today’s conversion is the one that looks easiest on the surface: Converting CF_TEXT to CF_OEM­TEXT and back.

To convert CF_TEXT to CF_OEM­TEXT, Windows uses the AnsiToOem function. And to convert the other way, it uses the Oem­To­Ansi function.

These are legacy function names; the modern names are Char­To­Oem and Oem­To­Char. But I used the legacy names because that’s what they were called in 16-bit Windows, and that’s how the conversion was done in 16-bit Windows.

This was anticlimactic, but we’re just getting started. And when we get to the end, we’ll see that what looks like a simple answer is actually quite complicated.

Back in the days of 16-bit Windows, ANSI text and OEM text were the only two clipboard text formats, so there were only two possible conversions (one in each direction). Things get more complicated with the introduction of CF_UNICODE­TEXT, which we’ll look at next time.

The post How does Windows synthesize <CODE>CF_<WBR>OEM­TEXT</CODE> from <CODE>CF_<WBR>TEXT</CODE> and vice versa? appeared first on The Old New Thing.


From The Old New Thing via this RSS feed