Suppose you have a large block of memory, and you want to swap two blocks that reside inside that large block. For concreteness, say you have a block of memory of the form A1, A2, B1, B2, C1, C2, D1, D2, D3, E1, and you want to swap the B’s and D’s, producing A1, A2, D1, D2, D3, C1, C2, B1, B2, E1.

A1A2B1B2C1C2D1D2D3E1⤩A1A2D1D2D3C1C2B1B2E1

Like, say, you have a double-null-terminated string and you want to swap two strings in it. Now, of course, the easy way would be just to allocate a second buffer equal in size to the original buffer and copy the strings to the second buffer in the desired order. But just as an exercise, can you do this without allocating additional memory?

Here’s a hint: There is an algorithm to swap two adjacent buffers using no additional space. In C++, this algorithm is implemented by the std::rotate method. For example, if you have a block of memory of the form X1, X2, Y1, Y2, Y3, then you can call std::rotate(v.begin(), v.begin() + 2, v.end()) to get Y1, Y2, Y3, X1, X2.

X1X2Y1Y2Y3⤩Y1Y2Y3X1X2

The function is called rotate because you can view it as taking the combined buffer and doing a circular rotation until Y1 is the new first element.

You can perform the desired exchange of non-adjacent blocks by performing three rotations. Starting with A1, A2, B1, B2, C1, C2, D1, D2, D3, E1, your first rotation exchanges the B’s with the C’s: your second rotation exchanges the B’s with the D’s, and your final rotation exchanges the C’s with the D’s.

A1A2B1B2C1C2D1D2D3E1⤩A1A2C1C2B1B2D1D2D3E1⤩A1A2C1C2D1D2D3B1B2E1⤩A1A2D1D2D3C1C2B1B2E1

(There are other ways to accomplish this with three rotations.)

But you can do better.

The way rotation works is by reversing the two buffers separately, and then reversing the combined buffer. Starting with X1, X2, Y1, Y2, Y3, we reverse the X’s (producing X2, X1) and reverse the Y’s (producing Y3, Y2, Y1), and then reverse the whole thing, giving a final result of Y1, Y2, Y3, X1, X2.

X1X2Y1Y2Y3⇆⇆X2X1Y3Y2Y1⇆Y1Y2Y3X1X2

The cost of reversing n elements is n/2 swaps, so the cost of a rotation of n elements is n swaps. (You reverse the first part and the second part, which together add up to n/2 swaps; and then you reverse the whole thing, which is another n/2 swaps, for a total of n.)

Therefore, our three-rotation version costs a total of 2n swaps, where n is the combined size of the two blocks being swapped as well as the block that is sandwiched between them.

We can reduce the cost to n swaps by applying the reversal trick directly: Reverse the two blocks, reverse the block that is sandwiched between them, and then reverse the combo block.

A1A2B1B2C1C2D1D2D3E1⇄⇄⇄A1A2B2B1C2C1D3D2D1E1⇄A1A2D1D2D3C1C2B1B2E1template<typename BiDirIt> void swap_discontiguous(BiDirIt first1, BiDirIt last1, BiDirIt first2, BiDirIt last2) { std::reverse(first1, last1); std::reverse(last1, first2); std::reverse(first2, last2); std::reverse(first1, last2); }

But wait. If you look at the specification for std::rotate, you see that it requires only a forward iterator, not a bidirectional iterator. Yet std::reverse requires a bidirectional iterator. So how does std::rotate operate if it is given only forward iterators? We’ll look at that next time.

Bonus chatter: clang’s libcxx and gcc’s libstdc++ contain specializations of std::rotate for random-access iterators which perform only n/2 swaps, moving each element directly to its final destination by breaking the element movements into disjoint cycles and then moving the elements within each cycle. They also have special-cases for rotating zero elements (nop), as well as rotating a single element and swapping two equal-sized blocks (which reduces the number of swaps to n/2). I guess these cases happen often enough for the extra code to be worth it.

The post Swapping two blocks of memory that reside inside a larger block, in constant memory appeared first on The Old New Thing.


From The Old New Thing via this RSS feed