C++ pack 1: Know your compiler, lock-free multithreading

10.11.2009 19:44 in c++, multithreading

Do you consider yourself a good programmer? I believe that there are 3 "skill levels" -- not only when we talk about programming:

  • level 1 programmer doesn't know any tricks nor fancy optimalizations. If he's about to write a loop, he'd probably write i++ as incrementation. If he needs an absolute value, he uses abs function (found after 2 hours spent with Google).
  • level 2 programmer finds himself more powerful. He knows that postincrementation requires copying variable, and he knows that branching is serious pain -- so instead of v>0?v:-v, he would say
    __forceinline float abs(float v)
    {
      union { float a; unsigned int b; } f;
      f.a = v; f.b &= 0x7FFFFFFFU;
      return f.a;
    }
    And yes, FFFFFFFU is very well related to the idea. :)
  • and at last, level 3 programmer knows that i++ and ++i will result in the same code (if i is integer and we don't use result). He knows that compiler is way more comfortable with playing with low-level stuff (such as bitwise operations), so instead of trying to help compiler generate good code, he uses his skill not to disturb his compiler.

So, maybe document about compilers would be a good start if you are really interested in reasonable and pointless optimizations. Good knowledge about inlining and profiling is valuable. In fact, if you want to be a "maximum performance" programmer, you'll need to enjoy reading (and NOT writing) assembly code.

And remember: If you do an optimization, test it on real world data. If it’s not drastically faster but makes the code less readable: undo it. (final quote from referenced article)

Multithreading

And now for something completely different: multithreading. If you are a computer science student, you've probably had your Parallel programming course. Have you learned about small producers and big consumers (and vice versa)? Have you noticed that most important part of every piece of code you've written on such course (well, if you've written any code at all) is wait and release?

If you are a driver (not a graphics card driver), you're aware that your car can start skidding in bad weather conditions. But what is more important? An ability to control your skid? Or an ability to avoid it? Same goes for parallel programming. While multithreading is fun, using a lot of synchronization is not. If each method in your code starts with GlobalMutex->Acquire() and ends with GlobalMutex->Release(), would you call it multithreading?

Possible solution is lock-free programming. Andrei Alexandrescu (C++ expert) has written an article about lock-free data structures. Basic construction used is compare-and-swap -- an instruction that modern CPU can perform atomically (it can't be interrupted during its execution, unlike i++). It look like this:

template <class T>
bool CAS(T* addr, T exp, T val)
{
  if (*addr == exp)
  {
    *addr = val;
    return true;
  }
  return false;
}

Using such instruction, we can, for example, implement a Dictionary class that can be used in multithreading environment and doesn't require locking if we only want to get a value from a dictionary. Lock is needed only if user wants to update dictionary. However, this is not as simple as you want it to be. In the end, author ends up with Write-Rarely-Read-Many-But-Not-Too-Many Dictionary. Read the article if you are interested in details.