My First Experiment with C++'s std::thread

  • Posted on: 30 January 2016

std::thread is a big step forward for multi-threaded applications written in C++. I suppose people have written their own cross-platform thread wrappers before, but for the rest of us, there's no more dealing with POSIX threads vs. Windows threads etc. std::thread is also far more 'natural' to C++ than openMP is out of the box, which is what I used pretty often in the past. By natural, I mean that one can write:

std::vector<std::thread> 

This is all to say, std::thread is an object which represents a thread on whatever system you're on.

Four or so years after std::thread was introduced into the C++ standard, I finally got around to trying it out. By now, there are many tutorials available, but I'll add my own first-pass thoughts into the mix.

First, let's take a look at the constructors for std::thread:

thread();
 
thread( thread&& other );
 
template< class Function, class... Args &rt
explicit thread( Function&& f, Args&&... args );
 
thread(const thread&) = delete;

Of these, only explicit thread( Function&& f, Args&&... args ) really creates a new physical thread on the system. It takes a function to execute, and that function's args.

thread() creates a thread object that doesn't yet have a physical thread. thread( thread&& other ) moves thread other into a new thread object, leaving other in a state as if were created with thread(). The standard copy constructor is illegal.

Let's count!:

Now, I'm going to write a simple program that spawns 10 threads, each of which increments an integer and adds that variable into a std::vector in order. After this is complete, the vector should hold 0,1,2,...,9.

First, let's make a class to hold our variable to increment, and the vector:

	
class my_class
{
    std::vector<double> d;
    std::mutex my_mutex;
    int m_ctr;
 
public:
    my_class(){}
 
    void doit();
    void thread_func();
 
};

We also have a std::mutex, since std::vector is not thread safe. If you're not familiar with mutexs, for our purposes, it's going to make sure that only one thread can increment m_ctr at a time, and that only one thread can do the push_back at a time. If we don't use a mutex, we have a classic race condition on incrementing both m_ctr and the vector's size.

Ok, let's see my_class::thread_func -- the function that each thread will be constructed with:

void my_class::thread_func()
{
    std::lock_guard<std::mutex> g(my_mutex);
    d.push_back(double(m_ctr++));
}

Here, std::lock_guard requires some explanation. If you're familiar with RAII, you probably already know the answer. If not, this is a good example of the idiom.

A Detour: RAII and the Mutex:

Let's take a look at a simpler version of my_class::thread_func


void my_class::thread_func()
{
    g.lock();
    d.push_back(double(m_ctr++));
    g.unlock();
 
}

Since thread_func is executed by each thread. The first thread to reach g.lock() will prevent any other thread from executing d.push_back(double(m_ctr++)) until the first thread reaches g.unlock(). This is exactly what we want to get out of our mutex. So, why use std::thread_guard?

Imagine this was our thread_func:

	
void my_class::thread_func()
{
    g.lock();
    //lots of code with exceptions
    g.unlock();
 
}

If an exception occurs after g.lock(), the function will exit and the executing thread will never unlock the mutex, and the program will deadlock!

On the other hand, thread_func calls the destructor on each object locally defined within the function when the exception is handled. Say we wrote:

	
void my_class::thread_func()
{
    std::vector<double> z;
    g.lock();
    //lots of code with exceptions
    g.unlock();
 
}

z will be destroyed via its destructor even if an exception occurs. The vector's underlying memory is free'd, which prevents a memory leak. In the same way that this prevented z from leaking its underlying memory, lock_guard prevents a thread from 'leaking' its mutex by unlocking the mutex in its destructor. In other words, here are some greatly simplified destructors for vector and lock_guard:

vector<T>::~vector<T>()
{
    delete[] m_ptr;
}
 
lock_guard<T>::~lock_guard<T>()
{
    m_mutex.unlock();
}

If you didn't understand RAII, you (hopefully) do now. By the way, 'scope-based resource management' or something similar might be a better name for the idiom or philosophy.

...We're Back!:

Now, let's look at my_class::doit, which creates the threads, and outputs the vector when the threads have done their work.

void my_class::doit()
{
    d.clear();
    m_ctr = 0;
 
    int nthread = 10;
 
    std::vector<std::thread> vthread;
 
    for(int i = 0; i < nthread; i++)
    {
        vthread.push_back(
            std::thread(&my_class::thread_func, this) );
    }
 
    for(auto& t : vthread){
        t.join();
    }
 
    std::for_each(d.begin(),d.end(),
      [](double& a){std::cout << a << " ";});
 
}

Let's break this down a bit. First, when I was first writing doit, I reached for the following constructor for std::vector (well, the one that looks something like this):

	
vector<T>::vector<T>(size_t n, T& v);

Normally, this creates a vector of size n, and then initializes each element of the vector with v. If you're been paying attention, you already know why this won't work. The copy constructor for std::thread is deleted. If you understand move semantics, you understand why vector::vector(size_t n, T&& v) could never be a thing. Note that we can also deduce that we must be using push_back(T&& v) to move the threads into the vector.

Now, the actual constructor for std::thread. As noted earlier, explicit thread( Function&& f, Args&&... args ) is the only constructor which creates a new physical thread, so we use that to move real threads into the vector. I stole the syntax for constructing with a member function from this StackOverflow question. I must say, that is some pretty clunky syntax. In hindsight, a lambda function is probably reasonable to use here.

Note that, had thread_func not been a member function, adding to the vector just looks like this:

	
vthread.push_back(std::thread(thread_func) );

Each thread begins executing thread_func once constructed, and the main thread waits for each thread at t.join(), whereupon the main thread continues execution of the program.

Note that we have an issue here similar to the 'leaking' a mutex idea we had earlier. If one of our threads dies, the main thread will be waiting forever to "t.join()" with the thread, and deadlock will occur.

The only thing that remains is:

	
int main()
{ 
    my_class c;
    c.doit();
 
    return 0;
}

Here's the full code in one piece.

Well, I've presented a few of the thoughts I had writing my first C++ program with std::thread. Obviously, this was a toy problem and there's a lot to learn, but so far I'm diggin' it!