Troubleshooting an OpenMP and/or MPI application on a cluster

Some tips and tricks 🙂

1.Get it debugged/running in serial

  • use gdb, don’t forget to compile with -g and -Wall. Did you eliminate/address all warnings?

2. Know your memory

  • run pmap with your application at least once to get a snapshot, pay particular attention to stacks and total memory. Did you request enough memory through PBS etc?

3. Know your code; use callgrind, do this with your serial code. With this data, does your parallel strategy make sense cf Amdahl’s law?

3.Solve your leaks, avoid realloc

  • valgrind/memcheck is your friend, do this with your serial code. Keep in mind that the your scheduler will set hard, lower limits which is some default or what you request explicitly in your job submission script, which brings me too..

4. Know your limits

  • run ulimit -a; pay particular attention to stack size, it’s easy to blow particularly if you insist on statically allocating massive arrays.

5. Know your compiler eg., there are many useful lesser-used flags you can set

6. Know your thread programming pitfalls eg.,

  • stack size is different to the one you find with ulimit -s
  • gratuitous critical/atomic sections/ops, use valgrind –tool=helgrind
  • races eg., bad/no use of shared/private clauses, use valgrind –tool=helgrind
  • deadlocks

7. Know some signals & system call so you can run strace and interpret output eg., “see that SIGFPE? That’s bad”

8. Avoid I/O

9. Debug in parallel eg., mpirun -np 4 valgrind my_application


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s