Category: programming

BBD

I really dig halt and catch fire on AMC, for a wide variety of reasons. I’m sure the internet doesn’t need another post on AMC or one of its shows. However, the internet could use another post on the amazing cache of music exposed by HCF, eg., big black delta.

Advertisements

new GPU book

Numerical Computations with GPUs comes out later this year; Pierre-Yves and I were able to contribute a chapter on LU &QR decomposition (the latter using Givens rotations) for batches of dense matrices. We saw some impressive performance improvements for specific problem sizes. QR will benefit particularly from CUDA 6 and the availability of the fast/safe reciprocal hypotenuse function rhypot(x,y), more details here .

HPC Essentials 0

The notes from my last talk at PSU for the forseeable future, delivered to the Math department during colloquia a few weeks back. A kind of prequel to the HPC Essentials series, I take a simple Kriging process through the steps of making it an example of high performance computation. Lots of information, perhaps too much 🙂

cluster profiler

We’ve been working on a method to effectively monitor and in some senses profile all relevant processes running on one or more systems. An alpha has been released on github, Pierre-yves is working on a powerful flume+solr component for search, readme follows, more to come.


Overview

This code comprises the clpr_d daemon for the Cluster Profiler project, an Orwellian attempt to develop time series and statistics for all running processes on a single system or many systems. Process data is gathered or clustered according to process birthdate (rounded to the minute) and uid. The daemon uses several threads to work on a boost::multi_index data structure, containing the acquired process data. The main thread reads from a named pipe specified in key_defines.h, data produced by running and re-directing output from pidstat in (eg.,) an external shell loop. Appropriately formatted pidstat output may be produced from this forked code, the utility originally produced by Sebastien Godard : https://github.com/wjb19/sysstat. An example usage is the following, removing process data from root, and redirecting to a fifo in the bin directory of this distribution :

pidstat -d -u -h -l -r -w -U -v | grep -v root > bin/clpr_input

An archiving thread works asynchronously to write port 80 using tcp/ipv4 whenever queried, according to the format specified in the ostream operator for clpr_proc_db, the wrapper around boost::multi_index. Acquired process data from the fifo is ‘blobbed’ together by the reader and statistics developed on the fly. A logging class will write a log file periodically as well, with filename specified in key_defines.h. Top utilization statistics are reported in the log file, using the multiple search/sort indices of boost::multi_index. Finally, a manager thread periodically monitors the size of the database, trimming/deleting entries according to timestamps and a maximum size.

Keep in mind this daemon can be a security risk and compute as well as i/o intensive. It has been designed with flume + solr in mind; for the overall project, flume is used to query various instances of clpr_d, and solr used for indexing and search on records – WJB 03/14
Installation

Install fork of pidstat specified, ‘make’ this distribution, specifying compile and linker paths for boost as needed.

avpipe alpha

While I nut out the details of working with the new open source codec for h.264 from cisco, I’ve gone ahead and released the code for the aforementioned avi processing application, tentatively dubbed avpipe, on github . Performance is fairly good, although memory management needs attention at some point 🙂

hpc app for avi

There are many good open source tools for processing video data, but very little in a HPC context; and yet processing image data/video streams is becoming increasingly common, at least from the perspective of our unit. Hence I’ve invested a little time in an application that works directly on the stream from stdin, using FFmpeg, OpenCV and boost::threads. With the power of UNIX pipes, different processing algorithms are readily concatenated, all without the need for copious i/o. Hexdump -C has been a lifesaver in terms of discovering the quirks of the format.


00000000  52 49 46 46 48 e0 6f 3e  41 56 49 20 4c 49 53 54  |RIFFH.o>AVI LIST|
00000010  88 22 00 00 68 64 72 6c  61 76 69 68 38 00 00 00  |."..hdrlavih8...|
00000020  21 00 00 00 5c 05 51 00  00 00 00 00 10 09 00 00  |!...\.Q.........|
00000030  d5 0a 4d 00 00 00 00 00  02 00 00 00 00 00 10 00  |..M.............|
00000040  80 07 00 00 38 04 00 00  00 00 00 00 00 00 00 00  |....8...........|
00000050  00 00 00 00 00 00 00 00  4c 49 53 54 ae 10 00 00  |........LIST....|
00000060  73 74 72 6c 73 74 72 68  38 00 00 00 76 69 64 73  |strlstrh8...vids|
00000070  61 76 63 31 00 00 00 00  00 00 00 00 00 00 00 00  |avc1............|
00000080  01 00 00 00 30 75 00 00  00 00 00 00 d5 0a 4d 00  |....0u........M.|
00000090  00 00 10 00 ff ff ff ff  00 00 00 00 00 00 00 00  |................|
000000a0  80 07 38 04 73 74 72 66  46 00 00 00 45 00 00 00  |..8.strfF...E...|
000000b0  80 07 00 00 38 04 00 00  01 00 18 00 61 76 63 31  |....8.......avc1|
000000c0  00 ec 5e 00 00 00 00 00  00 00 00 00 00 00 00 00  |..^.............|
000000d0  00 00 00 00 01 42 e0 32  ff e1 00 0e 67 42 e0 32  |.....B.2....gB.2|
000000e0  da 01 e0 08 9a 6e 02 02  0c 04 01 00 04 68 ce 3c  |.....n.......h.<|
000000f0  80 00 4a 55 4e 4b 14 10  00 00 04 00 00 00 00 00  |..JUNK..........|
00000100  00 00 30 30 64 63 00 00  00 00 00 00 00 00 00 00  |..00dc..........|
00000110  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

Advanced MPI

Sat in an excellent tutorial today at SC13 in Denver, covering (mostly) relatively new additions to the standard, slides here . Highlights (for me anyway) included a new scalable method for creating graph topologies (apparently MPI_Graph_create is bad), remote memory access/one sided communication and non-blocking collectives. The speakers were excellent communicators and had an obvious love for the subject, most definitely one of the better tutorials I’ve attended.