How would you count the number of bits set in a 32-bit integer? This question is amongst the favorite interviewing questions. The naïve approach is to loop through the bits aggregating the sum in a variable. It's already fast enough but you can make the interviewer happy with a much better "lookup table" solution: just use a static 256-entry table that contains pre-evaluated bits count for any 8-bit integer. With such table you need only 4 lookups and 4 summations to get the answer. "Lookup table" algorithm is even implemented as a GCC builtin function. Take a look at the code:
Here __builtin_popcount is a function that calculates the population count. If compiled withbecause it will save lives
because I don't want to buy a new laptop capable of running new version of nano/Notepad.
I also highly recommend an awesome article "Benchmarking CRC32 and PopCnt instructions" for further reading.
#include <stdint.h> const uint32_t tbl[] = { 4181239173/*,lots of random data, I used 16384 integers for tests*/,2866015773 } ; #define TEST_COUNT 100000 static const uint32_t ARRAY_SIZE = sizeof(tbl)/sizeof(tbl[0]); int main() { uint32_t i = 0; volatile register uint32_t c; for (; i< TEST_COUNT; ++i) { uint32_t j = 0; for (; j < ARRAY_SIZE; ++j) { c = __builtin_popcount(tbl[j]); } } }
Here __builtin_popcount is a function that calculates the population count. If compiled with
gcc -O3 t.c -o bitcountone can get such results:
$ time ./bitcount real 0m17.465s user 0m17.425s sys 0m0.000sNot too fast, huh? Some of the time is wasted calling the function and inlining it can speed up things. But! There is even a better way. Compiling it with
gcc -O3 -mpopcnt t.c -o bitcountresults in:
$ time ./bitcount real 0m1.738s user 0m1.728s sys 0m0.004sThat's simply because modern processors have POPCNT instruction that can be used to count the number of bits and GCC is nice enough to support it. As you can see, in this case hardware optimization beats smart algorithm design with ease.
Resume
Algorithms are designed on paper but executed on a real hardware. Knowing what your hardware is capable of is a must for a software engineer. And please don't tell me about all these patterns and object-oriented stuff: unless you invent a machine that can run concepts instead of machine instructions, you must be aware of all the low-level stuffI also highly recommend an awesome article "Benchmarking CRC32 and PopCnt instructions" for further reading.
А интересно, зачем такую инструкцию сделали? В каких задачах требуется считать количество битов?
ReplyDeleteЕсть известная байка о том, что в NSA (Агентство Национальной Безопасности США, контрразведка) есть внутреннее предписание - приобретать компы только с реализованной в железе POPCNT, для криптоанализа. Подробней о применимости population count можно почитать здесь - http://en.wikipedia.org/wiki/Hamming_weight
ReplyDelete