To inaugurate my latest attempt at a blog, I want to talk about something cool I’ve read about lately. I’ve received this question a couple times at interviews, and I think it was just to demonstrate that I could work with bitwise values.
For instance, take the number 57:
57 = 00111001
It’s easy enough to see that 57 has 4 bits set to 1. How do you check that in code? My answer has always been to loop through the number of bits (in this case 8, but usually 32), shifting to the right each time and then count how often the lowest bit comes up as a 1.
value = 57 count = 0 for i in xrange(8): count += (value >> i) & 1
Effectively what I’m doing here is just looping through the 8 bits, rotating the value to the right and stripping off the least significant bit and adding it to the total (if it’s 1, then it adds to the total).
Since I’m using python for the example, I can abbreviate it a bit like this:
count1s = lambda x: sum( (x >> i) & 1 for i in xrange(32) ) print count1s(57) > 4
That’s more compact, certainly, but it’s still linear time with respect to the number of bits that you are examining, and that’s where today’s coolness comes into play. It turns out there’s a log time algorithm for doing this.
Because it can get a bit long when I show all my work long-hand, let me jump straight to the code that will give the results for an 8 bit integer in 3 steps (and could be extended to do 32 bits in 5 steps, and so on):
x = 57 y = (x & 0x55) + ((x & 0xaa) >> 1) y = (y & 0x33) + ((y & 0xcc) >> 2) y = (y & 0x0f) + ((y & 0xf0) >> 4) print y > 4
Follow through the link if you want the nitty gritty.
The Nitty gritty:
Like most log time algorithms, it relies on cutting the problem in half, then half again, then half again, and so on; but in this case it kind of works it’s way up from the bottom, so to speak, starting with groups of 2 bits, finding the number of 1 bits in each group, then looking at groups of 4, counting the 1s, groups of 8s, and so on.
Well, let’s jump in, you’ll see what I mean.
First, start by breaking the binary up into group of two bits:
x = 57 = 00 11 10 01
And now go through and add the two bits in each group together: 0+0=0, 1+1=2, 1+0=1, 0+1=1. This gives us a new number to work with:
y = 00 10 01 01
Remember, all we did was take each set of two bits and added them together, and used that to create a new number.
Now, let’s consider them as sets of 4 bits, coming up a level:
y = 0010 0101
Last time, we added individual bits together, this time we’ll take groups of two bits and them together. So: 0+2=2, 1+1=2, and now we have a new number again:
z = 0010 0010
Now we’re almost done. Last time we took groups of 2 digits and added them together, now we’re going to take our groups of 4 and add them together: 2+2=4
answer = 00000100 = 4
Intuitively, it kinda makes sense, we’re just adding how many ones there are in each group at each level and rolling them up, but it’s still kind of weird. Let’s try it with another number, say 183.
x = 183 = 10110111
We can see that there are 6 1s, so let’s work through it:
First, break it up into sets of two bits:
x = 10 11 01 11
Add them together: 1+0=1, 1+1=2, 0+1=1, 1+1=2; create the new value from that:
y = 0110 0110
Take them in groups of two digits and add them together: 1+2=3, 1+2=3, and create a new value:
z = 0011 0011
Take each of the 4 bit groups and add them together and you get 3+3=6. And that’s how many 1s we knew we should have.
Take note that we did that in 3 steps for 8 bits, and indeed you will see the same logarithmic growth as the size of the integer grows; if it were a 32 bit integer, it would take 5 steps.
Now it looks as if each of those steps could be pretty complex, and you might think that the effort involved would quickly outstrip any savings over something as simple as my original ‘shift to the right, then and the result’, but I was certainly surprised.
For instance, take the initial step of breaking the integer into 2 bit groups and adding them. First it helps to remember that we can get the alternating bits by anding with 0x55 and 0xAA. Recall:
57 = 00111001
If we and with 0x55 (01010101), we’ll get every second binary digit. If we and it with 0xAA (10101010) we’ll get the other set of every second binary digit. If we shift that second one over 1 bit and then add them together, we’ll get our first result:
00111001 & 01010101 = 00010001 00111001 & 10101010 >> 1 = 00010100 -------- y = 00100101
Similarly, we can take the groups of 2 bits and add them together by anding with 0x33 & 0xcc (and then shifting by 2 bits):
00100101 & 00110011 = 00100001 00100101 & 11001100 >> 2 = 00000001 -------- z = 00100010
For the final step, we work with 4 bits at a time by anding with 0x0F & 0xF0 (and then shifting by 4 bits):
00100010 & 00001111 = 00000010 00100010 & 11110000 >> 4 = 00000010 -------- answer = 00000100
And there we have it. There are ways to optimize this a bit, but it just makes it harder to understand.
Of course, now that I have a good answer for the question, it pretty much guarantees I will never again be asked it…but now that I’ve written the solution up, it guarantees that I won’t forget it either.