Exponentiation, or calculating a power of a number, looks like a trivial task. Just multiply a number by itself several times. But, is it the most efficient way? In this article we’ll discuss several approaches to what seems to be a basic mathematical operation. We will only discuss powers that are integers here.

First, let’s take a look at the approaches that can be encountered all over the Internet.

## The straightforward approach

First, let’s do it in Python.

And a test code. It will be pretty much the same for every approach (except the function name will be different).

I didn’t want to include negative values of k at first, but since I was playing with those anyway, why not. Also, I learnt that zero in negative power is a complicated matter, so I’ll just leave it be. Python throws ZeroDivisionError: 0.0 cannot be raised to a negative power on 0**-1 anyway.

This approach takes O(n) time and constant space. Not bad, but it can be done faster.

## The recursive approach

The fundamental quality of powers is that you can multiply two powers and their exponents sum up: $x^a * x^b = x^{(a+b)}$. You can go the other way around by dividing exponents evenly until you reach the value of 1. It can be easily done in a recursive way.

This approach takes O(log k) time, but it requires more space for recursive calls (probably O(log k) by memory, correct me if I’m wrong).

## Iterative approach

As far as I’ve heard, any task that can be performed recursively can be done iteratively as well. Can we go deeper and make the algorithm satisfy the following requirements?

• O(log k) by time
• O(1) by memory
• No modulo or division operations (except the last one for negative powers)

It is possible, with some bit magic.

First of all, let’s look at what an integer actually is from a binary representation perspective.

$k = (0 \text{ or } 1) \cdot 2^{0} + (0 \text{ or } 1) \cdot 2^{1} + (0 \text{ or } 1) \cdot 2^{2} + (0 \text{ or } 1) \cdot 2^{3} + \cdots$

That’s how an integer is represented in memory, in reverse. The bit that corresponds to $2^{0}$ (or $1$) is the rightmost.

Now, back to our exponent. Let’s create a polynomial (or is it a polynomial? Math-savvy people may correct me).

$x^{k} = x^{(0 \text{ or } 1) \cdot 2^{0} + (0 \text{ or } 1) \cdot 2^{1} + (0 \text{ or } 1) \cdot 2^{2} + (0 \text{ or } 1) \cdot 2^{3} + \cdots} = x^{(0 \text{ or } 1) \cdot 2^{0}} \cdot x^{(0 \text{ or } 1) \cdot 2^{1}} \cdot x^{(0 \text{ or } 1) \cdot 2^{2}} \cdot x^{(0 \text{ or } 1) \cdot 2^{3}} \cdot \cdots$

The rightmost bit corresponds to $x^{(0 \text{ or } 1) \cdot 2^{0}}$. Which means, $x$ can be either $1$ if the bit is 0 or $x$ if it is 1. This can be done in code very simply:

What about the rest? We can easily iterate ofer the remaining bits using k >>= 1 (or k = k >> 1) and getting the next bit with k&1 on every iteration. The next number in this polynomial will be either $1$ or $x^{2}$. The one after it is either $1$ or $x^{2^{2}} = x^{4}$. The following one is either $1$ or $x^{2^{3}} = x^{8}$. See the pattern? We just have to make a number (which I call cur_pow_x in my program) and intialize it to $x^{2}$, and multiply it by itself on every iteration. If the current bit is 1, then we just multiply the current result by cur_pow_x.

There you go. Iterative approach.

## Benchmarks

I decided to run some benchmarks for Python algorithms on every approach we discussed so far for the $9^{9}$ operation (this way I stayed within the boundaries of a 32-bit integer. Python may get slow on higher numbers because it has to increase the number of operations on numbers that are beyond limitations of the CPU architecture). The results were slightly unexpected:

powah_naive: 1.29 µs ± 40.6 ns
powah_recursive: 2 µs ± 25.4 ns
powah_iterative: 1.39 µs ± 31.3 ns


The O(k) algorithm was the fastest. But we have to remember that 9 is likely a pretty small number, and some time is consumed by the overhead of additional operations in the two latter algorithms. So they might be slower on small numbers.

Let’s do the same thing for $9^{99}$. The results:

powah_naive: 10.1 µs ± 158 ns
powah_recursive: 3.99 µs ± 160 ns
powah_iterative: 2.68 µs ± 94.8 ns


Now we can finally see the might of O(log k). And the iterative approach is faster. But what if we go for $9^{999}$ now?

powah_naive: 177 µs ± 2.95 µs
powah_recursive: 10.3 µs ± 125 ns
powah_iterative: 14.1 µs ± 218 ns


Now the recursive one is faster. And I have tested it with higher values of exponent, the iterative algorithm only gets slower compared to the recursive one. I don’t have an explanation for this at the moment.

## In Golang

As a bonus, I’ll include the same code in Golang, since I played around with it as well. Didn’t make any benchmarks though.

Keep in ming that the precision may be lost on big numbers. If you set the upper limit for n to, say, 100, the test code will throw errors, even though the results are seemingly close. Strangely enough, powah_iterative did not have this behaviour, it was the most precise.

# Conclusion

There you have it! Several approaches to exponentiation. Sounds like a simple task but I’ve spent hours playing around with it. Of course, the practical approach is by simply using x**k in Python or math.Pow(x,k) in Go, the built-ins are way faster and more available.

Thanks for tuning in!