- Mastering Objectoriented Python
- Steven F. Lott
- 466字
- 2021-11-12 16:25:18
Improving performance
We'll look at two performance tweaks for the Power3
class.
First, a better algorithm. Then, a better algorithm combined with memoization, which involves a cache; therefore, the function becomes stateful. This is where callable objects shine.
The first modification is to use a Divide and Conquer design strategy. The previous version chopped into n steps; the loop carried out n individual multiplication operations. If we can find a way to split the problem into two equal portions, the problem decomposes into
steps. Given
pow1(2,1024)
, the Power1
callable performs the calculation 1024 multiplications by 2. We can optimize this down to 10 multiplications, a significant speedup.
Rather than simply multiplying by a fixed value, we'll use the "fast exponentiation" algorithm. It uses three essential rules for computing , as follows:
- If
:
, the result is simply 1.
- If n is odd and
, the result is
. This involves a recursive computation of
. This still does a multiplication but not a real optimization.
- If n is even and
, the result is
. This involves a recursive computation of
. This chops the number of multiplications in half.
The following is the recursive callable object:
class Power4( abc.Callable ): def __call__( self, x, n ): if n == 0: return 1 elif n % 2 == 1: return self.__call__(x, n-1)*x else: # n % 2 == 0: t= self.__call__(x, n//2) return t*t pow4= Power4()
We applied the three rules to the input value. If n is zero, we'll return 1. If n is odd, we'll make a recursive call and return . If n is even, we'll make a recursive call and return
.
The execution time is dramatically faster. We can use the timeit
module to see the difference in performance. See Some Preliminaries, for information on using timeit
. When we compare running pow1(2,1024)
and pow4(2,1024)
10,000 times, we'll see something like 183 seconds for the previous version versus 8 seconds for this version. We can do better, however, with memoization.
The following is how we can gather performance data using timeit
:
import timeit iterative= timeit.timeit( "pow1(2,1024)",""" import collections.abc class Power1( collections.abc.Callable ): def __call__( self, x, n ): p= 1 for i in range(n): p *= x return p pow1= Power1() """, number=100000 ) # otherwise it takes 3 minutes print( "Iterative", iterative )
We imported the timeit
module. The timeit.timeit()
function will evaluate a given statement in the defined context. In this case, our expression is the simple pow1(2,1024)
expression. The context for this statement is the definition of the pow1()
function; it includes the import, class definition, and creation of the instance.
Note that we provided number=100000
to speed things up. If we had used the default value for the number of iterations, it could have taken almost 2 minutes.