1. 程式人生 > >Parallel Python 平行計算

Parallel Python 平行計算

最近在關注如何提升Python執行效率的問題,自己沒有時間去深入研究,就直接選擇了開源的Parallel Python,希望能夠充分發揮多核CPU及叢集環境的優勢。
Parallel Python是Python進行分散式計算的開源模組,能夠將計算壓力分佈到多核CPU或叢集的多臺計算機上,能夠非常方便的在內網中搭建一個自組織的分散式計算平臺。先從多核計算開始,普通的Python應用程式只能夠使用一個CPU程序,而通過Parallel Python能夠很方便的將計算擴充套件到多個CPU程序中,使用官方網站上的一個例子。

不使用Parallel Python

import math, time

def
isprime(n):
"""Returns True if n is prime and False otherwise""" if not isinstance(n, int): raise TypeError("argument passed to is_prime is not of 'int' type") if n < 2: return False if n == 2: return True max = int(math.ceil(math.sqrt(n))) i = 2 while
i <= max: if n % i == 0: return False i += 1 return True def sum_primes(n): """Calculates sum of all primes below given integer n""" return sum([x for x in xrange(2,n) if isprime(x)]) start_time = time.time() inputs = (100000, 100100, 100200, 100300, 100400, 100500, 100600, 100700) jobs = [(input, sum_primes(input)) for
input in inputs] for input, job in jobs: print "Sum of primes below", input, "is", job print "Time elapsed: ", time.time() - start_time, "s"

執行結果
這裡寫圖片描述

使用Parallel Python

將程式稍作調整,引入pp模組。

#!
# File: sum_primes.py
# Author: VItalii Vanovschi
# Desc: This program demonstrates parallel computations with pp module
# It calculates the sum of prime numbers below a given integer in parallel
# Parallel Python Software: http://www.parallelpython.com/
import math, sys, time
import pp
def isprime(n):
    """Returns True if n is prime and False otherwise"""
    if not isinstance(n, int):
        raise TypeError("argument passed to is_prime is not of 'int' type")
    if n < 2:
        return False
    if n == 2:
        return True
    max = int(math.ceil(math.sqrt(n)))
    i = 2
    while i <= max:
        if n % i == 0:
            return False
        i += 1
    return True
def sum_primes(n):
    """Calculates sum of all primes below given integer n"""
    return sum([x for x in xrange(2,n) if isprime(x)])
print """Usage: python sum_primes.py [ncpus]
    [ncpus] - the number of workers to run in parallel, 
    if omitted it will be set to the number of processors in the system
"""
# tuple of all parallel python servers to connect with
ppservers = ()
#ppservers = ("10.0.0.1",)
if len(sys.argv) > 1:
    ncpus = int(sys.argv[1])
    # Creates jobserver with ncpus workers
    job_server = pp.Server(ncpus, ppservers=ppservers)
else:
    # Creates jobserver with automatically detected number of workers
    job_server = pp.Server(ppservers=ppservers)
print "Starting pp with", job_server.get_ncpus(), "workers"
# Submit a job of calulating sum_primes(100) for execution. 
# sum_primes - the function
# (100,) - tuple with arguments for sum_primes
# (isprime,) - tuple with functions on which function sum_primes depends
# ("math",) - tuple with module names which must be imported before sum_primes execution
# Execution starts as soon as one of the workers will become available
job1 = job_server.submit(sum_primes, (100,), (isprime,), ("math",))
# Retrieves the result calculated by job1
# The value of job1() is the same as sum_primes(100)
# If the job has not been finished yet, execution will wait here until result is available
result = job1()
print "Sum of primes below 100 is", result
start_time = time.time()
# The following submits 8 jobs and then retrieves the results
inputs = (100000, 100100, 100200, 100300, 100400, 100500, 100600, 100700)
jobs = [(input, job_server.submit(sum_primes,(input,), (isprime,), ("math",))) for input in inputs]
for input, job in jobs:
    print "Sum of primes below", input, "is", job()
print "Time elapsed: ", time.time() - start_time, "s"
job_server.print_stats()

執行結果
這裡寫圖片描述