用迭代器模塊為Python提速

Python官方文檔用"高效的循環(huán)"來形容itertools模塊,有些tools會(huì)帶來性能提升,而另外一些tools并不快,只是會(huì)節(jié)省一些開發(fā)時(shí)間而已,如果濫用還會(huì)導(dǎo)致代碼可讀性變差。我們不妨把itertools的兄弟們拉出來溜溜。
1. 數(shù)列累加
給定一個(gè)列表An,返回?cái)?shù)列累加和Sn。舉例說明:
輸入: [1, 2, 3, 4, 5]
返回: [1, 3, 6, 10, 15]
使用accumulate,性能提升了2.5倍
from?itertools?import?accumulate
def?_accumulate_list(arr):
????tot?=?0
????for?x?in?arr:
????????tot?+=?x
????????yield?tot
def?accumulate_list(arr):
????return?list(_accumulate_list(arr))
def?fast_accumulate_list(arr):
????return?list(accumulate(arr))
arr?=?list(range(1000))
%timeit?accumulate_list(arr)
61?μs?±?2.91?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?10000?loops?each)
%timeit?fast_accumulate_list(arr)
21.3?μs?±?811?ns?per?loop?(mean?±?std.?dev.?of?7?runs,?10000?loops?each)
2. 選擇數(shù)據(jù)
給定一個(gè)列表data,一個(gè)用0/1表示的列表selectors,返回被選擇的數(shù)據(jù)。舉例說明:
輸入: [1, 2, 3, 4, 5], [0, 1, 0, 1, 0]
返回: [2, 4]
使用compress,性能提升了2.8倍
from?itertools?import?compress
from?random?import?randint
def?select_data(data,?selectors):
????return?[x?for?x,?y?in?zip(data,?selectors)?if?y]
def?fast_select_data(data,?selectors):
????return?list(compress(data,?selectors))
data?=?list(range(10000))
selectors?=?[randint(0,?1)?for?_?in?range(10000)]
%timeit?select_data(data,?selectors)
341?μs?±?17.8?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?1000?loops?each)
%timeit?fast_select_data(data,?selectors)
130?μs?±?3.19?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?10000?loops?each)
3. 組合
給定一個(gè)列表arr和一個(gè)數(shù)字k,返回從arr中選擇k個(gè)元素的所有情況。舉例說明:
輸入: [1, 2, 3], 2
返回: [(1, 2), (1, 3), (2, 1), (2, 3), (3, 1), (3, 2)]
使用permutations,性能提升了10倍
from?itertools?import?permutations
def?_get_permutations(arr,?k,?i):
????if?i?==?k:
????????return?[arr[:k]]
????res?=?[]
????for?j?in?range(i,?len(arr)):
????????arr_cpy?=?arr.copy()
????????arr_cpy[i],?arr_cpy[j]?=?arr_cpy[j],?arr_cpy[i]
????????res?+=?_get_permutations(arr_cpy,?k,?i?+?1)
????return?res
def?get_permutations(arr,?k):
????return?_get_permutations(arr,?k,?0)
def?fast_get_permutations(arr,?k):
????return?list(permutations(arr,?k))
arr?=?list(range(10))
k?=?5
%timeit?-n?1?get_permutations(arr,?k)
15.5?ms?±?1.96?ms?per?loop?(mean?±?std.?dev.?of?7?runs,?1?loop?each)
%timeit?-n?1?fast_get_permutations(arr,?k)
1.56?ms?±?284?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?1?loop?each)
4. 篩選數(shù)據(jù)
給定一個(gè)列表arr,篩選出所有的偶數(shù)。舉例說明:
輸入: [3, 1, 4, 5, 9, 2]
返回: [(4, 2]
使用filterfalse,性能反而會(huì)變慢,所以不要迷信itertools。
from?itertools?import?filterfalse
def?get_even_nums(arr):
????return?[x?for?x?in?arr?if?x?%?2?==?0]
def?fast_get_even_nums(arr):
????return?list(filterfalse(lambda?x:?x?%?2,?arr))
arr?=?list(range(10000))
%timeit?get_even_nums(arr)
417?μs?±?18.8?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?1000?loops?each)
%timeit?fast_get_even_nums(arr)
823?μs?±?22.6?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?1000?loops?each)
5. 條件終止
給定一個(gè)列表arr,依次對(duì)列表的所有數(shù)字進(jìn)行求和,若遇到某個(gè)元素大于target之后則終止求和,返回這個(gè)和。舉例說明:
輸入: [1, 2, 3, 4, 5], 3
返回: 6 (4 > 3,終止)
使用takewhile,性能反而會(huì)變慢,所以不要迷信itertools。
from?itertools?import?takewhile
def?cond_sum(arr,?target):
????res?=?0
????for?x?in?arr:
????????if?x?>?target:
????????????break
????????res?+=?x
????return?res
def?fast_cond_sum(arr,?target):
????return?sum(takewhile(lambda?x:?x?<=?target,?arr))
arr?=?list(range(10000))
target?=?5000
%timeit?cond_sum(arr,?target)
245?μs?±?11.8?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?1000?loops?each)
%timeit?fast_cond_sum(arr,?target)
404?μs?±?13.3?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?1000?loops?each)
6. 循環(huán)嵌套
給定列表arr1,arr2,返回兩個(gè)列表的所有元素兩兩相加的和。舉例說明:
輸入: [1, 2], [4, 5]
返回: [1 + 4, 1 + 5, 2 + 4, 2 + 5]
使用product,性能提升了1.25倍。
from?itertools?import?product
def?_cross_sum(arr1,?arr2):
????for?x?in?arr1:
????????for?y?in?arr2:
????????????yield?x?+?y
def?cross_sum(arr1,?arr2):
????return?list(_cross_sum(arr1,?arr2))
def?fast_cross_sum(arr1,?arr2):
????return?[x?+?y?for?x,?y?in?product(arr1,?arr2)]
arr1?=?list(range(100))
arr2?=?list(range(100))
%timeit?cross_sum(arr1,?arr2)
484?μs?±?16.6?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?1000?loops?each)
%timeit?fast_cross_sum(arr1,?arr2)
373?μs?±?11.4?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?1000?loops?each)
7. 二維列表轉(zhuǎn)一維列表
給定二維列表arr,轉(zhuǎn)為一維列表 舉例說明:
輸入: [[1, 2], [3, 4]]
返回: [1, 2, 3, 4]
使用chain,性能提升了6倍。
from?itertools?import?chain
def?_flatten(arr2d):
????for?arr?in?arr2d:
????????for?x?in?arr:
????????????yield?x
def?flatten(arr2d):
????return?list(_flatten(arr2d))
def?fast_flatten(arr2d):
????return?list(chain(*arr2d))
arr2d?=?[[x?+?y?*?100?for?x?in?range(100)]?for?y?in?range(100)]
%timeit?flatten(arr2d)
379?μs?±?15.4?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?1000?loops?each)
%timeit?fast_flatten(arr2d)
66.9?μs?±?3.43?μs?per?loop?(mean?±?std.?dev.?of?7?runs,?10000?loops?each)
作者:李小文,先后從事過數(shù)據(jù)分析、數(shù)據(jù)挖掘工作,主要開發(fā)語言是Python,現(xiàn)任一家小型互聯(lián)網(wǎng)公司的算法工程師。
Github:?https://github.com/tushushu
更多閱讀
特別推薦

點(diǎn)擊下方閱讀原文加入社區(qū)會(huì)員
