本文为作者原创,转载请先与作者联系。 同发于 SegmentFault 和 简书
functools
, itertools
, operator
是Python标准库为我们提供的支持函数式编程的三大模块,合理的使用这三个模块,我们可以写出更加简洁可读的Pythonic代码,接下来我们通过一些example来了解三大模块的使用。
>>> from functools import partial >>> basetwo = partial(int, base=2) >>> basetwo('10010') 18
上述过程实际上等价于调用int(‘10010’, base=2),而partial内部实际上就是通过一个简单的闭包来实现的
defpartial(func, *args, **keywords): defnewfunc(*fargs, **fkeywords): newkeywords = keywords.copy() newkeywords.update(fkeywords) return func(*args, *fargs, **newkeywords) newfunc.func = func newfunc.args = args newfunc.keywords = keywords return newfunc
partialmethod和partial类似,但是对于 绑定一个非对象自身的方法
的时候,这个时候就只能使用partialmethod了,我们通过下面这个例子来看一下两者的差异
>>> from functools import partial, partialmethod >>> defstandalone(self, a=1, b=2): ... "Standalone function" ... print(' called standalone with:', (self, a, b)) ... if self is not None: ... print(' self.attr =', self.attr) >>> classMyClass: ... "Demonstration class for functools" ... def__init__(self): ... self.attr = 'instance attribute' ... method1 = functools.partialmethod(standalone) # 使用partialmethod ... method2 = functools.partial(standalone) # 使用partial >>> o = MyClass() >>> o.method1() called standalone with: (<__main__.MyClass object at 0x7f46d40cc550>, 1, 2) self.attr = instance attribute # 不能使用partial >>> o.method2() Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: standalone() missing 1 required positional argument: 'self'
虽然Python不支持同名方法允许有不同的参数类型,但是我们可以借用singledispatch来 动态指定相应的方法所接收的参数类型
,而不用把参数判断放到方法内部去判断从而降低代码的可读性
from functools import singledispatch classTestClass(object): @singledispatch deftest_method(arg, verbose=False): if verbose: print("Let me just say,", end=" ") print(arg) @test_method.register(int) def_(arg): print("Strength in numbers, eh?", end=" ") print(arg) @test_method.register(list) def_(arg): print("Enumerate this:") for i, elem in enumerate(arg): print(i, elem) if __name__ == '__main__': TestClass.test_method(55555) # call @test_method.register(int) TestClass.test_method([33, 22, 11]) # call @test_method.register(list) TestClass.test_method('hello world', verbose=True) # call default
根据运行结果来解释一下上面这段代码,上面通过@test_method.register(int)和@test_method.register(list)指定当test_method的第一个参数为int或者list的时候,分别调用不同的方法来进行处理
Enumerate this: 0 33 1 22 2 11 Let me just say, hello world
装饰器会遗失被装饰函数的__name__和__doc__等属性,可以使用@wraps来恢复
>>> from functools import wraps >>> defmy_decorator(f): ... @wraps(f) ... defwrapper(): ... """wrapper_doc""" ... print('Calling decorated function') ... return f() ... return wrapper ... >>> @my_decorator ... defexample(): ... """example_doc""" ... print('Called example function') ... >>> example.__name__ 'example' >>> example.__doc__ 'example_doc' # 尝试去掉@wraps(f)来看一下运行结果,example自身的__name__和__doc__都已经丧失了 >>> example.__name__ 'wrapper' >>> example.__doc__ 'wrapper_doc'
我们也可以使用update_wrapper来改写
defg(): ... g = update_wrapper(g, f) # 等价于 @wraps(f) defg(): ...
@wraps内部实际上就是基于update_wrapper来实现的
defwraps(wrapped, assigned=WRAPPER_ASSIGNMENTS, updated=WRAPPER_UPDATES): defdecorator(wrapper): return update_wrapper(wrapper, wrapped=wrapped...) return decorator
Python2中可以通过自定义__cmp__的返回值0/-1/1来比较对象的大小,在Python3中废弃了__cmp__,但是我们可以通过total ordering然后修改/ _lt__() , __le__() , __eq__() , __ne__() , __gt__(), __ge__()等魔术方法来自定义类的比较规则
import functools @functools.total_ordering classMyObject: def__init__(self, val): self.val = val def__eq__(self, other): print(' testing __eq__({}, {})'.format( self.val, other.val)) return self.val == other.val def__gt__(self, other): print(' testing __gt__({}, {})'.format( self.val, other.val)) return self.val > other.val a = MyObject(1) b = MyObject(2) for expr in ['a < b', 'a <= b', 'a == b', 'a >= b', 'a > b']: print('/n{:<6}:'.format(expr)) result = eval(expr) print(' result of {}: {}'.format(expr, result))
下面是运行结果
a < b : testing __gt__(1, 2) testing __eq__(1, 2) result of a < b: True a <= b: testing __gt__(1, 2) result of a <= b: True a == b: testing __eq__(1, 2) result of a == b: False a >= b: testing __gt__(1, 2) testing __eq__(1, 2) result of a >= b: False a > b : testing __gt__(1, 2) result of a > b: False
count(start=0, step=1) 返回一个无限的整数流,每次增加1。你可以选择提供起始编号,默认为0
>>> from itertools import count >>> for i in zip(count(1), ['a', 'b', 'c']): ... print(i, end=' ') ... (1, 'a') (2, 'b') (3, 'c')
cycle(iterable) saves a copy of the contents of a provided iterable and returns a new iterator that returns its elements from first to last. The new iterator will repeat these elements infinitely.
>>> from itertools import cycle >>> for i in zip(range(6), cycle(['a', 'b', 'c'])): ... print(i, end=' ') ... (0, 'a') (1, 'b') (2, 'c') (3, 'a') (4, 'b') (5, 'c')
repeat(object[, times]) 返回提供的元素n次,如果未提供n则无限次返回
>>> from itertools import repeat >>> for i, s in zip(count(1), repeat('over-and-over', 5)): ... print(i, s) ... 1 over-and-over 2 over-and-over 3 over-and-over 4 over-and-over 5 over-and-over
accumulate(iterable[, func])
>>> from itertools import accumulate >>> import operator >>> list(accumulate([1, 2, 3, 4, 5], operator.add)) [1, 3, 6, 10, 15] >>> list(accumulate([1, 2, 3, 4, 5], operator.mul)) [1, 2, 6, 24, 120]
itertools.chain(*iterables)可以将多个iterable组合成一个iterator
>>> from itertools import chain >>> list(chain([1, 2, 3], ['a', 'b', 'c'])) [1, 2, 3, 'a', 'b', 'c']
chain的实现原理如下
defchain(*iterables): # chain('ABC', 'DEF') --> A B C D E F for it in iterables: for element in it: yield element
chain.from_iterable(iterable)和chain类似,但是只是接收单个iterable,然后将这个iterable中的元素组合成一个iterator
>>> from itertools import chain >>> list(chain.from_iterable(['ABC', 'DEF'])) ['A', 'B', 'C', 'D', 'E', 'F']
实现原理也和chain类似
deffrom_iterable(iterables): # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F for it in iterables: for element in it: yield element
compress(data, selectors)接收两个iterable作为参数,只返回selectors中对应的元素为True的data,当data/selectors之一用尽时停止
>>> list(compress([1, 2, 3, 4, 5], [True, True, False, False, True])) [1, 2, 5]
zip_longest(*iterables, fillvalue=None)和zip类似,但是zip的缺陷是iterable中的某一个元素被遍历完,整个遍历都会停止,具体差异请看下面这个例子
from itertools import zip_longest r1 = range(3) r2 = range(2) print('zip stops early:') print(list(zip(r1, r2))) r1 = range(3) r2 = range(2) print('/nzip_longest processes all of the values:') print(list(zip_longest(r1, r2)))
下面是输出结果
zip stops early: [(0, 0), (1, 1)] zip_longest processes all of the values: [(0, 0), (1, 1), (2, None)]
islice(iterable, stop) or islice(iterable, start, stop[, step]) 与Python的字符串和列表切片有一些,只是你不能对start、start和step使用负值
>>> from itertools import islice >>> for i in islice(range(100), 0, 100, 10): ... print(i, end=' ') ... 0 10 20 30 40 50 60 70 80 90
tee(iterable, n=2) 返回n个独立的iterator,n默认为2
from itertools import islice, tee r = islice(count(), 5) i1, i2 = tee(r) print('i1:', list(i1)) print('i2:', list(i2)) for i in r: print(i, end=' ') if i > 1: break
下面是输出结果,注意tee(r)后,r作为iterator已经失效,所以for循环没有输出值
i1: [0, 1, 2, 3, 4] i2: [0, 1, 2, 3, 4]
starmap(func, iterable)假设iterable将返回一个元组流,并使用这些元组作为参数调用func:
>>> from itertools import starmap >>> import os >>> iterator = starmap(os.path.join, ... [('/bin', 'python'), ('/usr', 'bin', 'java'), ... ('/usr', 'bin', 'perl'), ('/usr', 'bin', 'ruby')]) >>> list(iterator) ['/bin/python', '/usr/bin/java', '/usr/bin/perl', '/usr/bin/ruby']
filterfalse(predicate, iterable) 与filter()相反,返回所有predicate返回False的元素
itertools.filterfalse(is_even, itertools.count()) => 1, 3, 5, 7, 9, 11, 13, 15, ...
takewhile(predicate, iterable) 只要predicate返回True,不停地返回iterable中的元素。一旦predicate返回False,iteration将结束
def less_than_10(x): return x < 10 itertools.takewhile(less_than_10, itertools.count()) => 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 itertools.takewhile(is_even, itertools.count()) => 0
dropwhile(predicate, iterable) 在predicate返回True时舍弃元素,然后返回其余迭代结果
itertools.dropwhile(less_than_10, itertools.count()) => 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ... itertools.dropwhile(is_even, itertools.count()) => 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...
groupby(iterable, key=None) collects all the consecutive elements from the underlying iterable that have the same key value, and returns a stream of 2-tuples containing a key value and an iterator for the elements with that key. If you don’t supply a key function, the key is simply each element itself.
例子很清晰,直接看例子,上面的定义我就不翻译了
city_list = [('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL'), ('Anchorage', 'AK'), ('Nome', 'AK'), ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ'), ... ] def get_state(city_state): return city_state[1] itertools.groupby(city_list, get_state) => ('AL', iterator-1), ('AK', iterator-2), ('AZ', iterator-3), ... iterator-1 => ('Decatur', 'AL'), ('Huntsville', 'AL'), ('Selma', 'AL') iterator-2 => ('Anchorage', 'AK'), ('Nome', 'AK') iterator-3 => ('Flagstaff', 'AZ'), ('Phoenix', 'AZ'), ('Tucson', 'AZ')
groupby() assumes that the underlying iterable’s contents will already be sorted based on the key.
product(*iterables, repeat=1)
from itertools import product defshow(iterable): for i, item in enumerate(iterable, 1): print(item, end=' ') if (i % 3) == 0: print() print() print('Repeat 2:/n') show(product(range(3), repeat=2)) print('Repeat 3:/n') show(product(range(3), repeat=3))
Repeat 2: (0, 0) (0, 1) (0, 2) (1, 0) (1, 1) (1, 2) (2, 0) (2, 1) (2, 2) Repeat 3: (0, 0, 0) (0, 0, 1) (0, 0, 2) (0, 1, 0) (0, 1, 1) (0, 1, 2) (0, 2, 0) (0, 2, 1) (0, 2, 2) (1, 0, 0) (1, 0, 1) (1, 0, 2) (1, 1, 0) (1, 1, 1) (1, 1, 2) (1, 2, 0) (1, 2, 1) (1, 2, 2) (2, 0, 0) (2, 0, 1) (2, 0, 2) (2, 1, 0) (2, 1, 1) (2, 1, 2) (2, 2, 0) (2, 2, 1) (2, 2, 2)
permutations(iterable, r=None)返回长度为r的所有可能的组合
from itertools import permutations defshow(iterable): first = None for i, item in enumerate(iterable, 1): if first != item[0]: if first is not None: print() first = item[0] print(''.join(item), end=' ') print() print('All permutations:/n') show(permutations('abcd')) print('/nPairs:/n') show(permutations('abcd', r=2))
下面是输出结果
All permutations: abcd abdc acbd acdb adbc adcb bacd badc bcad bcda bdac bdca cabd cadb cbad cbda cdab cdba dabc dacb dbac dbca dcab dcba Pairs: ab ac ad ba bc bd ca cb cd da db dc
combinations(iterable, r) 返回一个iterator,提供iterable中所有元素可能组合的r元组。每个元组中的元素保持与iterable返回的顺序相同。下面的实例中,不同于上面的permutations,a总是在bcd之前,b总是在cd之前,c总是在d之前
from itertools import combinations defshow(iterable): first = None for i, item in enumerate(iterable, 1): if first != item[0]: if first is not None: print() first = item[0] print(''.join(item), end=' ') print() print('Unique pairs:/n') show(combinations('abcd', r=2))
下面是输出结果
Unique pairs: ab ac ad bc bd cd
combinations_with_replacement(iterable, r)函数放宽了一个不同的约束:元素可以在单个元组中重复,即可以出现aa/bb/cc/dd等组合
from itertools import combinations_with_replacement defshow(iterable): first = None for i, item in enumerate(iterable, 1): if first != item[0]: if first is not None: print() first = item[0] print(''.join(item), end=' ') print() print('Unique pairs:/n') show(combinations_with_replacement('abcd', r=2))
下面是输出结果
aa ab ac ad bb bc bd cc cd dd
operator.attrgetter( attr )和operator.attrgetter( *attrs )
我们通过下面这个例子来了解一下itergetter的用法
>>> classStudent: ... def__init__(self, name, grade, age): ... self.name = name ... self.grade = grade ... self.age = age ... def__repr__(self): ... return repr((self.name, self.grade, self.age)) >>> student_objects = [ ... Student('john', 'A', 15), ... Student('jane', 'B', 12), ... Student('dave', 'B', 10), ... ] >>> sorted(student_objects, key=lambda student: student.age) # 传统的lambda做法 [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] >>> from operator import itemgetter, attrgetter >>> sorted(student_objects, key=attrgetter('age')) [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] # 但是如果像下面这样接受双重比较,Python脆弱的lambda就不适用了 >>> sorted(student_objects, key=attrgetter('grade', 'age')) [('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]
attrgetter的实现原理
defattrgetter(*items): if any(not isinstance(item, str) for item in items): raise TypeError('attribute name must be a string') if len(items) == 1: attr = items[0] defg(obj): return resolve_attr(obj, attr) else: defg(obj): return tuple(resolve_attr(obj, attr) for attr in items) return g defresolve_attr(obj, attr): for name in attr.split("."): obj = getattr(obj, name) return obj
operator.itemgetter(item)和operator.itemgetter(*items)
我们通过下面这个例子来了解一下itergetter的用法
>>> student_tuples = [ ... ('john', 'A', 15), ... ('jane', 'B', 12), ... ('dave', 'B', 10), ... ] >>> sorted(student_tuples, key=lambda student: student[2]) # 传统的lambda做法 [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] >>> from operator import attrgetter >>> sorted(student_tuples, key=itemgetter(2)) [('dave', 'B', 10), ('jane', 'B', 12), ('john', 'A', 15)] # 但是如果像下面这样接受双重比较,Python脆弱的lambda就不适用了 >>> sorted(student_tuples, key=itemgetter(1,2)) [('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]
itemgetter的实现原理
defitemgetter(*items): if len(items) == 1: item = items[0] defg(obj): return obj[item] else: defg(obj): return tuple(obj[item] for item in items) return g
operator.methodcaller(name[, args…])
methodcaller的实现原理
defmethodcaller(name, *args, **kwargs): defcaller(obj): return getattr(obj, name)(*args, **kwargs) return caller
DOCUMENTATION-FUNCTOOLS
DOCUMENTATION-ITERTOOLS
DOCUMENTATION-OPERATOR
HWOTO-FUNCTIONAL
HWOTO-SORTING
PYMOTW