原文: Designing Pythonic APIs
当编写一个包(库)的时候,为它提供一个良好的API,几乎与它的功能本身一样重要(好吧,至少你想要让别人使用),但怎么才算一个良好的API呢?在这篇文章中,我将尝试通过比较Requests和Urllib(Python标准库的一部分)在一些经典的HTTP场景的使用,从而提供关于这个问题的一些见解,并看看为什么Requests已经成为了Python用户中的事实上的标准。
** 此博文是我上周的一个本地Python聚会( PywebIL )上的演讲的改编。你可以 在这里 找到幻灯片。
import urllib.request urllib.request.urlopen('http://python.org/')
<http.client.HTTPResponse at 0x7fdb08b1bba8>
import requests requests.get('http://python.org/')
<Response [200]>
data
参数 __repr__()
方法来完成)。 ( requests/api.py ):
def request(method, url, **kwargs): with sessions.Session() as session: return session.request(method=method, url=url, **kwargs) def get(url, params=None, **kwargs): kwargs.setdefault('allow_redirects', True) return request('get', url, params=params, **kwargs) def post(url, data=None, json=None, **kwargs): return request('post', url, data=data, json=json, **kwargs)
request()
主流程函数。 request()
的动作实现一个“辅助函数”,启用我们正在寻找的明确性。 import urllib.request r = urllib.request.urlopen('http://python.org/') r.getcode()
import requests r = requests.get('http://python.org/') r.status_code
@property
装饰器。 http/client.py :
class HTTPResponse(io.BufferedIOBase): # ... def getcode(self): return self.status
import urllib.parse import urllib.request import json url = 'http://www.httpbin.org/post' values = {'name' : 'Michael Foord'} data = urllib.parse.urlencode(values).encode() response = urllib.request.urlopen(url, data) body = response.read().decode() json.loads(body)
import requests url = 'http://www.httpbin.org/post' data = {'name' : 'Michael Foord'} response = requests.post(url, data=data) response.json()
同时注意, requests 还提供了一种优雅的方式来发送JSON内容:
import requests url = 'http://www.httpbin.org/post' data = {'name' : 'Michael Foord'} response = requests.post(url, json=data) response.json()
下面为HTTP请求创建了持久性凭证,然后发送请求:
import urllib.request gh_url = 'https://api.github.com/user' password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm() password_mgr.add_password(None, gh_url, 'user', 'pswd') handler = urllib.request.HTTPBasicAuthHandler(password_mgr) opener = urllib.request.build_opener(handler) opener.open(gh_url)
import requests session = requests.Session() session.auth = ('user', 'pswd') session.get('https://api.github.com/user')
但如果我们只是想进行一次HTTP调用呢?我们需要所有的代码吗?这里, requests 允许你这样:
import requests requests.get('https://api.github.com/user', auth=('user', 'pswd'))
requests/models.py :
def prepare_auth(self, auth, url=''): """Prepares the given HTTP auth data.""" # ... if auth: if isinstance(auth, tuple) and len(auth) == 2: # special-case basic HTTP auth auth = HTTPBasicAuth(*auth)
(user,pass)
元组转换成一个鉴权类。 from urllib.request import urlopen response = urlopen('http://www.httpbin.org/geta') response.getcode()
--------------------------------------------------------------------------- HTTPError Traceback (most recent call last) <ipython-input-45-5fba039d189a> in <module>() 1 from urllib.request import urlopen ----> 2 response = urlopen('http://www.httpbin.org/geta') 3 response.getcode() /usr/lib/python3.5/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context) 161 else: 162 opener = _opener --> 163 return opener.open(url, data, timeout) 164 165 def install_opener(opener): /usr/lib/python3.5/urllib/request.py in open(self, fullurl, data, timeout) 470 for processor in self.process_response.get(protocol, []): 471 meth = getattr(processor, meth_name) --> 472 response = meth(req, response) 473 474 return response /usr/lib/python3.5/urllib/request.py in http_response(self, request, response) 580 if not (200 <= code < 300): 581 response = self.parent.error( --> 582 'http', request, response, code, msg, hdrs) 583 584 return response /usr/lib/python3.5/urllib/request.py in error(self, proto, *args) 508 if http_err: 509 args = (dict, 'default', 'http_error_default') + orig_args --> 510 return self._call_chain(*args) 511 512 # XXX probably also want an abstract factory that knows when it makes /usr/lib/python3.5/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args) 442 for handler in handlers: 443 func = getattr(handler, meth_name) --> 444 result = func(*args) 445 if result is not None: 446 return result /usr/lib/python3.5/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs) 588 class HTTPDefaultErrorHandler(BaseHandler): 589 def http_error_default(self, req, fp, code, msg, hdrs): --> 590 raise HTTPError(req.full_url, code, msg, hdrs, fp) 591 592 class HTTPRedirectHandler(BaseHandler): HTTPError: HTTP Error 404: NOT FOUND
import requests r = requests.get('http://www.httpbin.org/geta') r.status_code
exceptions
并不会。 使用示例:
from urllib.request import urlopen from urllib.error import URLError, HTTPError try: response = urlopen('http://www.httpbin.org/geta') except HTTPError as e: if e.code == 404: print('Page not found') else: print('All good')
Page not found
from requests.exceptions import HTTPError import requests r = requests.get('http://www.httpbin.org/posta') try: r.raise_for_status() except HTTPError as e: if e.response.status_code == 404: print('Page not found')
Page not found
import requests r = requests.get('http://www.httpbin.org/geta') if r.ok: print('All good') elif r.status_code == requests.codes.not_found: print('Page not found')
Page not found
目前就是这样了。在准备这个演讲/文章的过程中,我学到了很多(Ele注,在翻译的时候我也学到了很多,O(∩_∩)O~),我希望你也读读它。我会很高兴在下面或者在Twitter (@noamelf)上看到你的评论(Ele注:欢迎去原文评论哈)。
如果你像许多人,包括我自己一样,最终好奇为什么在Requests和Urllib之间有如此鲜明的差异。Nick Coghlan在 下面的注释 和下面的一篇博文(标题自解释): 它解决了什么问题? (Ele注:刚好翻译了这篇的中文版)中分享了它关于这个问题的广阔的视角。