转载

用 node.js 爬取网页内容(包括单页应用)

aimer 用 node.js 爬取网页内容(包括单页应用) 用 node.js 爬取网页内容(包括单页应用) 用 node.js 爬取网页内容(包括单页应用)

Remote web content crawler done right.

Motivation

Sometimes I want to grab some nice images from a url like http://bbs.005.tv/thread-492392-1-1.html , so I made this little program to combine node-fetch and cheerio to make my attempt fulfilled. And it uses nightmare to handle SPAs.

Install

$ npm install --save aimer 

Usage

const aimer = require('aimer')  aimer('http://some-url.com/a/b/c')     .then($ => {         $('img.nice-images').each(function () {             const url = $(this).attr('src')             console.log(url)         })     })  // or even single page website! const nightmare = require('aimer/nightmare') nightmare('http://some-url.com/#!/list')   .then($ => {     // your code goes here   })

API

aimer(url, opts)

opts

cheerio

cheerio options. Except decodeEntities is false by default here.

nightmare(url, opts)

Usenightmare to retrieve html from url, this is good for handling SPA website.

opts

cheerio

cheerio options. Except decodeEntities is false by default here.

nightmare

nightmare options.

License

MIT ©EGOIST

原文  https://github.com/egoist/aimer
正文到此结束
Loading...