转载

新版卖家中心 Bigpipe 实践（二）

自从上次通过新版卖家中心 Bigpipe 实践（一）阐述了 Bigpipe 实现思路和原理之后，一转眼春天就来了。而整个实践过程，从开始冬天迎着冷风前行，到现在逐渐回暖。其中感受和收获良多，和大家分享下。代码偏多，请自带编译器。

核心问题

一切技术的产生或者使用都是为了解决问题，所以开始前，看下要解决的问题：

同步加载首屏模块，服务端各个模块并行生成内容，客户端渲染内容依赖于最后一个内容的生成时间。这里的痛点是同步。因为要多模块同步，所以难免浏览器要等待，浏览器等待也就是用户等待。
于是我们采用了滚动异步加载模块，页面框架优先直出，几朵菊花旋转点缀，然后首屏模块通过异步请求逐个展现出来。虽然拿到什么就能在客户端渲染显示，但还是有延迟感。这里的痛点是请求，每个模块都需要多一个请求，也需要时间。
Facebook 的工程师们会不会是这样想的：一次请求，各个首屏模块服务端并行处理生成内容，生成的内容能直接传输给客户端渲染，用户能马上看到内容，这样好猴赛雷~
其实 Bigpipe 的思路是从微处理器的流水线中受到启发

技术突破口

卖家中心主体也是功能模块化，和 Facebook 遇到的问题是一致的。核心的问题换个说法： 通过一个请求链接，服务端能否将动态内容分块传输到客户端实时渲染展示，直到内容传输结束，请求结束。

概念

技术点：HTTP 协议的分块传输（在 HTTP 1.1 提供）概念入口
如果一个 HTTP 消息（请求消息或应答消息）的 Transfer-Encoding 消息头的值为 chunked ，那么，消息体由数量未定的块组成，并以最后一个大小为 0 的块为结束。
这种机制使得网页内容分成多个内容块，服务器和浏览器建立管道并管理他们在不同阶段的运行。

实现

如何实现数据分块传输，各个语言的方式并不一样。

PHP 的方式

<html>
<head>
    <title>php chunked</title>
</head>
<body>

    <?php sleep(1); ?>
    <div id="moduleA"><?php echo 'moduleA' ?></div>
    <?php ob_flush(); flush(); ?>
    
    <?php sleep(3); ?>
    <div id="moduleB"><?php echo 'moduleB' ?></div>
    <?php ob_flush(); flush(); ?>
    
    <?php sleep(2); ?>
    <div id="moduleC"><?php echo 'moduleC' ?></div>
    <?php ob_flush(); flush(); ?>

</body>
</html>

PHP 利用 ob_flush 和 flush 把页面分块刷新缓存到浏览器，查看 network ，页面的 Transfer-Encoding=chunked ，实现内容的分块渲染。
PHP 不支持线程，所以服务器无法利用多线程去并行处理多个模块的内容。
PHP 也有并发执行的方案，这里不做扩展，有兴趣地可以去深入研究下。

Java 的方式

Java 也有类似于 flush 的函数实现简单页面的分块传输。
Java 是多线程的，方便并行地处理各个模块的内容。

flush 的思考

Yahoo 34 条性能优化 Rules 里面提到 flush 时机是 head 之后，可以让浏览器先行下载 head 中引入的 CSS/js。
我们会把内容分成一块一块 flush 到浏览器端，flush 的内容优先级应该是用户关心的。比如 Yahoo 之前优先 flush 的就是搜索框，因为这个是核心功能。
flush 的内容大小需要进行有效地拆分，大内容可以拆成小内容。

Node.js 实现

通过对比 PHP 和 Java 在实现 Bigpipe 上的优势和劣势，很容易在 Node.js 上找到幸福感。

Node.js 的异步特性可以很容易地处理并行的问题。
View 层全面控制，对于需要服务端处理数据和客户端渲染有天然的优势。
Node.js 中的 HTTP 接口的设计支持许多 HTTP 协议中原本用起来很困难的特性。

回到 HelloWorld

var http = require('http');

http.createServer(function (request, response){
  response.writeHead(200, {'Content-Type': 'text/html'});
  response.write('hello');
  response.write(' world ');
  response.write('~ ');
  response.end();
}).listen(8080, "127.0.0.1");

HTTP 头 Transfer-Encoding=chunked ，我的天啊，太神奇了！
如果只是 response.write 数据，没有指示 response.end ，那么这个响应就没有结束，浏览器会保持这个请求。在没有调用 response.end 之前，我们完全可以通过 response.write 来 flush 内容。
把 Bigpipe Node.js 实现是从 HelloWorld 开始，心情有点小激动。

完整点

layout.html

<!DOCTYPE html>
<html>
<head>
 <!-- css and js tags -->
    <link rel="stylesheet" href="index.css" />
    <script>
    function renderFlushCon(selector, html) {
        document.querySelector(selector).innerHTML = html;
    }
    </script>
</head>
<body>
    <div id="A"></div>
    <div id="B"></div>
    <div id="C"></div>

head 里面放我们要加载的 assets
输出页面框架，A/B/C 模块的占位

var http = require('http');
var fs = require('fs');

http.createServer(function(request, response) {
  response.writeHead(200, { 'Content-Type': 'text/html' });

  // flush layout and assets
  var layoutHtml = fs.readFileSync(__dirname + "/layout.html").toString();
  response.write(layoutHtml);
  
  // fetch data and render
  response.write('<script>renderFlushCon("#A","moduleA");</script>');
  response.write('<script>renderFlushCon("#C","moduleC");</script>');
  response.write('<script>renderFlushCon("#B","moduleB");</script>');
  
  // close body and html tags
  response.write('</body></html>');
  // finish the response
  response.end();
}).listen(8080, "127.0.0.1");

页面输出：

moduleA
moduleB
moduleC

flush layout 的内容包含浏览器渲染的函数
然后进入核心的取数据、模板拼装，将可执行的内容 flush 到浏览器
浏览器进行渲染（此处还未引入并行处理）
关闭 body 和 HTML 标签
结束响应完成一个请求

express 实现

var express = require('express');
var app = express();
var fs = require('fs');

app.get('/', function (req, res) {
  // flush layout and assets
  var layoutHtml = fs.readFileSync(__dirname + "/layout.html").toString();
  res.write(layoutHtml);
  
  // fetch data and render
  res.write('<script>renderFlushCon("#A","moduleA");</script>');
  res.write('<script>renderFlushCon("#C","moduleC");</script>');
  res.write('<script>renderFlushCon("#B","moduleB");</script>');
  
  // close body and html tags
  res.write('</body></html>');
  // finish the response
  res.end();
});

app.listen(3000);

页面输出：

moduleA
moduleB
moduleC

express 建立在 Node.js 内置的 HTTP 模块上，实现的方式差不多

koa 实现

var koa = require('koa');
var app = koa();

app.use(function *() {
    this.body = 'Hello world';
});

app.listen(3000);

Koa 不支持直接调用底层 res 进行响应处理。 res.write()/res.end() 就是个雷区，有幸踩过。
koa 中，this 这个上下文对 Node.js 的 request 和 response 对象的封装。this.body 是 response 对象的一个属性。
感觉 koa 的世界就剩下了 generator 和 this.body ，怎么办？继续看文档~
this.body 可以设置为字符串， buffer 、stream 、对象、或者 null 也行。
stream stream stream 说三遍可以变得很重要。

流的意义

关于流，推荐看 @愈之的通通连起来 – 无处不在的流，感触良多，对流有了新的认识，于是接下来连连看。

var koa = require('koa');
var View = require('./view');
var app = module.exports = koa();

app.use(function* () {
  this.type = 'html';
  this.body = new View(this);
});

app.listen(3000);

view.js

var Readable = require('stream').Readable;
var util = require('util');
var co = require('co');
var fs = require('fs');

module.exports = View

util.inherits(View, Readable);

function View(context) {
  Readable.call(this, {});

  // render the view on a different loop
  co.call(this, this.render).catch(context.onerror);
}

View.prototype._read = function () {};

View.prototype.render = function* () {
  // flush layout and assets
  var layoutHtml = fs.readFileSync(__dirname + "/layout.html").toString();
  this.push(layoutHtml);
  
  // fetch data and render
  this.push('<script>renderFlushCon("#A","moduleA");</script>');
  this.push('<script>renderFlushCon("#C","moduleC");</script>');
  this.push('<script>renderFlushCon("#B","moduleB");</script>');
  
  // close body and html tags
  this.push('</body></html>');
  // end the stream
  this.push(null);
};

页面输出：

moduleA
moduleB
moduleC

Transfer-Encoding:chunked
服务端和浏览器端建立管道，通过 this.push 将内容从服务端传输到浏览器端

并行的实现

目前我们已经完成了 koa 和 express 分块传输的实现，我们知道要输出的模块 A 、模块 B 、模块 C 需要并行在服务端生成内容。在这个时候来回顾下传统的网页渲染方式，A / B / C 模块同步渲染：

新版卖家中心 Bigpipe 实践（二）

采用分块传输的模式，A / B / C 服务端顺序执行，A / B / C 分块传输到浏览器渲染：

新版卖家中心 Bigpipe 实践（二）

时间明显少了，然后把服务端的顺序执行换成并行执行的话：

新版卖家中心 Bigpipe 实践（二）

通过此图，并行的意义是显而易见的。为了寻找并行执行的方案，就不得不追溯异步编程的历史。（读史可以明智，可以知道当下有多不容易）

callback 的方式

首先过多 callback 嵌套实现异步编程是地狱
第二选择绕过地狱，选择成熟的模块来取代

async 的方式

async 算是异步编码流程控制中的元老。
parallel(tasks, [callback]) 并行执行多个函数，每个函数都是立即执行，不需要等待其它函数先执行。传给最终 callback 的数组中的数据按照 tasks 中声明的顺序，而不是执行完成的顺序。

var Readable = require('stream').Readable;
var inherits = require('util').inherits;
var co = require('co');
var fs = require('fs');
var async = require('async');


inherits(View, Readable);

function View(context) {
  Readable.call(this, {});

  // render the view on a different loop
  co.call(this, this.render).catch(context.onerror);
}

View.prototype._read = function () {};

View.prototype.render = function* () {
  // flush layout and assets
  var layoutHtml = fs.readFileSync(__dirname + "/layout.html").toString();
  this.push(layoutHtml);

  var context = this;

  async.parallel([
    function(cb) {
      setTimeout(function(){
        context.push('<script>renderFlushCon("#A","moduleA");</script>');
        cb();
      }, 1000);
    },
    function(cb) {
      context.push('<script>renderFlushCon("#C","moduleC");</script>');
      cb();
    },
    function(cb) {
      setTimeout(function(){
        context.push('<script>renderFlushCon("#B","moduleB");</script>');
        cb();
      }, 2000);
    }
  ], function (err, results) {
    // close body and html tags
    context.push('</body></html>');
    // end the stream
    context.push(null);
  });
  
};

module.exports = View;

页面输出：

moduleC
moduleA
moduleB

模块显示的顺序是 C>A>B ，这个结果也说明了 Node.js IO 不阻塞
优先 flush layout 的内容
利用 async.parallel 并行处理 A 、B 、C ，通过 cb() 回调来表示该任务执行完成
任务执行完成后执行结束回调，此时关闭 body/html 标签并结束 stream

每个 task 函数执行中，如果有出错，会直接最后的 callback。此时会中断，其他未执行完的任务也会停止，所以这个并行执行的方法处理异常的情况需要比较谨慎。

另外 async 里面有个 each 的方法也可以实现异步编程的并行执行：

each(arr, iterator(item, callback), callback(err))

稍微改造下：

var options = [
  {id:"A",html:"moduleA",delay:1000},
  {id:"B",html:"moduleB",delay:0},
  {id:"C",html:"moduleC",delay:2000}
];


async.forEach(options, function(item, callback) { 
  setTimeout(function(){
    context.push('<script>renderFlushCon("#'+item.id+'","'+item.html+'");</script>');
    callback();
  }, item.delay);
  
}, function(err) { 
  // close body and html tags
  context.push('</body></html>');
  // end the stream
  context.push(null);
});

结果和 parallel 的方式是一致的，不同的是这种方式关注执行过程，而 parallel 更多的时候关注任务数据

我们会发现在使用 async 的时候，已经引入了 co ，co 也是异步编程的利器，看能否找到更简便的方法。

co

co 作为一个异步流程简化工具，能否利用强大的生成器特性实现我们的并行执行的目标。其实我们要的场景很简单：

多个任务函数并行执行，完成最后一个任务的时候可以进行通知执行后面的任务。

var Readable = require('stream').Readable;
var inherits = require('util').inherits;
var co = require('co');
var fs = require('fs');
// var async = require('async');

inherits(View, Readable);

function View(context) {
  Readable.call(this, {});

  // render the view on a different loop
  co.call(this, this.render).catch(context.onerror);
}

View.prototype._read = function () {};

View.prototype.render = function* () {
  // flush layout and assets
  var layoutHtml = fs.readFileSync(__dirname + "/layout.html").toString();
  this.push(layoutHtml);

  var context = this;
  var options = [
    {id:"A",html:"moduleA",delay:100},
    {id:"B",html:"moduleB",delay:0},
    {id:"C",html:"moduleC",delay:2000}
  ];

  var taskNum = options.length;
  var exec = options.map(function(item){opt(item,function(){
    taskNum --;
    if(taskNum === 0) {
      done();
    } 
  })});

  function opt(item,callback) {
    setTimeout(function(){
      context.push('<script>renderFlushCon("#'+item.id+'","'+item.html+'");</script>');
      callback();
    }, item.delay);
  }

  function done() {
    context.push('</body></html>');
      // end the stream
    context.push(null);
  }

  co(function* () {
     yield exec;
  });  
};

module.exports = View;

yield array 并行执行数组内的任务。
为了不使用 promise 在数量可预知的情况，加了个计数器来判断是否已经结束，纯 co 实现还有更好的方式？
到这个时候，才发现生成器的特性并不能应运自如，需要补一补。

co 结合 promise

这个方法由@大果同学赞助提供，写起来优雅很多。

var options = [
  {id:"A",html:"moduleAA",delay:100},
  {id:"B",html:"moduleBB",delay:0},
  {id:"C",html:"moduleCC",delay:2000}
];

var exec = options.map(function(item){ return opt(item); });

function opt(item) {
  return new Promise(function (resolve, reject) {
  setTimeout(function(){
      context.push('<script>renderFlushCon("#'+item.id+'","'+item.html+'");</script>');
      resolve(item);
    }, item.delay);
  });
}

function done() {
  context.push('</body></html>');
    // end the stream
  context.push(null);
}

co(function* () {
   yield exec;
}).then(function(){
  done();
});

ES 7 async/wait

如果成为标准并开始引入，相信代码会更精简、可读性会更高，而且实现的思路会更清晰。

async function flush(Something) {  
 await Promise.all[moduleA.flush(), moduleB.flush(),moduleC.flush()]
 context.push('</body></html>');
      // end the stream
    context.push(null);
}

此段代码未曾跑过验证，思路和代码摆在这里，ES 7 跑起来 ^_^。

Midway

写到这里太阳已经下山了，如果在这里来个“预知后事如何，请听下回分解”，那么前面的内容就变成一本没有主角的小说。

Midway 是好东西，是前后端分离的产物。分离不代表不往来，而是更紧密和流畅。因为职责清晰，前后端有时候可以达到“你懂的，懂！”，然后一个需求就可以明确了。用 Node.js 代替 Webx MVC 中的 View 层，给前端实施 Bigpipe 带来无限的方便。

>Midway 封装了 koa 的功能，屏蔽了一些复杂的元素，只暴露出最简单的 MVC 部分给前端使用，降低了很大一部分配置的成本。

一些信息

Midway 其实支持 express 框架和 koa 框架，目前主流应该都是 koa，Midway 5.1 之后应该不会兼容双框架。
Midway 可以更好地支持 generators 特性
midway-render this.render（xtpl,data）内容直接通过 this.body 输出到页面。

function renderView(basePath, viewName, data) {
  var me = this;
  var filepath = path.join(basePath, viewName);
  data = utils.assign({}, me.state, data);
  return new Promise(function(resolve, reject) {
    function callback(err, ret) {
      if (err) {
        return reject(err);
      }
      // 拼装后直接赋值this.body
      me.body = ret;
      resolve(ret);
    }
    render(filepath, data, callback);
  });
}

MVC

Midway 的专注点是做前后端分离，Model 层其实是对后端的 Model 做一层代理，数据依赖后端提供。
View 层模板使用 xtpl 模板，前后端的模板统一。
Controller 把路由和视图完整的结合在了一起，通常在 Controller 中实现 this.render。

Bigpipe 的位置

了解 Midway 这些信息，其实是为了弄清楚 Bigpipe 在 Midway 里面应该在哪里接入会比较合适：

Bigpipe 方案需要实现对内容的分块传输，所以也是在 Controller 中使用。
拼装模板需要 midway-xtpl 实现拼装好字符串，然后通过 Bigpipe 分块输出。
Bigpipe 可以实现对各个模块进行取数据和拼装模块内容的功能。

建议在 Controller 中作为 Bigpipe 模块引入使用，取代原有 this.render 的方式进行内容分块输出

场景

什么样的场景比较适合 Bigpipe，结合我们现有的东西和开发模式。

类似于卖家中心，模块多，页面长，首屏又是用户核心内容。
每个模块的功能相对独立，模板和数据都相对独立。
非首屏模块还是建议用滚动加载，减少首屏传输量。
主框架输出 assets 和 bigpipe 需要的脚本，主要的是需要为模块预先占位。
首屏模块是可以固定或者通过计算确认。
模块除了分块输出，最好也支持异步加载渲染的方式。

封装

最后卖家中心的使用和 Bigpipe 的封装，我们围绕着前面核心实现的分块传输和并行执行，目前的封装是这样的：

由于 Midway this.render 除了拼装模板会直接将内容赋值到 this.body，这种时候回直接中断请求，无法实现我们分块传输的目标。所以做了一个小扩展：

midway-render 引擎里面添加只拼装模板不输出的方法 this.Html

// just output html no render;
 app.context.Html = utils.partial(engine.renderViewText, config.path);

renderViewText

function renderViewText(basePath, viewName, data) {
  var me = this;
  var filepath = path.join(basePath, viewName);
  data = utils.assign({}, me.state, data);

  return new Promise(function(resolve, reject) {
    render(filepath, data, function(err, ret){
      if (err) {
        return reject(err);
      }
      //此次 去掉了 me.body=ret
      resolve(ret);
    });
  });
}

midway-render/midway-xtpl 应该有扩展，但是没找到怎么使用，所以选择这样的方式。

View.js 模块

'use strict';
var util = require('util');
var async = require('async');
var Readable = require('stream').Readable;

var midway = require('midway');
var DataProxy = midway.getPlugin('dataproxy');

// 默认主体框架
var defaultLayout = '<!DOCTYPE html><html><head></head><body></body>';

exports.createView = function() {
  function noop() {};

  util.inherits(View, Readable);

  function View(ctx, options) {
    Readable.call(this);

    ctx.type = 'text/html; charset=utf-8';
    ctx.body = this;
    ctx.options = options;
    this.context = ctx;

    this.layout = options.layout || defaultLayout;
    this.pagelets = options.pagelets || [];
    this.mod = options.mod || 'bigpipe';
    this.endCB = options.endCB || noop;
  }

  /**
   *
   * @type {noop}
   * @private
   */
  View.prototype._read = noop;


  /**
   * flush 内容
   */
  View.prototype.flush = function* () {
    // flush layout
    yield this.flushLayout();

    // flush pagelets
    yield this.flushPagelets();
  };

  /**
   * flush主框架内容
   */
  View.prototype.flushLayout = function* () {
    this.push(this.layout);
  }

  /**
   * flushpagelets的内容
   */
  View.prototype.flushPagelets = function* () {
    var self = this;
    var pagelets = this.pagelets;

    // 并行执行
    async.each(pagelets, function(pagelet, callback) {
      self.flushSinglePagelet(pagelet, callback);
    }, function(err) {
      self.flushEnd();
    });
  }


  /**
   * flush 单个pagelet
   * @param pagelet
   * @param callback
   */
  View.prototype.flushSinglePagelet = function(pagelet, callback) {
    var self = this,
      context = this.context;

    this.getDataByDataProxy(pagelet,function(data){
      var data = pagelet.formateData(data, pagelet) || data;

      context.Html(pagelet.tpl, data).then(function(html) {
        var selector = '#' + pagelet.id;
        var js = pagelet.js;

        self.arrive(selector,html,js);

        callback();
      });
    });
  }

  /**
   * 获取后端数据
   * @param pagelet
   * @param callback
   */
  View.prototype.getDataByDataProxy = function(pagelet, callback) {
    var context = this.context;

    if (pagelet.proxy) {
      var proxy = DataProxy.create({
        getData: pagelet.proxy
      });

      proxy.getData()
        .withHeaders(context.request.headers)
        .done(function(data) {
          callback && callback(data);
        })
        .fail(function(err) {
          console.error(err);
        });
    }else {
      callback&&callback({});
    }
  }

  /**
   * 关闭html结束stream
   */
  View.prototype.flushEnd = function() {
    this.push('</html>');
    this.push(null);
  }



  // Replace the contents of `selector` with `html`.
  // Optionally execute the `js`.
  View.prototype.arrive = function (selector, html, js) {
      this.push(wrapScript(
          'BigPipe(' +
              JSON.stringify(selector) + ', ' +
              JSON.stringify(html) +
              (js ? ', ' + JSON.stringify(js) : '') + ')'
      ))
  }



  function wrapScript(js) {
    var id = 'id_' + Math.random().toString(36).slice(2)

    return '<script id="' + id + '">'
      + js
      + ';remove(/'#' + id + '/');</script>'
  }

  return View;
}

context.html 拼装各个 pagelet 的内容

Controller 调用

var me = this;
var layoutHtml = yield this.Html('p/seller_admin_b/index', data);

yield new View(me, {
  layout: layoutHtml, // 拼装好layout模板
  pagelets: pageletsConfig,
  mod: 'bigpie'  // 预留模式选择
}).flush();

layoutHtml 拼装好主框架模板
每个 pagelets 的配置

{
 id: 'seller_info',//该pagelet的唯一id
    proxy: 'Seller.Module.Data.seller_info', // 接口配置
    tpl: 'sellerInfo.xtpl', //需要的模板
    js: '' //需要执行的js
}

proxy 和 tpl 获取数据和拼装模板需要并行执行
js 通常进行模块的初始化

改进

思路和代码实现都基于现有的场景和技术背景，目前只有实现的思路和方案尝试，还没形成统一的解决方案，需要更多的场景来支持。目前有些点还可以改进的：

代码可以采用 ES6/ES7 新特性进行改造会更优雅，时刻结合 Midway 的升级进行改进。
分块传输机制存在一些低版本浏览器不兼容的情况，最好实现异步加载模块的方案，分双路由，根据用户设备切换路由。
对于每个模块和内容进行异常处理，设置一个请求的时间限制，达到限制时间，关闭链接，不要让页面挂起。此时把本来需要进行分块传输的模块通过异步的方式引入。
并行的实现方案目前采用 async.each，需要从性能上进行各方案的对比

原文 http://taobaofed.org/blog/2016/03/25/seller-bigpipe-coding/

正文到此结束

所属分类：编程技术

本文标签： 编译 setTimeout cat 多线程 DOM ip http UI 时间服务器开发线程管理 Document node ask 配置需求 PHP list Facebook HTML tab src map 代码 web json 突破下载 CSS App 协议同步 apr Node.js java Select js ACE 数据 CTO
版权声明： 本文为互联网转载文章，出处已在文章中说明(部分除外)。如果侵权，请联系本站长删除，谢谢。
本文海报： 生成海报一生成海报二

其他链接

关于本站

本站定位：个人技术类博客

本站作用：写博客、记日志、闲聊扯淡鼓捣技术。

问题交流

新版卖家中心 Bigpipe 实践（二）

核心问题

技术突破口

概念

实现

PHP 的方式

Java 的方式

flush 的思考

Node.js 实现

回到 HelloWorld

完整点

layout.html

express 实现

koa 实现

流的意义

view.js

并行的实现

callback 的方式

async 的方式

co

多个任务函数并行执行，完成最后一个任务的时候可以进行通知执行后面的任务。

co 结合 promise

ES 7 async/wait

Midway

一些信息

MVC

Bigpipe 的位置

建议在 Controller 中作为 Bigpipe 模块引入使用，取代原有 this.render 的方式进行内容分块输出

场景

封装

renderViewText

View.js 模块

Controller 调用

改进

热门推荐

相关文章

说给你听

本文目录

随机标签

书籍教程

近期评论

网站信息

其他链接

关于本站

问题交流