2016年5月13日-15日,由CSDN重磅打造的 2016中国云计算技术大会 (CCTC 2016)将于5月13日-15日在北京举办,今年大会特设“中国Spark技术峰会”、“Container技术峰会”、“OpenStack技术峰会”、“大数据核心技术与应用实战峰会”四大技术主题峰会,以及“云计算核心技术架构”、“云计算平台构建与实践”等专场技术论坛。大会讲师阵容囊括Intel、微软、IBM、AWS、Hortonworks、Databricks、Elastic、百度、阿里、腾讯、华为、乐视、京东、小米、微博、迅雷、国家电网、中国移动、长安汽车、广发证券、民生银行、国家超级计算广州中心等60+顶级技术讲师,CCTC必将是中国云计算技术开发者的顶级盛会。目前会议门票限时7折(截止至4月29日24点),详情访问CCTC 2016官网。
MESOS 提供了Scheduler HTTP RESTful API. 不论是软件开发人员还是软件测试人员都可以通过这些API 进一步了解mesos的工作机制。本文讲述了如何用curl来访问这些API。
1.mesos master,master的command options 设置如下:
MESOS_MASTER_OPTS="--work_dir=/var/lib/mesos --log_dir=/var/log/mesos"
2 mesos slaves,每个slave的配置是一样的,2 CPU,996M memory。slave的command options设置如下
MESOS_SLAVE_OPTS="--master=$MESOS_MASTER_IP:5050 --log_dir=/var/log/mesos"
1.注册framework是framework与mesos master通信的第一步。Framework注册成功之后,才会建立起和mesos master之间的联系。运行以下命令来注册一个framework。
# curl –vv --no-buffer –X POST -H "Content-Type: application/json" -d@register.json http://$MESOS_MASTER_IP:5050/master/api/v1/scheduler
curl 后面的-d 指的是传送给API的data. 这个data是个json串。在命令行中写长的json串是个比较痛苦的事情,最好将json串写进一个文件。本例中该文件的名字为register.json. json文件怎么定义可以查看 mesos.proto , scheduler.proto .
Response body like this: * upload completely sent off: 257 out of 257 bytes < HTTP/1.1 200 OK < Date: Fri, 15 Apr 2016 05:56:03 GMT < Mesos-Stream-Id: b3a3e239-f0ad-43d3-b635-1c1d9c66566b < Content-Type: application/json < X-Cache: MISS from db03b04 < X-Cache-Lookup: MISS from db03b04:3128 < Transfer-Encoding: chunked < Via: 1.1 db03b04 (squid/3.3.8) < Connection: keep-alive < 70 {"subscribed":{"framework_id":{"value":"test13"}},"type":"SUBSCRIBED"}20 {"offers":{"offers":[{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"},"framework_id":{"value":"test13"},"hostname":"slave1","id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O4"},"resources":[{"name":"cpus","role":"*","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":996.0},"type":"SCALAR"},{"name":"disk","role":"*","scalar":{"value":30509.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"}],"url":{"address":{"hostname":"slave1","ip":"9.111.254.36","port":5051},"path":"//slave(1)","scheme":"http"}},{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S0"},"framework_id":{"value":"test13"},"hostname":"slave0","id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O5"},"resources":[{"name":"cpus","role":"*","scalar":{"value":2.0},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":996.0},"type":"SCALAR"},{"name":"disk","role":"*","scalar":{"value":30509.0},"type":"SCALAR"},{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"}],"url":{"address":{"hostname":"slave0","ip":"9.111.254.62","port":5051},"path":"//slave(1)","scheme":"http"}}]},"type":"OFFERS"}
Mesos-Stream-Id, 作为该framework的唯一标识。在后面的actions 像launch task, decline offers等都需要用到它.framework 注册成功之后,如果master 有资源的话,该framework会收到master发送来的offers。
register.json文件内容示例:
{ "framework_id": {"value" : "test13"}, "type":"SUBSCRIBE", "subscribe":{ "framework_info":{ "user":"root", "name":"test13", "failover_timeout":60, "role":"aa", "id":{"value":"test13"}, "principal":"test", "capabilities":{"type":"REVOCABLE_RESOURCES"} }, "force":true } }
2.使用已注册好的framework 来launch task。
# curl -vv --no-buffer -X POST -H "Content-Type: application/json" -H "Mesos-Stream-Id:b3a3e239-f0ad-43d3-b635-1c1d9c66566b" -d@launch_task.json http://$MESOS_MASTER_IP:5050/master/api/v1/scheduler
Response body like this:
* Done waiting for 100-continue < HTTP/1.1 202 Accepted < Date: Wed, 13 Apr 2016 05:37:44 GMT < Content-Length: 0 < X-Cache: MISS from db03b04 < X-Cache-Lookup: MISS from db03b04:3128 < Via: 1.1 db03b04 (squid/3.3.8) < Connection: keep-alive
launch_task.json文件内容示例:
{ "framework_id":{"value":"test13"}, "type":"ACCEPT", "accept":{ "offer_ids":[ {"value":"81c111bd-f27d-41a4-b184-64090dec3048-O428"} ], "filters":{}, "operations":[ { "type":"LAUNCH", "launch":{"task_infos":[{ "name":"task 00012", "task_id":{"value": "00012"}, "agent_id":{"value": "5bb4aba0-628b-4309-b36f-767db1ebb7f4-S0"}, "resources":[ { "name":"mem", "type":"SCALAR", "scalar":{"value":996.0} }, { "name":"cpus", "type":"SCALAR", "scalar":{"value":1.0} } ], "command":{"value": "/bin/sleep 3000s"} } ] } } ] } }
Result
查看framework output , framework 收到了UPDATE event, 这是MESOS master发给framework更新task 状态的UPDATE
{"type":"UPDATE","update":{"status":{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"},"container_status":{"network_infos":[{"ip_address":"9.111.254.36","ip_addresses":[{"ip_address":"9.111.254.36"}]}]},"executor_id":{"value":"00016"},"source":"SOURCE_EXECUTOR","state":"TASK_RUNNING","task_id":{"value":"00016"},"timestamp":1460526113.05411,"uuid":"H9KoCldtRTu0T7halLxs9w=="}}}20 {"type":"HEARTBEAT"}391
注意:master 会将当前offer剩余的资源作为新的offer 提供给其它的framework。形成一个新的资源。
3.Framework 收到task status 之后,需要发acknowledge event 给master ,表示已经收到。master就会一直不停的发同样的event给framework直到它收到该event的acknowledge
# curl -vv --no-buffer -X POST -H "Content-Type: application/json" -H "Mesos-Stream-Id:b3a3e239-f0ad-43d3-b635-1c1d9c66566b" -d@acknowledge.json http://$MESOS_MASTER_IP:5050/master/api/v1/scheduler
Response body like this:
* upload completely sent off: 264 out of 264 bytes < HTTP/1.1 202 Accepted < Date: Wed, 13 Apr 2016 06:01:31 GMT < Content-Length: 0 < X-Cache: MISS from db03b04 < X-Cache-Lookup: MISS from db03b04:3128 < Via: 1.1 db03b04 (squid/3.3.8) < Connection: keep-alive
acknowledge.json的文件内容示例如下:
{ "framework_id" : {"value" : "test13"}, "type" : "ACKNOWLEDGE", "acknowledge" : { "agent_id" : {"value" : "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"}, "task_id" : {"value" : "00016"}, "uuid" : "H9KoCldtRTu0T7halLxs9w==" } }
Result
在Framework的output里,同样UUID的event信息不再出现
4.当task 运行完毕的时候,它所占用的资源会被释放掉,从而一个新的资源产生了,master又将该资源按照DRF算法发给了相应的framework。
5.Decline offer, 如果目前分配给framework的offer不合适,framework可以decline it。
# curl -vv --no-buffer -X POST -H "Content-Type: application/json" -H "Mesos-Stream-Id:b3a3e239-f0ad-43d3-b635-1c1d9c66566b" -d@decline.json http://$MESOS_MASTER_IP:5050/master/api/v1/scheduler
Response body like this
* upload completely sent off: 214 out of 214 bytes < HTTP/1.1 202 Accepted < Date: Wed, 13 Apr 2016 06:17:56 GMT < Content-Length: 0 < X-Cache: MISS from db03b04 < X-Cache-Lookup: MISS from db03b04:3128 < Via: 1.1 db03b04 (squid/3.3.8) < Connection: keep-alive <
decline.json 文件内容示例:
{ "framework_id" : {"value" : "test13"}, "type" : "DECLINE", "decline" : { "offer_ids" : [ {"value" : "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O15"} ] } }
Result
被decline的offer不再可用。
6.如果在mesos cluster 运行很多个task,在mesos cluster 系统里面就会不停的创造出很多的资源碎片,这些碎片都不能单独的launch task的时候,该怎么办呢?有两个办法可以合并同一个slave上面的资源碎片。
6.1 ACCEPT event
在launch task的时候,将slave上面的碎片资源写成list,这样会自动合并碎片资源。注意这些资源碎片一定是属于同一个slaveLaunch_task.json 文件内容示例:
{ "framework_id":{"value":"test13"}, "type":"ACCEPT", "accept":{ "offer_ids":[ {"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O14"}, {"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O16"} ], "filters":{}, "operations":[ { "type":"LAUNCH", "launch":{"task_infos":[{ "name":"task 00017", "task_id":{"value": "00017"}, "agent_id":{"value": "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"}, "resources":[ { "name":"mem", "type":"SCALAR", "scalar":{"value":996.0} }, { "name":"cpus", "type":"SCALAR", "scalar":{"value":2.0} } ], "command":{"value": "/bin/sleep 180s"} } ] } } ] } }
当task运行完毕,mesos master就会将这些资源作为整块资源按照DRF算法发给相应的framework
6.2 decline event 合并碎片资源
将各个碎片资源decline掉, 这样也能整合碎片资源。 例如有三个offers,这3个offer都属于同一个slave。decline 这些offer之后,mesos master会自动整合这些碎片资源,然后按照DRF算法将合并后的资源发给相应的framework
{"offers":{"offers":[{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"},"framework_id":{"value":"test13"},"hostname":"slave1","id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O24"},"resources":[{"name":"cpus","role":"*","scalar":{"value":1.0},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":512.0},"type":"SCALAR"}],"url":{"address":{"hostname":"slave1","ip":"9.111.254.36","port":5051},"path":"//slave(1)","scheme":"http"}}]},"type":"OFFERS"} {"offers":{"offers":[{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"},"framework_id":{"value":"test13"},"hostname":"slave1","id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O26"},"resources":[{"name":"cpus","role":"*","scalar":{"value":1.0},"type":"SCALAR"},{"name":"mem","role":"*","scalar":{"value":256.0},"type":"SCALAR"}],"url":{"address":{"hostname":"slave1","ip":"9.111.254.36","port":5051},"path":"//slave(1)","scheme":"http"}}]},"type":"OFFERS"} {"offers":{"offers":[{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"},"framework_id":{"value":"test13"},"hostname":"slave1","id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O27"},"resources":[{"name":"ports","ranges":{"range":[{"begin":31000,"end":32000}]},"role":"*","type":"RANGES"},{"name":"mem","role":"*","scalar":{"value":228.0},"type":"SCALAR"},{"name":"disk","role":"*","scalar":{"value":30509.0},"type":"SCALAR"}],"url":{"address":{"hostname":"slave1","ip":"9.111.254.36","port":5051},"path":"//slave(1)","scheme":"http"}}]},"type":"OFFERS"}
decline offer 内容如下:
{ "framework_id" : {"value" : "test13"}, "type" : "DECLINE", "decline" : { "offer_ids" : [ {"value" : "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O24"}, {"value" : "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O26"}, {"value" : "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-O27"} ] } }
decline 成功以后,一个整合过的offer就会发送给相应的framework
7.kill task。
kill event主要用于kill special task,一次kill一个task
# curl -vv --no-buffer -X POST -H "Content-Type: application/json" -H "Mesos-Stream-Id:b3a3e239-f0ad-43d3-b635-1c1d9c66566b" -d@kill_task.json http://$MESOS_MASTER_IP:5050/master/api/v1/scheduler
Response body like this
* upload completely sent off: 211 out of 211 bytes < HTTP/1.1 202 Accepted < Date: Fri, 15 Apr 2016 13:08:25 GMT < Content-Length: 0 < X-Cache: MISS from db03b04 < X-Cache-Lookup: MISS from db03b04:3128 < Via: 1.1 db03b04 (squid/3.3.8) < Connection: keep-alive
kill_task.json 文件内容示例:
{ "framework_id" : {"value" : "test13"}, "type" : "KILL", "kill" : { "task_id" : {"value" : "00020"}, "agent_id" : {"value" : "b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S1"} } }
ResultThe output in framework like this:
{"type":"UPDATE","update":{"status":{"agent_id":{"value":"b5e0c23f-237e-4398-816f-ad1a4aa7d3f3-S0"},"container_status":{"network_infos":[{"ip_address":"9.111.254.62","ip_addresses":[{"ip_address":"9.111.254.62"}]}]},"executor_id":{"value":"00012"},"message":"Command terminated with signal Terminated","source":"SOURCE_EXECUTOR","state":"TASK_KILLED","task_id":{"value":"00012"},"timestamp":1460725705.73596,"uuid":"mKSkxCycRLKS8y4YA2HM4A=="}}}
该task的资源被释放重新利用。如果只有一个framework的话,在当前framework的output中就会有一个新的offer。
8.TEARDOWN framework
# curl -vv --no-buffer -X POST -H "Content-Type: application/json" -H "Mesos-Stream-Id: b3a3e239-f0ad-43d3-b635-1c1d9c66566b" -d@teardown.json http://9.111.254.199:5050/master/api/v1/scheduler
Response body like this
* upload completely sent off: 77 out of 77 bytes < HTTP/1.1 202 Accepted < Date: Fri, 15 Apr 2016 13:25:49 GMT < Content-Length: 0 < X-Cache: MISS from db03b04 < X-Cache-Lookup: MISS from db03b04:3128 < Via: 1.1 db03b04 (squid/3.3.8) < Connection: keep-alive
teardown.json 文件内容如下:
{ "framework_id" : {"value" : "test13"}, "type" : "TEARDOWN" }
Result注册framework的进程会自动退出。Framework 变成inactive
More information about mesos RESTful api ,please refer to
http://mesos.apache.org/documentation/latest/scheduler-http-api/作者简介:高智芳,IBM软件测试工程师,主要从事云计算领域相关的工作,平时喜欢尝试新的测试技术和测试方法。乐于在测试中发现bug。