参数:
{
"key": "CN208414307U",
"url": "http://epub.cnipa.gov.cn/pic.jpg",
"bucket": "patent",
"real":0,
"proxy":1,
}
- key (str): 唯一标识 (必选)
- url (str): 图片链接 (必选)
- bucket (str):图片来源 (必选)
- proxy (int): 是否开启代理 ,0 不开启, 1 开启 ,默认为1
- real (int):是否实时返回结果 ,可以传 0,1,2,3 ,默认为0
- 0 表示不要求实时返回结果
- 1 表示要求实时返回储存url
- 2 表示要求实时返回图片内容和储存url
返回值:
{
"spider_name": "picture_download",
"platform_name": "picture",
"http_code": 200,
"error_msg": "successful",
"task_result": 1000,
"data_type": "",
"bucket": "patent",
"spider_start_time": "2021-10-13 15:48:56",
"spider_end_time": "2021-10-13 15:49:08",
"spider_used_time_ms": 12,
"spider_ip": "10.8.6.30",
"task_params": {
"key": "CN204671179U",
"url": "http://qxb-img.oss-cn-hangzhou.aliyuncs.com/dlpatents/009c730db9230747893f2356324376ba.jpg",
"bucket": "patent"
},
"metadata": {},
"data": {
"key": "CN204671179U",
"bucket": "patent",
"store_path": "patent/ff/b1/d0/ffb1d035d18b1d8a37ad2ac54218adb9.jpg",
"content": "",
"basket_host": "10.8.8.59:31010"
}
}
-
message 与 task_result 对应关系:
message task_result successful 1000 成功 status_code error: {status} 9110 http状态码异常 request error 9100 requests请求异常 url unidentified 7000 图片url解析错误 params error 7000 参数错误 decode error 9300 图片内容解码错误 basket error 6000 调用basket错误
目前计划返回结果储存为 json 文件(可结合文件下载服务);
文件目录格式为: /{bucket}/{date}/{uuid}.json
流程图:
graph LR
A[spider]
F[udms] --> A-->E[basket]
B[web] --> A
C[redis] --> F
D[post]--> B
E -->|real = 0 |K[.json] --> H[kibana]
E -->|real = 1,2 |G[response] --> K