基本信息
环保等级
数据名称(中文)
环保等级
数据英文名称
environmental_protection_grade
采集网站(采集入口)
官网PC端入口:
江苏:https://hblp.jsshbt.cn/shencai-envfacial-web/service/envFacial/hblp/tEnvBasKeylistModel/queryKeyListBusinessNumber
浙江:http://223.4.71.96/portal/data/api/auto
福建:http://220.160.52.213:20071/api/template/page/p_list_eval_credit
四川:http://103.203.219.138:18081/data/w/evaluateResults/list4Public
湖南:http://218.76.24.162:5014/hnxypjqyd/xxgk/queryXxgsQy
河南:http://222.143.24.250:8127/credit_publicService/system/company/systemcompanyinfo/getPublicResultListPage.do
湖北:http://113.57.151.5:8030/HBHB/companyInfo.action
广东:https://www-app.gdeei.cn/gdeepub/data/industry
贵州:http://202.98.194.198:6661/wwgs/xypj/public/credit/ratingpublicity/primaryPublicity.jsp
广西:http://202.103.233.156:9081/xypjgx/pages/xypj/wzgs/qypjjgList.jsp
河北:http://110.249.223.66:8099/xypjww/xypj/listEntEvaluate
辽宁:http://221.180.204.224:8080/LiaoNingQiYeXinYongPingJia/display/getEnterpriseInfo.do
山东:http://103.239.155.242:7002/xypjgzd/business/xypj/xypjcontroller/
吉林:http://125.32.96.149:8081/was5/web/search
安徽:http://112.27.211.29:8082/wznrfb/getQueryList
采集文件存放路径:
/data/gravel_spiders/environmental_protection
采集频率及采集策略
存量更新策略
目前全量更新一轮
增量采集策略
爬虫
环保等级爬虫 environmental_protection
责任人
蒋家升
爬虫名称
environmental_protection
代码地址
项目地址: http://tech.pingansec.com/granite/project-gravel/-/tree/develop_environmental_protection_grade/scrapy_spiders
队列名称及队列地址
- redis host: redis://:utn@0818@bdp-mq-001.redis.rds.aliyuncs.com:6379/7
- redis port: 6379
- redis db: 7
- redis key:
- environmental_protection
优先级队列说明
- environmental_protection 支持队列优先级
任务来源
导入任务配置文件路径:http://tech.pingansec.com/granite/project-gravel/-/blob/develop_environmental_protection_grade/app_environmental_protection_grade/data_pump/normal_list_task.yml
任务输入参数(样例)
任务样例
{"province": "jiangsu", "step": "start"},
{"province": "zhejiang", "step": "start"},
{"province": "fujian", "step": "start"},
{"province": "sichuan", "step": "start"},
{"province": "hunan", "step": "start"},
{"province": "henan", "step": "start", "index": 0, "city": "郑州市"},
{"province": "hubei", "step": "start"},
{"province": "guangdong", "step": "start"},
{"province": "guizhou", "step": "start", "index": 0, "year": 2019},
{"province": "guangxi", "step": "start"},
{"province": "hebei"},
{"province": "liaoning"}
{"province": "shandong", "company": "山东金穗木业股份有限公司"}
{"province": "jilin", "step": "start"}
{"province": "anhui", "step": "start", "index": 0, "year": 2019}
任务参数说明
{"province": "henan", "step": "start", "index": 0, "city": "郑州市"}
{"province": "guizhou", "step": "start", "index": 0, "year": 2019}
{"province": "shandong", "company": "山东金穗木业股份有限公司"}
{"province": "jilin", "step": "start", "index": 0, "level": "blue"}
- 主要参数
- province: 省份拼音
- index: 翻页的页数(部分不需要)
- 非必要参数
- step: 步骤
- 特殊参数
- city: 地市,仅河南省有该字段
- year: 年份,仅贵州省与安徽省有该字段
- company: 公司名或统一信用代码,仅山东省有该字段
- level: 等级,仅吉林省有该字段
data_type说明
list: 列表页数据
detail:详情页数据,当前仅山东省属于detail
爬虫结果的超级数据
实际爬虫结果的数据结构
江苏:
{
"data":
[
{
"creditDataID": "202110242303108479a164c13b45caa51598e28aac5b69",
"creditLevel": 4,
"creditName": "一般守信",
"efResultID": "13574254755210144971",
"fullName": "宿迁市-宿豫区-宿豫经济开发区",
"manageLevel": 2,
"manageName": "二星",
"spCode": "142332828000",
"spName": "宿迁市罐头食品有限责任公司",
"time": "2021-10-25"
},
{
"creditDataID": "202110242305284f9a3b85fc214f13b2cb5944fbec7e36",
"creditLevel": 4,
"creditName": "一般守信",
"efResultID": "13574254755210175105",
"fullName": "常州市-武进区",
"manageLevel": 5,
"manageName": "五星",
"spCode": "3101120200000099",
"spName": "江苏绿浥农业科技股份有限公司",
"time": "2021-10-25"
},
{
"creditDataID": "202110242308191eaf0c8cb8174469bc192039d249540c",
"creditLevel": 4,
"creditName": "一般守信",
"efResultID": "13574254755210175285",
"fullName": "苏州市-姑苏区",
"manageLevel": 5,
"manageName": "五星",
"spCode": "3200000200000537",
"spName": "中建三局第一建设工程有限责任公司",
"time": "2021-10-25"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-10-25 17:17:52.098",
"spider_end_time": "2021-10-25 17:17:54",
"task_params":{
"province": "jiangsu",
"step": "start",
"index": 1
},
"metadata":{
"province": "jiangsu",
"index": 1
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.18"
}
浙江:
{
"data":
[
{
"city": "舟山市", # 市
"level_title": "优秀",
"district": "普陀区", # 县
"level_code": "A", # 信用等级
"social_credit_code": "91330903336897819J", # 统一社会信用代码
"score_code": "8ca03c8e-634c-49cb-b35c-bac95ce46910",
"ent_name": "舟山丰瑞海洋生物制品有限公司",
"region_code": "330903",
"release_time": 1635264000000 # 更新时间
},
{
"city": "舟山市",
"level_title": "优秀",
"district": "普陀区",
"level_code": "A",
"social_credit_code": "913309031487170831",
"score_code": "8ca03c8e-634c-49cb-b35c-bac95ce46910",
"ent_name": "中石化浙江舟山石油有限公司",
"region_code": "330903",
"release_time": 1635264000000
},
{
"city": "舟山市",
"level_title": "优秀",
"district": "普陀区",
"level_code": "A",
"social_credit_code": "9133090307868393X1",
"score_code": "8ca03c8e-634c-49cb-b35c-bac95ce46910",
"ent_name": "浙江荣生海洋生物制品有限公司",
"region_code": "330903",
"release_time": 1635264000000
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-10-27 11:28:51.445",
"spider_end_time": "2021-10-27 11:28:58",
"task_params":
{
"province": "zhejiang",
"step": "start",
"index": 2
},
"metadata":
{
"province": "zhejiang",
"index": 2
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.18"
}
福建:
{
"data":
[
{
"id": 11795,
"social_credit_code": "91350504MA2Y13F352",
"credit_year_batch": "2021年第二批",
"ent_name": "泉州佰份佰卫生用品有限公司", # 企业名称
"county": "洛江区", # 区县
"city": "泉州市", # 地市
"deptName": "泉州市洛江生态环境局", # 评价单位
"createTime": "2021-06-03", # 评价时间
"credit": "79",
"credit_type": "环保良好企业" # 信用等级
},
{
"id": 11794,
"social_credit_code": "91350504MA346T4N1T",
"credit_year_batch": "2021年第二批",
"ent_name": "泉州洛江凤栖石材厂",
"county": "洛江区",
"city": "泉州市",
"deptName": "泉州市洛江生态环境局",
"createTime": "2021-06-03",
"credit": "79",
"credit_type": "环保良好企业"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list_of_normal",
"spider_start_time": "2021-10-27 14:50:13.824",
"spider_end_time": "2021-10-27 14:50:14",
"task_params":
{
"province": "fujian",
"step": "start",
"index": 2020
},
"metadata":
{
"province": "fujian",
"index": 2020
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.6.51"
}
四川: # 参照字段解析
{
"data":
[
{
"id": 994222,
"enterprise":
{
"id": 354497052,
"name": "阆中市枣碧大梁山页岩机砖厂", # 参照字段解析
"creditCode": "92511381MA695TYH83",
"orgCode": "MA695TYH8",
"enterpriseType": "PRIVATE_OWNED",
"industry": "OTHER",
"polluteType": "EXHAUST_GAS",
"controlType": "CITY_PREFECTURE",
"enterpriseAttr": "PRODUCTION_ENTERPRISE",
"legalName": "庄兆雄",
"legalTel": "15881493231",
"productScale": "7.5万吨/年",
"businessScope": "页岩砖生产、销售",
"headOfEPA": "庄剑平",
"regTime": "2008-05-10T00:00:00+08:00",
"enterpriseState": "PRODUCTING",
"regionCity":
{
"code": "511300000",
"name": "南充市",
"parent":
{
"code": "510000000",
"name": "四川省"
}
},
"regionDistrict":
{
"code": "511381000",
"name": "阆中市",
"parent":
{
"code": "511300000",
"name": "南充市"
}
},
"longitude": 105.2579,
"latitude": 31.0116,
"isSSGS": false,
"enable": true,
"lastModifiedDate": "2021-03-31T09:34:40.045+08:00",
"address": "阆中市枣碧乡大梁山村"
},
"evaluateScore": 84,
"selfScore": 98,
"countyScore": 88,
"cityScore": 84,
"evaluateResult": "HBLHQY",
"evaluateYear": 2020,
"last": true,
"publishOrg": "南充市生态环境局",
"evaluateState": "GSZ",
"veto": false,
"dataFrom": "NormalCreditEvaluation"
},
{
"id": 994221,
"enterprise":
{
"id": 354496929,
"name": "阆中市金福旺页岩机砖厂",
"creditCode": "92511381MA62HXHX0D",
"orgCode": "MA62HXHX-0",
"industry": "OTHER",
"controlType": "OTHER",
"enterpriseAttr": "PRODUCTION_ENTERPRISE",
"legalName": "金跃伟",
"legalTel": "",
"headOfEPA": "金光禄",
"regionCity":
{
"code": "511300000",
"name": "南充市",
"parent":
{
"code": "510000000",
"name": "四川省"
}
},
"regionDistrict":
{
"code": "511381000",
"name": "阆中市",
"parent":
{
"code": "511300000",
"name": "南充市"
}
},
"isSSGS": false,
"enable": true,
"lastModifiedDate": "2020-07-10T12:29:10.841+08:00",
"address": "阆中市江南镇瓦房沟村十四社"
},
"evaluateScore": 95,
"selfScore": 104,
"countyScore": 97,
"cityScore": 95,
"evaluateResult": "HBLHQY",
"evaluateYear": 2020,
"last": true,
"publishOrg": "南充市生态环境局",
"evaluateState": "GSZ",
"veto": false,
"dataFrom": "NormalCreditEvaluation"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list_of_normal",
"spider_start_time": "2021-10-27 14:54:17.216",
"spider_end_time": "2021-10-27 14:54:17",
"task_params":
{
"province": "sichuan",
"step": "start",
"index": 1
},
"metadata":
{
"province": "sichuan",
"index": 1
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.18"
}
湖南:
{
"data":
[
{
"SSQX": "雨花区", # 区县
"GSDJ": "环保合格单位", # 信用等级(与年度字段的年份相关)
"ND": "2020", # 年度
"QYMC": "长沙博大环保科技有限公司", # 企事业单位名称
"SSDS": "长沙市", # 市州
"GXSJ": "2021年09月27日", # 更新时间
"TYSHXYDM": "91430111344823182Y", # 统一社会信用代码
"CPDJ": 2, # 参评等级
"ZXDJ": "环保合格单位" # 当前信用等级
},
{
"SSQX": "保靖县",
"GSDJ": "环保合格单位",
"ND": "2020",
"QYMC": "保靖县人民医院",
"SSDS": "湘西土家族苗族自治州",
"GXSJ": "2021年09月27日",
"TYSHXYDM": "12433125448636058Q",
"CPDJ": 2,
"ZXDJ": "环保合格单位"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-10-29 16:03:08.162",
"spider_end_time": "2021-10-29 16:03:10",
"task_params": {"province": "hunan","step": "start","index": 1},
"metadata": {"province": "hunan","index": 1},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.10"
}
河南:
{
"data":
[
{
"belongsBasin": "1",
"companyAddress": "郑州市中原区桐柏南路158号",
"companyLevel": "2",
"companyName": "河南(郑州)中汇心血管病医院", # 企业事业单位名称
"contactAddress": "郑州市中原区桐柏南路158号",
"contactNumber": "",
"contactPerson": "",
"contactTelphone": "15290405687",
"contactUser": "周毅鹏",
"contactWechat": "zyp15290405687",
"controlLimit": "",
"createTime": "2021-06-09 14:16:43",
"createUser": "d857c4afdcc047f1a1fa70417df5f5f7",
"emissionLimits": "",
"emissionsTo": "",
"enabledStatus": "1",
"evaluateDate": "2021-09-24", # 评级时间
"evaluation": "1",
"exhaustType": "01,02,04",
"finalResult": "警示", # 等级
"hasInit": false,
"id": "0351cdb38249462f8589d065f8a207af",
"industry": "",
"industryInvolved": "Q8415",
"isUsed": "1",
"legalRepresentative": "毛慧娟", # 法人
"officePhone": "",
"orgCode": "ceaa1f4652ae4d73a2bf82d36eb2e325",
"orgName": "中原区生态环境局", # 评级单位
"organizationCode": "52410100MJF72040XB",
"outletName": "",
"pKName": "id",
"parentCode": "410100",
"pollutantName": "",
"postcode": "450000",
"pregion": "郑州市", # 城市
"productionDate": "2009-11-05",
"region": "中原区", # 区县
"regionCode": "410102",
"regionName": "",
"registeredAddress": "",
"remarks": "", # 备注
"scores": "75.0",
"unicode": "52410100MJF72040XB", # 统一社会信用代码
"updateTime": "2021-07-08 15:57:38",
"updateUser": "cd83c18f52a847bdb3466dacebc80515"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-10-28 15:31:48.890",
"spider_end_time": "2021-10-28 15:33:17",
"task_params":
{
"province": "henan",
"step": "start",
"city": "郑州市",
"index": 1
},
"metadata":
{
"province": "henan",
"city": "郑州市",
"index": 1
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.18"
}
湖北:
{
"data":
[
{
"序号": "1",
"行政区域": "天门市",
"单位名称": "天门市岳口新兴猪场",
"单位地址": "天门市岳口镇新兴垸村三组",
"法定代表人": "罗远松",
"当前得牌": "蓝标",
"时间": "2020-01-19 13:35:33.0"
},
{
"序号": "2",
"行政区域": "天门市",
"单位名称": "天门市新跃养殖场",
"单位地址": "天门市岳口镇新兴垸村",
"法定代表人": "罗良洲",
"当前得牌": "蓝标",
"时间": "2020-01-19 13:35:33.0"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-10-29 10:45:05.281",
"spider_end_time": "2021-10-29 10:45:09",
"task_params":
{
"province": "hubei",
"step": "start",
"index": 1
},
"metadata":
{
"province": "hubei",
"index": 1
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.10"
}
广东:
{
"data":
[
{
"序号": "1",
"地区": "广州市",
"组织机构代码/社会统一信用代码": "",
"企业名称": "广东南方碱业股份有限公司",
"年度评定结果":
{
"2019": "蓝牌",
"2018": "蓝牌",
"2017": "红牌"
}
},
{
"序号": "2",
"地区": "清远市",
"组织机构代码/社会统一信用代码": "",
"企业名称": "清远市广业环保有限公司(源潭污水处理厂)",
"年度评定结果":
{
"2019": "绿牌",
"2018": "绿牌",
"2017": "绿牌"
}
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-11-08 14:35:56.383",
"spider_end_time": "2021-11-08 14:36:01",
"task_params":
{
"province": "guangdong",
"step": "start",
"year": "2007",
"index": 6
},
"metadata":
{
"province": "guangdong",
"year": "2007",
"index": 6
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.6"
}
贵州:
{
"data":
[
{
"企业名称": "遵义市红花岗区口腔医院",
"污染源地址": "遵义市红花岗区子尹路157号",
"参评年度": "2015",
"系统评分": "87",
"系统评级结果": "B++"
},
{
"企业名称": "贵州昊龙胜境建材有限责任公司",
"污染源地址": "",
"参评年度": "2015",
"系统评分": "87",
"系统评级结果": "B++"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-10-29 14:08:28.227",
"spider_end_time": "2021-10-29 14:08:29",
"task_params":
{
"province": "guizhou",
"step": "start",
"year": 2015,
"index": 3
},
"metadata":
{
"province": "guizhou",
"year": 2015,
"index": 3
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.10"
}
广西:
{
"data":
[
{
"统一社会信用代码": "91451281340369170J",
"企业名称": "广西万屹工贸有限公司",
"行业分类": "化工",
"行政区划": "河池市 宜州市",
"法人代表": "韦泽达",
"是否参评": "已参评",
"最近评价时间": "2021-07-12",
"信用等级": "普通",
"是否有效": "有效"
},
{
"统一社会信用代码": "9145128168 51842278",
"企业名称": "广西鑫华源水务有限公司",
"行业分类": "污水处理",
"行政区划": "河池市 宜州市",
"法人代表": "霍军",
"是否参评": "已参评",
"最近评价时间": "2021-07-12",
"信用等级": "守信",
"是否有效": "有效"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-10-29 15:13:45.615",
"spider_end_time": "2021-10-29 15:13:48",
"task_params":
{
"province": "guangxi",
"step": "start",
"index": 4
},
"metadata":
{
"province": "guangxi",
"index": 4
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.10"
}
河北:
{
"data":
[
{
"qyid": 16622,
"qymc": "邯郸新兴发电有限责任公司", # 企业名称
"orgcode": "911304817840855238", # 企业代码
"qyxzname": "重点排污单位",
"id": 16153,
"stime": "1585908614273",
"etime": null,
"sfhmd": 0,
"ljpf": 96, # 评分
"qybz": 1, # 企业标识 1:A类企业、2:B类企业、3:C类企业、4:D类企业、5:E类企业
"sfcp": null,
"hstime": null,
"hetime": null,
"gstime": "1599548668767", # 更新时间戳?不确定是否该字段
"zq_stime": "1585908614273",
"zq_etime": null,
"sfww": null,
"hmdtime": null,
"qybzname": null,
"ljpfs": null,
"xqcount": 0
},
{
"qyid": 2511,
"qymc": "中普(邯郸)钢铁有限公司",
"orgcode": "75025136X",
"qyxzname": "重点排污单位",
"id": 17534,
"stime": null,
"etime": null,
"sfhmd": 0,
"ljpf": 89,
"qybz": 1,
"sfcp": null,
"hstime": null,
"hetime": null,
"gstime": "1599548496797",
"zq_stime": null,
"zq_etime": null,
"sfww": null,
"hmdtime": null,
"qybzname": null,
"ljpfs": null,
"xqcount": 0
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-10-30 10:16:49.339",
"spider_end_time": "2021-10-30 10:16:54.067",
"task_params": {"province": "hebei"},
"metadata": {"index": "36"},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.38"
}
辽宁:
{
"data":
[
{
"year": 2019, # 年份
"enterpriseName": "大连金海润废物综合利用有限公司", # 企业名称
"city": "大连市", # 所在城市
"area": "金普新区", # 所在区县
"enterpriseCode": "91210213MA0YT5EY7C", # 统一社会信用代码
"industry": "危险废物治理", # 行业类别
"pollutionGroup": "重点排污单位", # 污染源企业分类
"scoreNum": 11, # 参照字段解析
"isStop": "0" # 参照字段解析
},
{
"year": 2019,
"enterpriseName": "沈阳广宇供热有限公司总参热源厂",
"city": "沈阳市",
"area": "和平区",
"enterpriseCode": "91210106738671337Y",
"industry": "热力生产和供应",
"pollutionGroup": "重点排污单位",
"scoreNum": 11,
"isStop": "0"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-11-01 15:12:39.342",
"spider_end_time": "2021-11-01 15:12:41",
"task_params":
{
"province": "liaoning",
"step": "start",
"year": 2019,
"index": 1
},
"metadata":
{
"province": "liaoning",
"year": 2019,
"index": 1
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.10"
}
山东:
{
"data":
[
{
"XH": "20211029230149891b2f3dcb6a41fbba72a2315794219a",
"QYGXJGMC": "",
"DF": 7,
"YSBS": "#ffff33",
"XTXH": "1635519677032004730880",
"QYMC": "利津誉鑫新型建材有限责任公司",
"QYBH": "8efdb9506bea47a9f78017a328a6b981",
"QYDZ": "山东省东营市利津县经济开发区S316路南(利津力能热电对过)",
"TYSHXYDM": "91370522MA3CFETM8A",
"FRDB": "刘彬",
"PJRQ": "2021-10-29",
"DJFLMC": "黄色等级"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-10-30 16:26:25.305",
"spider_end_time": "2021-10-30 16:26:29",
"task_params": {"province": "shandong","company": "利津誉鑫新型建材有限责任公司"},
"metadata": {"province": "shandong","company": "利津誉鑫新型建材有限责任公司"},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.14"
}
吉林:
{
"data":
[
{
"address": "吉林省吉林市永吉县岔路河镇一委", # 企业住址
"capital": "100", # 注册资本(万人民币)
"code": "91220221MA0Y3JCJ8A", # 社会信用代码/工商许可证号
"id": "427",
"lawer": "许笑维", # 法定代表人或负责人
"name": "永吉县仁合供热有限公司", # 企业名称
"score": "3", # 总扣分项
"wasid": "218230",
"level": "blue" # 当前环境信用状况结果,与score关联,蓝标(blue):score >=1 and score <=6;黄标(yellow):score >=7 and score <=11;红标(red):score>=12
},
{
"wasid": "218230",
"id": "596",
"code": "91220112310008964E",
"name": "长春市泓利供热有限公司",
"capital": "500",
"address": "长春市双阳区山河街道泓利港湾小区东侧",
"score": "2",
"lawer": "李建",
"level": "blue"
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-11-01 10:54:59.762",
"spider_end_time": "2021-11-01 10:55:01",
"task_params":
{
"province": "jilin",
"step": "start",
"level": "blue",
"index": 3
},
"metadata":
{
"province": "jilin",
"level": "blue",
"index": 3
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.10"
}
安徽:
{
"data":
[
{
"area_id": "繁昌县", # 地区
"b_id": "9297d507725146d08b7d5f9d8cf0f9c6",
"com_id": "ebc07434060e4670827fd462df69ceeb",
"com_name": "芜湖海螺水泥有限公司", # 企业名称
"cplx": "省级", # 参评类型
"id": "a36e410a3bfc4b0091682f815db93b41",
"public_level": "环保诚信企业", # 年度评定结果
"row_number": 21,
"year": 2016 # 年度
},
{
"area_id": "经济技术开发区",
"b_id": "8d0e5dd6986949d198cf69af39eec955",
"com_id": "5a581012297445e5852afbb2d6a32afc",
"com_name": "安徽楚江科技新材料股份有限公司",
"cplx": "省级",
"id": "15850528a44541c2813782b6dcbcaa0b",
"public_level": "环保诚信企业",
"row_number": 22,
"year": 2016
}
],
"http_code": 200,
"error_msg": "",
"task_result": 1000,
"data_type": "list",
"spider_start_time": "2021-11-01 13:43:36.944",
"spider_end_time": "2021-11-01 13:43:37",
"task_params":
{
"province": "anhui",
"step": "start",
"year": 2016,
"index": 3
},
"metadata":
{
"province": "anhui",
"year": 2016,
"index": 3
},
"spider_name": "environmental_protection",
"spider_ip": "10.8.1.10"
}
爬虫运行环境
scrapy
爬虫部署信息
target: node_51,
spider_name: environmental_protection
10个进程
Taskhub地址
提交任务地址:
代码编写地址:
Taskhub调度规则说明
爬虫监控指标设计
(先观察,待补充)
索引:
监控频率:
监控起止时间:
报警条件:
报警群:
报警内容:
数据归集
责任人
数据归集方式
-
爬虫直接写kafka
-
爬虫写文件logstash采集
爬虫结果目录
采集文件存放路径:
/data/gravel_spiders/environmental_protection
归集后存放目录
10.8.6.228:
/data2_227/grvael_spider_result/environmental_protection
logstash配置文件名称
logstash文件采集type
数据归集的topic
general-taxpayer
ES日志索引及筛选条件
gravel-spider-data-*
监控指标看板
数据保留策略
数据清洗
责任人
代码地址
部署地址
部署方法及说明
- crontab + data_pump
- supervisor + data_pump
- supervisor + consumer
数据接收来源
数据存储表地址
- 数据库地址:
- 表名: