Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
K
kb
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 2
    • Issues 2
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • Operations
    • Operations
    • Incidents
  • Analytics
    • Analytics
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • granite
  • kb
  • Wiki
    • Data_stream
    • Pc
  • pc_wenshu

pc_wenshu · Changes

Page history
update: 裁判文书爬虫结果html字段补充说明;redis配置的错误更正 authored Sep 27, 2021 by 蒋家升's avatar 蒋家升
Hide whitespace changes
Inline Side-by-side
Showing with 5 additions and 3 deletions
+5 -3
  • data_stream/pc/pc_wenshu.md data_stream/pc/pc_wenshu.md +5 -3
  • No files found.
data_stream/pc/pc_wenshu.md
View page @ d38d64fc
......@@ -77,9 +77,9 @@ wenshu_spider
<!--redis host port db key 优先级说明-->
* redis host: redis://:utn@0818@bdp-mq-001.redis.rds.aliyuncs.com:6379/0
* redis host: redis://:utn@0818@bdp-mq-001.redis.rds.aliyuncs.com:6379/7
* redis port: 6379
* redis db: 0
* redis db: 7
* redis key:
* wenshu_keys
......@@ -203,6 +203,7 @@ detail: 详情信息,其中每个字典为一条数据,只有这一种类型
<!--可能与超级数据一致,可能不同的data_type的爬虫结果结构不同,超级数据是把所有data_type的结果组合在一起-->
* 详情页爬虫,通过时间条件`"search_keys": "cprq"`爬取
* 特别说明:html字段除了正常html格式外,另有纯文本格式,如:"不公开理由:以调解方式结案的";
```json
{
......@@ -384,4 +385,5 @@ public-company-spider-data-*
* 数据库地址:10.8.6.74
* 数据库类型:hive
* 库名:risk
* 表名:ods_lawsuit
\ No newline at end of file
* 表名:ods_lawsuit
Clone repository
  • README
  • basic_guidelines
  • basic_guidelines
    • basic_guidelines
    • dev_guide
    • project_build
    • 开发流程
  • best_practice
  • best_practice
    • AlterTable
    • RDS
    • azkaban
    • create_table
    • design
    • elasticsearch
    • elasticsearch
      • ES运维
    • logstash
View All Pages