|
|
|
# 增量数据提取
|
|
|
|
|
|
|
|
## 方案流程
|
|
|
|
|
|
|
|
```plantuml
|
|
|
|
@startuml
|
|
|
|
|
|
|
|
file maxwell
|
|
|
|
queue topic_maxwell_test
|
|
|
|
database 监控的库表
|
|
|
|
file 原始数据消费组
|
|
|
|
file 处理数据消费组
|
|
|
|
file 原始数据落地
|
|
|
|
file 处理后数据落地
|
|
|
|
|
|
|
|
监控的库表 --> maxwell: 所有库表的binlog数据,统一采用producer为kafka
|
|
|
|
maxwell --> topic_maxwell_test: binlog数据统一入kafka的topic:maxwell_test
|
|
|
|
topic_maxwell_test --> 原始数据消费组: 对于binlog数据不做处理直接持久化到磁盘
|
|
|
|
topic_maxwell_test --> 处理数据消费组: 根据需求对数据进行清洗转换后持久化到指定目录
|
|
|
|
原始数据消费组 --> 原始数据落地: 持久化到10.8.6.227:/data4/data_sync/maxwell_binlog
|
|
|
|
处理数据消费组 --> 处理后数据落地: 持久化到10.8.6.227:/data/data_sync/lake_ic
|
|
|
|
|
|
|
|
@enduml
|
|
|
|
```
|
|
|
|
```
|
|
|
|
由maxwell监控业务需求的mysql库表,将binlog数据发送到kafka,以tablename的hash值作为partiti_key,通过同一个表的数据落在同一个分区中以保证binlog数据的时序性。由data_pump消费kafka数据,分两个方面,一方面对数据不做处理,将原始binlog文件持久化到磁盘做原始数据存储;一方面根据业务需求增加自定义udm模块对数据进行清洗转换,后持久化到指定目录。
|
|
|
|
```
|
|
|
|
|
|
|
|
## 监控表分布详情
|
|
|
|
|
|
|
|
| 服务器 | database | table | 数量 |
|
|
|
|
| -------------------------------------- | --------------- | ------------------------------- | ---- |
|
|
|
|
| **bdp-rds-001.mysql.rds.aliyuncs.com** | utn_ng_risk | risk_environment_punish | 12 |
|
|
|
|
| | | risk_tax_punish | 12 |
|
|
|
|
| | | risk_xianxiao | |
|
|
|
|
| | | risk_zhixing | |
|
|
|
|
| | | risk_zhongben | |
|
|
|
|
| | | risk_lawsuit | |
|
|
|
|
| | | risk_lawsuit_list | |
|
|
|
|
| | | risk_court_announcement | |
|
|
|
|
| | | risk_court_announcement_list | |
|
|
|
|
| | | risk_court_notice | |
|
|
|
|
| | | risk_court_notice_list | |
|
|
|
|
| | | company_own_tax_info | |
|
|
|
|
| **bdp-rds-005.mysql.rds.aliyuncs.com** | utn_ng_tm | sys_cat | 28 |
|
|
|
|
| | | ip_trademark_flow_list | 5 |
|
|
|
|
| | | ip_trademark_category_list | |
|
|
|
|
| | | ip_trademark_applicant_list | |
|
|
|
|
| | | ip_trademark | |
|
|
|
|
| | utn_ng_ip | ip_copyright_works_list | 7 |
|
|
|
|
| | | ip_copyright_works | |
|
|
|
|
| | | ip_copyright_software_list | |
|
|
|
|
| | | ip_copyright_software | |
|
|
|
|
| | | ip_patent_legal_status | |
|
|
|
|
| | | ip_patent | |
|
|
|
|
| | | ip_patent_applicant_list | |
|
|
|
|
| | utn_ng_emp | company_employment | 1 |
|
|
|
|
| | utn_stock | stock_financial_analysis | 11 |
|
|
|
|
| | | stock_executive_new | |
|
|
|
|
| | | stock_base_new | |
|
|
|
|
| | | stock_announcement | |
|
|
|
|
| | | stock_allotment | |
|
|
|
|
| | | stock_bonus | |
|
|
|
|
| | | stock_issue | |
|
|
|
|
| | | stock_holder | |
|
|
|
|
| | | tradable_stock_holder | |
|
|
|
|
| | | stock_holding | |
|
|
|
|
| | | stock_equity_change | |
|
|
|
|
| | utn_ng_qmp_info | qimingpian_history_rongzi | 4 |
|
|
|
|
| | | qimingpian_jingpin_map | |
|
|
|
|
| | | qimingpian_product | |
|
|
|
|
| | | qimingpian_team_member | |
|
|
|
|
| **bdp-rds-003.mysql.rds.aliyuncs.com** | utn_data | tb_icp_base_info | 13 |
|
|
|
|
| | utn_ng_bid | company_bid_main | 1 |
|
|
|
|
| | | company_bid_entity_list_new | 2 |
|
|
|
|
| | utn_ng_biz_info | tb_general_taxpayer | 6 |
|
|
|
|
| | | tax_credit | |
|
|
|
|
| | | land_result_announcement | |
|
|
|
|
| | | tb_certificate | |
|
|
|
|
| | | wechat_public_num | |
|
|
|
|
| | | company_logo | |
|
|
|
|
| | utn_risk | tb_judicial_sale | 3 |
|
|
|
|
| | | risk_shixin | |
|
|
|
|
| | | tb_executed_person | |
|
|
|
|
| | utn_ng_biz | company_customs_business_credit | 1 |
|
|
|
|
| **10.8.6.87** | utn_special | special_enterprise | 5 |
|
|
|
|
| | | special_foundation | 5 |
|
|
|
|
| | | special_law_office | |
|
|
|
|
| | | special_social_organ | |
|
|
|
|
| | | special_trade_union | |
|
|
|
|
| **bdp-rds-007.mysql.rds.aliyuncs.com** | utn_ic | tb_company_sme | 2 |
|
|
|
|
| | | tb_tag_company_qualifications | 2 |
|
|
|
|
| **bdp-rds-801.mysql.rds.aliyuncs.com** | bdp_company | company_contact_details | 1 |
|
|
|
|
| **bdp-rds-t02.mysql.rds.aliyuncs.com** | ic_ar | annual_report_asset | 37 |
|
|
|
|
| | | annual_report_base | 9 |
|
|
|
|
| | | annual_report_change | |
|
|
|
|
| | | annual_report_equity_change | |
|
|
|
|
| | | annual_report_guarantee | |
|
|
|
|
| | | annual_report_invest | |
|
|
|
|
| | | annual_report_partner | |
|
|
|
|
| | | annual_report_social_security | |
|
|
|
|
| | | annual_report_website | |
|
|
|
|
| | ic_base | company_base | 9 |
|
|
|
|
| | | company_branch | |
|
|
|
|
| | | company_change | |
|
|
|
|
| | | company_employee | |
|
|
|
|
| | | company_history_name | |
|
|
|
|
| | | company_industry | |
|
|
|
|
| | | company_partner | |
|
|
|
|
| | | company_partner_pay | |
|
|
|
|
| | | company_partner_realpay | |
|
|
|
|
| | ic_biz | company_abnormal | 19 |
|
|
|
|
| | | company_allow | |
|
|
|
|
| | | company_chattel | |
|
|
|
|
| | | company_chattel_change | |
|
|
|
|
| | | company_chattel_mortgage | |
|
|
|
|
| | | company_chattel_pawn | |
|
|
|
|
| | | company_check | |
|
|
|
|
| | | company_clear | |
|
|
|
|
| | | company_illegal | |
|
|
|
|
| | | company_intellectual | |
|
|
|
|
| | | company_intellectual_change | |
|
|
|
|
| | | company_justice | |
|
|
|
|
| | | company_justice_equity_change | |
|
|
|
|
| | | company_justice_freeze | |
|
|
|
|
| | | company_justice_invalid | |
|
|
|
|
| | | company_justice_keep_freeze | |
|
|
|
|
| | | company_justice_unfreeze | |
|
|
|
|
| | | company_pledge | |
|
|
|
|
| | | company_punish | |
|
|
|
|
|
|
|
|
## 配置文件分布
|
|
|
|
|
|
|
|
| 类型 | 文件名 | database | table数量 |
|
|
|
|
| ---------- | ---------------------- | --------------- | --------- |
|
|
|
|
| 原始数据 | | | |
|
|
|
|
| | maxwell_update_file_01 | | 19 |
|
|
|
|
| | | ic_biz | 19 |
|
|
|
|
| | maxwell_update_file_02 | | 14 |
|
|
|
|
| | | utn_ng_risk | 12 |
|
|
|
|
| | | utn_ic | 2 |
|
|
|
|
| | maxwell_update_file_03 | | 9 |
|
|
|
|
| | | ic_ar | 9 |
|
|
|
|
| | maxwell_update_file_04 | | 14 |
|
|
|
|
| | | ic_base | 9 |
|
|
|
|
| | | utn_ng_tm | 5 |
|
|
|
|
| | maxwell_update_file_05 | | 14 |
|
|
|
|
| | | utn_stock | 11 |
|
|
|
|
| | | utn_risk | 3 |
|
|
|
|
| | maxwell_update_file_06 | | 14 |
|
|
|
|
| | | utn_ng_ip | 7 |
|
|
|
|
| | | utn_ng_biz_info | 6 |
|
|
|
|
| | | bdp_company | 1 |
|
|
|
|
| | maxwell_update_file_07 | | 14 |
|
|
|
|
| | | utn_ng_emp | 1 |
|
|
|
|
| | | utn_ng_qmp_info | 4 |
|
|
|
|
| | | utn_data | 1 |
|
|
|
|
| | | utn_ng_bid | 2 |
|
|
|
|
| | | utn_ng_biz | 1 |
|
|
|
|
| | | utn_special | 5 |
|
|
|
|
| 处理的数据 | | | |
|
|
|
|
| | conv_binlog_file_01 | | 19 |
|
|
|
|
| | | ic_biz | 19 |
|
|
|
|
| | conv_binlog_file_02 | | 14 |
|
|
|
|
| | | utn_ng_risk | 12 |
|
|
|
|
| | | utn_ic | 2 |
|
|
|
|
| | conv_binlog_file_03 | | 9 |
|
|
|
|
| | | ic_ar | 9 |
|
|
|
|
| | conv_binlog_file_04 | | 9 |
|
|
|
|
| | | ic_base | 9 |
|
|
|
|
| | conv_binlog_file_05 | | 19 |
|
|
|
|
| | | utn_stock | 11 |
|
|
|
|
| | | utn_risk | 3 |
|
|
|
|
| | | utn_ng_tm | 5 |
|
|
|
|
| | conv_binlog_file_06 | | 14 |
|
|
|
|
| | | utn_ng_ip | 7 |
|
|
|
|
| | | utn_ng_biz_info | 6 |
|
|
|
|
| | | bdp_company | 1 |
|
|
|
|
| | conv_binlog_file_07 | | 14 |
|
|
|
|
| | | utn_ng_emp | 1 |
|
|
|
|
| | | utn_ng_qmp_info | 4 |
|
|
|
|
| | | utn_data | 1 |
|
|
|
|
| | | utn_ng_bid | 2 |
|
|
|
|
| | | utn_ng_biz | 1 |
|
|
|
|
| | | utn_special | 5 |
|
|
|
|
|
|
|
|
## kafka
|
|
|
|
|
|
|
|
```shell
|
|
|
|
topic:maxwell_test
|
|
|
|
consumer_group:
|
|
|
|
maxwell_conv_file_01
|
|
|
|
maxwell_conv_file_02
|
|
|
|
maxwell_conv_file_03
|
|
|
|
maxwell_conv_file_04
|
|
|
|
maxwell_conv_file_05
|
|
|
|
maxwell_conv_file_06
|
|
|
|
maxwell_conv_file_07
|
|
|
|
maxwell_update_file_01
|
|
|
|
maxwell_update_file_02
|
|
|
|
maxwell_update_file_03
|
|
|
|
maxwell_update_file_04
|
|
|
|
maxwell_update_file_05
|
|
|
|
maxwell_update_file_06
|
|
|
|
maxwell_update_file_07
|
|
|
|
```
|
|
|
|
|
|
|
|
## maxwell
|
|
|
|
|
|
|
|
* [maxwell](best_practice/maxwell)
|
|
|
|
|
|
|
|
```shell
|
|
|
|
服务器:10.8.6.74
|
|
|
|
目录/opt/maxwell/
|
|
|
|
配置文件/opt/maxwell/conf/
|
|
|
|
-rw-r--r-- 1 root root 12233 Sep 27 09:56 001_config.properties
|
|
|
|
-rw-r--r-- 1 root root 12281 Sep 27 09:56 003_config.properties
|
|
|
|
-rw-r--r-- 1 root root 12874 Sep 27 09:56 005_config.properties
|
|
|
|
-rw-r--r-- 1 root root 11821 Sep 27 09:56 007_config.properties
|
|
|
|
-rw-r--r-- 1 root root 11788 Sep 27 09:56 801_config.properties
|
|
|
|
-rw-r--r-- 1 root root 11929 Sep 27 09:56 87_config.properties
|
|
|
|
-rw-r--r-- 1 root root 11770 Sep 27 09:56 t02_config.properties
|
|
|
|
7个实例同时运行
|
|
|
|
/opt/maxwell/bin/maxwell --config /opt/maxwell/conf/001_config.properties --daemon
|
|
|
|
将监控到的binlog信息写入kafka的toupic:maxwell_test
|
|
|
|
```
|
|
|
|
|
|
|
|
## data_pump
|
|
|
|
|
|
|
|
```shell
|
|
|
|
服务器:10.8.6.47
|
|
|
|
app_maxwell:/home/collie/product
|
|
|
|
|
|
|
|
读取kafka转存原始binlog:
|
|
|
|
supervisor文件:/etc/supervisord.d/collie/maxwell_update_file.conf
|
|
|
|
7个进程读取kafka后转存到/data4_227/data_sync/maxwell_binlog
|
|
|
|
|
|
|
|
读取kafka转换数据格式转存:
|
|
|
|
supervisor文件:/etc/supervisord.d/collie/conv_binlog_file.conf
|
|
|
|
7个进程读取kafka后转存到/data4_227/data_sync/maxwell_data
|
|
|
|
```
|
|
|
|
|
|
|
|
## 数据存放目录
|
|
|
|
|
|
|
|
```shell
|
|
|
|
10.8.6.227
|
|
|
|
原始binlog:
|
|
|
|
/data4/data_sync/maxwell_binlog
|
|
|
|
|
|
|
|
转换数据格式后:
|
|
|
|
/data/data_sync/lake_ic
|
|
|
|
|
|
|
|
```
|
|
|
|
|