|
|
# 从mysql数据库作为数据输入源
|
|
|
|
|
|
**class**参数配置为```sql.SqlDocReader```
|
|
|
|
|
|
## init 参数
|
|
|
|配置参数|子项|说明|
|
|
|
|----|----|----|
|
|
|
|db||数据库配置|
|
|
|
|query| |查询条件配置|
|
|
|
| |table|要查询的数据库表名|
|
|
|
| |columns|查询返回的字段列表|
|
|
|
| |order_by|返回结果排序的字段|
|
|
|
| |limit|返回结果的条数|
|
|
|
| |condition|查询条件|
|
|
|
| |query_sql|完整的查询sql语句, 语句需必需包含占位符 $CONDITIONS。 |
|
|
|
| |pack|关联查询子表,并将子表记录以数据形式存放在结果数据的指定字段中|
|
|
|
|offset|查询偏移记录|
|
|
|
|
|
|
### 查询参数(query)
|
|
|
数据筛选的行为可以通过参数query_sql指定完整的SQL语句或分别指定 table, columns,condition等参数进行配置。
|
|
|
当指定了query_sql参数,table,columns,condition,order_by, limit等参数将被忽略。
|
|
|
|
|
|
#### 完整查询语句 (query_sql)
|
|
|
查询语句的Where子句中必需包含占位符 $CONDITIONS。
|
|
|
$CONDITIONS 可被认为是一个条件表达式, 且它应该与SQL其它条件以 AND 连接。
|
|
|
|
|
|
#### 关联查询(pack)
|
|
|
该功能用于将主表与关联子表数据一起查出。如: 将如下表的记录,进行关联查询,并打包为一条数据:
|
|
|
|
|
|
tb_team
|
|
|
|
|
|
| id |name |
|
|
|
| ---- |---- |
|
|
|
| 1 |marvel|
|
|
|
| 1 |marvel|
|
|
|
|
|
|
tb_members
|
|
|
|
|
|
|id|team_id|member_name|
|
|
|
|----|----|----|
|
|
|
|1 |1 |Iron Man|
|
|
|
|2 |1 |Spider Man|
|
|
|
|
|
|
|
|
|
打包后的数据为:
|
|
|
|
|
|
```json
|
|
|
{
|
|
|
"id": 1,
|
|
|
"name": "marvel",
|
|
|
"members": [
|
|
|
{
|
|
|
"id": 1,
|
|
|
"team_id": 1,
|
|
|
"member_name": "Iron Man"
|
|
|
},
|
|
|
{
|
|
|
"id": 2,
|
|
|
"team_id": 1,
|
|
|
"member_name": "Spider Man"
|
|
|
}
|
|
|
]
|
|
|
}
|
|
|
```
|
|
|
|
|
|
所需的配置如下:
|
|
|
|
|
|
```yaml
|
|
|
table: tb_team
|
|
|
pack:
|
|
|
- name: "members"
|
|
|
table: "tb_members"
|
|
|
join_on:
|
|
|
left: "id"
|
|
|
right: "team_id"
|
|
|
columns:
|
|
|
- id
|
|
|
- team_id
|
|
|
- member_name
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
tb_team
|
|
|
|
|
|
| id |name | sex |
|
|
|
| ---- |---- | --- |
|
|
|
| 1 |marvel| Man |
|
|
|
| 1 |marvel| Woman |
|
|
|
|
|
|
tb_members
|
|
|
|
|
|
|id|team_id|member_name| sex |
|
|
|
|----|----|----|----|
|
|
|
|1 |1 |Iron Man| Man |
|
|
|
|2 |1 |Spider Man| Man |
|
|
|
|3 |1 |Black widow| Woman |
|
|
|
|
|
|
|
|
|
打包后的数据为:
|
|
|
```json
|
|
|
{
|
|
|
"id": 1,
|
|
|
"name": "marvel",
|
|
|
"sex": "Woman",
|
|
|
"members": [
|
|
|
{
|
|
|
"id": 3,
|
|
|
"team_id": 1,
|
|
|
"member_name": "Black widow",
|
|
|
"sex": "Woman"
|
|
|
}
|
|
|
]
|
|
|
}
|
|
|
```
|
|
|
|
|
|
所需的配置如下:
|
|
|
```yaml
|
|
|
query_sql: "select * from tb_team where id='1' and sex='Woman' and $CONDITIONS"
|
|
|
pack:
|
|
|
- name: "members"
|
|
|
table: "tb_members"
|
|
|
join_on:
|
|
|
left: "id, sex"
|
|
|
right: "team_id, sex"
|
|
|
columns:
|
|
|
- id
|
|
|
- team_id
|
|
|
- member_name
|
|
|
- sex
|
|
|
```
|
|
|
|
|
|
|
|
|
|配置参数|说明|
|
|
|
|----|----|
|
|
|
|name|打包后数据的字段名|
|
|
|
|table|关联的子表名称|
|
|
|
|join_on|关联查询所使用的关联字段. left: 主表字段名, right: 子表字段名|
|
|
|
|columns|查询返回的字段列表|
|
|
|
|
|
|
|
|
|
## 配置示例:
|
|
|
```yaml
|
|
|
sql_qichacha_push:
|
|
|
class: sql.SqlDocReader
|
|
|
init:
|
|
|
db:
|
|
|
host: 192.168.109.220
|
|
|
port: 3306
|
|
|
database: collie
|
|
|
user: collie
|
|
|
password: eill
|
|
|
|
|
|
query:
|
|
|
table:
|
|
|
a: company_lawsuit
|
|
|
b: company_lawsuit_parsed_info
|
|
|
columns:
|
|
|
a: "*"
|
|
|
b:
|
|
|
- id as bid
|
|
|
- jd_case_reason
|
|
|
- plaintiffs
|
|
|
- defendants
|
|
|
- thirdParties
|
|
|
condition: a.uuid = b.lawsuit_uuid
|
|
|
|
|
|
offset:
|
|
|
field: id
|
|
|
store: "file:///home/collie/project-collie/offset_company_lawsuit.txt"
|
|
|
```
|
|
|
|
|
|
|
|
|
* **db**: mysql数据库连接配置
|
|
|
* **query**: 查询条件,支持直接完整sql语句、指定表名和列名两种方式
|
|
|
1. 完整sql语句
|
|
|
* **query_sql**: select查询语句,其中where子句必须包含**$CONDITIONS**关键字
|
|
|
2. 指定表名和列名
|
|
|
* **query.table**: 数据表,支持单表,多表join查询
|
|
|
* **query.columns**: select查询的列名列表,逗号(,)分割。默认是所有列
|
|
|
* **query.condition**: select查询的where条件
|
|
|
* **offset**: 记录上次查询结束时记录在表中的offset
|
|
|
1. **offset.field**: 记录offset的字段名称
|
|
|
2. **offset.store**: offset保存的路径,支持本地文件和mysql数据库,前缀分别是```file://```、```mysql://``` |