Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
project-collie
project-collie
  • Project overview
    • Project overview
    • Details
    • Activity
    • Releases
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 5
    • Issues 5
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Merge requests 2
    • Merge requests 2
  • Operations
    • Operations
    • Incidents
  • Analytics
    • Analytics
    • Repository
    • Value Stream
  • Wiki
    • Wiki
  • Members
    • Members
  • Activity
  • Graph
  • Create a new issue
  • Commits
  • Issue Boards
Collapse sidebar
  • granite
  • project-collieproject-collie
  • Wiki
    • Udms
    • Sync_mysql_new
  • table_explode

Last edited by 吴一博 Apr 28, 2020
Page history
This is an old version of this page. You can view the most recent version or browse the history.

table_explode

展开子表

该功能用于处理一对多关系的数据。这种数据通常被设计为存储在两个关联的表中。 主表用于存储基本信息,关联表用于存储详情。主表与关联表为一对多关系。 如下图的所展示的。记录了团队(team)及其成员(member)。

tb_team及tb_members

@startuml

' hide the spot
hide circle

entity "tb_team" {
  *id : number <<generated>>
  --
  *name : text
}

entity "tb_members" {
  *id : number <<generated>>
  --
  *team_id : number <<FK>>
  *name : text
}

tb_team  --|{ tb_members

@enduml

例如希望得到如下入表后的结果

tb_team

id name
1 marvel

tb_members

id team_id member_name
1 1 Iron Man
2 1 Spider Man

数据包需要按如下结构组装:

{
    "id": 1,
    "name": "marvel",
    "members": [
        {
            "id": 1,
            "member_name": "Iron Man"
        },
        {
            "id": 2,
            "member_name": "Spider Man"
        }
    ],

    "sync_condition": {
        "data_type": "teams",
        "operation":"upsert",
    }
}

配置文件中,名为teams的data_type定义如下:

catalogues: 
  - data_type:                              #(1)
      - name: "teams"                       #(2)
        tables: ['tb_team', "tb_members"]   

    table_explode:                          #(3) 
      - table: "tb_members"                 #(4) 
        explode_field: "members"            #(5)
        attach_fields:                      #(6) 
           - field: "team_id"
             refer: "id"
        overwrite:                          #(7)
           foreign_keys:                    #(8)
               - field: "team_id"
                 refer: "id"
           logical_delete: True             #(9)
           deleted_field: "deleted"
           deleted_field_value: 1,



#(1) data_type为列表类型,在一个catalog下可以定义多个data_type
#(2) 名为teams的data_type涉及两个表:tb_team及tb_members,
#(3) 定义哪些表的数据是从输入数据的特定字段展开得到的。 table_explode是列表类型。
#(4) 表tb_members需要根据输入数据中的字段 members,进行展开
#(6) 在展开的数据中,要加上字段team_id,该字段的值为输入数据中字段id的值
#(7) 使用覆盖方式更新
#(8) 以team_id字段tb_members选择要覆盖的数据,该字段的值为输入数据中字段id的值
#(9) 执行逻辑删除。删除标志字段为 deleted, 其值为1表示删除
Clone repository
  • README
  • data_pump
    • data_pump
    • filters
    • filters
      • bloom
    • readers
    • readers
      • file
      • kafka
      • mongodb
      • sql
    • writers
    • writers
      • file
  • dev_guide
  • dev_manual
  • Home
  • ops
    • ansible
View All Pages