SpringBoot3 + ShardingSphere-Proxy5.5.2 分库分表

发表于 2025-09-04 更新于 2025-09-15 分类于技术， springboot ， sharding-sphere

阅读次数：本文字数： 3.6k 阅读时长 ≈ 13 分钟

摘要

本文介绍 SpringBoot3.5.5 + ShardingSphere-Proxy5.5.2 分库分表的使用。
ShardingSphere官网
本文项目代码Github地址
本文将 SpringBoot3 + ShardingSphere-JDBC5.5.2 分库分表修改为 ShardingSphere-Proxy 的模式

ShardingSphere-Proxy 简介

ShardingSphere-Proxy 定位为透明化的数据库代理端，通过实现数据库二进制协议，对异构语言提供支持。目前提供 MySQL 和 PostgreSQL 协议，透明化数据库操作，对 DBA 更加友好。
- 向应用程序完全透明，可直接当做 MySQL/PostgreSQL 使用；
- 兼容 MariaDB 等基于 MySQL 协议的数据库，以及 openGauss 等基于 PostgreSQL 协议的数据库；
- 适用于任何兼容 MySQL/PostgreSQL 协议的的客户端，如：MySQL Command Client, MySQL Workbench, Navicat 等。
ShardingSphere-Proxy 独立部署架构图
ShardingSphere-Proxy 与 ShardingSphere-JDBC 的特性比较

特性	ShardingSphere-JDBC	ShardingSphere-Proxy
数据库支持	任意数据库	MySQL / PostgreSQL
连接消耗数	高	低
异构语言支持	仅支持 Java	支持任意语言
性能	损耗低	损耗略高
无中心化	是	否
静态入口	无	有

部署 ShardingSphere-Proxy

运行 ShardingSphere-Proxy 要求 JDK 1.8+
下载 ShardingSphere-Proxy5.5.2

wget https://www.apache.org/dyn/closer.lua/shardingsphere/5.5.2/apache-shardingsphere-5.5.2-shardingsphere-proxy-bin.tar.gz

tar -zxvf apache-shardingsphere-5.5.2-shardingsphere-proxy-bin.tar.gz
cd apache-shardingsphere-5.5.2-shardingsphere-proxy-bin/

配置 ShardingSphere-Proxy

conf/global.yaml: 全局配置，所谓全局，就是对所有逻辑库都生效的配置

# https://shardingsphere.apache.org/document/current/cn/user-manual/shardingsphere-proxy/yaml-config/authority/
authority:
 users:
   - user: root@127.0.0.1 # 格式：用户名@IP
     password: root
     admin: true          # 是否是管理员
   - user: sharding@%     # 所有IP都可以访问
     password: sharding
 privilege:
   type: DATABASE_PERMITTED # 权限类型
   props:
      user-database-mappings: root@127.0.0.1=*,sharding@%=sharding_db # 用户权限映射，*表示所有数据库

# 开启XA事务，如果是 springboot项目，只需要在方法上加上 @Transactional 注解即可开启事务(从 MySQL 5.7 开始就全面支持 XA 分布式事务)
# https://shardingsphere.apache.org/document/5.5.2/cn/user-manual/shardingsphere-jdbc/yaml-config/rules/transaction/
transaction:  # 配置事务
  defaultType: XA        # 事务类型，默认 LOCAL
  providerType: Atomikos # 事务管理器具体实现，默认就是 Atomikos

# 属性配置：https://shardingsphere.apache.org/document/current/cn/user-manual/shardingsphere-proxy/yaml-config/props/
props:
  sql-show: true # 控制台打印改写后的 SQL，便于排错，默认为 false
  check-table-metadata-enabled: false # 在程序启动和更新时，是否检查分片元数据的结构一致性，默认为 false

conf/database-my.yaml: 自定义分配规则配置

注意:
1.名称必须以 database-开头，实际上conf目录下有很多示例，我们可以根据需要进行配置
2.global.yaml中的配置项不能配置到 database-my.yaml中

# 数据库名称，默认值：logic_db
databaseName: sharding_db

# 数据源配置: https://shardingsphere.apache.org/document/current/cn/user-manual/shardingsphere-jdbc/yaml-config/data-source/
dataSources:
  ds_0: # 逻辑数据源名称
    # dataSourceClassName: com.zaxxer.hikari.HikariDataSource # 不要指定
    # driverClassName: com.mysql.cj.jdbc.Driver               # 不要指定
    url: jdbc:mysql://127.0.0.1:3306/shardingdb0?useSSL=false&serverTimezone=UTC&useUnicode=true&characterEncoding=UTF-8  # 注意这里属性为 url
    username: root
    password: newpwd

  ds_1:
    # dataSourceClassName: com.zaxxer.hikari.HikariDataSource
    # driverClassName: com.mysql.cj.jdbc.Driver
    url: jdbc:mysql://127.0.0.1:3306/shardingdb1?useSSL=false&serverTimezone=UTC&useUnicode=true&characterEncoding=UTF-8
    username: root
    password: newpwd

# 以下规则与前文中的 shardingsphere-jdbc 中的 rules 相同
# 分片规则配置: https://shardingsphere.apache.org/document/current/cn/user-manual/shardingsphere-jdbc/yaml-config/rules/sharding/
rules:
  - !SHARDING # 分片规则配置
    # 绑定表：同分片键 join 时走同路由，减少广播,多个逗号分隔，要求分片规则一致
    bindingTables:
      - t_order,t_order_item

    tables: # 手工分片规则配置
      course: # 逻辑表名称
        actualDataNodes: ds_${0..1}.course_${1..2} # 实际数据节点
        databaseStrategy: # 分库策略
          standard: # 用于单分片键的标准分片场景
            shardingColumn: user_id # 分片列名称
            shardingAlgorithmName: course_db_inline # 分片算法名称
        tableStrategy: # 分表策略
          standard:
            shardingColumn: cid
            shardingAlgorithmName: course_inline
        keyGenerateStrategy: # 分布式序列策略
          column: cid # 自增列名称
          keyGeneratorName: snowflake # 分布式序列算法名称
      t_order_complex: # 逻辑表名称
        actualDataNodes: ds_${0..1}.t_order_complex_${0..1} # 实际数据节点
        databaseStrategy: # 分库策略
          standard: # 用于单分片键的标准分片场景
            shardingColumn: user_id # 分片列名称
            shardingAlgorithmName: t_order_db_inline # 分片算法名称
        tableStrategy: # 分表策略
          complex: # 用于多分片键的复杂分片场景
            shardingColumns: user_id,order_id
            shardingAlgorithmName: t_order-complex-algorithm
        keyGenerateStrategy: # 分布式序列策略
          column: order_id # 自增列名称
          keyGeneratorName: snowflake # 分布式序列算法名称
      t_order_item_complex: # 逻辑表名称
        actualDataNodes: ds_${0..1}.t_order_item_complex_${0..1} # 实际数据节点
        databaseStrategy: # 分库策略
          standard: # 用于单分片键的标准分片场景
            shardingColumn: user_id # 分片列名称
            shardingAlgorithmName: t_order_db_inline # 分片算法名称
        tableStrategy: # 分表策略
          complex: # 用于多分片键的复杂分片场景
            shardingColumns: user_id,order_id  # 分片列名称,多个逗号分隔
            shardingAlgorithmName: t_order_item-class-based-algorithm_spi # 基于 SPI 的分片算法，效果同上，建议生产环境使用 SPI
        keyGenerateStrategy: # 分布式序列策略
          column: item_id # 自增列名称
          keyGeneratorName: snowflake # 分布式序列算法名称

      t_user:
        actualDataNodes: ds_${0..1}.t_user_${0..1} # 实际数据节点
        databaseStrategy: # 分库策略
          standard: # 用于单分片键的标准分片场景
            shardingColumn: id # 分片列名称
            shardingAlgorithmName: t_user_db_inline # 分片算法名称
        tableStrategy: # 分表策略
          standard:
            shardingColumn: id  # 分片列名称
            shardingAlgorithmName: t_user_inline # 分片算法名称
        keyGenerateStrategy: # 分布式序列策略
          column: id # 自增列名称，字符串类型
          #           keyGeneratorName: uuid # 分布式序列算法名称
          keyGeneratorName: custom_snowflake_string # 分布式序列算法名称


    #       t_address:      # 普通表（不分库分表，绑定到 ds_0）,没有默认的数据源配置，所以每个都要显示声明
    #         actualDataNodes: ds_0.t_address # 实际数据节点

    autoTables: # 自动分片规则配置
      t_order: # 逻辑表名称
        actualDataSources: ds_${0..1} # 数据源名称
        shardingStrategy: # 切分策略
          standard: # 用于单分片键的标准分片场景
            shardingColumn: user_id # 分片列名称
            shardingAlgorithmName: mod_2 # 自动分片算法名称
        keyGenerateStrategy: # 分布式序列策略
          column: order_id # 自增列名称
          keyGeneratorName: snowflake # 分布式序列算法名称
      t_order_item: # 逻辑表名称
        actualDataSources: ds_${0..1} # 数据源名称
        shardingStrategy: # 切分策略
          standard: # 用于单分片键的标准分片场景
            shardingColumn: user_id # 分片列名称
            shardingAlgorithmName: mod_2 # 自动分片算法名称
        keyGenerateStrategy: # 分布式序列策略
          column: item_id # 自增列名称
          keyGeneratorName: snowflake # 分布式序列算法名称

    shardingAlgorithms: # 分片算法 https://shardingsphere.apache.org/document/current/cn/dev-manual/sharding/
      course_inline: # 定义名称，在上面引用
        type: INLINE # 基于行表达式的分片算法，这里使用 MOD 会报错
        props: # 属性
          algorithm-expression: course_${cid % 2 + 1} # 表达式，这是因为表名称为 course_1, course_2
          allow-range-query-with-inline-sharding: true # 允许范围查询
      course_db_inline:
        type: INLINE
        props:
          algorithm-expression: ds_${user_id % 2} # 表示 ds_0, ds_1
      mod_2:
        type: MOD # 基于 MOD 的分片算法
        props:
          sharding-count: 2 # 分片数量，即 对 2 进行取余
      t_order_db_inline:
        type: INLINE
        props:
          algorithm-expression: ds_${user_id % 2}
      t_order-complex-algorithm:
        type: COMPLEX_INLINE # 基于行表达式的复合分片算法
        props:
          algorithm-expression: t_order_complex_${(user_id + order_id + 1) % 2}
      t_order_item-class-based-algorithm_spi: # SPI
        type: T_ORDER_ITEM_COMPLEX # 基于自定义类的分片算法
      t_user_db_inline:
        type: INLINE
        props:
          algorithm-expression: ds_${Math.abs(id.hashCode()%2)}
      t_user_inline:
        type: INLINE
        props:
          algorithm-expression: t_user_${Math.abs(id.hashCode()%4).intdiv(2)}

    keyGenerators: # 分布式主键生成器: https://shardingsphere.apache.org/document/current/cn/user-manual/common-config/builtin-algorithm/keygen/
      snowflake: # 定义名称，在上面引用
        type: SNOWFLAKE # 使用雪花算法，Long
      uuid: # 定义名称
        type: UUID # 字符串主键，String
      custom_snowflake_string:
        type: CUSTOM_SNOWFLAKE_STRING
        props:
          workerId: 2
          datacenterId: 2

  - !BROADCAST  # 广播表配置，即所有的库中都包含指定的表，写入数据时同时写入多个库，查询时随机读一个
    tables:
      - dict    # 广播表名称，⼴播表不能配置分表逻辑，只往多个库的同⼀个表中插⼊数据。

  - !ENCRYPT    # 数据加密配置
    tables:
      t_user: # 加密表名称
        columns:
          password: # 加密列名称
            cipher:
              name: password # 密文列名称
              encryptorName: aes_encryptor # 密文列加密算法名称
    # 加密算法配置: https://shardingsphere.apache.org/document/current/cn/user-manual/common-config/builtin-algorithm/encrypt/
    encryptors:
      aes_encryptor: # 加解密算法名称
        type: AES # 加解密算法类型
        props: # 加解密算法属性配置
          aes-key-value: 123456abc     # AES 使用的 KEY
          digest-algorithm-name: SHA-1 # AES KEY 的摘要算法
      md5_encryptor:
        type: MD5
        props:
          salt: 123456  # 盐值（可选）

  - !MASK  # 数据脱敏配置
    tables:
      t_user: # 脱敏表名称
        columns: # 脱敏列配置
          password: # 脱敏列名称
            maskAlgorithm: md5_mask # 脱敏算法名称
          email:
            maskAlgorithm: mask_before_special_chars_mask
          telephone:
            maskAlgorithm: keep_first_n_last_m_mask
          name:
            maskAlgorithm: my_mask

    maskAlgorithms: # 脱敏算法配置
      md5_mask: # 自定义脱敏算法名称
        type: MD5  # 脱敏算法类型，md5加密后展示
      mask_before_special_chars_mask:
        type: MASK_BEFORE_SPECIAL_CHARS # 在特殊字符（比如邮箱里的 @）前面做脱敏，示例：myemail@example.com → *******@example.com
        props:
          special-chars: '@'  # 遇到 @ 之前的部分做脱敏
          replace-char: '*'   # 脱敏字符用 * 代替
      keep_first_n_last_m_mask:
        type: KEEP_FIRST_N_LAST_M # 保留前 n 位和后 m 位，其余用替换字符填充，示例：13812345678 → 138****5678
        props:
          first-n: 3     # 保留前 3 位
          last-m: 4      # 保留后 4 位
          replace-char: '*' # 脱敏字符用 * 代替
      my_mask:
        type: MY_CUSTOM_MASK  # 自定义脱敏算法名称
        props:
          replace-char: "#"


  - !SINGLE # 单表规则配置，单表规则优先级高于分库分表规则
    tables:
      # MySQL 风格
      - ds_0.t_address # 加载指定单表
#       - ds_1.* # 加载指定数据源中的全部单表
#       - "*.*" # 加载全部单表

上面的rules中使用的是mysql数据库，所以我们需要引入mysql数据库的依赖

cd apache-shardingsphere-5.5.2-shardingsphere-proxy-bin/
mkdir ext-lib
cd ext-lib
wget https://repo1.maven.org/maven2/mysql/mysql-connector-java/8.0.11/mysql-connector-java-8.0.11.jar

同时rules中包含一些自定义的算法，我们也需要将这些算法作为依赖进行引入，将这些算法类打成jar，然后也拷贝到ext-lib目录下，我已经将其发布到了github上，实际上和前文中的 shardingsphere-jdbc 项目中将算法配置为 spi 的方式是一致的。

# 1. 只克隆仓库的基本信息，不下载所有文件
git clone --filter=blob:none --sparse https://github.com/hanqunfeng/springbootchapter.git

# 2. 进入仓库目录
cd springbootchapter

# 3. 设置只检出你需要的目录
git sparse-checkout set springboot3-demo/shardingsphere-demo/algorithm-swapper

# 4. 编译打包
cd springboot3-demo/shardingsphere-demo/algorithm-swapper
mvn clean package -DskipTests

# 5. 将打好的包复制到 shardingsphere-proxy 的 ext-lib 目录下
cp target/algorithm-swapper-1.0.0.jar $shardingsphere-proxy$/ext-lib

启动与关闭 ShardingSphere Proxy

ShardingSphere Proxy 要求 JDK 1.8 或以上版本

cd $shardingsphere-proxy$/bin
# 启动
./start.sh
# 关闭
./stop.sh

参数说明

参数	说明	默认值
`-a`	绑定地址，可以是 IPv4、IPv6 或主机名，多个地址用逗号分隔。	`0.0.0.0`
`-p`	绑定端口号，可以在 `global.yaml` 中修改。`-p` 优先级更高	`3307`
`-c`	ShardingSphere-Proxy 配置目录路径。	`conf`
`-f`	强制启动 ShardingSphere-Proxy。	无
`-g`	如果在 `agent` 目录下部署了 `shardingsphere-agent`，启用 agent 功能。	无
`-s`	指定用于连接的 socket 文件路径。	无

ShardingSphere-Proxy 启动命令速查表

场景	命令示例	说明
默认启动	`./start.sh`	默认端口 `3307`，配置目录 `conf`
指定端口和配置目录	`./start.sh 3308 /opt/shardingsphere-proxy/conf`	简写模式，指定端口和配置目录
参数形式启动	`./start.sh -p 3308 -c /opt/shardingsphere-proxy/conf`	与上面等效，但更明确
指定监听地址	`./start.sh -a 192.168.1.100 -p 3307 -c conf`	指定单个 IP 地址
指定多个监听地址	`./start.sh -a 192.168.1.100,127.0.0.1 -p 3307 -c conf`	多个地址用逗号分隔
强制启动	`./start.sh -p 3307 -c conf -f`	遇到残留 PID 文件时使用
启用 agent	`./start.sh -p 3307 -c conf -g`	启动 ShardingSphere-Agent
使用 Unix Socket	`./start.sh -p 3307 -c conf -s /tmp/sharding-proxy.sock`	通过 Socket 文件进行连接
多选项组合	`./start.sh -a 127.0.0.1 -p 3310 -c /opt/proxy/conf -f -g -s /tmp/proxy.sock`	一次性指定多个选项

项目连接 ShardingSphere-Proxy 时，就像连接普通的 mysql 服务一样

示例项目：https://github.com/hanqunfeng/springbootchapter/tree/master/springboot3-demo/shardingsphere-demo/shardingsphere-proxy

后记

在 ShardingSphere-JDBC 和 ShardingSphere-Proxy 中，存在部分 MySQL(其它数据库也类似) 的 SQL 语法或功能目前还不完全支持的情况。ShardingSphere 在做 SQL 路由、改写、执行时，必须能解析 SQL 并理解其语义，但并不是 MySQL 的 100% 完全代理。因此，有些复杂或特定场景下的 SQL 可能无法被正确解析或执行。
支持良好的 SQL 语法
- 基础 DML
  SELECT、INSERT、UPDATE、DELETE
  基本的条件查询、排序、分页、分组、聚合函数（如 COUNT、SUM、AVG）
- DCL
  基本的事务语句：BEGIN、COMMIT、ROLLBACK
- DDL
  部分表结构管理语句：CREATE TABLE、ALTER TABLE、DROP TABLE
- 函数支持
  大部分常用的 MySQL 内置函数，如字符串、数学、日期函数

在 ShardingSphere + MySQL 下建议避免或谨慎使用的 SQL 清单

1. 跨分片复杂查询

SQL 场景	原因	建议处理方式
多表复杂 JOIN（特别是跨分片）	需要跨库数据聚合，性能差，可能报错	使用广播表/绑定表，或在应用层完成
跨分片子查询	SQL 路由困难，可能不支持	尽量改成单表查询或分步查询
跨分片的 GROUP BY / ORDER BY	在 Proxy 层聚合，性能很差	尽量避免，或控制数据量

1. DDL 相关

SQL 场景	原因	建议处理方式
`ALTER TABLE` 复杂变更	需在所有分片执行，可能执行失败	手动在每个分片库执行
`CREATE TRIGGER`、`PROCEDURE`	Proxy 不解析这些语法，直接透传不安全	尽量在单库手动创建
`CREATE FUNCTION`	同上	单库执行或应用层替代
`FULLTEXT INDEX`、`SPATIAL INDEX`	分片环境下无法自动维护索引	单库手动维护或避免使用

1. 文件导入导出

SQL 场景	原因	建议处理方式
`LOAD DATA INFILE`	Proxy 不支持文件系统直接访问	在分片库手动执行或通过应用导入
`SELECT ... INTO OUTFILE`	同上	应用层处理导出

1. MySQL 特有功能

功能	支持情况	建议
MySQL 8.0 公共表表达式（CTE）	部分支持	避免跨分片使用
窗口函数（`OVER() PARTITION BY`）	部分支持	避免跨分片大数据量使用
JSON 函数	基本支持	单表场景可用，跨分片需谨慎

分布式事务

场景	问题原因	建议处理方式
跨分片原生事务	MySQL 原生事务不支持跨库	使用 ShardingSphere XA / BASE
大事务 + 分布式事务	性能开销大	尽量控制事务范围