引言: 今天的问题将围绕Java写入MySQL之时,中文在数据库中编程乱码的分析追踪过程,以此来了解和优化分析解决问题的过程。
1. 开发环境描述
spring Boot 1.4.0.RELEASE, JDK 1.8, Mysql 5.7, CentOS 7
2. 问题描述
在Java代码中,保存中文到数据,发现在数据库中显示为???,这个是乱码的表现, 剩下的问题是哪个环节出现了问题呢?
3. 问题分析以及推理
在整个环节中,产生乱码的环节主要有以下几个:java代码, IDE, 代码所在的系统, Mysql连接, 数据库所在的操作系统,数据库层面。这里我们使用utf-8来做通用的编码格式。
接下来我们进行逐个分析与排查可能的问题:
A: IDE本身的编码, 经过排查正确, utf-8.
B. 开发所使用的操作系统
经过确认为windows 7的中文版,应该不是问题的根源。
C. Mysql的连接驱动
目前使用的连接URL为: jdbc:log4jdbc:mysql://localhost:3306/mealsystem?useUnicode=true&characterEncoding=utf-8
问号后面挂接的unicode编码的支持,设定为utf-8.
D. 数据库所在的操作系统
[root@flybird ~]# lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.2.1511 (Core)
Release: 7.2.1511
Codename: Core
[root@flybird ~]# uname -a
Linux flybird 3.10.0-327.3.1.el7.x86_64 #1 SMP Wed Dec 9 14:09:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
[root@flybird ~]# cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
[root@flybird ~]#
E. 操作系统的编码以及locale:
[root@flybird ~]# locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
经过确认,没有问题,都是遵守utf-8的格式。
F. 数据库中的表分析:
数据库表test, 表中5个字段,id, name, created_time, updated_time, version.
其中表的encode如下, 确认为utf-8.
其中目标字段name的编码格式:
故name本身的编码没有问题。
3. Spring Boot的Java代码分析
TestEntity的定义:
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Table;
@Entity
@Table(name="test")
public class TestEntity extends BaseEntity {
private static final long serialVersionUID = -4437451262794760844L;
@Column
private String name;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
}
DAO的TestRepository.java的定义:
import org.springframework.data.jpa.repository.JpaRepository;
import org.springframework.stereotype.Repository;
import com.rain.wx.meal.model.TestEntity;
@Repository
public interface TestRepository extends JpaRepository<TestEntity, Long> {
}
测试代码:
@RunWith(SpringRunner.class)
@SpringBootTest
@ActiveProfiles("dev")
public class TestEntityTest {
@Autowired
private TestRepository testRepo;
@Test
public void testEntity() {
TestEntity test = new TestEntity();
test.setName("我的天空");
test = testRepo.save(test);
test = testRepo.findOne(test.getId());
System.out.println("tst info:" + test);
}
}
经过分析,由于IDE本身已经设置了UTF-8的编码,故在代码已经无需额外的转码,且在代码层面已经进行了转码的测试,比如utf-8, gb2312, gbk, is08859_1等编码,皆仍未乱码。
4. 基于Mysql的客户端的验证分析
基于workbench或者Navicat之类的客户端工具,打开目标表test, 手动输入中文信息到test的name字段,保存之后,重新查询,发现仍为中文信息。 基于代码针对基于客户端输入的信息,进行查询发现,可以正常的查出中文信息来。
基于这个正确查询出来的结果,可以确认从数据中的查询是正确的;目前存在问题的路径为写入中文的过程。
5. 聚焦数据库本身
在之前排查完了操作系统的编码之后,数据库的编码也需要排查一下:
忽然发现character_set_server的编码是latin1, 原来问题在这样; 在基本确认问题源头之后,我们来看看如何解决。
6. 问题的解决方式
修改character_set_server的encode:
>> set global character_set_server = utf8.
然后重启 mysqlServer之后,很不幸,竟然不生效。不知道问题出在哪里。。。。。。
那好吧,我们换一种方式来做吧,在/etc/my.cnf中进行初始化数据库的encode:
[client] # 新增客户端的编码
default-character-set=utf8
[mysql] # 新增客户端的编码,缺省
default-character-set=utf8
[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
# Recommended in standard MySQL setup
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
# 新增 关于character_set_server的编码设置
init-connect='SET NAMES utf8'
character-set-server = utf8
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
这里在mysql中新增了如下记录,来初始化设置mysql数据库服务器的编
init-connect='SET NAMES utf8'
character-set-server = utf8
然后,重新启动mysql服务:
重新执行测试代码,欣喜之中看到了预期中的结果:
2016-08-31 16:26:27.613 INFO 12556 --- [ main] jdbc.audit : 4. Connection.getWarnings() returned null
2016-08-31 16:26:27.614 INFO 12556 --- [ main] jdbc.audit : 4. Connection.clearWarnings() returned
2016-08-31 16:26:27.615 INFO 12556 --- [ main] jdbc.audit : 4. Connection.clearWarnings() returned
tst info:com.rain.wx.meal.model.TestEntity@578198d9[
name=我的天空
id=7
version=0
createdTime=<null>
updatedTime=<null>
]
2016-08-31 16:26:27.656 INFO 12556 --- [ Thread-2] o.s.w.c.s.GenericWebApplicationContext : Closing org.springframework.web.context.support.GenericWebApplicationContext@71687585: startup date [Wed Aug 31 16:26:08 CST 2016]; root of context hierarchy
2016-08-31 16:26:27.670 INFO 12556 --- [ Thread-2] j.LocalContainerEntityManagerFactoryBean : Closing JPA EntityManagerFactory for persistence unit 'default'
2016-08-31 16:26:27.677 INFO 12556 --- [ Thread-2] jdbc.connection : 1. Connection closed
2016-08-31 16:26:27.677 INFO 12556 --- [ Thread-2] jdbc.audit : 1. Connection.close() returned
2016-08-31 16:26:27.679 INFO 12556 --- [ Thread-2] jdbc.connection : 2. Connection closed
2016-08-31 16:26:27.680 INFO 12556 --- [ Thread-2] jdbc.audit : 2. Connection.close() returned
2016-08-31 16:26:27.680 INFO 12556 --- [ Thread-2] jdbc.connection : 3. Connection closed
2016-08-31 16:26:27.680 INFO 12556 --- [ Thread-2] jdbc.audit : 3. Connection.close() returned
2016-08-31 16:26:27.682 INFO 12556 --- [ Thread-2] jdbc.connection : 5. Connection closed
2016-08-31 16:26:27.683 INFO 12556 --- [ Thread-2] jdbc.audit : 5. Connection.close() returned
2016-08-31 16:26:27.684 INFO 12556 --- [ Thread-2] jdbc.audit : 4. PreparedStatement.close() returned
2016-08-31 16:26:27.685 INFO 12556 --- [ Thread-2] jdbc.audit : 4. PreparedStatement.close() returned
2016-08-31 16:26:27.685 INFO 12556 --- [ Thread-2] jdbc.connection : 4. Connection closed
2016-08-31 16:26:27.686 INFO 12556 --- [ Thread-2] jdbc.audit : 4. Connection.close() returned
2016-08-31 16:26:27.687 INFO 12556 --- [ Thread-2] com.alibaba.druid.pool.DruidDataSource : {dataSource-1} closed
7. 参考资料
- http://stackoverflow.com/questions/3513773/change-mysql-default-character-set-to-utf-8-in-my-cnf
- http://www.cnblogs.com/-1185767500/articles/3106194.html
- https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_character_set_database