Spring data 把数据库常见操作都封装得差不多了,甚至连分页都为不用自己写代码,确实大大提高了码砖的效率,对于一般的需求,往Reposistory里加几个接口就妥当了.那么反过来说,如果有’不一般’的需求呢?
这就是矛盾的地方,ORM用它自己的一套概念给程序员洗脑,让你感觉不到数据库的存在,但实际上要达到这个目标很难,总归要和数据库直接打交道.
我最近碰到的一个问题就是索引失效,这个问题一开始我还没察觉到,因为我已经提前根据查询语句创建了好了索引,但是当我观察数据库的性能报告却发现查询效率很低,把相应的语句弄出来执行EXPLAIN
,居然是ALL
.
背景 entity
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 @Data @Entity @Table(name = "t_clan_tracking") public class ClanTracking { @Id @Column(name = "f_clan_tag") private String clan; @Column(name = "f_name",nullable = false) private String name; @Column(name = "f_score") private int score; @Column(name = "f_last_hit_time") @Temporal(TemporalType.TIMESTAMP) private Date lastHit; }
repositiry
1 2 3 public interface ClanTrackingRepository extends JpaRepository <ClanTracking,String> { ClanTracking findFirstByOrderByLastHitAscScoreDesc () ; }
以上的代码来自shufork
项目的coc-discovery
模块,由于这个模块扮演着类似爬虫的角色(定期去抓数据),为了尽量抓取有意义的数据,因此需要简单记录一下抓取时间(lastHit
)和对象权重(score
).每一次需要抓数据的时候就从数据库里面按照抓取时间和权重值找一个目标出来,然后去抓取它的详细信息.总之,findFirstByOrderByLastHitAscScoreDesc
这个查询的执行非常频繁.
使用(伪代码)
1 2 3 4 5 6 7 8 9 10 11 12 13 public ClanTracking retrieveOne () { ClanTracking found = clanTrackingRepository.findFirstByOrderByLastHitAscScoreDesc(); if (found != null ){ found.setLastHit(DateTimeUtil.utc().toDate()); clanTrackingRepository.save(found); } return found; } public void pull () { ClanTracking o = retrieveOne(); log.debug("tag = {},name = {}" ,o.getClan(),o.getName()); }
为了便于理解,这里写了个伪代码pull()
来表示抓数据的方法.
现象 程序跑起来后,打开MySql的性能报告看了一下,发现t_clan_tracking
表存在明显的查询性能问题.
Executed (#)
Errors (#)
Warnings (#)
Total Time
Max Time
Avg Time
17962
0
0
2700217844
2069254.11
150329.4
其对应的SQL是
1 2 3 4 5 6 7 8 9 10 11 SELECT `clantracki0_` . `f_clan_tag` AS `f_clan_t1_0_` , `clantracki0_` . `z_created_time` AS `z_create2_0_` , `clantracki0_` . `f_last_hit_time` AS `f_last_h3_0_` , `clantracki0_` . `z_modified_time` AS `z_modifi4_0_` , `clantracki0_` . `f_name` AS `f_name5_0_` , `clantracki0_` . `f_score` AS `f_score6_0_` , `clantracki0_` . `z_version` AS `z_versio7_0_` FROM `t_clan_tracking` `clantracki0_` ORDER BY `clantracki0_` . `f_last_hit_time` ASC , `clantracki0_` . `f_score` DESC LIMIT ?
很显然它就是ClanTrackingRepository
的findFirstByOrderByLastHitAscScoreDesc
了.
检查一下索引:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 mysql> show index from t_clan_tracking \G; *************************** 1. row *************************** Table: t_clan_tracking Non_unique: 0 Key_name: PRIMARY Seq_in_index: 1 Column_name: f_clan_tag Collation: A Cardinality: 105610 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: Index_comment: *************************** 2. row *************************** Table: t_clan_tracking Non_unique: 1 Key_name: idx_top_last_hit_score Seq_in_index: 1 Column_name: f_last_hit_time Collation: A Cardinality: 13886 Sub_part: NULL Packed: NULL Null: YES Index_type: BTREE Comment: Index_comment: *************************** 3. row *************************** Table: t_clan_tracking Non_unique: 1 Key_name: idx_top_last_hit_score Seq_in_index: 2 Column_name: f_score Collation: A Cardinality: 21919 Sub_part: NULL Packed: NULL Null: YES Index_type: BTREE Comment: Index_comment: *************************** 4. row *************************** Table: t_clan_tracking Non_unique: 1 Key_name: idx_top_last_hit_score Seq_in_index: 3 Column_name: f_clan_tag Collation: A Cardinality: 101087 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: Index_comment: *************************** 5. row *************************** Table: t_clan_tracking Non_unique: 1 Key_name: idx_top_last_hit_score Seq_in_index: 4 Column_name: f_name Collation: A Cardinality: 94932 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: Index_comment: 5 rows in set (0.00 sec)
查询条件中的列是有索引的,顺序也正确,不过取出的列太多了,导致索引没有起作用.而上面的代码说明实际的业务逻辑需要用到的字段只有
查询条件字段f_last_hit_time
和f_score
查询结果字段f_clan_tag
和f_name
使用EXPLAIN验证一下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 mysql> EXPLAIN SELECT * FROM t_clan_tracking ORDER BY f_last_hit_time ASC, f_score DESC LIMIT 1 \G; *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t_clan_tracking partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 105610 filtered: 100.00 Extra: Using filesort
Join Type 是ALL,MySql文档对于它的描述是
A full table scan is done for each combination of rows from the previous tables. This is normally not good if the table is the first table not marked const, and usually very bad in all other cases. Normally, you can avoid ALL by adding indexes that enable row retrieval from the table based on constant values or column values from earlier tables.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 mysql> EXPLAIN SELECT f_clan_tag,f_name FROM t_clan_tracking ORDER BY f_last_hit_time ASC, f_score DESC LIMIT 1 \G; *************************** 1. row *************************** id: 1 select_type: SIMPLE table: t_clan_tracking partitions: NULL type: index possible_keys: NULL key: idx_top_last_hit_score key_len: 2055 ref: NULL rows: 105610 filtered: 100.00 Extra: Using index; Using filesort
知道原因就好对症下药了,查阅sping data
的文档发现投影(prijections) 操作就可以解决这个问题.
改造后的代码 entity
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 @Data @Entity @Table(name = "t_clan_tracking") public class ClanTracking { @Id @Column(name = "f_clan_tag") private String clan; @Column(name = "f_name",nullable = false) private String name; @Column(name = "f_score") private int score; @Column(name = "f_last_hit_time") @Temporal(TemporalType.TIMESTAMP) private Date lastHit; public interface ClanTracker { String getClan () ; String getName () ; } }
repositiry
1 2 3 4 5 6 7 8 9 public interface ClanTrackingRepository extends JpaRepository <ClanTracking,String> { ClanTracking findFirstByOrderByLastHitAscScoreDesc () ; <T> T findFirstByOrderByLastHitAscScoreDesc (Class<T> type) ; @Modifying @Query("update ClanTracking u set u.lastHit = ?2 where u.clan = ?1") void updateLastHit (String clan,Date lastHit) ; }
使用(伪代码)
1 2 3 4 5 6 7 8 9 10 11 12 public ClanTracking.ClanTracker retrieveOne () { ClanTracking.ClanTracker found = clanTrackingRepository.findFirstByOrderByLastHitAscScoreDesc(ClanTracking.ClanTracker.class); if (found != null ){ clanTrackingRepository.updateLastHit(found.getClan(),DateTimeUtil.utc().toDate()); } return found; } public void pull () { ClanTracking.ClanTracker o = retrieveOne(); log.debug("tag = {},name = {}" ,o.getClan(),o.getName()); }
参考资料