多版本并发控制(multiversion concurrency control,即MVCC): 指的是一种提高并发的技术。最早期的数据库系统,只有读读之间可以并发,读写、写读、写写都要阻塞。引入多版本之后,只有写写之间相互阻塞,其他三种操作都可以并行,这样大幅度提高了InnoDB的并发性能。在内部实现中,InnoDB通过undo log保存每条数据的多个版本,并且能够提供数据历史版本给用户读,每个事务读到的数据版本可能是不一样的。在同一个事务中,用户只能看到该事务创建快照之前已经提交的修改和该事务本身做的修改。
简单来说,MVCC表达的是维持一个数据的多个版本,使得读写操作没有冲突这么一个思想。
MVCC在read committed和repeatable read两个事务隔离级别下工作。
隐藏字段Internally, InnoDB adds three fields to each row stored in the database. A 6-byte DB_TRX_ID field indicates the transaction identifier for the last transaction that inserted or updated the row. Also, a deletion is treated internally as an update where a special bit in the row is set to mark it as deleted. Each row also contains a 7-byte DB_ROLL_PTR field called the roll pointer. The roll pointer points to an undo log record written to the rollback segment. If the row was updated, the undo log record contains the information necessary to rebuild the content of the row before it was updated. A 6-byte DB_ROW_ID field contains a row ID that increases monotonically as new rows are inserted. If InnoDB generates a clustered index automatically, the index contains row ID values. Otherwise, the DB_ROW_ID column does not appear in any index.
InnoDB存储引擎在每行数据的后面添加了三个隐藏字段,如下图所示:
DB_TRX_ID(6字节):表示最近一次对本记录行做修改(insert或update)的事务ID。至于delete操作,InnoDB认为是一个update操作,不过会更新一个另外的删除位,将行标识为deleted。并非真正删除。
DB_ROLL_PTR(7字节):回滚指针,指向当前记录行的undo log信息。
DB_ROW_ID(6字节):随着新行插入而单调递增的行ID。当表没有主键或唯一非空索引时,InnoDB就会使用这个行ID自动产生聚集索引。前文《一文读懂MySQL的索引结构及查询优化》中也有所提及。这个DB_ROW_ID跟MVCC关系不大。
undo logundo log中存储的是老版本数据,当一个事务需要读取记录行时,如果当前记录行不可见,可以顺着undo log链表找到满足其可见性条件的记录行版本。
对数据的变更操作主要包括insert/update/delete,在InnoDB中,undo log分为如下两类:
insert undo log: 事务对insert新记录时产生的undo log, 只在事务回滚时需要, 并且在事务提交后就可以立即丢弃。
update undo log: 事务对记录进行delete和update操作时产生的undo log,不仅在事务回滚时需要,快照读也需要,只有当数据库所使用的快照中不涉及该日志记录,对应的回滚日志才会被purge线程删除。
Purge线程:为了实现InnoDB的MVCC机制,更新或者删除操作都只是设置一下旧记录的deleted_bit,并不真正将旧记录删除。为了节省磁盘空间,InnoDB有专门的purge线程来清理deleted_bit为true的记录。purge线程自己也维护了一个read view,如果某个记录的deleted_bit为true,并且DB_TRX_ID相对于purge线程的read view可见,那么这条记录一定是可以被安全清除的。
不同事务或者相同事务的对同一记录行的修改形成的undo log如下图所示:
可见链首就是最新的记录,链尾就是最早的旧记录。
Read View结构Read View(读视图)提供了某一时刻事务系统的快照,主要是用来做可见性判断的, 里面保存了"对本事务不可见的其他活跃事务"。
MySQL5.7源码中对Read View定义如下(详情见https://github.com/mysql/mysql-server/blob/5.7/storage/innobase/include/read0types.h#L306):
class ReadView { private: /** The read should not see any transaction with trx id >= this value. In other words, this is the "high water mark". */ trx_id_t m_low_limit_id; /** The read should see all trx ids which are strictly smaller (<) than this value. In other words, this is the low water mark". */ trx_id_t m_up_limit_id; /** trx id of creating transaction, set to TRX_ID_MAX for free views. */ trx_id_t m_creator_trx_id; /** Set of RW transactions that was active when this snapshot was taken */ ids_t m_ids; /** The view does not need to see the undo logs for transactions whose transaction number is strictly smaller (<) than this value: they can be removed in purge if not needed by other views */ trx_id_t m_low_limit_no; /** AC-NL-RO transaction view that has been "closed". */ bool m_closed; typedef UT_LIST_NODE_T(ReadView) node_t; /** List of read views in trx_sys */ byte pad1[64 - sizeof(node_t)]; node_t m_view_list; };重点解释下面几个变量(建议仔细看上面的源码注释,以下仅为个人理解,有理解不到位的地方欢迎指出(●´ω`●)):