为了最大程度避免数据写入时io瓶颈带来的性能问题,MySQL采用了这样一种缓存机制:当query修改数据库内数据时,InnoDB先将该数据从磁盘读取到内存中,修改内存中的数据拷贝,并将该修改行为持久化到磁盘上的事务日志(先写redo log buffer,再定期批量写入),而不是每次都直接将修改过的数据记录到硬盘内,等事务日志持久化完成之后,内存中的脏数据可以慢慢刷回磁盘,称之为Write-Ahead Logging。事务日志采用的是追加写入,顺序io会带来更好的性能优势。
为了避免脏数据刷回磁盘过程中,掉电或系统故障带来的数据丢失问题,InnoDB采用事务日志(redo log)来解决该问题。
【redo log】
用于在实例故障恢复时,继续那些已经commit但数据尚未完全回写到磁盘的事务。
The redo log is a disk-based data structure used during crash recovery to correct data written by incomplete transactions.
通常会初始化2个或更多的 ib_logfile 存储 redo log,由参数 innodb_log_files_in_group 确定个数,命名从 ib_logfile0 开始,依次写满 ib_logfile 并顺序重用(in a circular fashion)。如果最后1个 ib_logfile 被写满,而第一个ib_logfile 中所有记录的事务对数据的变更已经被持久化到磁盘中,将清空并重用之。
在写入量比较高的情况下,redo log 的大小将显著地影响写入性能。
innodb_log_file_size 用来控制 ib_logfile 的大小。5.5版本及以前:默认5MB,最大4G。
所有事务日志累加大小不超过4G,事务日志过大,checkpoint会减少,在节省磁盘io的同时,大的事务日志也意味着数据库crash后,恢复起来较慢。
官方文档建议innodb_log_file_size的设置可参考 show engine innodb status \G
Log sequence number -Last checkpoint < (innodb_log_files_in_group * innodb_log_file_size ) * 0.75
【undo log】
记录了数据修改的前镜像。存放于ibdata中。(提一句,DDL操作会修改数据字典,该信息也存放在ibdata中)
用于在实例故障恢复时,借助undo log将尚未commit的事务,回滚到事务开始前的状态。
An undo log is a collection of undo log records associated with a single transaction. An undo log record contains information about how to undo the latest change by a transaction to a clustered index record. If another transaction needs to see the original data (as part of a consistent read operation), the unmodified data is retrieved from the undo log records. Undo logs exist within undo log segments, which are contained within rollback segments. Rollback segments are physically part of the system tablespace. For related information, see Section 14.6, “InnoDB Multi-Versioning”.
The InnoDB crash recovery process consists of several steps:(指的是MySQL server crash,不包括hardware failure 或者 server system error)
Redo log application
Redo log application is the first step and is performed during initialization, before accepting any connections. If all changes are flushed from the buffer pool to the tablespaces (ibdata* and *.ibd files) at the time of the shutdown or crash, redo log application is skipped. InnoDB also skips redo log application if redo log files are missing at startup.
Removing redo logs to speed up recovery is not recommended, even if some data loss is acceptable. Removing redo logs should only be considered after a clean shutdown, with innodb_fast_shutdown set to 0 or 1.
Roll back of incomplete transactions
Incomplete transactions are any transactions that were active at the time of crash or fast shutdown. The time it takes to roll back an incomplete transaction can be three or four times the amount of time a transaction is active before it is interrupted, depending on server load.
You cannot cancel transactions that are being rolled back. In extreme cases, when rolling back transactions is expected to take an exceptionally long time, it may be faster to start InnoDB with an innodb_force_recovery setting of 3 or greater. See Section 14.23.2, “Forcing InnoDB Recovery”.
Change buffer merge
Applying changes from the change buffer (part of the system tablespace) to leaf pages of secondary indexes, as the index pages are read to the buffer pool.
Purge
Deleting delete-marked records that are no longer visible to active transactions.
The steps that follow redo log application do not depend on the redo log (other than for logging the writes) and are performed in parallel with normal processing. Of these, only rollback of incomplete transactions is special to crash recovery. The insert buffer merge and the purge are performed during normal processing.
After redo log application, InnoDB attempts to accept connections as early as possible, to reduce downtime. As part of crash recovery, InnoDB rolls back transactions that were not committed or in XA PREPARE state when the server crashed. The rollback is performed by a background thread, executed in parallel with transactions from new connections. Until the rollback operation is completed, new connections may encounter locking conflicts with recovered transactions.