The AOF persistence logs every write operation received by the server, that will be played again at server startup, reconstructing the original dataset. Commands are logged using the same format as the Redis protocol itself, in an append-only fashion. Redis is able to rewrite the log in the background when it gets too big.
RDB 持久化是全量备份,比较耗时,所以Redis就提供了一种更为高效地AOF(Append Only-file)持久化方案,简单描述它的工作原理:AOF日志存储的是Redis服务器指令序列,AOF只记录对内存进行修改的指令记录。
在服务器从新启动时,Redis就会利用 AOF 日志中记录的这些操作从新构建原始数据集。
Redis会在收到客户端修改指令后,进行参数修改、逻辑处理,如果没有问题,就立即将该指令文本存储到 AOF 日志中,也就是说,先执行指令才将日志存盘。这点不同于 leveldb、hbase等存储引擎,它们都是先存储日志再做逻辑处理。
1. AOF 的触发配置AOF也有不同的触发方案,这里简要描述以下三种触发方案:
always:每次发生数据修改就会立即记录到磁盘文件中,这种方案的完整性好但是IO开销很大,性能较差;
everysec:在每一秒中进行同步,速度有所提升。但是如果在一秒内宕机的话可能失去这一秒内的数据;
no:默认配置,即不使用 AOF 持久化方案。
可以在redis.config中进行配置,appendonly no改换为yes,再通过注释或解注释appendfsync配置需要的方案:
############################## APPEND ONLY MODE ############################### # By default Redis asynchronously dumps the dataset on disk. This mode is # good enough in many applications, but an issue with the Redis process or # a power outage may result into a few minutes of writes lost (depending on # the configured save points). # # The Append Only File is an alternative persistence mode that provides # much better durability. For instance using the default data fsync policy # (see later in the config file) Redis can lose just one second of writes in a # dramatic event like a server power outage, or a single write if something # wrong with the Redis process itself happens, but the operating system is # still running correctly. # # AOF and RDB persistence can be enabled at the same time without problems. # If the AOF is enabled on startup Redis will load the AOF, that is the file # with the better durability guarantees. # # Please check for more information. appendonly no # The name of the append only file (default: "appendonly.aof") appendfilename "appendonly.aof" # ... 省略 # appendfsync always appendfsync everysec # appendfsync no 2. AOF 重写机制随着Redis的运行,AOF的日志会越来越长,如果实例宕机重启,那么重放整个AOF将会变得十分耗时,而在日志记录中,又有很多无意义的记录,比如我现在将一个数据 incr 一千次,那么就不需要去记录这1000次修改,只需要记录最后的值即可。所以就需要进行 AOF 重写。
Redis 提供了bgrewriteaof指令用于对AOF日志进行重写,该指令运行时会开辟一个子进程对内存进行遍历,然后将其转换为一系列的 Redis 的操作指令,再序列化到一个日志文件中。完成后再替换原有的AOF文件,至此完成。
同样的也可以在redis.config中对重写机制的触发进行配置:
通过将no-appendfsync-on-rewrite设置为yes,开启重写机制;auto-aof-rewrite-percentage 100意为比上次从写后文件大小增长了100%再次触发重写;
auto-aof-rewrite-min-size 64mb意为当文件至少要达到64mb才会触发制动重写。
# ... 省略 no-appendfsync-on-rewrite no # Automatic rewrite of the append only file. # ... 省略 auto-aof-rewrite-percentage 100 auto-aof-rewrite-min-size 64mb重写也是会耗费资源的,所以当磁盘空间足够的时候,这里可以将 64mb 调整更大写,降低重写的频率,达到优化效果。
3. fsync 函数再将AOF配置为appendfsync everysec之后,Redis在处理一条命令后,并不直接立即调用write将数据写入 AOF 文件,而是先将数据写入AOF buffer(server.aof_buf)。调用write和命令处理是分开的,Redis只在每次进入epoll_wait之前做 write 操作。
/* Write the append only file buffer on disk. * * Since we are required to write the AOF before replying to the client, * and the only way the client socket can get a write is entering when the * the event loop, we accumulate all the AOF writes in a memory * buffer and write it on disk using this function just before entering * the event loop again. * * About the 'force' argument: * * When the fsync policy is set to 'everysec' we may delay the flush if there * is still an fsync() going on in the background thread, since for instance * on Linux write(2) will be blocked by the background fsync anyway. * When this happens we remember that there is some aof buffer to be * flushed ASAP, and will try to do that in the serverCron() function. * * However if force is set to 1 we'll write regardless of the background * fsync. */ #define AOF_WRITE_LOG_ERROR_RATE 30 /* Seconds between errors logging. */ void flushAppendOnlyFile(int force) { // aofWrite 调用 write 将AOF buffer写入到AOF文件,处理了ENTR,其它没什么 ssize_t nwritten = aofWrite(server.aof_fd,server.aof_buf,sdslen(server.aof_buf)); /* Handle the AOF write error. */ if (server.aof_fsync == AOF_FSYNC_ALWAYS) { /* We can't recover when the fsync policy is ALWAYS since the * reply for the client is already in the output buffers, and we * have the contract with the user that on acknowledged write data * is synced on disk. */ serverLog(LL_WARNING,"Can't recover from AOF write error when the AOF fsync policy is 'always'. Exiting..."); exit(1); } else { return; /* We'll try again on the next call... */ } else { /* Successful write(2). If AOF was in error state, restore the * OK state and log the event. */ } /* Perform the fsync if needed. */ if (server.aof_fsync == AOF_FSYNC_ALWAYS) { // redis_fsync是一个宏,Linux实际为fdatasync,其它为fsync // 所以最好不要将redis.conf中的appendfsync设置为always,这极影响性能 redis_fsync(server.aof_fd); /* Let's try to get this data on the disk */ } else if ((server.aof_fsync == AOF_FSYNC_EVERYSEC && server.unixtime > server.aof_last_fsync)) { // 如果已在sync状态,则不再重复 // BIO线程会间隔设置sync_in_progress // if (server.aof_fsync == AOF_FSYNC_EVERYSEC) // sync_in_progress = bioPendingJobsOfType(BIO_AOF_FSYNC) != 0; if (!sync_in_progress) // everysec性能并不那么糟糕,因为它:后台方式执行fsync。 // Redis并不是严格意义上的单线程,实际上它创建一组BIO线程,专门处理阻塞和慢操作 // 这些操作就包括FSYNC,另外还有关闭文件和内存的free两个操作。 // 不像always,EVERYSEC模式并不立即调用fsync, // 而是将这个操作丢给了BIO线程异步执行, // BIO线程在进程启动时被创建,两者间通过bio_jobs和bio_pending两个 // 全局对象交互,其中主线程负责写,BIO线程负责消费。 aof_background_fsync(server.aof_fd); server.aof_last_fsync = server.unixtime; } }