PostgreSQL启动恢复读取checkpoint记录失败的条件

1、首先读取ControlFile->checkPoint指向的checkpoint
2、如果读取失败,slave直接abort退出,master再次读取ControlFile->prevCheckPoint指向的checkpoint
StartupXLOG->
    |--checkPointLoc = ControlFile->checkPoint;
    |--record = ReadCheckpointRecord(xlogreader, checkPointLoc, 1, true):
    |-- if (record != NULL){
          ...
        }else if (StandbyMode){
            ereport(PANIC,(errmsg("could not locate a valid checkpoint record")));
        }else{
            checkPointLoc = ControlFile->prevCheckPoint;
            record = ReadCheckpointRecord(xlogreader, checkPointLoc, 2, true);
            if (record != NULL){
                InRecovery = true;//标记下面进入recovery
            }else{
                ereport(PANIC,(errmsg("could not locate a valid checkpoint record")));
            }
        }


一、那么什么条件下读取的checkpoint记录record==NULL?

1、ControlFile->checkPoint % XLOG_BLCKSZ < SizeOfXLogShortPHD
2、ReadRecord(xlogreader, ControlFile->checkPoint, LOG, true)返回NULL
3、ReadRecord读到的record!=NULL && record->xl_rmid != RM_XLOG_ID
4、ReadRecord读到的record!=NULL && info != XLOG_CHECKPOINT_SHUTDOWN && info != XLOG_CHECKPOINT_ONLINE
5、ReadRecord读到的record!=NULL && record->xl_tot_len != SizeOfXLogRecord + SizeOfXLogRecordDataHeaderShort + sizeof(CheckPoint)

二、ReadRecord函数返回NULL的条件

ReadRecord(xlogreader, ControlFile->checkPoint, LOG, true)
    |--record = XLogReadRecord(xlogreader, ControlFile->checkPoint, &errormsg);
    |-- 2.1 record==NULL && !StandbyMode
    |-- 2.2 record!=NULL && !tliInHistory(xlogreader->latestPageTLI, expectedTLEs)
    /*-----
    note:只要读取了一页xlog,就会赋值为该页第一个记录的时间线
    XLogReaderValidatePageHeader
        -->xlogreader->latestPageTLI=hdr->xlp_tli;
    ------*/

三、XlogReadRecord读取checkpoint返回NULL的条件?

XLogReadRecord(xlogreader, ControlFile->checkPoint, &errormsg)
    targetPagePtr = ControlFile->checkPoint - (ControlFile->checkPoint % XLOG_BLCKSZ);
    targetRecOff = ControlFile->checkPoint % XLOG_BLCKSZ;
    readOff = ReadPageInternal(state,targetPagePtr, Min(targetRecOff + SizeOfXLogRecord, XLOG_BLCKSZ));
    pageHeaderSize = XLogPageHeaderSize((XLogPageHeader) state->readBuf);
    record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);
    total_len = record->xl_tot_len;
    -------------
    1、readOff < 0
    2、0< targetRecOff < pageHeaderSize
    3、(((XLogPageHeader) state->readBuf)->xlp_info & XLP_FIRST_IS_CONTRECORD) && targetRecOff == pageHeaderSize
      page头有跨页的record并且checkpoint定位的偏移正好在页头尾部
    4、targetRecOff <= XLOG_BLCKSZ - SizeOfXLogRecord &&
      !ValidXLogRecordHeader(state, ControlFile->checkPoint, state->ReadRecPtr, record,randAccess)
      ---(record->xl_tot_len < SizeOfXLogRecord || record->xl_rmid > RM_MAX_ID || record->xl_prev != state->ReadRecPtr)
    5、targetRecOff > XLOG_BLCKSZ - SizeOfXLogRecord && total_len < SizeOfXLogRecord
    6、total_len > state->readRecordBufSize && !allocate_recordbuf(state, total_len)
      一旦该记录损坏,total_len的长度非常大的话,就需要allocate_recordbuf扩展state->readbuf,可能因此分配失败abort
      记录的checksum需要等待全部读取完整记录后才校验
    -------------

三、ReadPageInternal返回的readOff返回小于0的条件

ReadPageInternal(state,targetPagePtr, Min(targetRecOff + SizeOfXLogRecord, XLOG_BLCKSZ))
    1、第一次read wal文件,readLen = state->read_page:读取第一页。readLen < 0
    2、readLen>0 && !XLogReaderValidatePageHeader(state, targetSegmentPtr, state->readBuf)
    --
    3、读取checkpoint所在页readLen = state->read_page: readLen < 0
    4、readLen > 0 && readLen <= SizeOfXLogShortPHD
    5、!XLogReaderValidatePageHeader(state, pageptr, (char *) hdr)

四、XLogPageRead何时返回值<0 ?

内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://www.heiqu.com/a009c31f3cdcbd78d4494c33be0a9637.html