1)根据Scan对象构造时设置好readPoint,scan.getIsolationLevel()分为READ_UNCOMMITTED和READ_COMMITTED,只有当READ_COMMITTED时根据MultiVersionConsistencyControl.resetThreadReadPoint(mvcc);设置当前scanner线程的readPoint,并插入到scannerReadPoints维护起来。
2)根据scan需要读取的column family,创建StoreScanner(根据bloom filter、time range、ttl筛选需要的MemStoreScanner和StoreFileScanner),添加到scanners中,并最终根据scanners构造出一个KeyValueHeap。
下面看下RegionScannerImpl中的next方法是每次查询时需要调用的函数:
boolean org.apache.hadoop.hbase.regionserver.HRegion.RegionScannerImpl.next(List<KeyValue> outResults, int limit) throws IOException
而上述方法会通过KeyValueHeap的next方法读取下一条数据:先定位到当前KeyValueScanner(即之前构造KeyValueHeap时传入的MemStoreScanner或StoreScanner),然后调用next方法。
StoreFileScanner和MemStoreScanner均为KeyValueScanner,通过其中的next()接口方法,分别调用到StoreFileScanner.java的skipKVsNewerThanReadpoint方法、Memstore.java中MemStoreScanner对象的getNext方法。
1)StoreFileScanner.java的skipKVsNewerThanReadpoint方法:
protected boolean skipKVsNewerThanReadpoint() throws IOException { long readPoint = MultiVersionConsistencyControl.getThreadReadPoint(); // We want to ignore all key-values that are newer than our current // readPoint while(enforceMVCC && cur != null && (cur.getMemstoreTS() > readPoint)) { hfs.next(); cur = hfs.getKeyValue(); } if (cur == null) { close(); return false; } // For the optimisation in HBASE-4346, we set the KV's memstoreTS to // 0, if it is older than all the scanners' read points. It is possible // that a newer KV's memstoreTS was reset to 0. But, there is an // older KV which was not reset to 0 (because it was // not old enough during flush). Make sure that we set it correctly now, // so that the comparision order does not change. if (cur.getMemstoreTS() <= readPoint) { cur.setMemstoreTS(0); } return true; }
View Code2) Memstore.java中MemStoreScanner对象的getNext方法:
protected KeyValue getNext(Iterator<KeyValue> it) { long readPoint = MultiVersionConsistencyControl.getThreadReadPoint(); while (it.hasNext()) { KeyValue v = it.next(); if (v.getMemstoreTS() <= readPoint) { return v; } } return null; }