Greenplum SysCache and RelCache

2022-12-02 20:36:57 浏览数 (1)

1.Abstract

When accessing database tables, some information needs to be obtained from system tables. In order to improve retrieval efficiency, PostgreSQL provides caches, including SysCache and RelCache.

  • SysCache
    • Recently used system table tuple
  • RelCache
    • Schema information for recently accessed tables

2.SysCache

2.1 Some important data structures

  • struct cachedesc information defining a single syscache
  • struct catctup
    • individual tuple in the cache.
  • struct catclist
    • list of tuples matching a partial key.
  • struct catcache
    • information for managing a cache.
  • struct catcacheheader
    • information for managing all the caches.

SysCache is a CatCache array.

代码语言:javascript复制
static CatCache *SysCache[SysCacheSize];

The SysCache structure formed by the above data structure is as follows:

2.2 Some important operation functions

  • Init operation
    • InitCatalogCache

Use the cacheinfo array to initialize the SysCache array, call the InitCatCache function to create and initialize the CacheHeader. But after this round of initialization is completed, there are no tuples in CatCache. Therefore, there is a second stage initialization.

  • InitCatalogCachePhase2

cc_skey and cc_tupdesc will be filled

  • Search operation

Find by different number of keys,they will invoke SearchCatCacheInternal function.

  • step1: init
  • step2: get a hash bucket
  • step3: Iterate over hash bucket
  • step4: cache miss

key1: For each iteration, the catcup structure is obtained. If we get catcup, similar to LRU, the node in the two-way list is deleted and inserted into the head (increase the priority of the next access).

key2: Negative tuple design: Solve the problem of cache penetration and improve the cache hit rate. Specifically, invalid tuples are also cached.

  • cache miss

Tuple was not found in cache, so we have to try to retrieve it directly from the relation. If found, we will add it to the cache; if not found, we will add a negative cache entry instead.

  • step1: init
  • step2: tuple found
  • step3: tuple not found and build a negative cache entry containing a fake tuple.

代码语言:javascript复制
static inline HeapTuple
SearchCatCacheInternal(CatCache *cache,
                                           int nkeys,
                                           Datum v1,
                                           Datum v2,
                                           Datum v3,
                                           Datum v4)
{
        // some initialization operations
        if (unlikely(cache->cc_tupdesc == NULL)) 
                CatalogCacheInitializeCache(cache);
        arguments[0] = v1;
        arguments[1] = v2;
        arguments[2] = v3;
        arguments[3] = v4;
        // step2: get a hash bucket
        hashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
        hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
        bucket = &cache->cc_bucket[hashIndex];
        
        // step3: Iterate over hash buckets
        dlist_foreach(iter, bucket)
        {
                ct = dlist_container(CatCTup, cache_elem, iter.cur);

                if (ct->dead)
                        continue;                        /* ignore dead entries */

                if (ct->hash_value != hashValue)
                        continue;                        /* quickly skip entry if wrong hash val */

                if (!CatalogCacheCompareTuple(cache, nkeys, ct->keys, arguments))
                        continue;

                // LRU 
                dlist_move_head(bucket, &ct->cache_elem);

                /*
                 * If it's a positive entry, bump its refcount and return it. If it's
                 * negative, we can report failure to the caller.
                 */
                if (!ct->negative)
                {
                        ResourceOwnerEnlargeCatCacheRefs(CurrentResourceOwner);
                        ct->refcount  ;
                        ResourceOwnerRememberCatCacheRef(CurrentResourceOwner, &ct->tuple);

                        CACHE_elog(DEBUG2, "SearchCatCache(%s): found in bucket %d",
                                           cache->cc_relname, hashIndex);

                        return &ct->tuple;
                }
                else
                {
                        CACHE_elog(DEBUG2, "SearchCatCache(%s): found neg entry in bucket %d",
                                           cache->cc_relname, hashIndex);

                        return NULL;
                }
        }
        // step4: cache miss
          
        return SearchCatCacheMiss(cache, nkeys, hashValue, hashIndex, v1, v2, v3, v4);
}

static pg_noinline HeapTuple
SearchCatCacheMiss(CatCache *cache,
                                   int nkeys,
                                   uint32 hashValue,
                                   Index hashIndex,
                                   Datum v1,
                                   Datum v2,
                                   Datum v3,
                                   Datum v4)
{
        /* step1: init */
        arguments[0] = v1;
        arguments[1] = v2;
        arguments[2] = v3;
        arguments[3] = v4;
        memcpy(cur_skey, cache->cc_skey, sizeof(ScanKeyData) * nkeys);
        cur_skey[0].sk_argument = v1;
        cur_skey[1].sk_argument = v2;
        cur_skey[2].sk_argument = v3;
        cur_skey[3].sk_argument = v4;

        /*
         * step2: tuple found
         */
        relation = table_open(cache->cc_reloid, AccessShareLock);
        scandesc = systable_beginscan(relation, cache->cc_indexoid, IndexScanOK(cache, cur_skey), NULL, nkeys, cur_skey);
        ct = NULL;
        while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
        {
                ct = CatalogCacheCreateEntry(cache, ntp, arguments, hashValue, hashIndex, false);
                /* immediately set the refcount to 1 */
                ResourceOwnerEnlargeCatCacheRefs(CurrentResourceOwner);
                ct->refcount  ;
                ResourceOwnerRememberCatCacheRef(CurrentResourceOwner, &ct->tuple);
                break;                                        /* assume only one match */
        }
        systable_endscan(scandesc);
        table_close(relation, AccessShareLock);

        /*
         * step3: tuple not found
         * If tuple was not found, we need to build a negative cache entry
         * containing a fake tuple.
         *
         * In bootstrap mode, we don't build negative entries.
         */
        if (ct == NULL)
        {
                if (IsBootstrapProcessingMode())
                        return NULL;
                ct = CatalogCacheCreateEntry(cache, NULL, arguments, hashValue, hashIndex, true);
                return NULL;
        }
        // print something
        return &ct->tuple;
}
  • Remove operation
  • CatCacheRemoveCTup(red strikethrough) When refcount is zero, it will remove cur CatCTup and invoke CatCacheRemoveCList to remove list.
  • CatCacheRemoveCList(blue strikethrough) When refcount is zero, it will remove all CatCTup(member field).

3.RelCache

  • RelationCacheInitialize

initializes the relation descriptor cache

  • RelationCacheInitializePhase2

load the shared relcache cache file

  • RelationCacheInitializePhase3 This stage will ensure that the information of pg_class, pg_attribute, pg_proc, pg_type related indexes is added to RelCache.First, the file pg_internal.init will be read and loaded. If the file does not exist, the RelationData structure will be rebuilt, added to the Hash table on RelCache, and the file will be rewritten.
  • Insert、Lookup and delete operations RelationCacheInsert、RelationIdCacheLookup、RelationCacheDelete invoke hash_search.hash_search is the external interface, and the input action is distinguished according to the following behaviors.
    • HASH_REMOVE
      • look up key in table
    • HASH_ENTER_NULL
      • look up key in table, creating entry if not present
    • HASH_ENTER
      • same, but return NULL if out of memory
    • HASH_FIND
      • look up key in table, remove entry if present

end

0 人点赞