Paging (Online Caching)

Preface

Instead of starting with a formal definition, the goal is to approach these topic via a row of examples, introducing definitions along the way. The remark section Theory will consist of all definitions, theorems and propositions to give you all informations to faster look up specific aspects.

The remark section sources consists of the basis material used for this topic and additional information for further reading. In addition you will find the full source codes for the examples there. Please pay attention that to make the source code for the examples more readable and shorter it refrains from things like error handling etc. It also passes on some specific language features which would obscure the clarity of the example like extensive use of advanced libraries etc.

Paging

The paging problem arises from the limitation of finite space. Let’s assume our cache C has k pages. Now we want to process a sequence of m page requests which must have been placed in the cache before they are processed. Of course if m<=k then we just put all elements in the cache and it will work, but usually is m>>k.

We say a request is a cache hit, when the page is already in cache, otherwise, its called a cache miss. In that case, we must bring the requested page into the cache and evict another, assuming the cache is full. The Goal is an eviction schedule that minimizes the number of evictions.

There are numerous strategies for this problem, let’s look at some:

First in, first out (FIFO): The oldest page gets evicted
Last in, first out (LIFO): The newest page gets evicted
Least recently used (LRU): Evict page whose most recent access was earliest
Least frequently used (LFU): Evict page that was least frequently requested
Longest forward distance (LFD): Evict page in the cache that is not requested until farthest in the future.
Flush when full (FWF): clear the cache complete as soon as a cache miss happened

There are two ways to approach this problem:

offline: the sequence of page requests is known ahead of time
online: the sequence of page requests is not known ahead of time

Offline Approach

For the first approach look at the topic Applications of Greedy technique. It’s third Example Offline Caching considers the first five strategies from above and gives you a good entry point for the following.

The example program was extended with the FWF strategy: