17-5 Competitive analysis of self-organizing lists with move-to-front

A self-organizing list is a linked list of nn elements, in which each element has a unique key. When we search for an element in the list, we are given a key, and we want to find an element with that key.

A self-organizing list has two important properties:

  1. To find an element in the list, given its key, we must traverse the list from the beginning until we encounter the element with the given key. If that element is the kkth element from the start of the list, then the cost to find the element is kk.
  2. We may reorder the list elements after any operation, according to a given rule with a given cost. We may choose any heuristic we like to decide how to reorder the list.

Assume that we start with a given list of nn elements, and we are given an access sequence σ=σ1,σ2,,σm\sigma = \langle \sigma_1, \sigma_2, \ldots, \sigma_m \rangle of keys to find, in order. The cost of the sequence is the sum of the costs of the individual accesses in the sequence.

Out of the various possible ways to reorder the list after an operation, this problem focuses on transposing adjacent list elements-switching their positions in the list—with a unit cost for each transpose operation. You will show, by means of a potential function, that a particular heuristic for reordering the list, move-to-front, entails a total cost no worse than 44 times that of any other heuristic for maintaining the list order—even if the other heuristic knows the access sequence in advance! We call this type of analysis a competitive analysis.

For a heuristic HH and a given initial ordering of the list, denote the access cost of sequence σ\sigma by CH(σ)C_H(\sigma) Let mm be the number of accesses in σ\sigma.

a. Argue that if heuristic H\text H does not know the access sequence in advance, then the worst-case cost for H\text H on an access sequence σ\sigma is CH(σ)=Ω(mn)C_H(\sigma) = \Omega(mn).

With the move-to-front heuristic, immediately after searching for an element xx, we move xx to the first position on the list (i.e., the front of the list).

Let rankL(x)\text{rank}_L(x) denote the rank of element xx in list LL, that is, the position of xx in list LL. For example, if xx is the fourth element in LL, then rankL(x)=4\text{rank}_L(x) = 4. Let cic_i denote the cost of access σi\sigma_i using the move-to-front heuristic, which includes the cost of finding the element in the list and the cost of moving it to the front of the list by a series of transpositions of adjacent list elements.

b. Show that if σi\sigma_i accesses element xx in list LL using the move-to-front heuristic, then ci=2rankL(x)1c_i = 2 \cdot \text{rank}_L(x) - 1.

Now we compare move-to-front with any other heuristic H\text H that processes an access sequence according to the two properties above. Heuristic H\text H may transpose elements in the list in any way it wants, and it might even know the entire access sequence in advance.

Let LiL_i be the list after access σi\sigma_i using move-to-front, and let LiL_i^* be the list after access σi\sigma_i using heuristic H\text H. We denote the cost of access σi\sigma_i by cic_i for move-to-front and by cic_i^* for heuristic H\text H. Suppose that heuristic H\text H performs tit_i^* transpositions during access σi\sigma_i.

c. In part (b), you showed that ci=2rankLi1(x)1c_i = 2 \cdot \text{rank}_{L_{i - 1}}(x) - 1. Now show that ci=rankLi1(x)+tic_i^* = \text{rank}_{L_{i - 1}^*}(x) + t_i^*.

We define an inversion in list LiL_i as a pair of elements yy and zz such that yy precedes zz in LiL_i and zz precedes yy in list LiL_i^*. Suppose that list LiL_i has qiq_i inversions after processing the access sequence σ1,σ2,,σi\langle \sigma_1, \sigma_2, \ldots, \sigma_i \rangle. Then, we define a potential function Φ\Phi that maps LiL_i to a real number by Φ(Li)=2qi\Phi(L_i) = 2q_i. For example, if LiL_i has the elements e,c,a,d,b\langle e, c, a, d, b \rangle and LiL_i^* has the elements c,a,b,d,e\langle c, a, b, d, e \rangle, then LiL_i has 5 inversions ((e,c),(e,a),(e,d),(e,b),(d,b))((e, c), (e, a), (e, d), (e, b), (d, b)), and so Φ(Li)=10\Phi(L_i) = 10. Observe that Φ(Li)0\Phi(L_i) \ge 0 for all ii and that, if move-to-front and heuristic H\text H start with the same list L0L_0, then Φ(L0)=0\Phi(L_0) = 0.

d. Argue that a transposition either increases the potential by 22 or decreases the potential by 22.

Suppose that access σi\sigma_i finds the element xx. To understand how the potential changes due to σi\sigma_i, let us partition the elements other than xx into four sets, depending on where they are in the lists just before the iith access:

  • Set AA consists of elements that precede xx in both Li1L_{i - 1} and Li1L_{i - 1}^*.
  • Set BB consists of elements that precede xx in Li1L_{i - 1} and follow xx in Li1L_{i - 1}^*.
  • Set CC consists of elements that follow xx in Li1L_{i - 1} and precede xx in Li1L_{i - 1}^*.
  • Set DD consists of elements that follow xx in both Li1L_{i - 1} and Li1L_{i - 1}^*.

e. Argue that rankLi1(x)=A+B+1\text{rank}_{L_{i - 1}}(x) = |A| + |B| + 1 and rankLi1(x)=A+C+1\text{rank}_{L_{i - 1}^*}(x) = |A| + |C| + 1.

f. Show that access σi\sigma_i causes a change in potential of

Φ(Li)Φ(Li1)2(AB+ti),\Phi(L_i) - \Phi(L_{i - 1}) \le 2(|A| - |B| + t_i^*),

where, as before, heuristic H\text H performs tit_i^* transpositions during access σi\sigma_i.

Define the amortized cost c^i\hat c_i of access σi\sigma_i by c^i=ci+Φ(Li)Φ(Li1)\hat c_i = c_i + \Phi(L_i) - \Phi(L_{i - 1}).

g. Show that the amortized cost c^i\hat c_i of access σi\sigma_i is bounded from above by 4ci4c_i^*.

h. Conclude that the cost CMTF(σ)C_{\text{MTF}}(\sigma) of access sequence σ\sigma with move-to-front is at most 44 times the cost CH(σ)C_H(\sigma) of σ\sigma withany other heuristic H\text H, assuming that both heuristics start with the same list.

a. Since the heuristic is picked in advance, given any sequence of requests given so far, we can simulate what ordering the heuristic will call for, then, we will pick our next request to be whatever element will of been in the last position of the list. Continuing until all the requests have been made, we have that the cost of this sequence of accesses is =mn= mn.

b. The cost of finding an element is =rankL(x)= \text{rank}_L(x) and since it needs to be swapped with all the elements before it, of which there are rankL(x)1\text{rank}_L(x) - 1, the total cost is 2rankL(x)12 \cdot \text{rank}_L(x) - 1.

c. Regardless of the heuristic used, we first need to locate the element, which is left where ever it was after the previous step, so, needs rankLi1(x)\text{rank}_{L_{i - 1}}(x). After that, by definition, there are tit_i transpositions made, so, ci=rankLi1(x)+tic^*_i = \text{rank}_{L_{i - 1}}(x) + t_i^*.

d. If we perform a transposition of elements yy and zz, where yy is towards the left. Then there are two cases. The first is that the final ordering of the list in LiL_i^* is with yy in front of zz, in which case we have just increased the number of inversions by 11, so the potential increases by 22. The second is that in LIzL_I^*z occurs before yy, in which case, we have just reduced the number of inversions by one, reducing the potential by 22.

In both cases, whether or not there is an inversion between yy or zz and any other element has not changed, since the transposition only changed the relative ordering of those two elements.

e. By definition, AA and BB are the only two of the four categories to place elements that precede xx in Li1L_{i - 1}, since there are A+B|A| + |B| elements preceding it, it's rank in Li1L_{i - 1} is A+B+1|A| + |B| + 1. Similarly, the two categories in which an element can be if it precedes xx in Li1L^*_{i - 1} are AA and CC, so, in Li1L^*_{i - 1}, xx has rankA+C+1\text{rank} |A| + |C| + 1.

f. We have from part d that the potential increases by 22 if we transpose two elements that are being swapped so that their relative order in the final ordering is being screwed up, and decreases by two if they are begin placed into their correct order in LiL^*_i.

In particular, they increase it by at most 22, since we are keeping track of the number of inversions that may not be the direct effect of the transpositions that heuristic HH made, we see which ones the Move to front heuristic may of added. In particular, since the move to front heuristic only changed the relative order of xx with respect to the other elements, moving it in front of the elements that preceded it in Li1L_{i - 1}, we only care about sets AA and BB. For an element in AA, moving it to be behind AA created an inversion, since that element preceded xx in LiL^*_i. However, if the element were in BB, we are removing an inversion by placing xx in front of it.

g.

c^i2(A+B+1)1+2(AB+ti)=4A+1+2ti4(A+C+1+ti)=4ci. \begin{aligned} \hat c_i & \le 2(|A| + |B| + 1) - 1 + 2(|A| - |B| + t_i^*) \\ & = 4|A| + 1 + 2 t_i^* \\ & \le 4(|A| + |C| + 1 + t_i^*) \\ & = 4 c_i^*. \end{aligned}

h. We showed that the amortized cost of each operation under the move to front heuristic was at most four times the cost of the operation using any other heuristic. Since the amortized cost added up over all these operation is at most the total (real) cost, so we have that the total cost with movetofront is at most four times the total cost with an arbitrary other heuristic.