Binary Indexed Tree (Part 1)

Motivation

Binary indexed tree (BIT for short), aka Fenwick tree, is popular data structure in competitive programming (CP)context. I believe it could also be a valuable tool in practical software developers' toolbox. But is it worth yet another BIT tutorial? Yes as my aim is to provide a different/new perspective on the data structure, complementing many existing (good) tutorials that I've seen. In particular, in this two-part series,

Last but not least, BIT is one of my all-time favorite data structure, and the underlying ideas are beautiful so I decided to write about it for my future self!

Update: I've added a reference C++ implementation for both parts on Github. Feel free to take a look and ignore non-BIT code there!

What is BIT anyway?

In short, BIT is a data structure, supporting efficient range queries on dynamic arrays, i.e., the array content can be updated along the way. More concretely, a standard BIT supports the following operations on an sized-\(n\) array \(a[1..n]\) in \(O(\log n)\) time per op, with additional \(O(n)\) space (for internal structure):

There are ways to enable (range update, point query) and (range update, range query) and they are all built on the key ideas that we're going to touch on later and in the next part.

Despite its name and many existing tutorials describe it as a tree, I prefer to see BIT as a clever decomposition of the array for efficient queries and updates. More about this will be discussed below.

Honorable mentions:

I'll write about these range-queries data structures in a future posts about beautiful and practical data structures (IMHO). So stay tuned!

Why does it matter?

Hang on, for the prefix range query and point update operations above, wouldn't the straightforward \(O(1)\) update, \(O(n)\) query on the array be enough? Why bother with BIT? Well, it depends, e.g.,

Cliché: There is no single best solution for all usecases.

Key Ideas

Range decomposition

For range query, without any preprocessing, the best we can do is to iterate over the whole range, which is essentially \(O(n)\). A natural idea is then to split the range into sub-ranges (smaller contiguous blocks) and "precompute" the queries on these blocks. To answer the query on the original range, we can just combine the precomputed answers on smaller sub-ranges.

The key point is to devise efficient split and combine scheme.

Observation 1: BIT splits a prefix range based on binary expansion. Given a range \(R = [1..m] = (0..m]\), consider binary representation of \(m = 2^{k_1} + 2^{k_2} + ... + 2^{k_t}\), where \(0 \leq k_1 \leq ... \leq k_t\). Thus,

$$ L = (0..2^{k_1}] \cup (2^{k_1}..(2^{k_1} + 2^{k_2})] \cup ...((2^{k_1}+...+2^{k_{t-1}})..m] $$

Clearly, all the subranges are disjoint and together (when union) they cover the whole original range.

How do we quickly traverse all these subranges?

Start with \(m\) and remove the 1s to the right once at a time, starting from the least significant bit!

Example: \(m = 7 = 2^2+2^1+2^0 = 111_{2}\),

The number of sub-ranges is the number of times the 1s are removed until we reach 0; which is the number of 1s in the binary representation of \(m\), which is at most \(k_t \leq \log m\) (as \(2^{k_t} \leq m\)).

We can efficiently traverse and combine all the subranges of \([1..n]\) in \(O(\log n)\) time.

Remark

This binary expansion (also known as binary lifting) idea is so beautiful and I see it in many places (no surprise), such as RMQ (sparse table construction and non-idempotent query), finding lowest common ancestor (LCA), finding \(k\)-node-away ancestor of a tree node, etc. For arrays of non-negatives (all prefix sums are non-decreasing), we can use this technique with BIT to find the prefix sum of a specific value in \(O(\log n)\) time, same as binary search but no need for the array being sorted! I won't digress here and the binary lifting idea deserves its own post, for now see e.g., this post if you're interested.

Implementation details

Given a range \((1..n]\), it's fairly easy to implement the range query by iteratively extract and remove the least significant bit from \(n\). Recall modern computer system uses the two's conplement representation for integers, it's easy to see \(2^{\text{lsb}(n)} = n \& (-n)\).

Let \(tree[i]\) denote the query value computed on the range \((i-(i\&-i)..i]\) (we'll get to how to update these \(tree[1..n]\) later). In the code below, \(tree[i] := sum(a[1..i])\) for prefix sum queries as an example, but the definition of \(tree[i]\) is really problem-specific and we can adapt the logic to other kinds of query like \(\text{xor}, \text{min}, \text{max}\), etc when fitted. Be creative!

Overly commented code snippet:

def query(i: int) -> int:
  '''Returns sum(a[1..i]), 1 <= i <= n.'''

  ans = 0

  while i:
    # replace += with other combination ops for specific problems at hand
    ans += tree[i]
    # remove lsb to traverse to next subrange
    i -= (i & -i)

  return ans

Generally, BIT only supports prefix range queries, but for specific cases such as range sum or xor we can query any range:

Update relevant ranges

We now get to point update operation, \(u(a[i],v)\). As mentioned above, \(tree[i]\) stores the current value computed on its corresponding range \((i-2^{\text{lsb}(i)}..i]\). In other words, \(tree[i]\) is responsible for the range\( (i-2^{\text{lsb}(i)}..i]\). Thus an update to any element in the range needs to "inform" and update \(tree[i]\). Flip it around, for an update \(u(a[i], v),\) we need to update all relevant ranges to which \(a[i]\) belongs!

Denote \(R_j = (j-2^{\text{lsb}(j)}..j]\),

Claim 1: \(R_i, R_{i+\text{lsb}(i)}...\) are the only ranges to which \(a[i]\) belong.

For example, given array \(a[1..8]\), what ranges covering \(a[3]\)?

Not convinced yet? Here is a proof sketch, which also gives some insights into the BIT structure.

Lemma 1: \(R_i \subset R_{i+\text{lsb}(i)}\)

Proof:

Lemma 1 shows that \(a[i]\) belongs to \(R_i, R_{i+\text{lsb}(i)}...\). Next let see why these ranges are the only ones covering \(a[i]\).

Lemma 2: For all \(j\) such that \(i < j < i+2^{\text{lsb}(i)}\), \(i \notin R_j\)

Proof:

Note that we only consider even \(i\), since for odd \(i\), \(i+2^{\text{lsb}(i)} = i+1\), and there is no integer \(j \in (i,i+1)\).

Now apply Lemma 2 to \(i' = i+\text{lsb}(i)\), \(\forall j \in (i', i' + \text{lsb}(i')\), \(i' \notin R_j\) and follows \(i \notin R_j\) since \(i < i'\). Repeating this process, we can prove the claim that \(R_i, R_{i+\text{lsb}(i)},...\) are the only ranges covering \(a[i]\) (QED)!

Implementation details

With the above claim, it's trivia to implement the update. As an example, this is the snippet for increment update \(a[i] \leftarrow a[i] + v\):

def update(i: int, v: int) -> None:
  while i <= n:
    tree[i] += v
    i += (i & -i)

Claim 2: There is at most \(O(\log n)\) ranges covering \(a[i]\) for \(1 \leq i \leq n\).

The while loop continues until \(i\) goes beyond \(n\). Every iteration update \(i \leftarrow i + 2^{\text{lsb}(i)}\), i.e., it moves the LSB at least one position to the left. Thus there can be at most \(O(\log n)\) iterations, and the update op is \(O(\log n)\) time, assuming the increment is \(O(1)\).

Recap

In this part, we explore key ideas of BIT:

References

  1. https://cp-algorithms.com/data_structures/fenwick.html
  2. Topcoder's excellent tutorial