Motivation

Binary indexed tree (BIT for short), aka Fenwick tree, is popular data structure in competitive programming (CP)context. I believe it could also be a valuable tool in practical software developers' toolbox. But is it worth yet another BIT tutorial? Yes as my aim is to provide a different/new perspective on the data structure, complementing many existing (good) tutorials that I've seen. In particular, in this two-part series,

Part 1: Distill its core ideas with intuitive explanation provided when needed. If you're already familiar with the standard range-query/point-update BIT, feel free to skip this part.
Part 2: Building on the core ideas from Part 1, suggest some flexible (and probably lesser-known) ways to extend the data structure to support more usecases, such as range-update/range-query and high-dimensional range queries.

Last but not least, BIT is one of my all-time favorite data structure, and the underlying ideas are beautiful so I decided to write about it for my future self!

Update: I've added a reference C++ implementation for both parts on Github. Feel free to take a look and ignore non-BIT code there!

What is BIT anyway?

In short, BIT is a data structure, supporting efficient range queries on dynamic arrays, i.e., the array content can be updated along the way. More concretely, a standard BIT supports the following operations on an sized-$n$ array $a[1..n]$ in $O(\log n)$ time per op, with additional $O(n)$ space (for internal structure):

Prefix range query: $f(a[1..m])$, e.g., sum of the first $m$ elements
Point update: $u(a[i], v)$, e.g., increasing $a[i]$ by $v$, $a[i] \leftarrow a[i] + v$

There are ways to enable (range update, point query) and (range update, range query) and they are all built on the key ideas that we're going to touch on later and in the next part.

Despite its name and many existing tutorials describe it as a tree, I prefer to see BIT as a clever decomposition of the array for efficient queries and updates. More about this will be discussed below.

Honorable mentions:

A more complex and powerful data structure which also support similar operations on dynamic arrays among other things is segment tree
For range queries on static arrays (no change to content), simpler data structures such as Sparse table (e.g., for range minimum query tasks), block decomposition techniques (e.g., square root decomposition), or even a simple prefix sum array would be enough.

I'll write about these range-queries data structures in a future posts about beautiful and practical data structures (IMHO). So stay tuned!

Why does it matter?

Hang on, for the prefix range query and point update operations above, wouldn't the straightforward $O(1)$ update, $O(n)$ query on the array be enough? Why bother with BIT? Well, it depends, e.g.,

The $O(1)$ update, $O(n)$ query approach is reasonable when most operations are update and very few are range query
When the input size is large with many range queries then $O(n)$ query isn't ideal and $O(\log n)$ query at the expense of $O(\log n)$ update seems better

Cliché: There is no single best solution for all usecases.

Key Ideas

Range decomposition

For range query, without any preprocessing, the best we can do is to iterate over the whole range, which is essentially $O(n)$. A natural idea is then to split the range into sub-ranges (smaller contiguous blocks) and "precompute" the queries on these blocks. To answer the query on the original range, we can just combine the precomputed answers on smaller sub-ranges.

The key point is to devise efficient split and combine scheme.

Observation 1: BIT splits a prefix range based on binary expansion. Given a range $R = [1..m] = (0..m]$, consider binary representation of $m = 2^{k_1} + 2^{k_2} + ... + 2^{k_t}$, where $0 \leq k_1 \leq ... \leq k_t$. Thus,

$$ L = (0..2^{k_1}] \cup (2^{k_1}..(2^{k_1} + 2^{k_2})] \cup ...((2^{k_1}+...+2^{k_{t-1}})..m] $$

Clearly, all the subranges are disjoint and together (when union) they cover the whole original range.

How do we quickly traverse all these subranges?

Start with $m$ and remove the 1s to the right once at a time, starting from the least significant bit!

Example: $m = 7 = 2^2+2^1+2^0 = 111_{2}$,

remove the lsb: $7 (111) \rightarrow 6 (110)$
remove the next 1: $6 (110) \rightarrow 4(100)$
remove the last 1: $4(100) \rightarrow 0(000)$ Hence, $[1..7] = (0..7] = (0,4] \cup (4,6] \cup (6, 7]$

The number of sub-ranges is the number of times the 1s are removed until we reach 0; which is the number of 1s in the binary representation of $m$, which is at most $k_t \leq \log m$ (as $2^{k_t} \leq m$).

We can efficiently traverse and combine all the subranges of $[1..n]$ in $O(\log n)$ time.

Remark

This binary expansion (also known as binary lifting) idea is so beautiful and I see it in many places (no surprise), such as RMQ (sparse table construction and non-idempotent query), finding lowest common ancestor (LCA), finding $k$-node-away ancestor of a tree node, etc. For arrays of non-negatives (all prefix sums are non-decreasing), we can use this technique with BIT to find the prefix sum of a specific value in $O(\log n)$ time, same as binary search but no need for the array being sorted! I won't digress here and the binary lifting idea deserves its own post, for now see e.g., this post if you're interested.

Implementation details

Given a range $(1..n]$, it's fairly easy to implement the range query by iteratively extract and remove the least significant bit from $n$. Recall modern computer system uses the two's conplement representation for integers, it's easy to see $2^{\text{lsb}(n)} = n \& (-n)$.

Let $tree[i]$ denote the query value computed on the range $(i-(i\&-i)..i]$ (we'll get to how to update these $tree[1..n]$ later). In the code below, $tree[i] := sum(a[1..i])$ for prefix sum queries as an example, but the definition of $tree[i]$ is really problem-specific and we can adapt the logic to other kinds of query like $\text{xor}, \text{min}, \text{max}$, etc when fitted. Be creative!

Overly commented code snippet:

def query(i: int) -> int:
  '''Returns sum(a[1..i]), 1 <= i <= n.'''

  ans = 0

  while i:
    # replace += with other combination ops for specific problems at hand
    ans += tree[i]
    # remove lsb to traverse to next subrange
    i -= (i & -i)

  return ans

Generally, BIT only supports prefix range queries, but for specific cases such as range sum or xor we can query any range:

Range sum: $sum(a[L..R]) = query(R) - query(L-1)$
Range XOR: $\text{xor}(a[L..R]) = query(R)~\text{xor}~ query(L-1)$

Update relevant ranges

We now get to point update operation, $u(a[i],v)$. As mentioned above, $tree[i]$ stores the current value computed on its corresponding range $(i-2^{\text{lsb}(i)}..i]$. In other words, $tree[i]$ is responsible for the range$ (i-2^{\text{lsb}(i)}..i]$. Thus an update to any element in the range needs to "inform" and update $tree[i]$. Flip it around, for an update $u(a[i], v),$ we need to update all relevant ranges to which $a[i]$ belongs!

Denote $R_j = (j-2^{\text{lsb}(j)}..j]$,

Claim 1: $R_i, R_{i+\text{lsb}(i)}...$ are the only ranges to which $a[i]$ belong.

For example, given array $a[1..8]$, what ranges covering $a[3]$?

$R_3 = (2..3]$, $R_4 = (0..4]$, $R_8 = (0..8]$

Not convinced yet? Here is a proof sketch, which also gives some insights into the BIT structure.

Lemma 1: $R_i \subset R_{i+\text{lsb}(i)}$

Proof:

$\text{lsb}(i) \leq 0 \Rightarrow j = i+\text{lsb}(i) \geq i$
$\text{lsb}(j) \geq \text{lsb}(i) +1 \Rightarrow 2^{\text{lsb}(j)} \geq 2*2^{\text{lsb}(i)} \Rightarrow j-2^{\text{lsb}(j)} \leq j-2*2^{\text{lsb}(i)} = i+2^{\text{lsb}(i)}-2*2^{\text{lsb}(i)} = i-2^{\text{lsb}(i)}$

Lemma 1 shows that $a[i]$ belongs to $R_i, R_{i+\text{lsb}(i)}...$. Next let see why these ranges are the only ones covering $a[i]$.

Lemma 2: For all $j$ such that $i < j < i+2^{\text{lsb}(i)}$, $i \notin R_j$

Proof:

Note that we only consider even $i$, since for odd $i$, $i+2^{\text{lsb}(i)} = i+1$, and there is no integer $j \in (i,i+1)$.

$j$ is odd, clearly $i \notin R_j = (j-1,j]$ since $i \leq j-1$.
$j$ is even, $k = j-i$ is even, and $k < 2^{\text{lsb}(i)}$, thus $2^{\text{lsb}(k)} < 2^{\text{lsb}(i)}$. Hence, $\text{lsb}(k)$ is also $\text{lsb}(j)$. $j-2^{\text{lsb}(j)} = j-2^{\text{lsb}(k)} \geq j-k = i$, i.e., $i \notin R_j = (j-2^{\text{lsb}(j)}, j]$

Now apply Lemma 2 to $i' = i+\text{lsb}(i)$, $\forall j \in (i', i' + \text{lsb}(i')$, $i' \notin R_j$ and follows $i \notin R_j$ since $i < i'$. Repeating this process, we can prove the claim that $R_i, R_{i+\text{lsb}(i)},...$ are the only ranges covering $a[i]$ (QED)!

Implementation details

With the above claim, it's trivia to implement the update. As an example, this is the snippet for increment update $a[i] \leftarrow a[i] + v$:

def update(i: int, v: int) -> None:
  while i <= n:
    tree[i] += v
    i += (i & -i)

Claim 2: There is at most $O(\log n)$ ranges covering $a[i]$ for $1 \leq i \leq n$.

The while loop continues until $i$ goes beyond $n$. Every iteration update $i \leftarrow i + 2^{\text{lsb}(i)}$, i.e., it moves the LSB at least one position to the left. Thus there can be at most $O(\log n)$ iterations, and the update op is $O(\log n)$ time, assuming the increment is $O(1)$.

Recap

In this part, we explore key ideas of BIT:

Split range into disjoint subranges based on binary expansion for efficient range query
For each point update, we need to update all the ranges covering the point. This is a tradeoff to speed up the range query
No single best solution for all usecases. BIT is efficient for large input size with a lot of range queries

References

https://cp-algorithms.com/data_structures/fenwick.html
Topcoder's excellent tutorial

Random Notes

Binary Indexed Tree (Part 1)

Motivation

What is BIT anyway?

Why does it matter?

Key Ideas

Range decomposition

Remark

Implementation details

Update relevant ranges

Implementation details

Recap

References