I have some length $L$ binary string consisting of an ordered array of bits: $(b_1, b_2, ..., b_{L})$, where all bit values $b_i$ are originally set to zero. I'd like a particular set of $n$ bits to hold the value $1$. For example, if $L = 5$, starting from $00000$ we might want to reach the state $01101$ where a specific set of $n = 3$ bits have the value $1$. However, I can only do two things:

[1] Choose a bit in the string at position $i$ where $b_i = 0$, with uniform random probability over all bits in the string with this value, and set $b_i = 1$. To be clear, here, if there are nine bits with the value $0$ and one bit with the value $1$, a particular bit with the value $0$ would be selected for a $0 \to 1$ transition with probability $\frac{1}{9}$.

[2] Choose a bit in the string at position $i$ where $b_i = 1$, with uniform random probability over all bits in the string with this value, and set $b_i = 0$.

What distribution do we have for the length of time until we reach the state where at least $k \leq L$ of the ordered array of bits have the desired $0$ and $1$ value ($k = L$ implying that only the aforementioned $n$ positions have the value $1$)?

What if we make it increasingly likely for us to select the previously flipped bit in [2], for example, if the immediately preceding $0 \to 1$ transition had a probability $r$ times greater of being reversed in [2] than any other bit with the value $1$ being flipped to the value $0$ (implying that the immediately previously flipped bit has a $1 \to 0$ transition probability during [2] of $Prob(1 \to 0) = \frac{r}{(q+r)}$ if there are $q$ OTHER $1$ bits that can be flipped to $0$ with probability $\frac{1}{(q+r)})$? As $r \to \infty$, one would imagine that it would be possible to quickly drift any binary string from some initial state to another desired state.

Alternative Question Formulation:

There's another fun way to ask this question. Given that the set of single bit flip transitions for a length $L$ binary string can be represented with a graph $G$ having the topology of an $L$-dimensional hypercube, draw this hypercube and replace each edge with a pair of directed RED and BLUE edges (that can only be traversed in a single direction). RED directed edges imply a $0 \to 1$ bit flip, and BLUE directed edges imply a $1 \to 0$ bit flip.

Starting one vertex $v_i$, our task is to reach another vertex $v_j$ in the fewest number of hops. However, we can only decide to select a RED or BLUE edge (pointing in the correct direction to allow traversal) at any given point in time, and then we need to pick one of the edges of the desired color randomly with uniform probability.

In terms of the "bias" in the original posting, we can say that we're $r$ times more likely to move back to the previously occupied vertex than any other vertex if we previously traversed a RED directed edge and now decide to traverse a BLUE directed EDGE.

Update - I wrote "...What distribution do we have for the length of time until we reach the state where at least $k \leq L$ of the ordered array of bits have the desired $0$ and $1$ value..." in the original posting. However, this could unnecessarily complicate things. So please feel free to set $k = L$. Also feel free to assume that the optimal "strategy" here is to immediately switch to process [2] when the wrong bit has its value flipped from $0$ to $1$.

Update 2 - Intuitively, letting $m_0$ be the number of $0$ bits and $m_1$ be the number of $1$ bits, we can probably guess that the length of time to reach the target string will probably have some dependency on $\delta = || m_1 - m_0 ||$, where smaller values of $\delta$ (i.e. where $m_0 \approx m_1$) imply longer average times to reach the desired string target structure. This "seems" like it might be true because ${L}\choose{m_0}$ and ${L}\choose{m_1}$ are simultaneously maximized at this limit.

Simulations (!!! where we immediately try to flip back "incorrect" $1$ bits by applying procedure [2] for as long as necessary to do so) appear to back up this intuition. For a length $L = 10$ binary string where the target string has $k$ bits with value $1$ (where exactly these bits are in the string should be irrelevant), and performing $10^4$ trials:

$k=0$ trivially implies a {mean, median} $= (\mu, \mu_{1/2}) = (0, 0)$

$k=1$ yields a {mean, median} $= (\mu, \mu_{1/2}) = (19.3202, 13)$

$k=2$ yields a {mean, median} $= (\mu, \mu_{1/2}) = (66.2872, 46)$

$k=3$ yields a {mean, median} $= (\mu, \mu_{1/2}) = (152.303, 107)$

$k=4$ yields a {mean, median} $= (\mu, \mu_{1/2}) = (258.273, 180)$

$k=5$ yields a {mean, median} $= (\mu, \mu_{1/2}) = (333.897, 237)$

$k=6$ yields a {mean, median} $= (\mu, \mu_{1/2}) = (321.758, 226)$

$k=7$ yields a {mean, median} $= (\mu, \mu_{1/2}) = (238.742, 167)$

$k=8$ yields a {mean, median} $= (\mu, \mu_{1/2}) = (130.412, 94)$

$k=9$ yields a {mean, median} $= (\mu, \mu_{1/2}) = (50.1086, 37)$

$k=10$ trivially implies a {mean, median} $= (\mu, \mu_{1/2}) = (L, L) = (10, 10)$

However, we also see see that the length of the string $L$ and $\delta = || m_1 - m_0 ||$ do not provide enough information to predict waiting times, specifically that for the same value of $\delta$, $m_1 > m_0$ implies a larger number of necessary bit flips to reach the target string. I suppose this makes sense given that the initial string is populated with only $0$-valued bits.

I see no reason why these relationships wouldn't hold with the "previous flipped bit" selection bias discussed at the end of the original question posting, though I could obviously be wrong about this.

Let's add some $L = 20$ values for the unbiased case (we're only doing $10^3$ trials here because these trials take a lot longer):

$L=20$ and $k=9$ yields a {mean, median} $= (\mu, \mu_{1/2}) = (118,487.382, 82,308)$

$L=20$ and $k=10$ yields a {mean, median} $= (\mu, \mu_{1/2}) = (127,382.07, 90,858)$

As Will Sawin nicely points out, for $k = 11$, we could quickly drift our string to the state where we have all-$1$ bit values, and then achieve roughly the same number of expected bit flips (in the unbiased case!) as if we had $k = 9$.

15more comments