(Please use monospaced font to read)
Example:
i: 0 1 2 3 4 5 6 7 8 9
S0: a a b x c a b x c b
T0: a b c b
expected output: [5,10)
n=len(S0), m=len(T0)
Collect indices in S0: O(n) time
a: 0 1 5
b: 2 6 9
c: 4 8
I wanted to use dynamic programming,
but I did not find any optimal substructure.
So I chose to backtrack on these indices instead.
T0: a b c b
i: 0 1 2 3 4 5 6 7 8 9
S0: a a b x c a b x c b
S1: a _ b _ c _ b _ _ _ -> [0,7)
S2: a _ b _ c _ _ _ _ b -> [0,10)
S3: a _ b _ _ _ _ _ c b -> [0,10)
S4: a _ _ _ _ _ b _ c b -> [0,10)
S5: _ a b _ c _ b _ _ _ -> [1,7)
S6: _ a b _ c _ _ _ _ b -> [1,10)
S7: _ a b _ _ _ _ _ c b -> [1,10)
S8: _ a _ _ _ _ b _ c b -> [1,10)
S9: _ _ _ _ _ a b _ c b -> [5,10) best
Compare across S1..S9 we pick S9.
But do we really have to do this comparison at this final stage?
No. Note that in solutions S1..S4 where the index of 'a' is 0,
when solution S1 is found, we don't need to explore S2..S4 at all.
Similarly, in solutions S5..S8 where the index of 'a' is 1,
once S5 is found, we don't need to explore S6..S8 any more.
The greedy principle applies here.
So in the end, we only need to compare S1, S5, S9.
i: 0 1 2 3 4 5 6 7 8 9
S1: a _ b _ c _ b _ _ _ -> [0,7)
S5: _ a b _ c _ b _ _ _ -> [1,7)
S9: _ _ _ _ _ a b _ c b -> [5,10) best
def greedy (s_indices, s, t, si_begin, tj):
if tj >= len(t):
return si_begin
for si in s_indices[t[tj]]:
if si >= si_begin:
return greedy (s_indices, s, t, si+1, tj+1)
def min_win_subs (s, t):
tset = set(t)
s_indices = dict()
for i, c in enumerate(s):
if c in tset:
s_indices.setdefault(c, []).append(i)
min_win_size = 999999
for si_begin in s_indices[t[0]]:
si_end = greedy (s_indices, s, t, si_begin+1, 1)
if si_end is None: continue
if si_end - si_begin < min_win_size:
si_min_begin = si_begin
si_min_end = si_end
min_win_size = si_end - si_begin
return si_min_begin, si_min_end
if __name__ == '__main__':
print(min_win_subs('aabxcabxcb', 'abcb')) # [5,10)