求多个子串${s_1,...,s_n}$的序列组合问题.
核心点要关注多维DP数组所存储的信息, DP数组里的信息有:
- 字符串$s_i$和$s_j$相互比较的信息
- 是一个隐马尔科夫的过程dpi的状态只与他的pre状态有关.
- 其pre状态是根据状态转移方程来定的.
- 回溯时要从后往前回溯, 根据状态的变化规则和想要的最终字符串回溯即可.对于高纬度多个字符串相比较, 其也是一样的, 只不过状态转移方程的参数要变多.
下面是LCS和SCS的代码和回溯过程.
代码语言:python3复制# %%
class Solution:
def shortestCommonSupersequence(self, str1: str, str2: str) -> str:
"""Find the Shortest Super Sequence which contains srt1 and str2.
Input:
str1: an English word string.
str2: an English word string.
Output:
1. An English word string.
Perfomance:
O(n^2)
O(n^2)
Tips:
It's similar to the Longest Common Subsequence.
We just take a special strategy to regenerate string when it's different.
"""
# Checks
max_size_str = 10**5
if str1 is None or str2 is None:
raise Exception("The parameter is None.")
if str1 == "" or str2 == "":
return str2 str1
if len(str1) > max_size_str or len(str2) > max_size_str:
raise Exception(
"The parameter is too long, please chang ohter algorithms.")
a, b = len(str1), len(str2)
# Dynamic Array.
dp = [[0 for _ in range(b 1)] for _ in range(a 1)]
for i in range(a):
for j in range(b):
if str1[i] == str2[j]:
# State Transition Equation. 1
dp[i 1][j 1] = dp[i][j] 1
else:
# State Transition Equation. 2
dp[i 1][j 1] = max(dp[i 1][j], dp[i][j 1])
# Regenerate SuperSequence.
scs = []
# Regenerate Subsequence.
lcs = []
while a > 0 and b > 0:
# Common character
if str1[a-1] == str2[b-1]:
scs.append(str1[a-1])
lcs.append(str1[a-1])
a -= 1
b -= 1
elif dp[a][b] == dp[a-1][b]:
scs.append(str1[a-1])
a -= 1
elif dp[a][b] == dp[a][b-1]:
scs.append(str2[b-1])
b -= 1
scs = "".join(scs[::-1])
# Recurse str1 or str2.
# Then add the rest of characters.
if a:
scs = str1[:a] scs
if b:
scs = str2[:b] scs
print("".join(lcs[::-1]))
return scs
if __name__ == "__main__":
s1, s2 = "bcacaaab", "bbabaccc"
print(Solution().shortestCommonSupersequence(s1, s2))
# %%
结果
LCS: bcc
SCS: bbabaccacaaab