If we can mark the end of each word, we can decode. To mark the end of a word, we cannot use a character from the alphabet because we would not know if it is a marker or part of a word.
Characters as numbers
The ord() of a character is a non-negative integer. So, if we encode the ord()‘s, then -1 could serve as a marker for word ends.
Say total character count is .
Time: , space:
.
class Codec:
def encode(self, strs: List[str]) -> str:
"""Encodes a list of strings to a single string.
"""
encoded = []
for s in strs:
for c in s:
encoded.append(ord(c))
encoded.append(-1)
return ",".join(str(x) for x in encoded)
def decode(self, s: str) -> List[str]:
"""Decodes a single string to a list of strings.
"""
decoded = []
curr = []
letters = s.split(",")
for l in letters:
if l == "-1":
decoded.append("".join(curr))
curr = []
continue
curr.append(chr(int(l)))
return decoded
# Your Codec object will be instantiated and called as such:
# codec = Codec()
# codec.decode(codec.encode(strs))
Append word length
We can append the length of each word in the beginning followed by some marker like “|”. So, the encoded string looks like: . As soon as we see “|”, we know where the next word begins and ends. After processing the word, we have the same pattern again.
class Codec:
def encode(self, strs: List[str]) -> str:
"""Encodes a list of strings to a single string.
"""
encoded = []
for s in strs:
encoded.append(f"{len(s)}|")
encoded.append(s)
return "".join(encoded)
def decode(self, s: str) -> List[str]:
"""Decodes a single string to a list of strings.
"""
decoded = []
# We have repeated pattern like: n1|word1
# `begin` marks the start of pattern
i = begin = 0
while i < len(s):
if s[i] != "|":
i += 1
continue
word_len = int(s[begin:i])
word_start = i+1
word_end = word_start+word_len
decoded.append( s[word_start : word_end] )
i = begin = word_end
return decoded
# Your Codec object will be instantiated and called as such:
# codec = Codec()
# codec.decode(codec.encode(strs))
Leave a comment