LeetCode 244: Shortest Word Distance II

link

There can be multiple instances of both word1 and word2. We need to find the shortest distance among the possible pairs.

List and sliding window

We do not maintain any auxiliary data structure. Using sliding window [begin_word, end_word], where begin_word in {word1, word2} and end_word in {word1, word2} \ {begin_word}, we can find the shortest distance between pairs: (word1, word2) or (word2, word1) in one pass through the wordDict.

Say, len(wordDict) = n and the longest word has length m.

OperationTimeSpace
__init__\mathcal{O}(1)\mathcal{O}(1)
shortest\mathcal{O}(n \cdot m)\mathcal{O}(m)
class WordDistance:

    def __init__(self, wordsDict: List[str]):
        self.words = wordsDict

    def shortest(self, word1: str, word2: str) -> int:
        min_dist = float("inf")
        begin = 0
        begin_word = None
        for end, end_word in enumerate(self.words):
            if not (end_word == word1 or end_word == word2):
                continue

            if not begin_word or end_word == begin_word:
                begin_word = end_word
                begin = end
                continue

            min_dist = min(min_dist, end - begin)
            begin_word = end_word
            begin = end

        return min_dist


# Your WordDistance object will be instantiated and called as such:
# obj = WordDistance(wordsDict)
# param_1 = obj.shortest(word1,word2)

Word index list and binary search

Since we are interested in the indices where word1 or word2 appear, we can collect the indices in the beginning and the index list of each word will be sorted. So, we can binary search each index from the shorter list in the longer list and find the candidate indices.

Say the most frequent word appears k times in wordDict.

OperationTimeSpace
__init__\mathcal{O}(n \cdot m)\mathcal{O}(n \cdot m)
shortest\mathcal{O}(k \cdot \lg{k})\mathcal{O}(1)
class WordDistance:

    def __init__(self, wordsDict: List[str]):
        self.word_index = {}
        for i, w in enumerate(wordsDict):
            if w not in self.word_index:
                self.word_index[w] = []
            self.word_index[w].append(i)

    def __find_insert_index(self, x, nums):
        lo, hi = 0, len(nums) - 1
        while lo <= hi:
            mid = (lo + hi) // 2
            if nums[mid] < x:
                lo = mid + 1
            else:
                hi = mid - 1

        return lo

    def shortest(self, word1: str, word2: str) -> int:
        min_dist = float("inf")
        index_list1 = self.word_index[word1]
        index_list2 = self.word_index[word2]
        short_list, long_list = (
            (index_list1, index_list2)
            if len(index_list1) <= len(index_list2)
            else (index_list2, index_list1)
        )
        n = len(long_list)
        for i in short_list:
            j = self.__find_insert_index(i, long_list)
            at_dist = abs(long_list[j] - i) if j != n else abs(long_list[n - 1] - i)
            left_dist = abs(long_list[j - 1] - i) if j != 0 else abs(long_list[0] - i)
            min_dist = min(min_dist, left_dist, at_dist)

        return min_dist


# Your WordDistance object will be instantiated and called as such:
# obj = WordDistance(wordsDict)
# param_1 = obj.shortest(word1,word2)

Like merge in merge-sort

In merge of merge-sort, a line sweep (two pointers) through the two sorted sublists put all elements in sorted order. We can follow similar approach to traverse through the closest pair of indices.

OperationTimeSpace
__init__\mathcal{O}(n \cdot m)\mathcal{O}(n \cdot m)
shortest\mathcal{O}(k)\mathcal{O}(1)
class WordDistance:

    def __init__(self, wordsDict: List[str]):
        self.word_index = {}
        for i, w in enumerate(wordsDict):
            if w not in self.word_index:
                self.word_index[w] = []
            self.word_index[w].append(i)

    def shortest(self, word1: str, word2: str) -> int:
        min_dist = float("inf")
        index_list1 = self.word_index[word1]
        index_list2 = self.word_index[word2]
        i, j = 0, 0
        while i < len(index_list1) and j < len(index_list2):
            min_dist = min( min_dist, abs(index_list1[i] - index_list2[j]) )
            if index_list1[i] <= index_list2[j]:
                i += 1
            else:
                j += 1
        
        return min_dist


# Your WordDistance object will be instantiated and called as such:
# obj = WordDistance(wordsDict)
# param_1 = obj.shortest(word1,word2)

Leave a comment