Longest String Chain

Problem Statement

Given an array of string words[], the task is to return the longest string chain. A string chain is defined as a sequence of words where:

Each word (except the first) can be formed by inserting exactly one character into the previous word.

The first word of the chain can be any word from the words[] array.

The task is to determine the length of the longest chain.

Examples

Example 1:

Input: words = ["a", "ab", "abc", "abcd", "abcde"]
Output: 5

Explanation: The longest chaing is ["a", "ab", "abc", "abcd", "abcde"]
Each word in the chain is formed by adding exactly one character to the previous word.

Example 2:

Input: words = ["dog", "dogs", "dots", "dot", "d", "do"]
Output: 4

Explanation: The longest chain is ["d", "do", "dot", "dots"].
Each word is formed by inserting one character into the previous word.

Different Approaches

1️⃣ Recursion

A naive approach to solve this problem is to generate all the possible chains of string (using Power Set or Recursion method), and then store the longest among them. Clearly, this approach will not be a useful approach for the given problem as it will end up taking O(2^N), where N is the number of elements in the given array, i.e., exponential time complexity.

Intuition:

The idea is to generate all possible increasing subsequences and find the length of the longest among them. As we need to find all possible subsequences, recusion can be used to solve the problem.

Approach:

Steps to form the recursive solution:

Express the problem in terms of indices: Let us define a function f(ind) that represents the length of the Longest Increasing Subsequence (LIS) starting from ind. Ultimately, the result will be the f(0).
The problem is now reduced to calculating f(0).
Try all possible choices for each index: At each index ind, we consider whether the current element should extend a previous subsequence or not. This gives an idea to introduce another parameter prev_ind in the function such that the following two possibilities arise:
1. Not Take case (If arr[prev_ind] >= arr[ind]): In such case, the current element cannot be taken into consideration because the goal is to form an increasing subsequence. Since the element is not taken, the prev_ind remains the same and the call simply jumps to the next index i.e., f(ind+1, prev_ind).
2. Take case (If arr[prev_ind] < arr[ind]): In such case, the current element can be added to the previously considered subsequence (ending at prev_ind) thus increasing the length of LIS by 1 and the call simply jumps to func(ind + 1, ind).
  1. Note that, the prev_ind is updated to ind, as now the ind becomes the last selected index for the further subsequence.
3. Initial case when the subsequence is empty: Initially, when the function call begins, there is no element chosen for the subsequence. To mark this, prev_ind can be set as -1 representing that the current element will be the first element that will be taken into consideration.
  1. Hence, to get the answer, the initial function call must be f(0, -1).
Return the longest length of LIS found:After the take and notTake calls, the longest length of the subsequence found can be returned as the answer to the current function call.

/*It is pseudocode  and it is not tied to 
any specific programming  language*/

/* Function to return the length of LIS starting from 
index i with the element prev_ind element refers to
the previously selected element in the subsequence*/
func(int i, int prev_ind, int arr[]) {
    
    // Not Take case
    int notTake = func(i+1, prev_ind, arr);
    
    // Take case
    int take = 0;
    
    // If no element is chosen till now
    if(prev_ind == -1)
        take = func(i+1, i, arr) + 1;
    
    /* Else the current element can be 
    taken if it is strictly increasing */
    else if(arr[i] > arr[prev_ind])
        take = func(i+1, i, arr) + 1;
    
    // Return the maximum length obtained from both cases
    return max(take, notTake);
}

Base cases:

When the function call reaches the last index, the base case will be called and there will be two cases:

Take case: If the last index element is greater than the prev_ind element in the subsequnce or if the last index element is the first element in the subsequence, it can be taken in the subsequence and 1 can be returned.
Not Take case: Otherwise, the last index element can not be considered and 0 can be returned.

Code:

class Solution {
public:
    // Helper function to check if `shorter` can be transformed into `longer`
    // by adding exactly one character
    bool isValidPredecessor(string& longer, string& shorter) {
        if (longer.size() != shorter.size() + 1) return false;

        int i = 0, j = 0;
        while (i < longer.size()) {
            if (j < shorter.size() && longer[i] == shorter[j]) {
                i++;
                j++;
            } else {
                i++;
            }
        }
        return j == shorter.size();
    }

    // Recursive function to try forming the longest string chain
    // `currIndex` is the current word we are checking
    // `prevIndex` is the index of the previous word we took into the chain
    int findLongestChain(vector<string>& words, int currIndex, int prevIndex) {
        // Base case: if we have checked all words
        if (currIndex == words.size()) return 0;

        int take = 0;

        // If no word has been picked yet OR the current word can follow the previous word
        if (prevIndex == -1 || isValidPredecessor(words[currIndex], words[prevIndex])) {
            // Option to take the current word in the chain
            take = 1 + findLongestChain(words, currIndex + 1, currIndex);
        }

        // Option to skip the current word
        int skip = findLongestChain(words, currIndex + 1, prevIndex);

        // Return the maximum of both choices
        return max(take, skip);
    }

    int longestStringChain(vector<string>& words) {
        // Step 1: Sort the words based on length (shortest to longest)
        sort(words.begin(), words.end(), [](const string& a, const string& b) {
            return a.size() < b.size();
        });

        // Step 2: Use recursive function to find the longest chain
        return findLongestChain(words, 0, -1);
    }
};

Complexity Analysis:

Time Complexity: Exponential O(2^n) in worst case.
- In each function call, there is two recursive calls are made (take and skip) resulting in exponential time complexity.
Space Complexity:O(n) for recursion stack depth.

2️⃣ Memoization (Top-Down Approach)

If you draw the recursion tree, you will see that there are overlapping subproblems. Hence the DP approache can be applied to the recursive solution to optimise it.

Steps to convert the recursive solution to the Memoization solution:

Declare a dp array of size [n][n]: There are two states of dp which are:
1. ind: which can go from 0 to n-1, requiring a total of n indices.
2. prev_ind: which can go from -1 to n-2, requiring a total of (n-2 -(-1) +1) or n states.
3. Important:
  1. Note that there is no index as -1, hence, to store the state (prev_ind = -1), it is required to do a co-ordinate shift. We can map the -1 index to 0, 0 index to 1, and so on.…
  2. Hence, a 2D dp array is needed where dp[ind][prev_ind+1] stores the result of the function call func(ind, prev_ind). The dp array stores the calculations of subproblems hence it is initialized with -1, to indicate that no subproblem has been solved yet.
While encountering an overlapping subproblem: When we come across a subproblem, for which the dp array has value other than -1, it means we have found a subproblem which has been solved before hence it is an overlapping subproblem. No need to calculate it's value again just retrieve the value from dp array and return it.
Store the value of subproblem: Whenever, a subproblem is enocountered and it is not solved yet (the value at this index will be -1 in the dp array), store the calculated value of subproblem in the array before returning the function call.

Note

Note that submitting the following code will throw a runtime error because of the tight constraints set for the problem. N can go upto 10^5 and declaring a 2D dp of size N*N will require 1010 space causing Memory Limit Exceeded resulting in runtime error.

Code:

class Solution {
public:
    // Check if 'shorter' can become 'longer' by adding one character
    bool isValidPredecessor(string& longer, string& shorter) {
        if (longer.size() != shorter.size() + 1) return false;

        int i = 0, j = 0;
        while (i < longer.size()) {
            if (j < shorter.size() && longer[i] == shorter[j]) {
                i++;
                j++;
            } else {
                i++;
            }
        }
        return j == shorter.size();
    }

    // Recursive function with memoization
    int findLongestChain(int currIndex, int prevIndex, vector<string>& words, vector<vector<int>>& dp) {
        if (currIndex == words.size()) return 0;

        // Return memoized result if already calculated
        if (dp[currIndex][prevIndex + 1] != -1) return dp[currIndex][prevIndex + 1];

        int take = 0;
        if (prevIndex == -1 || isValidPredecessor(words[currIndex], words[prevIndex])) {
            take = 1 + findLongestChain(currIndex + 1, currIndex, words, dp);
        }

        int skip = findLongestChain(currIndex + 1, prevIndex, words, dp);

        return dp[currIndex][prevIndex + 1] = max(take, skip);
    }

    int longestStringChain(vector<string>& words) {
        // Step 1: Sort by word length (shortest to longest)
        sort(words.begin(), words.end(), [](const string& a, const string& b) {
            return a.size() < b.size();
        });

        int n = words.size();
        // Step 2: Initialize memoization table
        // Use prevIndex + 1 to handle -1 case (i.e., not picked any word yet)
        vector<vector<int>> dp(n, vector<int>(n + 1, -1));

        // Step 3: Compute and return the result
        return findLongestChain(0, -1, words, dp);
    }
};

Complexity Analysis:

Time Complexity:O(N^2), where N is the size of the array
- Because there will be a total of N*N states for which the function will be called.
Space Complexity:O(N^2), because of the 2D dp array created taking O(N^2) space and O(N) recursive stack space.

3️⃣ Tabulation (Bottom-Up Approach)

Code:

class Solution {
public:
    // Helper to check if word2 is a predecessor of word1 (word1 is longer by one character)
    bool isValidPredecessor(const string& word1, const string& word2) {
        if (word1.size() != word2.size() + 1) return false;

        int i = 0, j = 0;
        while (i < word1.size()) {
            if (j < word2.size() && word1[i] == word2[j]) {
                i++;
                j++;
            } else {
                i++; // skip the extra character from the longer word
            }
        }
        return j == word2.size();
    }

    int longestStringChain(vector<string>& words) {
        // Step 1: Sort words by length
        sort(words.begin(), words.end(), [](const string& a, const string& b) {
            return a.size() < b.size();
        });

        int n = words.size();
        vector<int> dp(n, 1); // dp[i] = longest chain ending at i
        int maxLength = 1;

        // Step 2: Build DP table
        for (int i = 0; i < n; i++) {
            for (int j = 0; j < i; j++) {
                if (isValidPredecessor(words[i], words[j])) {
                    dp[i] = max(dp[i], dp[j] + 1);
                }
            }
            maxLength = max(maxLength, dp[i]);
        }

        return maxLength;
    }
};

Complexity Analysis:

Time Complexity:O(n^2)
Space Complexity:O(n)

Problem Statement

Examples

Different Approaches

1️⃣ Recursion

Intuition:

Approach:

Code:

Complexity Analysis:

2️⃣ Memoization (Top-Down Approach)

Complexity Analysis:

3️⃣ Tabulation (Bottom-Up Approach)

Code:

Complexity Analysis:

Leave a comment

Popular Posts

Variadic Function Working in C

How Characters are Stored in Memory

Tags

Quick links

Newsletter

🛠️ Settings

Problem Statement

Examples

Different Approaches

1️⃣ Recursion

Intuition:

Approach:

Code:

Complexity Analysis:

2️⃣ Memoization (Top-Down Approach)

Complexity Analysis:

3️⃣ Tabulation (Bottom-Up Approach)

Code:

Complexity Analysis:

Leave a comment

Variadic Function Working in C

How Characters are Stored in Memory