Solving the 0/1 Knapsack Problem

Explore how to solve the 0/1 knapsack problem by maximizing item value within capacity limits. Learn naive recursive and efficient dynamic programming methods including memoization and tabulation. Understand the time and space complexities to optimize your solution effectively.

We'll cover the following...

Statement
Examples
Try it yourself
Solution
- Naive approach
  - Time complexity
  - Space complexity
- Optimized solution using dynamic programming

Statement

Suppose you have a list of weights and corresponding values for n items. You have a knapsack that can carry items up to a specific maximum weight, known as the capacity of the knapsack.

You want to maximize the sum of values of the items in your knapsack while ensuring that the sum of the weights of the items remains less than or equal to the knapsack’s capacity.

If all the combinations exceed the given knapsack’s capacity, then return $0$ .

Note: While adding items in the knapsack, we either add the complete item or don’t add it. Moreover, we can’t add an item again that is already in the bag.

Let’s say you have a knapsack capacity of 5 and a list of items with weights and values as follows:

weights = [1, 2, 3, 5]

values = [10, 5, 4, 8]

There are four ways of storing items in the knapsack, such that the combined weight of stored items is less than or equal to the knapsack’s capacity.

Item of weight 1 and weight 2, with a total value of 15.
Item of weight 1 and weight 3, with a total value of 14.
Item of weight 2 and weight 3, with a total value of 9.
Item of weight 5, with a value of 8.

Though all of the combinations described above are valid, we need to select the one with the maximum value. Hence, we will select items with weights 1 and 2, as they give us the maximum value of 15.

Constraints:

$1 \leq$ capacity $\leq 10^4$
$1 \leq$ values.length $\leq 10^3$
weights.length $==$ values.length
$1 \leq$ values[i] $\leq 10^4$
$1 \leq$ weights[i] $\leq$ capacity

Note: If you clicked the “Submit” button and the code timed out, this means that your solution needs to be optimized in terms of running time.

Hint: Use dynamic programming and see the magic.

Solution

We will first explore the naive recursive solution to this problem and then see how it can be improved using dynamic programming.

Naive approach

A naive approach is to find all combinations of items such that their combined weight is less than the capacity of the knapsack and the total value is maximized.

For example, we have the following list of values and weights with the capacity of $5$ :

values: $[3, 5, 2, 7]$
weights: $[3, 1, 2, 4]$

To find the maximum value, we try all the possible combinations:

$3 + 5$ (total weight $4$ ) $= 8$
$3 + 2$ (total weight $5$ ) $= 5$
$5 + 2$ (total weight $3$ ) $= 7$
$5 + 7$ (total weight $5$ ) $= 12$

The calculation above shows that we need a recursive solution to make all possible combinations. In other words, we divide the problem into sub-problems, and for each item, if its weight is less than the capacity, we check whether we should place it in the knapsack.

Let’s implement the algorithm as discussed above:

Python 3.10.4

def find_knapsack(capacity, weights, values, n):
    #Base case
    if n == 0 or capacity == 0:
        return 0
    #check if the weight of the nth item is less than capacity then
    #either 
    # We include the item and reduce the weight of item from the total weight
    # or 
    # We don't include the item
    if (weights[n-1] <= capacity):
        return max(
            values[n-1] + find_knapsack(capacity-weights[n-1], weights, values, n-1),
            find_knapsack(capacity, weights, values, n-1)
            )
    #Item can't be added in our knapsack 
    #if its weight is greater than the capacity
    else:
        return find_knapsack(capacity, weights, values, n-1)
# Driver code
def main():
    weights = [[1, 2, 3, 5],[3],[2],[3,6,10,7,2], [3,6,10,7,2, 12, 15, 10, 13, 20]]
    values = [[1, 5, 4, 8],[2], [3], [12, 10, 15, 17,13], [12, 10, 15, 17,13, 12, 30, 15, 18, 20]]
    capacity = [6,3,3,10,20]
    n = [4, 1, 1, 5,10]
    # You can uncomment the lines below and check how this recursive solution causes a time-out 
    # weights.append([ 63,  55,  47,  83,  61,  82,   6,  34,   9,  38,   6,  69,  17,
    #     50,   7, 100, 101,   4,  41,  28, 119,  78,  98,  38,  75,  35,
    #      8,  10,  16,  93,  34,  23,  51,  79, 118,  86,  85, 109,  88,
    #     72,  99,  36,  21,  80,  42,  44,  62,   7,  54,   7,   6,   0,
    #     65,  25,  44,  86,  76,  18,  11,  10, 104,  17,  36,  91,  78,
    #     88,  79, 103,   1,   4,  34,  94,  73,  21,   8,   9,  79,  25,
    #    106,  76,  39,  78,   1,  92, 104,  84,  40, 100, 116,  84,  23,
    #     79, 109,  79,  71,  72, 116,  90,  79,  26])
    # values.append([ 35,  47,   8, 103,  83,  71,  11, 107,   9,  34,  41,  54,  73,
    #     72, 108, 100,  46,  27,  79,  98,  49,  63,  41, 116,  57,  86,
    #     51,  47,  88, 118,  65,   0,  64,  11,  45,  47,  36,  50, 114,
    #     90, 105,  55,  93,  12,  73,  96,  50,  27,  36,  97,  12,  21,
    #    107,  34, 106,  37,  84,  38, 110,  60,  34, 104,  92,  56,  94,
    #    109,  81,  17,  24, 106,  50,  68,  90,  73,  46,  99,   5,   5,
    #     22,  27,  58,  24,  20,  80,  37,   1,  16,  39,  26,  32,  12,
    #     47,  22,  28,  50,  95,   6, 105, 101,  20])
    # capacity.append(1000)
    # n.append(100)
    for i in range(len(values)):
        print(i+1, ". We have a knapsack of capacity ",capacity[i], " and we are given the following list of item values and weights:", sep="")
        print("-"*30, sep="")
        print("{:<10}{:<5}{:<5}".format("Weights", "|", "Values"))
        print("-"*30)
        for j in range(len(values[i])):
            print("{:<10}{:<5}{:<5}".format(weights[i][j], "|", values[i][j]))
        result = find_knapsack(capacity[i], weights[i], values[i], n[i])
        print("\nThe maximum we can earn is: ", result, sep="")
        print("-"*100, "\n", sep="")
if __name__ == '__main__':
    main()

0/1 Knapsack using recursion

Note: Please observe that if you include the test case commented out in the driver code, the solution is likely to time out. Try to solve the larger problem using the dynamic programming solutions provided below and see the difference.

Time complexity

The time complexity of the naive approach is $O(2^n)$ , where $n$ is the total number of items. This is because we have two possible choices every time, either to include the item or not.

Space complexity

As the maximum depth of the recursive call tree is $n$ , and each call stores its data on the stack, the space complexity of this approach is $O(n)$ .

Optimized solution using dynamic programming

Now, let’s improve the recursive solution using top-down and bottom-up approaches.

Top-down solution

The top-down solution, commonly known as the memoization technique, is an enhancement of the recursive solution. It overcomes the problem of calculating redundant solutions over and over again by storing them in a table.

In the recursive approach, the two variables that kept changing in each call were the total weight of the knapsack and the number of items we had. So, we’ll use these two variables to define each distinct subproblem. Therefore, we need a 2-D array to store these values and the result of any given subproblem when we encounter it for the first time. At any later time, if we encounter the same sub-problem, we can return the stored result from the table with an $O(1)$ lookup instead of recalculating that subproblem.

Let’s implement the algorithm as discussed above:

Python 3.10.4

def find_knapsack(capacity, weights, values, n):
    dp = [[-1 for i in range(capacity + 1)] for j in range(n + 1)]
    return find_knapsack_value(capacity, weights, values, n, dp)
def find_knapsack_value(capacity, weights, values, n, dp):
    # Base case
    if n == 0 or capacity == 0:
        return 0
    
    #If we have solved it earlier, then return the result from memory
    if dp[n][capacity] != -1:
        return dp[n][capacity]
 
    #Otherwise, we solve it for the new combination and save the results in the memory
    if weights[n-1] <= capacity:
        dp[n][capacity] = max(
            values[n-1] + find_knapsack_value(capacity-weights[n-1], weights, values, n-1, dp),
            find_knapsack_value(capacity, weights, values, n-1, dp)
            )
        return dp[n][capacity]
    dp[n][capacity] = find_knapsack_value(capacity, weights, values, n-1, dp)
    return dp[n][capacity] 
# Driver code
def main():
    weights = [[1, 2, 3, 5],[3],[2],[3,6,10,7,2], [3,6,10,7,2, 12, 15, 10, 13, 20]]
    values = [[1, 5, 4, 8],[2], [3], [12, 10, 15, 17,13], [12, 10, 15, 17,13, 12, 30, 15, 18, 20]]
    capacity = [6,3,3,10,20]
    n = [4, 1, 1, 5,10]
    # Let's uncomment this and check the effect of dynamic programming using memoization
   
    # weights.append([ 63,  55,  47,  83,  61,  82,   6,  34,   9,  38,   6,  69,  17,
    #     50,   7, 100, 101,   4,  41,  28, 119,  78,  98,  38,  75,  35,
    #      8,  10,  16,  93,  34,  23,  51,  79, 118,  86,  85, 109,  88,
    #     72,  99,  36,  21,  80,  42,  44,  62,   7,  54,   7,   6,   0,
    #     65,  25,  44,  86,  76,  18,  11,  10, 104,  17,  36,  91,  78,
    #     88,  79, 103,   1,   4,  34,  94,  73,  21,   8,   9,  79,  25,
    #    106,  76,  39,  78,   1,  92, 104,  84,  40, 100, 116,  84,  23,
    #     79, 109,  79,  71,  72, 116,  90,  79,  26])
    # values.append([ 35,  47,   8, 103,  83,  71,  11, 107,   9,  34,  41,  54,  73,
    #     72, 108, 100,  46,  27,  79,  98,  49,  63,  41, 116,  57,  86,
    #     51,  47,  88, 118,  65,   0,  64,  11,  45,  47,  36,  50, 114,
    #     90, 105,  55,  93,  12,  73,  96,  50,  27,  36,  97,  12,  21,
    #    107,  34, 106,  37,  84,  38, 110,  60,  34, 104,  92,  56,  94,
    #    109,  81,  17,  24, 106,  50,  68,  90,  73,  46,  99,   5,   5,
    #     22,  27,  58,  24,  20,  80,  37,   1,  16,  39,  26,  32,  12,
    #     47,  22,  28,  50,  95,   6, 105, 101,  20])
    # capacity.append(1000)
    # n.append(100)
    for i in range(len(values)):
        print(i+1, ". We have a knapsack of capacity ",capacity[i], " and we are given the following list of item values and weights:", sep="")
        print("-"*30, sep="")
        print("{:<10}{:<5}{:<5}".format("Weights", "|", "Values"))
        print("-"*30)
        for j in range(len(values[i])):
            print("{:<10}{:<5}{:<5}".format(weights[i][j], "|", values[i][j]))
        result = find_knapsack(capacity[i], weights[i], values[i], n[i])
        print("\nThe maximum we can earn is: ", result, sep="")
        print("-"*100, "\n", sep="")
if __name__ == '__main__':
    main()

0/1 Knapsack using memoization

Note: Please observe that if you now uncomment the large test case in the driver code, the dynamic programming solution will still work.

Time complexity

We avoided redundant calculations in this approach by storing all the intermediate results in a 2-D array in $O(1)$ time. Therefore, the time complexity of this approach is reduced to $O(n \times capacity)$ , where $n$ is the number of items.

Space complexity

We now need $O(n \times capacity)$ space in memory to store intermediate values.

Bottom-up solution

The bottom-up solution, also known as the tabulation technique, is an iterative method of solving dynamic programming problems. Here, we will create a 2-D array of size (n + 1) $*$ (capacity + 1). The row indicates the values of the available items, and the column shows the capacity of the knapsack at any given point.

We will initialize the array so that when the row or column is 0, the value in the table will also be 0. Next, we will check if the weight of an item is less than the capacity. If yes, we have two options: either we can add the item to the knapsack, or we can skip it. If we include the item, the optimal solution is the maximum of the two values. Otherwise, if the weight of an item is greater than the capacity, then we don’t include the item in the knapsack.

The algorithm work as follows:

In the first row, we have no items to pick from, so regardless of the knapsack’s capacity, we can only accumulate a value of 0.
In the second row, we can only pick the first item. With a knapsack of capacity 1, we wouldn’t be able to pick this item. If the knapsack capacity is 2, we can pick this item. As the knapsack capacity increased beyond value[0], we couldn’t get any more value because there was only one item to choose from.
In the third row, we have the first two items available to be picked. For a knapsack of capacity c, we can either try to pick the second item or skip it. If we choose to skip an item, the value we can accumulate is the value one can obtain using only the first item and a knapsack of capacity c. This is already known and stored in dp[1][c]. If we decide to pick this item, it will take up a weight of weights[1] in the knapsack while contributing an additional value of values[1]. However, we need to track how much value we can accumulate with the remaining capacity in the knapsack, i.e., capacity - weights[1] while using only the first item. This value is already stored in dp[1][c-weights[1]]. So, the total value, in this case, becomes values[1] + dp[1][c-weights[1]]. Since we’re trying to maximize the value, we’ll pick the more valuable of the two choices, i.e., picking or not picking the second item.
In general, the cell in row i, column j can be filled with the following formula:

dp[i][j] = max(values[i-1]+ dp[i-1][j-weights[i-1]], dp[i-1][j])

Let’s look at the following illustration to get a better understanding of the solution:

Python 3.10.4

def find_knapsack(capacity, weights, values, n):
    #create a table to hold intermediate values
    dp = [[0 for i in range(capacity + 1)] for j in range(n + 1)]
 
    for i in range(len(dp)):
        for j in range(len(dp[0])):
            #initialize the table with 0 when either the row or column is 0
            if i == 0 or j == 0:
                dp[i][j] = 0
            #check if the weight of an item is less than the capacity
            elif weights[i-1] <= j:
                dp[i][j] = max(values[i-1]+ dp[i-1][j-weights[i-1]], 
                              dp[i-1][j]
                            )
            #we don't include the item if the weight is greater than the capacity.                
            else:
                dp[i][j] = dp[i-1][j]
 
    return dp[-1][-1] #[n][capacity]
# Driver code
def main():
    weights = [[1, 2, 3, 5],[3],[2],[3,6,10,7,2], [3,6,10,7,2, 12, 15, 10, 13, 20]]
    values = [[1, 5, 4, 8],[2], [3], [12, 10, 15, 17,13], [12, 10, 15, 17,13, 12, 30, 15, 18, 20]]
    capacity = [6,3,3,10,20]
    n = [4, 1, 1, 5,10]
    # Let's uncomment this to see the benefit of using dynamic programming with tabulation
   
    # weights.append([ 63,  55,  47,  83,  61,  82,   6,  34,   9,  38,   6,  69,  17,
    #     50,   7, 100, 101,   4,  41,  28, 119,  78,  98,  38,  75,  35,
    #      8,  10,  16,  93,  34,  23,  51,  79, 118,  86,  85, 109,  88,
    #     72,  99,  36,  21,  80,  42,  44,  62,   7,  54,   7,   6,   0,
    #     65,  25,  44,  86,  76,  18,  11,  10, 104,  17,  36,  91,  78,
    #     88,  79, 103,   1,   4,  34,  94,  73,  21,   8,   9,  79,  25,
    #    106,  76,  39,  78,   1,  92, 104,  84,  40, 100, 116,  84,  23,
    #     79, 109,  79,  71,  72, 116,  90,  79,  26])
    # values.append([ 35,  47,   8, 103,  83,  71,  11, 107,   9,  34,  41,  54,  73,
    #     72, 108, 100,  46,  27,  79,  98,  49,  63,  41, 116,  57,  86,
    #     51,  47,  88, 118,  65,   0,  64,  11,  45,  47,  36,  50, 114,
    #     90, 105,  55,  93,  12,  73,  96,  50,  27,  36,  97,  12,  21,
    #    107,  34, 106,  37,  84,  38, 110,  60,  34, 104,  92,  56,  94,
    #    109,  81,  17,  24, 106,  50,  68,  90,  73,  46,  99,   5,   5,
    #     22,  27,  58,  24,  20,  80,  37,   1,  16,  39,  26,  32,  12,
    #     47,  22,  28,  50,  95,   6, 105, 101,  20])
    # capacity.append(1000)
    # n.append(100)
    for i in range(len(values)):
        print(i+1, ". We have a knapsack of capacity ",capacity[i], " and we are given the following list of item values and weights:", sep="")
        print("-"*30, sep="")
        print("{:<10}{:<5}{:<5}".format("Weights", "|", "Values"))
        print("-"*30)
        for j in range(len(values[i])):
            print("{:<10}{:<5}{:<5}".format(weights[i][j], "|", values[i][j]))
        result = find_knapsack(capacity[i], weights[i], values[i], n[i])
        print("\nThe maximum we can earn is: ", result, sep="")
        print("-"*100, "\n", sep="")
if __name__ == '__main__':
    main()

0/1 Knapsack using tabulation

Note: Please observe that if you now uncomment the large test case in the driver code, the dynamic programming solution will still work.

Time complexity

We have created a 2-D array to store the results of sub-problems that would be used later on, and filling each row in this table takes $O(1)$ time; therefore, the time complexity of this approach will also be $O(n \times capacity)$ .

Space complexity

We need $O(n \times capacity)$ space in memory to store the intermediate values.

Can we do better?

You must have noticed in the above algorithm that to fill up a row, we only require the previous rows’s values; that is, for filling the row against the $i^{th}$ element, we require the values from the previous row representing the $(i-1)^{th}$ element. Therefore, there is no point in storing all the previous $(i-2)$ rows.

We can further improve this approach by using a 1-D array of length $(capacity + 1)$ instead of creating a table. Next, we update this array for each element using the same calculation which we used earlier.

The time complexity would remain the same, $O(n \times capacity)$ , because we still have to do the calculation for each element. The space complexity, however, reduces to $O(capacity)$ since we are only maintaining an array of size $(capacity + 1)$ .

No.	capacity	weights	Values	n	Maximum Value
1	30	[10, 20, 30]	[22, 33, 44]	3	55
2	5	[1, 2, 3, 5]	[10, 5, 4, 8]	4	15

1.Getting Started

2.0/1 Knapsack

3.Unbounded Knapsack

4.Recursive Numbers

5.Longest Common Substring

6.Palindromic Subsequence

7.Conclusion

Solving the 0/1 Knapsack Problem

Statement

Examples

Try it yourself

Solution

Naive approach

Time complexity

Space complexity

Optimized solution using dynamic programming

Top-down solution

Time complexity

Space complexity

Bottom-up solution

Time complexity

Space complexity

Can we do better?