Skip to main content

The Number of Good Subsets

·4 mins

Link

Problem #

You are given an integer array nums. We call a subset of nums good if its product can be represented as a product of one or more distinct prime numbers.

For example, if nums = [1, 2, 3, 4]:
[2, 3], [1, 2, 3], and [1, 3] are good subsets with products 6 = 2*3, 6 = 2*3, and 3 = 3 respectively.
[1, 4] and [4] are not good subsets with products 4 = 2*2 and 4 = 2*2 respectively.
Return the number of different good subsets in nums modulo 1e9 + 7.

A subset of nums is any array that can be obtained by deleting some (possibly none or all) elements from nums. Two subsets are different if and only if the chosen indices to delete are different.

Constraints #

1 <= nums.length <= 1e5
1 <= nums[i] <= 30

Thoughts #

At the first glance, we should notice its a typical pick or not pick problem and the data range is 30, which is a strong hint that we can use bitmask to represent the status of the prime factors.

Solution #

class Solution:
    def numberOfGoodSubsets(self, nums):
        P = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
        cnt = Counter(nums)
        bm = [sum(1<<i for i, p in enumerate(P) if x % p == 0) for x in range(31)]
        bad = set([4, 8, 9, 12, 16, 18, 20, 24, 25, 27, 28])
        M = 10**9 + 7
        
        @cache
        def dp(mask, num):
            if num == 1: return 1
            ans = dp(mask, num - 1)
            if num not in bad and mask | bm[num] == mask:
                ans += dp(mask ^ bm[num], num - 1) * cnt[num]
            return ans % M

        return ((dp(1023, 30) - 1) * pow(2, cnt[1], M)) % M

Explanation #

  • There are only 10 prime numbers less than 30, so we can use a 10-bit bitmask to represent the status of the prime factors. For example, if a number has prime factors 2 and 3, its bitmask will be 0b11 (1«0 for 2 and 1«1 for 3).
  • We can use a set to store the bad numbers, which are the numbers that have repeated prime factors. For example, 4 has prime factor 2 repeated twice, so it’s a bad number.
  • Since the problem ask for the number of subset, so we don’t care about the order of the numbers, we can use a counter to count the frequency.
  • We can use a cached dp function to go over all possible conditions. For each number and each mask, we can either pick or not pick the number. If we pick the number, we need to check if it’s a bad number and if it has any common prime factor with the current mask. We can use O(1) time by preprocessing the bitmask for each number. If the results of xor operation is the same as the current mask, it means they have no common prime factor, so we can pick the number and update the mask by xor operation.
  • Finally, we need to multiply the result by 2^cnt[1] because for each 1 we can either pick or not pick it, and 1 doesn’t affect the product, so it can be included in any subset without affecting the good property.
  • We start with dp(1023, 30) because 1023 is the bitmask with all 10 bits set to 1, and we start from the largest number 30 and go down to 1. We need to subtract 1 from the result because we don’t want to count the empty subset. Finally, we take the result modulo 1e9 + 7 as required by the problem.

Insights #

  • Using bitmask to represent the status of the prime factors, actually use a tuple or string to represent the status of the prime factors is also fine, but bitmask is more efficient and easier to implement.