To comply with regulation a bank needs to make sure that at least 90% (or some other fixed proportion) of the money it holds is “clean”.
Money is considered clean if it is in a “clean” account. An account is considered “clean” if all the money in it comes from a verifiable source.
The bank can forcefully return money to the clients coming from a non-verifiable source in order to increment the proportion of "clean" money to comply with the regulation. The problem is to determine the minimum amount that the bank needs to return to comply with the regulation.
Account Verif NonVerif Divi Clean NotClean
1 889.77 157.01 5.67 0 1046.78
2 907.88 160.21 5.67 0 1068.09
3 1187.18 209.5 5.67 0 1396.68
4 918.12 162.02 5.67 0 1080.14
5 1721.88 303.86 5.67 0 2025.74
6 828.17 276.05 3.00 0 1104.22
7 1127.6 375.86 3.00 0 1503.46
8 1177.13 392.37 3.00 0 1569.5
9 801.81 267.27 3.00 0 1069.08
10 741.9 247.3 3.00 0 989.2
11 0 1468.11 0.00 0 1468.11
12 0 853.66 0.00 0 853.66
13 2354.81 0 -1.00 2354.81 0
14 2267.1 0 -1.00 2267.1 0
15 2238.3 0 -1.00 2238.3 0
16 2188.66 0 -1.00 2188.66 0
17 2167.85 0 -1.00 2167.85 0
18 2166.1 0 -1.00 2166.1 0
19 2165.59 0 -1.00 2165.59 0
20 2163.84 0 -1.00 2163.84 0
21 2145.43 0 -1.00 2145.43 0
22 2117.76 0 -1.00 2117.76 0
23 1320.26 0 -1.00 1320.26 0
24 1299.99 0 -1.00 1299.99 0
25 1241.02 0 -1.00 1241.02 0
26 1237.36 0 -1.00 1237.36 0
27 1208.74 0 -1.00 1208.74 0
28 1114.58 0 -1.00 1114.58 0
29 1048.63 0 -1.00 1048.63 0
30 1010.92 0 -1.00 1010.92 0
31 971.1 0 -1.00 971.1 0
32 874.95 0 -1.00 874.95 0
33 832.01 0 -1.00 832.01 0
34 825.72 0 -1.00 825.72 0
TOTAL 45262.16 4873.22 34960.72 15174.66
In the example above, total money held by the bank is 34960.72 + 15174.66 = 50135.38; without doing any cleansing, only the 69.7% (34960.72 / 50135.38 = 0.697...) can be considered clean so the bank needs to cleanse to comply with the regulation.
If the bank cleansed the first two accounts then the total money held by the bank would be 50135.38 - 157.01 - 160.21 = 49818.16 and clean money would be 34960.72 + 889.77 + 907.88 = 36758.37; the proportion of clean money would be 36758.37 / 49818.16 = 73.7%.
In the example above, Div=Verif/NonVerif (Verif would be value and NonVerif would be Weight, as to see the items that provide the best ratio to select those); the list in example is sorted by descending Div. Naive approach is to pick which accounts to cleanse in that order until the bank complies with the regulation.
I was thinking of using approach suggested by Avikalp Srivastava here: so Verif (value) would be treated as weight and NonVerif (cost) would be treated as value; the regular knapsack problem-solving approach would then be used to find the maximum cost that can be removed while Verif remains >= 90% * (total money held by the bank); the thing is the total money held by the bank decreases when you add not-clean items to the knapsack because the bank is returning that money to the customer (so the knapsack would become smaller as more items get added to it?). Brute force caused memory overflow for the data shown. I actually tried to solve this for hours without getting any closer to the answer. Maybe knapsack problem-solving is not the correct approach for this (?)
The naive approach will suffice for my purposes, but I still want to find out how to solve it properly.