My goal is to attack Java's HashMap with several strings (around 10000 or more), that produce the same hash. I used the script available on this link, translated it to Python3 (so that I can produce the strings on my terminal itself) and ran on my machine to produce around 177147 strings. All these strings produce the same hash when String.hashCode() method is called. From this link, it is evident that the time complexity would be O(N*N) if random read and write is performed on the HashMap. It should take more time if N is huge (in this case, N is greater than 10000). But surprisingly, it is running in less than 2s. I want it to take more than 10s. The following are the scripts I am using.
# Python3 script to generate strings that produce
# same hash with Java's String.hashCode() method
def combs(n,arr,str,arr_rtn):
if n==1:
for j in range(3):
arr_rtn[str + arr[j]] = 0
else:
for j in range(3):
combs(n-1,arr,str+arr[j],arr_rtn)
arr_src = ['at','bU','c6']
str_tmp = ''
arr_rtn = dict()
t = combs(11,arr_src,str_tmp,arr_rtn)
keys = list(arr_rtn.keys())
print(len(keys))
print(*keys, sep = '\n')
// Java code to insert the generated
// strings into HashMap
import java.util.*;
class Main {
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
int n = sc.nextInt();
HashMap<String, Integer> map = new HashMap<>();
for (int i = 0; i < n; i++) {
String s = sc.next();
map.put(s, s.hashCode());
// Should take more time since all strings
// hash to same value
}
System.out.println(map.size());
sc.close();
}
}
My only goal is to attack the HashMap with strings that produce same hash, so that it takes more than 20s (at least) to execute.
sleeps in it'scompareTomethod (or must you use String?)? In other words, 20 seconds is aribitrary. Isn't it better to show that you have reached the point where insertion is slowing down due to the collisions, than to hit some random duration?LinkedListwithnelements, and then search for a random elementntimes, doublingnevery test, and it only started to take 20 seconds or more whenn = 2^17.n^2would be sufficient. You can do that by repeatedly measuring the algorithm with increasing problem sizes and analyzing the experimental time growth.