I'm realizing my very first Go application and I would like to receive an hint on how to make a certain check faster.
I have a huge txt file in which each line is structured like the following:
key text
key text
.
.
key text
Where key is ALWAYS 6 HEX digits long and text is ALWAYS 16 HEX digits long.
I need to find and print all the lines which have the same text value.
For instance, suppose we have 2 lines like the following
- 000000 1234567890123456
- 111111 1234567890123456
They should be both printed.
Here's my code:
r, _ := os.Open("store.txt")
scanner := bufio.NewScanner(r)
for scanner.Scan() {
line := scanner.Text()
text := line[7:]
if !contains(duplicates, text) {
duplicates = append(duplicates, text)
} else {
t, _:= os.Open("store.txt")
dupScan := bufio.NewScanner(t)
//currLine := dupScan.Text()
for dupScan.Scan() {
currLine := dupScan.Text()
currCipher := currLine[7:23]
if( currCipher == text ){
fmt.Println(currLine)
}
}
t.Close()
}
}
fmt.Println("Done Check")
//Close File
r.Close()
func contains(s []string, e string) bool {
for _, a := range s {
if a == e {
return true
}
}
return false
}
It works just fine but, since I have million lines, it's really slow. I'm only capable of doing it with a single thread and not using goroutines.
Have you any suggestion on how Should I change this code to speed up the check?
Thanks in advance.