Ok, so I'm trying to delete lines from a text file with java. Currently the way I'm doing this, is I'm keep track of a line number and inputting an index. The index is the line I want deleted. So each time I read a new line of data I increment the line count. Now when I reach the line count that is the same index, I dont write the data to the temporary file. Now this works, but what if for example I'm working with huge files and I have to worry about memory restraints. How can I do this with.. file markers? For example.. place the file marker on the line I want to do delete. Then delete that line? Or is that just too much work?
4 Answers
You could use nio to delete the region of the file that correspond to that line.
EDIT added some hints
By creating a FileChannel and using a Buffer, you could open the file, erase the required line by pushing over it the content that come after.
Unfortunatly, I must confess my knowledge of nio stops approximatly here ...
1 Comment
You could use a random access file. Keep a pointer to the byte you are reading and another for the byte you are writing. Fill a buffer with data and as you read it count the lines. If you have nothing to delete reset the channel to the write pointer and output the buffer, then reset the channel to the read pointer. If you find a line to delete, output the buffer to that point at the write index, then increment the read pointer until you find the end of the line, and then output the remainder of your buffer (refilling the buffer as necessary), repeat for each line to be deleted.
Comments
Ideally, I would use an ETL tool to perform this kind of batch work. Assuming you do not have access to such a tool, I would recommend gZipping the file first and then read it using java.util.zip.
Here is a good tutorial on how to do it.
Hope this helps!