I use importing systems based on delimited text files. The files used can sometimes be almost 2 Gb big and I have to check some lines from that file. So I want to know how can I output (on another file, or just on screen) the lines of specific value? E.g. line number 1010123, 1002451, 994123, etc., exactly as they are in the source file?
3 Answers
To print line N, use:
sed 'Nq;d' file
To print multiple lines (assuming they are in ascending order) e.g. 994123, 1002451, 1010123:
sed '994123p;1002451p;1010123q;d' file
The q after the last line number tells sed to quit when it reaches the 1010123th line, instead of wasting time by looping over the remaining lines that we are not interested in. That is why it is efficient on large files.
2 Comments
ktaria
in relation to that, a good link : webhostinghero.org/how-to-view-and-edit-large-files-in-linux
Artem Russakovskii
"That is why it is efficient on large files" - unless you need a line toward the end of the file, in which case it's not very efficient anymore (I just tried on a 40GB file). Nevertheless, your answer is great, thank you.
You can do this with many Unix tools, for instance with awk:
# print first 5 lines with awk
awk 'NR>=1&&NR<=5{print}NR>=6{exit}' file
# print selection of lines
awk 'NR==994123||NR==1002451||NR==1010123{print}NR>1010123{exit}' file
3 Comments
Amal Antony
How does
sed / awk perform on very large files like those he mentioned in his question (~2GB)?BogdanM
The line numbers are not sequential. Not first Nth or last Nth. They are just lines with errors. I have line numbers in a table and I just want to output just specific line numbers.
Chris Seymour
@BogdanM see the 2nd
awk example for this (dogbane answers shows how with sed) I thought printing ranges would also be useful to you.In python:
readThisFile = open('YOURFILE')
outputFile = open('OUTPUT', w)
for actualline, linetext in enumerate(readThisFile):
if actualline == WANTEDLINE
outputFile.write(linetext)
else:
pass
If wanted you can modify that script to work with arguments (like getline.py 1234)
2 Comments
Chris Seymour
It would more efficient to quit after
outputFile.write(linetext) is printed also as the question is tagged only unix so I'm uncertain the OP has Python available.chill0r
I can't say if there's Python available on it or not (he didn't say which language he want's the solution written in). And because of quitting at the line: Yeah that can be done, but my script isn't a "it's ready to use, just copy and paste it" script, it's just a hint how it can be done