20

I use importing systems based on delimited text files. The files used can sometimes be almost 2 Gb big and I have to check some lines from that file. So I want to know how can I output (on another file, or just on screen) the lines of specific value? E.g. line number 1010123, 1002451, 994123, etc., exactly as they are in the source file?

3 Answers 3

42

To print line N, use:

sed 'Nq;d' file

To print multiple lines (assuming they are in ascending order) e.g. 994123, 1002451, 1010123:

sed '994123p;1002451p;1010123q;d' file

The q after the last line number tells sed to quit when it reaches the 1010123th line, instead of wasting time by looping over the remaining lines that we are not interested in. That is why it is efficient on large files.

Sign up to request clarification or add additional context in comments.

2 Comments

"That is why it is efficient on large files" - unless you need a line toward the end of the file, in which case it's not very efficient anymore (I just tried on a 40GB file). Nevertheless, your answer is great, thank you.
5

You can do this with many Unix tools, for instance with awk:

# print first 5 lines with awk
awk 'NR>=1&&NR<=5{print}NR>=6{exit}' file

# print selection of lines 
awk 'NR==994123||NR==1002451||NR==1010123{print}NR>1010123{exit}' file

3 Comments

How does sed / awk perform on very large files like those he mentioned in his question (~2GB)?
The line numbers are not sequential. Not first Nth or last Nth. They are just lines with errors. I have line numbers in a table and I just want to output just specific line numbers.
@BogdanM see the 2nd awk example for this (dogbane answers shows how with sed) I thought printing ranges would also be useful to you.
0

In python:

readThisFile = open('YOURFILE')
outputFile = open('OUTPUT', w)

for actualline, linetext in enumerate(readThisFile):
    if actualline == WANTEDLINE
        outputFile.write(linetext)
    else:
        pass

If wanted you can modify that script to work with arguments (like getline.py 1234)

2 Comments

It would more efficient to quit after outputFile.write(linetext) is printed also as the question is tagged only unix so I'm uncertain the OP has Python available.
I can't say if there's Python available on it or not (he didn't say which language he want's the solution written in). And because of quitting at the line: Yeah that can be done, but my script isn't a "it's ready to use, just copy and paste it" script, it's just a hint how it can be done

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.