0

Considering my text file would have the following pattern:

    CONTAINER ID   IMAGE                                                                        COMMAND                  CREATED             STATUS                       PORTS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       NAMES
    xxxxxxxxxxxx   yyyyyyyyyyyyyyyyyyyyyyyyyyy                                                  zzzzzzzzzzzzzzzzzzzzzz   aaaaaaaaaaaaaaaa    bbbbbbbbbbb                  ports                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       random name

Basically the output of docker ps command, written to a text file. What is the most effective way to convert this into a data frame or a readable format into python, to query the values. (i.e.) Given a name, get the matching container ID etc.

I tried

    df = pd.read_csv("docker.txt",sep="  ")

but there are lapses, as the delimiter is not consistent.

5
  • use regular expressions to handle the inconsistent delimiters Commented Oct 17, 2023 at 2:26
  • Does sep='\s+' work for you? Commented Oct 17, 2023 at 2:29
  • 1
    pd.read_csv("docker.txt", sep='\s{2,}') Commented Oct 17, 2023 at 2:29
  • @PandaKim does the job yes, but misses out any empty values for the columns. the next value is taken Commented Oct 17, 2023 at 2:38
  • 1
    df = pd.read_fwf('docker.txt') Commented Oct 17, 2023 at 4:19

1 Answer 1

1
import pandas as pd
import re

column_names = ["CONTAINER ID", "IMAGE", "COMMAND", "CREATED", "STATUS", "PORTS", "NAMES"]


data = []
with open("docker.txt", "r") as file:
    lines = file.readlines()
    lines = [line.strip() for line in lines[1:]]

    for line in lines:
        values = re.split(r'\s{2,}', line)
        data.append(values)

df = pd.DataFrame(data, columns=column_names)

print(df)

enter image description here

Hope this helps with using regex and getting dataframe from text file

Sign up to request clarification or add additional context in comments.

2 Comments

thank you for the answer. there are actually 7 columns, from the output of docker ps, if you can use the scroll bar
Its just add the column names, I just updated the script and output

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.