1

I have store bulk of Temperature and Humidity values in text file (val.txt). I need to store into the Excel sheet in separate columns.

Values in val.txt file:

SHT1 E:   T1:30.45°C    H1:59.14 %RH
SHT2 S:   T2:29.93°C    H2:67.38 %RH

SHT1 E:   T1:30.49°C    H1:58.87 %RH
SHT2 S:   T2:29.94°C    H2:67.22 %RH

SHT1 E:   T1:30.53°C    H1:58.69 %RH
SHT2 S:   T2:29.95°C    H2:67.22 %RH
//its continues same like this//

Expected output (in excel sheet):

Column1 (T1)     Column2 (H1)     Column3 (T2)     Column3 (H2)
30.45            59.14            29.93            67.38
30.49            58.87            29.94            67.22  
30.53            58.69            29.95            67.22
2
  • You should be able to do this using pandas as explained here Commented Mar 26, 2021 at 11:42
  • It is great. But hear i have strings also. I need to grep float value only. Commented Mar 26, 2021 at 12:36

1 Answer 1

0

I'd suggest something like this using pandas

import itertools

import pandas as pd


def read_lines(file_object) -> list:
    return [
        parse_line(line) for line in file_object.readlines() if line.strip()
    ]


def parse_line(line: str) -> list:
    return [
        i.split(":")[-1].replace("°C", "").replace("%RH", "")
        for i in line.strip().split()
        if i.startswith(("T1", "T2", "H1", "H2"))
    ]


def flatten(parsed_lines: list) -> list:
    return list(itertools.chain.from_iterable(parsed_lines))


def cut_into_pieces(flattened_lines: list, piece_size: int = 4) -> list:
    return [
        flattened_lines[i:i + piece_size] for i
        in range(0, len(flattened_lines), piece_size)
    ]


with open("your_text_data.txt") as data:
    df = pd.DataFrame(
        cut_into_pieces(flatten(read_lines(data))),
        columns=["T1", "H1", "T2", "H2"],
    )
    print(df)
    df.to_excel("your_table.xlsx", index=False)

Output:

      T1     H1     T2     H2
0  30.45  59.14  29.93  67.38
1  30.49  58.87  29.94  67.22
2  30.53  58.69  29.95  67.22

EDIT:

A much shorter approach with regex.

import re

import pandas as pd

with open("your_text_data.txt") as data_file:
    data_list = re.findall(r"\d\d\.\d\d", data_file.read())

pd.DataFrame(
    [data_list[i:i + 4] for i in range(0, len(data_list), 4)],
    columns=["T1", "H1", "T2", "H2"],
).to_excel(
    "your_table.xlsx",
    index=False,
)

This, however, doesn't print anything to stdout but results in the same structure of the Excel file as below.

As an .xlsx file:

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.