My text file format is like below
ID col_A col_B col_C
1 0.26 0.11 0.18
2 0.27 0.12 0.17
3 0.21 0.10 0.15
----------------------------
AVG 0.25 0.11 0.17
----------------------------
ID col_D col_E col_F
1 0.23 0.18 0.20
2 0.24 0.14 0.17
3 0.23 0.10 0.13
----------------------------
AVG 0.23 0.14 0.17
----------------------------
I'm attempting to use python and regex to export two separate csv files with the format like below
Table 1
| ID | col_A | col_B | col_C | col_D | col_E | col_F |
|---|---|---|---|---|---|---|
| 1 | 0.26 | 0.11 | 0.18 | 0.23 | 0.18 | 0.20 |
| 2 | 0.27 | 0.12 | 0.17 | 0.24 | 0.14 | 0.17 |
| 3 | 0.21 | 0.10 | 0.15 | 0.23 | 0.10 | 0.13 |
Table 2
| col_A | col_B | col_C | col_D | col_E | col_F | |
|---|---|---|---|---|---|---|
| AVG | 0.25 | 0.11 | 0.17 | 0.23 | 0.14 | 0.17 |
Here's my code:
import re
import pandas as pd
with open('test.txt') as file:
lines = file.readlines()
regex = r'\A(?P<ID>\S+)\s*(?P<COL_A>\S+)\s*(?P<COL_B>\S+)\s*(?P<COL_C>\S+)'
data = []
for line in lines:
m = re.search(regex, line)
if m != None:
data.append([m.group(1),m.group(2),m.group(3),m.group(4)])
df = pd.DataFrame(data)
df.to_csv('test.csv', index = False)
My code would result in a strange format like
| 0 | 1 | 2 | 3 |
|---|---|---|---|
| ID | col_A | col_B | col_C |
| 1 | 0.26 | 0.11 | 0.18 |
| 2 | 0.27 | 0.12 | 0.17 |
| 3 | 0.21 | 0.10 | 0.15 |
| ------ | --------- | --------- | --------- |
| AVG | 0.25 | 0.11 | 0.17 |
| ------ | --------- | --------- | --------- |
| ID | col_D | col_E | col_F |
| 1 | 0.23 | 0.18 | 0.20 |
| 2 | 0.24 | 0.14 | 0.17 |
| 3 | 0.23 | 0.10 | 0.13 |
| ------ | --------- | --------- | --------- |
| AVG | 0.23 | 0.14 | 0.17 |
| ------ | --------- | --------- | --------- |
How can I modify my code to achieve my request? Thank you!