I have some big txt files as an inputs which looks like
# USER_IP: 37.1.62.12 INTERFACE CHARMM-GUI
@<TRIPOS>MOLECULE
lig.pdb
54 56 1 0 0
SMALL
NO_CHARGES
@<TRIPOS>ATOM
1 CAA 2.9880 0.1910 12.9830 C.3 1 P0G 0.0000
2 CAB 1.3730 1.7370 10.6500 C.3 1 P0G 0.0000
3 CAC -0.5820 0.2000 10.5350 C.3 1 P0G 0.0000
4 OAD -5.1220 5.7850 8.9220 O.2 1 P0G 0.0000
5 OAE -2.7610 6.1960 4.9010 O.3 1 P0G 0.0000
6 OAF -0.8620 0.4430 6.3540 O.3 1 P0G 0.0000
7 CAG 0.7160 -2.5530 14.2490 C.ar 1 P0G 0.0000
8 CAH 0.1300 -3.0010 13.0720 C.ar 1 P0G 0.0000
...
here in each of file I have a lot of strings:
6 OAF -0.8620 0.4430 6.3540 O.3 1 P0G 0.0000
7 CAG 0.7160 -2.5530 14.2490 C.ar 1 P0G 0.0000
8 CAH 0.1300 -3.0010 13.0720 C.ar 1 P0G 0.0000
my task is using some Linux shell script and combination of AFK, SED to remove all columns from those fragments with the exception of first 1-5 columns which are relevant for me. So the example file after its processing should be like
# USER_IP: 37.1.62.12 INTERFACE CHARMM-GUI
@<TRIPOS>MOLECULE
lig.pdb
54 56 1 0 0
SMALL
NO_CHARGES
@<TRIPOS>ATOM
1 CAA 2.9880 0.1910 12.9830
2 CAB 1.3730 1.7370 10.6500
3 CAC -0.5820 0.2000 10.5350
4 OAD -5.1220 5.7850 8.9220
5 OAE -2.7610 6.1960 4.9010
6 OAF -0.8620 0.4430 6.3540
7 CAG 0.7160 -2.5530 14.2490
8 CAH 0.1300 -3.0010 13.0720
the problem here that always in same type of files I have several strings (its number might differ) before those segments which should be processed. So the only idea is to use below string
@<TRIPOS>ATOM
as the reference and start to count strings which columns must be processed only after this reference string
I'd be thankful for several examples and its short explanation
Gleb