I'm tagging a file using Stanford NER and I want to replace every "O" tag with "NONE". I've already tried this code but it shows wrong output. The problem is it replaces every "O" in the string. I'm not familiar with regex and don't know what is the right regex for my problem. TIA.
Here's my code:
import re
tagged_text = st.tag(per_word(input_file))
string_type = "\n".join(" ".join(line) for line in tagged_text)
for line in string_type:
output_file.write (re.sub('O$', 'NONE', line))
Sample Input:
Tropical O
Storm O
Jolina O
affects O
2,000 O
people O
MANILA LOCATION
, O
Philippines LOCATION
– O
Initial O
reports O
from O
the O
OUTPUT:
Tropical NONE
Storm NONE
Jolina NONE
affects NONE
2,000 NONE
people NONE
MANILA LNONECATINONEN
, NONE
Philippines LNONECATINONEN
– NONE
Initial NONE
reports NONE
from NONE
the NONE
string_type? It seems you are looping through a string, which will check character by character.line = 'TrOpical O' re.sub('O$','NONE',line)'TrOpical NONE'