I am now confused by a problem. I have more than 3,000 observations, each observation is a full text. For example:
text="Ganluo County People's Court of X Province。The plaintiff X, female, born on May, 1980, lives in X County, X Province。The defendant X, male, born on May, 1971, lives in X County, X Province。
It is a divorce dispute, according to 《marriage law》on June 21, 2016。"
Now, I want to extract the information for the plaintiff and defendant, and also I want to know whether this full text contain the word "《marriage law》"(T for yes, F for no)
Thus, I want to have the following results:
| text | plaintiff | defendant | law |
|---|---|---|---|
| Ganluo County People's Court of X Province。The plaintiff X, female, born on May, 1980, lives in X County, X Province。The defendant X, male, born on May, 1971, lives in X County, X Province。It is a divorce dispute, according to 《marriage law》on June 21, 2016。 | The plaintiff X, female, born on May, 1980, lives in X County, X Province。 | The defendant X, male, born on May, 1971, lives in X County, X Province。 | T |
I tried several times, but it does not work. Many thanks for your kind help!
Follow up:
Thank you for your answers. However, the difficulty is that the whole text may have many sentences start with "the plaintiff" and ends with the punctuation "。". How can I only extract the first appearance of the sentence with plaintiff birth and residence information? The order is not fixed, the punctuation is always used.
For example, the whole text may also have sentence like "the plaintiff declares that he is wrong。" The pattern given in the previous answer will also extract this sentence, which I do not want.