0

I have a text file which got connverted from pdf to text data . From the text data ,i would like to extract descriptions present followed by string "FIGURE" . Below is some sample lines of text data ,

FIGURE 1-1. An empirical approach to the design of a dosage regimen. The effects, both desired and adverse, are monitored after the administration of a dosage regimen of a drug and used to further refine and optimize the regimen through feedback ( dashed line ).

Derendorf5e_CH01.indd 4Derendorf5e_CH01.indd 4 5/25/19 11:07 PM5/25/19 11:07 PM

CHAPTER 1 • Therapeutic Relevance 5

Another way of looking at these two subdisciplines is that pharmacokinetics deals with what the body does to the drug (absorption, distribution, metabolism, excretion), whereas pharmacodynamics describes what the drug does to the body (both desired and undesired effects). From this definition, one could wrongly conclude that these are opposite disci- plines, whereas in reality, they go hand-in-hand. Figure 1-3 shows that pharmacokinetics deals with concentration–time relationships, whereas pharmacodynamics describes the relationship between drug concentration and both good (desired) and bad (adverse) effects. Each of these two puzzle pieces by itself is insufficient to guide therapy and optimize dosing; only when pharmacokinetics and pharmacodynamics are linked (PK/PD) and integrated do they become therapeutically useful. This integration is commonly achieved by developing mathematical models (PK/PD models) that capture the observed relationships and allow prediction and identification of optimum dosing regimens.

FIGURE 1-2. A rational approach to the design of a dosage regimen. The pharmacokinetics and pharmacodynam- ics of the drug are first defined. Then, responses to the drug, coupled with pharmacokinetic information, are used as feedback ( dashed lines ) to modify the dosage regimen to achieve optimal ther- apy. For some drugs, active metabolites formed in the body may also need to be taken into account.

I have read pdf file into text and tried applying re.search on the text data with some regex combinations. But no luck .

# Get files text content
text = file_data['content']
#print(text)
text1 = re.search('FIGURE[ ]*[0-9]-[0-9]. (.*)',text,re.MULTILINE)

1 Answer 1

1
text1 = re.findall('FIGURE\s*[0-9]+-[0-9]+. (.*)',text,re.MULTILINE)
>>> import re
>>> t="""FIGURE 1-1. An empirical approach to the design of a dosage regimen. The effects, both desired and adverse, are monitored after the administration of a dosage regimen of a drug and used to further refine and optimize the regimen through feedback ( dashed line ).
...
... Derendorf5e_CH01.indd 4Derendorf5e_CH01.indd 4 5/25/19 11:07 PM5/25/19 11:07 PM
...
... CHAPTER 1 • Therapeutic Relevance 5
...
... Another way of looking at these two subdisciplines is that pharmacokinetics deals with what the body does to the drug (absorption, distribution, metabolism, excretion), whereas pharmacodynamics describes what the drug does to the body (both desired and undesired effects). From this definition, one could wrongly conclude that these are opposite disci- plines, whereas in reality, they go hand-in-hand. Figure 1-3 shows that pharmacokinetics deals with concentration–time relationships, whereas pharmacodynamics describes the relationship between drug concentration and both good (desired) and bad (adverse) effects. Each of these two puzzle pieces by itself is insufficient to guide therapy and optimize dosing; only when pharmacokinetics and pharmacodynamics are linked (PK/PD) and integrated do they become therapeutically useful. This integration is commonly achieved by developing mathematical models (PK/PD models) that capture the observed relationships and allow prediction and identification of optimum dosing regimens.
...
... FIGURE 1-2. A rational approach to the design of a dosage regimen. The pharmacokinetics and pharmacodynam- ics of the drug are first defined. Then, responses to the drug, coupled with pharmacokinetic information, are used as feedback ( dashed lines ) to modify the dosage regimen to achieve optimal ther- apy. For some drugs, active metabolites formed in the body may also need to be taken into account."""
>>> re.findall('FIGURE\s*[0-9]-[0-9]. (.*)',t,re.MULTILINE)
['An empirical approach to the design of a dosage regimen. The effects, both desired and adverse, are monitored after the administration of a dosage regimen of a drug and used to further refine and optimize the regimen through feedback ( dashed line ).', 'A rational approach to the design of a dosage regimen. The pharmacokinetics and pharmacodynam- ics of the drug are first defined. Then, responses to the drug, coupled with pharmacokinetic information, are used as feedback ( dashed lines ) to modify the dosage regimen to achieve optimal ther- apy. For some drugs, active metabolites formed in the body may also need to be taken into account.']`
Sign up to request clarification or add additional context in comments.

4 Comments

its giving output of first line description of each FIGURE . I want complete text description .
What do you mean complete text description ? Can you show me an expected ouput?
I want ouput like below , An empirical approach to the design of a dosage regimen. The effects, both desired and adverse, are monitored after the administration of a dosage regimen of a drug and used to further refine and optimize the regimen through feedback ( dashed line ).
I am not sure in which format your data is, if your data is in a multiline as specified in above interpreter the code will work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.