I have trimmed down an html file to get each character vector of a data set to look like:
<h3 class=\"personName\">Whitney Alicia Zimmerman</h3> <li>Assistant Teaching Professor</li>"
I want to use regular expressions to trim it down to just the name and position (for clarification, each vector has different names and positions). What I used before won't work for this (I used the grepl function to subset my original html file). How would I go about trimming this using regular expressions or even another technique? Thanks for any help in advance.
Or if it's easier to work with, I have two other character vectors separating the two that look like:
" <h3 class=\"personName\">Whitney Alicia Zimmerman</h3>"
and
" <li>Assistant Teaching Professor</li>"