What I want to do is look for a specific pattern. 1 letter, a dash, followed by a year and letter like "A-2012A". After that, the rest of the column's value can be anything. I want to confirm this first part. And return a true/false value. Is it possible?
pattern letter-yearletter
String validation on one column with regular expression.
example_column_1
| DNA \ Assay |
|---|
| A-2000X-27 |
| A-2000X-32 |
| A-2000X-45 |
| A-2000X-48 |
| A-2000X-80 |
truth_value = df['DNA \ Assay'].str.match(r'').astype(bool)
Sample, with nothing in the r'' regular expression.
My expected output would be True
example_column_2
| DNA \ Assay |
|---|
| Embryo FTA-Code-ID-2 |
| Embryo FTA-Code-ID-3 |
| Embryo FTA-Code-ID-4 |
| Embryo FTA-Code-ID-5 |
| Embryo FTA-Code-ID-6 |
My expected output with example_column_2 would be False