I want to filter some rows in my DF, keeping rows where a column starts with "startSubString" and do not contain the character '#'.
I can do what I want with two filters:
.filter( _!= col("theCol").contains("#"))
.filter( col("theCol").startsWith("startSubString"))
But was wondering if it could not be done in just one filter for better performance:
something like:
.filter(col("theCol").rlike("^(startSubString).*^[^@]"))
although this does not work. What am I missing?
.filter( _!= col("theCol").contains("#") || col("theCol").startsWith("http"))doesn't that work?