I have a column in spark dataframe which has text.
I want to extract all the words which start with a special character '@' and I am using regexp_extract from each row in that text column. If the text contains multiple words starting with '@' it just returns the first one.
I am looking for extracting multiple words which match my pattern in Spark.
data_frame.withColumn("Names", regexp_extract($"text","(?<=^|(?<=[^a-zA-Z0-9-_\.]))@([A-Za-z]+[A-Za-z0-9_]+)",1).show
Sample input: @always_nidhi @YouTube no i dnt understand bt i loved the music nd their dance awesome all the song of this mve is rocking
Sample output: @always_nidhi,@YouTube