0

As part of Databricks, We would like to filter rows having special characters in the columns.

Let's say we have a Table with data like:

Table1 has

Col1

  • 199 Central Avenue
  • 1664 O'block Road
  • 1630 Hahn's Dairy Road
  • "N 40 Degrees 36' 15"" W -75 Degrees -27' -52"""
  • 4061605 North Lat-7538'39* West Long.
  • "40�3'13"" North

Except the Last row, all rows are valid here. How do we filter just the last row here

Tried out following: select b.* from Table1 where (Col1 REGEXP '[^a-zA-Z0-9-&@.,/()#''``:$"" ]')

This is still showing all rows except 1st row.

Tried out following: select b.* from Table1 where (Col1 REGEXP '[^a-zA-Z0-9-&@.,/()#''``:$"" ]')

This is still showing all rows except 1st row.

2
  • 199 Central Avenue 1664 O'block Road 577 Saint John's Road 1630 Hahn's Dairy Road --> Row 1 "N 40 Degrees 36' 15"" W -75 Degrees -27' -52""" 40*61605 North Lat-75*38'39* West Long. "40�3'13"" North Commented May 18, 2024 at 15:07
  • Hi - what’s this comment meant to be? Why did you add it? Commented May 18, 2024 at 19:44

1 Answer 1

0

You can use below query where you need to add you special characters that you want to avoid.

SELECT  *  FROM Table1 WHERE (Col1 not REGEXP '[+�]')

Here, i am adding whatever the characters i don't want which presents one or more times in the record .

+: For having characters one or more time.

�: Special characters i don't want.

Output:

enter image description here

Sign up to request clarification or add additional context in comments.

1 Comment

@Nanda Check if above provided solution works for you? Let me know if I can be helpful here anyway with further input?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.