0

I'm trying to extract a number of strings from column values in PostgreSQL.

So basically I have text in columns in a table that look something like:

blah blah blah.... <something="AValueIWant">....dfdf.gd d.fg d.fd... <something="AnotherValueIWant">.

I want to be able to run a query to extract "AValueIWant" and "AnotherValueIWant" (without the quotes) in a form like this:

regex_matches
-------------
AValueIWant
AnotherValueIWant

Let's call the column "body" and the table "tablebody".

So far I have something like:

select regexp_matches(body, <some kind of pattern that doesn't work>, 'g') from tablebody

I don't feel there is any value posting up what I've tried since they all don't work! I've looked at the docos and tried things like 'something(.+)\"' but that returned the whole body text. Other patterns I've found online like '/"([^"]*/"' return nothing at all. Does anyone else have a better grasp of regex that knows the answer, or another way to do this :)? I'm pretty newbie.

Thanks!

1
  • Is the "something" key always the same? Commented Apr 8, 2013 at 9:33

2 Answers 2

2
select a[1]
from (
    select regexp_matches(body, '="(.+?)"', 'g') a
    from tablebody
) s

But I would rather use a specialized HTML parser unless it is a very quick and dirt job you want.

Sign up to request clarification or add additional context in comments.

Comments

0

You can try this :

<something="([^"]+)">

Or more generic :

<[A-Za-z]+="([^"]+)">

Or even more generic, if you don't care about the name of the attribute:

="([^"]+)"

Or...well you got the idea.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.