1

I have a long text written in HTML:

<body>
    <h2>title 1</h2>
    <h2>This is an <b>important</b> title</h2>
        Some text
        <h3>This a subtitle b</h3>
        <h3>This is also <span style="font-weight:500">important</span></h3>        
</body>

I need to extract the titles from it to create a Table of Content. I expect the result as:

h2      Title 1
h2      This is an <b>important</b> title
h3      This a subtitle b
h3      This is also <span style="font-weight:500">important</span>

or:

h2      Title 1
h2      This is an important title
h3      This a subtitle b
h3      This is also important

I tried:

select * from xmltable('body/*'  passing xmltype('<body><h2>title 1</h2><h2>This is an <b>important</b> title</h2>Some text<h3>This a subtitle b</h3><h3>This is also <span style="font-weight:500">important</span></h3></body>') 
columns 
tag_name varchar2(1000) path 'name()',
tag_value varchar2(1000) path 'text()')
where tag_name in ('h1','h2','h3','h4','h5')

But I'm getting the error:

ORA-19279: XPTY0004 - XQuery dynamic type mismatch: expected singleton sequence - got multi-item sequence
19279. 00000 -  "XPTY0004 - XQuery dynamic type mismatch: expected singleton sequence - got multi-item sequence" 
*Cause:    The XQuery sequence passed in had more than one item.
*Action:   Correct the XQuery expression to return a single item sequence.  

How can I solve that?

1 Answer 1

1

Would this work as the output?

TAG_NAME TAG_VALUE                                                             
-------- ----------------------------------------------------------------------
h2       <h2>title 1</h2>                                                      
h2       <h2>This is an <b>important</b> title</h2>                            
h3       <h3>This a subtitle b</h3>                                            
h3       <h3>This is also <span style="font-weight:500">important</span></h3>  

This makes more sense to me - then let whatever xml tools you work with interpret the tag values as needed. (They may require that the tag values be xmltype data type - if so, then simply remove the xmlserialize wrapper in the select clause.)

If that's acceptable, you can get it with a small modification of your query.

select tag_name, xmlserialize(document tag_value) as tag_value
from xmltable('body/*'  passing xmltype('<body><h2>title 1</h2>
  <h2>This is an <b>important</b> title</h2>Some text<h3>This a subtitle b</h3>
  <h3>This is also <span style="font-weight:500">important</span></h3></body>') 
columns 
  tag_name varchar2(1000) path 'name()',
  tag_value xmltype path '.')
where tag_name in ('h1','h2','h3','h4','h5')
;
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.