3

I have a very complex task - create a software that imports XMl files to MSSQL database. There is one serious problem - each file has different structure than datatable. Example :

DATABASE: It has columns like : ImageURL, Title, Content

XML: Each XML file is different , I'll give you two examples :

  1. <contents>
        <ImageURL>www.123.com/image.png</ImageURL>
        <Title>Some text</title>
        <Content>Content of item</Content>
    </contents>
    <contents>
        <ImageURL>www.123.com/image.png</ImageURL>
        <Title>Some text</title>
        <Content>Content of item</Content>
    </contents>
    

2.

<item imageURL="url" title="title" content="content">

Is there any open source solution to this ?

* UPDATE *

I forgot to let you know that I will be using this code with an ASP.NET application in following steps:

  1. The user selects the url of the XML document which he wants to import
  2. The user would selectthe tags he wants to import to the database in GUI
  3. in code behind, rule for importing current file will be created and stored to database .

Any tips/tricks how to achieve this in the most simple way ? if you don't have source code, please give me some hints. thanks BE AWARE THAT EACH FILE HAS DIFFERENT STRUCTURE

3 Answers 3

3

You need to figure out the different version you have and write some code to handle each xml-structure.

You can do like this in SQL Server

1:

declare @XML xml

set @XML = 
'<contents>
    <ImageURL>www.123.com/image.png</ImageURL>
    <Title>Some text</Title>
    <Content>Content of item</Content>
</contents>
<contents>
    <ImageURL>www.123.com/image.png</ImageURL>
    <Title>Some text</Title>
    <Content>Content of item</Content>
</contents>'

select 
  N.value('ImageURL[1]', 'varchar(max)') as ImageURL,
  N.value('Title[1]', 'varchar(max)') as Title,
  N.value('Content[1]', 'varchar(max)') as Content
from @XML.nodes('/contents') as T(N)

Result:

ImageURL               Title      Content
---------------------  ---------  ---------------
www.123.com/image.png  Some text  Content of item
www.123.com/image.png  Some text  Content of item

2:

declare @XML xml
set @XML = '<item imageURL="url" title="title" content="content"></item>'

select 
  N.value('@imageURL', 'varchar(max)') as ImageURL,
  N.value('@title', 'varchar(max)') as Title,
  N.value('@content', 'varchar(max)') as Content
from @XML.nodes('item') as T(N)

Result:

ImageURL   Title     Content
--------   -----     -------
url        title     content

3.

declare @XML xml

set @XML = 
'<contents>
  <content>
    <someOtherNode>
      <ImageURL>www.FirstURL.com/image.png</ImageURL>
    </someOtherNode>
  </content>
</contents>
<contents>
  <content>
    <someOtherNode>
      <ImageURL>www.SecondURL.com/image.png</ImageURL>
    </someOtherNode>
  </content>
</contents>'

select 
  N.value('ImageURL[1]', 'varchar(max)') as ImageURL
from @XML.nodes('/contents/content/someOtherNode') as T(N)

Result:

ImageURL
---------------------------
www.FirstURL.com/image.png
www.SecondURL.com/image.png

4.

declare @XML xml

set @XML = 
'<content>
   <imageURL>
     <url>first url</url>
   </imageURL>
   <info>
     <title>title 1</title>
     <text>text 1</text>
   </info>
 </content>
 <content>
   <imageURL>
     <url>second url</url>
   </imageURL>
   <info>
     <title>title 2</title>
     <text>text 2</text>
   </info>
 </content>'

select 
  N.value('imageURL[1]/url[1]', 'varchar(max)') as ImageURL,
  N.value('info[1]/title[1]', 'varchar(max)') as Title,
  N.value('info[1]/text[1]', 'varchar(max)') as Content
from @XML.nodes('/content') as T(N)

Result:

ImageURL    Title    Content
---------   -------  -------
first url   title 1  text 1
second url  title 2  text 2
Sign up to request clarification or add additional context in comments.

8 Comments

Questions : in the first sample, why are you selecting 'Title[1]' not 'Title' ?
@Branislav – value can only return the value from 1 element. We know that there are only one but the parser does not know so [1] tells the value function to pick the first occurrence of Title.
is this functional for each CONTENTS element ?
@Branislav – Yes. @XML.nodes('/contents') will give you one row in the result set for each contents node. Title[1] will give you the first occurrence of Title in the "current" contents node.
And what if i want to use3 structure like this : <content> <someOtherNode> <imageURL>link to image</imageURL> ... </someOtherNode> </content>
|
1

Are you manually importing these files?

You've tagged this c#, so I assume you're wanting to write something.

I would;

  • Create a windows forms or WPF app,
    use opendialog and select the file/s you want to import.
    • For each file, check the presence of certain nodes to determine which
      file I was reading
    • Using Linq to query

out the items into a list of objects;

public class ContentItem 
{
    public string ImageUrl [get;set;}
    public string Title {get;set;}
    public string Content {get;set;}
}
  • Loop through your list of objects and insert them into the database.

1 Comment

I do not fully understand your solution
0

You can transform(for example using xsl) all other formats to your standard xml(for example you decided that standard schema is like in 1 example). Like that you just need to decide which transformation to use for non standard xml file.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.