1

I am considering to store my html document in table like this:

id    content     parent      tag
1                 0           html
2                 1           head
3                 1           body
4     Main page   2           title
5     Hello world 3           h1

This is just simplistic example. The result should be

<html>
 <head>
   <title> Main page </title>
 </head>

 <body>
  <h1> Hello world </h1>
 </body>
</html> 

Right now, I am able to use CTE with SQL to have query resulting in correct tree structure. My idea was inspired by this page: https://www.sqlite.org/lang_with.html (Scroll down for the best part as solving sudoku with sql.)

I want to use sql as most as possible to avoid php code for my reasons. My questions are like this:

  1. Do you have any idea to finish the process? (E.g. map html tags, orderings, inserting and deleting nodes etc). Any thoughts would be appreciated.

  2. Did you try (or see) anything similar? Personal experiences, tutorials and so on?

  3. How would you suggest to make table structures? For example to avoid repeating of same html structures (typically headers, menus, footers)?

  4. Anything else what could be useful and related to this topic?

I hope you find this topic as intriguing as me :)

PS: I want to use SQLite but I think it doesnt matter if you wont suggest anything too much database specific

PPS: Please read before you advice it is not good idea :)

I would like to make most part of project in sql. It is my time to waste so dont worry :) It is just experimental thing. I would use python instead of php if the choice of language was that important. Basically, as you have ORM to have database-independent apps I am trying to make opposite - to have language independent sql database accessed just by any language. That is my target more or less. Speaking of wasting my time I could say the very same for the poor ones who are involved in any php frameworks. Recently, I had checked few of them and from my perspective I would call waste of time something really different :)

7
  • 5
    Don't do it... One day when you don't have hairs left you will read back this comment and realise you shouldn't have done it. Commented Jun 24, 2015 at 16:41
  • 6
    Is there a particular reason to want to do this? What problem are you trying to solve? Commented Jun 24, 2015 at 16:42
  • 1
    So basically you are trying to completely rebuild the DOM tree. Wasting your time with something like this seems silly. If you want to avoid PHP look into building a new server stack (i.e. MEAN) that uses a different language (javascript in the case of MEAN). If it is absolutely necessary to build this in SQL look into how the DOM tree is constructed Document Object Model Commented Jun 24, 2015 at 16:44
  • 2
    Sounds like a job for an HTML Templating Engine Commented Jun 24, 2015 at 16:47
  • 1
    Will somebody please think of the attributes! Commented Jun 24, 2015 at 16:50

1 Answer 1

1

There's a number of ways to store a tree structure in an RDBMS. HTML, though, is not a perfect tree structure. You'll face numerous issues creating valid HTML from your data (should <p> be closed? should selected attribute have a value? etc).

Also, SQL is not exactly a language to easily manipulate trees. In other words, any non-trivial editing of your template in the database would be a huge pain.

So I suppose you want to serialize a DOM tree, which you know how to produce from a regular HTML file, to save time on parsing. You can as well store it not as a complete DOM tree but as a sequence of fragments, only adding children where the HTML template has loops. This will exclude most of the DOM hairiness: why painstakingly parse it first only to serialize back later?

This, BTW, will require the template itself be a well-formed tree: no conditionally closed tags or suchlike. Some templating engines require this.

I'd not store the thing as a tree. Instead I'd store a parsed template as a flat sequence of fragments with markers where a nested structure begins and ends. It would be trivial to load, trivial to process (all you need is a stack to keep track of nesting), and much easier to inspect with eyes and debug.

Or maybe you'll look around and find a ready-made templating engine that does just that. I've no idea what modern PHP landscape looks like, but chances to find an existing solution in such a mature environment are quite high.

If you still take the tree approach, make sure that you can load the entire tree in one query, because database round-trips are not so cheap, even for in-process SQLite.

But before you even continue with any approach, profile your code first. I bet that templating is not the bottleneck, and lowering the number of database / file system accesses will have a much more pronounced effect on latency and CPU load.

Sign up to request clarification or add additional context in comments.

1 Comment

I am thinking alike. At the end, I believe the best would be made reusable elements probably mostly inspired by asp.net. My idea was to make component as follows: code <repeater datasource=AnyView> <element datafield = ColumnName></element> ... </repeater> code This should map to tree structure pretty nicely :) PS: You are right about html tree structure. I would probably use XHTML.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.