0

Is it possible to have selective queries in PostgreSQL which select different tables/columns based on values of rows already selected?

Basically, I've got a table in which each row contains a sequence of two to five characters (tbl_roots), optionally with a length field which specifies how many characters the sequence is supposed to contain (it's meant to be made redundant once I figure out a better way, i.e. by counting the length of the sequences).

There are four tables containing patterns (tbl_patterns_biliteral, tbl_patterns_triliteral, ...etc), each of which corresponds to a root_length, and a fifth table (tbl_patterns) which is used to synchronise the pattern tables by providing an identifier for each row—so row #2 in tbl_patterns_biliteral corresponds to the same row in tbl_patterns_triliteral. The six pattern tables are restricted such that no row in tbl_patterns_(bi|tri|quadri|quinqui)literal can have a pattern_id that doesn't exist in tbl_patterns.

Each pattern table has nine other columns which corresponds to an identifier (root_form).

The last table in the database (tbl_words), contains a column for each of the major tables (word_id, root_id, pattern_id, root_form, word). Each word is defined as being a root of a particular length and form, spliced into a particular pattern. The splicing is relatively simple: translate(pattern, '12345', array_to_string(root, '')) as word_combined does the job.

Now, what I want to do is select the appropriate pattern table based on the length of the sequence in tbl_roots, and select the appropriate column in the pattern table based on the value of root_form.

How could this be done? Can it be combined into a simple query, or will I need to make multiple passes? Once I've built up this query, I'll then be able to code it into a PHP script which can search my database.

EDIT

Here's some sample data (it's actually the data I'm using at the moment) and some more explanations as to how the system works: https://gist.github.com/823609

It's conceptually simpler than it appears at first, especially if you think of it as a coordinate system.

6
  • Can you post DDL and a little sample data, please? I'm pretty sure this can be done, but not in quite the way you're looking at it. Commented Feb 12, 2011 at 7:18
  • Sample data is best posted as INSERT statements. The console output cannot easily be used to quickly setup a test environment Commented Feb 12, 2011 at 9:28
  • I've updated the Gist so it should contain the right statements, but I haven't tested them so I don't know for sure. Commented Feb 12, 2011 at 10:55
  • @Catcall I've updated that Gist again so it has the table definitions and the SQL statements to insert the sample data; the syntax for inserting the root character arrays might be wrong though. Commented Feb 13, 2011 at 4:42
  • I would watch out using arrays. Not familiar with your domain, will tbl_root.root always be length of 3? If it can change length or you need to search through it could become a very big headache. Arrays are ok when you can index them and know what that means e.g. quarter[1] for an account cycle stands on its own. If not you basically have a hidden nested table that will not be relational sound. Commented Feb 13, 2011 at 20:13

2 Answers 2

1

I think you're going to have to change the structure of your tables to have any hope. Here's a first draft for you to think about. I'm not sure what the significance of the "i", "ii", and "iii" are in your column names. In my ignorance, I'm assuming they're meaningful to you, so I've preserved them in the table below. (I preserved their information as integers. Easy to change that to lowercase roman numerals if it matters.)

create table patterns_bilateral (
  pattern_id integer not null,
  root_num integer not null,
  pattern varchar(15) not null,
  primary key (pattern_id, root_num)
);

insert into patterns_bilateral values
(1,1, 'ya1u2a'), 
(1,2, 'ya1u22a'),
(1,3, 'ya12u2a'), 
(1,4, 'me11u2a'), 
(1,5, 'te1u22a'), 
(1,6, 'ina12u2a'), 
(1,7, 'i1u22a'), 
(1,8, 'ya1u22a'), 
(1,9, 'e1u2a');

I'm pretty sure a structure like this will be much easier to query, but you know your field better than I do. (On the other hand, database design is my field . . . )


Expanding on my earlier answer and our comments, take a look at this query. (The test table isn't even in 3NF, but the table's not important right now.)

create table test (
root_id integer,
root_substitution varchar[],
length integer,
form integer,
pattern varchar(15),
primary key (root_id, length, form, pattern));

insert into test values
(4,'{s,ş,m}', 3, 1, '1o2i3');

This is the important part.

select root_id
     , root_substitution
     , length
     , form
     , pattern
     , translate(pattern, '12345', array_to_string(root_substitution, '')) 
from test;

That query returns, among other things, the translation soşim.

Are we heading in the right direction?

Sign up to request clarification or add additional context in comments.

9 Comments

I'm not sure about this. It'll make the structure of the tables less clear to follow, as the data is supposed to be represented in a "three-dimensional space" accessed by "coordinates" (root, form, pattern). Flattening the pattern table, as you appear to have done, will result in some kind of secondary key to access the data. Would it be possible to create a view that could rearrange the table into a coordinate-based system?
Probably. (Another alternative is to have a spreadsheet query the database, displaying the data in rows, columns, and sheets.) Usually, if you have tables normalized to 5NF, how you present the data has almost nothing to do with how it's stored in the database. In other words, 5NF buys you a lot of flexibility in presentation. I think it makes sense to try doing this with a small subset of the data to make sure it will work for your purposes. And I don't mind trying to help you with that.
@Catcall What do you mean by normalised to 5NF?
@zpqaeski: 5NF is shorthand for "fifth normal form", a database design principle for increasing data integrity. A side-effect of it is to make the database more flexible. See bkent.net/Doc/simple5.htm
You have "So, the word for "girl" is given by the root with id=2 . . . Root = {s,ş,m}". Should that be "root with id = 4" instead?
|
0

Well, that's certainly a bizarre set of requirements! Here's my best guess, but obviously I haven't tried it. I used UNION ALL to combine the patterns of different sizes and then filtered them based on length. You might need to move the length condition inside each of the subqueries for speed reasons, I don't know. Then I chose the column using the CASE expression.

select  word,
        translate(
            case root_form
              when 1 then patinfo.pattern1
              when 2 then patinfo.pattern2
              ... up to pattern9
            end,
            '12345',
            array_to_string(root.root, '')) as word_combined
from    tbl_words word
join    tbl_root root
on      word.root_id = root.root_id
join    tbl_patterns pat
on      word.pattern_id = pat.pattern_id
join    (
        select  2 as pattern_length, pattern_id, pattern1, ..., pattern9
        from    tbl_patterns_biliteral bi
        union all
        select  3, pattern_id, pattern1, pattern2, ..., pattern9
        from    tbl_patterns_biliteral tri
        union all
        ...same for quad and quin...
        ) patinfo
on
patinfo.pattern_id = pat.pattern_id
and length(root.root) = patinfo.pattern_length

Consider combining all the different patterns into one pattern_details table with a root_length field to filter on. I think that would be easier than combining them all together with UNION ALL. It might be even easier if you had multiple rows in the pattern_details table and filtered based on root_form. Maybe the best would be to lay out pattern_details with fields for pattern_id, root_length, root_form, and pattern. Then you just join from the word table through the pattern table to the pattern detail that matches all the right criteria.

Of course, maybe I've completely misunderstood what you're looking for. If so, it would be clearer if you posted some example data and an example result.

2 Comments

I was thinking along the lines of creating a separate view for each length, then UNION on the views. I haven't looked at the sample data and DDL yet, though.
I just looked up the syntax for views and created one for each root length. I'll keep the root_length column for the time being, and drop it when it's definitely not needed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.