2

I am stuck between row vs columns table design for storing some items but the decision is which table is easier to manage and if columns then how many columns are best to have? For example I have object meta data, ideally there are 45 pieces of information (after being normalized) on the same level that i need to store per object. So is 45 columns in a heavry read/write table good? Can it work flawless in a real world situation of heavy concurrent read/writes?

2
  • i'm thinking columns is best. does every object have exactly 45 peices of information, or do some have less? if they all have exactly 45, use columns, if some have less, use rows in a separate associated table. Commented Dec 30, 2010 at 17:34
  • 45 is best case that all objects have in common. Ofcourse some values may be NULL. But there are more details that can be added to the colunm list if i de-normalize the schema a bit to remove some extra joins. Commented Dec 30, 2010 at 17:39

4 Answers 4

3

If all or most of your columns are filled with data and this number is fixed, then just use 45 fields. It's nothing inherently bad with 45 columns.

If all conditions are met:

  • You have a possibility of the the attributes which are neither known nor can be predicted at design time

  • The attributes are only occasionally filled (say, 10 or less per entity)

  • There are many possible attributes (hundreds or more)

  • No attribute is filled for most entities

then you have a such called sparce matrix. This (and only this) model can be better represented with an EAV table.

Sign up to request clarification or add additional context in comments.

3 Comments

My concern with this is table scans. Say i have 10 million users, each user generates about 1000 objects a month. After 1 year I have (12000 x 10 million) objects X (45 rows per object or even make it 30 as some may be NULL vlaues). So when the system has to find all the object properties it has to find all rows in this table for say object ID 1128. Can it perform well with this many rows? If yes then this method may work best.
@Dennis: 45 rows per object is a EAV table, 45 columns is a plain table. If most of your attributes are filled and you cannot have more than 45 ones, then just use column per attribute.
@Dennis: why would you every have a table scan? Table scans only occur if there are poor indexing decisions.
0

"There is a hard limit of 4096 columns per table", it should be just fine.

2 Comments

But does it work in the real world on social network sites which require high read/writres?
Your 45 columns are ~1% of maximum, so you do the math. If your site is not next facebook you shouldn't be worried about it. Just optimize everything as much as you can.
0

Taking the "easier to manage" part of the question:

If the property names you are collecting do not change, then columns is just fine. Even if it's sparsely populated, disk space is cheap.

However, if you have up to 45 properties per item (row) but those properties might be radically different from one element to another then using rows is better.

For example taking a product catalog. One product might have color, weight, and height. Another might have a number of buttons or handles. These are obviously radically different properties. Further this type of data suggests that new properties will be added that might only be related to a particular set of products. In this case, rows is much better.

Another option is to go NoSql and utilize a document based database server. This would allow you to set the named "columns" on a per item basis.

All of that said, management of rows will be done by the application. This will require some advanced DB skills. Management of columns will be done by the developer at design time; which is usually easier for most people to get their minds around.

2 Comments

As mentioned in the other comment, my concern with rows is performance. 10 million users, 1000 objects per month per user, 45 rows per object. So 1 year later, 2 years later etc the table is going to be super large. When doing Selects for getting properties of an object can it find the object efficiently in such a large row oriented table or will it take minutes to perform such a task?
@Dennis: First even going columns your talking 10B new records added each and every month. 2 years in and the table is going to have 240B records. Second, 10M users and you are going to have to somehow partition the data or shard it in some way to share the load. This isn't going to be on a single server just due to the queries you'll need to run to support even 1/10th of that user population.
0

I don't know if I'm correct but I once read in MySQL to keep your table with minimum columns IF POSSIBLE, (read: http://dev.mysql.com/doc/refman/5.0/en/data-size.html ), do NOTE: this is if you are using MySQL, I don't know if their concept applies to other DBMS like oracle, firebird, posgresql, etc.

You could take a look at your table with 45 column and analyze what you truly need and leave the optional fields into other table.

Hope it helps, good luck

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.