1

I have 2 tables. One stores nothing by email address along with a unique ID. Both fields are unique. I have another table which stores data about a person. Name, Phone, Address, Occupation, etc. There are a fixed amount of fields in the person table. I believe 10 is the number.

The reason for this setup is that I track changes to an individual person by email, as this is the only unique piece of information I have about someone. These individuals may post via several methods to my site with no long in. Depending on where they post and what method, I collect certain data. Data could change over time. People move, change phone numbers or occupations. So I have several records to match up to an email so that I can display the changes over time.

My question is, which is a better setup with MySQL. I'm sure this is a loaded question. Should I have:

ID | EmailID | Name | Phone | Address | City | Created | etc....

Or

ID | EmailID | FieldType | Value | Created 

The first setup obviously stores all data for a person along with an associated emailID in one row. The second stores data field by field. So I could insert a row with the FieldType occupation and the value developer under and emailID, as opposed to the whole data set.

Since not all data is required on each insertion, I feel that the second method is best, but I am not 100% sure. With the second method, I am only adding a row and inserting what I need to, as opposed to the whole row.

Hope this makes sense.

1
  • Depending on how you index your table the first one might be more efficient in the long run. Having a relational structure will require more elaborate queries, which can be cached or handled any number of ways so that may not even be an issue. But if you're worried about storing a few empty fields, then consider how minuscule the overhead of this actually in comparison to the possibility of having 10 million entries in structure 2 for 1 million users... Course, you could always try some benchmarking Commented Sep 18, 2013 at 18:34

1 Answer 1

1

The second method is called entity-attribute-value modeling (EAV).

In general, the first method is preferred. Among the problems with EAV is the typing of columns. For instance, one of your columns is created, which is presumably a date column. The others seem to be strings.

With the EAV structure, it is difficult to maintain foreign key relationships. It is harder to assign constraints on row values. Queries to return all values from a row require bringing together multiple rows. And, because the entity information has to be repeated (the email is your case), they often take up more space.

There are some situations where EAV is the right solution. However, I would encourage you to go with your first structure where all the columns are in a single table. Having NULL values is not a problem in terms of performance or in terms of inserting data.

Sign up to request clarification or add additional context in comments.

1 Comment

I won't argue with that. Thanks for the feedback. I will do some more research on EAV. Thanks again.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.