For future people:
The issue here is about data integrity, maintainability, and efficiency.
With free-text status values, you can accidentally insert "Dead", "DEAD", "deceased", etc. - leading to inconsistent data. Storing some variable length repeating text thousands of times typically wastes space compared to storing a foreign key UUID. If you need to rename a status or add metadata (like status descriptions, colors for UI, etc.), you'd need to update potentially millions of records vs 1 or 10. And joins on UUIDs are faster than string comparisons.
Duplicate tables... creating separate tables like alive_people and dead_people would violate several design principles. For example... DRY (Don't Repeat Yourself): You're duplicating the entire schema. This will create a maintenance nightmare, schema changes need to be applied to multiple tables. It also makes queries complex; getting all people requires UNION operations. It also violates the principle of having one source of truth: Person data is scattered across multiple tables.
Having a text "status" field in a table goes against fundamental database design principles about schema stability and logical data organization.
This is "What I Would Do" (oh, I can make bullet points! I forgot about that!)
- Create views like alive_people and dead_people that filter the main table (someone already said this! It's worth repeating!)
- Use repository patterns or ORM scopes
- When adding status columns, audit existing queries an update them to use the view when possible (and index the view! you can do that now!)
- Design your application to handle missing status filters gracefully OR make the foreign key field not null.
The foreign key approach to a status lookup table remains the best practice for data integrity and maintainability, even though it requires some upfront planning for query updates.
If you have multiple tables that need status, you can make the status tables parent-table specific, because status for a one entity isn't actually the same thing as status for another type of entity. Examples:
- Person (records have all the data about a person) - Person_Status (has alive and dead)
- Project (records for every project) - Project_Status (has active, inactive, completed, et al)
- Customer - Customer_Status (active, dormant, no longer operating)
The post on why you should use UUID for keys and not ints is for another post... but we all know you're probably just gonna use ints. But this actually is a good opportunity to explain one of the reasons not to use ints- misaligned joins. Accidentally joining Projects to Person_Status will result in results, especially if Project_Status and Person_Status both have a [name] field... but the results don't make sense. This is something a developer or dba could easily miss. Using UUIDs (among other things) ensures that joins either work or they return nothing, which is a strong indicator you perhaps got something wrong. Another reason is the rowguid, if you ever need to scale out across multiple servers, every record must have a rowguid, and if you use a uuid as your PK then all your rows already have one. I could write six more paragraphs on all the reasons to never use int (and especially never auto incrementing ints!) but I'm already off topic so I'll stop. :)