2

I am building an app where every company has it own private schema(Postgresql).

For every requisition, I set the Postgres search path in a before_action like this:

ActiveRecord::Base.connection.schema_search_path = 'company_id, public'

My doubt is, if I have multiple Unicorn workers, and one worker 'A' set the path, while another work 'B' set the path before worker A has finish, I think it will generate some conflict and 'A' worker could accidently save/read models from the wrong schema, right?

Is there another solution that could work better with Unicorn design?

Edit, schema details:

Each company has many users. Both users and companies tables live in the public schema, the rest( products, clients...) live in private schemas

Edit, more research:

After some research, I found that each database client connection has it own search path. Hence, if I change the search path using one connection, the others won't be affected, so this could work with Unicorn because each request has it own connection, but it will not work with multi-threaded servers like Puma.

But, there are still some problems cited in the answers, like ActiveRecord reloading the schema for each request. I would like to hear the experience of someone who is using this approach in production.

7
  • does each unique company have its own login, or, do all companies share the same login? also, how are the companies accessing the database? Commented May 18, 2014 at 22:15
  • Each company has many users. Both users and companies tables live in the public schema, the rest live in private schemas. Commented May 18, 2014 at 23:23
  • 1
    hmm, i don't know enough about rails, i did find this which simply reenforces your suspicion: groups.google.com/forum/#!topic/rails-oceania/tvzC85huXEA Commented May 19, 2014 at 0:01
  • @Jirico, I have solved this problem in a production environment and I have provided an alternative approach (which your question asks), along with insight into why changing the schema path dynamically without AR reparsing schema poses a risk to type handling. Just because there are real problems/risks and issues with your desired/preferred approach doesn't make the answers or advice invalid. I've invited you to check the adapter code for yourself and explained the detail of the issue you face. Commented May 28, 2014 at 1:21
  • I agree, until now your answer was the most complete. I will try my approach in production and I will keep updating this thread about my results. Thank for the help. Cheers Commented May 28, 2014 at 1:42

4 Answers 4

3
+50

I don't think multiple schemas is a great idea because your ORM will need to re-load its schema on each request unless you plan on running a server instance for each tenant...

Multiple schemas are not scalable in postres anyhow from what I have read. If you have tens of thousands of tenants you will start to get performance issues.

The approach I have used is to have a tenant_id in each table and just use a scope on your models and some validation checks to ensure that related models are within the required tenant or user scope. Its really very simple and works well.

I use request_store to set both User.current and Tenant.current from a base controller so that I have the needed context in my models to restrict and enforce tenant or user scope where required. I posted an example of this to another stack overflow question here.

I found that in my multi-tenant app not everything was isolated to a tenant, and I needed some tables to be shared so I quickly discounted the multiple schema solution, that, plus the per request schema reload issue and being able to easily create new tenants as normal model saves made multiple schemas a non starter.

Assuming you manage to avoid schema inconsistency problems within AR you also need to consider that live streaming, SSE or websockets are ruled out or become incredibly difficult with your approach as you cannot have threads operating in different tenants and Unicorn also doesn't support long running requests.

You may wish to instead consider using the EventMachine based Thin server and possibly rack-fiber-pool so that you can support currency, slow clients and long running requests for live streaming or SSE and not have thread related issues yet have excellent scalability. With the fiber approach you would need to work out how to switch/restore the schema context when the fiber is resumed but in principle it is doable.

Sign up to request clarification or add additional context in comments.

8 Comments

Could you explain better why change the postgresql search path will cause my ORM to reload its schema per each request?
The orm needs to interrogate the database schema so it can discover the tables and columns for your models. Switching the schema would need to be done per request and you would be getting active record to parse all that schema meta data each request. I don't even know how you would authenticate users without some default schema and a users table to even know what tenant specific schema you are supposed to switch to with your approach. In in all a lot of schema switching overhead would be involved
To be clear the AR adapter code loads types, and column definitions once on opening the connection to the database. But schemas are containers for tables, views, functions, types and constrains and may use different oids for the same type names so you are playing a very dangerous game and should be closing and reopening the connection when changing the schema path. What is your strategy for migrations? You will have to migrate each tenants schema independently and I don't think rails migrations will work that way for you.
I recommend looking at the AR postgres adapter code (I did). AR caches table/column info and type info. The oids for types with the same names can be different in each schema as each schema is independent. The only way to guarantee correct behavior is to close and reopen on tenant/schema change so that AR has a correct understanding of the schema you are using for that tenant. Your migration approach sounds very inefficient compared to having an tenant I'd
aside from AR & Rails, you should really read this Microsoft article that talks about the pros and cons of a multi-tenant architecture. Basically a shared design will be cheapest in the long run, but has more upfront development costs.
|
2

i think this is not a feasible solution. this is what the docs say:

schema_search_path=(schema_csv) public Sets the schema search path to a string of comma-separated schema names. Names beginning with $ have to be quoted (e.g. $user => ‘$user’). See: http://www.postgresql.org/docs/current/static/ddl-schemas.html

This should be not be called manually but set in database.yml.

and this is the implementation

    def schema_search_path=(schema_csv)
      if schema_csv
        execute("SET search_path TO #{schema_csv}", 'SCHEMA')
        @schema_search_path = schema_csv
      end
    end

looks like it's too global for your usecase.

1 Comment

Good information. Now my doubt is if the postgres search path has an instance for each database connection. If it has, I think it may work because if I'm not wrong each Unicorn worker has it own set of connections. If what I say is right, each worker could have is own postgres search path instance.
1

I can't directly answer to your question, but you might give a look to the acts_as_tenant gem.

Here is an example of usage: https://github.com/Bahanix/RubyBB/blob/master/app/controllers/application_controller.rb#L15-L25

2 Comments

It is the same approach suggested by @Andrew Hacking using a tenant column and put everything in the same table.
I used the ActsAsTenant gem in anger but it didn't work well when you have shared data or other access rules on what should be visible. ie 'shared/public' or 'tenant only' or where the tenant could be optional.
1

I made a gem called acts_as_restricted_subdomain that implements the single-schema strategy outlined above. We have been using it successfully in production with Unicorn and Resque for about 4 years now to separate all of our clients' data from each other with no spillover.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.