1

I'm setting up a data-logger which is polling data from a restful web service (from a PLC installed on a production facility) and I need to write it on a PostgreSQL database.

I usually need to read data every 30 seconds from 5 different machines, 24/24h and 6 days per week. That would be around 15.000 connections to the database everyday, if I decide to close the connection every time after the queries. I'm assuming that all 5 machines will read in different times, but of course we can reduce it to 3.000 queries, if I decide to read all of them simultaneously.

What is the best way to achieve a persistent connection using PostgreSQL? My doubt is that creating a database "handler" class and returning a "Connection" object to use could be affected by timeouts or errors (when the connection closes itself I'd not be able to log any data).

1
  • Are you saying you have five JDBC clients with each of them querying/writing Postgres every half-minute? Commented Jun 11, 2017 at 0:06

2 Answers 2

1

Keep connections open

You seem to be saying that you have five JDBC clients that each need to do a query or write every half-minute around the clock for six days per week. If that is the case, I see no need for disconnecting the connection at all. Just have each JDBC client maintain an open connection.

Why do you believe it is necessary to close your connections between the half-minute calls? If there are other factors to consider, edit your Question to clarify.

Be sure to test your JDBC connection as it may be lost with a network interruption or a restart of Postgres server. If the connection fails, open another.

You can maintain a connection between your app and the Postgres server indefinitely. Keep in mind two things: txn timeout and fragility of network.

Transaction timeout

Each client connection has default settings. One of these is idle_in_transaction_session_timeout (integer). If you keep a transaction longer than this limit, the transaction is rolled back and your connection (session) is closed.

If you know you will have transactions open for long durations, you can disable the timeout feature. Not generally recommended. Probably irrelevant in your use case. If you use transactions at all, your description sounds like they will be brief.

To quote the documentation:

idle_in_transaction_session_timeout (integer)

Terminate any session with an open transaction that has been idle for longer than the specified duration in milliseconds. This allows any locks held by that session to be released and the connection slot to be reused; it also allows tuples visible only to this transaction to be vacuumed. See Section 24.1 for more details about this.

The default value of 0 disables this feature.

Fragility of networks

Network connections are by their nature tenuous and fragile. Less experienced programmers tend to under appreciate this challenge as the development environment tends to be vastly more reliable than real-world deployment environments.

A programmer must always take extra care when using network connections. See the fallacies of distributed computing. You must assume you will have network interruptions and failed database connections. Test the validity of your db connection each time you do work. Catch exceptions thrown by your JDBC driver. Use transactions where appropriate to protect the integrity of your data. That's the main reason transactions were invented: we expect failures.

In other words, there is no such thing as a ”persistent connection“ that you mention in the Question. There are simply connections. Whether kept open for 50 milliseconds or 50 days is irrelevant: all connections are at risk of failure at any moment. So, again, expect failure.

Be sure to test for network failure in your development-testing cycles. There are high-tech ways to do that. And there is a low-tech favorite of mine: Pull the Ethernet cable while running.

Consider deploying your app and the Postgres server on the same box if otherwise practical (enough cores, enough RAM, enough stability). A local connection between app and database will be *much much * more reliable (and faster) than distributed (networked) across machines. But other deployment issues may dictate separate machines. System-administration is all about trade-offs.

Threads

If using threads in your app, be sure they do not share a JDBC connection. Each thread should use its own connection.

You mentioned a "database handler" to "get a connection". Not sure what you meant. Generally I recommend a DataSource object on which you call getConnection. Your JDBC driver should provide an implementation.

Do not use connection pool

Connection pools are often suggested for database work, sometimes too often suggested reflexively without properly considering the pros and cons. Connection pools bring their own risks and complexities.

And connection pools are most useful when you have many clients making intermittent calls to the database. Your situation is the opposite, few clients making many frequent regular calls to the database. So I would say a connection pool is contraindicated for you.

Sign up to request clarification or add additional context in comments.

4 Comments

Thank you for your reply, I am using one single JDBC client to read values from five RESTful devices through an HTTP URL, and then pushing the results on the database every half-minute. I was wondering if keeping the connection open may cause disconnections due to timeouts, and how to deal with this kind of situation (in fact, i cannot stop logging due to disconnections or restarting the program, so i need an highly reliable system).
@Mastarius Your comment is more clear than your Question. Please edit your Question with this clarification, for the sake of posterity.
@Mastarius I added two sections about txn timeout and fragility of networks in response to your Comment.
That is really complete, thank you again for your kind reply!
1

Best way to share connections is a connection pooling, like DBCP for example.

If you have different machines connecting to the database that is little bit more difficult. I would set up a service on a different machine or on one of the machines. for example via REST or another likely interface.

In the end make sure to check your PostgreSQL config aswell. A good guide you will find in the postgres wiki.

2 Comments

Can you explain more? I see no need for a connection pool in the situation described in the Question: Few clients (one client?) hitting the database every half-minute. Why not keep an open connection? Why add the complexity of a connection pool?
That's my solution to the question and that would be my way to solve it. If your solution ist the one in your answer, fine, it's yours! The connection pool Keeps connections open aswell :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.